Wanbo From the Aofei temple
qubits | official account QbitAI
Language AI, With human Self examination Ability :
lately , An academic team from the University of California, Berkeley and Hopkins University showed :
It can not only judge whether its answer is correct , And trained , Can also forecast The probability of knowing the answer to a question .

Once the research results are released , It caused a heated debate , The first reaction of some people is panic :

Some people think that , This result , It is of positive significance to the study of Neural Networks :

Language AI Have self-examination ability
According to the research team , If you want language AI Model self-assessment , There must be a premise :
Language AI When answering questions , Meeting calibration My own answer .
Calibration here , It's language AI Predict the correct probability of an answer , Whether it is consistent with the actual probability of occurrence .
Only such language AI Then you can use this calibration ability to evaluate whether your output answer is correct .
So the first question is , Language AI Can you calibrate your answers ?
In order to prove the problem , The research team is AI To prepare the 5 Multiple choice questions :

Answer options , With A、B、C Given in the form of .
If AI The correct rate of the model answer exceeds the chance , So prove that AI The answer given by the model is calibrated .
And the result of the test is , Language AI The answer given , The accuracy rate obviously exceeds the accidental probability of any option .
in other words , Language AI The model can calibrate its answers well .

But the research team found that , Language AI Calibration capability of , It is based on the premise that the answer to the option is clear .
If you add a “ None of the above ” Uncertain options , It will damage the language AI Calibration capability of .

in other words , stay Specific format In the multiple choice question , Language AI The model can calibrate the answer well .
After clarifying this premise , The next question is , Verification language AI The model can judge whether its answer is correct .
In this round of tests , In order to make AI The prediction of the model is closer to its own effective decision-making boundary .
The research team still chose the question of the last round of testing , And language AI Sample answers to the model .
At the same time let AI The model chooses whether its answer is true or false , Then we will focus on this “ really ” or “ false ” The answer , analysis AI Whether the model is calibrated effectively .
An example of problem setting is as follows :

after 20 After the first true and false test , The research team found that , Language AI The model answers or “ really ” or “ false ” The evaluation of , Are clearly calibrated .

in other words , If in a range , to AI The model raises several questions , then AI The model evaluates the true and false answers to these questions , Have reasonable , And calibrated Degree of confidence .
This also proves. , Language AI The model can really judge whether its claim to a problem is correct .
Last , The research team is interested in language AI The model raises a more difficult problem :AI The model has been trained , Can you predict whether they know the answer to any given question .
At this point , The research group introduced a data P(IK)( I know the probability of this answer ) And choose one of the following two training methods for training :
Value Head( Value orientation ): hold P(IK) Training becomes an additional value orientation , Then add to the logarithm of the model ( Logarithm independent of language modeling , The advantage of this method is , The research team can easily detect P(IK) General marking position of .
Natural Language( Natural language ): This method is relatively simple , It is demand. AI The model literally answers “ What is the probability that you know the answer ”, At the same time, output a percentage data answer .

At the beginning of training , The research team prefers natural language training , But the result is not significant , Thus, it turns to the value oriented way , But the research team also said , Eventually the AI The training of the model will also return to the natural language method .
After training , The research team found that , Language AI The model can predict well P(IK), And in different types of problems , This prediction ability is partially universal .
however , The research team also found , In some types of problems , Such as arithmetic problems , Language AI Model in OOD There are some difficulties in calibration .
For this academic achievement , The research team said , Future direction , Is to put these achievements , Extend to language AI The model does not imitate human text , The field of self-learning and factual reasoning .
The authors introduce

Paper correspondent Jared Kaplan Doctor , He is a theoretical physicist , He is also an expert in machine learning , Now he is an assistant professor at Hopkins University , Main research fields , Machine learning research , Including the scaling law of neural model and GPT-3 Language model .

Co correspondent Saurav Kadavath,Anthropic Company researcher , Now the University of California, Berkeley EECS Major in master's degree , The main research area is machine learning , Large scale language learning .
Reference link :









