Editor's note: Many real tasks have complex goals or goals that are difficult to express in detail. It is difficult to measure the results of the machine on this task. One of the solutions is that humans provide training signals through demonstration or judgment, but this method is easy to fail in complex situations. Now, OpenAI has proposed a method that can generate training signals for complex tasks. The following is Lunzhi's compilation of the original text.
The technique we propose is called iterated amplification, which allows us to clearly understand complex behaviors and goals that exceed human capabilities. Unlike providing label data or reward functions, our approach is to break a task into multiple simpler tasks. Although this idea is still in its infancy and has been experimenting with simple game algorithms, we decided to share its initial state because we think it will be a very useful way to ensure AI safety.
Paper address: arxiv.org/abs/1810.08575
If we want to train a machine learning system to perform tasks, we need a training signal, which is a way to measure the performance of the system, which can help it learn better. For example, labels in supervised learning or rewards in reinforcement learning can be regarded as training signals. The organizational rules of machine learning usually assume that a training signal is already available, and we should focus on learning from it, but in fact, the training signal must be obtained from elsewhere. If there is no training signal, the task cannot be learned. If you get the wrong training signal, you will get wrong and even dangerous behavior. Therefore, improving the ability to generate training signals is beneficial to learning new tasks and AI security.
How are we currently generating training signals? Sometimes, the goal we want to achieve can be evaluated by algorithms, such as calculating the score in a Go game or whether it succeeded in getting a certain score. Most realistic tasks are not suitable for algorithm training signals, but usually we can obtain training signals by humans performing tasks or judging the performance of AI. But most tasks are complex, humans cannot judge or perform well, they may have complex switching systems or have multiple security issues.
Iterative amplification is a method of generating training signals designed for the second type of task. In other words, although human beings cannot directly perform all tasks, we assume that he can clearly distinguish the various components of a certain task. For example, in the case of computer networks, humans can decompose the task of "protecting servers and routers" into "understanding attacks on servers", "understanding attacks on routers", and "how these two attacks interact". In addition, we assume that humans can complete some very small instances of tasks, such as "identifying suspicious specific codes in log files". If these can be achieved, then we can build training signals for large tasks through human training signals on small tasks.
In the process of implementing iterative amplification, we first sample small subtasks and train the AI ​​system to simulate human demonstrations to complete the task. After that, we began to collect slightly larger tasks. The solution was to let people divide them into small parts first, and then trained AI could solve these small problems. This method is often used in slightly difficult tasks, where human help is added as a training signal to train the AI ​​system to solve multi-layer tasks. Later, when solving more complex tasks, repeat the construction of such a training model. If this process is feasible, it will eventually be able to generate a completely automatic system for solving complex tasks, regardless of whether there is a direct training signal in the initial stage. This process is a bit like the expert iteration used in AlphaGo Zero, except that expert iteration is to strengthen the existing training signal, and our iterated amplification is to create the training signal from zero. It is also similar to several recent learning algorithms, such as using problem decomposition to solve a certain task during testing, but the difference is that it operates without prior training signals.
experiment
Based on the previous experience of studying the AI ​​debate, we believe that it is too difficult for a prototype project to directly deal with tasks beyond the human scale. At the same time, it is more complicated to use the actual behavior of human beings as training signals, so we have not solved this point yet. In our first experiment, we tried to expand an algorithm training signal to prove that iterated amplification can work on this simple setting. We also limit our attention to supervised learning.
We tested this method on five different toy algorithm tasks, all of which have direct algorithmic solutions, but we pretend not to know (for example, to find the shortest route between two points in the figure), but if you want Combining each segment manually requires a lot of effort. We use iterative amplification to learn a direct algorithm that only uses fragments as training signals, thus simulating the situation where humans know how to combine the solution fragments, but there is no direct training signal.
In these five tasks (arranged power supply, sequential distribution, wildcard search, shortest path query, and joint search), the results are equivalent to the performance of the tasks solved directly through supervised learning.
The amplification method has many similar characteristics to previous debates on AI security. Similar to the debate, it also trains the model to directly perform or judge tasks that humans cannot complete, allowing humans to provide indirect supervision through an iterative process, but the specific methods are different. In the future, we will add human feedback mechanisms.
Notonthehighstree Notonthehighstree33
Bossgoo(China)Tecgnology.(Bossgoo(China)Tecgnology) , https://www.cn-gangdao.com