The Future Of Life

AIAP: AI Alignment through Debate with Geoffrey Irving

Informações:

Sinopsis

See full article here: https://futureoflife.org/2019/03/06/ai-alignment-through-debate-with-geoffrey-irving/ "To make AI systems broadly useful for challenging real-world tasks, we need them to learn complex human goals and preferences. One approach to specifying complex goals asks humans to judge during training which agent behaviors are safe and useful, but this approach can fail if the task is too complicated for a human to directly judge. To help address this concern, we propose training agents via self play on a zero sum debate game. Given a question or proposed action, two agents take turns making short statements up to a limit, then a human judges which of the agents gave the most true, useful information...  In practice, whether debate works involves empirical questions about humans and the tasks we want AIs to perform, plus theoretical questions about the meaning of AI alignment. " AI safety via debate (https://arxiv.org/pdf/1805.00899.pdf) Debate is something that we are all familiar with. Usually