After Things Podcast
The Matrix Adventure and AI Revelations
- Autor: Vários
- Narrador: Vários
- Editor: Podcast
- Mas informaciones
Informações:
Sinopsis
The episode opens with a long discussion of OpenAI's Strawberry / O1-style reasoning models. Andrew Mayne explains that these models seem to work better when asked to break problems into steps, use tools, and reason through tasks in a more structured way than ordinary one-shot chat models. The hosts compare this to prompt engineering, discuss examples like decimal comparisons and counting the R's in "strawberry," and talk about how longer structured prompts, patience, and using the right model for the right task can improve results. Later, the conversation broadens into AI evaluations, benchmark gaming, model stacking, tool use, and concerns about AI persuasion. Andrew argues that leaderboard results can be misleading and that models often look strong in short tests but deteriorate with longer contexts, while Justin notes that eval methods themselves are still immature. They also discuss a Science paper about GPT-4 Turbo persuading people away from conspiracy beliefs, which Andrew frames as manipulative and a