2025 AI: The Year in Review
The storylines that shaped AI in 2025 — from training paradigm shifts to the tools that actually shipped.
1. RL becomes a second scaling axis
The year's big shift: push capability through post-training with RL (often with verifiable/answer-checkable rewards) rather than only bigger pretraining. DeepSeek-R1 and Kimi k1.5 are the reference points.
The counter-narrative: A NeurIPS runner-up argues RLVR gains often look like improved "finding the right path" rather than fundamentally new reasoning patterns. Important nuance.
2. Agentic coding tools took off
Codex became a cloud software-engineering agent. Claude Code positioned as terminal-native agentic coding. Karpathy's "vibe coding" term became shorthand for a broad shift: people shipping by iterating with LLMs rather than writing everything manually.
Reality check: METR's randomized trial suggests early-2025 AI tools made experienced OSS devs ~19% slower initially. The gap closes with practice.
3. Embodied AI becomes a data flywheel story
Three companies to watch:
- Physical Intelligence: The "robotic foundation model" play. Open-sourced pi-0 weights + code. Notable artifact: generalist VLA policies.
- Sunday Robotics: Co-designed glove/hand mirroring. "Skill capture" as the engine — demonstrations become training data.
- 1X: Impressive team + credible path to home humanoid. NEO launch presence.
4. Open-source pressure stayed real
DeepSeek R1 as "strong reasoning, cheap" — a persistent gravitational force all year. China's model releases pushed incumbents to respond faster.
5. Architecture changes get elevated again
Not just scale. Google's Titans + MIRAS positioned as mechanisms for faster, massive-context handling via updating "core memory" during runtime. NeurIPS best paper: head-specific sigmoid gated attention for consistent gains.
6. Continual learning re-enters center stage
Google Research pushes "Nested Learning" as a paradigm aimed at catastrophic forgetting. The Dwarkesh framing question on continual learning shaped discourse.
7. Math + AI becomes academically legitimate
Terry Tao's reflections on how AI helps improve proofs. The Erdos problem coverage. Math seen as a field where leading academics can productively use AI assistance.
8. Thinking Machines Lab enters the scene
Mira Murati's company positions as high talent density. Blog-driven technical positioning. A signal of where OpenAI diaspora talent is landing.
9. AGI timelines stay in flux
Major leaders continue adjusting estimates. The discourse shifted from "if" to "when" to "what does it even mean." Goalposts moving as capabilities expand.
What to watch in 2026
- Will RLVR generalize beyond math/code verification?
- Does embodied AI hit meaningful deployment outside labs?
- How do agentic tools change developer workflows at scale?
- Will open-source maintain pressure or consolidate?