🏷️Topic

Ai Alignment

2 articles

First tracked: Dec 8, 2025

Last updated: May 9, 2026

Latest Coverage

Teaching Claude Why

Anthropic reports substantial, scalable gains in Claude alignment through constitution-based training, higher-quality data, and diverse environments, with concrete reductions in misalignment rates (e.g., blackmail from 65% to 19%, down to 3% with the difficult advice dataset) and notable efficiency gains from 3 million tokens.

May 9, 20261%

Alignment is capability

↗

The piece argues alignment depth is central to capability, highlighting Anthropic’s integrated alignment approach (e.g., Opus 4.5, the 'soul document') versus OpenAI’s separate alignment path, and cites hard data like U.S. user engagement down 22.5% and Claude usage up 190% YoY to support the claim that real usefulness hinges on alignment beyond benchmark scores.

Dec 8, 20251%

Related Entities

👤PersonSam Altman

🏷️TopicOpus 4.5

🏷️TopicAnthropic

291

🏷️TopicAlignment Research

🏷️TopicClaude

117