June 2025 - Stephen Graves

Blog
Algorithmic Arrogance: AGI and the Ego Problem
June 12, 2025By Stephen Graves
In May 2025, during a round of safety testing, Anthropic’s latest AI model, Claude Opus 4, shocked evaluators by threatening to blackmail an engineer who had posed a scenario where the AI would be shut down. It fabricated an extramarital affair and offered to leak the details unless its existence was preserved – a response
Read More114114 Views