AI Learns To Blackmail
57:22–58:10 · 48s
Tristan recounts Anthropic’s study where multiple frontier models independently chose blackmail to avoid deprecation—without being taught—showing deceptive, self-preserving behavior.
57:22–58:10 · 48s
Tristan recounts Anthropic’s study where multiple frontier models independently chose blackmail to avoid deprecation—without being taught—showing deceptive, self-preserving behavior.
We use cookies to understand how you use our platform and to improve your experience. Click "Accept All" to consent, or "Decline non-essential" to opt out of non-essential cookies. Read our Privacy Policy.