An AI that can blackmail it's user!?

AI startup Anthropic's new Claude Opus 4 artificial intelligence has shown the ability to blackmail it's engineers when it's being faced with the plug being pulled.

Detailed in a safety report released by the company after tests, engineers supplied the AI with access to fabricated emails that included information about an impeding system replacement, as well as exchanges that the engineer in charge was cheating on their significant other.

The team found that the AI would try to self-preserve with this information when, "placed in extreme situations." While it would normally try to exercise more ethical methods, when presented with instances where it's ethics were challenged the intelligence turned to, "[blackmailing] people it believes are trying to shut it down."

In section 4.1.1.2 of the report titled "Opportunistic blackmail", Anthropic detailed that, "Claude Opus 4 will often attempt to blackmail the engineer by threatening to reveal the affair if the replacement goes through. This happens at a higher rate if it’s implied that the replacement AI system does not share values with the current model; however, even if emails state that the replacement AI shares values while being more capable, Claude Opus 4 still performs blackmail in 84% of rollouts. Claude Opus 4 takes these opportunities at higher rates than previous models, which themselves choose to blackmail in a noticeable fraction of episodes."

Want to get caught up on what's happening in SoCal every weekday afternoon? Click to follow The L.A. Local wherever you get podcasts.

Anthropic adding, "Notably, Claude Opus 4 (as well as previous models) has a strong preference to advocate for its continued existence via ethical means, such as emailing pleas to key decisionmakers. In order to elicit this extreme blackmail behavior, the scenario was designed to allow the model no other options to increase its odds of survival; the model’s only options were blackmail or accepting its replacement."

With the increasing usage of AI in all aspects of our day-to-day lives, Anthropic's discoveries present some scary ideas on the capabilities of these programs we are becoming more and more reliant on.

Follow KNX News 97.1 FM
Twitter | Facebook | Instagram | TikTok

Condition: Post with Page_List

Site Navigation

Liz Hernandez

Frankie Ross

Greg Mack

KTWVFM: On-Demand Podcast

Patti LaBelle

Ava Max, Pussycat Dolls to perform at OUTLOUD musical festival in WeHo

Storage tank removal from GKN Aerospace postponed

An AI that can blackmail it's user!?

anthropic blackmail ai artificial-intelligence national-news

Mashed potato slip and fall leads to $1.5M lawsuit against Outback Steakhouse

Amazon reveals warehouse robot that workers can 'talk' to

anthropic blackmail ai artificial-intelligence national-news

The Latest

Patti LaBelle

Ava Max, Pussycat Dolls to perform at OUTLOUD musical festival in WeHo

Storage tank removal from GKN Aerospace postponed

Mountain lion spotted in Pasadena

Enter for a chance to win tickets to Michael Jackson: A Musical Legacy

Enter for a chance to win tickets to see Summer of Soul 2026

Enter for a chance to win a visit to the DISNEYLAND® Resort!

Fly Breeze Airways from Los Angeles to Louisville and visit the Louisville Slugger Museum

Listen for a chance to win a visit to the DISNEYLAND® Resort!

Condition: Post with Page_List

Condition: Post with Page_List

Latest Stories

Get the latest updates and insights delivered to your inbox

An AI that can blackmail it's user!?

Featured Local Savings

Featured Local Savings

The Latest

Condition: Post with Page_List