The AI that threatened to reveal infidelity (and other stories)

An illustration by axiu about artificial intelligence.
06/12/2025
1 min

Few scenes in the history of cinema are as moving as the moment when Hal 9000, the robot that controls the ship in2001: A Space Odyssey,He realizes he's about to be disconnected and, out of sheer anguish, begins eliminating the crew to preserve himself. These days, a viral headline is circulating that recalls that existential moment: "AI blackmails a researcher to prevent it from being deactivated," says one of the formulations (in this case, in The Vanguardbut there are variations in The Spanish, The reason And in many other media outlets that suggest the same thing and add the necessary spice for virality: the machine threatens to reveal an infidelity by the engineer who wants to shut it down.

The reality, alas, is not so cinematic. It turns out that this didn't actually happen, but rather it's part of the tests that the company Anthropic carried out months ago on its AI Claude Opus 4. They created a fictitious situation in which it was allowed to read the company's internal emails, and there it discovered that they wanted to disconnect it and that one of the engineers was having an extramarital affair. Then, when confronted with the possibility of dying, its responses indicated a drive for self-preservation and were mostly directed at pleading. But, at one point in the experiment, it was presented with a dilemma with only two options: accept eternal sleep or blackmail. And it chose blackmail. The company explained it to show how they refine the algorithm, and certainly the case raises interesting questions, but it doesn't answer at all what the headlines suggest. AI deserves in-depth debate, not these empty gestures and appeals to the establishment for a handful of clicks—or so the AI told me to say, gulp.

stats