The Fable Debacle
The current case study in AI companies’ ongoing failure to protect your most sensitive data
Today’s LLM call is tomorrow’s leak, and it’s clear the AI companies won’t fix it.
The past weeks’ AI headlines are filled with Anthropic, again. Or perhaps more accurately, still. The Fable release has been a bit of a disaster for the company and a disappointment for users.
The most recent development: the U.S. government used export controls to undo Fable’s release. As I was writing my last piece on Fable’s guardrails, a security researcher reported a partial jailbreak. A twice weekly posting cadence doesn’t give me the luxury of reaction.
I have another piece coming this week on why none of this should surprise anyone watching this industry, not just this company.
Last week I predicted that the Fable release had two issues we would see raised.
30 day retention. Opus stores nothing. Fable stores everything for 30 days, even for organizations with a negotiated zero data retention agreement. The change was documented, but buried in a support article most people never read before their first API call. Your developer didn’t know the model they reached for had different terms than the one they used yesterday.
Wording in the release admitted a potential jailbreak vector without disclosing important information about it. Anthropic was vague in ways that a proper level of trust in the company would not allow. In their own statement, they wrote that their safeguards are “so strong that many users have complained that they are overly broad.” Broad and strong are not the same property.
The jailbreak itself raised the flags that the release’s wording should have, and more. Anthropic later described it as “a potential narrow, non-universal jailbreak, which essentially consists of asking the model to read a specific codebase and fix any software flaws.” That is what triggered a federal export control directive. A typical dev workflow.
The U.S. government put export controls on Fable and Mythos, effectively shutting them down.
My following of the known exploit does not currently reveal any evidence of it being a greater risk than Opus. If that’s true, the partial exploit may have just been the excuse needed to shut down and investigate something else, more serious.
The true Fable and Mythos capability is still being debated, especially now that there’s no access. But the consensus seems to be that it’s about equal to Opus, just with much better performance at extremely long contexts. This alone is a powerful advantage to Anthropic and the United States. To me, a likely concern is distilling attempts by foreign companies.
I do not use Fable. My opinions come from watching the discussion online, and most users don’t seem to be seeing a significant improvement over Opus. Every time I see a credible source mention improvement, they mention extremely long autonomous sessions, typically eight or more hours. Long sessions and long contexts are a current area of active research that promises to vastly improve the field and provide massive advantages to the country that controls it.
It makes one wonder if the government has some reason to hold the logs longer than 30 days. Anthropic’s own policy already has the door open for that: data is deleted automatically after 30 days, “except in the rare cases where it’s part of a safety investigation or we’re legally required to keep it.” An active federal directive is about as legally required as it gets, and a distillation attempt at this level may cause goverment attention.
None of this had to be a bind. Anthropic called Mythos too dangerous to release publicly, then paid fifty industry partners to use it anyway. If they’d been precise from the start about what that actually meant, dangerous enough to need Glasswing, or not, there’d be one story instead of two that can’t both survive contact with a federal export control directive. Instead they get to choose between contradicting their own safety framing or contradicting their own objection to the government. They chose neither, and called it caution.
