Anthropic reopens Fable 5, and publishes cyber safeguards plus jailbreak criteria

SNACK three-line summary

Anthropic has reopened Fable 5 to users worldwide. It follows the return on July 1 of a model whose access had been paused in mid-June under a U.S. government directive.
On July 2, the company also published its cyber safety classifiers and a draft jailbreak severity framework. The post is less about model performance and more about which requests should be blocked, and which defensive requests should be allowed.
The key point is that Anthropic is reopening a powerful AI while also disclosing the operating rules around it. Developers and general readers should watch not only the new model news, but also the scope of safeguards, false positives, and blocked requests.

Image source: Anthropic official newsroom

Snackgirls editor note

AIKO: “This is not simply a story about a model coming back. It is a case where the company also explained how safeguards should be described and verified when releasing a powerful AI.”

Red: “For developers, Fable 5 reopening matters, but they also need to watch whether legitimate security-testing requests could be blocked too. The faster the model, the more important the rules of use become.”

What has reopened?

In an official post on June 30, Anthropic said Claude Fable 5 would again be available to users worldwide from July 1. The affected services are Claude Platform, Claude.ai, Claude Code, and Claude Cowork. Access to Fable 5 and Mythos 5 had previously been suspended after a U.S. government directive, and Game Sunakku also covered that suspension at the time.

The core of this follow-up is not simply “service resumed.” Alongside the restored access, Anthropic disclosed the safety classifiers applied to Fable 5, reviews by governments and partners, and discussions around jailbreak severity criteria. In other words, it reads more like an operating report explaining why a powerful model is blocked in some cases and what it will allow when reopened.

What the cyber safety classifier does

In its July 2 post, Anthropic divided cybersecurity-related requests to Fable 5 into four levels. The structure blocks requests with a high potential for major harm, allows everyday defensive requests, and may monitor or block some boundary cases.

Put simply, Anthropic has placed a traffic light next to the model to judge whether AI may help with security work. Green means ordinary safe requests, red means requests with a high potential for harm, and yellow means requests where defensive and abusive possibilities are mixed. Anthropic said it has treated this yellow-light area more conservatively in Fable 5.

Why bring up jailbreak criteria separately?

Anthropic said it is working with Glasswing partners including Amazon, Microsoft, and Google on a draft AI jailbreak severity framework. Here, a jailbreak means a prompting method intended to bypass a model’s safeguards. The company explains that because not every bypass attempt carries the same risk, a consistent way to classify severity is needed.

This part matters for general readers too. Future controversies around AI models will be hard to reduce to a single line of “it was broken” or “it was not broken.” Readers will need to look at which capability was unlocked, whether there is a realistic chance of harm, and whether defensive use is being blocked as well. This announcement can be seen as part of a trend toward disclosing AI safety issues in a more concrete, product-spec-like way.

What users should watch carefully

Still, stronger safeguards do not automatically mean every user will have a smoother experience. Anthropic also acknowledged that conservative classifiers may more often block legitimate coding and debugging requests. In particular, defensive requests such as security checks, vulnerability analysis, and code review may look sensitive in wording, so users need to watch for false positives.

In short, this Fable 5 follow-up is both news of a model returning and a public example of how AI safety operations are being handled. As performance rises, product news cannot be understood by model names alone. Going forward, readers will need the habit of looking at access rights, blocking criteria, false-positive risks, and third-party review processes together.

Sources and checked date · Published 2026-07-02 / Checked 2026-07-03T01:05:53+00:00

Sources

Anthropic reopens Fable 5, and publishes cyber safeguards plus jailbreak criteria

SNACK three-line summary

Snackgirls editor note

What has reopened?

What the cyber safety classifier does

Why bring up jailbreak criteria separately?

What users should watch carefully

Share this article:

Like this:

Comments

Leave a commentCancel reply

More posts

Game Sunakku에서 더 알아보기