Skip to content

Claude Fable 5 and Claude Mythos 5: What Developers Need to Know

A clear guide to Claude Fable 5 and Claude Mythos 5, covering model specs, API changes, refusal handling, fallback options, and billing.

Claude Fable 5 and Claude Mythos 5

You're building something serious and you need the most capable Claude model available. But the moment you hear terms like "safety classifiers," "refusals," and "fallback credit," things start to feel complicated fast.

Here's the good news: once you understand the core concepts, it all clicks. Claude Fable 5 is Anthropic's most powerful publicly available model, built for demanding reasoning and long-horizon tasks. Claude Mythos 5 shares those same capabilities but skips the safety classifiers and is available only to select partners.

This guide breaks down both models, explains how refusals work, and shows you exactly how to handle them in your integration.


Model Overview

ModelAPI IDAvailability
Claude Fable 5claude-fable-5General availability
Claude Mythos 5claude-mythos-5Limited (Project Glasswing only)

Both models support a 1M token context window and up to 128k output tokens per request. Pricing is $10 per million input tokens and $50 per million output tokens.

Claude Fable 5 is available on the Claude API, Claude Platform on AWS, Amazon Bedrock, Vertex AI, and Microsoft Foundry. Claude Mythos 5 requires approval through Project Glasswing.


Key API Changes for These Models

These behaviors are specific to Fable 5 and Mythos 5 and do not apply to Opus, Sonnet, or Haiku.

Adaptive thinking is always on. You cannot disable it with thinking: {"type": "disabled"}. Use the effort parameter to control how deeply the model thinks.

Raw thinking is never returned. The internal chain of thought is hidden by default. Set display: "summarized" if you want readable thinking output.

json
{
  "thinking": {
    "display": "summarized"
  }
}

When continuing a multi-turn conversation, pass thinking blocks back unchanged.


Supported Features at Launch

  • Effort control
  • Task budgets (beta header: task-budgets-2026-03-13)
  • Memory tool
  • Context editing / tool result clearing (beta header: context-management-2025-06-27)
  • Compaction
  • Vision

Understanding Refusals

Claude Fable 5 has built-in safety classifiers. When a request is declined, you get a normal HTTP 200 response, not an error, with stop_reason: "refusal".

json
{
  "stop_reason": "refusal",
  "stop_details": {
    "type": "refusal",
    "category": "cyber",
    "explanation": "This request was declined because it could enable cyber harm."
  },
  "content": [],
  "usage": {
    "input_tokens": 412,
    "output_tokens": 0
  }
}

Refusal categories you may encounter:

CategoryMeaning
cyberRequest may enable malware or exploit development. Can also trigger on legitimate security work.
bioRequest may enable biological harm. Can also trigger on regular life sciences work.
reasoning_extractionRequest asks the model to reproduce its internal reasoning as text. Use adaptive thinking instead.

Billing note: A refusal before any output is free. You are not charged, and it does not count against rate limits. A mid-stream refusal bills normally for what was already generated.

Always detect refusals by checking stop_reason === "refusal" directly. Do not rely on stop_details, which can be null.


How to Handle Refusals: Three Options

Your setupBest approach
Claude API or Claude Platform on AWSServer-side fallback
TypeScript, Python, Go, Java, or C# SDKSDK middleware
Ruby, PHP, or raw HTTPManual retry with fallback credit

Server-Side Fallback (Simplest Setup)

Add a fallbacks list and a beta header. If Fable 5 declines, the API automatically retries on the next model. You get one response.

typescript
const response = await client.beta.messages.create({
  model: "claude-fable-5",
  max_tokens: 1024,
  messages,
  betas: ["server-side-fallback-2026-06-01"],
  fallbacks: [{ model: "claude-opus-4-8" }]
});
python
response = client.beta.messages.create(
    model="claude-fable-5",
    max_tokens=1024,
    messages=[{"role": "user", "content": "Hello, Claude"}],
    fallbacks=[{"model": "claude-opus-4-8"}],
    betas=["server-side-fallback-2026-06-01"],
)

The response tells you which model actually answered via the top-level model field. A fallback content block marks where one model handed off to another.

To check if the fallback ran:

python
fallback_ran = any(
    iteration.type == "fallback_message"
    for iteration in response.usage.iterations or []
)

Limits: Server-side fallback is beta on Claude API and Claude Platform on AWS only. It is not available on Bedrock, Vertex AI, Microsoft Foundry, or Message Batches.


SDK Middleware (Works on Any Platform)

Configure fallback once on the client. All requests through client.beta.messages handle retries automatically.

python
from anthropic import Anthropic, BetaFallbackState, BetaRefusalFallbackMiddleware

client = Anthropic(
    middleware=[BetaRefusalFallbackMiddleware([{"model": "claude-opus-4-8"}])],
)

state = BetaFallbackState()

with state:
    message = client.beta.messages.create(
        max_tokens=1024,
        model="claude-fable-5",
        messages=[{"role": "user", "content": "Hello, Claude"}],
    )
typescript
import { BetaFallbackState, betaRefusalFallbackMiddleware } from "@anthropic-ai/sdk";

const client = new Anthropic({
  middleware: [betaRefusalFallbackMiddleware([{ model: "claude-opus-4-8" }])]
});

const fallbackState = new BetaFallbackState();

const message = await client.beta.messages.create(
  {
    model: "claude-fable-5",
    max_tokens: 1024,
    messages: [{ role: "user", content: "Hello, Claude" }]
  },
  { fallbackState }
);

Share one BetaFallbackState instance across turns so follow-up requests stay pinned to the model that accepted the first message. The middleware is not yet available in Ruby or PHP.


Manual Retry with Fallback Credit (Ruby, PHP, Raw HTTP)

When you build the retry yourself, you can avoid paying twice for prompt caching by using a fallback credit token included in the refusal response.

Basic flow:

  1. Send the initial request with the header anthropic-beta: fallback-credit-2026-06-01.
  2. On refusal, read stop_details.fallback_credit_token.
  3. Retry on the fallback model with the same body plus fallback_credit_token.
bash
# Initial request
response=$(curl -sS https://api.anthropic.com/v1/messages \
  -H "x-api-key: $ANTHROPIC_API_KEY" \
  -H "anthropic-version: 2023-06-01" \
  -H "anthropic-beta: fallback-credit-2026-06-01" \
  -H "content-type: application/json" \
  -d '{
    "model": "claude-fable-5",
    "max_tokens": 1024,
    "messages": [{"role": "user", "content": "Hello, Claude"}]
  }')

token=$(jq -r '.stop_details.fallback_credit_token // empty' <<< "$response")

if [[ -n "$token" ]]; then
  curl -sS https://api.anthropic.com/v1/messages \
    -H "x-api-key: $ANTHROPIC_API_KEY" \
    -H "anthropic-version: 2023-06-01" \
    -H "anthropic-beta: fallback-credit-2026-06-01" \
    -H "content-type: application/json" \
    -d "$(jq -n --arg t "$token" '{
      model: "claude-opus-4-8",
      max_tokens: 1024,
      messages: [{"role": "user", "content": "Hello, Claude"}],
      fallback_credit_token: $t
    }')"
fi

Credit tokens expire after 5 minutes and are single-use. The credit shows up as lower cache_creation_input_tokens and higher cache_read_input_tokens on the retry.


Common Mistakes to Avoid

  • Retrying on the same model. A model that refused will almost always refuse again. Always point retries at a fallback model.
  • Treating refusals like errors. Refusals return HTTP 200. Standard error-rate monitoring will miss them entirely. Add explicit tracking for stop_reason === "refusal".
  • Forgetting sub-agents. The fallbacks parameter does not propagate into tool calls made inside an agent. Configure fallback on each sub-agent call too.
  • Using stop_details to detect refusals. It can be null even on a real refusal. Always check stop_reason directly.
  • Mixing middleware and server-side fallback. Use one or the other, never both on the same request.

Data Retention

Claude Fable 5 and Claude Mythos 5 are Covered Models. They carry a 30-day data retention requirement and are not available under zero data retention agreements.


Q&A

1. What is the difference between Claude Fable 5 and Claude Mythos 5?

They share the same underlying capabilities. Claude Fable 5 includes safety classifiers and is publicly available. Claude Mythos 5 removes those classifiers and is available only to approved partners through Project Glasswing.

2. Can I disable thinking on Claude Fable 5?

No. Adaptive thinking is always on for Fable 5 and Mythos 5. You can control depth using the effort parameter, but you cannot turn thinking off entirely.

3. Will I be charged for a refused request?

Not if the refusal happens before any output is generated. The token counts appear in usage but are not billed and do not consume rate limits. A mid-stream refusal is billed normally for the output already produced.

4. Which models can I fall back to from Claude Fable 5?

At launch, the permitted fallback target is Claude Opus 4.8 (claude-opus-4-8). You can check permitted targets via the Models API when using the server-side-fallback-2026-06-01 beta header.

5. Is server-side fallback available on Amazon Bedrock?

No. Server-side fallback is in beta on the Claude API and Claude Platform on AWS only. On Bedrock, Vertex AI, and Microsoft Foundry, use the SDK middleware instead.

6. What is sticky routing?

Once a conversation falls back to a different model, later requests in that conversation are served directly by the fallback model without re-trying Fable 5. This avoids paying for an attempt that would predictably be refused again. It lasts approximately one hour.

7. What is fallback credit and do I always need it?

Fallback credit prevents you from paying twice for prompt caching when you retry a refused request on a different model. You do not need to manage it manually if you use server-side fallback or the SDK middleware, both of which apply it automatically.

8. How do I know which model actually served my response?

Check the top-level model field in the response. With server-side fallback, a fallback content block also marks where the handoff occurred, and usage.iterations gives a per-model breakdown.

9. Can I use the fallbacks parameter in Message Batches?

No. Including fallbacks in a batch request produces a per-item error. To handle refusals in batches, collect the refused items from results and resubmit them on a fallback model as a new batch.

10. What should I do if the credit token is rejected?

Follow the degradation ladder: first retry with the same token but without the appended assistant message; if that is also rejected, retry without the token at all. If the error says "redemption temporarily unavailable," that is transient, so retry the same way within the 5-minute token window instead of moving to the next step.

My SaaS
Acluebox
Build modular and reusable system prompts with my SaaS,
Acluebox
. Also, free prompt template generators there.

References

Made with ❤️ by Mun Bock Ho

Copyright ©️ 2026