Skip to content

Claude Sonnet 5: What's Actually New

This guide breaks down what's actually new for the Anthropic's latest Sonnet-tier model, Claude Sonnet 5.

Claude Sonnet 5: What's Actually New

Key Specifications of Claude Sonnet 5

FeatureClaude Sonnet 5
DescriptionSpeed and Intelligence
API model IDclaude-sonnet-5
Context window1M tokens
Max output tokens128k
LatencyFast
Extended thinkingNo
Adaptive thinkingYes
Knowledge CutoffJan 2026

Three Behavior Changes You Need to Know

These are the changes that can actually break your existing code if you skip them.

1. Adaptive thinking is on by default

On Claude Sonnet 4.6, a request with no thinking field just ran without thinking. On Claude Sonnet 5, that same request now runs with adaptive thinking automatically.

If you want the old behavior, turn it off explicitly:

python
thinking = {"type": "disabled"}

One thing to watch: max_tokens covers thinking tokens plus your response text combined. If your workload used to run without thinking, revisit your max_tokens value so your response doesn't get cut off.

2. Sampling parameters are no longer accepted

Setting temperature, top_p, or top_k to anything other than the default now returns a 400 error.

python
# This will fail on Claude Sonnet 5
response = client.messages.create(
    model="claude-sonnet-5",
    temperature=0.7,
    max_tokens=1000,
    messages=[{"role": "user", "content": "Hello"}]
)

Fix: remove these parameters entirely, or leave them at default. If you were using temperature to steer tone or creativity, move that guidance into your system prompt instead.

This same restriction already applies to Claude Opus 4.7, so it's not a totally new pattern.

3. Manual extended thinking is removed

Manually setting a thinking budget no longer works:

python
# Not supported on Claude Sonnet 5 (returns 400)
thinking = {"type": "enabled", "budget_tokens": 32000}

# Use this instead
thinking = {"type": "adaptive"}

This matches Claude Opus 4.8 and Claude Opus 4.7. If your code manually sets budget_tokens, switch to adaptive thinking and use the effort parameter if you need more control over how much the model thinks.

The New Tokenizer (and Why It Matters for Your Budget)

Claude Sonnet 5 uses a new tokenizer. The same text now produces roughly 30% more tokens than it did on Claude Sonnet 4.6.

Nothing about the API shape changes: your requests, responses, and streaming events all work the same way. But anything measured in tokens will look different.

Here's what to double check:

  • Token counts: old counts from Claude Sonnet 4.6 don't apply anymore. Recount your prompts against Claude Sonnet 5.
  • Effective context size: the window is still 1M tokens, but each token holds less text, so the same window fits less content than before.
  • max_tokens limits: a limit that worked fine on Claude Sonnet 4.6 might now truncate your output. Recheck any limit set close to your expected output length.
  • Cost per request: pricing per token hasn't changed, but since your text now produces more tokens, your actual bill per request can shift.

Claude Sonnet 5 vs Claude Sonnet 4.6

Claude Sonnet 4.6Claude Sonnet 5
Default thinkingOffAdaptive (on)
Manual thinking budgetDeprecatedRemoved (400 error)
Custom sampling paramsAllowedReturns 400 error
TokenizerOlderNew (~30% more tokens for same text)
Priority TierSupportedNot supported
Assistant message prefillingNot supportedNot supported

Pricing of Claude Sonnet 5

ModelClaude Sonnet 5
Pricing$3 input / M tokens
$15 output / M tokens
Intro Pricing (through Aug 31, 2026)$2 input / M tokens
$10 output / M tokens

Where You Can Use Claude Sonnet 5

Claude Sonnet 5 is available through:

  • Claude API
  • AWS Amazon Bedrock
  • Google Cloud
  • Microsoft Foundry (preview)

Claude Sonnet 5 also supports zero data retention for organizations with a ZDR agreement.

How to Migrate Your Code

Step one is simple: swap the model ID.

python
model = "claude-sonnet-4-6"  # Before
model = "claude-sonnet-5"    # After

Then work through this checklist:

  1. Recount your tokens. Use the token counting API against Claude Sonnet 5 and adjust any max_tokens limits that are close to your expected output size.
  2. Replace manual thinking budgets. Swap budget_tokens for {"type": "adaptive"}.
  3. Strip out sampling parameters. Remove any non-default temperature, top_p, or top_k values from your requests.

Everything else, including tool definitions and response formats, stays the same. Assistant message prefilling was already unsupported on Claude Sonnet 4.6, so that's not a new limitation.

Q&A

1. Do I need to rewrite my whole integration to use Claude Sonnet 5?

No. It's a drop-in replacement. Change the model ID, then handle the three behavior changes above.

2. Why is thinking suddenly on by default?

Claude Sonnet 5 uses adaptive thinking by default instead of running with thinking off, which was the default on Claude Sonnet 4.6. You can disable it if you don't want it.

3. Can I still set a specific thinking budget in tokens?

No. Manual extended thinking is removed. Use adaptive thinking with the effort parameter instead.

4. Why am I getting a 400 error when I set temperature?

Claude Sonnet 5 no longer accepts non-default values for temperature, top_p, or top_k. Remove them from your request.

5. Will my old token counts still be accurate?

No. The new tokenizer produces about 30% more tokens for the same text, so old counts from Claude Sonnet 4.6 are no longer valid.

6. Does the new tokenizer change how I call the API?

No. Requests, responses, and streaming events keep the same structure. Only your token-based measurements are affected.

7. Is Claude Sonnet 5 more expensive than Claude Sonnet 4.6?

Per-token pricing is the same. But since the same text now produces more tokens, your total cost per request can be higher.

8. Does Claude Sonnet 5 support Priority Tier?

No, Priority Tier isn't available for Claude Sonnet 5 at this time.

9. Can I use Claude Sonnet 5 on Amazon Bedrock?

Yes, through Claude in Amazon Bedrock and Claude Platform on AWS. It's not available on the legacy Bedrock InvokeModel or Converse APIs.

10. What's the biggest capability improvement in Claude Sonnet 5?

The largest gains over Claude Sonnet 4.6 are in coding and agentic tasks.


My SaaS
Acluebox
Build modular and reusable system prompts with my SaaS,
Acluebox
. Also, free prompt template generators there.

References

Tags

ClaudeSonnet 5

Made with ❤️ by Mun Bock Ho

Copyright ©️ 2026