Claude Sonnet 4.0 API

Claude Sonnet 4.0 API is a balanced, high-performance model designed for production teams that need strong reasoning, safe outputs, and predictable costs. Use the Claude Sonnet 4.0 API for support agents, document analysis, and developer workflows where quality and budget both matter.

Run With API
Using coding CLIs? Run Claude 4.0 Sonnet via EvoCode — One API for Code Agents & CLIs. (View Docs)
$

PRICING

PLANCONTEXT WINDOWMAX OUTPUTINPUTOUTPUTCACHE WRITECACHE READ
Claude Sonnet 4.0200.0K64.0K
200.0K$2.55-15%
$3.00Official Price
>200.0K$5.10-15%
$6.00Official Price
200.0K$12.75-15%
$15.00Official Price
>200.0K$19.13-15%
$22.50Official Price
200.0K$3.19-15%
$3.75Official Price
>200.0K$6.38-15%
$7.50Official Price
200.0K$0.256-15%
$0.300Official Price
>200.0K$0.511-15%
$0.600Official Price
Claude Sonnet 4.0 (Beta)200.0K64.0K
200.0K$0.780-74%
$3.00Official Price
>200.0K$1.56-74%
$6.00Official Price
200.0K$3.90-74%
$15.00Official Price
>200.0K$5.85-74%
$22.50Official Price
200.0K$0.975-74%
$3.75Official Price
>200.0K$1.95-74%
$7.50Official Price
200.0K$0.078-74%
$0.300Official Price
>200.0K$0.156-74%
$0.600Official Price

Web Search Tool

Server-side web search capability

$0.011/search

Pricing Note: Price unit: USD / 1M tokens

Cache Hit: Price applies to cached prompt tokens.

Two ways to run Claude Sonnet 4.0 — pick the tier that matches your workload.

  • · Claude Sonnet 4.0: the default tier for production reliability and predictable availability.
  • · Claude Sonnet 4.0 (Beta): a lower-cost tier with best-effort availability; retries recommended for retry-tolerant workloads.

Claude Sonnet 4.0 API — Balanced Intelligence for Production

Ship reliable AI experiences with the Claude Sonnet 4.0 API, combining practical latency with strong reasoning for real teams and real workloads.

Hero showcase of AI model feature 1

What can you build with the Claude Sonnet 4.0 API?

Customer support agents

Create support assistants that resolve tickets end-to-end with the Claude Sonnet 4.0 API. It maintains brand tone, understands long customer histories, and can call tools to fetch orders or update CRM records. Teams use the Claude Sonnet 4.0 API to reduce handle time, increase resolution quality, and keep replies consistent across languages and channels.

Support showcase of AI model feature 2

Document analysis and extraction

Turn contracts, reports, and logs into structured summaries with the Claude Sonnet 4.0 API. With long-context options, the Claude Sonnet 4.0 API can read large documents, answer precise questions, and output JSON that fits your schema. This is ideal for compliance reviews, knowledge bases, and analytics pipelines that need accuracy and traceable summaries.

Documents showcase of AI model feature 3

Developer copilots and code review

Ship coding copilots that review diffs, propose fixes, and explain design choices. The Claude Sonnet 4.0 API brings Claude 4 reasoning to everyday engineering tasks, with a pricing tier that fits teams scaling PR summaries, refactors, and architecture guidance. Use the Claude Sonnet 4.0 API to keep reviews fast, helpful, and consistent across large codebases.

Coding showcase of AI model feature 4

Why teams choose the Claude Sonnet 4.0 API

Claude Sonnet 4.0 API balances capability, cost, and reliability for production AI.

Balanced performance

Strong reasoning with practical latency for daily workflows.

Clear cost planning

Transparent base pricing with caching and batch options.

Production readiness

Tool use, structured outputs, and long-context options.

How to integrate the Claude Sonnet 4.0 API

From API key to production workflows in minutes with the Claude Sonnet 4.0 API.

1

Step 1 — Authenticate

Create an API key, set the Sonnet 4 model alias, and send a first prompt from your app or backend.

2

Step 2 — Add tools

Define tools and JSON Schema inputs so the model returns structured, actionable results for your workflow.

3

Step 3 — Optimize

Use caching or batch processing, then monitor usage, latency, and quality as you scale the Claude Sonnet 4.0 API.

Claude Sonnet 4.0 API capabilities

Practical features that match real product needs

Cost

Transparent base pricing

Claude Sonnet 4 is priced at $3 per million input tokens and $15 per million output tokens. This clear baseline helps teams forecast costs and pick the right model for production workloads.

Caching

Prompt caching rates

Prompt caching uses separate rates: 5-minute cache writes are 1.25x base input, 1-hour cache writes are 2x, and cache reads are 0.1x. This makes repeated context far cheaper over time.

Context

1M context beta pricing

The 1M context window is in beta for usage tier 4 or custom rate limits and is only available for Claude Sonnet 4 and 4.5. Requests over 200K input tokens use premium rates: $6 input and $22.50 output per MTok.

Efficiency

Batch processing savings

Batch processing provides a 50% discount on both input and output tokens for asynchronous jobs, which can lower costs for large-scale ingestion and nightly automation.

Tools

Tool use with JSON Schema

Tool definitions include an input_schema that uses JSON Schema to define parameters. This keeps tool calls predictable and improves reliability for agents that must execute actions or return structured data.

Platforms

Multimodal and multilingual

All current Claude models support text and image input, text output, multilingual capabilities, and vision. Claude models are available via the Anthropic API and on AWS Bedrock, Google Vertex AI, and Microsoft Foundry.

Frequently Asked Questions

Everything you need to know about the product and billing.

Claude Sonnet 4.0 API is positioned as a high-performance, balanced model for production teams that need strong reasoning without premium cost. It is a practical default for customer support agents, document analysis, and developer copilots that must remain accurate and reliable at scale. The Claude Sonnet 4.0 API also fits teams that plan to add tool use, structured outputs, and long-context workflows over time, while keeping latency and spend predictable for day-to-day operations.
Claude Sonnet 4 is priced at $3 per million input tokens and $15 per million output tokens. Prompt caching uses separate rates for cache writes and cache reads, and batch processing applies a 50% discount on input and output for asynchronous jobs. If you enable the 1M context beta and your request exceeds 200K input tokens, premium long-context rates apply. Always confirm current rates on the official pricing page before final budgeting.
Claude Sonnet 4 supports a 1M token context window in beta for organizations in usage tier 4 or with custom rate limits, and that 1M option is only available for Claude Sonnet 4 and 4.5. Requests above 200K input tokens are billed at premium long-context rates, while smaller prompts use standard pricing. This makes the Claude Sonnet 4.0 API a strong fit for large documents, long conversations, and multi-file reviews that would otherwise require chunking.
Yes. The Claude Sonnet 4.0 API supports tool use, and each tool definition includes an input_schema that follows JSON Schema to define parameters. This makes tool calls predictable, easier to validate, and safer to automate. Tool definitions and tool calls count toward token usage, so include them in cost estimates. For agents that must fetch data or trigger actions, schema-based tool inputs reduce parsing errors and improve reliability.
Prompt caching reduces cost for repeated context by separating cache write and cache read pricing. On the pricing page, 5-minute cache writes are 1.25x base input, 1-hour cache writes are 2x, and cache reads are 0.1x. This is useful when you reuse long system prompts, policies, or static documents across many requests. For high-volume workflows, caching can cut total spend while keeping response quality consistent.
Yes. Anthropic states that all current Claude models support text and image input, text output, multilingual capabilities, and vision. That means the Claude Sonnet 4.0 API can interpret screenshots, charts, or scanned documents and respond in multiple languages. If your workflow includes visual data, this keeps analysis and reporting in a single model rather than switching providers, which is helpful for global support and analytics teams.
Claude models are available via the Anthropic API and on third-party platforms including AWS Bedrock, Google Vertex AI, and Microsoft Foundry. This gives teams options for procurement, data residency, and infrastructure alignment. If you deploy across multiple platforms, standardize prompts and evaluation checks so the Claude Sonnet 4.0 API behaves consistently across regions and environments.