Claude Sonnet 4.6 API
Claude Sonnet 4.6 is Anthropic's best balance of speed, intelligence, and cost — a versatile model for coding, agentic workflows, and everyday tasks with a 200K context window and 128K max output. Access it through EvoLink's unified API.
PRICING
| PLAN | CONTEXT WINDOW | MAX OUTPUT | INPUT | OUTPUT | CACHE WRITE | CACHE READ |
|---|---|---|---|---|---|---|
| Claude Sonnet 4.6 | 200.0K | 128.0K | ≤200.0K$2.55-15% $3.00Official Price >200.0K$5.10-15% $6.00Official Price | ≤200.0K$12.75-15% $15.00Official Price >200.0K$19.125-15% $22.50Official Price | ≤200.0K$3.188-15% $3.75Official Price >200.0K$6.375-15% $7.50Official Price | ≤200.0K$0.256-15% $0.300Official Price >200.0K$0.511-15% $0.600Official Price |
| Claude Sonnet 4.6 (Beta) | 200.0K | 128.0K | ≤200.0K$0.780-74% $3.00Official Price >200.0K$1.56-74% $6.00Official Price | ≤200.0K$3.90-74% $15.00Official Price >200.0K$5.85-74% $22.50Official Price | ≤200.0K$0.975-74% $3.75Official Price >200.0K$1.95-74% $7.50Official Price | ≤200.0K$0.078-74% $0.300Official Price >200.0K$0.156-74% $0.600Official Price |
Server-side web search capability
Pricing Note: Price unit: USD / 1M tokens
Cache Hit: Price applies to cached prompt tokens.
Two ways to run Claude Sonnet 4.6 — pick the tier that matches your workload.
- · Claude Sonnet 4.6: the default tier for production reliability and predictable availability.
- · Claude Sonnet 4.6 (Beta): a lower-cost tier with best-effort availability; retries recommended for retry-tolerant workloads.
Claude Sonnet 4.6 API — Anthropic's best-balanced model
Claude Sonnet 4.6 (Claude 4.6 Sonnet) delivers the ideal balance of intelligence, speed, and cost with a 200K context window and up to 128K output tokens for coding, agents, and complex workflows.

What can you build with the Claude Sonnet 4.6 API?
Versatile Coding Assistant
Use Sonnet 4.6 for day-to-day coding tasks — architecture, refactors, code review, and bug fixing. With up to 128K output tokens and a 200K context window, handle large codebases and generate comprehensive diffs, test suites, and implementation plans in a single request.

Reliable Agent Workflows
Build agents that plan, call tools, and maintain context across multi-step tasks. Sonnet 4.6 balances intelligence and speed for agent-heavy workflows, delivering reliable tool use and consistent outputs at a fraction of flagship pricing.

Extended Thinking & Analysis
Enable extended thinking for complex reasoning tasks. Sonnet 4.6 supports deeper analysis when needed while keeping costs predictable — ideal for research, planning, and technical strategy where you need more than a quick answer.

Why teams choose the Claude Sonnet 4.6 API on EvoLink
Get Anthropic's best-balanced model with stable model IDs, prompt caching, and unified routing through EvoLink's single API key.
Best balance of speed, intelligence, and cost
Sonnet 4.6 is purpose-built for teams that need strong performance across coding, analysis, and agentic tasks without flagship pricing.
128K max output for large-scale generation
Generate comprehensive code, documentation, and analysis in a single request — double the output capacity of previous models.
Cost control with prompt caching
Prompt caching supports 5-minute and 1-hour caches, and cache hits are billed at 0.1x the base input rate to reduce repeat costs.
How to integrate the Claude Sonnet 4.6 API
Connect through EvoLink, choose your model ID, and start building in minutes.
Step 1 — Create your EvoLink API key
Sign up for EvoLink to get a single API key that routes to Anthropic, Bedrock, or Vertex AI.
Step 2 — Select the model ID
Use `claude-sonnet-4-6` to access the latest Sonnet 4.6 model through EvoLink's unified API.
Step 3 — Optimize quality and cost
Claude Sonnet 4.6 supports extended thinking for complex tasks and prompt caching to lower repeat costs — at $3/$15 per million tokens.
Claude Sonnet 4.6 API capabilities
Key specs and model features for production use
200K Context Window
Read large documents or codebases in a single request without chunking.
128K Max Output
Generate long-form answers, plans, and code without early truncation — double previous limits.
Extended Thinking
Enable deeper reasoning when tasks become complex, with predictable cost scaling.
Vision + Multilingual Input
Accept text and image inputs with strong multilingual understanding.
Prompt Caching Rates
Cache writes and reads are priced separately; cache hits are billed at 0.1x the base input price.
Stable IDs & Aliases
Aliases auto-upgrade to the newest snapshot, while versioned IDs keep results consistent.
Claude Sonnet 4.6 API - FAQ
Everything you need to know about the product and billing.