
Claude Fable 5 API Developer Guide: Setup, Routing, Cost, and Evaluation

- Can you call
claude-fable-5? - What is the smallest safe request?
- Which facts are confirmed by Anthropic documentation?
- Which assumptions still need to be verified in your EvoLink account, route, and logs?
- When should Fable 5 replace or escalate beyond Opus 4.8?
- How should you control context, cache behavior, output length, safeguards, and fallback cost?
- What should be measured before routing user traffic to Fable 5?
Fast Verdict
Use Fable 5 when a wrong answer is expensive: repo-scale architecture, difficult refactors, long-horizon agents, critical long-context analysis, final pre-release decision synthesis, and other tasks where a stronger answer can justify a premium route. Most high-value Claude workloads should still start with Opus 4.8 as a strong premium default. Simpler or high-volume requests should remain on Sonnet or Haiku when they meet the quality bar.
| Decision point | Recommendation |
|---|---|
| First model to test on frontier-difficulty Claude tasks | Claude Fable 5 |
| Default premium route for many hard Claude tasks | Claude Opus 4.8 |
| High-volume simple tasks | Sonnet or Haiku |
| Fable 5 model ID | claude-fable-5 |
| Anthropic list price | $10 / MTok input, $50 / MTok output |
| Context window | 1M tokens |
| Maximum output | 128K tokens |
| Main rollout risk | Cost drift, plus safeguard behavior in sensitive workflows |
Table of Contents
- What is Claude Fable 5?
- Confirmed facts and assumptions to verify
- Quickstart: call Claude Fable 5 through EvoLink
- Request structure
- Code examples: curl, Node.js, and Python
- Pricing and production cost model
- Long context, output, and caching strategy
- When to use Fable 5 instead of Opus 4.8
- Safeguards, refusals, and fallback planning
- Pre-production evaluation framework
- Phased rollout plan on EvoLink
- Monitoring and logging checklist
- Common mistakes
- Sources and FAQ
What Is Claude Fable 5?
claude-fable-5. Anthropic positions it above the Opus tier for the most demanding reasoning, long-horizon agentic work, and complex coding tasks.For EvoLink users, the more important product framing is:
Fable 5 is a premium route inside a multi-model system, not a reason to upgrade every Claude call to the most expensive model.
That distinction matters because the early search demand is not only "what was released?" Developers want to know:
- what model ID to use;
- whether API access is available;
- how much it costs;
- whether safeguards can affect legitimate technical work;
- whether it should replace Opus 4.8;
- how to test it without uncontrolled spend.
This guide is organized around those production questions.
Confirmed Facts and Assumptions to Verify
Separate two kinds of information before you ship: facts documented by Anthropic, and production behavior that still needs verification in your EvoLink account, route, and logs.
| Area | Status | Documented fact | What EvoLink users should verify |
|---|---|---|---|
| Model name | Documented by Anthropic | Claude Fable 5 | Whether the route is enabled for your EvoLink account |
| Model ID | Documented by Anthropic | claude-fable-5 | Whether the model ID succeeds on your endpoint |
| Availability | Documented by Anthropic | Generally available on listed Claude channels beginning June 9, 2026 | Account, region, billing, and route-level availability |
| Context window | Documented by Anthropic | 1M token context window | Real request size, timeout behavior, and cost at your workload shape |
| Maximum output | Documented by Anthropic | 128K output tokens | Current EvoLink route limits and response behavior |
| List price | Documented by Anthropic | $10 / MTok input, $50 / MTok output | Current EvoLink credits, discounts, and SKU billing |
| Prompt caching | Pricing documented by Anthropic | Cache write and cache hit prices are listed separately | Whether the current EvoLink route supports the exact caching behavior you plan to use |
| Adaptive thinking | Documented by Anthropic | Adaptive thinking is described for Fable 5 | Which advanced controls EvoLink exposes for your route |
| Safeguards | Documented by Anthropic | Higher-risk requests may receive additional handling | Whether your sensitive workflows behave as expected |
| Mythos 5 | Documented by Anthropic | Limited availability through Project Glasswing and approved channels | Do not assume self-serve EvoLink availability |
What to Verify Before Rollout
The model facts are documented, but production integration depends on your EvoLink account, route configuration, and observability. Do not infer these from the model announcement. Verify them before a rollout:
| Capability to confirm | Why it matters |
|---|---|
| Anthropic route availability in your account | Prevents a launch plan based on a route you cannot call |
| Exact model ID behavior | Confirms claude-fable-5 resolves in your environment |
| Token limits and request body size | A 1M context window does not remove gateway, product, or timeout constraints |
| Streaming behavior | Long responses need predictable UX and timeout handling |
| Tool/function calling support | Agent workloads may depend on tools, but support should be tested |
| Prompt caching behavior | Cost models change if caching is unavailable or configured differently |
| Safety and refusal behavior | Sensitive workflows need expected fallback and user messaging |
| Billing logs | Teams need real cost data, not only list-price estimates |
| Fallback routing | You need to know where traffic goes when Fable is unavailable or rejected |
Quickstart: Call Claude Fable 5 Through EvoLink
model to claude-fable-5, send the request to https://direct.evolink.ai/v1/messages, and include max_tokens.curl https://direct.evolink.ai/v1/messages \
-H "Authorization: Bearer $EVOLINK_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "claude-fable-5",
"max_tokens": 1024,
"messages": [
{
"role": "user",
"content": "Review this migration plan and identify the highest-risk assumptions."
}
]
}'Keep the first request simple. Do not combine first-call testing with a giant context window, tool use, streaming, caching, and a long output target at the same time. First confirm that the route works. Then add production features one at a time.
Request Structure
The first request should answer five questions:
| Field | What to check |
|---|---|
model | Is the route using claude-fable-5? |
max_tokens | Is output bounded before real users hit the route? |
messages | Does the Claude Messages API request contain a meaningful user task? |
system | If needed, is the system instruction set as a top-level field rather than a chat message? |
| metadata or logging fields | Can you later identify this request in billing and observability? |
| fallback policy | What happens if the request fails, is refused, or times out? |
If your app uses a unified model abstraction, keep Fable 5 behind a policy layer rather than hardcoding it everywhere. That makes it easier to compare Fable, Opus, Sonnet, and Haiku without rewriting product code.
Code Examples: curl, Node.js, and Python
curl
curl https://direct.evolink.ai/v1/messages \
-H "Authorization: Bearer $EVOLINK_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "claude-fable-5",
"max_tokens": 1024,
"messages": [
{
"role": "user",
"content": "Summarize the risks in this deployment checklist."
}
]
}'Node.js
const response = await fetch('https://direct.evolink.ai/v1/messages', {
method: 'POST',
headers: {
Authorization: `Bearer ${process.env.EVOLINK_API_KEY}`,
'Content-Type': 'application/json',
},
body: JSON.stringify({
model: 'claude-fable-5',
max_tokens: 1024,
messages: [
{
role: 'user',
content: 'Find the weakest assumptions in this architecture proposal.',
},
],
}),
})
if (!response.ok) {
throw new Error(`EvoLink request failed: ${response.status}`)
}
const data = await response.json()
console.log(data)Python
import os
import requests
response = requests.post(
"https://direct.evolink.ai/v1/messages",
headers={
"Authorization": f"Bearer {os.environ['EVOLINK_API_KEY']}",
"Content-Type": "application/json",
},
json={
"model": "claude-fable-5",
"max_tokens": 1024,
"messages": [
{
"role": "user",
"content": "Review this incident report and identify prevention gaps.",
}
],
},
timeout=120,
)
response.raise_for_status()
print(response.json())For advanced Claude Messages API features such as streaming, tools, multimodal input, caching, or thinking controls, verify the current EvoLink docs and your account configuration before depending on them in production.
Pricing and Production Cost Model
The list price is only the starting point. For production, the better question is:
What is the cost of one accepted, useful answer after retries, long context, cache behavior, fallback, and human review?
$10 / MTok input and $50 / MTok output. Prompt caching has separate cache write and cache hit prices. EvoLink users should compare those public list prices with current EvoLink pricing, credits, discounts, and billing logs.
| Cost driver | Why it grows | Control lever |
|---|---|---|
| Long input context | Large repos, logs, document packs, and repeated instructions | Retrieval, file selection, chunking, summarization, prompt caching |
| Long output | Plans, code, tables, and audit reports can become expensive | Explicit answer shape, section limits, max output, staged generation |
| Retries | Errors, refusals, timeouts, and poor prompts multiply cost | Better prompts, smaller test sets, fallback policy, monitoring |
| Premium-by-default routing | Every simple task pays Fable-level cost | Route only high-value hard tasks to Fable |
| Hidden fallback | Fallback routes may hide quality or cost changes | Log selected model and fallback reason |
Long Context, Output, and Caching Strategy
A 1M token context window is powerful, but it is not free capacity. Treat it as a capability for hard cases, not a default input size.
Use this policy before sending large context:
| Step | Question |
|---|---|
| Select | Does the model need the whole corpus, or only relevant files and excerpts? |
| Compress | Can repeated boilerplate, logs, or history be summarized safely? |
| Cache | Is the repeated instruction or corpus stable enough to benefit from caching? |
| Bound output | Do you need a full report, or a decision plus evidence? |
| Measure | Did Fable reduce retries or improve accepted output enough to justify cost? |
For coding agents, the best pattern is usually not "send the whole repository every time." A better pattern is:
- classify the task;
- retrieve relevant files;
- summarize stable project context;
- send Fable 5 only when the task crosses a difficulty threshold;
- log whether the answer was accepted.
When to Use Fable 5 Instead of Opus 4.8
The core comparison is not "which model is stronger in the abstract?" The production question is "which route should handle this request?"
| Workload | Start with Opus 4.8 | Escalate to Fable 5 |
|---|---|---|
| Code review | Normal PR review, localized bug checks, routine refactors | Repo-scale architecture, migration risk, multi-service reasoning |
| Coding agents | Short tool loops and common implementation tasks | Long-horizon planning, difficult recovery, high-autonomy workflows |
| Long documents | Standard summaries and extraction | Cross-document conflict analysis and high-stakes synthesis |
| Security-adjacent work | Benign policy or defensive summaries with known behavior | Sensitive prompts only after safeguard testing and fallback design |
| Product decisions | Routine analysis | High-cost decisions where a weak answer creates real downstream work |
Safeguards, Refusals, and Fallback Planning
Fable 5 discussion includes both capability and safeguards. Anthropic documentation and launch coverage describe additional handling for higher-risk areas such as cybersecurity, biology, chemistry, and model-distillation-related requests.
For EvoLink users, the right response is not to ignore safeguards or overstate them. Build a small, realistic test set and log outcomes.
| Test area | What to verify | What to log |
|---|---|---|
| Defensive security prompts | Legitimate analysis completes as expected | refusal reason, fallback model, accepted output |
| Research or scientific prompts | Benign workflows are not unexpectedly blocked | prompt category, user-facing message, reviewer result |
| Coding prompts | Normal repo tasks are stable | selected model, tool calls, latency, retry count |
| Risky or policy-sensitive prompts | The app handles refusal safely | route decision, user message, fallback behavior |
| Long-context prompts | Large inputs do not cause runaway spend | input tokens, output tokens, cache usage, timeout |
Fallback should be visible. A fallback that silently changes model behavior can create debugging problems and misleading evaluation results.
Pre-Production Evaluation Framework
Before routing Fable 5 to production, create a small evaluation harness from real tasks.

| Evaluation dimension | What to test | Pass condition |
|---|---|---|
| Quality | Human acceptance, correctness, completeness | Beats Opus 4.8 on target hard tasks |
| Cost | tokens, retries, cache hit rate, output length | Higher token cost is offset by better outcomes |
| Latency | time to first useful answer and total completion time | Acceptable for the user workflow |
| Safety behavior | refusal, fallback, sensitive-category handling | Predictable and observable |
| Reliability | error rate and retry rate | Stable enough for limited production |
| Routing | whether escalation rules select the right prompts | Only valuable tasks reach Fable |
Suggested Evaluation Set
Start with 20 to 50 tasks:
- 10 difficult code or repo tasks;
- 10 long-context analysis tasks;
- 5 sensitive but legitimate prompts if your product needs them;
- 5 high-value decision prompts;
- 5 known Opus 4.8 failure cases;
- 5 ordinary tasks that should remain on lower-cost routes.
The last group matters. A good routing policy knows when to use Fable 5 and when not to use it.
Phased Rollout Plan on EvoLink
Do not migrate all traffic at once. Use a staged rollout.
| Stage | Traffic | Goal |
|---|---|---|
| Lab test | Internal prompts only | Confirm route access and baseline quality |
| Replay test | Historical hard prompts | Compare against Opus 4.8 |
| Shadow test | Same user request, Fable result not shown | Measure quality and cost safely |
| Limited production | Internal users or trusted customers | Validate real behavior |
| Policy rollout | Only requests matching escalation rules | Control cost |
| Review cycle | Weekly review during the first month | Tune prompts, routing, and guardrails |
Monitoring and Logging Checklist
If you cannot observe Fable 5 behavior, you should not route production traffic to it.
Log these fields:
| Field | Why it matters |
|---|---|
model | Confirms which model was selected |
| route family | Compares Fable, Opus, Sonnet, and Haiku |
| prompt category | Identifies sensitive or high-cost workloads |
| input tokens | Tracks context growth |
| output tokens | Tracks the most expensive side of the request |
| cache usage | Shows whether repeated context is optimized |
| latency | Measures user impact |
| retry count | Reveals hidden cost |
| fallback model | Shows route changes |
| refusal or error reason | Supports debugging and product messaging |
| accepted output | Connects model cost to business value |
Common Mistakes
| Mistake | Better approach |
|---|---|
| Routing all Claude traffic to Fable 5 | Escalate only hard, high-value requests |
| Testing only one clever prompt | Replay real production traces |
| Ignoring output length | Bound answer shape and budget |
| Treating 1M context as free space | Retrieve, compress, cache, and measure |
| Assuming all advanced parameters are available | Verify EvoLink route support first |
| Hiding fallback behavior | Log it and make it debuggable |
| Estimating cost from vendor list price only | Measure completed-task cost on EvoLink |
| Mixing Fable 5 and Mythos 5 messaging | Fable is the generally available route; Mythos is limited availability |
Internal Link Cluster
Use these pages together:
| Page | Role |
|---|---|
| Claude Fable 5 API on EvoLink | Product page, model ID, live pricing, API access |
| Claude API models on EvoLink | Claude family selection |
| How to use Claude Fable 5 API | First successful API call and setup steps |
| Claude Fable 5 vs Claude Opus 4.8 | Upgrade and routing decision |
| Claude Opus 4.8 review | Premium Claude default-route evaluation |
Sources
- EvoLink Claude Messages API documentation
- Anthropic models overview
- Anthropic pricing
- Anthropic Fable 5 and Mythos 5 launch docs
- Business Insider coverage of Claude Fable 5
- The Verge coverage of Claude Fable 5 and Mythos 5
- Wired coverage of the launch
FAQ
What is the Claude Fable 5 API model ID?
claude-fable-5.Is Claude Fable 5 generally available?
Anthropic documentation lists Claude Fable 5 as generally available beginning June 9, 2026 on listed Claude channels. EvoLink users should still verify account-level route access before moving production traffic.
How much does Claude Fable 5 cost?
$10 / MTok input and $50 / MTok output, with separate prompt-caching prices. EvoLink users should confirm current route pricing and billing behavior in EvoLink before final cost planning.Does Claude Fable 5 support 1M context?
Yes. Anthropic documents a 1M token context window for Claude Fable 5.
What is the maximum output for Claude Fable 5?
Anthropic documents a maximum output of 128K tokens.
Should Claude Fable 5 replace Claude Opus 4.8?
Not by default. Treat Fable 5 as a premium escalation route for the hardest tasks. Keep Opus 4.8 as a strong premium default until your own evaluation proves that Fable should carry more traffic.
Is Claude Fable 5 good for coding agents?
Yes, but route it intentionally. It is best suited for repo-scale planning, difficult refactors, long tool loops, and high-risk decisions. Simpler coding tasks should usually remain on Opus, Sonnet, or lower-cost routes when they meet quality requirements.
What safeguards should developers test?
Test defensive security, research, scientific, compliance, and other sensitive workflows that your product actually handles. Log refusals, fallback route, prompt category, and accepted output so the behavior is observable.
Can Claude Mythos 5 be called through EvoLink?
Do not assume self-serve EvoLink access to Claude Mythos 5. Anthropic describes Mythos 5 as limited availability through Project Glasswing and approved customer channels. This guide focuses on Claude Fable 5.
What should be monitored after launch?
Monitor model, token usage, cache usage, latency, retries, fallback route, refusal reason, error rate, and whether the output was accepted by the user or reviewer.


