GPT-5.4 API
GPT-5.4 API
The first general-purpose model with native computer use, 1.05M-token context, 128K max output, and significantly improved token efficiency.

Billing Rules
- Input/output billed per 1M tokens.
- Cached input: 90% discount.
- >272K input: 2x input + 1.5x output for full session.
- Regional processing: 10% uplift.
- Reasoning tokens count as output.
PRICING
| PLAN | CONTEXT WINDOW | MAX OUTPUT | INPUT | OUTPUT | CACHE READ |
|---|---|---|---|---|---|
| GPT-5.4 | 1.05M | 128.0K | $2.00-20% $2.50Official Price | $12.00-20% $15.00Official Price | $0.200-20% $0.250Official Price |
| GPT-5.4 (Beta) | 1.05M | 128.0K | $0.650-74% $2.50Official Price | $3.90-74% $15.00Official Price | $0.065-74% $0.250Official Price |
Pricing Note: Price unit: USD / 1M tokens
Cache Hit: Price applies to cached prompt tokens.
Two ways to run GPT-5.4 — pick the tier that matches your workload.
- · GPT-5.4: the default tier for production reliability and predictable availability.
- · GPT-5.4 (Beta): a lower-cost tier with best-effort availability; retries recommended for retry-tolerant workloads.
| Model | Metric | Official | EvoLink | Credits |
|---|---|---|---|---|
| GPT-5.4 | Input (Standard) | $2.50 / 1M | TBA | TBA |
| GPT-5.4 | Input (Cached) | $0.25 / 1M | TBA | TBA |
| GPT-5.4 | Input (>272K Prompt Tier) | $5.00 / 1M | TBA | TBA |
| GPT-5.4 | Output (Standard) | $15.00 / 1M | TBA | TBA |
| GPT-5.4 | Output (>272K Prompt Tier) | $22.50 / 1M | TBA | TBA |
If it is down, we automatically use the next cheapest available option, ensuring 99.9% uptime at the best possible price.
Capabilities
Native computer use: first general-purpose model that operates computers
GPT-5.4 is the first general-purpose model with native, state-of-the-art computer-use capabilities. It can click, type, and navigate software with screenshots plus keyboard/mouse commands without requiring a separate specialized model. On OSWorld-Verified, GPT-5.4 scores 75.0%, surpassing human performance at 72.4%.

1.05M-token context with better token efficiency
Process entire codebases, book-length documents, or months of conversation history in one request. GPT-5.4vides 2.6x GPT-5.2's 400K context and uses significantly fewer tokens for equivalent tasks, reducing usage and improving speed.

Deep reasoning with adjustable effort
Use five reasoning levels: none, low, medium, high, and xhigh. For the hardest tasks, increase effort to deepen reasoning quality. GDPval reaches 83.0% (matching or exceeding professionals across 44 occupations) versus GPT-5.2 at 70.9%.

Why Developers Choose
Frontier capability, broader tools, and practical integration through EvoLink.
Full tool ecosystem with Tool Search
Web search, file search, image generation, code interpreter, hosted shell, computer use, MCP, and tool search are natively supported. Tool Search helps agents select and use the right tools across large connector ecosystems.
Better results with fewer tokens
GPT-5.4 is OpenAI's most token-efficient reasoning model. Compared with GPT-5.2, it generally uses fewer tokens for equivalent tasks, often improving speed and effective cost per job.
One key, zero setup
Access GPT-5.4 with one EvoLink API key. Migration from GPT-5.2 is drop-in for most integrations by changing one model string.
How to Integrate
Three steps from key creation to production monitoring.
Get your API key
Sign up on EvoLink, generate your API key, and use it immediately with GPT-5.4 and 47+ other models.
Send your request
POST with model set to "gpt-5.4", your messages array, and optional parameters.
Deploy and monitor
Track usage, costs, and reasoning-token consumption in the EvoLink dashboard and scale workflows when ready.
Key Features
Core strengths for production agents, coding systems, and enterprise workflows.
1.05M Context Window
Process entire repositories and book-length documents in one request.
128K Max Output
Generate complete reports and long implementations in one response.
Native Computer Use
Operate software via screenshots and keyboard/mouse commands (OSWorld 75.0%, human 72.4%).
Tool Search
Agents can automatically identify and use the right tools in larger ecosystems.
Token Efficiency
Uses fewer tokens than GPT-5.2 for equivalent problem-solving in many workloads.
Prompt Caching
Cached input pricing at $0.25 per 1M tokens, a 90% discount from standard input.
Benchmarks: GPT-5.4 vs GPT-5.2
Verified benchmark deltas highlight stronger professional performance, tool use, browsing quality, and computer-use reliability.
| Benchmark | GPT-5.4 | GPT-5.2 |
|---|---|---|
| GDPval | 83.0% | 70.9% |
| SWE-Bench Pro | 57.7% | 55.6% |
| OSWorld (Human: 72.4%) | 75.0% | 47.3% |
| Toolathlon | 54.6% | 46.3% |
| BrowseComp | 82.7% | 65.8% |
| MMMU-Pro | 81.2% | 79.5% |
| Factual errors per claim | 33% fewer | Baseline |
| Factual errors per response | 18% fewer | Baseline |
Data Summary
GPT-5.4
gpt-5.4-2026-03-05 | $2.50/$15/$0.25 | 1.05M/128K | reasoning none→xhigh | all tools
GPT-5.4 Thinking
ChatGPT only, not an API model
What Changed from V1
- Added GPT-5.4 Thinking clarification (ChatGPT only, not an API model).
- Moved native computer use to lead capability (OSWorld 75.0% > human 72.4%).
- Added token-efficiency positioning (fewer tokens, lower effective cost).
- Added Tool Search capability details.
- Added benchmark comparison section versus GPT-5.2.
- Updated SEO title and meta description for quick-start intent.
Frequently Asked Questions
Everything you need to know about the product and billing.
Related Resources
Internal links for release notes, pricing analysis, comparisons, and migration decisions.