Gemini 2.5 Pro API
Gemini 2.5 Pro gives teams a long-context reasoning model for deep analysis, code review, and complex planning. With Gemini 2.5 Pro on EvoLink, you can route requests with a single API key, track usage by project, and keep compliance-friendly controls for production apps.
PRICING
| PLAN | CONTEXT WINDOW | MAX OUTPUT | INPUT | OUTPUT |
|---|---|---|---|---|
| Gemini 2.5 Pro | 1.05M | 65.5K | ≤200.0K$1.00-20% $1.25Official Price >200.0K$2.00-20% $2.50Official Price | ≤200.0K$8.00-20% $10.00Official Price >200.0K$12.00-20% $15.00Official Price |
| Gemini 2.5 Pro (Beta) | 1.05M | 65.5K | ≤200.0K$0.325-74% $1.25Official Price >200.0K$0.650-74% $2.50Official Price | ≤200.0K$2.60-74% $10.00Official Price >200.0K$3.90-74% $15.00Official Price |
Pricing Note: Price unit: USD / 1M tokens
Two ways to run Gemini 2.5 Pro — pick the tier that matches your workload.
- · Gemini 2.5 Pro: the default tier for production reliability and predictable availability.
- · Gemini 2.5 Pro (Beta): a lower-cost tier with best-effort availability; retries recommended for retry-tolerant workloads.
Gemini 2.5 Pro for long-context reasoning and tool use
Gemini 2.5 Pro supports up to about one million input tokens and text output, so long files, PDFs, and multi-turn workflows stay in one conversation. Use multimodal inputs and structured outputs to turn large context into reliable actions.

What can Gemini 2.5 Pro help you build?
Long-context understanding
Gemini 2.5 Pro can read large documents, codebases, and PDFs in a single request and keep intent consistent across long conversations. Load policies, specs, and prior chat history, then ask for summaries, risk checks, or decisions without heavy chunking or constant re-prompts.

Multimodal analysis
Gemini 2.5 Pro accepts text, images, audio, video, and PDF inputs while returning clear text answers. That means you can combine meeting audio with slides, add screenshots to a bug report, or attach a contract PDF and ask for a risk summary in a single flow.

Structured workflows
Gemini 2.5 Pro supports function calling, structured outputs, URL context, and file search so your app can move from insight to action. Use JSON-shaped responses for data extraction, approvals, or routing, then ground results with search or maps when accuracy matters most.

Why teams pick this model
Teams choose Gemini 2.5 Pro for long-context reasoning, multimodal inputs, and production-ready controls like structured outputs and grounding, then access it on EvoLink via OpenAI-compatible or native Gemini endpoints.
Long-context confidence
Up to 1,048,576 input tokens and 65,536 output tokens help keep large documents and long histories in a single request.
Reliable structure
Function calling and structured outputs help generate consistent JSON for automation and downstream systems.
Operational clarity
Caching and Batch API support reduce cost on repeated workloads, while search or maps grounding improves trust.
How to use Gemini 2.5 Pro
Use Gemini 2.5 Pro through EvoLink with either OpenAI SDK compatibility or the native Gemini endpoint.
Step 1 - Prepare context
Collect the files, links, or transcripts you need, then ask for an outline or summary before deep analysis.
Step 2 - Choose the API format
Call /v1/chat/completions for OpenAI SDK compatibility, or use /v1beta/models/gemini-2.5-pro:{method} for native Gemini features.
Step 3 - Generate, review, improve
Evaluate outputs, add constraints, and cache repeated context blocks to reduce cost on large, recurring jobs.
Key capabilities
Built for long, reliable reasoning
1M-class context window
Gemini 2.5 Pro supports up to 1,048,576 input tokens and up to 65,536 output tokens, so long documents and multi-step work stay in a single request.
Multimodal inputs
This model accepts text, image, audio, video, and PDF inputs, then returns text output that is easy to store, search, or pass to other systems.
Structured outputs and tools
Get function calling and structured outputs to format responses as JSON, so your workflows can parse results, trigger actions, and avoid brittle post-processing.
Grounding and URL context
Use search grounding, maps grounding, URL context, and file search to improve accuracy and reduce hallucinations when factual precision matters.
Caching and batch support
Caching is supported for repeated long-context prompts, and Batch API support lets you process large queues efficiently when latency is less important than throughput.
Reasoning with known limits
This model includes a January 2025 knowledge cutoff, so pair it with grounding or fresh sources when you need the most current information.
Frequently Asked Questions
Everything you need to know about the product and billing.