Gemini 2.5 Pro API

Gemini 2.5 Pro gives teams a long-context reasoning model for deep analysis, code review, and complex planning. With Gemini 2.5 Pro on EvoLink, you can route requests with a single API key, track usage by project, and keep compliance-friendly controls for production apps.

Run With API
Using coding CLIs? Run Gemini 2.5 Pro via EvoCode — One API for Code Agents & CLIs. (View Docs)
$

PRICING

PLANCONTEXT WINDOWMAX OUTPUTINPUTOUTPUT
Gemini 2.5 Pro1.05M65.5K
200.0K$1.00-20%
$1.25Official Price
>200.0K$2.00-20%
$2.50Official Price
200.0K$8.00-20%
$10.00Official Price
>200.0K$12.00-20%
$15.00Official Price
Gemini 2.5 Pro (Beta)1.05M65.5K
200.0K$0.325-74%
$1.25Official Price
>200.0K$0.650-74%
$2.50Official Price
200.0K$2.60-74%
$10.00Official Price
>200.0K$3.90-74%
$15.00Official Price

Pricing Note: Price unit: USD / 1M tokens

Two ways to run Gemini 2.5 Pro — pick the tier that matches your workload.

  • · Gemini 2.5 Pro: the default tier for production reliability and predictable availability.
  • · Gemini 2.5 Pro (Beta): a lower-cost tier with best-effort availability; retries recommended for retry-tolerant workloads.

Gemini 2.5 Pro for long-context reasoning and tool use

Gemini 2.5 Pro supports up to about one million input tokens and text output, so long files, PDFs, and multi-turn workflows stay in one conversation. Use multimodal inputs and structured outputs to turn large context into reliable actions.

Hero showcase of AI model feature 1

What can Gemini 2.5 Pro help you build?

Long-context understanding

Gemini 2.5 Pro can read large documents, codebases, and PDFs in a single request and keep intent consistent across long conversations. Load policies, specs, and prior chat history, then ask for summaries, risk checks, or decisions without heavy chunking or constant re-prompts.

Context showcase of AI model feature 2

Multimodal analysis

Gemini 2.5 Pro accepts text, images, audio, video, and PDF inputs while returning clear text answers. That means you can combine meeting audio with slides, add screenshots to a bug report, or attach a contract PDF and ask for a risk summary in a single flow.

Multimodal showcase of AI model feature 3

Structured workflows

Gemini 2.5 Pro supports function calling, structured outputs, URL context, and file search so your app can move from insight to action. Use JSON-shaped responses for data extraction, approvals, or routing, then ground results with search or maps when accuracy matters most.

Workflow showcase of AI model feature 4

Why teams pick this model

Teams choose Gemini 2.5 Pro for long-context reasoning, multimodal inputs, and production-ready controls like structured outputs and grounding, then access it on EvoLink via OpenAI-compatible or native Gemini endpoints.

Long-context confidence

Up to 1,048,576 input tokens and 65,536 output tokens help keep large documents and long histories in a single request.

Reliable structure

Function calling and structured outputs help generate consistent JSON for automation and downstream systems.

Operational clarity

Caching and Batch API support reduce cost on repeated workloads, while search or maps grounding improves trust.

How to use Gemini 2.5 Pro

Use Gemini 2.5 Pro through EvoLink with either OpenAI SDK compatibility or the native Gemini endpoint.

1

Step 1 - Prepare context

Collect the files, links, or transcripts you need, then ask for an outline or summary before deep analysis.

2

Step 2 - Choose the API format

Call /v1/chat/completions for OpenAI SDK compatibility, or use /v1beta/models/gemini-2.5-pro:{method} for native Gemini features.

3

Step 3 - Generate, review, improve

Evaluate outputs, add constraints, and cache repeated context blocks to reduce cost on large, recurring jobs.

Key capabilities

Built for long, reliable reasoning

Context

1M-class context window

Gemini 2.5 Pro supports up to 1,048,576 input tokens and up to 65,536 output tokens, so long documents and multi-step work stay in a single request.

Multimodal

Multimodal inputs

This model accepts text, image, audio, video, and PDF inputs, then returns text output that is easy to store, search, or pass to other systems.

Tools

Structured outputs and tools

Get function calling and structured outputs to format responses as JSON, so your workflows can parse results, trigger actions, and avoid brittle post-processing.

Grounding

Grounding and URL context

Use search grounding, maps grounding, URL context, and file search to improve accuracy and reduce hallucinations when factual precision matters.

Efficiency

Caching and batch support

Caching is supported for repeated long-context prompts, and Batch API support lets you process large queues efficiently when latency is less important than throughput.

Trust

Reasoning with known limits

This model includes a January 2025 knowledge cutoff, so pair it with grounding or fresh sources when you need the most current information.

Frequently Asked Questions

Everything you need to know about the product and billing.

Gemini 2.5 Pro is strongest when you need deep reasoning across long context, such as multi-document reviews, complex code analysis, or planning that spans many constraints. Because the model accepts large prompts, you can keep policies, specs, and historical context together and ask for a single, coherent response. It is also well-suited to multimodal workflows where text needs to be combined with images, audio, video, or PDFs. For production apps, structured outputs help keep results consistent.
Gemini 2.5 Pro supports an input token limit up to 1,048,576 tokens and an output limit up to 65,536 tokens. In practice, that means it can take very large documents, long chat histories, or combined media inputs in a single request. If you push the maximum, plan for larger response times and higher costs. For everyday work, many teams stay below the limit and use the extra headroom to reduce chunking and preserve continuity.
Gemini 2.5 Pro accepts text, images, audio, video, and PDF inputs and returns text output. This makes the model practical for workflows like summarizing a PDF, extracting insights from a meeting recording, or explaining a video clip in plain language. Because output is text-only, it is easy to store, search, or send downstream to analytics and business systems. If you need multimodal outputs, you can pair it with specialized media models in EvoLink.
Yes. Gemini 2.5 Pro supports function calling and structured outputs, which lets you request JSON-shaped responses for consistent data extraction and routing. This is useful when you want Gemini 2.5 Pro to populate a form, classify tickets, or produce structured summaries for dashboards. You can define the fields you need, validate output more easily, and reduce manual cleanup. For high-stakes workflows, combine structured outputs with grounding to improve reliability.
Gemini 2.5 Pro supports URL context and file search, plus grounding options such as search or maps grounding. This means Gemini 2.5 Pro can reference specific sources, link to pages, and anchor answers in retrievable material. When you need trustworthy results, provide the sources you care about, ask the model to cite them, and keep prompts focused. Grounding is especially helpful for policy, compliance, and customer-support scenarios where accuracy matters.
EvoLink provides two paths: an OpenAI SDK compatible endpoint at /v1/chat/completions and a native Gemini endpoint at /v1beta/models/gemini-2.5-pro:{method}. Gemini 2.5 Pro works with either option, so you can keep existing OpenAI-style tooling or use the native format for Gemini-specific features. Both flows use Bearer token authentication and can stream responses; the native endpoint also supports async mode with an X-Async-Mode header.
Google publishes official Gemini 2.5 Pro pricing for its API, with standard paid tiers that vary by prompt size. As listed by Google, prompts up to 200k tokens are priced at $1.25 per 1M input tokens and $10 per 1M output tokens, while larger prompts cost more; caching and storage have separate rates. EvoLink usage depends on your routing and plan, so check your dashboard for the most accurate pricing and cost controls.
Gemini 2.5 Pro lists a January 2025 knowledge cutoff, so it may not know very recent events or changes. When freshness matters, use Gemini 2.5 Pro with URL context, file uploads, or grounding so the model can rely on current sources you provide. You can also prompt it to separate cited facts from assumptions, which helps reviewers verify accuracy. This approach keeps responses useful while still benefiting from the model's long-context reasoning.