Gemini 3 Flash Preview API
Access Google's Gemini 3 Flash Preview (gemini-3-flash-preview) through EvoLink with OpenAI SDK compatibility and native Gemini API support. Send text, image, video, audio, and PDF inputs with a 1,048,576 token context window, plus caching and batch options for production workloads.
PRICING
| PLAN | CONTEXT WINDOW | MAX OUTPUT | INPUT | OUTPUT | CACHE READ |
|---|---|---|---|---|---|
| Gemini 3 Flash | 1.05M | 65.5K | ≤200.0K$0.400-20% $0.500Official Price >200.0K$0.400-20% $0.500Official Price | ≤200.0K$2.40-20% $3.00Official Price >200.0K$2.40-20% $3.00Official Price | ≤200.0K$0.040-19% $0.050Official Price >200.0K$0.040-19% $0.050Official Price |
| Gemini 3 Flash (Beta) | 1.05M | 65.5K | ≤200.0K$0.130-74% $0.500Official Price >200.0K$0.130-74% $0.500Official Price | ≤200.0K$0.780-74% $3.00Official Price >200.0K$0.780-74% $3.00Official Price | ≤200.0K$0.013-74% $0.050Official Price >200.0K$0.013-74% $0.050Official Price |
Pricing Note: Price unit: USD / 1M tokens
Cache Hit: Price applies to cached prompt tokens.
Two ways to run Gemini 3 Flash — pick the tier that matches your workload.
- · Gemini 3 Flash: the default tier for production reliability and predictable availability.
- · Gemini 3 Flash (Beta): a lower-cost tier with best-effort availability; retries recommended for retry-tolerant workloads.
Gemini 3 Flash Preview API on EvoLink
Built for speed and scale, Gemini 3 Flash Preview understands text, images, video, audio, and PDFs, and handles massive context (up to 1M tokens). It delivers clear, reliable answers for real-time assistants, document understanding, and media analysis.

What You Can Build with Gemini 3 Flash Preview
Multimodal Inputs, Reliable Text Outputs
A single request can include text, images, video, audio, or PDFs and return text output. This makes it easy to summarize meetings, review media, and extract structured insights without separate pipelines.

1M-Token Context for Long Sessions
Handle up to 1,048,576 input tokens and 65,536 output tokens in a single request. That lets you keep long documents, codebases, or multi-turn chats in one coherent context.

Tools, Grounding, and Reasoning
Use thinking and structured outputs with function calling, code execution, file search, search grounding, and URL context. Batch API and caching are supported for scale and cost control.

Why Use EvoLink for Gemini 3 Flash Preview
Run gemini-3-flash-preview via OpenAI SDK format or Google Native API format with official Gemini capabilities and pricing.
One Integration, Two Formats
Call Gemini 3 Flash Preview in OpenAI SDK or native Gemini format without changing app logic.
Batch + Caching Savings
Use batch processing and context caching to lower repeat costs while scaling high-volume workloads safely.
Ready for Production Use
Multimodal inputs, long context, and tool support cover real production assistants, analysis, and automation workflows.
How to Call Gemini 3 Flash Preview
Choose OpenAI SDK or Google Native API format, then send your request.
Step 1 - Choose API Format
OpenAI SDK format: POST /v1/chat/completions with model "gemini-3-flash-preview". Native API format: POST /v1beta/models/gemini-3-flash-preview:{method} with method generateContent or streamGenerateContent.
Step 2 - Add Auth and Inputs
Include Authorization: Bearer <token>. Send messages/contents with text or multimodal parts (image, video, audio, PDF).
Step 3 - Stream or Scale
Enable streaming for real-time UX, or use X-Async-Mode to return a task ID. Combine batch and caching for cost-efficient high-volume runs.
Technical Specs
Official model capabilities for gemini-3-flash-preview
1,048,576 Input Tokens
Up to 1,048,576 input tokens and 65,536 output tokens.
Multimodal Inputs
Text, image, video, audio, and PDF inputs with text output.
Thinking + Structured Outputs
Thinking and structured outputs are supported for reliable, machine-readable results.
Function Calling + Tools
Function calling, code execution, and file search are supported.
Caching + Batch
Context caching and Batch API are supported for repeated or large-scale workloads.
Search Grounding + URL Context
Search grounding and URL context are supported (Google Maps grounding is not).
Gemini 3 Flash Preview API FAQs
Everything you need to know about the product and billing.