Release Watch

Gemini 3.5 Flash API Is Now Available: Model ID, Pricing, and Production Notes

Name: EvoLink AI Model API Platform
Brand: EvoLink
Availability: InStock

EvoLink Team

Product Team

May 18, 2026

Updated on May 20, 2026

6 min read

Update (May 20, 2026): Google has updated its official Gemini API documentation. Gemini 3.5 Flash is now listed as generally available and stable for scaled production use. The model ID is gemini-3.5-flash. This page has been updated from a release-watch format to reflect confirmed availability.

For the full Gemini 3.5 Flash API page with pricing, code examples, and use cases, see Gemini 3.5 Flash API on EvoLink.

TL;DR

Gemini 3.5 Flash is now generally available (GA) and marked stable for production use.
Model ID: gemini-3.5-flash — confirmed in Google's official Gemini API docs.
Pricing: $1.50 input / $9.00 output per 1M tokens (standard tier), with context caching and batch discounts available.
Context: 1,048,576 input tokens / 65,536 output tokens.
Key strengths: agentic workflows, coding agents, sub-agent deployment, long-horizon tasks.
Not a preview — production teams can route traffic to it with confidence.

What Changed Since May 18

On May 18, 2026, this page reported that Google's official Gemini API docs did not list Gemini 3.5 Flash. Here is what has changed:

Item	May 18 status	Current status (May 20)
Official release	Not confirmed	Generally available, stable
Model ID	Not confirmed	`gemini-3.5-flash`
Pricing	Not confirmed	$1.50 input / $9.00 output per 1M tokens
Context window	Not confirmed	1M input / 65K output
Tool calling	Not confirmed	Function calling, structured outputs, code execution supported
Context caching	Not confirmed	Supported
Batch API	Not confirmed	Supported
Production status	Not available	Stable for scaled production use

Confirmed Capabilities

Google's official documentation describes Gemini 3.5 Flash as a model built for real-world tasks with frontier-level intelligence at Flash-tier speed and cost. The key capabilities confirmed:

Agentic Workflows

Gemini 3.5 Flash is optimized for agentic workflows, parallel execution loops, and sub-agent deployment. Function calling, structured outputs, and code execution are all supported natively.

Coding Tasks

The model handles code generation, debugging, refactoring, and test writing at Flash-tier speed, making it a strong candidate for coding agent loops where each iteration consumes tokens.

Long-Horizon Tasks

With 1M input context, it can process full codebases, multi-document analysis, legal review, research synthesis, and PDF-heavy workflows without context truncation.

Multimodal Input

Text, image, video, audio, and PDF inputs are supported with unified pricing — no separate surcharges for audio or video input.

Pricing Overview

Tier	Input (per 1M tokens)	Output (per 1M tokens)
Standard	$1.50	$9.00
Context caching	Reduced input cost	Same output
Batch API	Additional discounts	Additional discounts

For detailed EvoLink pricing with credit conversion, see the Gemini 3.5 Flash API page.

What This Means for Production Teams

You Can Route Production Traffic Now

Gemini 3.5 Flash is not a preview or experimental model. Google marks it as stable for scaled production use. This means you can plan production routing, cost budgets, and SLAs around it.

Evaluate It for Agent and Coding Workloads

Google explicitly positions this model for agentic workflows and coding tasks. If you are running coding agents, multi-step automation, or tool-orchestrated pipelines, this is a model worth benchmarking against your current default.

Compare Within the Gemini Family

Model	Best for	Cost profile
Gemini 3.5 Flash	Agentic workflows, coding, long-horizon tasks	Flash-tier
Gemini 3.1 Pro	Hardest reasoning, complex analysis	Higher cost
Gemini 3.1 Flash Lite	High-volume batch, simple extraction	Lowest cost

Use EvoLink for Unified Access

EvoLink provides OpenAI-compatible access to Gemini 3.5 Flash alongside other models. One API key, one billing system, and configurable routing between Flash, Pro, and models from other providers.

Access Gemini 3.5 Flash on EvoLink →

Updated Evaluation Checklist

Now that the model is available, here is what to verify on your actual workloads:

Dimension	What to measure
Latency	Time to first token and full completion on your prompt distribution
Quality	Task success rate, schema adherence, hallucination rate
Cost	Token cost including retries, fallbacks, and context caching savings
Tool use	Function calling reliability, structured output accuracy
Agent loops	Cost per full agent session (see cost example)
Fallback rate	How often Flash must escalate to Pro

Official Sources

FAQ

Is Gemini 3.5 Flash now available in the API?

Yes. As of May 2026, Google's official Gemini API documentation lists Gemini 3.5 Flash as generally available and stable for scaled production use. The model ID is gemini-3.5-flash.

What is the model ID for Gemini 3.5 Flash?

The confirmed model ID is gemini-3.5-flash. Use this exact string in API requests.

What is Gemini 3.5 Flash pricing?

Standard pricing is $1.50 per 1M input tokens and $9.00 per 1M output tokens. Context caching and Batch API provide additional cost savings. See the Gemini 3.5 Flash API page for EvoLink-specific pricing.

Is Gemini 3.5 Flash production-ready?

Yes. Google marks it as stable for scaled production use. It is not a preview or experimental model.

What is Gemini 3.5 Flash best for?

According to Google's official documentation, it is optimized for agentic workflows, coding agents, sub-agent deployment, long-horizon tasks, and cost-efficient production inference with 1M context.

How does Gemini 3.5 Flash compare to Gemini 3 Flash?

Gemini 3.5 Flash is the current-generation Flash model with frontier-level intelligence, stronger agentic and coding performance, and built-in reasoning. Gemini 3 Flash is the previous generation.

Can I access Gemini 3.5 Flash through EvoLink?

Yes. EvoLink provides access via OpenAI-compatible requests and native Gemini API format. See the Gemini 3.5 Flash API page for details.

All Posts

#Gemini 3.5 Flash #Gemini API #Google AI #Flash models #release watch