HappyHorse 1.0 Coming SoonLearn More

Gemini 2.5 Flash Lite API

Use Gemini 2.5 Flash Lite on EvoLink through OpenAI-compatible or native Gemini requests. This route is positioned for cheap, high-volume text workloads where cost control matters more than moving up to a stronger Gemini model.
Price: 

$0.081(~ 5.8 credits) per 1M input tokens; $0.321(~ 23.1 credits) per 1M output tokens

$0.0083(~ 0.6 credits) per 1M cache read tokens; $0.240(~ 17.3 credits) per 1M audio tokens

Google Search grounding charged separately per query.

Highest stability with guaranteed 99.9% uptime. Recommended for production environments.

Use the same API endpoint for all versions. Only the model parameter differs.

Gemini 2.5 Flash Lite API

Gemini 2.5 Flash Lite is Google's lowest-cost Gemini text route on EvoLink. Use it for translation, classification, extraction, tagging, and summarization at scale when low token cost matters more than stepping up to Gemini 2.5 Flash or Pro.

Page keyword

Gemini 2.5 Flash Lite API

Request model ID

gemini-2.5-flash-lite

Gemini 2.5 Flash Lite API

Best-fit workloads

Translation and localization pipelines

Use Flash Lite for large batches of product copy, support content, help-center articles, and multilingual backlogs where keeping per-request cost low matters more than upgrading to a stronger reasoning model.

Translation and localization pipelines

Classification, tagging, and extraction queues

A practical fit for queues that label, sort, normalize, or extract structured fields from large volumes of tickets, forms, catalog content, CRM notes, or internal text records.

Classification tagging and extraction queues

Summarization and batch text processing

Use it as the low-cost layer for summarizing long text, compressing repetitive content, or preprocessing datasets before routing only the harder cases to Gemini 2.5 Flash or Gemini 2.5 Pro.

Summarization and batch text processing

When to choose this route

Flash Lite makes the most sense as the low-cost layer in a Gemini routing strategy: one gateway, one auth pattern, and a clearer split between cheap bulk processing and stronger Gemini routes.

Choose Flash Lite when cost and throughput lead the decision

If the workload is mostly translation, tagging, extraction, summarization, or batch text cleanup, Flash Lite is the practical starting route because it keeps unit cost low without creating a separate integration path.

Do not default to Flash Lite for harder reasoning-heavy work

Flash Lite is not the route to center your stack on when the task quality threshold is higher, the reasoning path is more complex, or you expect too many edge cases to pass through a cheap first-pass layer.

Move up to Gemini 2.5 Flash or Pro when quality matters more than price

Upgrade to Gemini 2.5 Flash for a stronger general-purpose route, or Gemini 2.5 Pro when the task justifies a more capable premium model. EvoLink makes that routing split easier to operate behind one gateway.

How to start

Use this page as a quick route guide: pick the request format, use the correct model ID, and keep detailed request syntax in docs.

1

Step 1 - Choose the Request Format

Call Gemini 2.5 Flash Lite through OpenAI-compatible requests or native Gemini requests, depending on the stack you already run.

2

Step 2 - Use the Correct Model ID

Use the request model ID "gemini-2.5-flash-lite" for this route.

3

Step 3 - Route the Right Workloads Here

Use Flash Lite for translation, classification, extraction, tagging, summarization, and batch text processing. Move up only when the task needs a stronger Gemini route.

Core capabilities and limits

The main limits and production signals that matter when deciding whether this route fits your workload

Context

1,048,576 Input Tokens

Supports up to 1,048,576 input tokens for long prompts, large documents, and bulk text processing.

Output

65,536 Max Output Tokens

Best suited to compact outputs such as labels, summaries, extracted fields, and text responses.

Input

Text + Audio In, Text Out

Accepts text and audio input, with text output for transcription-adjacent and text processing workflows.

Caching

Implicit Caching

Repeated context can benefit from implicit caching, which helps reduce cost on overlapping requests.

Scale

Batch API

Supports Batch API for queued, offline, or other high-volume processing patterns.

Pricing

Lowest-Cost Gemini Text Route

Positioned below Gemini 2.5 Flash in both capability and price, making it the practical budget layer for bulk text workloads.

Gemini 2.5 Flash Lite API FAQs

Everything you need to know about the product and billing.

Yes. Flash Lite is positioned below Gemini 2.5 Flash in both price and capability, and is meant for lower-cost bulk text workloads.
Yes. EvoLink supports both OpenAI-compatible requests and native Gemini requests for this route.
Use "gemini-2.5-flash-lite" as the request model ID.
Gemini 2.5 Flash Lite supports up to 1,048,576 input tokens and up to 65,536 output tokens.
Yes. This route supports text and audio input, with text output.
Implicit caching can reduce repeated token cost when requests share overlapping context, which is useful for recurring prompts and batch-style workloads.
Choose Flash Lite when translation, tagging, extraction, summarization, and other high-volume text tasks need the lowest practical cost. Move up to Flash when you need a stronger general-purpose route.
It is best for translation, classification, extraction, tagging, summarization, and other batch text processing workloads where cost and throughput matter more than using a stronger model by default.
Yes. Gemini 2.5 Flash Lite supports function calling, but it is usually best positioned as a low-cost text route rather than the strongest option for the hardest tool-heavy reasoning tasks.

Next steps for Gemini routing

Where Flash Lite fits in the Gemini family

Use Flash Lite for bulk text processing, move to Gemini 2.5 Flash when you need a stronger general-purpose route, and move to Gemini 2.5 Pro when the task justifies premium reasoning quality.

Use this area to move to the right Gemini route or into docs once Flash Lite's role in your stack is clear.