Seedance 2.0 Mini is now availableTry now

Gemini Omni Flash API

Gemini Omni Flash API on EvoLink — video generation and video editing through one API key, async task workflow, and callback support.
Price: 

$1.275(~ 86.7 credits) per 1M input tokens; $14.875(~ 1011.5 credits) per 1M video output tokens

$7.650(~ 520.2 credits) per 1M other output tokens

Token-based billing. Actual cost follows the usage object returned by the API.

Highest stability with guaranteed 99.9% uptime. Recommended for production environments.

Use the same video endpoint for all modes. Only the model parameter differs.

Output is 720p with audio. Duration resets to Auto; drag the slider to send a fixed 3-10s duration.
131 (suggested: 2,000)

Choose landscape, portrait, or Auto to let the provider select the output ratio.

Auto lets the provider decide the output duration (estimated as 10s). Choose 3-10s to send a fixed duration.

Click Generate to see preview

History

Max 20 items

0 running · 0 completed

Your generation history will appear here

Gemini Omni Flash API on EvoLink

Use Gemini Omni Flash on EvoLink for text-to-video, image-to-video, reference-to-video, and video editing through one unified video API. Public discussion often frames Gemini Omni as a video counterpart to Nano Banana because it brings multimodal video creation and conversational editing into short-form workflows. On EvoLink, the practical value is API access: EvoLink model IDs, async task workflow, callback support, token-based usage visibility, and the same API key used for Veo, Seedance, Kling, and other video models.

Gemini Omni API video workflow on EvoLink

Billing Rules

  • Gemini Omni Flash is billed by token usage. The task returns a credits_reserved estimate on creation and settles from the actual usage tokens once the task completes.
  • Text input: counted from the prompt tokens.
  • Video input: 5,792 tokens per second of input video.
  • Video output: 5,792 tokens per second of 720p video (audio included).
  • The output follows the input video, so video edit does not accept duration or aspect_ratio.

Pricing

Text to Video
Output video
Meter:Video output tokens
Price:
$0.015/ 1K tokens
(1.0115 Credits)
Text to Video
Input text / image / video
Meter:Input tokens
Price:
$0.0013/ 1K tokens
(0.0867 Credits)
Text to Video
Thinking / text output
Meter:Other output tokens
Price:
$0.0077/ 1K tokens
(0.5202 Credits)

If it's down, we automatically use the next cheapest available—ensuring 99.9% uptime at the best possible price.

EVOLINK · PRICE EST.gemini-omni-flash
Auto estimated as 10s · real-time

Figures are pre-bill estimates. Actual charges follow the upstream usage tokens returned by the model.

Your estimate
~$0.86959.106
Official· saves ~15%
~$1.02369.537
Tokens per task
video output57,920
text input0
other output1,000
Mode
Duration
Prompt
0 chars · ~0 text tokens

What can you build with Gemini Omni API?

Chat-Based Video Editing

Generate a clip with Gemini Omni, then refine it in conversation — "make the lighting warmer", "replace the red car". The workflow is designed for iterative edits while preserving the surrounding scene, subject identity, and motion as much as the selected route supports.

Gemini Omni chat-based video editing

Object Replacement and Scene Rewrite

Swap an object in frame, remove an unwanted element, or rewrite a scene while preserving identity and motion. Useful for ad creative iteration and product variant rendering without external editing tools.

Gemini Omni object replacement and scene rewrite

Reference Image Workflow

Pass a reference image and Gemini Omni anchors character identity, lighting, and color across the generated video. Combine with chat-based editing to refine specific shots without losing visual consistency.

Gemini Omni reference image workflow

Audio-Capable Video Generation

Gemini Omni Flash routes can return short video outputs with audio where supported by the selected mode, reducing the need to stitch a separate TTS or sound-design pipeline into first-pass generation.

Gemini Omni audio-capable video generation

How Gemini Omni Compares — All models on one EvoLink API key

Gemini Omni is most interesting for workflow rather than raw fidelity alone: multimodal inputs, conversational editing, and a practical EvoLink route for testing it beside Veo, Seedance, and Kling with one API key.

Chat-Native Editing Workflow

Gemini Omni is positioned around conversational video editing, while Veo 3.1 and Seedance 2.0 are usually evaluated first as generation routes. For multi-turn refinement, this is the workflow difference to test.

Long-Context Character Consistency

Gemini Omni is reported to benefit from Gemini context and world knowledge for continuity across multi-input and edit-heavy workflows. Treat this as a behavior to evaluate in your own storyboard or short-video pipeline.

No Google Cloud Project — Same Async Pattern as Veo and Seedance

No GCP setup, no Vertex billing, no separate region approval. If you already run video generation through EvoLink, adding Gemini Omni is a one-parameter change — same request shape, same task lifecycle as Veo 3.1, Seedance 2.0, and Kling.

Gemini Omni vs Veo 3.1 vs Seedance 2.0 — Side-by-side comparison

Three models commonly shortlisted for production video workflows in 2026. All three accessible through one EvoLink API key.

FeatureGemini OmniVeo 3.1Seedance 2.0
EvoLink priceToken-basedFrom $0.50/sFrom $0.092/s
Quality720p720p / 1080p, 4K upscaling where available480p / 720p / 1080p
Native audioYesYesYes
Reference controlText + image + chat editText + imageText + image + video + audio
Video length3-10s / AutoShort clips with Extend for longer scenes where supported4–15s
EditingConversational editing workflowGeneration-firstV2V mode
Best forShort-form editing and multi-input workflowsCinematic baselineMultimodal reference production

How to Integrate Gemini Omni API

Three steps to your first Gemini Omni video task. Same integration pattern as Veo 3.1, Seedance 2.0, and Kling 3.0.

1

Step 1 — Get Your API Key

Sign up on EvoLink.ai and generate your API key from the dashboard. No Google Cloud project required.

2

Step 2 — Submit Generation Task

POST to /v1/videos/generations with one of the Gemini Omni Flash model names and your prompt. Add duration for 3-10 second or Auto generation modes, image_urls for image-to-video or reference-to-video, video_urls for video edit, and callback_url for completion notification. The API processes asynchronously and returns a task id.

3

Step 3 — Retrieve Video Result

Use the task ID to poll the status endpoint, or wait for the callback_url webhook. When status reaches completed, you receive a download URL for the generated MP4. Links are valid for 24 hours.

Gemini Omni API Capabilities

Technical specifications for production video workflows.

Editing

Chat-Based Video Editing

Multi-turn refinement in a conversational workflow, with scene continuity depending on the selected route and input quality.

Output

720p, 3-10s / Auto Clips

720p output with configurable 3-10 second or Auto clips for generation modes. Auto is estimated as 10 seconds. Video edit accepts one MP4 input up to 10 seconds.

Modes

Text-to-Video and Image-to-Video

T2V from prompts and I2V with reference image input. Chat editing applies to outputs of either mode.

Audio

Audio-Capable Video Output

Short video outputs can include audio where supported by the selected Gemini Omni Flash route.

Consistency

Long-Context Character Consistency

Designed for stronger continuity across multi-input and edit-heavy workflows; validate consistency on your own production prompts.

Workflow

Async API with Task ID and Callback

Submit a task, receive an ID, poll status or configure a callback_url. Same lifecycle as other EvoLink video models.

Cost Example — Gemini Omni pricing estimates

100 × 3-10s/Auto clips for social media batch

Use current Pricing tab rates

1,000 × 3-10s/Auto clips/month at production scale

Use current Pricing tab rates

1 generation + 3 edits multi-turn workflow

Use current Pricing tab rates

Use the Pricing tab above for current token-based rates. Select the workflow by changing the model parameter.

Gemini Omni API Frequently Asked Questions

Everything you need to know about the product and billing.

Gemini Omni is Google's multimodal video model family announced at Google I/O 2026, with Omni Flash discussed as a short-form video route for text, image, video, and audio inputs. Compared with Veo 3.1, Gemini Omni is more interesting for conversational editing and multi-input workflows, while Veo remains a strong cinematic generation baseline.
Billing follows the usage tokens returned by the API, with separate token meters for input, video output, and other output. Check the Pricing table above for current rates.
No. EvoLink provides access via one API key. No Google Cloud project, no Vertex billing, no separate region approval. Same authentication as Veo 3.1 and Seedance 2.0 on EvoLink.
Four modes are available: gemini-omni-flash-text-to-video, gemini-omni-flash-image-to-video, gemini-omni-flash-reference-to-video, and gemini-omni-flash-video-edit. All share the same async video API endpoint.
Yes. Pass a callback_url (HTTPS) when submitting the task and EvoLink can POST task updates to your endpoint when the task reaches a terminal state. Polling the task status endpoint also works if you do not provide a callback URL.
Failed tasks return a failed status with an error reason. For application-level retry, inspect the error, keep the original parameters for debugging, and resubmit only when the input or transient failure mode is clear.
Yes — this is one of Gemini Omni's main workflow differences. Use a natural-language edit instruction and validate how well the selected route preserves the surrounding scene, subject identity, and motion across iterations.
Generation modes support configurable 3-10 second or Auto clips. Auto is estimated as 10 seconds for reservation. Video edit accepts one MP4 input up to 10 seconds. For longer narratives, chain multiple clips using long-context character consistency.
Yes. Pass a reference image URL and Gemini Omni uses it as an identity anchor for the generated video.
Seedance 2.0 has strong benchmark and multimodal reference signals, while Veo 3.1 remains a strong cinematic generation baseline with advanced Flow and extension workflows. Gemini Omni is different because developers are evaluating it for conversational editing, multi-input generation, and short-form iteration.
Yes. EvoLink exposes Gemini Omni, Veo 3.1, Nano Banana 2, and the rest of the Gemini family through a single API key. Switch by changing the model parameter.

All Gemini Video API Models

EvoLink provides unified access to Google's video and media model family through a single API key. All models share the same EvoLink API endpoint. Switch models with one parameter.

POST
/v1/videos/generations

Create Gemini Omni Flash Video Task

Text to Video uses the unified EvoLink video generation endpoint. Select the mode by changing the model parameter.

Asynchronous processing returns a task ID. Use it to , or provide callback_url for completion notifications.

Generated outputs should be stored in your own system when result URLs are time-limited.

Request Parameters

modelstringRequiredDefault: gemini-omni-flash-text-to-video

Gemini Omni Flash model name. Fixed to gemini-omni-flash-text-to-video for text-to-video generation.

Examplegemini-omni-flash-text-to-video
promptstringRequired

Natural-language instruction describing the requested video.

ExampleCreate a cinematic product video with smooth camera motion and natural audio ambience
aspect_ratiostringOptionalDefault: 16:9

Output aspect ratio. Use auto to let the provider choose.

ValueDescription
16:9Landscape video
9:16Portrait video
autoLet the provider choose the output ratio
Example16:9
durationinteger or stringOptionalDefault: 10 if omitted

Output video duration in seconds. The Playground sends auto by default.

ValueDescription
3-10Any integer from 3 to 10 seconds. If omitted, the API default is 10 seconds.
autoLet the provider decide the output duration. Playground sends auto by default and estimates it as 10 seconds.
Notes
  • Use auto to let the model decide the duration; reservations estimate auto as 10 seconds
  • Affects the estimated reservation; completed tasks are billed from API usage tokens
Exampleauto
callback_urlstringOptional

Optional HTTPS callback address after task completion.

Notes
  • Use polling if no callback_url is provided
  • Store outputs promptly when result URLs are time-limited
Examplehttps://your-domain.com/webhooks/video-task-completed

Request Example

{
  "model": "gemini-omni-flash-text-to-video",
  "prompt": "Create a cinematic product video with smooth camera motion and natural audio ambience",
  "aspect_ratio": "16:9",
  "duration": "auto",
  "callback_url": "https://your-domain.com/webhooks/video-task-completed"
}

Response Example

{
  "id": "task-video-xxxxxxxx",
  "model": "gemini-omni-flash-text-to-video",
  "object": "video.generation.task",
  "status": "processing",
  "progress": 0,
  "task_info": {
    "estimated_time": 60,
    "can_cancel": false,
    "video_duration": 10
  },
  "usage": {
    "credits_reserved": 59.1089,
    "billing_rule": "per_token"
  },
  "type": "video",
  "created": 1782940800
}

Billing Rules

Gemini Omni Flash is billed by token usage. The task returns a credits_reserved estimate on creation and settles from the actual usage tokens once the task completes. Token counts per material:

  • Text input — counted from the prompt tokens.
  • Video output — 5,792 tokens per second of 720p video (audio included).
  • Duration only affects the reservation estimate; Auto is estimated as 10 seconds.