Gemini Omni coming soonLearn more

Gemini Omni API

Gemini Omni API on EvoLink — video generation and chat-based editing through one API key, async task workflow, and callback support.
Model Type:

Highest stability with guaranteed 99.9% uptime. Recommended for production environments.

Use the same API endpoint for all versions. Only the model parameter differs.

Google Gemini Omni API and AI Video Generation Model

Use the Gemini Omni API to run Google's chat-based video model for text-to-video generation, image-to-video, and conversational editing through a single EvoLink API key. Unlike Veo 3.1, Gemini Omni treats editing as a first-class capability — refine clips in conversation instead of regenerating from scratch. Available globally on EvoLink with async task workflow, callback support, and no Google Cloud project required. The Pricing tab above shows current rates for the Pro and Flash routes.

Gemini Omni API video workflow on EvoLink

What can you build with Gemini Omni API?

Chat-Based Video Editing

Generate a clip with Gemini Omni, then refine it in conversation — "make the lighting warmer", "replace the red car". The model rewrites only the affected frames and keeps the rest pixel-stable. No regenerate-from-scratch loop.

Gemini Omni chat-based video editing

Object Replacement and Scene Rewrite

Swap an object in frame, remove an unwanted element, or rewrite a scene while preserving identity and motion. Useful for ad creative iteration and product variant rendering without external editing tools.

Gemini Omni object replacement and scene rewrite

Reference Image Workflow

Pass a reference image and Gemini Omni anchors character identity, lighting, and color across the generated video. Combine with chat-based editing to refine specific shots without losing visual consistency.

Gemini Omni reference image workflow

Native Audio-Synced Generation

Gemini Omni outputs synchronized audio and video in a single inference pass — footsteps match impacts, dialogue matches lip movement. No separate TTS or sound design pipeline.

Gemini Omni native audio-synced generation

How Gemini Omni Compares — All models on one EvoLink API key

Gemini Omni isn't the fidelity leader — Seedance 2.0 currently tops public benchmarks. Where Gemini Omni wins is workflow: chat-based editing, long-context consistency, and the simplest production path among Google video models.

Chat-Native Editing Workflow

Gemini Omni is the only major Google video model with editing as a first-class capability. Veo 3.1 and Seedance 2.0 are generation-first. For multi-turn refinement, this is the structural difference.

Long-Context Character Consistency

Gemini Omni inherits Gemini's long-context window to maintain character, outfit, and props across multiple shots in the same task. Reduces manual reference-management work in storyboard production.

No Google Cloud Project — Same Async Pattern as Veo and Seedance

No GCP setup, no Vertex billing, no separate region approval. If you already run video generation through EvoLink, adding Gemini Omni is a one-parameter change — same request shape, same task lifecycle as Veo 3.1, Seedance 2.0, and Kling.

Gemini Omni vs Veo 3.1 vs Seedance 2.0 — Side-by-side comparison

Three models commonly shortlisted for production video workflows in 2026. All three accessible through one EvoLink API key.

FeatureGemini OmniVeo 3.1Seedance 2.0
EvoLink priceTBCFrom $0.50/sFrom $0.092/s
Quality720p / 1080p (TBC)720p / 1080p / 4K480p / 720p / 1080p
Native audioYesYesYes
Reference controlText + image + chat editText + imageText + image + video + audio
Video length~10s4–8s + 16s extend4–15s
EditingChat-native, multi-turnGeneration-firstV2V mode
Best forEditing-heavy workflowsCinematic baselineMultimodal reference production

How to Integrate Gemini Omni API

Three steps to your first Gemini Omni video task. Same integration pattern as Veo 3.1, Seedance 2.0, and Kling 3.0.

1

Step 1 — Get Your API Key

Sign up on EvoLink.ai and generate your API key from the dashboard. No Google Cloud project required.

2

Step 2 — Submit Generation Task

POST to /v1/videos/generations with model: gemini-omni-pro (or gemini-omni-flash) and your prompt. Optionally include a reference image URL for image-to-video and a callback_url for completion notification. The API processes asynchronously and returns a task id.

3

Step 3 — Retrieve Video Result

Use the task ID to poll the status endpoint, or wait for the callback_url webhook. When status reaches completed, you receive a download URL for the generated MP4. Links are valid for 24 hours.

Gemini Omni API Capabilities

Technical specifications for production video workflows.

Editing

Chat-Based Video Editing

Multi-turn refinement in a single chat thread. Only affected frames re-render.

Output

Up to 1080p, ~10s Clips

720p and 1080p output tiers with clips up to approximately 10 seconds. Designed for short-form content and storyboard shots.

Modes

Text-to-Video and Image-to-Video

T2V from prompts and I2V with reference image input. Chat editing applies to outputs of either mode.

Audio

Native Synchronized Audio

Picture and audio generated jointly — dialogue, ambient sound, and impact effects sync with on-screen action.

Consistency

Long-Context Character Consistency

Inherits Gemini's long-context window to maintain character and props across shots in the same task.

Workflow

Async API with Task ID and Callback

Submit a task, receive an ID, poll status or configure a callback_url. Same lifecycle as other EvoLink video models.

Cost Example — Gemini Omni pricing estimates

100 × 10s clips for social media batch

Pricing to be announced

1,000 × 10s clips/month at production scale

Pricing to be announced

1 generation + 3 edits multi-turn workflow

Pricing to be announced

Iterate with gemini-omni-flash, then promote winners to gemini-omni-pro. Pricing details will be published when the route goes live.

Gemini Omni API Frequently Asked Questions

Everything you need to know about the product and billing.

Gemini Omni is Google's chat-based video generation model, announced at Google I/O 2026. Unlike Veo 3.1 — which is generation-first with cinematic text-to-video and image-to-video output — Gemini Omni treats editing as a first-class capability. Veo 3.1 still leads on raw cinematic fidelity and 4K output; Gemini Omni leads on multi-turn editing workflow.
Billed per task. Two routes: gemini-omni-pro for high-quality output and gemini-omni-flash for cost-efficient iteration. Audio generation is included. Check the Pricing table above for current rates.
No. EvoLink provides access via one API key. No Google Cloud project, no Vertex billing, no separate region approval. Same authentication as Veo 3.1 and Seedance 2.0 on EvoLink.
Pro is the higher-quality route for production output. Flash is the cost-efficient route for iteration and A/B testing. Both share the same async API — switch by changing the model parameter.
Yes. Pass a callback_url (HTTPS) when submitting the task and EvoLink will POST the result to your endpoint on completion, failure, or cancellation. Failed callbacks retry up to 3 times with 1s/2s/4s backoff. Polling the task status endpoint also works.
Failed tasks return a failed status with an error reason. Failed tasks are not billed. For application-level retry, treat the task as idempotent and resubmit with the same parameters.
Yes — this is Gemini Omni's core differentiator. Pass the previous task ID along with an edit instruction in natural language, and the model rewrites only the affected frames. Multi-turn editing in a single task is supported.
The current route generates clips up to approximately 10 seconds. For longer narratives, chain multiple clips using long-context character consistency.
Yes. Pass a reference image URL and Gemini Omni uses it as an identity anchor for the generated video.
Seedance 2.0 leads on raw text-to-video benchmark scores and supports the broadest multimodal reference inputs. Veo 3.1 is the cinematic baseline with 4K and 16s extension. Gemini Omni differentiates on chat-based editing and long-context consistency.
Yes. EvoLink exposes Gemini Omni, Veo 3.1, Nano Banana 2, and the rest of the Gemini family through a single API key. Switch by changing the model parameter.

All Gemini Video API Models

EvoLink provides unified access to Google's video and media model family through a single API key. All models share the same EvoLink API endpoint. Switch models with one parameter.

POST
/v1/videos/generations

Create Gemini Omni Video Task

Confirm live route fields before production use

Submit a Gemini Omni video task through EvoLink using the live supported request fields.

Asynchronous processing returns a task ID. Use it to , or provide callback_url when callback support is documented for the route.

Store completed outputs in your own system when result URLs are time-limited.

Core Request Parameters

modelstringRequiredDefault: gemini-omni

EvoLink model parameter for the Gemini Omni route. Confirm the live value before launch.

Examplegemini-omni
promptstringRequired

Text prompt describing the desired video workflow.

ExampleCreate a short product video with smooth camera motion and clean studio lighting
callback_urlstringOptional

Optional HTTPS callback for task completion when supported by the live route.

Notes
  • Use polling if callback_url is not enabled for the route
  • Store outputs promptly when result URLs are time-limited
Examplehttps://your-domain.com/webhooks/video-task-completed

Request Example

{
  "model": "gemini-omni",
  "prompt": "Create a short product video with smooth camera motion and clean studio lighting",
  "callback_url": "https://your-domain.com/webhooks/video-task-completed"
}

Response Example

{
  "id": "task-video-xxxxxxxx",
  "model": "gemini-omni",
  "object": "video.generation.task",
  "status": "pending",
  "progress": 0,
  "task_info": {
    "can_cancel": true
  },
  "type": "video"
}