Kling O3 API

Kling O3 (V3 Omni) next-generation video model with text-to-video, image-to-video, reference-to-video, video editing, and custom element creation. Supports 3-15 second videos with per-second billing.

Model Type :

Kling O3 Text to Video Kling O3 Image to Video Kling O3 Reference to Video Kling O3 Video Edit

Reference Video (Required)

Upload a reference video to guide generation

Reference Images (Optional, up to 2)

Upload optional reference images

Prompt

Estimated Cost (5s)

8.1 Credits

Sample Result

No sample available

Prompt

0 (suggested: 2,000)

Input Video *

Upload a video for editing (max 100MB)

Click to upload or drag and drop

Supported formats: MP4, MOV
Maximum file size: 100MB; Maximum files: 1

Reference Images (Optional)

Upload reference images

Click to upload or drag and drop

Supported formats: JPG, JPEG, PNG, WEBP
Maximum file size: 10MB; Maximum files: 4

Aspect Ratio

Duration5s

3s10s

Quality

Keep Original Sound

Click Generate to see preview

Historial

Máx. 20 elementos

0 ejecutando · 0 completado

Tu historial de generación aparecerá aquí

Billing Rules

•Price shown is per second
•Duration range: 3-10 seconds
•Total = price/second × duration
•Sound is forced off when video input is present

Pricing

Model	Mode	Quality	Price
Kling O3 Reference to Video	Video Generation	720p	Popular $0.1125/ second(8.1 Credits)
Kling O3 Reference to Video	Video Generation	1080p	$0.1501/ second(10.8054 Credits)

Popular

Kling O3 Reference to Video

Video Generation

Quality:720p

Price:

$0.1125/ second

(8.1 Credits)

Kling O3 Reference to Video

Video Generation

Quality:1080p

Price:

$0.1501/ second

(10.8054 Credits)

If it's down, we automatically use the next cheapest available—ensuring 99.9% uptime at the best possible price.

Kling O3 API for next-generation video creation

Build with the latest Kling V3 Omni model. Generate videos from text, images, or references, and edit existing footage — all through one unified API with 3-15 second output support.

Hero showcase of Kling O3 video capabilities

What can you build with the Kling O3 API?

Text-to-video creation

Generate videos directly from text prompts with Kling O3. Describe scenes, actions, and styles in natural language and let the model produce 3-15 second clips ready for marketing, social media, or creative projects.

Start creating

Image and reference-driven video

Use images or reference videos to guide generation. Kling O3 supports image-to-video and reference-to-video modes, giving teams precise control over visual style, character consistency, and scene composition.

Explore modes

AI-powered video editing

Edit and transform existing footage with Kling O3's video editing mode. Apply style transfers, adjust scenes, and refine content without starting from scratch — ideal for iterating on commercial content at scale.

Try editing

Why teams choose Kling O3

Kling O3 brings the latest V3 Omni architecture with four specialized modes — text-to-video, image-to-video, reference-to-video, and video editing — in a single model family.

Four specialized modes

Text, image, reference, and editing modes cover the full video creation workflow.

Latest V3 Omni architecture

Built on Kling's newest generation for improved quality and consistency.

Flexible 3-15s output

Generate videos from 3 to 15 seconds with per-second billing.

How to integrate the Kling O3 API

From input to production-ready video in three steps.

Choose your mode

Select text-to-video, image-to-video, reference-to-video, or video editing based on your workflow needs.

Submit a generation task

Send your request with prompts, images, or references. Track the async task until results are ready.

Review and iterate

Download results, compare variations, and reuse the same structure for fast iteration across campaigns.

View API Docs

Core capabilities of the Kling O3 API

Next-generation video AI with four specialized modes

Text

Text-to-video generation

Generate videos purely from text descriptions. Kling O3 interprets natural language prompts to produce dynamic video content without requiring any visual input.

Image

Image-to-video transformation

Transform static images into dynamic videos. Provide reference images and let Kling O3 animate them with natural motion and scene dynamics.

Reference

Reference video guidance

Use existing videos as references to guide new generation. This mode helps maintain visual consistency and style across multiple outputs.

Edit

AI video editing

Edit and transform existing footage with AI-powered tools. Apply style changes, scene adjustments, and creative transformations without manual editing.

Billing

Per-second billing

Pay only for what you generate with per-second billing. Videos range from 3 to 15 seconds, giving teams precise cost control for every project.

V3 Omni architecture

Built on Kling's latest V3 Omni foundation, delivering improved visual quality, better motion coherence, and more accurate prompt following.

Frequently Asked Questions

Everything you need to know about the product and billing.

The Kling O3 API provides access to Kling's latest V3 Omni video model through EvoLink. It supports four modes: text-to-video, image-to-video, reference-to-video, and video editing. Each mode generates 3-15 second videos with per-second billing. Use your EvoLink dashboard for current pricing and availability.

Kling O3 offers four modes: text-to-video for generating from prompts, image-to-video for animating images, reference-to-video for style-guided generation using reference videos, and video editing for transforming existing footage. Each mode is optimized for different production workflows.

Kling O3 generates videos between 3 and 15 seconds. Billing is per-second within this range. Videos shorter than 3 seconds are billed at the 3-second minimum. This range is suitable for social media clips, ads, and short-form content.

Kling O3 uses per-second billing. Text-to-video and image-to-video are priced at 5.4 credits per second, while reference-to-video and video editing are priced at 8.1 credits per second. The minimum billing is 3 seconds and maximum is 15 seconds. Check your EvoLink dashboard for your group's specific pricing.

Kling O3 is built on the newer V3 Omni architecture and adds text-to-video as a new mode. It also introduces reference-to-video for style-guided generation. The video duration range is 3-15 seconds compared to O1's varying ranges. O3 represents the latest generation with improved quality and consistency.

Start with a clear subject and describe the action, mood, and setting in simple terms. For image-to-video, provide high-quality reference images. For reference-to-video, use videos that match your desired style. Consistency improves when your prompt structure stays stable across runs.

Limits, pricing, and available modes are determined by your provider and region. Use your EvoLink dashboard and API responses as the source of truth. Check the API documentation for the most current constraints and parameters.

API Reference

Select endpoint

Authentication

All APIs require Bearer Token authentication.

Header

Authorization: 
Bearer YOUR_API_KEY

Get API Key

POST

/v1/videos/generations

Create Video

Kling O3 Reference to Video (kling-o3-reference-to-video) generates videos guided by reference video style and motion features using the V3 Omni model. The reference video serves as a feature reference (not direct editing).

Asynchronous processing mode, use the returned task ID to query status.

Generated video links are valid for 24 hours, please save them promptly.

Important Notes

A reference video is required (video_url, video_urls, or video).
Max duration: 10 seconds (shorter than text/image-to-video's 15s).
Sound is forced off when video input is present — sound parameter is ignored.
Video format: MP4/MOV, ≤ 200MB, ≥ 3s, 720-2160px, 24-60fps. Max 1 video.
With video input: images + subjects ≤ 4, no video-character subjects.

Request Parameters

modelstringRequiredDefault: kling-o3-reference-to-video

Video generation model name.

Examplekling-o3-reference-to-video

promptstringOptional

Text prompt describing what kind of video to generate with reference guidance.

Notes

Max 2500 characters
Optional

ExampleMaintain the same motion style, switch to a snowy background.

video_urlstringRequired

Reference video URL. At least one of video_url, video_urls, or video must be provided.

Notes

Priority: video_url and video_urls take the first video; video is lowest priority
Format: MP4/MOV
Max size: 200MB
Duration: ≥ 3 seconds
Resolution: 720-2160px width/height
Frame rate: 24-60fps
Max 1 video (multiple videos only use the first)

Examplehttps://example.com/reference.mp4

image_urlsarrayOptional

Optional reference image URLs for style/scene guidance.

Notes

Optional, for style/scene/subject reference
With video: images + subjects ≤ 4

Example["https://example.com/style.jpg"]

keep_original_soundbooleanOptionalDefault: true

Whether to keep the original sound from the reference video.

Value	Description
true	Preserve original audio
false	Discard original audio

Exampletrue

durationintegerOptionalDefault: 5

Specifies the generated video duration in seconds.

Notes

Range: 3-10 seconds (shorter than text/image-to-video's 15s)
Base price: 8.1 credits per second
Minimum billing: 3 seconds

Example5

aspect_ratiostringOptional

Video aspect ratio.

Value	Description
16:9	Landscape video
9:16	Portrait video
1:1	Square video

Example16:9

qualitystringOptionalDefault: 720p

Video resolution quality. Affects billing multiplier.

Value	Description
720p	Standard 720P (1.0x base)
1080p	High quality 1080P (1.334x base)

Notes

Sound forced off — only quality affects the multiplier

Example720p

callback_urlstringOptional

HTTPS callback address after task completion.

Notes

Triggered on completion, failure, or cancellation
HTTPS only, no internal IPs
Max length: 2048 chars
Timeout: 10s, Max 3 retries

Examplehttps://your-domain.com/webhooks/video-task-completed

model_params.multi_shotbooleanOptionalDefault: false

Enable multi-shot mode for generating videos with multiple camera angles or scenes.

Notes

When enabled, shot_type and multi_prompt become relevant

Exampletrue

model_params.shot_typestringOptional

Shot type for multi-shot mode. Required when multi_shot is true.

Value	Description
customize	Custom per-shot prompts and durations

Notes

Only effective when multi_shot=true

Examplecustomize

model_params.multi_promptarrayOptional

Per-shot prompt array. Required when multi_shot=true and shot_type=customize. Each item defines a shot segment.

Notes

Format: [{index: number, prompt: string, duration: string}, ...]
Max 6 shots
Total duration of all shots should match the requested duration

Example[{"index": 1, "prompt": "Scene one", "duration": "3"}, {"index": 2, "prompt": "Scene two", "duration": "5"}]

model_params.element_listarrayOptional

Subject library list for referencing pre-trained subjects in the video.

Notes

Format: [{element_id: long}, ...]
No video-character subjects supported
With video: images + subjects ≤ 4
Reference subjects in prompt using <<<element_N>>> placeholder

Example[{"element_id": 789012}]

model_params.watermark_infoobjectOptional

Watermark configuration for the generated video.

Notes

Format: {enabled: boolean}

Example{"enabled": false}

Request Example

{
  "model": "kling-o3-reference-to-video",
  "prompt": "Maintain the same motion style, switch to a snowy background",
  "video_url": "https://example.com/reference.mp4",
  "duration": 5,
  "aspect_ratio": "16:9",
  "quality": "720p"
}

Request Example (With Reference Image + Subject)

{
  "model": "kling-o3-reference-to-video",
  "prompt": "<<<element_1>>> walking in the same scene style",
  "video_url": "https://example.com/reference.mp4",
  "image_urls": ["https://example.com/style.jpg"],
  "duration": 8,
  "quality": "1080p",
  "keep_original_sound": false,
  "model_params": {
    "element_list": [{"element_id": 789012}]
  }
}

Response Example

{
  "created": 1757169743,
  "id": "task-unified-1757169743-o3r2v",
  "model": "kling-o3-reference-to-video",
  "object": "video.generation.task",
  "progress": 0,
  "status": "pending",
  "task_info": {
    "can_cancel": true,
    "estimated_time": 240,
    "video_duration": 5
  },
  "type": "video",
  "usage": {
    "billing_rule": "per_second",
    "credits_reserved": 40.5,
    "user_group": "default"
  }
}