Kling O3 API
Price: $0.075 - 0.125(~ 5.4 - 9 credits) per second of video
Highest stability with guaranteed 99.9% uptime. Recommended for production environments.
Use the same API endpoint for all versions. Only the model parameter differs.
At least one image is required (first frame, end frame, or reference images). *
Video starts from this image
Click to upload or drag and drop
Supported formats: JPG, JPEG, PNG
Maximum file size: 10MB; Maximum files: 1
Video ends at this image
Click to upload or drag and drop
Supported formats: JPG, JPEG, PNG
Maximum file size: 10MB; Maximum files: 1
Style/scene/subject reference images (not first/end frames)
Click to upload or drag and drop
Supported formats: JPG, JPEG, PNG
Maximum file size: 10MB; Maximum files: 7
History
Max 20 items0 running · 0 completed
Billing Rules
- •Price shown is per second
- •Duration range: 3-15 seconds
- •Total = price/second × duration
Pricing
| Model | Mode | Quality | Sound | Price |
|---|---|---|---|---|
| Kling O3 Image to Video | Video Generation | 720p | Off | $0.075/ second(5.4 Credits) |
| Kling O3 Image to Video | Video Generation | 720p | On | $0.100/ second(7.2036 Credits) |
| Kling O3 Image to Video | Video Generation | 1080p | Off | $0.100/ second(7.2036 Credits) |
| Kling O3 Image to Video | Video Generation | 1080p | On | $0.125/ second(9.0018 Credits) |
If it's down, we automatically use the next cheapest available—ensuring 99.9% uptime at the best possible price.
Kling O3 (3.0 Omni) API Pricing, Playground, and Integration
Access Kling O3 through EvoLink's unified API gateway. Run text-to-video, image-to-video, reference-to-video, and video editing workflows with one integration, online testing, and 3-15 second output support.
Kling O3 pricing starts at $0.075 per second on EvoLink, compared to $0.084 on the official Kling API. Access all four video modes — text-to-video, image-to-video, reference-to-video, and video editing — with free credits to start.

Kling O3 overview and what changed from Kling 3.0
Kling O3 (Kling 3.0 Omni) is the most capable video model in the Kling AI family. It extends Kling 3.0 with reference-to-video and video editing — four modes total through a single API.
Choose O3 over standard Kling 3.0 when your workflow needs more than prompt-driven generation. Available on EvoLink at $0.075/s (vs $0.084 official) with free credits and playground access.
Kling O3 API video modes
Kling O3 Text-to-Video API
Generate videos directly from text prompts with Kling O3. Describe scenes, actions, and styles in natural language and let the model produce 3-15 second clips ready for marketing, social media, or creative projects.

Kling O3 Image-to-Video and Reference-to-Video API
Use images or reference videos to guide generation. Kling O3 supports image-to-video and reference-to-video modes, giving teams precise control over visual style, character consistency, and scene composition.

Kling O3 Video Editing API
Edit and transform existing footage with Kling O3's video editing mode. Apply style transfers, adjust scenes, and refine content without starting from scratch — ideal for iterating on commercial content at scale.

Why teams use Kling O3 through EvoLink
Kling O3 combines four production-ready video modes in one model family, while EvoLink gives teams unified access, predictable billing, and a faster integration path.
Four specialized modes
Text, image, reference, and editing modes cover the full video creation workflow.
Latest V3 Omni architecture
Built on Kling's newest generation for improved quality and consistency.
Flexible 3-15s output
Generate videos from 3 to 15 seconds with per-second billing.
How to integrate the Kling O3 API
Test a mode online, send an async request, and move approved outputs into production.
Choose your mode
Select text-to-video, image-to-video, reference-to-video, or video editing based on your workflow needs.
Submit a generation task
Send your request with prompts, images, or references. Track the async task until results are ready.
Review and iterate
Download results, compare variations, and reuse the same structure for fast iteration across campaigns.
Core capabilities of Kling O3
Four production-ready video modes through one unified API
Text-to-video generation
Generate videos purely from text descriptions. Kling O3 interprets natural language prompts to produce dynamic video content without requiring any visual input.
Image-to-video transformation
Transform static images into dynamic videos. Provide reference images and let Kling O3 animate them with natural motion and scene dynamics.
Reference video guidance
Use existing videos as references to guide new generation. This mode helps maintain visual consistency and style across multiple outputs.
AI video editing
Edit and transform existing footage with AI-powered tools. Apply style changes, scene adjustments, and creative transformations without manual editing.
Per-second billing
Pay only for what you generate with per-second billing. Videos range from 3 to 15 seconds, giving teams precise cost control for every project.
V3 Omni architecture
Built on Kling's latest V3 Omni foundation, delivering improved visual quality, better motion coherence, and more accurate prompt following.
Kling O3 API FAQ
Everything you need to know about the product and billing.
All Kling AI Models
EvoLink provides unified API access to the full Kling model family: All models share the same API key. Switch models with one parameter.
API Reference
Select endpoint
Authentication
All APIs require Bearer Token authentication.
Authorization:
Bearer YOUR_API_KEY/v1/videos/generationsCreate Video
Kling O3 Image to Video (kling-o3-image-to-video) transforms static images into dynamic videos using the V3 Omni model. Supports first frame, end frame, reference images, subject control, multi-shot, and sound effects.
Asynchronous processing mode, use the returned task ID to query status.
Generated video links are valid for 24 hours, please save them promptly.
Important Notes
- At least one image is required: first frame (image_start), end frame (image_end), or reference images (image_urls).
- First-frame priority: image_start > image > image_url.
- image_urls are reference images (not first/end frames).
- If total images > 2, end frame (image_end) is not supported.
- Image requirements: JPG/JPEG/PNG, ≤ 10MB, width/height ≥ 300px, aspect ratio 1:2.5 ~ 2.5:1.
- Video duration: 3-15 seconds, billed per second.
Request Parameters
modelstringRequiredDefault: kling-o3-image-to-videoVideo generation model name.
kling-o3-image-to-videopromptstringOptionalText prompt describing what kind of motion and video to generate.
Notes
- Max 2500 characters
- Optional for image-to-video
- Reference elements using <<<element_1>>> syntax
A gentle breeze moves through the scene, creating subtle motion and life.image_startstringOptionalFirst-frame image URL. At least one image is required: image_start, image_end, or image_urls.
Notes
- Priority: image_start > image > image_url
- JPG/JPEG/PNG format
- Max size: 10MB
- Width/height ≥ 300px, aspect ratio 1:2.5 ~ 2.5:1
https://example.com/first-frame.jpgimage_endstringOptionalEnd-frame image URL. Can be used alone or with image_start.
Notes
- Optional
- Last frame requires a first frame (image_start)
- Not supported when total images > 2
- Same format requirements as image_start
https://example.com/end-frame.jpgimage_urlsarrayOptionalReference image URL array (not first/end frames). Used for style, scene, or subject reference.
Notes
- Optional
- These are reference images, NOT first/end frames
- Same format requirements as image_start
["https://example.com/ref-scene.jpg"]durationintegerOptionalDefault: 5Specifies the generated video duration in seconds.
Notes
- Range: 3-15 seconds (integer)
- Base price: 5.4 credits per second
- Minimum billing: 3 seconds
5aspect_ratiostringOptionalVideo aspect ratio. When a first-frame image is provided, this can be omitted (auto-adapts to image ratio).
| Value | Description |
|---|---|
| 16:9 | Landscape video |
| 9:16 | Portrait video |
| 1:1 | Square video |
16:9qualitystringOptionalDefault: 720pVideo resolution quality. Affects billing multiplier.
| Value | Description |
|---|---|
| 720p | Standard 720P (1.0x base) |
| 1080p | High quality 1080P (1.334x base) |
720psoundstringOptionalDefault: offSound effect control. Affects billing multiplier.
| Value | Description |
|---|---|
| off | No sound effects (1.0x) |
| on | Generate sound effects (1.334x) |
Notes
- Combined multiplier: 720p+off=1.0x, 720p+on=1.334x, 1080p+off=1.334x, 1080p+on=1.667x
offcallback_urlstringOptionalHTTPS callback address after task completion.
Notes
- Triggered on completion, failure, or cancellation
- HTTPS only, no internal IPs
- Max length: 2048 chars
- Timeout: 10s, Max 3 retries
https://your-domain.com/webhooks/video-task-completedmodel_params.multi_shotbooleanOptionalDefault: falseEnable multi-shot mode for generating videos with multiple camera angles or scenes.
Notes
- When enabled, prompt parameter will be ignored — use multi_prompt instead
- Sum of all shot duration values must equal total video duration
truemodel_params.shot_typestringOptionalShot type for multi-shot mode. Required when multi_shot is true.
| Value | Description |
|---|---|
| customize | Custom per-shot prompts and durations |
Notes
- Only effective when multi_shot=true
customizemodel_params.multi_promptarrayOptionalPer-shot prompt array. Required when multi_shot=true and shot_type=customize. Each item defines a shot segment.
Notes
- Format: [{index: number, prompt: string, duration: string}, ...]
- Max 6 shots, each shot prompt max 512 characters
- Sum of all shot durations must equal total video duration
- When used, top-level prompt can be empty
[{"index": 1, "prompt": "Scene one", "duration": "5"}, {"index": 2, "prompt": "Scene two", "duration": "5"}]model_params.element_listarrayOptionalSubject element list for consistent character appearance. Elements are created via kling-custom-element model.
Notes
- Format: [{element_id: string}, ...]
- Max 3 elements when first-frame image is set
- Video character elements not supported (only multi-image elements supported)
- element_id is obtained from kling-custom-element creation result
[{"element_id": "123456"}]model_params.watermark_infoobjectOptionalWatermark configuration for the generated video.
Notes
- Format: {enabled: boolean}
{"enabled": false}