Kling O1 API
Kling O1 video generation model with image-to-video, video editing, and fast video editing variants. Supports 3-20 second videos with reference images for style-guided generation.
Upload images to generate video
A cinematic transformation video. Start with the model standing in a clean white photography studio, wearing a simple nude-tone fitting outfit. The camera is steady, 9:16 vertical, natural soft light. The model makes very subtle natural movements such as blinking or shifting her weight. Then, the environment begins to slowly transform. Soft pastel mist fades in behind her. Large pink flowers begin to appear and grow around her in a magical, elegant way. The grass and dreamy atmosphere gradually form. The lighting becomes more cinematic and soft. Next, her outfit transforms from the simple fitting outfit into the final fantasy couture outfit shown in the reference: layered pink tulle dress, fluffy textured jacket, mint-green thigh-high stockings, and heels. The transformation must be smooth, elegant, and high-end. As the scene completes, the final environment fully resembles the dreamy floral fantasy world from the reference image. Extremely large pink flowers, lush green textures, soft cloudy sky. The model stands confidently in the completed scene and makes light natural movements, such as a gentle head turn or slight breathing motion. Ultra high-end, fashion commercial style. Final 2 seconds: the completed big scene, subtle motion in flowers like a gentle breeze, and the model holding a graceful pose.
Upload reference images
Click to upload or drag and drop
Supported formats: JPG, JPEG, PNG, WEBP
Maximum file size: 10MB; Maximum files: 10
History
Max 20 items0 running · 0 completed
Kling O1 API for unified video generation and editing
Build modern video workflows with Kling O1. Use one API to create new clips from prompts, refine existing footage, and keep characters and scenes consistent across outputs for marketing, social content, and commerce.

What can you build with the Kling O1 API?
Prompt-to-video storytelling
Turn short creative briefs into videos with Kling O1 and keep the same look across multiple outputs. This is useful for social campaigns, brand series, or episodic content where consistency matters more than one-off experimentation.

Reference-driven edits
Use Kling O1 to refine or rework existing footage with instruction-based edits. Keep the core subject intact while adjusting style, lighting, or scene details so teams can iterate fast without a full re-shoot.

Commercial content at scale
Kling O1 is positioned for production teams in advertising, e-commerce, and social media. Use it to generate variations, keep brand tone consistent, and deliver content at the pace required by modern channels.

Why teams choose Kling O1
Kling O1 emphasizes unified creation and editing with consistency across characters and scenes, which reduces rework and keeps creative pipelines predictable.
Unified multimodal workflow
Text, image, video, and subject inputs live in one model.
Consistency-first outputs
Maintain recognizable characters and scenes across clips.
Production-friendly focus
Built for film, social, ads, and commerce workflows.
How to integrate the Kling O1 API
A simple flow from input to production-ready video.
Choose inputs and mode
Select text, image, video, or subject inputs based on your workflow and desired output type.
Submit a generation task
Send your request with instructions and any references, then track the task until results are ready.
Review and iterate
Download results, compare variations, and reuse the same structure for fast iteration.
Core capabilities of the Kling O1 API
Unified video creation and editing in one model
Unified multimodal engine
Kling O1 is introduced as a unified multimodal model that combines generation and editing in a single system. This allows teams to keep one integration while handling both new clip creation and edits across the same workflow.
Text, image, video, and subject inputs
Public descriptions highlight that Kling O1 supports text, image, video, and subject inputs. This gives creators more ways to control outputs and reduces guesswork when consistent results are required.
Consistency for characters and scenes
Kling O1 is positioned to address the consistency challenge in AI video generation. This helps teams keep character identity, props, and scene details aligned across multiple clips.
Generation plus editing workflows
Instead of switching tools, Kling O1 brings generation and editing tasks into one engine. This is useful for marketing teams that need to create, then refine, without breaking continuity.
Commercial content use cases
The model is described as suitable for film, television, social media, advertising, and e-commerce workflows. That makes it a practical choice for teams building content at scale.
Multimodal visual language
Kling O1 is built on a multimodal visual language framework. This helps it interpret intent across text and visual references so outputs align more closely with creative direction.
Frequently Asked Questions
Everything you need to know about the product and billing.
API Reference
Select endpoint
Authentication
All APIs require Bearer Token authentication.
Authorization:
Bearer YOUR_API_KEY/v1/videos/generationsCreate Video
Kling O1 Image to Video (kling-o1-image-to-video) model transforms static images into dynamic videos.
Asynchronous processing mode, use the returned task ID to query status.
Generated video links are valid for 24 hours, please save them promptly.
Important Notes
- At least one input image is required for image-to-video generation.
- Maximum 2 images per request.
Request Parameters
modelstringRequiredDefault: kling-o1-image-to-videoVideo generation model name.
kling-o1-image-to-videopromptstringRequiredPrompt describing what kind of motion and video to generate.
Notes
- Limited to 2000 tokens
A gentle breeze moves through the scene, creating subtle motion and life.image_urlsarrayRequiredInput image URL list for image-to-video generation.
Notes
- At least 1 image required
- Max 2 images per request
- Max size: 10MB per image
- Formats: .jpg, .jpeg, .png, .webp
- URLs must be directly viewable by the server
["http://example.com/image1.jpg", "http://example.com/image2.jpg"]aspect_ratiostringOptionalDefault: 16:9Video aspect ratio.
| Value | Description |
|---|---|
| 16:9 | Landscape video |
| 9:16 | Portrait video |
| 1:1 | Square video |
'16:9'durationintegerOptionalDefault: 5Specifies the generated video duration in seconds.
| Value | Description |
|---|---|
| 5 | 5 seconds duration (Base price) |
| 10 | 10 seconds duration (2x price) |
Notes
- Billing is based on duration: 8.064 credits per second
5callback_urlstringOptionalHTTPS callback address after task completion.
Notes
- Triggered on completion, failure, or cancellation
- Sent after billing confirmation
- HTTPS only, no internal IPs
- Max length: 2048 chars
- Timeout: 10s, Max 3 retries
https://your-domain.com/webhooks/video-task-completed