Kling 3.0 API
Price: $0.075 - 0.150(~ 5.4 - 10.8 credits) per second of video
Highest stability with guaranteed 99.9% uptime. Recommended for production environments.
Use the same API endpoint for all versions. Only the model parameter differs.
History
Max 20 items0 running · 0 completed
Billing Rules
- •Price shown is per second
- •Duration range: 3-15 seconds
- •Total = price/second × duration
Pricing
| Model | Mode | Quality | Sound | Price |
|---|---|---|---|---|
| Kling 3.0 Text to Video | Video Generation | 720p | Off | $0.075/ second(5.4 Credits) |
| Kling 3.0 Text to Video | Video Generation | 720p | On | $0.113/ second(8.1 Credits) |
| Kling 3.0 Text to Video | Video Generation | 1080p | Off | $0.100/ second(7.1982 Credits) |
| Kling 3.0 Text to Video | Video Generation | 1080p | On | $0.150/ second(10.8 Credits) |
If it's down, we automatically use the next cheapest available—ensuring 99.9% uptime at the best possible price.
Kling 3.0 API Pricing, Playground, and Integration
Access Kling 3.0 through EvoLink's unified API. Use text-to-video and image-to-video routes with async delivery, per-second pricing, and one integration path for production workflows.
Kling 3.0 pricing starts at $0.075 per second on EvoLink, compared to $0.084 on the official Kling API. Generate 3-15 second videos from text or images with free credits to start, no deposit required.

Kling 3.0 overview and version history
Kling 3.0 is the standard video generation model in the Kling AI family by Kuaishou. Two modes — text-to-video and image-to-video — produce 3-15 second clips at 720p or 1080p with per-second billing.
Compared to Kling 2.1 and 1.6, version 3.0 improved motion quality, scene coherence, and prompt adherence. It also added multi-shot support, AI sound effects, and subject control for consistent characters across clips. Access Kling 3.0 on EvoLink with free credits, a built-in playground, and pricing lower than the official rate.
Kling 3.0 API video modes and workflow features
Kling 3.0 Text-to-Video API
Generate videos directly from text prompts with Kling 3.0. Describe scenes, actions, and styles in natural language and let the model produce 3-15 second clips ready for marketing, social media, or creative projects.

Kling 3.0 Image-to-Video API
Use images to guide video generation. Kling 3.0 supports image-to-video mode, giving teams precise control over visual style, character consistency, and scene composition.

Kling 3.0 Multi-Shot and Sound Effects
Create complex multi-shot videos with scene transitions and add AI-generated sound effects. Kling 3.0 supports customizable shot sequences and audio generation for professional-quality output.

Why teams use Kling 3.0 through EvoLink
Kling 3.0 gives teams text-to-video and image-to-video access through one gateway, making pricing, routing, and production integration easier to manage.
One API for two core Kling 3.0 modes
Use the same integration path for text-to-video and image-to-video, instead of splitting implementation across separate vendor setups.
Cleaner production integration
Async task handling, one API key, and unified billing make it easier to run Kling 3.0 inside internal tools, creator products, and automation workflows.
Predictable per-second pricing
3-15 second output windows and visible quality options help teams estimate cost before sending production traffic.
How to integrate the Kling 3.0 API
From input to production-ready video in three steps.
Choose your mode
Select text-to-video or image-to-video based on your workflow needs.
Submit a generation task
Send your request with prompts or images. Track the async task until results are ready.
Review and iterate
Download results, compare variations, and reuse the same structure for fast iteration across campaigns.
Kling 3.0 API capabilities
Text-to-video and image-to-video access through one production-ready gateway
Text-to-video generation
Generate videos purely from text descriptions. Kling 3.0 interprets natural language prompts to produce dynamic video content without requiring any visual input.
Image-to-video transformation
Transform static images into dynamic videos. Provide reference images and let Kling 3.0 animate them with natural motion and scene dynamics.
Multi-shot support
Create complex multi-shot videos with customizable scene transitions, per-shot prompts, and duration control for professional video production.
Sound effects
Add AI-generated sound effects to your videos. Toggle sound on or off based on your needs, with transparent pricing for audio generation.
Per-second billing
Pay only for what you generate with per-second billing. Videos range from 3 to 15 seconds, giving teams precise cost control for every project.
720p & 1080p quality
Choose between standard 720p and high-quality 1080p output resolution to balance quality and cost for your specific use case.
Kling 3.0 API FAQ
Everything you need to know about the product and billing.
All Kling AI Models
EvoLink provides unified API access to the full Kling model family: All models share the same API key. Switch models with one parameter.
API Reference
Select endpoint
Authentication
All APIs require Bearer Token authentication.
Authorization:
Bearer YOUR_API_KEY/v1/videos/generationsCreate Video
Kling 3.0 Text to Video (kling-v3-text-to-video) generates videos from text prompts using the 3.0 model. Supports single-shot and multi-shot modes with optional sound effects.
Asynchronous processing mode, use the returned task ID to query status.
Generated video links are valid for 24 hours, please save them promptly.
Important Notes
- Text-to-video mode: no image input required.
- Video duration: 3-15 seconds, billed per second.
- Pricing varies by quality and sound: 720p+off = 1.0x, 720p+on = 1.5x, 1080p+off = 1.333x, 1080p+on = 2.0x.
Request Parameters
modelstringRequiredDefault: kling-v3-text-to-videoVideo generation model name.
kling-v3-text-to-videopromptstringRequiredText prompt describing what kind of video to generate. When multi_shot=true and shot_type=customize, this can be empty (use multi_prompt instead).
Notes
- Max 2500 characters
- Reference elements using <<<element_1>>> syntax in the prompt
A golden retriever running through a sunlit meadow, cinematic slow motion.durationintegerOptionalDefault: 5Specifies the generated video duration in seconds.
Notes
- Range: 3-15 seconds (integer)
- Base price: 5.4 credits per second
- Minimum billing: 3 seconds
5aspect_ratiostringOptionalVideo aspect ratio.
| Value | Description |
|---|---|
| 16:9 | Landscape video |
| 9:16 | Portrait video |
| 1:1 | Square video |
16:9qualitystringOptionalDefault: 720pVideo resolution quality. Affects billing multiplier.
| Value | Description |
|---|---|
| 720p | Standard 720P (1.0x base) |
| 1080p | High quality 1080P (1.333x base) |
720psoundstringOptionalDefault: offSound effect control. Affects billing multiplier.
| Value | Description |
|---|---|
| off | No sound effects (1.0x) |
| on | Generate sound effects (1.5x) |
Notes
- Combined multiplier: 720p+off=1.0x, 720p+on=1.5x, 1080p+off=1.333x, 1080p+on=2.0x
offcallback_urlstringOptionalHTTPS callback address after task completion.
Notes
- Triggered on completion, failure, or cancellation
- HTTPS only, no internal IPs
- Max length: 2048 chars
- Timeout: 10s, Max 3 retries
https://your-domain.com/webhooks/video-task-completedmodel_params.multi_shotbooleanOptionalDefault: falseEnable multi-shot mode for generating videos with multiple camera angles or scenes.
Notes
- When enabled, prompt parameter will be ignored — use multi_prompt instead
- Sum of all shot duration values must equal total video duration
truemodel_params.shot_typestringOptionalShot type for multi-shot mode. Required when multi_shot is true.
| Value | Description |
|---|---|
| customize | Custom per-shot prompts and durations |
| intelligence | AI auto-plans shots based on prompt |
Notes
- Only effective when multi_shot=true
customizemodel_params.multi_promptarrayOptionalPer-shot prompt array. Required when multi_shot=true and shot_type=customize. Each item defines a shot segment.
Notes
- Format: [{index: number, prompt: string, duration: string}, ...]
- Max 6 shots, each shot prompt max 512 characters
- Sum of all shot durations must equal total video duration
- When used, top-level prompt can be empty
[{"index": 1, "prompt": "A person on a hilltop", "duration": "5"}, {"index": 2, "prompt": "Camera pulls back", "duration": "5"}]negative_promptstringOptionalNegative prompt describing what you don't want in the video.
Notes
- Max 2500 characters
- Optional
blurry, watermark, text, low qualitymodel_params.watermark_infoobjectOptionalWatermark configuration for the generated video.
Notes
- Format: {enabled: boolean}
{"enabled": false}