Wan 2.5 API
Turn short prompts or reference images into ready-to-post videos with synced audio using Wan 2.5 API on Evolink AI.
No sample available
Upload 1 image (max 10MB)
Click to upload or drag and drop
Supported formats: JPEG, JPG, PNG, BMP, WEBP
Maximum file size: 10MB; Maximum files: 1
Click Generate to see preview
History
Max 20 items0 running · 0 completed
Wan 2.5 API for AI video with sound
Generate short HD videos with native audio, lip-sync, and social-ready framing from simple text or image inputs, all via a clean Evolink AI API.

What is Wan 2.5 API on Evolink AI?
Text-to-video with audio
Wan 2.5 API lets you send a short text prompt and receive a cinematic video clip complete with auto-generated sound, voice, or ambience so your content is ready for TikTok, Reels, and Shorts without extra editing. Instead of stitching tools together, you get visuals, pacing, and audio in one pass, which makes it easy to test hooks, concepts, and ad angles at scale across different social media accounts.

Image-to-video for product stories
With Wan 2.5 API you can upload a single key visual, such as a product shot or character design, and turn it into a short, dynamic clip that still feels consistent with the original image. This works well for turning static catalog images into scroll-stopping ads, motion posters, or story snippets where the camera moves, the light changes, and audio reinforces your brand message in a few seconds.

Built for social-first creators
Wan 2.5 API via Evolink AI is designed around social media creators and indie SaaS builders who care about speed, volume, and consistency more than academic benchmarks. It focuses on short 3–10 second clips, vertical and square formats, and audio that actually feels aligned with what shows on screen so your videos can drop straight into content calendars, UGC templates, or automated posting systems without hand-fixing every render.

Why choose Wan 2.5 API via Evolink AI?
Wan 2.5 API combines Alibaba’s audio-visual model with Evolink AI’s simple routing so you focus on ideas, not cloud configs or complex infrastructure.
Audio and video in one pass
Most AI video tools still make you juggle separate models for visuals and sound, which introduces friction, file management, and sync issues. Wan 2.5 API generates video and audio together, including lip-sync and ambient sound, so what you get already feels like a finished short-form clip. For creators and SaaS products that live or die by speed, that one-step workflow removes a lot of invisible overhead.
Simple Wan 2.5 integration
Wan 2.5 itself is an Alibaba Tongyi Wanxiang model, but Evolink AI wraps it into a straightforward Wan 2.5 API so you do not need to deal with region settings, separate console projects, or complex billing dashboards. You call a single endpoint with clear parameters and Evolink AI handles routing to the underlying Wan 2.5 model, making it much easier for developers and creators to plug video generation into their products or content workflows.
Optimized for short-form experimentation
Wan 2.5 API focuses on short, punchy clips in HD so you can rapidly iterate creative ideas instead of waiting for long renders. For marketers, agencies, and growth teams, this aligns perfectly with constant testing across audiences, geos, and hooks. You can generate many small, targeted variations, see what performs, and reinvest in the concepts that actually move metrics.
How to use Wan 2.5 API
Connect Wan 2.5 API through Evolink AI and move from prompt to published video in a few simple steps.
Connect your Evolink AI account
Sign up or log into Evolink AI, create an API key, and enable Wan 2.5 API access so your app can securely call the video generation endpoints without touching Alibaba Cloud directly.
Send prompts, images, and basic settings
Choose text-to-video or image-to-video, write a clear prompt, upload an optional reference image, set duration and aspect ratio, then send a simple JSON request to the Wan 2.5 API route.
Receive, review, and publish your clips
Fetch the generated Wan 2.5 video URL, preview audio and visuals, then plug it into your editor, scheduler, or SaaS interface for immediate download, posting, or further automation.
Key Wan 2.5 API features
Wan 2.5 API on Evolink AI focuses on real-world social and marketing use cases rather than lab demos, so every feature maps to a clear creator benefit.
Native audio and lip-sync
Wan 2.5 API can render video with audio by default, including voices, effects, or music, which means your team no longer needs a separate soundtrack pipeline just to make clips feel alive.
Short HD clips for social feeds
The model is tuned for 3–10 second HD videos so you hit the sweet spot for TikTok, Reels, and ad placements without wasting budget or time on overly long renders that nobody watches.
Text or image as flexible input
You can start from a simple text script or reuse an existing image as your base, which lets you adapt Wan 2.5 API to ideation, product showcases, and creator tools inside the same stack.
Multi-language prompt and audio support
Wan 2.5 API is comfortable with Chinese and English prompts and can keep audio aligned, which is especially useful when your audience spans multiple regions and languages online.
Consistent motion and control
The model offers smoother motion and better camera dynamics than older Wan versions, so videos feel more cinematic and less like janky demos, even when you move fast on campaigns.
Built-in for automation and SaaS
Because Wan 2.5 API runs through Evolink AI, you can plug it into cron jobs, no-code tools, or full SaaS backends to auto-generate video assets based on schedules, feeds, or prompts.
Wan 2.5 API vs other AI video models
Compare Wan 2.5 API with leading AI video backbones on cost, duration, and ideal use cases so you can choose the right model for each project.
| Model | Duration | Resolution | Price | Strength |
|---|---|---|---|---|
| Wan 2.5 API | 3–10 second clips focused on short-form hooks and social stories | Up to 1080p HD with lower tiers at 480p and 720p for budget control | Around $0.05 per second for HD video generation in many pay-per-use setups | Balanced quality, cost, and speed with native audio and lip-sync for social-first workflows |
| Kling 2.6 | 5–10 second clips, with options for longer high-motion shots | Up to 1080p with strong motion realism and physics for complex scenes | Commonly around $0.07–$0.14 per second depending on resolution and priority tier | Very strong motion quality and physics, good for realistic avatar videos and dynamic product shots |
| Seedance 1.5 Pro | 4–12 second audio-video clips with flexible dialogue settings | Up to 1080p with tightly synchronized audio and video | Often positioned near $0.05 per second for 720p audio-video generation in competitive offerings | Joint audio-video model with precise lip-sync and dialogue control, great for talking heads and explainers |
| Sora 2 | 10–20 second cinematic clips suitable for hero assets | 720p to 4K with high-end cinematic quality and detailed motion | Typical guidance puts standard Sora 2 around $0.10 per second for 720p, with higher rates for 1080p and 4K | Top-tier realism and storytelling power for flagship campaigns and premium branded content |
| Veo 3 | Short to mid-length clips tuned for cinematic storytelling | High-resolution output up to 4K depending on provider and plan | Frequently listed close to $0.40 per second for higher-end video generations in external pricing tables | High-end cinematic aesthetic suitable for trailers, launch videos, and professional creative studios |
Wan 2.5 API FAQ
Everything you need to know about the product and billing.
API Reference
Select endpoint
Authentication
All APIs require Bearer Token authentication.
Authorization:
Bearer YOUR_API_KEY/v1/videos/generationsCreate Video
Wan 2.5 Video Image to Video (wan2.5-image-to-video) model supports image-to-video generation mode.
Asynchronous processing mode, use the returned task ID to .
Generated video links are valid for 24 hours, please save them promptly.
Request Parameters
modelstringRequiredDefault: wan2.5-image-to-videoVideo generation model name.
wan2.5-image-to-videopromptstringRequiredPrompt describing what kind of video to generate from the input image.
Notes
- Limited to 2000 tokens
A cat playing pianodurationintegerOptionalDuration of the generated video (seconds).
| Value | Description |
|---|---|
| 5 | 5 seconds |
| 10 | 10 seconds |
Notes
- Pre-charged based on duration, actual charge based on generated video duration
5qualitystringOptionalDefault: 720pVideo quality.
| Value | Description |
|---|---|
| 480p | Lower quality, lower price |
| 720p | Standard quality (default) |
| 1080p | High quality, higher price |
720pimage_urlsarrayRequiredReference image URL list for first-frame image-to-video feature.
Notes
- 1 image required for image-to-video generation
- Max size: 10MB per image
- Formats: .jpeg, .jpg, .png (no transparent), .bmp, .webp
- Resolution: width and height range [360, 2000] pixels
- URLs must be directly viewable by the server
https://example.com/image1.pngprompt_extendbooleanOptionalDefault: trueWhether to enable intelligent prompt rewriting.
Notes
- When enabled, a large language model will optimize the prompt
- Effective for prompts that lack detail or are too simple
truecallback_urlstringOptionalHTTPS callback address after task completion.
Notes
- Triggered on completion, failure, or cancellation
- Sent after billing confirmation
- HTTPS only, no internal IPs
- Max length: 2048 chars
- Timeout: 10s, Max 3 retries
https://your-domain.com/webhooks/video-task-completed