MiniMax-M2.5 API
$0.181(~ 13 credits) per 1M input tokens; $0.719(~ 51.8 credits) per 1M output tokens
$0.024(~ 1.7 credits) per 1M cache read tokens
Web search tool charged separately per request.
Highest stability with guaranteed 99.9% uptime. Recommended for production environments.
Use the same API endpoint for all versions. Only the model parameter differs.
MiniMax-M2.5 API Pricing and Access for Reasoning Workloads
Route MiniMax-M2.5 through EvoLink for coding agents, repo Q&A, research, and document analysis with 204K context, built-in web search, and prompt caching. Start with OpenAI-compatible access and pricing from $0.18/1M input tokens.
Access and workflow fit
Best fit
Coding agents
Access
OpenAI-compatible
Context
204K window
Built-in
Web search + caching

What can you build with the MiniMax-M2.5 API?
Intelligent Coding Assistants
Build coding copilots and coding agents that handle repo Q&A, code generation, bug triage, and review workflows. MiniMax-M2.5 is a strong fit when your product needs long-context code understanding and step-by-step reasoning in one text API.

Research & Analysis with Web Search
Use MiniMax-M2.5 for research agents, market scans, and knowledge workflows that need fresh web data. Search can be enabled only when needed, helping teams balance answer quality, latency, and cost.

Document Processing & Summarization
Process contracts, reports, support transcripts, and long internal knowledge bases without aggressive chunking. The 204K context window is useful for structured summaries, extraction pipelines, and document comparison tasks.

Why teams choose the MiniMax-M2.5 API
Teams choose MiniMax-M2.5 on EvoLink when they need long-context reasoning, predictable token pricing, and faster onboarding than a separate vendor-specific integration.
Lower-friction integration
Keep the OpenAI-style request shape, use one EvoLink key, and plug MiniMax-M2.5 into coding agents or gateway-style workflows without building a MiniMax-specific integration path first.
Predictable production cost
Visible token pricing makes budgeting easier: input starts at $0.18/1M, output at $0.72/1M, and cache hits at $0.018/1M for repeated prompts.
Reasoning plus live retrieval
Use 204K context for large prompts and turn on built-in web search for research or verification flows that need fresh information.
How to integrate the MiniMax-M2.5 API
Keep your existing OpenAI client, point it to EvoLink, set the model to MiniMax-M2.5, and use the same route for coding-agent, repo Q&A, and long-context workflows.
Step 1 — Authenticate
Create an EvoLink API key, set the EvoLink base URL, and send requests with standard Bearer authentication.
Step 2 — Set required fields
Send `model: MiniMax-M2.5` with your `messages` array. Reuse stable system prompts and prefixes to benefit from prompt caching on repeated workloads.
Step 3 — Tune outputs
Adjust temperature, top_p, max_tokens, and stream as usual. Turn on `enable_search` only when needed, then choose `search_strategy: turbo` or `max` based on latency and coverage.
MiniMax-M2.5 API features for production teams
Concrete controls and deployment signals instead of a generic model overview
Reasoning model for text workloads
Use MiniMax-M2.5 for coding, structured analysis, and multi-step text tasks where response quality matters more than lightweight chat output.
204K Context Window
Fit long documents, large prompts, and multi-turn context into one request before you reach for aggressive chunking or multi-pass orchestration.
Search modes for fresh data
Enable real-time retrieval with `enable_search: true` and choose `turbo` or `max` depending on whether speed or broader coverage is more important.
OpenAI SDK Compatible
Move existing OpenAI-style clients onto MiniMax-M2.5 by changing the base URL and model name instead of rebuilding your integration path for coding tools or internal agents.
Prompt Caching
Repeated prefixes and system prompts can be billed more efficiently, which helps recurring agent workflows and high-volume production traffic.
Alibaba Cloud deployment path
The route is deployed on Alibaba Cloud for low-latency access and a production-oriented delivery path on EvoLink.
MiniMax-M2.5 API FAQs
Everything you need to know about the product and billing.