Skip to main content

Documentation Index

Fetch the complete documentation index at: https://platform.kimi.ai/docs/llms.txt

Use this file to discover all available pages before exploring further.

Product Pricing

Explanation: Prices exclude applicable taxes. Specific tax obligations are subject to local tax regulations and will be calculated at checkout based on your jurisdiction. Here, 1M = 1,000,000. The prices in the table represent the cost per 1M tokens consumed.
kimi-k2 series models will be officially discontinued on May 25, 2026 and will no longer be maintained or supported. Please use the latest Kimi model kimi-k2.6 for continued support and enhanced reasoning capabilities.

Model Description

  • Kimi K2 is a Mixture-of-Experts (MoE) foundation model with exceptional coding and agent capabilities, featuring 1 trillion total parameters and 32 billion activated parameters. In benchmark evaluations covering general knowledge reasoning, programming, mathematics, and agent-related tasks, the K2 model outperforms other leading open-source models
  • kimi-k2-0905-preview: Context length 256k. Based on kimi-k2-0711-preview, with enhanced agentic coding abilities, improved frontend code quality and practicality, and better context understanding
  • kimi-k2-turbo-preview: Context length 256k. High-speed version of Kimi K2, always aligned with the Kimi K2 (kimi-k2-0905-preview). Same model parameters as Kimi K2, output speed up to 60 tokens/sec (max 100 tokens/sec)
  • kimi-k2-0711-preview: Context length 128k
  • kimi-k2-thinking: Context length 256k. A thinking model with general agentic and reasoning capabilities, specializing in deep reasoning tasks Usage Notes
  • kimi-k2-thinking-turbo: Context length 256k. High-speed version of kimi-k2-thinking, suitable for scenarios requiring both deep reasoning and extremely fast responses
  • Supports ToolCalls, JSON Mode, Partial Mode, and internet search functionality
  • Does not support vision functionality
  • Supports automatic context caching functionality. Cached tokens are charged at the input price (cache hit) rate. You can view “context caching” type cost details in the console