Artificial Intelligence & Machine Learning
,
Next-Generation Technologies & Secure Development
Shanghai Firm Bets on Open-Source Strategy, Efficiency Claims

Shanghai artificial intelligence startup MiniMax released a new open-source large language model, positioning it as a direct competitor to American and other Chinese models.
See Also: Taming Cryptographic Sprawl in a Post-Quantum World
The company released the MiniMax-M1 model under Apache license, enabling open-source access – unlike many rivals that publish their models under more restrictive terms. Meta’s Llama family operates under a non-commercial community license and DeepSeek’s models are only partially open source. M1’s licensing permits unrestricted commercial and research use.
“In complex, productivity-oriented scenarios, M1’s capabilities are top-tier among open-source models,” MiniMax asserted in a blog post accompanying the launch. The company claims M1 surpasses domestic closed-source models and approaches the leading overseas models, while offering what it describes as “the industry’s best cost-effectiveness.”
MiniMax says its model performs competitively on benchmark tests against leading proprietary and open models, including OpenAI’s o3, Google’s Gemini 2.5 Pro, Anthropic’s Claude 4 Opus and DeepSeek R1. The company cited evaluations on AIME 2024, LiveCodeBench, SWE-bench Verified, Tau-bench and MRCR. MiniMax did not rank its model universally superior, instead saying that performance varies across benchmarks. As with most vendor-supplied results, the claims have not been independently verified, but the model’s code and weights are available on GitHub.
MiniMax says that M1’s ability to handle long-context tasks is one of its key advantages. The model can process up to one million tokens, a data span rivaling Google’s Gemini 2.5 Pro and eight times the capacity of DeepSeek R1. Its output generation reaches 80,000 tokens, higher than DeepSeek’s 64,000 and trailing OpenAI’s o3, which handles up to 100,000 output tokens.
The company says its in-house architecture, which includes a “lightning attention” mechanism, enables more efficient training and inference, especially in long-context scenarios. MiniMax says this method allows the M1 model to process 80,000-token reasoning tasks using about 30% of the computing power required by DeepSeek R1.
MiniMax describes a reinforcement learning strategy called CISPO as part of its approach to optimizing training efficiency. Details of the method are included in the model’s technical documentation, which the company published alongside the code. The reinforcement learning phase of training used 512 Nvidia H800 GPUs over three weeks, which MiniMax estimates cost $537,400 in rental fees.
The release adds to competition in China’s large model landscape, where firms such as DeepSeek, Alibaba-backed Qwen and Baidu’s Ernie are vying to match or surpass Western players. MiniMax itself is backed by investors such as Alibaba Group, Tencent and IDG Capital.