Alibaba Cloud’s Qwen team on May 20 announced Qwen3.7-Max, a proprietary large language model built for autonomous agent tasks such as coding, office automation and long-horizon execution that spans hours without human intervention.

According to the Qwen team’s announcement, the model is positioned as a shift from chat-based artificial intelligence (AI) to systems that can independently plan, reason, and execute complex workflows. The model is available immediately through Alibaba Cloud Model Studio, with API access supporting integration with multiple agent frameworks, including Anthropic’s Claude Code, OpenClaw and Qwen Code, according to the Qwen team [1][2]. A companion model, Qwen3.7-Plus, adds vision input capabilities [3].

Performance Benchmarks

On specialized coding benchmarks, Qwen3.7-Max scored 60.6 on SWE-Pro, 78.3 on SWE-Multilingual and 69.7 on Terminal Bench 2.0-Terminus, outperforming several competitor models, according to the Qwen team [4][5]. On reasoning benchmarks, the model achieved 92.4 on GPQA Diamond and 97.1 on HMMT 2026 Feb, according to the same source.

On general-purpose agent benchmarks, Qwen3.7-Max scored 60.8 on MCP-Mark and 76.4 on MCP-Atlas. The model also scored 56.6 on the Artificial Analysis Intelligence Index, according to a separate report [6]. These benchmark results position Qwen3.7-Max among the top-tier models for coding and reasoning, based on the figures released by the Qwen team.

Agent Capabilities and Use Cases

The model functions as a coding agent capable of frontend prototyping and complex multi-file engineering, and as an office productivity assistant through MCP (Model Context Protocol) integrations and multi-agent orchestration, according to the Qwen team [1][7]. It demonstrated sustained autonomous execution in a 35-hour kernel optimization run involving over 1,000 tool calls, achieving a 10.0x speedup over a reference implementation, the team stated [8][9][10].

“Qwen3.7-Max proves that the autonomous agent era is no longer a theoretical projection; it is a present reality capable of executing complex tasks,” VentureBeat reported, citing the Qwen team [8]. The model’s ability to operate for extended periods without human oversight marks a departure from earlier AI models that require frequent user prompting.

Agent Scaling and Training Methodology

The Qwen team employed an environment scaling approach that expands the quality and diversity of agentic training environments, with benchmark evaluations conducted on unseen, out-of-domain environments, according to the team’s technical description [1][2]. The training utilized cross-harness and cross-verifier reinforcement learning (RL) to improve generalization.

During training, the model autonomously monitored reward hacking, adding 13 new heuristic rules over 80 hours of training to prevent exploitation of the reward signal, according to the Qwen team [1][2]. This self-monitoring capability reflects a growing emphasis on training robustness and alignment in advanced AI systems.

Availability and Conclusion

Qwen3.7-Max is now available through Alibaba Cloud Model Studio, with API access priced at $1.20 per million tokens, according to pricing data [11][3]. The model supports a preserve_thinking feature for agentic tasks, and the Qwen team said complex projects that typically require one to two weeks of team effort can be completed within hours [1][5].

As centralized AI development accelerates, the emergence of models capable of sustained autonomous execution raises broader questions about control and dependency. While the technology can boost productivity, its reliance on centralized cloud infrastructure reinforces the kind of top-down control that advocates of decentralization warn against [12]. For now, Qwen3.7-Max represents a significant milestone in the move toward AI agents that act independently for extended periods.

References

  1. qwen.ai. “Qwen3.7: The Agent Frontier”. May 20, 2026.
  2. Alibaba Cloud Community. “Qwen3.7: The Agent Frontier”.
  3. Codersera. “Qwen 3.7 Max: Alibaba’s May 2026 Flagship Guide”.
  4. DataCamp. “Qwen3.7-Max: Features, Benchmarks, and the Agent Frontier”.
  5. ExplainX. “Qwen 3.7-Max: The Agent Frontier and Long-Horizon Autonomy”.
  6. Marktechpost. “Qwen Introduces Qwen3.7-Max: A Reasoning Agent Model With a 1M-token Context Window”. May 22, 2026.
  7. YouTube. “Qwen3.7-Max: The Agent Frontier”.
  8. VentureBeat. “Alibaba’s proprietary Qwen3.7-Max can run for 35 hours autonomously and supports external harnesses like Anthropic’s Claude Code”. May 22, 2026.
  9. TechTimes. “Qwen3.7-Max Wrote Its Own Chip’s Software in 35-Hour Run: Alibaba’s Full-Stack Bet”. May 21, 2026.
  10. AskSurf. “Qwen3.7-Max shifts the frontier from chat to sustained agent execution”. May 22, 2026.
  11. Automatio. “Qwen 3.7 Max: The Agent Frontier at $1.20/M tokens”.
  12. NaturalNews.com. “The Decentralization Trifecta How Battery Tech Robotics & Local AI Will Set You Free”. February 06, 2026.
  13. Dimick William. “Python 3 books in 1 Beginners guide Data science and Machine learning”.

Explainer Infographic

Read full article here