Alibaba unveils new contender in AI language model

January 29, 2026 This week, Alibaba Cloud’s Qwen Team unveiled Qwen3-Max-Thinking, a proprietary language model designed to compete directly with top-tier reasoning systems from U.S. and European labs. The Chinese AI lab says the model can match and in some benchmarks, surpass the reasoning performance of OpenAI’s GPT-5.2 and Google’s Gemini 3 Pro, while running at a fraction of the cost.

The release lands at a pivotal moment. Western firms have largely defined what “reasoning models” look like, often described as System-2 style logic for multi-step problem solving. Qwen’s latest results suggest that gap has narrowed or, in some cases, closed.

Qwen is already a familiar name in AI circles. Over the past year, the team has shipped a wide range of open-source models spanning text, image, and speech, earning adoption well beyond China. The models even drew praise from Brian Chesky, who previously said Airbnb was using Qwen’s free, open models as a cost-effective alternative to U.S.-based offerings.

Unlike earlier releases, Qwen3-Max-Thinking is proprietary. According to the Qwen Team, the goal was not incremental improvement but architectural leverage.

At the core of the model is a technique the team calls “test-time scaling,” which allows the system to spend more compute on difficult problems while avoiding redundant reasoning. Instead of generating many answers and selecting the best one, Qwen3-Max-Thinking iteratively refines its reasoning, identifying dead ends early and focusing processing power on unresolved uncertainties.

Alibaba Cloud says this approach delivers measurable gains. On GPQA, a PhD-level science benchmark, the model improved its score from 90.3 to 92.8. On LiveCodeBench v6, performance rose from 88.0 to 91.4. In agentic tasks that combine reasoning with live web search, Qwen3-Max-Thinking also outperformed competing models on “Humanity’s Last Exam,” a benchmark designed to resist memorization.

Beyond raw reasoning, the model integrates autonomous tool use. It can decide when to search the web, run code, or retrieve stored context without explicit user prompts.

Pricing appears to be a key part of the strategy. Alibaba Cloud has positioned Qwen3-Max-Thinking below many flagship Western models on a per-token basis, while charging separately for advanced agent features such as web search and agent orchestration. Some tools, including code execution and web extraction, are currently free for a limited time.

Top Stories

Related Articles

June 26, 2026 Polaroid has launched a new advertising campaign criticizing data centre water consumption as concerns about the environmental more...

June 26, 2026 Opposition to large-scale data centre developments tied to the artificial intelligence boom is beginning to influence U.S. more...

June 26, 2026 Meta's chief technology officer says employee morale has fallen to one of the lowest levels in the more...

June 26, 2026 Memory chip maker Micron says it has signed 16 long-term strategic customer agreements that include price floors more...

Jim Love

Jim is an author and podcast host with over 40 years in technology.

Share:
Facebook
Twitter
LinkedIn