Alibaba unveils new contender in AI language model

January 29, 2026 This week, Alibaba Cloud’s Qwen Team unveiled Qwen3-Max-Thinking, a proprietary language model designed to compete directly with top-tier reasoning systems from U.S. and European labs. The Chinese AI lab says the model can match and in some benchmarks, surpass the reasoning performance of OpenAI’s GPT-5.2 and Google’s Gemini 3 Pro, while running at a fraction of the cost.

The release lands at a pivotal moment. Western firms have largely defined what “reasoning models” look like, often described as System-2 style logic for multi-step problem solving. Qwen’s latest results suggest that gap has narrowed or, in some cases, closed.

Qwen is already a familiar name in AI circles. Over the past year, the team has shipped a wide range of open-source models spanning text, image, and speech, earning adoption well beyond China. The models even drew praise from Brian Chesky, who previously said Airbnb was using Qwen’s free, open models as a cost-effective alternative to U.S.-based offerings.

Unlike earlier releases, Qwen3-Max-Thinking is proprietary. According to the Qwen Team, the goal was not incremental improvement but architectural leverage.

At the core of the model is a technique the team calls “test-time scaling,” which allows the system to spend more compute on difficult problems while avoiding redundant reasoning. Instead of generating many answers and selecting the best one, Qwen3-Max-Thinking iteratively refines its reasoning, identifying dead ends early and focusing processing power on unresolved uncertainties.

Alibaba Cloud says this approach delivers measurable gains. On GPQA, a PhD-level science benchmark, the model improved its score from 90.3 to 92.8. On LiveCodeBench v6, performance rose from 88.0 to 91.4. In agentic tasks that combine reasoning with live web search, Qwen3-Max-Thinking also outperformed competing models on “Humanity’s Last Exam,” a benchmark designed to resist memorization.

Beyond raw reasoning, the model integrates autonomous tool use. It can decide when to search the web, run code, or retrieve stored context without explicit user prompts.

Pricing appears to be a key part of the strategy. Alibaba Cloud has positioned Qwen3-Max-Thinking below many flagship Western models on a per-token basis, while charging separately for advanced agent features such as web search and agent orchestration. Some tools, including code execution and web extraction, are currently free for a limited time.

Top Stories

Related Articles

February 25, 2026 The RAM shortage continues to squeeze PC buyers, with memory kits from major brands selling at sharply more...

February 25, 2026 Women and girls could face heightened risks of harassment and stalking if Meta proceeds with plans to more...

February 24, 2026 Graph databases have moved from an academic topic to the mainstream of information technology over the last more...

February 24, 2026 Linus Torvalds is marking the start of Linux 7.0 with equal parts routine engineering update and self-aware more...

Jim Love

Jim is an author and podcast host with over 40 years in technology.

Share:
Facebook
Twitter
LinkedIn