Qwen3-Math

In August 2024, the Alibaba Cloud Dev Team introduced the first mathematical LLMs in the Qwen family with the release of the Qwen2-Math series. A month later, they enhanced it by launching the open-sourced Qwen2.5-Math series, which includes the base models Qwen2.5-Math-1.5B/7B/72B, instruction-tuned models Qwen2.5-Math-1.5B/7B/72B-Instruct, and the Qwen2.5-Math-RM-72B reward model.

In contrast to the Qwen2-Math series, which only supported Chain-of-Thought (CoT) reasoning for solving English math problems, the upcoming Qwen3-Math series will extend support to both CoT and Tool-Integrated Reasoning (TIR) for solving math problems in both Chinese and English. This expansion will result in substantial performance improvements across Chinese and English mathematics benchmarks when using CoT reasoning, marking a significant upgrade from the Qwen2.5-Math models.

While Chain-of-Thought (CoT) is essential for improving the reasoning abilities of large language models, it struggles with tasks that require high computational precision and complex mathematical or algorithmic reasoning, such as solving quadratic equations or calculating eigenvalues. Tool-Integrated Reasoning (TIR) addresses these challenges by enhancing the model’s ability to perform precise computations, symbolic manipulation, and algorithmic processes. The Qwen3-Math-1.5B/7B/72B-Instruct models, leveraging TIR, will achieve impressive scores on the MATH benchmark, demonstrating significant improvements in these areas.

Explore Qwen models on HuggingFace.