The Qwen team at Alibaba has released a new version of the open source inference AI model with several impressive benchmarks.
Meet QWEN3-235B-A22B-Thinking-2507. Over the past three months, the Qwen team has been aiming to expand what is called AI’s “thinking ability” and improve both the quality and depth of reasoning.
The results of their efforts are truly tough and excellent models, such as logical reasoning, complex mathematics, scientific problems, advanced coding. In these areas where human experts usually require, this new Qwen model sets the standard for open source models.
In the inference benchmark, QWEN’s latest open source AI model provides coding at 92.3 on AIME25 and 74.1 on LiveCodebench V6. It also holds its own in more general ability tests, earning 79.7 on the Arena-Hard V2, measuring how well it matches human preferences.
At the heart of this is the large-scale inference of the QWEN team’s AI model, with a total of 235 billion parameters. However, we use a mixture of Experts (MOE). This means that a small portion of these parameters (approximately 22 billion) are active at any time. Think of it as if there’s a large team of 128 experts on the call. However, only eight people who are best suited for a particular task are brought in to actually work.
Perhaps one of its most impressive features is its huge memory. Qwen’s open source inference AI model has a native context length of 262,144 tokens. A great advantage of tasks that involve understanding huge amounts of information.
For the developers and Tinker, the Qwen team was easy to get started. This model is available with a hugging face. You can deploy it using tools like Sglang or VLLM to create your own API endpoints. The team also pointed to the Qwen-Agent framework as the best way to utilize the model’s tool call skills.
To get the best performance from open source AI inference models, the Qwen team shares some tips. They propose an output length of about 32,768 tokens for most tasks, but for a very complex task, it needs to be bolstered to 81,920 tokens to give AI sufficient space to “think”. It is also a good idea to provide model-specific instructions at the prompt to ask “stage-level reasons” for a mathematical problem “step-level” for a most accurate and clear answer.
This new Qwen model release offers a powerful yet open source inference AI comparable to some of the best proprietary models, especially when it comes to complex and brain-bending tasks. It’s exciting to see what developers ultimately build.
(Image by Tung Lam)
reference: AI Action Plan: US leadership needs to be “unchallenged”
Want to learn more about AI and big data from industry leaders? Check out the AI & Big Data Expo in Amsterdam, California and London. The comprehensive event will be held in collaboration with other major events, including the Intelligent Automation Conference, Blockx, Digital Transformation Week, and Cyber Security & Cloud Expo.
Check out other upcoming Enterprise Technology events and webinars with TechForge here.