A chip stacking strategy has emerged as China’s innovative response to U.S. semiconductor regulations, but can this approach really close the performance gap with Nvidia’s advanced GPUs?As the U.S. government tightens export controls on cutting-edge chip manufacturing technology, Chinese researchers are proposing a bold workaround. That’s comparable to the performance of chips that can no longer be accessed by stacking older chips that can be produced domestically.
Core Concept: Build Up, Not Forward
Chip stacking strategies are based on a deceptively simple premise. In other words, if we can’t make more advanced chips, we use the chips we can to create smarter systems. Wei Shaojun, vice president of the China Semiconductor Industry Association and a professor at Tsinghua University, recently outlined to the South China Morning Post an architecture that uses three-dimensional hybrid bonding to combine 14-nanometer logic chips with 18-nanometer DRAM.
This is important because US export regulations specifically target the production of logic chips below 14nm and DRAM below 18nm. Wei’s proposal uses a process that remains accessible to Chinese manufacturers and works precisely within these technological limits.
The technical approach includes something called “software-defined near-memory computing.” Instead of moving data back and forth between processors and memory, which is the main bottleneck for AI workloads, chip stacking strategies place processors and memory in close proximity through vertical stacking.
3D hybrid bonding technology creates direct copper-to-copper connections with a pitch of less than 10 micrometers, essentially eliminating the physical distance that slows down traditional chip architectures.
Performance claims and reality checks
Wei claims this configuration could rival Nvidia’s 4nm GPUs while significantly reducing cost and power consumption. He cites performance numbers of 2 TFLOPS per watt, or 120 TFLOPS total. There’s just one problem. Nvidia’s A100 GPU, which Wei is comparing, actually delivers up to 312 TFLOPS, which is more than 2.5 times its claimed performance.
This discrepancy highlights questions regarding the feasibility of chip stacking strategies. The architectural innovation is real, but the performance gap is still large. Stacking older chips doesn’t magically erase the benefits of advanced process nodes that deliver better power efficiency, higher transistor density, and better thermal performance.
Why China is betting on this approach
The strategy logic behind the chip stacking strategy goes beyond pure performance metrics. Huawei’s founder, Ren Zhengfei, has articulated his philosophy: “Achieve cutting-edge performance by stacking and clustering chips, rather than competing on a node-by-node basis.” This represents a change in the way China approaches semiconductor challenges.
Please consider alternatives. TSMC and Samsung are pushing 3nm and 2nm processes that are still completely out of reach for Chinese manufacturers. Rather than fighting unwinnable battles for process node leadership, the chip stacking strategy proposes competing in system architecture and software optimization.
There are also issues with CUDA. Nvidia’s dominance in AI computing lies not only in its hardware, but also in its CUDA software ecosystem. Wei describes this as a “triple dependency” spanning model, architecture, and ecosystem.
Chinese chip designers pursuing traditional GPU architectures will either need to replicate CUDA’s functionality or persuade developers to abandon the mature and widely adopted platform. Chip stacking strategies provide a way around this dependency by proposing an entirely different computing paradigm.
Feasibility issues
Does the chip stacking strategy actually work? The technical foundation is solid, and 3D chip stacking is already used in high-bandwidth memory and advanced packaging solutions around the world. Innovation lies in applying these techniques to create entirely new computing architectures, rather than simply improving on existing designs.
However, several challenges stand in the way. First, thermal management becomes very difficult when stacking multiple active processing dies. The heat generated by 14nm chips is significantly higher than modern 4nm or 5nm processes, and stacking them makes the problem even worse.
Second, the yield of 3D stacking is notoriously difficult to optimize. A defect in any layer can compromise the entire stack. Third, the software ecosystem needed to effectively use such an architecture does not yet exist and will take years to mature.
The most realistic assessment is that the chip stacking strategy is an effective approach for certain workloads where memory bandwidth is more important than raw computational speed. AI inference tasks, certain data analysis operations, and specialized applications may benefit. But matching Nvidia’s performance across the full range of AI training and inference tasks remains a distant goal.
What it means for the AI chip wars
The emergence of chip stacking strategy as a focus of China’s semiconductor development signals a strategic shift. Rather than trying to replicate Western chip designs at inferior process nodes, China is exploring architectural alternatives that leverage available manufacturing strengths.
It remains unclear whether the chip stacking strategy will be successful in closing the performance gap with Nvidia. What is clear is that China’s semiconductor industry is adapting to the regulations by pursuing innovation in areas less affected by export restrictions, such as system design, packaging technology, and co-optimization of software and hardware.
For the global AI industry, this means a more complex competitive landscape. Nvidia’s current dominance faces challenges from traditional competitors like AMD and Intel, as well as entirely new architectural approaches that could redefine what an “AI chip” looks like.
Chip stacking strategies, whatever their current limitations, represent exactly this kind of architectural disruption and are worth monitoring closely.
SEE ALSO: New Nvidia Blackwell chips for China could outperform H20 model

Want to learn more about AI and big data from industry leaders? Check out the AI & Big Data Expos in Amsterdam, California, and London. This comprehensive event is part of TechEx and co-located with other major technology events. Click here for more information.
AI News is brought to you by TechForge Media. Learn about other upcoming enterprise technology events and webinars.

