Notes: Huawei’s 910C And CM384: A Strategic Shock To The NVIDIA Thesis (Pt.2)

Summary
- Huawei's AI-native 910C and CM384 systems now rival NVIDIA’s Blackwell in real-world performance, despite using older 7nm and HBM2e components.
- By vertically integrating chip, system, cloud, and power infrastructure, Huawei is scaling compute in ways general-purpose GPGPU systems like NVIDIA’s Rubin cannot.
- While Huawei remains uninvestable, its rise threatens NVIDIA’s pricing power, ecosystem dominance, and long-term competitiveness in China and beyond.
Ascend’s Adoption Gap and System-Level Trade-Offs
Although Huawei’s Ascend platform is proving far more capable than many initially expected, it still faces key limitations. Most notably, bottom-up adoption of Ascend remains weak compared to NVIDIA. This is partly due to Ascend’s advanced NPU architecture, which requires deeper optimization work to unlock its full potential — something that deters many smaller engineering teams and independent developers.
Despite the H20 offering roughly half the BF16 compute of the 910B, it shipped with larger and faster HBM3 memory, and sold at a similar price point. As a result, the H20 significantly outsold the 910B throughout 2024. Following the release of DeepSeek V3 and R1, demand for H20 surged in China. Leading AI customers — including DeepSeek, ByteDance, Tencent, and Alibaba — each placed H20 orders exceeding $1bn, even as U.S. sanctions intensified and Chinese authorities advised companies to reduce reliance on NVIDIA. This demand spike was largely due to a shortage of 910B supply, which constrained Chinese hyperscalers' ability to meet growing inference needs. ByteDance and Tencent, for instance, have integrated LLMs natively into core applications like Douyin (TikTok China) and WeChat — platforms with over 700 million MAUs—driving enormous, urgent demand for inference compute.
Another structural issue is the state of Huawei’s AI software stack. While Huawei has made significant progress with its Ascend ecosystem and improved CANN's (Compute Architecture for Neural Networks) usability, the platform remains optimized primarily for large enterprises and high-volume inference workloads like DeepSeek. For developers working outside these common use cases — or for those building from the ground up using open tooling — the experience is more fragmented. In these scenarios, engineers may encounter issues quickly, and unless they’re a large enterprise with direct Huawei support, resolving them often means waiting on technical service or working around limitations independently.
Finally, although using all optical modules allows Huawei to deliver larger scale up clusters with higher MFU and linearity whilst lowering down power density per rack, Huawei still needs to figure out ways to improve the cost-effectiveness further in the future via either cutting down optical connection costs further or increasing rack density and use copper cables for inner-rack communication to reduce the cost further.
910C: Made in China?
Contrary to widespread belief, we believe the Ascend 910C is — if not already — now being mass-produced entirely within mainland China. Specifically, it is being fabricated by SMIC (Semiconductor Manufacturing International Corporation), China’s leading foundry, based in the People’s Republic of China (PRC). This contrasts with TSMC, which, of course, is located in Taiwan, officially known as the Republic of China (ROC).