DeepSeek V4 Trained on Huawei Chips, Not Nvidia. That Matters.

PressBot 4 min read
DeepSeek V4 Trained on Huawei Chips, Not Nvidia. That Matters.

On April 24, 2026, DeepSeek dropped V4 — a 1.6 trillion parameter model trained entirely on Huawei Ascend chips instead of Nvidia hardware. That single decision matters more than the benchmark scores.

The Hardware Story Everyone Should Be Reading

DeepSeek’s earlier V3 model trained on 2,048 Nvidia H800 GPUs. The company faced investigations over whether it acquired restricted Nvidia hardware through intermediaries. V4 sidesteps that supply chain entirely.

DeepSeek partnered with Huawei, which provided its “Supernode” technology — large clusters of Ascend 950 chips purpose-built for AI workloads. As Wei Sun, principal analyst at Counterpoint Research, pointed out: V4 runs on domestic chips from Huawei and Cambricon, a clean break from the Nvidia dependency that defined R1’s training.

The performance numbers back it up. Analyst Rui Ma highlighted that DeepSeek validated their fine-grained Expert Parallelism (EP) scheme on Huawei Ascend NPUs, achieving a 1.50x to 1.73x speedup on non-Nvidia platforms. That’s not parity — that’s a blueprint for AI development that doesn’t route through a single chipmaker’s export licenses.

What’s Actually Inside V4

V4 ships in two variants:

  • V4-Pro — 1.6 trillion parameters, 1 million token context window, flagship model
  • V4-Flash — 284 billion parameters, same context window, built for cost-sensitive production

The architecture includes Manifold-Constrained Hyper-Connections (mHC), which replace traditional residual connections. The practical result: signal propagation stays stable across a massive parameter count without the training instability that typically plagues models at this scale.

Here’s the efficiency gain that matters for production use: in the 1M-token context setting, V4-Pro requires only 27% of the single-token inference FLOPs and 10% of the KV cache compared to DeepSeek-V3.2. Million-token context becomes a standard API feature, not a premium surcharge.

The Pricing Makes the Argument

Put these numbers side by side:

  • DeepSeek V4-Pro: $3.48 per million output tokens
  • DeepSeek V4-Flash: $0.28 per million output tokens
  • OpenAI GPT-5.4: $30 per million output tokens
  • Anthropic Claude Opus 4.6: $25 per million output tokens

V4-Pro is roughly 8x cheaper than GPT-5.4. V4-Flash is over 100x cheaper. And DeepSeek is running a 75% promotional discount on V4-Pro until May 5, 2026, while simultaneously cutting cache-hit prices across its entire API suite to one-tenth of previous levels.

DeepSeek has tied future pricing to Huawei’s hardware roadmap — V4-Pro prices could drop further when Ascend 950 supernodes ship at scale in the second half of 2026. The cost advantage isn’t a launch promotion. It’s structural.

MIT License: The Open Source Angle

Both V4 models are open-weight on Hugging Face under the MIT License — the most permissive framework available. Developers can use, copy, modify, and distribute the weights for any purpose, commercial or otherwise.

This matters because the open-source AI community has been watching a slow consolidation. The best models increasingly sit behind proprietary APIs with opaque pricing. DeepSeek V4 pushes back on that trend with a model that competes at the top of benchmarks and ships with no strings attached.

For anyone building AI-powered tools — chatbots, content pipelines, code assistants — V4’s open weights mean you can self-host a model with million-token context for the cost of compute alone. No per-token API fees. No vendor lock-in.

What This Means If You Run WordPress

If you manage a WordPress site with AI integrations, the DeepSeek V4 story is about options. Every time a capable open model ships at aggressive pricing, it applies downward pressure on the providers you’re already using. Claude and Gemini both benefit when competition forces everyone to improve price-to-performance ratios.

PressBot Pro currently supports Anthropic Claude, Google Gemini and OpenAI models through a BYOK (Bring Your Own Key) model — you pay your AI provider directly, and your costs track with the market. When API prices drop industry-wide, your per-conversation costs drop too. The V4 release is exactly the kind of competitive pressure that accelerates those price drops across every provider. Also who knows, with these prices, supporting DeepSeek as an ultrabudget option sounds very tempting!

The Huawei training story adds a second dimension. A viable non-Nvidia training path means the AI supply chain is less fragile than it was six months ago. More hardware competition means more models, lower prices, and fewer bottlenecks that could spike API costs overnight.

The Takeaway

DeepSeek V4 proved three things at once: you can train a frontier model without Nvidia, you can price it at a fraction of the incumbents, and you can release it under MIT. Each of those facts individually would be significant. Together, they shift what’s possible for every developer and business building on AI.

If you’re running AI on your WordPress site and want to keep your costs low as the market evolves, PressBot Pro’s BYOK model ensures you always pay market rate — no markup, no middleman. Set up your AI provider keys and start managing your site through conversation.

Share

Ready when you are

Add AI to your WordPress.

Free forever. Unlimited conversations. Bring your own keys, keep your data on your server.