AI experts sharing free tutorials to accelerate your business.
← Back to News
Breaking

DeepSeek V4 Preview Drops: 1M Context, 1.6T MoE, Fully Open-Source

Krasa AI

2026-04-24

4 minute read

DeepSeek V4 Preview Drops: 1M Context, 1.6T MoE, Fully Open-Source

DeepSeek released preview versions of its V4 model family on Friday, pushing the Chinese lab back into direct competition with OpenAI and Anthropic at the frontier. The launch includes two variants — V4-Pro and V4-Flash — both open-weight, both designed to compete on reasoning, coding, and autonomous agent workflows.

The timing isn't subtle. V4 lands one year to the week after DeepSeek's R1 upended the AI industry and briefly tanked Nvidia's stock. The follow-up is bigger, longer-context, and still free to download.

What DeepSeek actually shipped

V4-Pro is a 1.6 trillion parameter mixture-of-experts (MoE) model with about 49 billion parameters active at inference time. V4-Flash is the smaller sibling: 284 billion total parameters, 13 billion active, tuned for speed and cost efficiency rather than raw benchmark wins.

Both models ship with a 1 million token context window — the same size that OpenAI and Anthropic have only recently moved to on their top tiers. For enterprise use, that means entire codebases, multi-hundred-page legal contracts, or months of customer transcripts can be processed in a single prompt.

The headline architectural change is what DeepSeek calls Hybrid Attention Architecture. The company claims it dramatically reduces the memory and compute overhead of long-context attention, cutting KV-cache requirements by roughly 90% compared with conventional transformer designs. That's the engineering work that makes 1M context economically viable at open-source pricing.

Benchmarks and the coding claim

DeepSeek is positioning V4-Pro specifically against closed-source frontier models on math and code. The company's own published numbers show V4-Pro leading all open models and approaching parity with GPT-5 and Claude Opus 4 on standard coding benchmarks like SWE-Bench and LiveCodeBench.

V4-Flash is the more interesting product for most developers. At 13B active parameters, it runs on a single high-end GPU — making it directly competitive with Mistral's and Meta's latest open releases on price-performance, with a much larger context window than either.

Morningstar senior equity analyst Ivan Su described V4 as a "competent" follow-up but noted it's not as disruptive as R1's debut last year. That reads as fair: V4 is an iteration, not a paradigm shift. The paradigm shift was R1 proving a Chinese lab could train a frontier-grade model at a fraction of U.S. lab budgets.

The Huawei compute angle

To train V4, DeepSeek partnered with Huawei on what Huawei is calling "Supernode" technology — large clusters of its Ascend 950 chips linked into training-scale systems. Huawei confirmed the partnership in a statement on Friday.

Why this matters: U.S. export controls have blocked the most advanced Nvidia chips from Chinese labs for nearly three years. Supernode + Ascend 950 is the strongest public evidence yet that the Chinese domestic AI stack can train a frontier-class model without Nvidia silicon. That's a significant strategic development — both for Beijing's AI self-sufficiency goals and for Nvidia's long-term China revenue outlook.

Who this affects

For developers and startups, V4 is immediately usable. The weights are hosted on Hugging Face and DeepSeek's own platform, with API access available through the DeepSeek API at a fraction of the price of closed frontier models. Expect a wave of fine-tunes and derivative models over the next few weeks.

For enterprises in regulated industries, V4's appeal is obvious: a frontier-quality model that can be self-hosted, audited, and run behind a firewall. That's the category where DeepSeek has quietly been building enterprise presence over the past year.

For OpenAI and Anthropic, V4 tightens the competitive squeeze on the middle of the market. Flagship models still have the edge on complex reasoning, but the gap on standard coding and retrieval workloads has narrowed to the point where many enterprises will pick the cheaper, self-hostable option.

What's next

DeepSeek says the preview will be followed by general availability within a few weeks, along with full technical documentation and a research paper. The V4-Pro and V4-Flash weights are already downloadable now for developers and researchers.

Watch for three things. First, whether third-party benchmarks confirm DeepSeek's published numbers — the company has historically been accurate, but independent verification takes time. Second, whether U.S. policymakers react to the Huawei Ascend 950 training story with fresh chip restrictions. And third, whether Western enterprises actually deploy V4 in production, or whether geopolitical considerations keep it a research-only option in the U.S. and Europe.

Bottom line

DeepSeek V4 isn't the shock that R1 was, but it doesn't need to be. It's proof that open-source frontier models are keeping pace with closed ones, and that a Chinese lab running on domestic silicon can ship at the frontier. For developers, that means a genuinely competitive open alternative at the top of the market. For the industry, it means the "open versus closed" gap keeps shrinking every quarter.

#ai#deepseek#open-source#china#llm

Related Articles