DeepSeek V4 is out, and it’s actually interesting for a few reasons

DeepSeek V4 is out, and it’s actually interesting for a few reasons

5 0 0

DeepSeek just released V4, their new flagship model, and I’ve been poking around the benchmarks and pricing all morning. This is their first big move since R1 turned them into a household name back in January 2025, and honestly, it’s refreshing to see them ship something substantial after months of relative quiet.

For those who missed the R1 saga — that model was trained on limited compute and still managed to punch way above its weight, sparking a wave of open-weight releases from Chinese AI labs. DeepSeek went from obscure research team to China’s poster child for AI ambitions almost overnight. But since then, it’s been mostly silence, punctuated by personnel shuffles, delayed launches, and the usual regulatory noise from both Washington and Beijing.

V4 breaks that silence. And while I don’t think it’ll shake the industry the way R1 did — that was a once-in-a-career moment — there are three things about this release that genuinely caught my attention.

Open-source that actually competes

DeepSeek is sticking with the open-source playbook, which I respect. V4 comes in two flavors: V4-Pro, the big one for coding and agent tasks, and V4-Flash, a leaner version that’s faster and cheaper to run. Both are available for download, modification, and API access. The pricing is almost absurdly low — V4-Pro runs $1.74 per million input tokens and $3.48 per million output tokens, a fraction of what OpenAI or Anthropic charge. V4-Flash is even cheaper, at $0.14 and $0.28 per million tokens respectively. That’s practically pocket change for building applications.

On benchmarks, V4-Pro holds its own against the closed-source heavyweights — Anthropic’s Claude-Opus-4.6, OpenAI’s GPT-5.4, and Google’s Gemini-3.1. Against open-source rivals like Alibaba’s Qwen-3.5 or Z.ai’s GLM-5.1, it pulls ahead on coding, math, and STEM problems. DeepSeek also ran an internal survey of 85 experienced developers, and over 90% included V4-Pro among their top choices for coding tasks. That’s not just marketing fluff — it suggests real-world usability.

Memory efficiency that matters

The standout technical feature here is the context window. Both versions handle 1 million tokens, which is roughly the length of a trilogy of novels. That’s not just a spec sheet flex — it’s genuinely useful for tasks like analyzing entire codebases, reviewing long legal documents, or processing extended conversations without losing the thread.

DeepSeek achieved this with a new architecture that’s more memory-efficient than previous approaches. I’ve seen a lot of models claim long context windows only to fail miserably on “needle in a haystack” tests, where you hide a specific fact in a sea of text and ask the model to retrieve it. DeepSeek claims V4 handles these tests well, though I’d want to see independent verification before getting too excited. Still, the design is a meaningful step forward for open-source models, which have typically lagged behind closed-source ones on memory management.

The timing is interesting

DeepSeek has been operating under a microscope. The US government has been tightening export controls on AI chips, and China’s own regulatory environment has been unpredictable. Personnel departures and delayed launches added to the narrative that DeepSeek might be losing momentum. Releasing V4 now, with strong performance and aggressive pricing, feels like a deliberate signal — they’re not slowing down.

What I find more interesting is how this plays into the broader open-source vs. closed-source debate. DeepSeek is proving that you can build frontier-level models without a proprietary moat, and that the open-source ecosystem can sustain itself economically through API pricing that undercuts everyone else. If V4 gains traction, it could force the big players to rethink their pricing strategies, which would be good for everyone building on top of these models.

That said, I’m not convinced V4 will trigger another R1-style frenzy. The landscape has changed — there are more capable open-source models now, and the novelty of a Chinese lab outperforming expectations has worn off. But as a solid, affordable, open-source model that handles long contexts and competes on benchmarks, V4 earns its place. I’ll be watching to see how the developer community actually adopts it, because that’s where the real test lies.

Comments (0)

Be the first to comment!