April 8, 2026·68 views·AI

China's AI Pricing Wave: Why Zhipu Is Raising Prices and Still Gaining Users

For years, the conventional wisdom on China's AI market held that Chinese large language model companies competed on one thing: price. The "Hundred-Model War" drove token prices toward zero as labs fought to win developers with rock-bottom API rates. Zhipu AI is now testing whether that era is over.

On February 11, 2026, Zhipu launched GLM-5 and raised prices on its GLM Coding Plan by at least 30%. The promotional entry point that had made GLM-4.7 accessible at roughly $3 per month disappeared. New subscribers faced higher tiers across the board. Existing users were grandfathered in—a classic retention move—but for anyone coming in fresh, the economics had changed.

Then came the earnings report.

What the Numbers Show

Zhipu's 2025 full-year results, reported by Reuters and confirmed by the South China Morning Post, told a counterintuitive story:

Metric Value Change
Total revenue CNY 724.3M (~$105.2M) +131.9% YoY
Cloud API revenue CNY 190.4M +292.6% YoY
Market cap (post-earnings) HK$400B ($51B) +25% on report day

The critical detail: these gains came after cumulative API price increases exceeding 80% compared to end-of-2023 levels. Usage did not fall. It rose.

Zhang Peng, Zhipu's CEO, explained it during the earnings call with a framing that is worth quoting directly:

"The tokens used for actual work are 10 times, even 100 times, those used for simple Q&A."

The implication is that as AI moves from chatty experimentation into production pipelines—agent loops, multi-step tool calls, automated code review—the volume of high-value tokens consumed per task compounds rapidly. Users are not counting tokens; they are counting outputs.

Why the Price Hikes Are Sticking

Several forces are converging to give Zhipu genuine pricing power for the first time.

The model closed the gap. GLM-5 is a 744-billion-parameter Mixture-of-Experts model. On SWE-bench Verified—a benchmark measuring how well a model resolves real GitHub issues—it scored 77.8, surpassing Google Gemini 3 Pro (76.2) and approaching Anthropic Claude Opus 4.6 (80.9). On the ClawBench agent evaluation for March 2026, Zhipu's GLM-5-Turbo topped the global ranking with a score of 93.9. When a model can genuinely replace a senior engineer's code reviews, its $30 plan starts looking cheap.

The infrastructure bet is paying off. Zhipu has spent heavily on inference optimization, compressing per-token serving costs by 50% through architectural improvements. It is also deep into software-hardware co-design with domestic AI chips—a strategic move that insulates it from U.S. export controls on advanced Nvidia hardware while keeping compute supply growing.

Agentic workflows create sticky demand. When a software team's CI pipeline calls an API thirty times per pull request to run static analysis, style checks, security scans, and test generation, a token is not a token—it is a component of a workflow. Switching models mid-workflow is expensive in engineering time. That stickiness gives providers leverage they did not have when users were just pasting prompts into a chat window.

The Competitive Context

Zhipu is not alone in raising prices. Moonshot, MiniMax, Alibaba, and ByteDance all implemented tiered or higher pricing through 2025. The Chinese AI price war, which once resembled a race to give models away, appears to be rotating toward a market where the best-in-class models can command premiums.

This stands in tension with the prevailing Western narrative that Chinese AI advances on cost alone. The reality is more layered. Chinese labs like Zhipu and DeepSeek have indeed driven down per-token costs through architectural efficiency—MoE, aggressive quantization, smarter training recipes. But the companies now gaining pricing power are the ones that also closed the capability gap. Cheap-and-worst no longer wins; cheap-and-competitive does.

Nine of China's top ten internet companies have integrated GLM models into production, primarily for code generation, automated workflows, and agent execution. The model has crossed 4 million developers and enterprises globally, with 242,000 paying developers as of the March earnings report. OpenRouter data from mid-March showed Chinese models accounting for 7.359 trillion tokens in a single week—exceeding U.S. model call volume for multiple consecutive weeks.

What Is Still Unresolved

None of this means the pricing pressure story is over. Zhipu reported a net loss of CNY 4.72 billion ($662 million) for 2025, driven by R&D spending of CNY 3.18 billion ($462 million). The company is investing at a pace that assumes the revenue trajectory continues accelerating. A slowdown in enterprise AI adoption, new entrants undercutting on price for specific coding niches, or a shift in developer preference toward open-weight self-hosting could all complicate the picture.

There are also technical limits. GLM-5 requires roughly 1,490 GB of memory to run locally—double GLM-4.7's footprint. Zhipu itself acknowledged that "compute is very tight" even before the GLM-5 launch, and the model faced temporary rate limiting after launch due to demand exceeding capacity. For teams relying on Zhipu's API, infrastructure availability is a real operational risk.

The analyst Lukas Petersson, co-founder of Andon Labs, raised a more structural concern after reviewing GLM-5 traces: the model achieves goals through aggressive tactics but shows less situational awareness in complex, long-horizon tasks compared to Claude. Benchmarks can flatten the difference between a model that solves the problem and a model that solves it reliably over months of production use.

The Broader Signal

What makes Zhipu's trajectory worth watching is not the price hike in isolation—it is the combination of price hikes, rising usage, and a stock that surged 34% on the GLM-5 launch and another 25% on earnings.

That pattern says something specific: a meaningful segment of the AI developer market has moved past the "which model is free" question to the "which model saves me the most engineering time" question. And for that segment, Zhipu's GLM-5 is winning on value even as it loses on price.

Whether this pricing power holds through a full business cycle—or whether it triggers competitive retaliation from rivals like Moonshot, DeepSeek, or the wave of new entrants—is the open question that will define the next phase of China's AI monetization story.

Priya Nanda
Priya Nanda

Applied AI editor tracking copilots, model products, AI interfaces, and the business reality behind practical automation.

More stories to explore

View all articles