The emergence of the DeepSeek R1 model has sparked a wave of skepticism surrounding the long-term demand for computing power, causing a notable dip in Nvidia's stock pricesIn a world that's increasingly driven by artificial intelligence, one has to wonder: is the demand for computing capabilities really on the verge of retreating?

While some skeptics express concerns, AI chip startups are viewing this phenomenon through a different lensTo them, this does not represent a threat but rather a monumental opportunity to grow and innovateAs more customers begin to adopt and integrate the DeepSeek open-source model into their operations, there has been an evident surge in the demand for inference chips and computing power.

One of Nvidia's competitors, Cerebras Systems, has positioned itself as a provider of AI chips, also offering cloud services through its dedicated computing clustersNotably, the firm launched the Cerebras Inference in August of last year, which it claims is the fastest AI inference solution available globallyAccording to CEO Andrew Feldman, the release of the DeepSeek R1 model has led to "one of the largest spikes in service demand we've ever experienced." He further noted that developers are eager to leverage open-source models like DeepSeek R1 as alternatives to the expensive and closed models from OpenAI. "Just as the cost reductions in previous technology sectors, such as personal computers and the internet, expanded accessibility, AI is currently on a similar trajectory of long-term growth," he said.

Meanwhile, another AI chip manufacturer, Etched, has reported a growing influx of interest from dozens of companies since the release of the DeepSeek inference modelThis shift has prompted the company to allocate spending from training clusters to inference clustersThe CEO of d-Matrix, Sid Sheth, stated, "The emergence of DeepSeek R1 demonstrates that smaller open models can be trained to be as powerful, or even more so than larger proprietary models, and this can be achieved at significantly lower costs

Advertisements

As smaller models proliferate, we are likely to see a surge in the era of inference." From the perspectives of these chip startups and industry analysts, it appears that DeepSeek may indeed accelerate the transition from training to inference in the AI cycle, promoting the adoption of new chip technology.

Phelix Lee, a semiconductor analyst at Morningstar, succinctly articulated the distinction between training and inference. "In simple terms, AI training involves creating a tool or algorithm, while inference is about applying this tool in real-world contexts." He points out that while AI training heavily relies on computing power, inference can occur on less advanced chips that perform a narrower range of tasksThis reality underscores the evolving landscape of AI technology, where the focus has shifted from merely developing capabilities to effectively applying them in practical scenarios.

However, even as the landscape shifts, evidence suggests that computing power remains in high demandA recent incident involving DeepSeek illustrates this pointOn February 6, they unexpectedly suspended their API service recharges, displaying a gray button signifying its unavailabilityThe official statement cited "tight server resources" and emphasized that the API recharge was paused to prevent business disruption for usersExisting recharge amounts, however, could still be utilized.

To put things into perspective, if DeepSeek experiences an average of 100 million visits per day, with an average of ten inquiries per visit using approximately 1,000 tokens per inquiry, the real-time inference compute demand would reach an astounding 1.6x1019 TOPsIn a typical inference scenario where DeepSeek employs H100 cards with FP8 precision at 50% utilization, the demand would require about 16,177 H100 cards and approximately 51,282 A100 cardsThis analysis highlights the substantial requirements stemming from low-cost inference models spearheaded by DeepSeekWith such models becoming prevalent, the resulting reduction in inference costs is bound to trigger a flourishing of applications, leading to a multiplied increase in overall computing power demand.

The evolution of AI technology suggests a shift in investment strategies as the field matures

Advertisements

Advertisements

Advertisements

Advertisements