On February 8, the release of the DeepSeek R1 model sent shockwaves throughout the tech community, prompting intense debate over the future demand for computational powerThis event caused significant fluctuations in the stock price of Nvidia, a leading player in the AI hardware marketThe underlying question emerged: Are we at the brink of a decline in computational power demands?

Interestingly, from the perspective of AI chip startups, this is far from a threatIn fact, they see it as an expansive opportunity for growthWith the rise of DeepSeek's open-source model, companies are increasingly adopting it, which in turn is fueling a surge in demand for inference chips and computation power.

Cerebras Systems, a competitor of Nvidia, focuses on providing AI chips while also offering cloud services through its proprietary computing clustersIn August of the previous year, they introduced what they claimed to be "the fastest AI inference solution in the world," known as Cerebras InferenceThe company's CEO, Andrew Feldman, disclosed to CNBC that following the DeepSeek R1 launch, Cerebras experienced what he described as "one of the largest surges in service demand ever." He elaborated that developers are eager to replace the expensive and closed models from OpenAI with the more accessible DeepSeek R1. The decrease in cost, according to Feldman, enables broader global adoption much like the evolution seen in the personal computer and internet sectors, as AI embarks on a similar trajectory of sustained growth.

Similarly, another AI chip manufacturer named Etched reported a wave of interest from dozens of companies since the launch of DeepSeek's inference modelThey are shifting their expenditure focus from training clusters to inference clusters, acknowledging that "DeepSeek-R1 has proven that inference computing has become the avant-garde method for every major model supplier," according to their representativesThis transition is characterized by the realization that leveraging these models for millions of users requires increasingly more computation power, despite inherent costs.

Sid Sheth, the CEO of AI chip startup d-Matrix, also commented on the potential of smaller open models to rival or even surpass larger proprietary models in strength and performance, achieved at a significantly lower cost

Advertisements

This trend could further accelerate a shift into the inference era, emphasizing the changing dynamics of the AI landscape.

For chip startups and industry analysts, DeepSeek is poised to expedite the transition from training to inference cycles in AI, paving the way for the adoption of new chip technologiesIn simplistic terms, training in AI involves creating a tool or algorithm, while inference relates to applying this tool in practical, real-world situationsAccording to Phelix Lee, a semiconductor analyst at Morningstar, this reliance on computation power is a hallmark of training, whereas inference can take place on less advanced chips capable of performing a narrower range of tasks.

However, DeepSeek's recent activities suggest that the demand for computational power may still be outpacing supplyOn February 6, DeepSeek unexpectedly halted API service recharge capabilities, rendering the button for recharging gray and inactiveThe official statement explained, "Current server resources are tight, and to prevent any impact on your business, we have suspended API service rechargesExisting recharge amounts can still be utilized; we appreciate your understanding!" This service interruption raises concerns about the infrastructure's ability to handle the burgeoning demand sparked by the new model.

Research conducted by Guotai Junan indicated staggering computational requirements given a hypothetical scenario where DeepSeek experiences 100 million visits daily, with each query demanding 10 responses utilizing 1,000 tokens for each response—roughly correlating to 750 English charactersIn such an inference context, if DeepSeek employs H100 cards with FP8 precision at 50% utilization, the estimated requirement would be 16,177 H100 cards and 51,282 A100 cards.

As the landscape shifts and the popularity of cost-effective inference models like DeepSeek grows, a notable reduction in inference costs and pricing could catalyze a boom in applications requiring extensive computational power

Advertisements

Advertisements

Advertisements

Advertisements