NVIDIA's Fall: A Feast for AI Chip Startups?

On February 8, the release of the DeepSeek R1 model sent shockwaves throughout the tech community, prompting intense debate over the future demand for computational power. This event caused significant fluctuations in the stock price of Nvidia, a leading player in the AI hardware market. The underlying question emerged: Are we at the brink of a decline in computational power demands?

Interestingly, from the perspective of AI chip startups, this is far from a threat. In fact, they see it as an expansive opportunity for growth. With the rise of DeepSeek's open-source model, companies are increasingly adopting it, which in turn is fueling a surge in demand for inference chips and computation power.

Cerebras Systems, a competitor of Nvidia, focuses on providing AI chips while also offering cloud services through its proprietary computing clusters. In August of the previous year, they introduced what they claimed to be "the fastest AI inference solution in the world," known as Cerebras Inference. The company's CEO, Andrew Feldman, disclosed to CNBC that following the DeepSeek R1 launch, Cerebras experienced what he described as "one of the largest surges in service demand ever." He elaborated that developers are eager to replace the expensive and closed models from OpenAI with the more accessible DeepSeek R1. The decrease in cost, according to Feldman, enables broader global adoption much like the evolution seen in the personal computer and internet sectors, as AI embarks on a similar trajectory of sustained growth.

Similarly, another AI chip manufacturer named Etched reported a wave of interest from dozens of companies since the launch of DeepSeek's inference model. They are shifting their expenditure focus from training clusters to inference clusters, acknowledging that "DeepSeek-R1 has proven that inference computing has become the avant-garde method for every major model supplier," according to their representatives. This transition is characterized by the realization that leveraging these models for millions of users requires increasingly more computation power, despite inherent costs.

Sid Sheth, the CEO of AI chip startup d-Matrix, also commented on the potential of smaller open models to rival or even surpass larger proprietary models in strength and performance, achieved at a significantly lower cost. This trend could further accelerate a shift into the inference era, emphasizing the changing dynamics of the AI landscape.

For chip startups and industry analysts, DeepSeek is poised to expedite the transition from training to inference cycles in AI, paving the way for the adoption of new chip technologies. In simplistic terms, training in AI involves creating a tool or algorithm, while inference relates to applying this tool in practical, real-world situations. According to Phelix Lee, a semiconductor analyst at Morningstar, this reliance on computation power is a hallmark of training, whereas inference can take place on less advanced chips capable of performing a narrower range of tasks.

However, DeepSeek's recent activities suggest that the demand for computational power may still be outpacing supply. On February 6, DeepSeek unexpectedly halted API service recharge capabilities, rendering the button for recharging gray and inactive. The official statement explained, "Current server resources are tight, and to prevent any impact on your business, we have suspended API service recharges. Existing recharge amounts can still be utilized; we appreciate your understanding!" This service interruption raises concerns about the infrastructure's ability to handle the burgeoning demand sparked by the new model.

Research conducted by Guotai Junan indicated staggering computational requirements given a hypothetical scenario where DeepSeek experiences 100 million visits daily, with each query demanding 10 responses utilizing 1,000 tokens for each response—roughly correlating to 750 English characters. In such an inference context, if DeepSeek employs H100 cards with FP8 precision at 50% utilization, the estimated requirement would be 16,177 H100 cards and 51,282 A100 cards.

As the landscape shifts and the popularity of cost-effective inference models like DeepSeek grows, a notable reduction in inference costs and pricing could catalyze a boom in applications requiring extensive computational power. Thus, a multipliable increase in total demand for computation power seems likely.

Future AI investments will likely pivot away from the haphazard "scale to miraculous results" approach, gradually shifting from pre-training to emphasis on inference. Eastmoney Securities remains optimistic about the overall demand for training and inference, drawing parallels to the declining costs of data usage in transitioning from 2G to 4G network services. Despite lowered costs, the exponential increase in usage invariably propels rapid growth in the overall market scale.

Looking back to the 19th century, we can draw insights from the observations made by British economist William Stanley Jevons. He identified that advancements in efficiency and reductions in coal costs led to an overall increase in coal consumption—a phenomenon that gave rise to what is now widely recognized in environmental economics as the "Jevons Paradox."

As we stand 160 years in the future, the technology-led cost reductions initiated by DeepSeek may similarly unlock an array of AI applications across various domains. Notably, this potential may give rise to a new interpretation of the "Jevons Paradox" within the emerging AI industry.

NVIDIA's Fall: A Feast for AI Chip Startups?

Featured Posts

Categories