NVIDIA's Decline

The emergence of the DeepSeek R1 model has sparked a wave of skepticism surrounding the long-term demand for computing power, causing a notable dip in Nvidia's stock prices. In a world that's increasingly driven by artificial intelligence, one has to wonder: is the demand for computing capabilities really on the verge of retreating?

While some skeptics express concerns, AI chip startups are viewing this phenomenon through a different lens. To them, this does not represent a threat but rather a monumental opportunity to grow and innovate. As more customers begin to adopt and integrate the DeepSeek open-source model into their operations, there has been an evident surge in the demand for inference chips and computing power.

One of Nvidia's competitors, Cerebras Systems, has positioned itself as a provider of AI chips, also offering cloud services through its dedicated computing clusters. Notably, the firm launched the Cerebras Inference in August of last year, which it claims is the fastest AI inference solution available globally. According to CEO Andrew Feldman, the release of the DeepSeek R1 model has led to "one of the largest spikes in service demand we've ever experienced." He further noted that developers are eager to leverage open-source models like DeepSeek R1 as alternatives to the expensive and closed models from OpenAI. "Just as the cost reductions in previous technology sectors, such as personal computers and the internet, expanded accessibility, AI is currently on a similar trajectory of long-term growth," he said.

Meanwhile, another AI chip manufacturer, Etched, has reported a growing influx of interest from dozens of companies since the release of the DeepSeek inference model. This shift has prompted the company to allocate spending from training clusters to inference clusters. The CEO of d-Matrix, Sid Sheth, stated, "The emergence of DeepSeek R1 demonstrates that smaller open models can be trained to be as powerful, or even more so than larger proprietary models, and this can be achieved at significantly lower costs. As smaller models proliferate, we are likely to see a surge in the era of inference." From the perspectives of these chip startups and industry analysts, it appears that DeepSeek may indeed accelerate the transition from training to inference in the AI cycle, promoting the adoption of new chip technology.

Phelix Lee, a semiconductor analyst at Morningstar, succinctly articulated the distinction between training and inference. "In simple terms, AI training involves creating a tool or algorithm, while inference is about applying this tool in real-world contexts." He points out that while AI training heavily relies on computing power, inference can occur on less advanced chips that perform a narrower range of tasks. This reality underscores the evolving landscape of AI technology, where the focus has shifted from merely developing capabilities to effectively applying them in practical scenarios.

However, even as the landscape shifts, evidence suggests that computing power remains in high demand. A recent incident involving DeepSeek illustrates this point. On February 6, they unexpectedly suspended their API service recharges, displaying a gray button signifying its unavailability. The official statement cited "tight server resources" and emphasized that the API recharge was paused to prevent business disruption for users. Existing recharge amounts, however, could still be utilized.

To put things into perspective, if DeepSeek experiences an average of 100 million visits per day, with an average of ten inquiries per visit using approximately 1,000 tokens per inquiry, the real-time inference compute demand would reach an astounding 1.6x10¹⁹ TOPs. In a typical inference scenario where DeepSeek employs H100 cards with FP8 precision at 50% utilization, the demand would require about 16,177 H100 cards and approximately 51,282 A100 cards. This analysis highlights the substantial requirements stemming from low-cost inference models spearheaded by DeepSeek. With such models becoming prevalent, the resulting reduction in inference costs is bound to trigger a flourishing of applications, leading to a multiplied increase in overall computing power demand.

The evolution of AI technology suggests a shift in investment strategies as the field matures. The blind adherence to the "Scaling Law" of investing heavily for miraculous results is giving way to a more nuanced understanding of AI development. Analysts from Dongwu Securities highlight that while the focus has shifted, the overall demand for both training and inference remains optimistic. They draw parallels to the telecommunications industry's transition from 2G to 4G, where, despite decreasing traffic costs, user growth surged exponentially, sustaining rapid market expansion regardless of price drops. This principle holds true for the ongoing development of AI.

Reflecting on the historical context, we are reminded of the insights of the 19th-century British economist William Stanley Jevons, who posited that increased efficiency in coal usage could paradoxically lead to a rise in coal consumption. This concept, known as the "Jevons Paradox," remains relevant as we witness the current technological landscape. As DeepSeek pushes the boundaries of cost-efficient AI, it is plausible that the Jevons Paradox will echo through the AI sector, prompting growth that exceeds initial expectations.

Featured Posts

Categories