Nvidia GPU rentals drop sharply, but experts say no cause for panic

A recent plunge in rental prices for Nvidia GPUs has stirred debate across the artificial intelligence sector, with various Western sources characterizing the trend dramatically as a rental bubble on the verge of bursting.

According to 36Kr, rental prices for Nvidia’s key products in China have indeed fluctuated considerably.

Nvidia’s H100, typically rented in nodes of eight cards, initially saw market rates of RMB 120,000–180,000 (USD 16,800–25,200) per card per month earlier this year. That rate has since dropped to around RMB 75,000 (USD 10,500).

Similarly, the consumer-grade Nvidia 4090, once selling for RMB 18,000–19,000 (USD 2,500–2,700) per card at the peak of the cryptocurrency mining boom when demand surged, had a rental price of roughly RMB 13,000 (USD 1,800) per card at the start of 2024. Now, rentals are priced around RMB 7,000–8,000 (USD 980–1,120).

In just ten months, the rental prices for these two popular Nvidia models have fallen by about 50%—a stark contrast from the days when they were highly coveted.

Despite the high-impact tone of Western media coverage, industry experts in China suggest there’s no need to panic. An industry veteran noted that rental prices for typical computing power chips decrease by about 80% over five years. Nvidia’s H100 and 4090 chips, released in 2022, have been on the market for two years, making the price decline a natural progression.

Several factors are at play in Nvidia’s rental price drop, primarily Nvidia’s product cycle and shifts in the computing power market’s supply-demand dynamics. This evolving market landscape is prompting rapid adjustments within the Chinese computing power industry.

Demand and supply for computing power

Nvidia’s declining rental prices coincide with the company’s transition from older to newer product lines.

One insider explained that Nvidia’s new Blackwell architecture chip, the GB200, offers lower unit computing costs compared to the H100. To save costs, many AI companies are choosing to wait for its launch, which has reduced demand for older models.

Jensen Huang, Nvidia’s CEO, described the new product as experiencing intense demand, likening supply allocation to walking a tightrope, with the risk of alienating key customers at every step.

However, despite the anticipation, Nvidia’s new chip has faced delays. Nvidia engineers have pointed to TSMC’s adoption of a new packaging technology, while TSMC claims that Nvidia’s accelerated production demands have cut validation time short. Originally scheduled for release in Q3 2024, the chip is now delayed to Q4 or possibly early 2025.

A chip industry analyst told 36Kr that, once the GB200 officially launches, rental prices for Nvidia’s older chips could drop further, estimating that a steadying of prices isn’t likely within the next six months.

Moreover, the current imbalance in the computing power market is also contributing to Nvidia’s plunging rental prices.

China’s approach to building its computing power sector contrasts with overseas models. Domestically, the industry has focused on constructing massive computing centers first, then developing AI applications to fill them, akin to a “nail searching for a hammer” approach. Internationally, the market is more commercially driven, where companies build computing centers in response to client demand.

Over the past two years, around 13,000 intelligent computing centers have sprung up across China, placing the nation second globally in computing power capacity at 246 exaflops by mid-2024, with intelligent computing power growing over 65% year-on-year.

This surge sparked a hoarding trend for Nvidia’s H100 chips in China. As these chips made their way into the country, often via Hong Kong and Singapore, some industry insiders noticed that demand for pre-training, a once power-intensive process, had waned.

Since early 2024, while demand for inference and model fine-tuning has grown, it hasn’t reached the explosive growth many anticipated. “We have yet to see a ‘killer app’ for AI or a clear application scenario,” one industry insider said.

Given an oversupply of computing power and fewer immediate applications, Nvidia’s rental prices were bound to drop.

Shifting from buying to renting

Historically, computing power companies mainly sold Nvidia hardware, colloquially known as “selling metal.” However, with shifts in computing power demand, hardware-only models have grown less viable. As Nvidia’s rental prices have tumbled, the downstream AI sector has adjusted its approach to computing power, with companies moving toward renting rather than outright purchases to avoid significant capital investment and preserve cash flow.

Upstream computing power companies are also adapting by offering more flexible rental options.

Previously, AI firms renting Nvidia cards typically committed to multi-node setups with annual contracts. This year, however, clients have become more cost-conscious and dispersed, leading to a rise in demand for fractional rentals.

“Some computing centers now allow clients to rent just a few Nvidia cards for only a few hours,” said one industry source, likening the change to a shift from leasing an entire floor for a year to renting a single room by the hour.

This trend, however, means longer payback periods for computing power providers. A source estimated for 36Kr that an H100-based computing center can take over five years to break even.

Meanwhile, computing power companies are enhancing the granularity of their services and gradually expanding into higher levels of service, such as model and application layers.

36Kr reports that some operators now provide not only computing power but also model fine-tuning for downstream AI clients. They are also targeting computing-heavy sectors such as finance, healthcare, and renewable energy, identifying specific scenarios that could benefit from leasing additional computing power.

An industry expert noted that, by bundling various AI services, the payback period on hardware costs could be shortened to as little as two years.

These adjustments reflect a maturing sector. As the dust settles after two years of rapid expansion, companies are adopting a more measured perspective toward Nvidia’s chips—assets that were once deemed irreplaceable.

KrASIA Connection features translated and adapted content that was originally published by 36Kr. This article was written by Qiu Xiaofen for 36Kr.