We use cookies to understand how you use our site and to improve your experience.
This includes personalizing content and advertising.
By pressing "Accept All" or closing out of this banner, you consent to the use of all cookies and similar technologies and the sharing of information they collect with third parties.
You can reject marketing cookies by pressing "Deny Optional," but we still use essential, performance, and functional cookies.
In addition, whether you "Accept All," Deny Optional," click the X or otherwise continue to use the site, you accept our Privacy Policy and Terms of Service, revised from time to time.
You are being directed to ZacksTrade, a division of LBMZ Securities and licensed broker-dealer. ZacksTrade and Zacks.com are separate companies. The web link between the two companies is not a solicitation or offer to invest in a particular security or type of security. ZacksTrade does not endorse or adopt any particular investment strategy, any analyst opinion/rating/report or any approach to evaluating individual securities.
If you wish to go to ZacksTrade, click OK. If you do not, click Cancel.
Can Cloudflare's Edge AI Inference Reshape Cost Economics?
Read MoreHide Full Article
Key Takeaways
NET uses a custom Rust-based Infire engine to maximize GPU utilization and lower AI inference cost.
NET runs models closer to users, caching weights at the edge to cut latency and speed startups.
NET relies on off-the-shelf hardware to add capacity fast and generate revenues earlier.
Cloudflare’s (NET - Free Report) AI inference strategy has been different from hyperscalers, as instead of renting server capacity and aiming to earn multiples on hardware costs that hyperscalers do, Cloudflare maximizes system efficiency and utilization per capital expenditure dollar spent on developing the infrastructure, hence optimizing the cost structure of AI inference.
While hyperscalers are dealing with the GPU utilization paradox, Cloudflare is maximizing GPU utilization and minimizing overhead cost. Cloudflare has been using a custom large language model (LLM) inference engine written in Rust, namely Infire, built explicitly for its hardware and edge network.
The system has been able to use fewer CPUs and GPUs for more throughput, run models closer to users and improve on startup speed and efficiency, while the hyperscalers still battle with high latency, underutilized GPUs due to CPU limitations and networking bottlenecks. Here’s how Cloudflare is solving these bottlenecks with Infire.
Cloudflare’s Infire works through an OpenAI-compatible HTTP server, a batcher, and an LLM inference engine on which the models run. Infire downloads the model weights from R2 storage when models are scheduled to run. Furthermore, Infire allows these weights to be cached locally on the edge node for faster loadups in the future and start inference quickly.
Cloudflare’s supply chain is also highly optimized as it uses off-the-shelf hardware, especially in tier-1 cities, which allows it to quickly set up and start generating revenues before fully paying for the hardware, lending the company flexibility and a fast response time when capacity needs to be added.
How Competitors Fare Against Cloudflare
When it comes to AI inference and edge deployment, Cloudflare’s strategy is very different from hyperscalers, traditional cloud and inference providers like Amazon (AMZN - Free Report) and Microsoft (MSFT - Free Report) . Hyperscale data centers built for Amazon Web Services and Microsoft Azure use large-scale facilities for high-volume data processing, data storage, and massive workloads. However, this method suffers from higher power consumption and latency.
Amazon Web Services is solving this with the introduction of Lambda@Edge as a feature of Amazon CloudFront that lets users run code closer to their application, which massively improves performance and reduces latency. Microsoft has taken a different strategy by using a hybrid cloud strategy, where Microsoft allows its customers to run AI workloads on-premises at the edge.
NET Price Performance, Valuation and Estimates
Shares of Cloudflare have risen 9.9% in the past six months against the Zacks Internet – Software industry’s decline of 3.1%.
Cloudflare 6-Month Price Performance Chart
Image Source: Zacks Investment Research
From a valuation standpoint, Cloudflare trades at a forward price-to-sales ratio of 26.19X, much higher than the industry’s average of 4.86X.
NET Forward 12-Month (P/S) Valuation Chart
Image Source: Zacks Investment Research
The Zacks Consensus Estimate for Cloudflare’s 2025 earnings implies year-over-year growth of 21.3%. The estimate for 2025 has been revised upward in the past 30 days.
Image: Bigstock
Can Cloudflare's Edge AI Inference Reshape Cost Economics?
Key Takeaways
Cloudflare’s (NET - Free Report) AI inference strategy has been different from hyperscalers, as instead of renting server capacity and aiming to earn multiples on hardware costs that hyperscalers do, Cloudflare maximizes system efficiency and utilization per capital expenditure dollar spent on developing the infrastructure, hence optimizing the cost structure of AI inference.
While hyperscalers are dealing with the GPU utilization paradox, Cloudflare is maximizing GPU utilization and minimizing overhead cost. Cloudflare has been using a custom large language model (LLM) inference engine written in Rust, namely Infire, built explicitly for its hardware and edge network.
The system has been able to use fewer CPUs and GPUs for more throughput, run models closer to users and improve on startup speed and efficiency, while the hyperscalers still battle with high latency, underutilized GPUs due to CPU limitations and networking bottlenecks. Here’s how Cloudflare is solving these bottlenecks with Infire.
Cloudflare’s Infire works through an OpenAI-compatible HTTP server, a batcher, and an LLM inference engine on which the models run. Infire downloads the model weights from R2 storage when models are scheduled to run. Furthermore, Infire allows these weights to be cached locally on the edge node for faster loadups in the future and start inference quickly.
Cloudflare’s supply chain is also highly optimized as it uses off-the-shelf hardware, especially in tier-1 cities, which allows it to quickly set up and start generating revenues before fully paying for the hardware, lending the company flexibility and a fast response time when capacity needs to be added.
How Competitors Fare Against Cloudflare
When it comes to AI inference and edge deployment, Cloudflare’s strategy is very different from hyperscalers, traditional cloud and inference providers like Amazon (AMZN - Free Report) and Microsoft (MSFT - Free Report) . Hyperscale data centers built for Amazon Web Services and Microsoft Azure use large-scale facilities for high-volume data processing, data storage, and massive workloads. However, this method suffers from higher power consumption and latency.
Amazon Web Services is solving this with the introduction of Lambda@Edge as a feature of Amazon CloudFront that lets users run code closer to their application, which massively improves performance and reduces latency. Microsoft has taken a different strategy by using a hybrid cloud strategy, where Microsoft allows its customers to run AI workloads on-premises at the edge.
NET Price Performance, Valuation and Estimates
Shares of Cloudflare have risen 9.9% in the past six months against the Zacks Internet – Software industry’s decline of 3.1%.
Cloudflare 6-Month Price Performance Chart
Image Source: Zacks Investment Research
From a valuation standpoint, Cloudflare trades at a forward price-to-sales ratio of 26.19X, much higher than the industry’s average of 4.86X.
NET Forward 12-Month (P/S) Valuation Chart
Image Source: Zacks Investment Research
The Zacks Consensus Estimate for Cloudflare’s 2025 earnings implies year-over-year growth of 21.3%. The estimate for 2025 has been revised upward in the past 30 days.
Image Source: Zacks Investment Research
Cloudflare currently carries a Zacks Rank #2 (Buy). You can see the complete list of today’s Zacks #1 Rank (Strong Buy) stocks here.