The GPU Bottleneck: Navigating Supply and Demand in AI Development

The GPU Bottleneck: Navigating Supply and Demand in AI Development In this interview, I'm speaking with Ahmad Shadid. We'll discuss the critical importance of GPUs for AI development and the challenges posed by the current GPU shortage. We'll also touch on the shortage's broader impact on various industries and the exploration of alternative computing technologies. Please introduce yourself and tell us what you do. My name is Ahmad Shadid. I am the CEO and Co-Founder of io.net. Prior to io.net, I founded an institutional-grade quant trading firm called Dark Tick. I have a background in low-latency systems and ML-engineering. Can you explain why GPUs have become so critical for AI development, and what has driven the recent surge in demand? GPU compute has undoubtedly been the focus of AI development since the ability to parallel process and handle larger computational workloads has made them invaluable to the industry. Sam Altman said it well, compute is possibly the most valuable resource in the world right now. The AI “arms race” has accelerated the shortage of GPUs as companies, and even governments attempt to acquire as much compute capacity as possible to secure a lead in AI/ML development. How do modern AI algorithms' computational requirements differ from traditional computing tasks, and why are GPUs particularly suited to meet these needs? Simply put, GPU architecture means each GPU has tens of thousands of cores, while CPUs have tens of cores. This enables GPUs to parallel process enormous data sets effectively and train AI models. What inspired the foundation of IO.NET, and what specific gaps or opportunities in the AI and computing landscape were you aiming to address? io.net is a decentralized physical infrastructure network, or DePIN, that aggregates GPU and CPU nodes across co-located and geographically distributed nodes in a network. By aggregating underutilized supply across independent data centers, crypto miners, and consumer devices, io.net can offer fast, cheap, and perhaps most importantly, flexible compute to AI/ML companies and other workloads. io.net was inspired by my experience while running Dark Tick. Since we ran models daily on over 200 tickers and tuned them in real time during the trading session, our quant trading firm was consuming enormous amounts of compute. To offset costs, we built our own “DePIN” and aggregated compute from multiple sources for our own use. When OpenAI launched a few years later, we noticed they also used a modified version of the same network architecture, Ray, as Dark Tick. We realized we had a huge opportunity to become a major player in decentralized cloud computing and solve the compute shortage problem that the AI industry faced. Could you explain how IO.NET works, particularly how it enables more efficient use of GPU resources among AI developers and researchers? This simplifies what we do at io.net. To put it simply, io.net functions like an AirBNB for GPU compute to provide a decentralized version of AWS. Diving a little deeper, io.net is an aggregator that can cluster and virtualize thousands of GPUs across multiple locations to provide AI/ML-ready compute capacity. Since we’re tapping into underutilized or latent capacity, we can provide this compute at a much lower cost to AI engineers and researchers to train, tune, and inference their models. Can you elaborate on the primary causes behind the massive GPU shortage in the public cloud, and how it's affecting companies relying on cloud services like AWS, GCP, or Azure? The GPU shortage is simply due to the constraints of manufacturing GPUs while demand skyrockets. Moore’s law has hardware efficiency doubling every eighteen months while the demand for GPUs grows ten times over the same period. Cloud computing providers simply cannot acquire hardware and deploy infrastructure quickly enough, and manufacturers like NVIDIA cannot produce chips fast enough. This means the centralized cloud costs are increasing, timelines to access cloud compute are growing longer, and builders have fewer flexible options (location, speed, hardware, etc.). What would be the immediate and long-term implications of a GPU shortage on AI research and development? As the GPU shortage worsens, we’ll simply see less innovation from startups, especially those that can’t access large amounts of VC funding or are attempting to bootstrap their companies. We’ll also see more value and attention accrual among fewer companies, and a lack of innovation and participation from developing countries since they will get priced out of the compute market. Can you provide insights into how you plan to increase accessibility to GPUs for AI researchers and developers? Are there specific technologies or platforms you're developing to address this issue? At its core, io.net’s differentiator is the network and orchestration layers that enable us to cluster thousands of distributed GPUs in a single cluster and make that cluster AI/ML workload ready. Whether it’s model inference, tuning, or training, io.net can provide lower-cost compute to AI research and development by making previously inaccessible compute capacity available to builders. What have been the most significant challenges in building IO.NET, especially in a rapidly evolving and highly competitive market? Like any two-sided marketplace, getting past the cold start problem is always challenging. It took io.net 3 months to reach 25,000 GPUs in the network and less than 1 month after that to reach 300,000 GPUs. As the supply side grows, we can more rapidly grow the demand side and build on the network effects. Part of this growth also highlighted areas of the platform that needed additional work to scale - and we’re glad that we could pressure test our infrastructure with the community and customers before launching the full version of io.net at the end of April. How could a prolonged GPU shortage affect industries increasingly reliant on AI technologies, such as healthcare, automotive, and finance? The GPU shortage does have an impact beyond AI. Suppose you think about industries that consume a lot of compute but are not immediately categorized as AI. In that case, you’ll find that some of the industries you mentioned, like healthcare, automotive, and finance, depend heavily on compute. You have medical research and pharmaceutical development in healthcare, self-driving vehicles in automotive, and quant trading and other modeling in finance that all require large amounts of compute power. Are there viable alternatives to GPUs for AI development, such as TPUs or FPGAs? How do they compare performance and cost-effectiveness for different types of AI applications? There are some interesting alternatives being developed today, like TPUs, FPGAs, and LPUs, which are entering the market to address the GPU compute shortage. However, the problem isn’t really simply GPUs; it’s that hardware production overall is falling short of demand growth, which is limiting access to that hardware and increasing costs for all builders. How far are we from seeing a shift or diversification in the types of hardware used for AI tasks? What challenges need to be overcome to make alternative hardware more mainstream? We’re too early to call it a shift from GPUs for AI/ML workloads. There is definitely a lot of promising work being done, but at the end of the day, mass adoption and scaling up production are the problems. It took NVIDIA years to fine-tune its hardware for AI use cases and then again more years to build up its supply chain and manufacturing scale to a place where it could actually serve demand at scale. Given the current trends, how do you foresee the balance between supply and demand for GPUs evolving in the next few years? I don’t think we are near the compute shortage's end. Compute is like digital oil, and we are in the middle of a technological industrial revolution. AI/ML for consumers has just started to take hold and it’s already making an enormous splash across multiple industries.