Rise of Generative AI and GPU shortage Generative AI, enabled by large language models ( ) like , has caused shockwaves in the tech world. ChatGPT's meteoric rise has triggered the global tech industry to reassess and prioritize Generative AI, reshaping product strategies in real-time. LLMs GPT-4 Integration of LLMs has given product developers an easy way to incorporate AI-powered features into their products. But it's not all smooth sailing. A glaring challenge looms large for product leaders: the GPU shortage and spiraling costs. The increasing number of AI startups and services has led to high demand for high-end GPUs such as A100s and H100s, thereby overwhelming Nvidia and its manufacturing partner TSMC, both of whom are struggling to meet the supply. Online forums like Reddit are abuzz with frustrations over GPU availability, echoing the sentiment across the tech community. It's become so dire that both AWS and Azure have had no choice but to implement quota systems. This bottleneck doesn't just squeeze startups; it’s a stumbling block for tech giants like OpenAI. At a recent off-the-record meeting in London, OpenAI's CEO Sam Altman candidly acknowledged that the computer chip shortage is stymieing ChatGPT’s advancement. Altman reportedly lamented that the dearth of computing power has resulted in subpar API availability and has obstructed OpenAI from rolling out larger "context windows" for ChatGPT. Prioritizing AI Features On one hand, product leaders find themselves caught in a relentless push to innovate, facing the expectations to deliver cutting-edge features that leverage the power of Generative AI. On the other hand, they grapple with the harsh realities of GPU capacity constraints. It's a complex juggling act, where ruthless prioritization becomes not just a strategic decision but a necessity. Given that GPU availability is poised to remain a challenge for the foreseeable future, product leaders must think strategically about GPU allocation. Traditionally, product leaders have leaned on prioritization techniques like the Customer Value/Need vs. Effort Matrix. This method, however logical in a world where computational resources were abundant, now demands a bit of reevaluation. In our current paradigm, where computing is the constraint and not software talent, product leaders must redefine how they prioritize various products or features, bringing GPU limitations to the forefront of strategic decision-making. Planning around capacity constraints might seem unusual for the tech industry, but it's a commonplace strategy in other industries. The underlying concept is straightforward: the most valuable factor is the time spent on the constrained resource, and the objective is to optimize the value per unit of time spent on that constraint. As a former consultant, I've successfully applied this framework across various industries. I believe that tech product leaders can also use a similar approach to prioritize products or features while GPU constraints exist. When applying this framework, the most straightforward measure of value is profitability. However, in tech, profitability might not always be the appropriate metric, particularly when venturing into a new market or product. Thus, I have adapted the framework to align with the success metrics generally used in tech, outlining a simple four steps process: : First and foremost, identify your North Star metric. This is the contribution of each product or feature, something that encapsulates the essence of its worth. Some concrete examples might include: Contribution An increase in Revenue and Profit Gains in Market Share Growth in the Number of Daily/Monthly Active Users Gauge the number of GPUs needed for each product or feature. Focus on key factors like Number of GPUs Required: Number of Queries per User per Day Number of Daily Active Users The complexity of the Query (i.e., how many tokens each query consumes) : Break it down to the specifics. How does each GPU contribute to the overall goal? Understanding this will give you a clear picture of where your GPUs are best allocated. Calculate Contribution per GPU : Now, it's time to make the tough decisions. Rank your products by their Contribution per GPU, and then line them up accordingly. Focus on the products with the highest Contribution per GPU first, ensuring that your limited resources are channeled into the areas where they'll make the most impact. Prioritize Products Based on Contribution per GPU With GPU constraints no longer a blind spot, but a quantifiable factor in the decision-making process, your company can more strategically navigate the GPU shortage. To bring this framework to life, let's visualize a scenario where you, as a product leader, are grappling with the challenge of prioritizing among four different products: Product A Product B Product C Product D Revenue Potential (Contribution) $100M $80M $50M $25M Number of GPUs Required 1000 450 500 50 Contribution Per GPU $0.1M/GPU $0.18M/GPU $0.1M/GPU $0.5M/GPU Although Product A has the highest revenue potential, it doesn't yield the highest contribution per GPU. Surprisingly, Product D, with the least revenue potential, offers the most substantial return per GPU. By prioritizing based on this metric, you could maximize total potential revenue. Let's say you have a total of 1000 GPUs at your disposal. A straightforward choice might have you opting for Product A, generating a revenue potential of $100M. However, applying the prioritization strategy described above, you could achieve $155M in revenue: Priority Order Product Revenue Gain GPUs 1 Product D $25M 50 2 Product B $80M 450 3 Product C $50M 500 Total $155M 1000 The same method can be applied to other contribution metrics, such as market share gain: Product A Product B Product C Product D Market Share Gain (Contribution) 5% 4% 2.5% 1.25% Number of GPUs Required 1000 500 500 50 Contribution Per GPU 0.005%/GPU 0.008%/GPU 0.005%/GPU 0.025%/GPU Similarly selecting Product A would have led to a market share gain of 5%. However, applying the prioritization strategy described above, you could achieve 7.75% in market share gain: Priority Order Product Market Share gain GPUs 1 Product D 1.25% 50 2 Product B 4% 450 3 Product C 2.5% 500 Total 7.75% 1000 Benefits and limitations This alternative prioritization framework introduces a more nuanced and strategic approach. By zeroing in on the Contribution Per GPU, you're strategically aligning resources where they can make the most substantial difference, whether in terms of revenue, market share, or any other defining metric. But the advantages don't stop there. This method also fosters a greater sense of clarity and objectivity across product teams. In my experience, including my early days leading digital transformation at a healthcare company and later while working with various McKinsey clients, this approach has been a game-changer in scenarios where capacity constraints are a critical factor. It's enabled us to prioritize initiatives in a more data-driven and rational way, sidelining the traditional politics where decisions might otherwise fall to the loudest voice in the room. However, no one-size-fits-all solution exists, and it's worth acknowledging the potential limitations of this method. For instance, this approach may not always encapsulate the strategic importance of certain investments. Thus, while exceptions to the framework can and should be made, they ought to be carefully considered rather than the norm. This maintains the integrity of the process and ensures that any deviations are made with a broader strategic context in mind. Conclusion Product leaders are facing an unprecedented situation with the GPU shortage, so finding new ways of managing resources is needed. In the words of the great strategist Sun Tzu, "In the midst of chaos, there is also opportunity." The GPU shortage is indeed a challenge, but with the right approach, it may also be a catalyst for differentiation and success. The proposed prioritization framework, focusing on Contribution Per GPU, offers a strategic way to prioritize. By zeroing in on Contribution Per GPU, companies can maximize their return on investment, aligning resources where they'll make the most impact and focusing on what matters the most to the long-term success of their company. References https://nypost.com/2023/06/07/openais-sam-altman-complained-chip-shortage-is-delaying-chatgpt-plans/

How to Prioritize AI Projects Amidst GPU Constraints

About Author

Comments

TOPICS

THIS ARTICLE WAS FEATURED IN

Related Stories

The Noonification: How Often Do NFTs Pass The Howey Test? (1/13/2023)

Darwin's Hybrid Intelligence to Align AI & Human Goals for Startups & VCs

The Noonification: White Man (11/26/2022)

The Noonification: The Metaverse is a Sh*tshow (11/2/2022)

100 Days of AI Day 1: From Newsletter to Podcast, Leveraging AI for Audio Transformation

10 Ways AI Has Changed Our Lives

The Noonification: How Often Do NFTs Pass The Howey Test? (1/13/2023)

Darwin's Hybrid Intelligence to Align AI & Human Goals for Startups & VCs

The Noonification: White Man (11/26/2022)

The Noonification: The Metaverse is a Sh*tshow (11/2/2022)

100 Days of AI Day 1: From Newsletter to Podcast, Leveraging AI for Audio Transformation

10 Ways AI Has Changed Our Lives

Light-Mode

Classic

Newspaper

Minty

Dark-Mode

Neon Noir

Minty

HN StartUps