629 reads

How to Prioritize AI Projects Amidst GPU Constraints

by Prerak GargAugust 6th, 2023

Too Long; Didn't Read

The rise of Generative AI, driven by large language models like GPT-4, is reshaping tech industry strategies. Incorporating AI features via LLMs has become crucial, but a challenge arises due to the ongoing GPU shortage and high costs. The demand for high-end GPUs (e.g., A100s, H100s) for AI services has overwhelmed manufacturers, causing a supply shortage. Even major cloud platforms like AWS and Azure have had to implement quota systems. GPU shortage is affecting OpenAI's ChatGPT advancement, hindering API availability and larger "context windows." Tech product leaders face a dilemma: delivering AI-powered features while dealing with GPU constraints. They must strategically prioritize products using a new framework based on Contribution Per GPU. The proposed framework involves identifying metrics like Revenue, Market Share, and Daily Active Users, calculating Contribution Per GPU, and prioritizing products accordingly. While this approach provides strategic clarity and objectivity, it may not capture all strategic aspects. Exceptional cases should be considered thoughtfully. The GPU shortage challenge can be turned into an opportunity by using the Contribution Per GPU framework, helping companies maximize ROI and focus on long-term success.

featured image - How to Prioritize AI Projects Amidst GPU Constraints

Rise of Generative AI and GPU shortage

Generative AI, enabled by large language models (LLMs) like GPT-4, has caused shockwaves in the tech world. ChatGPT's meteoric rise has triggered the global tech industry to reassess and prioritize Generative AI, reshaping product strategies in real-time.

Integration of LLMs has given product developers an easy way to incorporate AI-powered features into their products. But it's not all smooth sailing. A glaring challenge looms large for product leaders: the GPU shortage and spiraling costs.

The increasing number of AI startups and services has led to high demand for high-end GPUs such as A100s and H100s, thereby overwhelming Nvidia and its manufacturing partner TSMC, both of whom are struggling to meet the supply. Online forums like Reddit are abuzz with frustrations over GPU availability, echoing the sentiment across the tech community. It's become so dire that both AWS and Azure have had no choice but to implement quota systems.

This bottleneck doesn't just squeeze startups; it’s a stumbling block for tech giants like OpenAI.

At a recent off-the-record meeting in London, OpenAI's CEO Sam Altman candidly acknowledged that the computer chip shortage is stymieing ChatGPT’s advancement. Altman reportedly lamented that the dearth of computing power has resulted in subpar API availability and has obstructed OpenAI from rolling out larger "context windows" for ChatGPT.

Prioritizing AI Features

On one hand, product leaders find themselves caught in a relentless push to innovate, facing the expectations to deliver cutting-edge features that leverage the power of Generative AI. On the other hand, they grapple with the harsh realities of GPU capacity constraints. It's a complex juggling act, where ruthless prioritization becomes not just a strategic decision but a necessity.

Given that GPU availability is poised to remain a challenge for the foreseeable future, product leaders must think strategically about GPU allocation. Traditionally, product leaders have leaned on prioritization techniques like the Customer Value/Need vs. Effort Matrix. This method, however logical in a world where computational resources were abundant, now demands a bit of reevaluation. In our current paradigm, where computing is the constraint and not software talent, product leaders must redefine how they prioritize various products or features, bringing GPU limitations to the forefront of strategic decision-making.

Planning around capacity constraints might seem unusual for the tech industry, but it's a commonplace strategy in other industries. The underlying concept is straightforward: the most valuable factor is the time spent on the constrained resource, and the objective is to optimize the value per unit of time spent on that constraint. As a former consultant, I've successfully applied this framework across various industries. I believe that tech product leaders can also use a similar approach to prioritize products or features while GPU constraints exist. When applying this framework, the most straightforward measure of value is profitability. However, in tech, profitability might not always be the appropriate metric, particularly when venturing into a new market or product. Thus, I have adapted the framework to align with the success metrics generally used in tech, outlining a simple four steps process:

Contribution: First and foremost, identify your North Star metric. This is the contribution of each product or feature, something that encapsulates the essence of its worth. Some concrete examples might include:

An increase in Revenue and Profit
Gains in Market Share
Growth in the Number of Daily/Monthly Active Users

Number of GPUs Required: Gauge the number of GPUs needed for each product or feature. Focus on key factors like

Number of Queries per User per Day
Number of Daily Active Users
The complexity of the Query (i.e., how many tokens each query consumes)

Calculate Contribution per GPU: Break it down to the specifics. How does each GPU contribute to the overall goal? Understanding this will give you a clear picture of where your GPUs are best allocated.
Prioritize Products Based on Contribution per GPU: Now, it's time to make the tough decisions. Rank your products by their Contribution per GPU, and then line them up accordingly. Focus on the products with the highest Contribution per GPU first, ensuring that your limited resources are channeled into the areas where they'll make the most impact.

With GPU constraints no longer a blind spot, but a quantifiable factor in the decision-making process, your company can more strategically navigate the GPU shortage. To bring this framework to life, let's visualize a scenario where you, as a product leader, are grappling with the challenge of prioritizing among four different products:

	Product A	Product B	Product C	Product D
Revenue Potential (Contribution)	$100M	$80M	$50M	$25M
Number of GPUs Required	1000	450	500	50
Contribution Per GPU	$0.1M/GPU	$0.18M/GPU	$0.1M/GPU	$0.5M/GPU

Although Product A has the highest revenue potential, it doesn't yield the highest contribution per GPU. Surprisingly, Product D, with the least revenue potential, offers the most substantial return per GPU. By prioritizing based on this metric, you could maximize total potential revenue.

Let's say you have a total of 1000 GPUs at your disposal. A straightforward choice might have you opting for Product A, generating a revenue potential of $100M. However, applying the prioritization strategy described above, you could achieve $155M in revenue:

Priority Order	Product	Revenue Gain	GPUs
1	Product D	$25M	50
2	Product B	$80M	450
3	Product C	$50M	500
Total		$155M	1000

The same method can be applied to other contribution metrics, such as market share gain:

	Product A	Product B	Product C	Product D
Market Share Gain (Contribution)	5%	4%	2.5%	1.25%
Number of GPUs Required	1000	500	500	50
Contribution Per GPU	0.005%/GPU	0.008%/GPU	0.005%/GPU	0.025%/GPU

Similarly selecting Product A would have led to a market share gain of 5%. However, applying the prioritization strategy described above, you could achieve 7.75% in market share gain:

Priority Order	Product	Market Share gain	GPUs
1	Product D	1.25%	50
2	Product B	4%	450
3	Product C	2.5%	500
Total		7.75%	1000

Benefits and limitations

This alternative prioritization framework introduces a more nuanced and strategic approach. By zeroing in on the Contribution Per GPU, you're strategically aligning resources where they can make the most substantial difference, whether in terms of revenue, market share, or any other defining metric.

But the advantages don't stop there. This method also fosters a greater sense of clarity and objectivity across product teams. In my experience, including my early days leading digital transformation at a healthcare company and later while working with various McKinsey clients, this approach has been a game-changer in scenarios where capacity constraints are a critical factor. It's enabled us to prioritize initiatives in a more data-driven and rational way, sidelining the traditional politics where decisions might otherwise fall to the loudest voice in the room.

However, no one-size-fits-all solution exists, and it's worth acknowledging the potential limitations of this method. For instance, this approach may not always encapsulate the strategic importance of certain investments. Thus, while exceptions to the framework can and should be made, they ought to be carefully considered rather than the norm. This maintains the integrity of the process and ensures that any deviations are made with a broader strategic context in mind.

Conclusion

Product leaders are facing an unprecedented situation with the GPU shortage, so finding new ways of managing resources is needed. In the words of the great strategist Sun Tzu, "In the midst of chaos, there is also opportunity." The GPU shortage is indeed a challenge, but with the right approach, it may also be a catalyst for differentiation and success. The proposed prioritization framework, focusing on Contribution Per GPU, offers a strategic way to prioritize. By zeroing in on Contribution Per GPU, companies can maximize their return on investment, aligning resources where they'll make the most impact and focusing on what matters the most to the long-term success of their company.

References

https://nypost.com/2023/06/07/openais-sam-altman-complained-chip-shortage-is-delaying-chatgpt-plans/