Summary An Interview with Nvidia CEO Jensen Huang About AI’s iPhone Moment – Stratechery by Ben Thompson stratechery.com
9,437 words - html page - View html page
One Line
Nvidia CEO discusses the future potential of inference becoming the primary way software is operated and their interest in developing consumer AI GPUs and smaller language models for cell phones, highlighting the importance of diversity, redundancy, data centers, and cloud services in their new business model.
Key Points
- ChatGPT and AI's "iPhone moment" have increased demand for generative AI models, leading to an acceleration in demand for training and inference.
- Nvidia is responding to the demand by working on large language models and inference platforms, but faces constraints in meeting demand due to factors such as chip production, data center availability, and customer demand.
- Nvidia is shifting towards delivering their system as a whole and working closely with cloud providers to optimize performance for customers, launching with Oracle as their first OEM layer of clouds.
Summaries
194 word summary
Nvidia CEO Jensen Huang discusses the future potential of inference becoming the primary way software is operated, with generative AI eventually being available on every computer. Nvidia is interested in developing consumer AI GPUs and smaller, more performant versions of large language models that could run on cell phones within 10 years. The interview highlights the company's commitment to working with industries to use their Nemo and Picasso models with their own data. Nvidia's new business model involves working directly with customers to accelerate their end-to-end ML ops platforms and host their applications in the cloud. Huang emphasizes the importance of diversity and redundancy in building resilience for large companies and investing in fabs and interconnects for scaling up computing systems. He also highlights the importance of data centers as the future of computing and Nvidia's move into cloud services. The impact of ChatGPT and AI's "iPhone moment" on Nvidia's business is discussed, with an acceleration in demand for training and inference leading to constraints in meeting demand due to factors such as chip production, data center availability, and customer demand. Huang emphasizes the importance of being strict about inventory and purchase order obligations.
481 word summary
Nvidia CEO Jensen Huang discusses the impact of ChatGPT and AI's "iPhone moment" on Nvidia's business in an interview conducted in March 2023. ChatGPT is an easy-to-use application with incredible capabilities in generative AI that caused reverberations in every industry, waking up the entire industry to the potential of generative AI. The demand for generative AI models has increased due to their integration into large applications such as Microsoft Office and Google Docs. There is an urgency to train larger models and create supporting models for fine-tuning, alignment, guard railing, and augmentation. This has led to an acceleration in demand for training and inference. Nvidia is responding to the demand by working on large language models and inference platforms. The construction of AI supercomputers involves thousands of components and is a heavy process. The company faces constraints in meeting demand due to factors such as chip production, data center availability, and customer demand. The company had to take write-downs on gaming and data center products due to a disappointing year in sales. Huang emphasizes the importance of being strict about inventory and purchase order obligations. Nvidia CEO Jensen Huang discusses the company's ability to quickly change course and develop new platforms for different use cases in the inference business. He also emphasizes the importance of diversity and redundancy in building resilience for large companies, investing in fabs, and interconnects for scaling up computing systems. Huang highlights the importance of data centers as the future of computing and Nvidia's move into cloud services. The company is shifting towards delivering their system as a whole and working closely with cloud providers to optimize performance for customers. Nvidia is launching with Oracle as their first OEM layer of clouds. The interview with Nvidia CEO Jensen Huang highlights the company's commitment to working with industries to use their Nemo and Picasso models with their own data. Huang emphasizes that Nvidia wants to work with everyone, including AWS. The interview also discusses Nvidia's new business model of working directly with customers to accelerate their end-to-end ML ops platforms and host their applications in the cloud. Nvidia can help accelerate end-to-end processes and refine platforms for both established companies and newcomers to the field. Huang recommends using cloud services like SageMaker or Azure ML for those new to machine learning or AI unless they require a bespoke framework platform. Finally, Huang discusses Nvidia's focus on solving problems that only they can solve, rather than competing with others on things that everyone can do. Nvidia CEO Jensen Huang discusses the future potential of inference becoming the primary way software is operated, with generative AI eventually being available on every computer. Inference will be done on both local devices and in the cloud. Nvidia is interested in developing consumer AI GPUs and smaller, more performant versions of large language models that could run on cell phones within 10 years.
1660 word summary
The excerpt is not related to the subject of the document and does not provide any meaningful information. It includes boilerplate, search options, and other irrelevant details. The actual content of the interview with Nvidia CEO Jensen Huang about AI's "iPhone moment" is not included. Nvidia CEO Jensen Huang discusses the advancement of Moore's Law and the potential for inference to become the primary way software is operated in the future. He predicts that generative AI will eventually be available on every computer and that inference will be done on both local devices and in the cloud. Huang also mentions Nvidia's interest in developing consumer AI GPUs and the potential for smaller, more performant versions of large language models to run on cell phones within 10 years. The interview with Nvidia CEO Jensen Huang discusses the centralized versus localized compute for AI and how Nvidia focuses on solving problems that only they can solve, rather than competing with others on things that everyone can do. They celebrate the success of their partners, whether they use CUDA or not, and prioritize making it easy and cost-effective for them to achieve their goals. Huang also addresses concerns about commoditization and efforts by companies like Meta to expand PyTorch. Nvidia CEO Jensen Huang discusses the benefits of working with his company for applications that require generative AI-based video-based storytelling at high quality. For other applications such as spell checkers, he recommends using cloud services that are already accelerated on GPU. Huang believes that most companies should go directly to the cloud and work with their partners to ensure infrastructure and services are accelerated. However, for those who need experts to develop acceleration layers or algorithms, working directly with Nvidia is beneficial. He also suggests using cloud services like SageMaker or Azure ML for those new to machine learning or AI, unless they require a bespoke framework platform. The explosion of innovation in AI may occur in a fundamentally different layer of the stack going forward, further up on top. The article discusses Nvidia's role in assisting companies with accelerated computing and AI. There are two groups of customers: established companies that Nvidia has been assisting and newcomers to the field. Nvidia can help accelerate end-to-end processes and refine platforms. They work with large companies like Amazon, Microsoft, and Vertex AI, but also assist smaller companies without large engineering teams. Nvidia is well-positioned to assist companies that have recently realized the need for AI and generative AI. Nvidia CEO Jensen Huang discusses the company's new business model of working directly with customers to accelerate their end-to-end ML ops platforms and host their applications in the cloud. The company can engage directly with customers thanks to Nvidia being in the browser. Huang also notes that the company already works directly with end users and developers in various industries, such as healthcare, automotive, and video games. The fulfillment of the system historically has been through someone else, but now Nvidia can fulfill it directly or through another CSP or OEM. The new business model provides an opportunity for Nvidia to generate subscription revenue instead of relying solely on product sales. The interview with Nvidia CEO Jensen Huang discusses the company's commitment to supporting industries in using their Nemo and Picasso models with their own data. Huang emphasizes that Nvidia wants to work with everyone, including HP, Dell, Lenovo, and AWS. While AWS was notably absent from Nvidia's recent joint press release, Huang believes that AWS has an interest in continuing to partner with Nvidia in a deep way. The interview also touches on the technical aspects of DGX Cloud and how Nvidia will intermediate relationships with host providers for their customers. The CEO of Nvidia, Jensen Huang, discusses the collaboration between their company and cloud providers to optimize performance for customers. The architecture is not diverse enough to make a noticeable difference between cloud providers, but there may be slight differences in experience. Nvidia works closely with each cloud provider to integrate their software architecture into the cloud provider's system. Oracle is a good example of a cloud provider that will build the full Nvidia stack, while AWS has its own competitive Nitro layer. Nvidia is launching with Oracle as their first OEM layer of clouds. Nvidia CEO Jensen Huang discusses the company's shift towards delivering their system as a whole, rather than through different clouds. This is the company's largest business model extension and involves a large and growing service organization to help people with their models. Nvidia works closely with cloud service providers (CSPs) and their salesforce and marketing to offer fully optimized computers compatible with their software stack. The goal is to extend Nvidia's architecture to all CSPs and provide the same software stack on any cloud, multi-cloud, hybrid cloud, or at the edge. Nvidia integrates their system into various companies and works with them to understand their needs and APIs. This allows Nvidia to be a vertically integrated systems company while connecting with the world. Nvidia CEO Jensen Huang discusses the importance of data centers as the future of computing and how Nvidia has built a vertically integrated system that operates from the cloud to the edge. He emphasizes the need for a software-defined system that can orchestrate the entire fleet of computers inside data centers as if it's one, with a separation of the compute plane and the control plane. Huang also discusses Nvidia's move into cloud services, such as DGX Cloud and Omniverse Cloud, which were pre-announced in previous interviews. Transparency is important to Nvidia's partners who depend on them, and their goal is to build a computing platform that's available everywhere. The CEO of Nvidia, Jensen Huang, discussed the importance of diversity and redundancy in building resilience for large companies. This includes investing in fabs (fabrication plants) in the United States and elsewhere, which can be more expensive but necessary for supply chain resilience. Huang also talked about the importance of interconnects for scaling up computing systems and the trade-off between speed and effectiveness. He also addressed the limitations of fused chips in complying with export controls, but assured that they still serve the needs of customers and run the same software. The interview with Nvidia CEO Jensen Huang discusses the company's ability to quickly change course, such as taking A100s and changing them to A800s for China, which was possible due to the speed of the chip being fine and the limitation being the memory and interconnect speed. The core technology for inference products at scale for data centers exists, and four new platforms have been developed for different use cases, including large language models and video editing at full film quality scale with generative AI. The scale of inference business has gone through a step function, and Nvidia is racing to meet demand while also focusing on generative AI work done in the cloud. The interview also touches on changes in the way Nvidia thinks about the business post-ChatGPT. Nvidia CEO, Jensen Huang, discusses the challenges of building AI supercomputers, including the need for switches, NICs, cables, and data center space. The construction of these supercomputers involves thousands of components and is a heavy process. The company faces constraints in meeting demand due to factors such as chip production, data center availability, and customer demand. The company had to take write-downs on gaming and data center products due to a disappointing year in sales. Huang emphasizes the importance of being strict about inventory and purchase order obligations. The demand for generative AI models has increased due to their integration into large applications such as Microsoft Office and Google Docs. There is an urgency to train larger models and create supporting models for fine-tuning, alignment, guard railing, and augmentation. This has led to an acceleration in demand for training and inference. Nvidia is responding to the demand by working on large language models and inference platforms. The ChatGPT model is a significant development that allows programming with natural language and represents a phase shift in computing. Companies are now considering the implications for their industry, competition, products, and business models. Nvidia CEO Jensen Huang discusses the three properties of computing platforms and how they apply to AI. ChatGPT and generative AI have driven an inflection point in AI adoption and created a new computing model. The accessibility of this new computer and the applications that can be built with it are brand new. Understanding the language of proteins and chemicals can lead to new opportunities for companies. The AI Moment, ChatGPT, has opened people's minds to the possibilities of AI. Executives who were not previously engaged with AI are now interested in it. Large language models will learn the language of everything that has structure, including the physical world. The CEO of Nvidia, Jensen Huang, was surprised by the effectiveness and widespread use of ChatGPT, an easy-to-use application with incredible capabilities in generative AI. It caused reverberations in every industry and woke up the entire industry to the potential of generative AI. Within 60 days, hundreds of startups were created and VCs were funding them. ChatGPT is unquestionably the most easy-to-use application that performs tasks that are consistently surprising to just about everyone. Its impact has dramatically changed views on the potential of generative AI. In this interview with Nvidia CEO Jensen Huang, conducted in March 2023, Huang discusses the impact of ChatGPT and what he calls AI's iPhone moment on Nvidia's business. He also touches on the announcement of Nvidia's new DGX Cloud service, how Nvidia responded to the Biden administration's export controls, TSMC's new plant in Arizona, running AI locally, and Nvidia's position in the stack in an LLM world. The interview is lightly edited for clarity and was conducted on the occasion of this week's GTC conference. The frequency of Nvidia's semiannual conferences might seem aggressive, but given Nvidia's central role in AI, Huang notes that their last talk seems like it was years ago.