NVIDIA AWS and NVIDIA announce strategic partnership to deliver new supercomputing, infrastructure, software and services for generative AI

NVIDIA
AWS and NVIDIA announce strategic partnership to deliver new supercomputing, infrastructure, software and services for generative AI
……
AWS delivers first cloud AI supercomputer with scalability with NVIDIA Grace Hopper Superchip and AWS UltraCluster
・NVIDIA DGX Cloud, equipped with NVIDIA GH200 NVL32 for the first time, is now available on AWS
・For NVIDIA’s AI R&D and custom model development, the two companies partner on Project Ceiba to build the world’s fastest GPU-powered AI supercomputer and the latest NVIDIA DGX Cloud supercomputer
Accelerate generative AI, HPC, design and simulation workloads with new Amazon EC2 instances powered by NVIDIA GH200, H200, L40S and L4 GPUs
– NVIDIA software implemented on AWS to power custom models, semantic search, and drug discovery with NeMo LLM framework, NeMo Retriever, and BioNeMo
[Image

Las Vegas, November 28, 2023 – At AWS re:Invent, Amazon Web Services, Inc., an Amazon.com, Inc. (NASDAQ: AMZN) company, and NVIDIA (NASDAQ: NVDA) announced their strategic Announced further expansion of partnership to provide customers with the latest infrastructure, software and services to accelerate innovation in generative AI. The companies will leverage NVIDIA’s latest multi-node systems with next-generation GPUs, CPUs and AI software, to state-of-the-art virtualization and security with AWS’s Nitro System, and the scalability of Elastic Fabric Adapter (EFA) interconnects and UltraClusters. It brings together the best of NVIDIA and AWS technologies to provide an ideal solution for training foundational models and building generative AI applications.
This expanded partnership builds on the companies’ long-standing relationship that has powered the era of generative AI by providing early machine learning (ML) pioneers with the compute power they need to advance the cutting edge in technology. It is based on.
Our expanded partnership to accelerate generative AI across all industries includes:
– AWS will be the first cloud provider to implement the NVIDIA(R) GH200 Grace Hopper Superchip with new multi-node NVLink(TM) technology in the cloud. The NVIDIA GH200 NVL32 multi-node platform connects 32 Grace Hopper Superchips into a single instance with NVIDIA NVLink and NVSwitch(TM) technology. The platform is now available on Amazon Elastic Compute Cloud (Amazon EC2) instances connected with Amazon’s Powerful Networking (EFA), advanced virtualization (AWS Nitro System), and hyperscale clustering (Amazon EC2 UltraClusters), enabling mutual customers to scale to thousands of GH200 Superchips.
・ NVIDIA and AWS have announced that NVIDIA’s
AI-training-as-a-service, NVIDIA DGX(TM) Cloud
(https://www.nvidia.com/ja-jp/data-center/dgx-cloud/) Works with being hosted on AWS. This is the first DGX Cloud to feature the GH200 NVL32, giving developers the largest amount of shared memory in a single instance. DGX Cloud on AWS accelerates the training of cutting-edge generative AI and large-scale language models with over 1 trillion parameters.
・ NVIDIA and AWS will collaborate on Project Ceiba to build the world’s fastest AI supercomputer powered by GPUs. This supercomputer is a large-scale system powered by GH200 NVL32 and Amazon EFA interconnects, hosted by AWS for NVIDIA’s own research and development team. This first-of-its-kind supercomputer is equipped with 16,384 NVIDIA GH200 Superchips, capable of 65 exaflops of AI processing, and will be used by NVIDIA to power a new wave of generative AI
innovation.
・AWS introduces three new Amazon EC2 instances. 5e instances are equipped with NVIDIA H200 Tensor Core GPUs
(https://www.nvidia.com/ja-jp/data-center/h200/) for large-scale, cutting-edge generative AI and HPC workloads. G6 and G6e instances support NVIDIA L4 GPU (https://www.nvidia.com/ja-jp/data-center/l4/) and NVIDIA L40S GPU (https://www.nvidia.com/ja- jp/data-center/l40s/), each supporting a wide range of applications such as AI fine-tuning, inference, and graphics and video workloads. G6e instances support generative AI-enabled 3D applications. Particularly suitable for developing applications such as 3D workflows and digital twins using NVIDIA Omniverse(TM) (https://www.nvidia.com/ja-jp/omniverse/), a platform for connecting and building Masu.
Adam Selipsky, CEO of AWS, said: “AWS and NVIDIA have partnered for more than 13 years, starting with the world’s first GPU cloud instance. Today, AWS supports workloads including graphics, gaming, high performance computing, machine learning, and now generative AI. We offer the broadest range of NVIDIA GPU solutions for customers. We continue to innovate with NVIDIA to make AWS the best place to run and leverage GPUs, and with the next-generation NVIDIA Grace Hopper Superchip and the AWS powerful It combines EFA networking, hyperscale clustering with EC2 UltraCluster, and advanced virtualization capabilities with Nitro.”
Jensen Huang, founder and CEO of NVIDIA, said: “Generative AI is transforming cloud workloads, with accelerated computing becoming the foundation for diverse content generation. With a shared mission to deliver for our customers, NVIDIA and AWS will partner across the entire compute stack, from AI infrastructure, acceleration libraries, and foundational models to generative AI services.”
New Amazon EC2 instances combine cutting-edge technology from NVIDIA and AWS AWS becomes the first cloud provider to offer the NVIDIA GH200 Grace Hopper Superchip with multi-node NVLink technology. Each GH200 Superchip combines an Arm-based Grace CPU and an NVIDIA Hopper(TM) architecture GPU on the same module. A single Amazon EC2 instance with GH200 NVL32 provides up to 20 TB of shared memory, enabling you to support terabyte-scale workloads.
These instances leverage AWS’s third-generation Elastic Fabric Adapter (EFA) interconnect to deliver high-bandwidth networking throughput with low latency of up to 400 Gbps per Superchip, allowing customers to deploy thousands of GH200s on EC2 UltraClusters. Expandable up to Superchip.
AWS instances powered by the GH200 NVL32 give customers on-demand access to supercomputer-class performance, which is required to distribute complex AI workloads such as underlying models, recommender systems, and vector databases across multiple nodes. critical for large-scale AI/ML workloads.
EC2 instances with NVIDIA GH200 have 4.5 TB of HBM3e memory, which is 7.2x more than current generation EC2 P5d instances with H100. This allows you to run larger models while improving training performance. Additionally, the CPU-to-GPU memory interconnect provides seven times the bandwidth of PCIe, enabling chip-to-chip communication that expands the total memory available to applications.
AWS instances equipped with the GH200 NVL32 will be AWS’s first AI infrastructure with a water cooling system. This allows dense server racks to operate at maximum performance and efficiency.
EC2 instances powered by GH200 NVL32 also leverage the AWS Nitro System, the platform that powers next-generation EC2 instances. The Nitro System offloads functional I/O from the host CPU/GPU to dedicated hardware, delivering more consistent performance and improved security to protect customer code and data during processing. AWS hosts NVIDIA DGX Cloud powered by Grace Hopper for the first time AWS collaborates with NVIDIA to host NVIDIA DGX Cloud with GH200 NVL32 NVLink infrastructure. NVIDIA DGX Cloud is an AI supercomputing service that gives enterprises rapid access to multi-node
supercomputing for training the most complex LLM and generative AI models. https://www.nvidia.com/ja-jp/data-center/products/ai-enterprise/) Get access to software and direct access to NVIDIA’s AI experts. Groundbreaking Project Ceiba supercomputer accelerates NVIDIA’s AI development The Project Ceiba supercomputer that AWS and NVIDIA are building integrates AWS services such as Virtual Private Cloud (VPC) for encrypted networking and Amazon Elastic Block Store for
high-performance block storage. You will be able to use the following functions.
NVIDIA will use this supercomputer for research and development to advance AI in areas such as LLM, graphics and simulation, digital biology, robotics, self-driving cars, and climate change prediction with Earth-2.
NVIDIA and AWS accelerate generative AI, HPC and simulation
To facilitate the development, training, and inference of the largest LLMs, AWS P5e instances are powered by NVIDIA’s latest H200 GPUs. The H200 offers 141 GB of HBM3e GPU memory, which is 1.8x more capacity and 1.4x faster than the H100 GPU. This GPU memory enhancement, along with up to 3,200 Gbps of EFA networking enabled by the AWS Nitro System, enables customers to continue building, training, and deploying cutting-edge models on AWS.
To provide cost-effective and energy-efficient solutions for video, AI, and graphics, AWS introduces new Amazon EC2 G6e instances with NVIDIA L40S GPUs and L4 GPUs, and announced the G6 instance. This new product helps startups, enterprises, and researchers meet their AI and high-fidelity graphics needs.
G6e instances are built to handle complex workloads such as generative AI and digital twin applications. With NVIDIA Omniverse, you can develop photorealistic 3D simulations that can be contextualized and refined based on real-time data from services like AWS IoT TwinMaker, intelligent chatbots, assistants, and search and summarization. Masu. Amazon Robotics and Amazon Fulfillment Center bring together digital twins built with NVIDIA Omniverse and AWS IoT to help optimize warehouse design and flow, train more intelligent robot assistants, and improve customer deliveries. It will be.
The L40S GPU has up to 1.45 petaflops of FP8 processing performance and features ray-tracing cores that deliver up to 209 teraflops of ray-tracing performance. The L4 GPUs in G6 instances provide a low-cost, energy-efficient solution for deploying AI models for natural language processing, language translation, AI video and image analysis, speech recognition, and personalization. Offers. The L40S GPU also accelerates graphics workloads, such as creating and rendering real-time movie-quality graphics and streaming games. All three instances are expected to be available next year.
NVIDIA Software on AWS Accelerates Generated AI Development
Additionally, NVIDIA announced software to accelerate the development of generative AI on AWS. NVIDIA NeMo(TM) Retriever Microservice (https://nvidianews.nvidia.com/news/nemo-retriever-generative-ai-microservice) is a highly accurate chatbot that uses accelerated semantic search. A new tool for creating summarization tools. NVIDIA BioNeMo(TM) (https://blogs.nvidia.com/blog/bionemo-on-aws-generative-ai-drug-discovery/), currently available on Amazon SageMaker, is available on AWS on NVIDIA DGX Cloud. It will be available for use soon. BioNemo enables pharmaceutical companies to speed up drug discovery by simplifying and accelerating the training of models using their own data.
NVIDIA software on AWS is helping Amazon bring new innovations to its services and operations. AWS uses the NVIDIA NeMo framework
(https://blogs.nvidia.com/blog/nemo-amazon-titan/) to train some of the next generation Amazon Titan LLMs. Amazon Robotics starts using NVIDIA Omniverse Isaac
(https://blogs.nvidia.co.jp/2023/11/30/gpu-aws-omniverse-isaac-sim-robots/) to expand warehouses into the real world We are building digital twins to automate, optimize, and plan autonomous warehouses in virtual space.
About NVIDIA
Since its founding in 1993, NVIDIA (https://www.nvidia.com/ja-jp/) (NASDAQ: NVDA) has been a pioneer in accelerated computing. Invented by the company in 1999, the GPU is fueling the growth of the PC gaming market, redefining computer graphics, and igniting the modern AI era while helping to digitize industries. NVIDIA is a full-stack computing company with data center-scale products that are currently reshaping the industry. For more information, follow this link:
https://nvidianews.nvidia.com/

Leave a Reply