19

2024

-

04

Full engine: NVIDIA joins Google Cloud to accelerate AI development


NVIDIA and Google Cloud have announced a new collaboration to help global startups accelerate the creation of generative AI applications and services.

The announcement, made at the Google Cloud Next '24 conference in Las Vegas on April 10, will combine NVIDIA's Startup acceleration program for Startups with the Google for Startups Cloud program. The move will expand the reach of cloud credits, go-to-market support and technical expertise to help startups create value for customers faster.

The NVIDIA Startup Acceleration Program is a global program empowering more than 18,000 startups, and eligible members will be able to access Google Cloud infrastructure through an accelerated pathway and earn Google Cloud Credits, which will provide up to $350,000 in credits for AI-focused startups.

The partnership is the latest in a series of announcements by the two companies aimed at helping businesses of all sizes reduce costs and barriers to developing generative AI applications. Among them, the high cost generated by AI investment has certain constraints on start-ups.

You need a full-stack AI platform

In February, Google DeepMind launched Gemma, an advanced open model series. NVIDIA recently partnered with Google to roll out optimizations across all NVIDIA AI platforms for Gemma, helping to reduce customer costs and accelerate innovation efforts for domain-specific use cases.

The teams from both companies worked closely to accelerate Gemma's performance with NVIDIA TensorRT-LLM, an open source library for optimizing inference performance for large language models running on NVIDIA Gpus. The research and techniques used to create Gemma are exactly the same as those used to create Google DeepMind's powerful Gemini model.

NVIDIA NIM microservices, included in the NVIDIA AI Enterprise software platform, will work with Google Kubernetes Engine (GKE) to develop and optimize AI for AI applications Deployment of models into production provides a simplified path. NIM is built on inference engines such as NVIDIA Triton Inference Server and TensorRT-LLM that support a variety of leading AI models and provide seamless, scalable AI inference to accelerate generative AI deployments in the enterprise.

The Gemma family of models includes Gemma 7B, RecurrentGemma, and CodeGemma, all of which are available from the NVIDIA API directory. Users can try it in a browser, a prototype with an API endpoint, and self-hosted NIM.

With GKE and the Google Cloud HPC Toolkit, it's easier to deploy the NVIDIA NeMo framework on Google Cloud Platform. This enables developers to automate and scale the training and servicing of generative AI models and quickly start the development process by quickly deploying a one-stop environment with customizable blueprints.

NVIDIA NeMo in NVIDIA AI Enterprise is also available on the Google Cloud Marketplace, giving customers another way to easily access NeMo and other frameworks to accelerate AI development.

To further expand the availability of accelerated generative AI computing powered by NVIDIA, Google Cloud also announced that the A3 Mega will be fully available next month. These instances are an extension of its A3 virtual machine family, powered by NVIDIA H100 Tensor Core Gpus. The new instance will double the GPU-to-GPU network bandwidth for A3 virtual machines.

The new Google Cloud Confidential virtual machine on A3 will also incorporate support for confidential computing to help customers protect the confidentiality and integrity of their sensitive data, as well as secure application and AI workloads during training and inference (no code changes required when using H100 GPU acceleration). These GPU-powered confidential virtual machines will be available in preview versions this year.

The next protagonist: NVIDIA Blackwell architecture GPU

NVIDIA's latest Gpus running on the NVIDIA Blackwell platform will be available on Google Cloud early next year in two versions, the NVIDIA HGX B200 and the NVIDIA GB200 NVL72.

The HGX B200 is designed for the most demanding AI, data analytics and high-performance computing workloads. The GB200 NVL72 is specifically designed for training and real-time inference of next-generation large-scale trillion-parameter models.

NVIDIA GB200 NVL72 connects 36 Grace Blackwell superchips with 900GB/s interchip connectivity. With two NVIDIA Blackwell Gpus and one NVIDIA Grace CPU on each superchip, an NVIDIA NVLink domain supports up to 72 Blackwell Gpus and 130TB/s of bandwidth. Compared to the previous generation, it overcomes communication bottlenecks and can run as a single GPU, with real-time LLM reasoning and training speeds up to 30 and 4 times, respectively.

The NVIDIA GB200 NVL72 is a multi-node rack-scale extension system that will use Google Cloud's fourth-generation advanced liquid cooling system.

NVIDIA announced in March that the NVIDIA DGX Cloud, an AI platform for enterprise developers optimized for the needs of generative AI, has been fully rolled out on A3 virtual machines powered by H100 Gpus. DGX Cloud with GB200 NVL72 will also be available on Google Cloud in 2025.

Focus on us

Chengdu Kingshili Technology Co., Ltd.

WeChat Public Number

Here is the placeholder text

Chengdu Kingshili Technology Co., Ltd.

East China: +8618224448086 Mr. Wang (WeChat with the same number)

North China: +8618228050282 Mr. Yan (WeChat with the same number)

South China: +8618982051022 Mr. yang (wechat with the same number)

Other areas: +8618224448086 Mr. Wang (WeChat with the same number)
Company Tel: +86028-61932001
Enterprise E-mail: info@kingsni.com
Company Address:No. 588, Wulian 3rd Road, Shuangliu District, Chengdu, China (Sichuan) Free Trade Zone


Copyright©2022 Chengdu Kingsni Technology Co., Ltd.

Technical Support: China Enterprise Power Chengdu |SEOLabel