NVIDIA® H200 Instances and Clusters Available

NVIDIA Blackwell B100, B200 GPU Specs and Availability

6 min read
NVIDIA Blackwell B100, B200 GPU Specs and Availability

At NVIDIA GTC 2024 we got a preview of the next generation of high-performance GPUs with the announcement of the Blackwell architecture.  

Let’s go through what the Blackwell architecture will give you from the perspective of AI training and inference use cases, and how it compares to the best NVIDIA GPUs currently on the market, the A100, H100 and H200.  

NVIDIA Blackwell architecture 

The Blackwell GPU is designed with the specific purpose of handling data center -scale generative AI workflows. Architecturally, Blackwell GPUs combine two reticle-limited dies into a single, unified GPU with a 10 terabyte-per-second chip-to-chip interface.  

NVIDIA named their latest generation of high-performance GPU architecture in honor of the American mathematician and statistician David H. Blackwell 

The Blackwell GPU is the largest GPU ever built with over 208 billion transistors. It also includes major software advancements, including the second-generation Transformer Engine and Confidential Computing capabilities aimed at supporting encrypted enterprise use of generative AI training, inference and federated learning. 

nvidia blackwell gpu architecture

What can we expect from the Blackwell architecture 

HGX B100 and HGX B200 

NVIDIA plans to release Blackwell GPUs with two different HGX AI supercomputing form factors, the B100 and B200. While these will share many of the same components, the B200 will have a higher maximum thermal design power (TDP) and overall higher performance across FP4 / FP8 / INT8 / FP16 / FP32 and FP64 workloads. 

At the time of the announcement both the B100 and B200 were expected to have the same 192GB HBM3e memory with up to 8 TB/seconds of memory bandwidth. 

B200 vs B100 Spec Sheet Comparison 

Specification 

HGX B200 

HGX B100 

Blackwell GPUs 

FP4 Tensor Core 

144 PFLOPS 

112 PFLOPS 

FP8/FP6/INT8 

72 PFLOPS 

56 PFLOPS 

Fast Memory 

Up to 1.5 TB 

Up to 1.5TB 

Aggregate Memory Bandwidth 

Up to 64 TB/s 

Up to 64 TB/s 

Aggregate NVLink Bandwidth 

14.4 TB/s 

14.4 TB/s 

FP4 Tensor Core (per GPU) 

18 PFLOPS 

14 PFLOPS 

FP8/FP6 Tensor Core (per GPU) 

9 PFLOPS 

7 PFLOPS 

INT8 Tensor Core (per GPU) 

9 petaOPS 

7 petaOPs 

FP16/BF16 Tensor Core (per GPU) 

4.5 PFLOPS 

3.5 PFLOPS 

TF32 Tensor Core 

2.2 PFLOPS 

1.8 PFLOPS 

FP32 

80 TFLOPS 

60 TFLOPS 

FP64 Tensor Core 

40 TFLOPS 

30 TFLOPS 

FP64 

40 TFLOPS 

30 TFLOPS 

GPU memory / Bandwidth 

Up to 192 GB HBM3e / Up to 8 TB/s 

Up to 192 GB HBM3e / Up to 8 TB/s 

Max thermal design power (TDP) 

1,000W 

700W 

Interconnect 

NVLink: 1.8TB/s

PCIe Gen6: 256GB/s 

NVLink: 1.8TB/s

PCIe Gen6: 256GB/s 

Note: All petaFLOPS and petaOPS are with Sparsity except FP64 which is dense.  

GB200 Grace Blackwell Superchip 

In addition to the HGX form factors NVIDIA have announced a new Superchip combining two Blackwell Tensor Core GPUs and one NVIDIA Grace CPU. These Superchips can be connected in clusters, for example, in an NVL72 configuration connecting 36 Grace CPUs and 72 Blackwell GPUs into one massive GPU delivering 30x faster LLM inference than the H100.  

NVIDIA GB200 Grace Blackwell Superchip

*See more information on the H100 specs and performance

Blackwell vs Hopper Comparison 

Ahead of the launch of the Blackwell generation of GPUs NVIDIA have released benchmarks comparisons to the Hopper architecture. 

In large model training (such as GPT-MoE-1.8T) the B200 is 3x faster than the h100. 

The B200 also achieves up to 15x higher inference performance compared to the H100 using large models such as GPT-MoE-1.8T. 

NVIDIA GB200 vs H100 comparison

Let’s still remember that before the Blackwell series is released, Hopper architecture will get a serious upgrade in the form of the H200 specs.  

GB200 vs H100 Benchmarks 

NVIDIA have released benchmark data comparing the GB200 Superchip to the NVIDIA H100.  

 

NVIDIA B200 vs. AMD MI300X 

Another major point of comparison is with the MI300X recently released by AMD. While NVIDIA have a very clear leadership in the market for AI-focused GPUs, AMD and other major players have brought competing products to the market. 

While the MI300X is already available in the market today, the best comparison is likely to be with the B200 with both high-performance GPUs seeing more availability by early 2025. 

 Specification

NVIDIA B200 

AMD MI300X 

GPU Memory 

192 GB 

192 GB  

Memory Type 

HBM3e 

HBM3 

Peak Memory Bandwidth 

8 TB/s 

5.3TB/s 

Interconnect 

NVLink: 1.8TB/s, PCIe Gen6: 256GB/s 

PCIe Gen5: 128GB/s 

Max Thermal Design Power (TDP) 

1,000W 

750W 

Naturally, you can’t compare NVIDIA and AMD GPUs just on hardware specs. NVIDIA still holds a massive edge when it comes to software with many AI engineers finding it difficult to move away from CUDA and other frameworks. 

Availability and pricing 

Don’t expect to get access to these high-performance devices any time soon. The earliest you can expect B100 to be available is in Q4 of 2024 and the B200 is not likely to be available before 2025. No release date is announced for the GB200. 

Also, NVIDIA have not released any pricing information. 

You don’t need to wait to get access to high-performance GPUs. DataCrunch offers a broad range of premium NVIDIA GPUs at competitive prices. See latest cloud GPU pricing and availability

Bottom line on Blackwell architecture 

Competition is increasing in the GPU race and NVIDIA is not looking to rest on their laurels. An early release of specs for the Blackwell architecture series confirms that NVIDIA is likely to give you the highest performance GPUs today and in years to come. 

It may be quite some time before you get access to the B100, B200 or the GB200, but when they do arrive you can expect DataCrunch to give you quick access, fair pricing and in-depth performance benchmarks.