NVIDIA V100 GPU Specs and Price in 2024

The NVIDIA V100 is a legendary piece of hardware that has earned its place in the history of high-performance computing. Launched in 2017, the V100 introduced us to the age of Tensor Cores and brought many advancements through the innovative Volta architecture.

However, in recent years, the landscape of AI hardware has continued to evolve. Newer models such as the NVIDIA A100 and H100 have entered the market, offering superior performance and capabilities. This raises an important question for AI engineers: what is the best use for the V100 in today’s diverse GPU landscape? Let’s answer this question looking through V100 specs, performance and pricing from today’s perspective.

Introduction to the V100

The NVIDIA V100, launched in May 2017, marked a significant milestone in the GPU industry. It was the first GPU based on NVIDIA's Volta architecture, which introduced several groundbreaking technologies designed to accelerate AI and high-performance computing (HPC) workloads. The V100 was a successor to the P100, bringing substantial improvements in both performance and efficiency.

Volta Architecture

The V100 is built on the Volta architecture, which represents a major leap forward from previous generations. Volta's design focuses on improving parallel processing capabilities and improving energy efficiency. Key features include a new streaming multiprocessor (SM) architecture, a redesigned memory hierarchy, and the introduction of Tensor Cores.

Tensor Cores

One of the key aspects of the V100 is its inclusion of Tensor Cores, specialized hardware designed to accelerate tensor operations essential for deep learning. Tensor Cores enable mixed-precision computing, allowing the V100 to perform matrix multiplications at unprecedented speeds. This results in significant performance boosts for tasks like neural network training and inference.

Power Consumption and Thermal Design

With a TDP of 300 Watts, the V100 is designed to deliver high performance while maintaining energy efficiency. The GPU's cooling solutions are designed to manage this power consumption effectively, ensuring stable operation even under heavy computational loads.

Key V100 Specifications

At the time of its introduction, the V100 was the most advanced GPU in the World. While it has been overtaken in specifications by many newer models, it still has solid specifications capable of high-performance computing.

Here is a spec sheet of what you can expect to find from the V100 GPU:

CUDA Cores: 5,120

Tensor Cores: 640

Base Clock Speed: 1,215 MHz

Boost Clock Speed: 1,380 MHz

Memory: 16 GB or 32GB HBM2

Memory Bandwidth: 900 GB/s

TDP (Thermal Design Power): 300 Watts

How the V100 Compares to the P100

As the first ever GPU to feature Tensor Cores, it’s difficult to evaluate the V100 against earlier models. However, the most natural comparison is with the P100 utilizing Pascal architecture. Compared to the P100, the V100 brought in a sizable increase in CUDA Cores (5,120 vs 3,584) and a substantial increase in memory bandwidth (900 GB/s vs. 720GB/s.)

Thanks to the Tensor Core technology, the V100 was the world’s first GPU to break the 100 teraFLOPS (TFLOPS) barrier of deep learning performance. According to NVIDIA benchmarks, the V100 can perform deep learning tasks 12x faster than the P100.

At the time of launch NVIDIA also made a point of comparing V100's AI training and inference performance compared to a Intel Gold [email protected]/3.9Hz Turbo CPU.

A100 Datasheet Comparison vs V100 and H100

Today the more natural comparison is with the A100 and H100 specs. In this comparison, the V100 falls short on many key elements.

GPU Features	NVIDIA V100	NVIDIA A100	NVIDIA H100 SXM5
GPU Board Form Factor	SXM2	SXM4	SXM5
SMs	80	108	132
TPCs	40	54	66
FP32 Cores / SM	64	64	128
FP32 Cores / GPU	5020	6912	16896
FP64 Cores / SM (excl. Tensor)	32	32	64
FP64 Cores / GPU (excl. Tensor)	2560	3456	8448
INT32 Cores / SM	64	64	64
INT32 Cores / GPU	5120	6912	8448
Tensor Cores / SM	8	4	4
Tensor Cores / GPU	640	432	528
Texture Units	320	432	528
Memory Interface	4096-bit HBM2	5120-bit HBM2	5120-bit HBM3
Memory Bandwidth	900 GB/sec	1555 GB/sec	3.35 TB/sec
Transistors	21.1 billion	54.2 billion	80 billion
Max thermal design power (TDP)	300 Watts	400 Watts	700 Watts

* see detailed comparisons of V100 vs A100 and A100 vs H100.

V100 Performance Metrics

The V100 is still a solid performer in various performance metrics, making it suitable for both training and inference of deep learning models:

FP32 (Single-Precision) Performance: 15.7 TFLOPS

FP64 (Double-Precision) Performance: 7.8 TFLOPS

These metrics highlight the V100's versatility, allowing it to deliver high performance for a range of precision levels required by different AI applications.

Limitations of the V100

The major disadvantage of the V100 is that it does not support BF16 (or bfloat16) data type. This makes it difficult to train today's larger models with the V100. Still, the V100 has it’s use in both inference and fine-tuning, as it has good availability at comparatively low cost compared to newer GPUs.

V100 Use Cases Today

The V100 has been widely adopted in various industries and applications. Here are a few examples:

Deep Learning Training: You can still use the V100 for training of smaller deep learning models, especially ones that don’t require BF16 precision support.

Inference Tasks: The V100's ability to handle high-throughput inference tasks makes it suitable for deploying trained models in production environments.

High-Performance Computing: Beyond AI, the V100 can be used in scientific computing, simulations, and other HPC applications due to its double-precision performance.

Famously OpenAI (with support from Microsoft) used 10,000 V100s to train the GPT-3 language model.

NVIDIA V100 Pricing

Today you can find the NVIDIA V100 offered as individual instances as well as clusters of up to 8 NVLink connected GPUs. In recent years the price of the V100 has gone down, meaning that you can get on-demand from Cloud GPU providers like DataCrunch based on your unique needs and requirements.

In a recent review of Cloud GPU pricing, we found that the cost difference between hyperscalers such as Amazon AWS, Google Cloud Platform and Microsoft Azure was up to 8 times greater than independent AI-computing specialists such as DataCrunch.

aws vs gcp vs azure cloud gpu cost comparison vm price

Current on-demand prices of A100 instances at DataCrunch:

1 x V100 16GB: $0.39/hour
4 x V100 16GB: $1.56/hour
8 x V100 16 GB: $3.12/hour

* see real-time pricing for V100.

nvidia v100 price - dynamic cost per hour

Bottom line on the V100

The V100 is a legendary GPU with a deserved place among the most influential pieces of hardware in the development of artificial intelligence.

While it has been overshadowed by newer models, like the A100 and the H100, the V100 remains today a cost-efficient solution for operations like inference and fine-tuning of AI models. See three creative use-cases for the V100 today.

If you’re looking to try out the V100, spin up an instance with DataCrunch today.