NVIDIA® H200 Instances and Clusters Available

NVIDIA V100 GPU Specs and Price in 2024

6 min read
NVIDIA V100 GPU Specs and Price in 2024

The NVIDIA V100 is a legendary piece of hardware that has earned its place in the history of high-performance computing. Launched in 2017, the V100 introduced us to the age of Tensor Cores and brought many advancements through the innovative Volta architecture. 

However, in recent years, the landscape of AI hardware has continued to evolve. Newer models such as the NVIDIA A100 and H100 have entered the market, offering superior performance and capabilities. This raises an important question for AI engineers: what is the best use for the V100 in today’s diverse GPU landscape? Let’s answer this question looking through V100 specs, performance and pricing from today’s perspective. 

NVIDIA v100 with NVLink

Introduction to the V100 

The NVIDIA V100, launched in May 2017, marked a significant milestone in the GPU industry. It was the first GPU based on NVIDIA's Volta architecture, which introduced several groundbreaking technologies designed to accelerate AI and high-performance computing (HPC) workloads. The V100 was a successor to the P100, bringing substantial improvements in both performance and efficiency. 

Volta Architecture 

The V100 is built on the Volta architecture, which represents a major leap forward from previous generations. Volta's design focuses on improving parallel processing capabilities and improving energy efficiency. Key features include a new streaming multiprocessor (SM) architecture, a redesigned memory hierarchy, and the introduction of Tensor Cores. 

Tensor Cores 

One of the key aspects of the V100 is its inclusion of Tensor Cores, specialized hardware designed to accelerate tensor operations essential for deep learning. Tensor Cores enable mixed-precision computing, allowing the V100 to perform matrix multiplications at unprecedented speeds. This results in significant performance boosts for tasks like neural network training and inference. 

Power Consumption and Thermal Design

With a TDP of 300 Watts, the V100 is designed to deliver high performance while maintaining energy efficiency. The GPU's cooling solutions are designed to manage this power consumption effectively, ensuring stable operation even under heavy computational loads.

Key V100 Specifications 

At the time of its introduction, the V100 was the most advanced GPU in the World. While it has been overtaken in specifications by many newer models, it still has solid specifications capable of high-performance computing.

Here is a spec sheet of what you can expect to find from the V100 GPU: 

CUDA Cores: 5,120 

Tensor Cores: 640 

Base Clock Speed: 1,215 MHz 

Boost Clock Speed: 1,380 MHz 

Memory: 16 GB or 32GB HBM2 

Memory Bandwidth: 900 GB/s 

TDP (Thermal Design Power): 300 Watts 

How the V100 Compares to the P100 

As the first ever GPU to feature Tensor Cores, it’s difficult to evaluate the V100 against earlier models. However, the most natural comparison is with the P100 utilizing Pascal architecture. Compared to the P100, the V100 brought in a sizable increase in CUDA Cores (5,120 vs 3,584) and a substantial increase in memory bandwidth (900 GB/s vs. 720GB/s.)

Thanks to the Tensor Core technology, the V100 was the world’s first GPU to break the 100 teraFLOPS (TFLOPS) barrier of deep learning performance. According to NVIDIA benchmarks, the V100 can perform deep learning tasks 12x faster than the P100.

nvidia v100 vs CPU

At the time of launch NVIDIA also made a point of comparing V100's AI training and inference performance compared to a Intel Gold [email protected]/3.9Hz Turbo CPU.

A100 Datasheet Comparison vs V100 and H100 

Today the more natural comparison is with the A100 and H100 specs. In this comparison, the V100 falls short on many key elements. 

GPU Features 

NVIDIA V100 

NVIDIA A100 

NVIDIA H100 SXM5 

GPU Board Form Factor 

SXM2 

SXM4 

SXM5 

SMs 

80 

108 

132 

TPCs 

40 

54 

66 

FP32 Cores / SM 

64 

64 

128 

FP32 Cores / GPU 

5020 

6912 

16896 

FP64 Cores / SM (excl. Tensor) 

32 

32 

64 

FP64 Cores / GPU (excl. Tensor) 

2560 

3456 

8448 

INT32 Cores / SM 

64 

64 

64 

INT32 Cores / GPU 

5120 

6912 

8448 

Tensor Cores / SM 

Tensor Cores / GPU 

640 

432 

528 

Texture Units 

320 

432 

528 

Memory Interface 

4096-bit HBM2 

5120-bit HBM2 

5120-bit HBM3 

Memory Bandwidth 

900 GB/sec 

1555 GB/sec 

3.35 TB/sec 

Transistors 

21.1 billion 

54.2 billion 

80 billion 

Max thermal design power (TDP) 

300 Watts 

400 Watts 

700 Watts 

  * see detailed comparisons of V100 vs A100 and A100 vs H100

V100 Performance Metrics 

The V100 is still a solid performer in various performance metrics, making it suitable for both training and inference of deep learning models: 

FP32 (Single-Precision) Performance: 15.7 TFLOPS 

FP64 (Double-Precision) Performance: 7.8 TFLOPS 

These metrics highlight the V100's versatility, allowing it to deliver high performance for a range of precision levels required by different AI applications. 

Limitations of the V100 

The major disadvantage of the V100 is that it does not support BF16 (or bfloat16) data type. This makes it difficult to train today's larger models with the V100. Still, the V100 has it’s use in both inference and fine-tuning, as it has good availability at comparatively low cost compared to newer GPUs. 

V100 Use Cases Today 

The V100 has been widely adopted in various industries and applications. Here are a few examples: 

Deep Learning Training: You can still use the V100 for training of smaller deep learning models, especially ones that don’t require BF16 precision support. 

Inference Tasks: The V100's ability to handle high-throughput inference tasks makes it suitable for deploying trained models in production environments. 

High-Performance Computing: Beyond AI, the V100 can be used in scientific computing, simulations, and other HPC applications due to its double-precision performance. 

Famously OpenAI (with support from Microsoft) used 10,000 V100s to train the GPT-3 language model.  

NVIDIA V100 Pricing 

Today you can find the NVIDIA V100 offered as individual instances as well as clusters of up to 8 NVLink connected GPUs. In recent years the price of the V100 has gone down, meaning that you can get on-demand from Cloud GPU providers like DataCrunch based on your unique needs and requirements.  

In a recent review of Cloud GPU pricing, we found that the cost difference between hyperscalers such as Amazon AWS, Google Cloud Platform and Microsoft Azure was up to 8 times greater than independent AI-computing specialists such as DataCrunch.

aws vs gcp vs azure cloud gpu cost comparison vm price

Current on-demand prices of A100 instances at DataCrunch: 

* see real-time pricing for V100.

nvidia v100 price - dynamic cost per hour

Bottom line on the V100 

The V100 is a legendary GPU with a deserved place among the most influential pieces of hardware in the development of artificial intelligence.

While it has been overshadowed by newer models, like the A100 and the H100, the V100 remains today a cost-efficient solution for operations like inference and fine-tuning of AI models. See three creative use-cases for the V100 today.

If you’re looking to try out the V100, spin up an instance with DataCrunch today.