Available soon H200 clusters

The perfect storage server

2 min read
The perfect storage server

Fast Storage

DataCrunch is proud to offer 100.000 IOPS for our customers, which is included with the default NVME storage option. Of course, we always strive to do better, so behind the scenes we're working on the next generation of storage to reach for 300.000 or more IOPS and 5-10GB/s per client!

In this blog, we discuss the bottlenecks of current storage servers and how the perfect storage server would look for us today.

Let's talk NUMA!

Our current storage servers get populated with 2 network cards, offering a total of 4 ports to the internal network. The NUMA configuration of our current servers is not ideal however, which leaves performance on the table.

NUMA current storage

The image itself outlines how having an unequal NUMA distribution can have less-than-ideal results. When a file is needed that is stored on a disk attached to NUMA 0, it needs to hop to another NUMA domain to find its way out of the server.

Netflix has a great talk on this: https://people.freebsd.org/~gallatin/talks/euro2021.pdf

Hopping over NUMA domains adds latency and can cause additional bottlenecks. Since the servers are outfitted with 8 channel DDR4 memory, we have a total memory bandwidth of ~170GB/s. To push 400 Gbit/s (or 50GB/s) out of the server, we need a memory bandwidth of ~100GB/s. The memory overhead is not huge and we are pushing close to the limit of 8-channel DDR4 3200, another reason why NUMA considerations come into play.

The perfect server

Let's dream for a minute what the perfect server would look like. We would have perfectly even distribution with network ports and disks across the NUMA domains. Here's how that would look:

NUMA blogPerfect NUMA

With 4 ports at 100 Gbit/s, this server would be perfect for us and would be perfectly possible to be built today.

Such a configuration should deliver 100GB/s per node when using 4 ports at 200GBE, PCIE gen5 and DDR5!

And thus our quest for lightning fast storage continues.