NVIDIA® B200 SXM6: Increased capacity in early June 2025
B200
GDPR and ISO 27001 compliant

Serverless Containers

Create fast and scalable endpoints

for containerized models

Queue-based auto-scaling Pay per use Scale to zero
Deploy a container
Partners who trust our services:
  • Freepik
  • Black Forest
  • 1X
  • ManifestAI
  • Nex
  • Sony
  • Harward University
  • NEC
  • Korea University
  • MIT University
  • Findable
  • Freepik
  • Black Forest
  • 1X
  • ManifestAI
  • Nex
  • Sony
  • Harward University
  • NEC
  • Korea University
  • MIT University
  • Findable

DataCrunch Serverless Containers

Create inference endpoints to serve your containerized models flexibly, fetched from any container registry
  • Deploy

    Package your models in containers and deploy from any registry (Docker Hub, GitHub, etc) using API, CLI, or UI.
  • Scale

    Auto-scale based on the number of incoming requests. Scale to hundreds of GPUs or zero when idle.
  • Monitor

    Get logs and metrics on resource utilization and application behavior in the UI or as endpoints for Prometheus or Loki.
  • Pay per usage

    Pay only for the compute that is in active use without charges for idle time. Start, stop, hibernate instantly via the UI or API.

Deployment types

  • Continuous

    Automatically scales to usage for continuous inference workloads

    Continuous deployment type
  • Jobs

    Long-running tasks with safe downscaling to prevent interruption

    Jobs deployment type
Deploy a container Need other configurations?

Pricing

Pay only for the compute that is in active use

Interruptible spot pricing is available at 50%

Multi-GPU options with 1x, 2x, and 4x

Compute type
Specs
On-demand price
Spot price
B200 SXM6 180GB
30 CPU
240 GB RAM
180 GB GPU VRAM
$5.39/h $2.69/h
H200 SXM5 141GB
21 CPU
175 GB RAM
141 GB GPU VRAM
$4.13/h $2.06/h
H100 SXM5 80GB
21 CPU
175 GB RAM
80 GB GPU VRAM
$3.98/h $1.99/h
A100 SXM4 80GB
21 CPU
110 GB RAM
80 GB GPU VRAM
$1.75/h $0.88/h
A100 SXM4 40GB
21 CPU
110 GB RAM
40 GB GPU VRAM
$1.29/h $0.65/h
L40S 48GB
20 CPU
58 GB RAM
48 GB GPU VRAM
$1.29/h $0.65/h
RTX6000 Ada 48GB
10 CPU
58 GB RAM
48 GB GPU VRAM
$1.29/h $0.65/h
Note: All specs and prices for 1x GPU.
Deploy a container Need other configurations?

Auto-scaling support

Control scaling sensitivity based on the queue length per replica with additional scaling metrics and attributes.

See the docs.
Auto scaling image

Sync and async requests

Access your container directly in the synchronous mode or use our cloud platform to run your workloads asynchronously.

See the docs.
Request image

Metrics

Get detailed, time-series metrics on replica count, GPU and CPU utilization, request rates, inference duration, and queue size with complementary endpoints for Prometheus or Loki.

Metrics image

Logs

Access container-level logs in real time to debug errors, trace requests, and monitor applications directly from the UI.

Logs image

API and SDK

Interact with the Serverless Containers through DataCrunch Public API or official Python SDK.

SDK image
Deploy a container Need other configurations?
Prometheus Logo Grafana Logo
Get real-time observability your way

Integrate DataCrunch Serverless Containers with your Prometheus or Grafana stack

Looking for something different?

Other inference services

You can also reach us via the contact form

Customer feedback

What they say about us...

  • Quote

    Having direct contact between our engineering teams enables us to move incredibly fast. Being able to deploy any model at scale is exactly what we need in this fast moving industry. DataCrunch enables us to deploy custom models quickly and effortlessly.

    Iván de Prado Head of AI at Freepik
    Logo
  • Quote

    From deployment to training, our entire language model journey was powered by DataCrunch's clusters. Their high-performance servers and storage solutions allowed us to run smooth operations and maximum uptime, and to to focus on achieving exceptional results without worrying about hardware issues.

    José Pombal AI Research Scientist at Unbabel
    Logo
  • Quote

    DataCrunch powers our entire monitoring and security infrastructure with exceptional reliability. We also enforce firewall restrictions to protect against unauthorized access. Thanks to DataCrunch, our training clusters run smoothly and securely.

    Nicola Sosio ML Engineer at Prem AI
    Logo
Deploy now