Simli is on a mission to make interactive, lifelike AI avatars a foundational part of future digital experiences across e-commerce, customer support, corporate training, EdTech, and beyond. Achieving this requires the production-grade GPU compute, ultra-low latency, and cost-efficiency that DataCrunch delivers.
Simli benefits from a fast-moving partnership with DataCrunch, collaborating on the design and configuration of bare-metal GPU clusters that meet the requirements for Simli’s real-time inference workloads.
In addition, Simli utilizes on-demand GPU resources from DataCrunch for scaling user requests and running research & development workloads. Thanks to how DataCrunch handles disk objects, Simli experiences 30–50% faster startup times and subsequent cost savings.
About Simli
Simli develops compute-efficient, interactive AI avatars targeting businesses seeking to enhance AI-powered applications, spanning various sectors like e-commerce, customer service, and EdTech. Their primary focus is on providing an excellent and user-friendly developer experience.
Simli’s interactive AI offering utilizes a novel 3D neural architecture based on Gaussian splatting, a new graphics primitive offering high visual fidelity and compute efficiency. Unlike alternatives that often rely on video-based lip-syncing, Simli's neural network provides full control over 3D animation, allowing the entire character's face – not just the lips – to be animated in response to audio. This results in more dynamic avatars that are also more cost-efficient.
Simli, an early-stage startup, was founded by an experienced team of AI builders and serial founders who understood the importance of time-to-market. Their story of starting in stealth mode and moving fast through rapid experimentation to production-grade workloads is similar to what many other AI-native startups experience.
Although Simli initially relied on credits from traditional hyperscalers, the team knew that the hyperscalers could not provide the customization, flexibility, and responsiveness that more agile AI neocloud providers could.
“DataCrunch is the perfect mix of being nimble and having a stable, production-grade product. With DataCrunch, we can promise our customers high uptimes and competitive SLAs.” – Lars Vågnes, Founder & CEO, Simli
The Challenge: Delivering Lifelike, Interactive Avatars Cost-Efficiently
Creating avatars that respond in real time to human speech with full facial expressions – not just lip-syncing – is no small feat. It requires ultra-low latency and production-grade stability, all while remaining cost-efficient.
“We needed production-grade reliability with pricing that made sense for a startup. DataCrunch hit that sweet spot.” – Lars Vågnes
To deliver a compelling user experience, Simli created custom workflows, including their proprietary load balancer and the creation of WebRTC-based peer-to-peer connections with end-users. These network-sensitive processes, which involve running multiple interacting Docker instances, required a customized bare-metal GPU cluster from DataCrunch.
Business benefits:
- 2–3× more avatar sessions per dollar compared to hyperscalers
- 30–50% faster GPU startup times, from 5 minutes to under 2 minutes
- Significant reduction in costs with flexible disk management and access to GPU models
- Cost-effective inference scaling for real-time interactive services
The DataCrunch Solution
After evaluating multiple AI neocloud providers, Simli chose DataCrunch for its:
- Production-grade reliability
- Developer-first experience
- Self-service GPU availability
- Ultimate cost-efficiency
“We found that other providers often offered cheaper GPUs but lacked the reliability required for a production-level, low-latency, API-driven service like ours,” noted Lars, explaining why DataCrunch stood out as the optimal choice.
“Startup times and compute costs both dropped significantly. We’re now streaming more avatars, faster, and at lower costs than ever.” – Lars Vågnes
Solution requirements:
- <300ms latency
- Fast GPU startup times - under 2 minutes
- Access to a broad spectrum of GPU models
- API-driven automation and scaling
Simli purpose-built its stack to for cost-efficient real-time inference. A broad availability of GPU models from DataCrunch allowed Simli to explore options and find the most optimal solution for all of its critical workloads in terms of latency and cost. A reliable network infrastructure was also critical for Simli's WebRTC-based real-time connections.
With DataCrunch, Simli experienced 30–50% faster GPU startup times (reduced from 5 minutes to under 2 minutes). They attributed this improvement largely to how DataCrunch handles disk objects, allowing Simli to pre-load a lot of data and attach it efficiently to GPUs upon commission. In addition, DataCrunch makes it easy to detach and reattach disks and avoid preemption issues with the on-demand resources.
Team: The Developer Experience
With only a few engineers split between building the product and managing infrastructure, every minute saved matters. Simli’s agile team – especially their lead engineer, Antony Kiroles – benefited from fewer headaches, better uptime, and a developer-first offering from DataCrunch.
“Quality of life went up. We don’t have to deal with the quirks and preemptions we had on other platforms. Ultimately, a great developer experience is being able to run workloads when and where you need to, without sales delays, needing to contact support, or getting stuck in strange Docker environments.” – Lars Vågnes
“Our overall experience with DataCrunch's cloud platform has been great. Things work reliably, and the customer support and responsiveness allow for rapid problem resolution like quick quota adjustments,” Lars emphasized.
Key Takeaways
The fast-moving partnership between Simli and DataCrunch demonstrates several critical success factors for AI-native startups:
- Cost Optimization: Achieving 2-3x better compute efficiency enables sustainable scaling of interactive applications.
- Performance Requirements: Meeting sub-second latency requirements for real-time applications requires specialized infrastructure.
- Reliability: Production-grade stability is essential for API services, especially for interactive applications.
- Support Quality: Direct access to technical support teams enables rapid problem resolution and knowledge sharing.
- Flexibility: Fast access to a wide range of GPU models allows optimizing workloads for latency and cost.
Simli's API offering now delivers interactive AI avatars at significantly lower costs while maintaining the <300ms latency requirements necessary for real-time interactions.
Looking Ahead: Unlocking Interactive AI
Simli is on a mission to reshape the economics of interactive AI. After a year in development, Simli unveils Trinity-1, the first real-time, interactive Gaussian avatar API.
Trinity-1 unlocks interactive AI to millions of users at less than 1 cent per minute, while the market rates exceed 5-20 cents per minute.
- To learn more about how Simli unlocks interactive AI with Trinity-1, visit www.simli.com or watch the demo
- To power your AI product with base and on-demand GPU resources, get started with DataCrunch Cloud Platform