X · @togethercompute June 23, 2026 · X / Twitter

.@cartesia runs one of the hardest inference workloads: real-time voice. Their stack has to keep long-lived streams moving, serve millions of audio mi…

.@cartesia runs one of the hardest inference workloads: real-time voice.Their stack has to keep long-lived streams moving, serve millions of audio minutes a day, and hold model latency around 90ms.Together gives them the managed GPU infrastructure and low-level cluster control to run that in production.Together AI: htt

Read original