Voice AI Infrastructure Startup LiveKit Reaches $1 Billion Valuation in New Funding Round
LiveKit, the real-time audio and video infrastructure powering OpenAI's voice mode, hits unicorn status with a $100 million investment led by Index Ventures.

By Creati.ai Editorial Team
January 16, 2026
In a landmark move that signals a decisive shift in the artificial intelligence hardware landscape, OpenAI has announced a multi-billion dollar partnership with chip startup Cerebras Systems. This strategic alliance, revealed on January 15, 2026, aims to secure massive computing capacity dedicated specifically to AI inference, addressing the skyrocketing demand from the company’s reported 800 million weekly users.
Coming on the heels of this infrastructure overhaul, OpenAI also confirmed today, January 16, a significant investment in Merge Labs, a brain-computer interface (BCI) startup founded by Sam Altman. Together, these developments paint a picture of an industry giant aggressively fortifying its supply chain while simultaneously expanding the frontiers of human-AI interaction.
For years, the generative AI revolution has been powered almost exclusively by Nvidia’s graphics processing units (GPUs). However, as model adoption transitions from training to large-scale deployment, the bottleneck has shifted. OpenAI’s agreement to purchase up to 750 megawatts of computing capacity from Cerebras over the next three years marks one of the most significant diversifications in AI infrastructure to date.
The deal highlights a critical evolution in the AI lifecycle: the era of "inference at scale." While training requires the raw, versatile power of Nvidia’s H100 or Blackwell clusters, inference—the act of the model generating responses to user prompts—demands low latency and high throughput. Cerebras, known for its dinner-plate-sized Wafer-Scale Engines (WSE), offers a specialized architecture that eliminates the memory bandwidth bottlenecks common in traditional GPU clusters.
Sachin Katti, an OpenAI executive, emphasized the necessity of this shift, stating, "OpenAI's compute strategy is to build a resilient portfolio that matches the right systems to the right workloads. Cerebras adds a dedicated low-latency inference solution to our platform. That means faster responses, more natural interactions, and a stronger foundation to scale real-time AI to many more people."
The core of this partnership lies in Cerebras’s unique approach to chip design. Unlike traditional processors that are cut from a silicon wafer and packaged individually, Cerebras builds a single, massive chip that encompasses the entire wafer. This design allows for communication speeds between cores that are orders of magnitude faster than what is possible between separate GPUs linked by cables.
For OpenAI’s "O" series models (such as o3 and the rumored o4), which rely on complex "chain-of-thought" reasoning, inference latency is a critical performance metric. The Cerebras architecture is particularly well-suited for these long-context, reasoning-heavy workloads where data needs to move instantly between memory and compute units.
Andrew Feldman, co-founder and CEO of Cerebras, compared the shift to the broadband revolution: "Just as broadband transformed the internet, real-time inference will transform AI, enabling entirely new ways to build and interact with AI models."
The following table outlines the key technical and operational differences driving OpenAI’s decision to integrate Cerebras into its stack.
| **Feature/Metric | Traditional GPU Clusters (e.g., Nvidia) | Cerebras Wafer-Scale Engine** |
|---|---|---|
| Architecture Design | Individual chips connected via interconnects (NVLink) | Single massive chip (wafer-scale) with on-chip interconnect |
| Memory Bandwidth | High, but limited by off-chip data transfer | Extremely High (all memory is on-chip) |
| Primary Use Case | General purpose (Training & Inference) | Specialized High-Performance Inference & Training |
| Scalability Challenge | Requires complex cabling and networking switches | Scales by adding "CS" nodes; simpler cluster management |
| Latency Profile | Variable, dependent on network hops | Deterministic and ultra-low |
| Energy Efficiency | High power consumption per token | Optimized for specific workloads, potentially lower per-token energy |
This partnership is not an isolated event but part of a broader "resilient portfolio" strategy orchestrated by OpenAI. Throughout late 2025, reports surfaced of OpenAI collaborating with Broadcom to develop custom inference chips and planning to deploy AMD’s latest Instinct accelerators.
By bringing Cerebras into the fold, OpenAI is effectively hedging its bets against supply chain disruptions and pricing power concentration. The dominance of a single hardware supplier has long been a risk factor for AI labs; diversifying into custom silicon and alternative architectures provides OpenAI with leverage and security.
Market Implications:
While the Cerebras deal shores up the digital infrastructure, OpenAI’s latest investment points toward a future of biological integration. On January 16, 2026, OpenAI announced a strategic investment in Merge Labs, a brain-computer interface (BCI) startup.
Founded by Sam Altman, Merge Labs operates in a space popularized by Neuralink, aiming to develop safe, scalable interfaces that can bridge the human brain with artificial intelligence. Unlike the immediate utility of the Cerebras deal, this investment is a long-term bet on the "human-in-the-loop" evolving into "human-integrated-with-loop."
Key aspects of the Merge Labs investment include:
As we move further into 2026, the narrative of "AI scaling" is changing. It is no longer just about training larger models on more data; it is about the logistics of intelligence. How do you serve a reasoning model to a billion users instantly? How do you power it efficiently? And ultimately, how do humans interact with it most effectively?
OpenAI’s dual announcements this week provide clear answers:
For developers and enterprise users of the OpenAI API, the Cerebras integration is expected to roll out in phases through 2028, with initial "low-latency" endpoints likely becoming available later this year. This promises to unlock a new tier of real-time AI applications, from instant voice translation to autonomous agents that can "think" and act in milliseconds.
Creati.ai Analysis: The "Chip Wars" are far from over, but the battlefield has expanded. By securing capacity from Cerebras, OpenAI has not only solved a looming bottleneck but has also crowned a new contender in the hardware race. The year 2026 is shaping up to be defined not just by how smart AI models get, but by the silicon engines that drive them.