Juice Labs’ Quest to Solve the Tech World’s GPU Hunger
OpenAI’s unleashing of ChatGPT onto the world in 2022 lifted GPUs out of obscurity. One day they were something only the most serious gamers, crypto miners and scientists working with supercomputers concerned themselves with, and the next they were being featured in headlines by the likes of CNN and Fox Business.
The AI hype pushed demand for the data center-grade variety of these chips beyond suppliers’ ability to churn them out, and the world was worried that the shortage could stifle its ability to get vacation planning advice from a robot.
One idea for alleviating the shortage is to make GPUs that aren’t being used by their owner at a given moment accessible for someone else to use temporarily. This is hard to do, because GPUs act as accelerators for CPUs, which offload computational tasks to them. Normally, both types of chips are part of the same system, often on the same motherboard. This close proximity and direct bus connection make it easier to optimize the speed at which they pass data to each other. Using a GPU remotely requires solving for a lot of extra latency between the machine hosting the GPU and the machine that needs to use it.
This is where Juice Labs comes in. Its software makes it possible to expose GPUs to remote applications via network connections and unlock the opportunity to tap into underutilized GPUs sitting in modern data centers.
'The Biggest Tech Trend in 20 Years'
To explain why the problem Juice Labs wants to solve is important, let's step back and talk about the role of GPUs in modern computing.
As the name “Graphics Processing Unit” suggests, these chips were originally designed to offload graphics processing (video and image rendering) from a computer’s CPU to free up its resources for other things and as a result speed up the system overall. But because GPUs pack enormous computing power that can be used to perform calculations of any type in parallel, they have been adopted for offloading other types of processing as well: first in traditional supercomputers and eventually systems designed expressly for training machine learning and AI models and running inference for applications that put those models to use.
A core piece of the infrastructure that enables the wildly popular AI tools we have at our disposal today, GPUs have become a hot commodity. Steve Golik, co-founder and CEO of Juice Labs, believes that GPU-accelerated computing has become "the biggest tech trend in 20 years."
Even when they are available, however, high-performance GPUs are pricey, costing tens of thousands of dollars each. Unless your workloads will actually use them at full capacity on an ongoing basis, it's hard to make a case for buying your own. Juice Labs' GPU-over-IP software, available in both an open source version and an enterprise edition, allows businesses to connect over a network to remote GPUs located in third-party servers or data centers.
This makes it possible for companies to essentially rent the expensive hardware from other companies that have capacity to spare. And they can do it without having to migrate workloads between servers because Juice Labs’ software enables direct GPU access via a network connection, an attractive advantage over other methods of renting GPU capacity, such as "GPU-as-a-Service" cloud offerings that require workloads to operate directly on the physical servers where the GPUs live.
Juice Labs is making it possible to get more use out of a "tremendously valuable resource that is usually used inefficiently due to local access problems." Golik says.The company's ultimate goal is to end what he calls "dark GPU," resources that sit underutilized in a world desperately in need of them.
Art of the Possible
Golik admits that what Juice Labs is doing is no easy task: "We're doing something that has never been done before, which is stretching a GPU-dependent workload over the network"
That's challenging because GPUs are designed to move vast volumes of data over PCIe buses on a motherboard, not the network. Finding a way to transmit enormous amounts of information from GPUs to applications over a network interface without suffering a major performance hit is difficult, to put it mildly.
Using techniques like sophisticated data compression and caching, however, Juice Labs has found a way to make it work. The company's software achieves performance outcomes that are in the "high 90s percentage-wise" compared to accessing GPUs within a server, Golik says.
Juice Labs' user experience is simple. The solution runs as "user space software that you install on Linux or Windows just like you would any other app," Golik explains. From there, the software exposes local GPUs as virtual devices to remote applications, which can connect to the GPUs without requiring custom application logic or complex operating system setups.
The Quest for Optimization
The optimizations that Juice Labs has built into software are only part of Golik's vision for combining the performance of GPUs with the flexibility of network-connected workloads. The company is also working with data center providers, including Equinix, to plan further optimizations at the network level that would improve performance even more.
By taking advantage of direct data center interconnection, for example, Juice Labs could bring the performance of its solution even closer to that of local GPUs. "Equinix could give us the bandwidth and latency optimizations we need to reach the next level of performance," Golik says.
Not only that, but he also sees data center operators as being in a great position to help connect companies with excess GPU resources to those who might benefit from them. "We envision working closely to find Equinix customers who want to put their GPUs online or who need more GPU and can't get it," he explains.
For now, the three-year-old company is still working on fully building out its vision. But it's not too early to say that by transforming GPUs into a resource that can be consumed quickly and efficiently over the network, Juice Labs is addressing a real need in the market and just may reshape the way businesses everywhere think about GPU-dependent workloads.