Lumai’s Photonic Chip Harnesses Light for Big AI Compute Speedup
Silicon photonics is emerging as a way move massive amounts of data among GPUs and CPUs in HPC systems, but what if you could compute purely with light and photonics? That’s the gist of the optical computing technology developed by Lumai, which today launched its first server, dubbed Iris.
Lumai was spun out of Oxford University, which pioneered a novel approach to process data in light. The company’s three-dimensional optical technology utilizes a combination of lasers and membranes to encode data as light and then perform a series of calculations upon it.
“At the heart of AI, it’s about vector-matrix or matrix-matrix multiplication,” said Phil Burr, head of product for Lumai. “What we do is we encode those incoming vectors in light. We effectively do a copy for free in light by passing that vector in through a lens. And then we copy that vector across a matrix. So we encode matrix values in the transmissivity of that membrane.”
The optical technology allows users to do the same type of calculations that are done on TPUs and GPUs, he said. One of the advantages that Lumai’s technology brings (besides energy savings) is the capability to calculate very large matrices, up to 2,048 by 2,048.
(Source: Lumai)
“That means it’s very efficient, very fast,” Burr said. “If you’re doing that in hardware, you can’t do it with such a large matrix. You’d have to subdivide into much smaller matrices. And then you have to move that data around to reconstitute the matrix and that’s very wasteful. So that’s what makes [Lumai’s approach] really efficient.”
Lumai is launching three servers as part of its Iris family, including Nova, Aura, and Tetra. Nova is available today for evaluation by hyperscalers, neo-clouds, enterprises, and research institutions. It’s capable of running Llama 8B and 70B using a hybrid processor. The company’s volume products, Aura, is slated for 2028, while Tetra is penciled in for 2029.
Lumai says its technology can perform the same matrix multiplication for AI inference workloads while using 90% less electricity compared to a GPU-based system. This is due to a peculiar scaling characteristic of 3D optical computing.
“There is a cost of conversion, from digital to optical,” Burr said. “The power cost of that is proportional to the width of that vector, whereas the performance is the square. So essentially as you increase that matrix size, the efficiency goes up.”
Lumai’s optical computers utilize many commercial off-the-shelf technologies that are available in the data center today, including lasers. The same types of lasers that are used for silicon photonics can also be used to power computing with Lumai’s technology, Burr said.
Lumai Iris Nova server (Source: Lumai)
“So there’s already volume manufacturing essentially,” he said. “We don’t need to create any new materials. And so actually in volume this will be lower cost than an Nvidia GPU.”
Similarly, the software stack for programming and operating Lumai servers isn’t as exotic as the optical computation. Lumai plugs into existing data flows, and applications can be developed using frameworks like PyTorch. Lumai develops the hardware-specific kernels that allows developers to program Iris servers using PyTorch, he said.
The huge energy demands of AI is threatening to derail AI before it fully gets off the ground. Nearly half of all data center projects in the U.S. will be delayed or cancelled this year, according to Bloomberg, with the availability of electricity and electrical components (such as transformers) being cited as a primary reason.
“Part of the reason there’s so much interest is that people understand that silicon scaling has pretty much stopped,” Burr said. “Yes, you can go down a node, but the benefits are much less now and, essentially to get more performance, you have more power, more complexity. The packages get bigger and hotter. And so they look at the roadmaps from conventional digital systems and their own roadmaps of software demand, and the two things don’t match. So they recognize that a need to find new technology.”
Lumai is positioning its technology as a way to power the new world of agentic AI without breaking the energy budget. Specifically, it’s targeting the prefill stage of AI inference, which is often compute bound and benefits from hefty processors that can chew through large amounts of data quickly, such as GPUs and TPUs. The decode stage, by contrast, is typically memory bound.
“As the industry transitions into the inference era, we are simultaneously crossing the threshold into the post-silicon era,” stated Dr. Xianxin Guo, CEO and co-Founder of Lumai. “By shifting the computation paradigm from electrons to photons, Lumai can deliver an order-of-magnitude increase in performance with significant energy savings.”
This article first appeared in HPCwire.
Related

