d-Matrix Inc., a hardware startup from Santa Clara, California, has unveiled its first AI processor, Corsair, designed to enhance AI inference without relying on traditional GPUs or costly high-bandwidth memory (HBM).
The processor is backed by Microsoft and promises significant performance and cost benefits for generative AI models.
Corsair is already available to early-access customers, with broader availability set for the second quarter of 2025. The processor is specifically built to handle demanding AI inference tasks, including generative AI models.
For instance, Corsair achieves 60,000 tokens per second at 1 ms per token when running Llama3 8B in a single server.
In more demanding scenarios, such as with Llama3 70B models, Corsair delivers 30,000 tokens per second at 2 ms per token, resulting in major savings in energy and operational costs compared to traditional GPU-based solutions.
The Corsair processor uses Nighthawk and Jayhawk II tiles and a 6nm manufacturing process. Each Nighthawk tile integrates four neural cores and a RISC-V CPU, optimized for large-model inference with digital in-memory computation (DIMC) and versatile datatype processing, including block floating point (BFP).
Corsairโs chiplet packaging integrates memory and computation to maximize efficiency. The processor conforms to the industry-standard PCIe Gen5 full height full-length card form factor, offering scalable performance when paired with DMX Bridge cards.
Corsairโs design also includes 2400 TFLOPs of 8-bit peak computing, 2GB of integrated performance memory, and up to 256GB of off-chip memory capacity. Micron Technology, a key partner of Nvidia, is also collaborating with d-Matrix on this innovative project.
d-Matrix initially planned to launch the Corsair processor in late 2023, but adjusted its architecture to meet the growing demand for generative AI.
This pivot allowed Corsair to integrate enhancements tailored for transformer models and emerging applications like agentic AI and interactive video generation.
Sid Sheth, cofounder and CEO of d-Matrix, emphasized the companyโs focus on addressing the challenges of AI inference, calling Corsair a groundbreaking platform for high-speed token generation, particularly for applications requiring interactivity with multiple users.