Tech & Science

Microsoft-Backed Startup Launches AI Processor Without GPUs or Expensive HBM Memory

56
Microsoft

d-Matrix Inc., a hardware startup from Santa Clara, California, has unveiled its first AI processor, Corsair, designed to enhance AI inference without relying on traditional GPUs or costly high-bandwidth memory (HBM).

The processor is backed by Microsoft and promises significant performance and cost benefits for generative AI models.

Corsair is already available to early-access customers, with broader availability set for the second quarter of 2025. The processor is specifically built to handle demanding AI inference tasks, including generative AI models.

For instance, Corsair achieves 60,000 tokens per second at 1 ms per token when running Llama3 8B in a single server.

In more demanding scenarios, such as with Llama3 70B models, Corsair delivers 30,000 tokens per second at 2 ms per token, resulting in major savings in energy and operational costs compared to traditional GPU-based solutions.

The Corsair processor uses Nighthawk and Jayhawk II tiles and a 6nm manufacturing process. Each Nighthawk tile integrates four neural cores and a RISC-V CPU, optimized for large-model inference with digital in-memory computation (DIMC) and versatile datatype processing, including block floating point (BFP).

Corsair’s chiplet packaging integrates memory and computation to maximize efficiency. The processor conforms to the industry-standard PCIe Gen5 full height full-length card form factor, offering scalable performance when paired with DMX Bridge cards.

Corsair’s design also includes 2400 TFLOPs of 8-bit peak computing, 2GB of integrated performance memory, and up to 256GB of off-chip memory capacity. Micron Technology, a key partner of Nvidia, is also collaborating with d-Matrix on this innovative project.

d-Matrix initially planned to launch the Corsair processor in late 2023, but adjusted its architecture to meet the growing demand for generative AI.

This pivot allowed Corsair to integrate enhancements tailored for transformer models and emerging applications like agentic AI and interactive video generation.

Sid Sheth, cofounder and CEO of d-Matrix, emphasized the company’s focus on addressing the challenges of AI inference, calling Corsair a groundbreaking platform for high-speed token generation, particularly for applications requiring interactivity with multiple users.

Written by
Sazid Kabir

I've loved music and writing all my life. That's why I started this blog. In my spare time, I make music and run this blog for fellow music fans.

Leave a comment

Leave a Reply

Your email address will not be published. Required fields are marked *

Stay updated with nomusica.com. Add us to your preferred sources to see our latest updates first.

Related Articles

ChatGPT 5
AITech & Science

ChatGPT Ads Could Reshape Digital Marketing for Businesses Everywhere

OpenAI’s introduction of ads in ChatGPT is changing how digital marketing works....

Kimi K1.5
AITech & Science

Chinese AI Models Close the Gap With OpenAI and Google

Chinese technology companies are speeding up the release of new artificial intelligence...

Google Chrome
Tech & Science

Delete These 17 Browser Extensions Now, Security Experts Warn

Security researchers have found a new wave of malicious browser extensions affecting...

Apple to Unveil iPhone 17 and Slimmest iPhone Air in September Event 1
Tech & Science

How Much You Could Get From Apple’s Siri Settlement

Apple users are beginning to receive payments from a $95 million settlement...