MemryX Inc., a company delivering production AI inference acceleration, announced its strategic roadmap for the MX4. The next-generation accelerator is engineered to scale the company’s “at-memory” dataflow architecture from edge deployments into the data center, leveraging 3D hybrid-bonded memory to eliminate the industry’s most pressing bottleneck: the “memory wall.”
MemryX is currently in production with its MX3 silicon, delivering >20× better performance per watt than mainstream GPUs for targeted AI inference applications. With MX4, MemryX is extending that production-proven foundation to address data center workloads increasingly constrained not by compute, but by memory capacity, bandwidth, and energy efficiency.
MemryX has now signed an agreement with a next-generation 3D memory partner to execute a dedicated 2026 test chip program, validating a targeted ~5µm-class hybrid-bonded interface and direct-to-tile memory integration. The partner is not disclosed at this time.
The announcement comes as the semiconductor industry increasingly prioritizes deterministic inference architectures for the next era of AI processing, reinforced by recent multibillion-dollar licensing and investment activity across AI hardware-such as Nvidia’s $20B deal with Groq, which underscores the massive strategic value of efficient inference solutions. While the first generation of dataflow solutions proved the efficiency of 2D SRAM, MemryX is moving into the third dimension to address the power, cost, and complexity constraints of frontier AI workloads.
Also Read: Atos Group to Sell South American Operations to Semantix
Software Continuity: Leveraging the MX3 Compiler Foundation
MemryX plans to leverage its mature, production-proven MX3 software stack – including its compiler and runtime – as the foundation for MX4. While MX4 introduces new capabilities to support larger memory footprints and data center-scale configurations, the roadmap is designed to preserve key elements of the MX3 programming model and toolchain to accelerate adoption and shorten time-to-deployment for existing and new customers.
Beyond LLMs: Powering Frontier Inference
While Large Language Models (LLMs) remain a priority, the data center is rapidly evolving toward Large Action Models (LAMs), high-resolution multimodal vision, and real-time recommendation engines. These “frontier workloads” require massive memory capacity and predictable throughput that traditional 2.5D HBM-based architectures struggle to provide efficiently.
The MX4 addresses this by physically bonding high-bandwidth memory directly to compute tiles, shifting the focus from data movement back to high-efficiency computation.
SORUCE: PRNewswire
























