NeuReality, an AI infrastructure technology company, announced the release of a new software developer portal and demo for easy installation of its full software stack and APIs. This marks a significant milestone for NeuReality since delivery and activation of its 7nm AI inference server-on-a-chip, the NR1 NAPU, and successful bring up of its entire NR1 AI hardware and software system in the first quarter.
The NR1™ AI Inference Solution enables businesses and governments to run new AI training models and existing AI applications without over-investing millions in scarce GPUs. Regardless of AI accelerator performance, CPUs remain the primary performance bottlenecks in AI Inference, resulting in excessive power consumption and cost, making the most exciting AI innovations impossible for the majority of organizations to install and operate today.
The NR1 system was deemed customer-ready in Q1 2024 after the NAPU arrived from TSMC Taiwan in December, followed by a successful bring-up and system integration in just 90 days. “Completing the seamless bring-up and integration of a sophisticated server-on-a-chip device and a comprehensive hardware/software AI system within a small startup team is nothing short of remarkable,” said Ilan Avital, chief R&D officer at NeuReality.
Also Read: SAS advances industry solutions with packaged AI models
The system successfully met target functionality and performance requirements, covering server-on-chip (SOC), IP, and software aspects. This achievement marked its readiness for early customer pilots, particularly with cloud service providers, financial services, and healthcare sectors for computer vision, automatic speech recognition and natural language processing – laying an affordable foundation for generative AI, multi-modality, and more advanced technologies to come. NeuReality attributed the swift bring-up process of the NR1 system to robust architecture and rigorous emulation testing conducted before the 2023 tape out in collaboration with Synopsys.
The accompanying Software Development Kit (SDK) is designed exclusively for high-volume, high-variety AI workloads in enterprise data centers. It contains hierarchical tools for all types of compute engines and XPUs, along with optimized partitioning – making it easy to install, manage, and scale while giving developers their time back from the traditional complexity of deploying AI Inference.
NeuReality’s solution delivers an unprecedented developer experience with significant flexibility to deploy the most advanced and complex AI pipelines more easily, based on the specific needs of their projects. It empowers developers with toolchain for complete AI pipeline acceleration, orchestration, provisioning and inference runtime APIs to streamline AI deployment workflow. All of this and more is now documented in a new technical whitepaper to inspire innovators to focus urgently on end-to-end data center efficiency and resource optimization to enable affordable AI deployments.
Citing a 35% AI adoption rate globally and lower than 25% rate in the U.S., NeuReality is focused on lowering market barriers to mainstream industries. “It’s simply out of reach to the majority of businesses,” added Avital. “We can start changing that now by reducing high power consumption at the source – and educating customers that the ideal AI Inference servers require fundamentally different and more efficient server configurations than big supercomputers and high-end GPUs used in AI training.”
For example, NeuReality‘s NR1-S™ AI Inference Appliance outperforms the Nvidia DGX H100 System with the same deep learning processing performance but packed with 6x the data processing performance, half the price, one-third of the energy consumption, and half the physical space – all without requiring a host CPU in the system. The NR1 engineering involved packing 6.5x more processing power onto the NR1 NAPU, equivalent to 830 CPU cores in a single 4u chassis, while at the same time having enough power to host 10 Nvidia GPUs or any AI accelerator.
SOURCE: BussinessWire