Recogni Inc., the Generative AI inference company, announced its patented logarithmic number system for AI, Pareto, which provides benefits for all main AI chip design criteria without compromise. Designed to change how the world runs GenAI, Pareto radically simplifies AI compute by turning multiplications into additions, making Recogni’s chips smaller, faster, and less energy-hungry.
Helping hyperscalers, compute providers, and enterprises maximize utilization of compute, space, and energy
The latest GenAI models demand multiplications and additions in the order of petaFLOPS, posing challenges in power consumption and computational speed. Pareto addresses these challenges by effectively converting multiplications into additions, significantly reducing power usage and execution time without compromising accuracy. Recogni is first-to-market with a logarithmic system that outperforms other quantized number systems for GenAI inference. The company’s years of research have culminated in a solution that provides:
- Smaller chip size: The efficiency of Pareto allows for a more compact chip design, allowing for significantly more compute in a data center and reducing costs
- Lower power consumption: Pareto reduces the power requirements of AI models, outperforming traditional FP8 and FP16 formats, and enabling sustainable AI computing
- High accuracy: AI models using Pareto exhibit less than 0.1% drop in accuracy with 16-bit precision and less than 1% with 8-bit precision without the need for retraining
“With Pareto, we accelerate the world’s AI ambitions,” said Marc Bolitho, CEO of Recogni. “Pareto’s logarithmic number system has the lowest average error and highest performance for AI models. By turning multiplications into additions, Pareto significantly reduces power consumption, latency, and chip size, making it the optimal choice for modern AI chip design. Organizations running GenAI inference can now keep operating costs lower than any other technology and ensure uncompromised AI model quality for the widest variety of multi-modal GenAI Inference applications and use cases.”
Helping developers to bring new models to production in less time and with higher accuracy
Extensive testing on various AI models, including Mixtral-8x22B, Llama3-70B, Falcon-180B, Stable Diffusion XL, and Llama3.1-405B shows that Pareto achieves a relative accuracy of over 99.9% compared to the trained high-precision baseline model, while consuming significantly less power.
Pareto offers FP16 accuracy at a lower power consumption than competitors’ FP8, enabling developers to deploy trained models quickly and efficiently. The instant conversion capability of Pareto with virtually no loss in accuracy, eliminates the need for time-consuming retraining processes.
“Our goal is and has always been to directly address both the needs of businesses and machine learning developers,” says Gilles Backhus, founder and VP of AI at Recogni. “With Pareto we came up with a number system that allows businesses to instantly deploy their models at high power efficiency with virtually no loss across all key performance and accuracy metrics. While companies using standard math are spending considerable time converting models to lower precision to reduce the power and operational expenses, Pareto allows companies to bring new models to production faster and cheaper while maintaining high accuracy.”
Seven years of development to change how the world runs GenAI
Recogni has proven the benefits of Pareto. Its initial chip was designed and manufactured using 7nm TSMC with the company’s first generation of logarithmic math, exceeding performance expectations and proving all hypotheses.
In the coming months, Recogni will announce a technology partnership that will make the power of Pareto more widely available, accelerating a new wave of progress.
SOURCE: PRNewsWire