H2O.ai Announces the Launch of Danube3 Series, Surpassing Apple and Rivaling Microsoft with Latest Small Language Models

H2O.ai

H2O.ai, the open-source leader in Generative AI and machine learning, is excited to announce the global release of the H2O-Danube3 series, the latest addition to its suite of small language models. This series, now available on Hugging Face, includes the H2O-Danube3-4B and the compact H2O-Danube3-500M, both designed to push the boundaries of natural language processing (NLP) and make advanced capabilities accessible to a wider audience.

“We are incredibly excited about the H2O-Danube3 series – a leap forward in making small language models more powerful and accessible. The H2O-Danube3-4B and H2O-Danube3-500M models are designed to push the envelope in terms of performance, outpacing competitors like Apple and rivaling even Microsoft’s offerings. These models are not just high-performing but also economically efficient and easily deployable on edge devices, making them perfect for enterprise and offline applications,” said Sri Ambati, CEO and Founder of H2O.ai.

“With H2O-Danube3, we continue to democratize advanced NLP capabilities, ensuring they are within reach for a wider audience while maintaining sustainability. The versatility of these models spans from enhancing chat applications to supporting research and on-device solutions, truly embodying our mission to bring AI to everyone,” added Sri Ambati.

Also Read: IGEL and Lenovo Announce Availability of AI-Ready Devices Pre-Loaded with IGEL OS

H2O-Danube3-4B: A New Benchmark in NLP

The H2O-Danube3-4B model, trained on an impressive 6 trillion tokens, has achieved a stellar score of over 80% on the 10-shot HellaSwag benchmark. This performance not only surpasses Apple’s OpenELM-3B but also rivals Microsoft’s Phi3 4B, setting a new standard in the field.

H2O-Danube3-500M: Compact Yet Powerful

The H2O-Danube3-500M model, trained on 4 trillion tokens, demonstrates remarkable efficiency and versatility. It has achieved the highest scores in 8 out of 12 academic benchmarks when compared to similarly sized models, such as Alibaba’s Qwen2. Despite its compact size, the H2O-Danube3-500M is designed to handle a wide range of applications, from chatbots and research to on-device solutions.

Complementing H2O-Danube2 with Advanced Capabilities

The H2O-Danube3 series builds on the foundation laid by the H2O-Danube2 models. The new models are trained on high-quality web data, Wikipedia, academic texts, synthetic texts, and other higher-quality textual data, primarily in English. They have undergone final supervised tuning specifically for chat applications, ensuring they meet diverse user needs.

Key Features:

  • High Efficiency: Designed for efficient inference on consumer hardware and edge devices, H2O-Danube3 models can even run fully offline on modern smartphones with H2O AI Personal GPT
  • Open Access: All models are openly available under the Apache 2.0 license on Hugging Face
  • Competitive Performance: Extensive evaluations show that H2O-Danube3 models achieve highly competitive results across various academic, chat, and fine-tuning benchmarks.
  • Use Cases: The models are suitable for a range of applications, including chatbot integration, fine-tuning for specific tasks such as sequence classification, question answering, or token classification, and offline use cases.

SOURCE: BusinessWire