Google Introduces Gemini Embedding 2 to Power Multimodal AI Applications

Google

Google has developed Gemini Embedding 2, its first natively multimodal embedding model that seeks to help developers create AI systems that can comprehend and retrieve data from various media formats. The model is available for public preview as part of the Gemini API and Vertex AI. The model combines text, images, videos, audio, and documents under a single embedding space. This allows for a more efficient retrieval, search, and classification of data from various media formats. The Gemini Embedding 2 model allows for the semantic intent of data to be captured in over 100 languages. This helps to simplify complex AI systems that were earlier required to be developed with the help of various models for different media formats.

Also Read: CrowdStrike Partners with Perplexity to Enhance Comet Enterprise Security

The model is useful for a variety of applications for developers and enterprises. The applications include retrieval augmented generation, recommendation systems, sentiment analysis, clustering, cross-modal search, and retrieving images based on text input. The model allows for the generation of high-dimensional vector representations that help to compare large datasets. This helps to make AI applications more scalable. The release is a part of Google’s efforts to help create better multimodal AI infrastructure for the development of AI applications.

Read More: Gemini Embedding 2: Our first natively multimodal embedding model