Snowflake, the AI Data Cloud company, announced at its annual user conference, Snowflake Summit 2024, Polaris Catalog, a vendor-neutral, open catalog implementation for Apache Iceberg — the open standard of choice for implementing data lakehouses, data lakes, and other modern architectures. Polaris Catalog will be open sourced in the next 90 days to provide enterprises and the entire Iceberg community with new levels of choice, flexibility, and control over their data, with full enterprise security and Apache Iceberg interoperability with Amazon Web Services (AWS), Confluent, Dremio, Google Cloud, Microsoft Azure, Salesforce, and more.
“Organizations want open storage and interoperable query engines without lock-in. Now, with the support of industry leaders, we are further simplifying how any organization can easily access their data across diverse systems with increased flexibility and control,” said Christian Kleinerman, EVP of Product, Snowflake. “Polaris Catalog extends Snowflake’s commitment to Apache Iceberg as the open standard of choice, and signals the intent from industry leaders in enabling customers and the wider Iceberg community to harness their data through an open and neutral approach, empowering cross-engine interoperability on that data.”
Polaris Catalog Introduces New Levels of Interoperability for Apache Iceberg
Apache Iceberg emerged from incubation to a top-level Apache Software Foundation project in May 2020, and has since surged in popularity to become a leading open source data table format. With Polaris Catalog, users now gain a single, centralized place for any engine to find and access an organization’s Iceberg tables with full, open interoperability. Polaris Catalog relies on Iceberg’s open source REST protocol, which provides an open standard for users to access and retrieve data from any engine that supports the Iceberg Rest API, including Apache Flink, Apache Spark, Dremio, Python, Trino, and more.
Also Read: IXOPAY Appoints Brady Harris as CEO to Drive Global Payments Growth
Organizations can get started running Polaris Catalog hosted in Snowflake’s AI Data Cloud within minutes (Snowflake-hosted in public preview soon), or self-host it in their own infrastructure using containers such as Docker or Kubernetes. Since Polaris Catalog’s backend implementation will be open source, organizations can freely swap the hosting infrastructure while eliminating vendor lock-in.
Leading Organizations Join the Polaris Catalog Community
A part of what makes Apache Iceberg so powerful is its vibrant community of diverse adopters, contributors, and commercial offerings. To ensure Polaris Catalog can meet the evolving needs of the wider community and landscape, Snowflake is collaborating with the Iceberg ecosystem to drive the project forward.
This comes on the heels of Snowflake and Microsoft’s recent partnership expansion, which creates more seamless interoperability between Snowflake and Fabric. This interoperability is possible because of Snowflake’s and Microsoft’s commitment to supporting the industry’s leading open standards for storage formats – Apache Iceberg and Apache Parquet. Now with Polaris Catalog, both organizations continue to partner with a joint mission of enabling all users to harness their enterprise data, regardless of where it is stored, to create AI-powered applications at scale.
“From day one at Microsoft, we’ve been focused on empowering every user on the planet to achieve more, and this starts with a strong data foundation. Through our support and contributions to open data standards, including Delta Parquet, Apache Iceberg, and Apache XTable, we’re furthering this mission by enabling organizations with a new level of open data interoperability, so they can do more with their data,” said Arun Ulagaratchagan, Corporate Vice President, Azure Data, Microsoft. “Snowflake continues to serve as a strategic partner of ours, and we’re excited by their willingness to work with the Iceberg community on an open catalog to empower our joint customers and the wider open-source community with more flexibility and control over their open Iceberg data.”
With Snowflake’s expertise, serving as the data foundation powering thousands of global customers’ cross-cloud data and AI workloads, and the rapidly growing Iceberg community’s innovation and open source skill sets, they will continue to simplify the interoperability of data across engines together.
Snowflake Continues to Extend Open Source Commitments
Polaris Catalog follows a slew of recent open source commitments from Snowflake, including its investments in Iceberg Tables, which allow Snowflake customers to work with data in their own storage in the Apache Iceberg format, while still benefiting from Snowflake’s ease of use, performance, and unified governance.
Snowflake also recently announced Snowflake Arctic, one of the most open, enterprise-grade large language models (LLM) on the market. As part of Snowflake’s commitment to open source, it not only released Arctic’s weights under an Apache 2.0 license, but also extensive details of how it was trained through a series of cookbooks. In addition, Snowflake supports the Streamlit open source community, which now has over 275K monthly active developers and over 6 million monthly application views. Since Snowflake acquired Streamlit in March 2022, the open source community has continued to flourish, growing over 500 percent in the past two years, as Snowflake and Streamlit continue to invest in cutting-edge open source advancements for developers.
SOURCE: BusinessWire