4 years in the past, Databricks noticed great complexity within the information panorama: separate catalogs for every platform, siloed governance instruments throughout clouds, and no unified option to safe AI belongings. We pioneered Unified Governance by launching Unity Catalog, an open, versatile catalog layer to handle entry, lineage, auditing, and discovery throughout all information and AI belongings.
In the present day, Unity Catalog has turn into the inspiration of the Databricks Information Intelligence Platform and the business’s solely unified governance resolution for information and AI throughout codecs, clouds, and engines. From open information sharing to fine-grained safety and data governance, Unity Catalog helps organizations convey context, management, and confidence to their information property.
At this yr’s Information + AI Summit, we’re asserting main improvements throughout Unity Catalog, delivering the most effective catalog for Apache Iceberg™, new enterprise person experiences, and clever governance to guard delicate information and guarantee trusted information high quality at scale.
Right here’s what’s new.
The Greatest Catalog for Apache Iceberg™
Organizations adopting a lakehouse are sometimes compelled to decide on between Delta Lake and Apache Iceberg™. That alternative creates synthetic silos: limiting entry to the info and AI instruments that groups can use, fragmenting governance, and locking metadata into format-specific catalogs.
Unity Catalog eliminates the necessity to decide on. Constructed on open requirements, Unity Catalog is the one unified catalog that works seamlessly throughout codecs, engines, and clouds, making it the inspiration of the open lakehouse. Over the previous yr, following the acquisition of Tabular, we’ve invested deeply in Apache Iceberg to increase this imaginative and prescient. We’re excited to announce:
- Full help for the Iceberg REST Catalog API, permitting exterior engines to learn (Typically Out there) and write (Public Preview) to Unity Catalog–managed Iceberg tables. This can be a main differentiator out there, eliminating format lock-in and enabling full interoperability unmatched by another resolution.
- Iceberg managed tables are actually in Public Preview, delivering best-in-class worth and efficiency, liquid clustering, predictive optimization, and full integration with Databricks and throughout exterior engines, together with Trino, Snowflake, and Amazon EMR.
- Iceberg catalog federation is in Public Preview, enabling you to manipulate and question Iceberg tables managed in AWS Glue, Hive Metastore, and Snowflake Horizon with out copying information.
- Delta Sharing for Iceberg is now in Non-public Preview, permitting you to share Unity Catalog tables and Delta tables with any recipient utilizing Delta Sharing and devour them in any shopper that helps the Iceberg REST Catalog API.
Collectively, these capabilities break down format silos and set Unity Catalog aside as the one catalog that delivers really open, unified governance and interoperability. Take a look at our weblog on Iceberg help to be taught extra about these bulletins.
Increasing Unity Catalog to enterprise customers
Information platforms shouldn’t cease on the technical person. Enterprise customers want a transparent, constant option to discover, belief, and work with information. Unity Catalog now presents a unified basis for enterprise context to bridge the hole between information and enterprise groups.
Unity Catalog Metrics: One semantic layer for all information and AI workloads
Inconsistent metric definitions throughout instruments and groups have lengthy brought about confusion, misalignment, and an absence of belief in information. Unity Catalog Metrics, now in Public Preview on AWS, Azure, and GCP and Typically Out there later this summer season, solves this by making enterprise metrics first-class belongings within the lakehouse. Not like metrics outlined solely within the BI layer, which restrict reuse and integration, defining metrics on the information layer makes enterprise semantics reusable throughout all workloads, like dashboards, AI fashions, and information engineering jobs. Unity Catalog Metrics are additionally totally addressable through SQL to make sure that everybody within the group can have the identical view of metrics, regardless of what device they select.
- Outline as soon as, use all over the place: Create metrics as soon as in Unity Catalog and use them throughout AI/BI Dashboards, Genie, Notebooks, SQL, and Lakeflow jobs. Upcoming integrations will prolong help to BI instruments like Tableau, Hex, Sigma, ThoughtSpot, Omni and observability instruments like Anomalo and Monte Carlo.
- Ruled and auditable by default: Licensed metrics include auditing and lineage out of the field, enabling trusted, compliant insights throughout groups.
“Unity Catalog Metrics provides us a central place to outline enterprise KPIs and standardize semantics throughout groups, making certain everybody works from the identical trusted definitions throughout dashboards, SQL, and AI purposes.”
— Richard Masters, Vice President, Information & AI, Virgin Atlantic
“Unity Catalog Metrics represents an thrilling alternative for Tableau prospects to leverage the worth of centralized governance with Databricks Unity Catalog. By our deep integration and increasing roadmap with Databricks, we’re thrilled to assist take away the friction for our prospects in leveraging Databricks to outline their core enterprise metrics.”
— Nicolas Brisoux, Sr. Director Product Administration, Tableau
New curated discovery experiences with clever insights
To completely empower enterprise customers, you need to make trusted information simple to seek out, perceive, and use. Unity Catalog is extending its business-aware governance with a brand new Uncover expertise, now in Non-public Preview, a curated inner market of licensed information merchandise organized by enterprise domains like Gross sales, Advertising and marketing, or Finance.
AI-powered suggestions and information steward curation assist floor the highest-value belongings, reminiscent of metrics, dashboards, tables, AI brokers, and Genie areas which are enriched with documentation, possession, and utilization insights. New clever indicators spotlight information high quality, utilization patterns, relationships, and certification standing, serving to customers shortly assess belief and relevance. Plus, with Databricks Assistant inbuilt, customers can ask pure language questions and get clear, context-aware solutions based mostly on ruled metrics.
We’re additionally introducing new clever capabilities throughout Databricks to make information discovery simpler and extra intuitive, wherever customers work within the platform. Powered by Unity Catalog, these options assist groups discover trusted information quicker and perceive its context at a look.
- Domains (Coming quickly): Arrange information by enterprise space to align discovery with the group’s operations.
- Certifications and Deprecation Tags (Beta): Sign information belief and enterprise relevance throughout datasets, metrics, and dashboards. Tagged belongings prominently show their standing in authoring surfaces just like the SQL editor, holding information high quality indicators seen all through the person workflow. Certifications and deprecation tags can be found as part of Tag Insurance policies Beta.
- Request for Entry (Public Preview): To streamline supply, customers can immediately request information entry on to the asset.
Further superior governance capabilities now out there
Excessive-leverage governance with scalable, attribute-driven controls
Scaling information governance turns into more and more difficult as organizations develop, with extra customers, groups, and information belongings to handle. Static insurance policies and guide controls can’t sustain, resulting in governance gaps, safety dangers, and operational bottlenecks.
To handle these challenges, Unity Catalog now gives clever automation and versatile, scalable controls to categorise delicate information, implement coverage constantly, and speed up safe information entry throughout the lakehouse.
Attribute-based entry management (ABAC): Outline versatile entry insurance policies utilizing tags that may be utilized on the catalog, schema, or desk stage. ABAC is accessible in Beta for row and column-level safety on AWS, Azure, and GCP.
Tag insurance policies: Tag insurance policies implement a governance layer for a way tags are created, assigned, and used throughout Databricks. These account-level insurance policies guarantee tags stay constant and trusted, supporting every little thing from information classification to price attribution. Tag insurance policies can be found in Beta on AWS, Azure, and GCP.
Information classification: Intelligently detect and tag delicate information throughout Unity Catalog. New information is scanned inside 24 hours to robotically detect new PII, minimizing guide effort and permitting groups to remain on high of information entry. When used with ABAC, Information classification robotically protects delicate information based mostly in your entry management insurance policies. Information classification is accessible in Beta on AWS, Azure, and GCP.
“Implementing column masking throughout greater than 5,000 tables was an unlimited guide effort. With ABAC, we’re in a position to apply constant insurance policies dynamically, drastically bettering each velocity and governance.”
— Ramesh Balasubramanyan, Databricks Admin, SAIF
“Databricks Information Classification has been a game-changer in our information privateness and safety technique. Paired with ABAC, it allows us to robotically safe delicate information with out limiting the info that our analysts want. The largest profit has been velocity, with automated classification and masking considerably decreasing guide overhead, releasing up our resourcing and saving our group numerous hours every week.”
— Mary Tesfay, Information & Analytics Lead, Corp IT, Navitas
Automated information high quality monitoring at scale
Unity Catalog now intelligently detects and helps resolve information high quality points throughout all of your tables with information high quality monitoring, out there in beta on AWS, Azure, and GCP. Information high quality monitoring checks freshness—how lately information has been up to date—and completeness—whether or not information volumes are as anticipated—utilizing information intelligence throughout total schemas. Customers are in a position to perceive the well being of information at a look with well being indicators, whereas information homeowners can perceive the precedence of points based mostly on downstream lineage, uncover the basis trigger, and set alerts utilizing built-in logging and dashboards.
Get began with Unity Catalog, the inspiration of Information Intelligence
Unity Catalog continues evolving because the business’s solely unified governance layer, the inspiration for safe, clever, and business-aware information platforms. Whether or not you’re constructing AI brokers, delivering BI dashboards, or sharing information throughout organizations, Unity Catalog connects all of it via a single, open catalog.
To get began, comply with the Unity Catalog guides for AWS, Azure, and GCP.
Watch the Information + AI Summit 2025 keynote from Matei Zaharia, Co-founder and Chief Expertise Officer at Databricks, to be taught extra about these latest bulletins.
Register for Information + AI Summit and discover the information and AI governance monitor