Add databricks-iceberg skill by irfanelahi-ds · Pull Request #188 · databricks-solutions/ai-dev-kit

irfanelahi-ds · 2026-02-26T05:35:19Z

Why this matters

Apache Iceberg is one of the most active areas of customer demand w.r.t interoperating with the wider Iceberg ecosystem. And it has many nuances and user journeys e.g. creating and sharing managed iceberg tables, enabling UniForm on existing delta tables to share with Iceberg clients like Snowflake or Trino without ETL, Reading Foreign Iceberg tables managed by other IRC (e.g., Snowflake) in Databricks or debugging why PyIceberg/OSS Spark can't connect to UC. Without a dedicated skill, Claude has no grounded reference for Databricks-specific Iceberg behaviour and falls back on generic Iceberg docs that don't reflect how UC actually implements things (e.g. PARTITIONED BY mapping to Liquid Clustering, the IRC endpoint path, EXTERNAL USE SCHEMA requirements, vended credential flow etc).

This PR closes that gap.

What's included

File	Coverage
`1-managed-iceberg-tables.md`	Native Iceberg DDL/DML, Liquid Clustering (`PARTITIONED BY` vs `CLUSTER BY`), Predictive Optimization, Iceberg v3, limitations
`2-uniform-and-compatibility.md`	External Iceberg Reads (UniForm) for regular Delta tables; Compatibility Mode for Streaming Tables and Materialized Views in SDP
`3-iceberg-rest-catalog.md`	IRC endpoint, auth (PAT/OAuth), credential vending, IP access list requirements
`4-snowflake-interop.md`	Bidirectional Snowflake↔Databricks — catalog integration (vended creds, AWS/Azure/GCS), foreign catalogs, networking gotchas
`5-external-engine-interop.md`	PyIceberg and OSS Spark connection configs; troubleshooting guide

Some Examples of Specific problems this solves for customers

Snowflake ↔ Databricks interop: Step-by-step setup for both directions, tried and tested.
PARTITIONED BY vs CLUSTER BY confusion: Clarifies that both produce Liquid Clustering and identical Iceberg metadata for external engines, and exactly when each requires TBLPROPERTIES changes
External engine setup: Correct PyIceberg and OSS Spark configs against UC IRC, including the version constraints and cloud bundle requirements that cause silent failures
Networking blind spot: IP access list requirements for external engines hitting the IRC endpoint are underdocumented. This skill explicitly covers it

Adds a new skill covering Apache Iceberg on Databricks: - Managed Iceberg tables (DDL, DML, Liquid Clustering, Iceberg v3) - External Iceberg Reads / UniForm and Compatibility Mode for Delta tables - Iceberg REST Catalog (IRC) — auth, credential vending, IP access list guidance - Snowflake interoperability (bidirectional: catalog integration + foreign catalogs) - External engine interop — PyIceberg, OSS Spark, EMR, Flink, Kafka Connect Registers the skill in README.md and install_skills.sh. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

- Correct factual error: PO is not auto-enabled, it must be explicitly enabled via ALTER ... SET DBPROPERTIES - Add enable examples at catalog, schema, and table levels - Add automatic statistics collection as a PO capability - Add ANALYZE TABLE as the manual statistics collection alternative Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

- Remove blockquote repeating PARTITIONED BY/CLUSTER BY behaviour already covered in Critical Rules - Remove 4 Common Issues rows that duplicate Critical Rules: write.metadata.path, Iceberg library in DBR, PARTITIONED BY Liquid Clustering, and CLUSTER BY v2 DV/row-tracking requirement Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

- Clarify IRC endpoint description in 1-managed-iceberg-tables.md - Improve wording of Future Modes note in 2-uniform-and-compatibility.md - Remove Limitations section from 3-iceberg-rest-catalog.md Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Irfan Elahi and others added 4 commits February 26, 2026 16:18

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add databricks-iceberg skill#188

Add databricks-iceberg skill#188
irfanelahi-ds wants to merge 4 commits intodatabricks-solutions:mainfrom
irfanelahi-ds:feature/databricks-iceberg-skill

irfanelahi-ds commented Feb 26, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

irfanelahi-ds commented Feb 26, 2026

Why this matters

What's included

Some Examples of Specific problems this solves for customers

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant