Skip to content

SQL overview#573

Open
kbatuigas wants to merge 16 commits into
rp-sqlfrom
DOC-2049-redpanda-sql-introduction-and-overview
Open

SQL overview#573
kbatuigas wants to merge 16 commits into
rp-sqlfrom
DOC-2049-redpanda-sql-introduction-and-overview

Conversation

@kbatuigas
Copy link
Copy Markdown
Contributor

@kbatuigas kbatuigas commented May 4, 2026

This pull request makes significant improvements to the Redpanda SQL documentation, focusing on restructuring and clarifying key concepts, updating navigation, and enhancing learning objectives and use case explanations. The most important changes are summarized below.

Documentation Restructuring and Navigation Updates:

  • The main overview for Redpanda SQL has been rewritten and moved to a new file, overview.adoc, which now serves as the entry point for understanding Redpanda SQL, its architecture, and use cases. The previous overview file, what-is-redpanda-sql.adoc, has been deleted, and navigation links have been updated accordingly. [1] [2] [3]

Content and Conceptual Enhancements:

  • The new overview provides a detailed explanation of Redpanda SQL’s architecture, supported workloads, query patterns, and technical differentiators, including vectorized execution, columnar storage, decoupled storage/compute, and optimized data transfer.
  • The oltp-vs-olap.adoc page has been updated to clarify the distinction between OLTP and OLAP in the context of streaming data, and now includes explicit learning objectives and personas. [1] [2]

Reference and Comparison Improvements:

  • The redpanda-sql-vs-postgresql.adoc page has been enhanced to clarify its purpose as a reference, add learning objectives, and include a TODO for further engineering review of compatibility differences. The section on error handling differences has also been clarified. [1] [2]

Catalogs and Querying Workflow Clarification:

  • The redpanda-catalogs.adoc page has been rewritten to clarify the Redpanda catalog model, its components, and typical usage, including examples and learning objectives. The page topic type is now set to "concept" and personas are specified.

References:
[1] [2] [3] [4] [5] [6] [7] [8]

Resolves https://github.com/redpanda-data/documentation-private/issues/
Review deadline: 19 May

Page previews

Redpanda SQL > Get Started > Redpanda SQL Overview
Redpanda SQL > Get Started > Redpanda SQL Overview > OLTP vs OLAP
Redpanda SQL > Get Started > Redpanda SQL Overview > Redpanda SQL vs PostgreSQL
Redpanda SQL > Query Data > Redpanda Catalogs

Checks

  • New feature
  • Content gap
  • Support Follow-up
  • Small fix (typos, links, copyedits, etc)

@netlify
Copy link
Copy Markdown

netlify Bot commented May 4, 2026

Deploy Preview for rp-cloud ready!

Name Link
🔨 Latest commit 28b71e6
🔍 Latest deploy log https://app.netlify.com/projects/rp-cloud/deploys/6a0e73d8016263000825f8f1
😎 Deploy Preview https://deploy-preview-573--rp-cloud.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 4, 2026

Important

Review skipped

Auto reviews are disabled on base/target branches other than the default branch.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 3e4a2bbe-7cb4-4e6f-b3bd-c97b1ae30e0e

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch DOC-2049-redpanda-sql-introduction-and-overview

Comment @coderabbitai help to get the list of available commands and usage tips.

@kbatuigas kbatuigas force-pushed the DOC-2049-redpanda-sql-introduction-and-overview branch 2 times, most recently from 3597669 to 908c8b1 Compare May 11, 2026 19:54
@kbatuigas kbatuigas marked this pull request as ready for review May 11, 2026 23:16
@kbatuigas kbatuigas requested a review from a team as a code owner May 11, 2026 23:16
@kbatuigas kbatuigas requested a review from takidau May 11, 2026 23:21
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed the tables under Functions and Mathematical operators as they didn't seem to describe any actual differences from PostgreSQL, and so may not be worth keeping. Are there any actual known differences w.r.t. functions and operators (other than the one with JSON)?

@kbatuigas kbatuigas force-pushed the DOC-2049-redpanda-sql-introduction-and-overview branch from d341c3b to 60a4bec Compare May 13, 2026 18:46
@PeterCorless
Copy link
Copy Markdown

Feedback for : https://deploy-preview-573--rp-cloud.netlify.app/redpanda-cloud/sql/get-started/redpanda-sql-vs-postgresql/

Currently:

Redpanda SQL aims for close compatibility with PostgreSQL but differs in some functions, operators, and behaviors. Use this page to check which features are supported and where Redpanda SQL diverges from PostgreSQL.

Suggested edit to above paragraph:

Redpanda SQL aims for close compatibility to PostgreSQL semantics, yet differs significantly in design and function. Use this page to check which features are supported and where Redpanda SQL diverges from PostgreSQL.

For example, PostgreSQL is an online transactional processing (OLTP) database by default, whereas Redpanda SQL is an online analytical processing (OLAP) query engine.

Many transaction processing functions for PostgreSQL are not available in Redpanda SQL, such as the ability to write or upsert data directly.

Instead, Redpanda SQL relies upon Apache Kakfa-compatible topics to be written into Redpanda Streaming. Redpanda SQL can then query against topics in local storage (for a "hot storage" tier), as well as Apache Iceberg-compatible tables written to object storage (for "cold storage"). Redpanda SQL performs a federated query, using the topics as a row-store, and the Iceberg tables as a column-store, performing a seamless, deduplicated join across both.

Another key thing to note: Redpanda SQL, while semantically compatible, is not code compatible with PostgreSQL. It cannot use common PostgreSQL plugins such as pgvector, PostGIS, or pg_cron.

kbatuigas and others added 14 commits May 18, 2026 20:27
… and catalogs

Tightens the PostgreSQL framing in the overview (compatible query engine
implementing the Postgres wire protocol and a Postgres-based dialect, not
a full Postgres database). Aligns Iceberg references with the v1 product
scope: only Iceberg tables created from Iceberg-enabled Redpanda topics
are queryable; no external Iceberg lakehouses or REST catalogs. Collapses
the overview's "Query Iceberg tables" and "Bridge queries" sections into
"Query Iceberg topics".

Rewrites the Redpanda Catalogs page with the named-collection-of-source-data
framing, leads with default_redpanda_connection auto-creation, and adds a
storage > catalog > tables hierarchy. Replaces the prior CREATE-flow
walkthrough with a smaller demo using default_redpanda_connection.

Per PM SME 2026-05-07.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@kbatuigas kbatuigas force-pushed the DOC-2049-redpanda-sql-introduction-and-overview branch from b65e870 to d278b7f Compare May 19, 2026 03:29
:learning-objective-2: Identify the query patterns Redpanda SQL supports
:learning-objective-3: Describe the architectural characteristics that enable those patterns

Redpanda SQL turns your Redpanda glossterm:topic[,topics], including their Iceberg-translated history, into queryable SQL surfaces inside your Redpanda Bring Your Own Cloud (BYOC) glossterm:cluster[]. Built as a column-oriented online analytical processing (OLAP) engine, Redpanda SQL runs analytical queries over streaming and historical data without moving or duplicating data. It is a PostgreSQL-compatible query engine that implements the PostgreSQL wire protocol and a PostgreSQL-based SQL dialect, so you can connect with any PostgreSQL client, including `psql`, JDBC, DBeaver, and DataGrip.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@grzebiel do we fully support JDBC ?

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

|Quick results for frequently accessed data
|Consistently fast response to requests

|Audience
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kbatuigas The rest of this page looks good but this one feels really weird to me. What does 'market-oriented information' even mean.

Better way to rephrase @adam-szymanski @ndrsbl ?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was from the migrated doc; I changed it to "Application end users" and "Business analysts and decision-makers"

Comment thread modules/sql/pages/get-started/overview.adoc Outdated
:learning-objective-2: Identify the query patterns Redpanda SQL supports
:learning-objective-3: Describe the architectural characteristics that enable those patterns

Redpanda SQL turns your Redpanda glossterm:topic[,topics], including their Iceberg-translated history, into queryable SQL surfaces inside your Redpanda Bring Your Own Cloud (BYOC) glossterm:cluster[]. Built as a column-oriented online analytical processing (OLAP) engine, Redpanda SQL runs analytical queries over streaming and historical data without moving or duplicating data. It is a PostgreSQL-compatible query engine that implements the PostgreSQL wire protocol and a PostgreSQL-based SQL dialect, so you can connect with any PostgreSQL client, including `psql`, JDBC, DBeaver, and DataGrip.
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

'translation' is an internal/low-level implementation detail.

"Redpanda SQL turns your live Redpanda topics and their history in Apache Iceberg into queryable SQL tables inside your Redpanda Bring Your Own Cloud (BYOC) cluster"

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed to "Redpanda SQL turns your Redpanda topics, including their history in Apache Iceberg, into queryable SQL tables inside your Redpanda Bring Your Own Cloud (BYOC) cluster."

Comment thread modules/sql/pages/get-started/overview.adoc Outdated
Comment thread modules/sql/pages/get-started/overview.adoc Outdated
Comment thread modules/sql/pages/query-data/redpanda-catalogs.adoc Outdated
Comment thread modules/sql/pages/query-data/redpanda-catalogs.adoc Outdated
Comment thread modules/sql/pages/query-data/redpanda-catalogs.adoc Outdated
Comment thread modules/sql/pages/query-data/redpanda-catalogs.adoc Outdated
[source,sql]
----
CREATE TABLE production_redpanda=>user_events
CREATE TABLE default_redpanda_catalog=>user_events
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kbatuigas I think we also want to explain the layered catalog model here.

Meaning something like this:

"To query topics with history in Apache Iceberg (Iceberg Topic), a redpanda catalog is created USING an existing iceberg catalog

<>

In Redpanda BYOC, both catalogs are pre-created for the BYOC cluster, allowing you to immediately query iceberg topics in the local cluster with a simple CREATE TABLE statement"

I say this because, they need to understand that this single catalog is a 'layered catalog' (maybe we should use that term), and not a vanilla redpanda catalog , at least in BYOC. And, if they do a DESCRIBE on the catalog, I think they will see this iceberg information that shows the catalog was created USING the iceberg catalog (albeit automatically, but they can see this fact). Without this it could be confusing

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Went with

In Redpanda BYOC, both catalogs are created and linked for you when Redpanda SQL is enabled, so you can query an Iceberg-enabled topic with a simple CREATE TABLE against default_redpanda_catalog. No manual CREATE ICEBERG CATALOG or CREATE REDPANDA CATALOG ... USING CATALOG is required. To inspect the layered relationship, run DESCRIBE REDPANDA CATALOG default_redpanda_catalog.

:learning-objective-2: Explain why Redpanda SQL uses an OLAP model

Redpanda SQL uses an OLAP (Online Analytical Processing) modeloptimized for analytical queries over large datasetsrather than the OLTP (Online Transaction Processing) model used by traditional relational databases. This makes OLAP suitable for querying Redpanda topics at scale. This page explains the differences between OLTP and OLAP and how they apply to querying data with Redpanda SQL.
Redpanda SQL uses an OLAP (Online Analytical Processing) model, optimized for analytical queries over large datasets, rather than the OLTP (Online Transaction Processing) model used by traditional relational databases. This makes OLAP suitable for querying Redpanda glossterm:topic[,topics] at scale.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Redpanda SQL uses an OLAP (Online Analytical Processing) model, optimized for analytical queries over large datasets, rather than the OLTP (Online Transaction Processing) model used by traditional relational databases. This makes OLAP suitable for querying Redpanda glossterm:topic[,topics] at scale.
Redpanda SQL uses an OLAP (online analytical processing) model, optimized for analytical queries over large datasets, rather than the OLTP (online transaction processing) model used by traditional relational databases. This makes OLAP suitable for querying Redpanda glossterm:topic[,topics] at scale.

* Record entry: Storing data like student score records, warehouse inventory, or customer service ticketing systems

== What is OLAP?
== Online Analytical Processing (OLAP)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
== Online Analytical Processing (OLAP)
== Online analytical processing (OLAP)

== Online Analytical Processing (OLAP)

OLAP stands for Online Analytical Processing and provides data analysis for business decisions. With OLAP, you can get information on multiple databases and data types with the ability to analyze them at the same time, even with complex queries.
OLAP stands for Online Analytical Processing and provides data analysis for business decisions. With OLAP, you can query information across multiple databases and data types simultaneously, including complex queries.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
OLAP stands for Online Analytical Processing and provides data analysis for business decisions. With OLAP, you can query information across multiple databases and data types simultaneously, including complex queries.
OLAP provides data analysis for business decisions. With OLAP, you can query information across multiple databases and data types simultaneously, including complex queries.

I don't think we need to spell it out in both heading and text

* [ ] {learning-objective-1}
* [ ] {learning-objective-2}

== Online Transaction Processing (OLTP)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
== Online Transaction Processing (OLTP)
== Online transaction processing (OLTP)


== Online Transaction Processing (OLTP)

Online Transaction Processing (OLTP) supports transaction-oriented applications under a 3-tier architecture (such as a https://en.wikipedia.org/wiki/Third_normal_form[3NF^] approach). OLTP administers day-to-day transactions through a relational database.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Online Transaction Processing (OLTP) supports transaction-oriented applications under a 3-tier architecture (such as a https://en.wikipedia.org/wiki/Third_normal_form[3NF^] approach). OLTP administers day-to-day transactions through a relational database.
OLTP supports transaction-oriented applications under a three-tier architecture (such as a https://en.wikipedia.org/wiki/Third_normal_form[3NF^] approach). OLTP administers day-to-day transactions through a relational database.

:learning-objective-2: Identify the query patterns Redpanda SQL supports
:learning-objective-3: Describe the architectural characteristics that enable those patterns

Redpanda SQL turns your Redpanda glossterm:topic[,topics], including their history in Apache Iceberg, into queryable SQL tables inside your Redpanda Bring Your Own Cloud (BYOC) glossterm:cluster[]. Built as a column-oriented online analytical processing (OLAP) engine, Redpanda SQL runs analytical queries over streaming and historical data without moving or duplicating data. It is a PostgreSQL-compatible query engine that implements the PostgreSQL wire protocol and a PostgreSQL-based SQL dialect, so you can connect with any PostgreSQL client, including `psql`, JDBC, DBeaver, and DataGrip.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Redpanda SQL turns your Redpanda glossterm:topic[,topics], including their history in Apache Iceberg, into queryable SQL tables inside your Redpanda Bring Your Own Cloud (BYOC) glossterm:cluster[]. Built as a column-oriented online analytical processing (OLAP) engine, Redpanda SQL runs analytical queries over streaming and historical data without moving or duplicating data. It is a PostgreSQL-compatible query engine that implements the PostgreSQL wire protocol and a PostgreSQL-based SQL dialect, so you can connect with any PostgreSQL client, including `psql`, JDBC, DBeaver, and DataGrip.
Redpanda SQL turns your Redpanda glossterm:topic[,topics], including their history in Apache Iceberg, into queryable SQL tables inside your Redpanda Streaming Bring Your Own Cloud (BYOC) glossterm:cluster[]. Built as a column-oriented online analytical processing (OLAP) engine, Redpanda SQL runs analytical queries over streaming and historical data without moving or duplicating data. It is a PostgreSQL-compatible query engine that implements the PostgreSQL wire protocol and a PostgreSQL-based SQL dialect, so you can connect with any PostgreSQL client, including `psql`, JDBC, DBeaver, and DataGrip.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants