Skip to content

fix(clickhouse): use unquote (not unquote_plus) for URL credentials; decode database#2436

Open
Bartok9 wants to merge 2 commits into
Canner:mainfrom
Bartok9:fix/clickhouse-url-unquote
Open

fix(clickhouse): use unquote (not unquote_plus) for URL credentials; decode database#2436
Bartok9 wants to merge 2 commits into
Canner:mainfrom
Bartok9:fix/clickhouse-url-unquote

Conversation

@Bartok9

@Bartok9 Bartok9 commented Jul 4, 2026

Copy link
Copy Markdown
Contributor

Summary

  • The ClickHouse connection_url path now decodes userinfo with unquote instead of unquote_plus, and also unquotes the database name parsed from the path.
  • Fixes silent credential corruption when a password contains a literal +, and a missing-decode bug for percent-encoded database names.

Motivation

In _build_clickhouse_client_kwargs (core/wren/src/wren/connector/clickhouse.py), the URL username/password were decoded with unquote_plus. But + only means "space" inside a URL query string — never in the userinfo or path (RFC 3986). So a password like pw+1 (written literally in the URL) was silently turned into pw 1, and authentication failed with no clear cause.

Every other connector in this package (mssql, mysql, oracle, trino) uses plain unquote for exactly this reason. This brings ClickHouse in line.

Separately, the database name parsed from the URL path was passed through raw (never decoded), so a percent-encoded database name (e.g. my%20analytics) reached ClickHouse verbatim.

unquote still decodes %20→space, %40@, etc., so the existing decode test continues to pass unchanged.

Verification

Real output on current upstream/main (fix reverted, tests present):

E         - my analytics
E         + my%20analytics
FAILED tests/unit/test_clickhouse_helpers.py::test_clickhouse_url_preserves_literal_plus_in_credentials
FAILED tests/unit/test_clickhouse_helpers.py::test_clickhouse_url_decodes_percent_encoded_database
2 failed, 2 passed

With the fix applied:

python3 -m pytest tests/unit/test_clickhouse_helpers.py -q
25 passed in 0.77s
  • Two new regression tests (literal + preserved; database path decoded); the pre-existing %40/%20 decode test still passes.
  • Path is Apache-2.0 (core/**).

Summary by CodeRabbit

  • Bug Fixes
    • Improved ClickHouse connection URL parsing so username, password, and database values are decoded correctly.
    • Preserved literal + characters in credential userinfo instead of converting them to spaces.
    • Properly decoded percent-encoded database names from the URL path.
  • Tests
    • Added regression coverage to verify correct + handling in credentials and percent-decoding of database names.

…decode database

The connection_url path decoded userinfo with unquote_plus, which turns a
literal '+' into a space. In a URL, '+' only means space inside a query
string — never in userinfo or the path — so a password like 'pw+1' was
silently corrupted to 'pw 1' and auth failed. Switch to unquote (matching
the mssql/mysql/oracle/trino connectors) and also decode the database name
parsed from the path, which was passed through raw.
@github-actions github-actions Bot added python Pull requests that update Python code core labels Jul 4, 2026
@coderabbitai

coderabbitai Bot commented Jul 4, 2026

Copy link
Copy Markdown
Contributor

Review Change Stack

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 4e12d5fc-cf1e-405b-905b-0c6fee512d7a

📥 Commits

Reviewing files that changed from the base of the PR and between c54497d and ca41b2e.

📒 Files selected for processing (1)
  • core/wren/src/wren/connector/clickhouse.py
🚧 Files skipped from review as they are similar to previous changes (1)
  • core/wren/src/wren/connector/clickhouse.py

Walkthrough

The ClickHouse connector now percent-decodes username, password, and database URL segments with unquote instead of unquote_plus, preserving literal + characters. Tests and inline comments were updated to match the new parsing behavior.

Changes

ClickHouse credential decoding

Layer / File(s) Summary
Switch credential decoding to unquote
core/wren/src/wren/connector/clickhouse.py
Replaces unquote_plus with unquote, updates the URL-handling comments, and applies percent-decoding to username, password, and database path values.
Regression tests for decoding behavior
core/wren/tests/unit/test_clickhouse_helpers.py
Updates the test description and adds cases for preserving literal + in userinfo and decoding percent-encoded database names.

Estimated code review effort: 2 (Simple) | ~10 minutes

Possibly related PRs

  • Canner/WrenAI#1281: Adjusts ClickHouse URL handling by encoding the password on output, which is closely related to this decoding change.

Suggested reviewers: wwwy3y3, goldmedal

Poem

I hopped through URLs, light and spry,
Saw plus signs stay as plus, oh my!
Percent codes softened, neat and true,
ClickHouse paths now parse right through,
🐇 A tidy nibble, a correct “+” too.

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately summarizes the main ClickHouse URL decoding changes, including credential decoding and database path decoding.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

core python Pull requests that update Python code

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant