Skip to content

fix(type-mapping): fall back on sqlglot TokenError, not just ParseError#2433

Open
Bartok9 wants to merge 1 commit into
Canner:mainfrom
Bartok9:fix/type-mapping-tokenerror
Open

fix(type-mapping): fall back on sqlglot TokenError, not just ParseError#2433
Bartok9 wants to merge 1 commit into
Canner:mainfrom
Bartok9:fix/type-mapping-tokenerror

Conversation

@Bartok9

@Bartok9 Bartok9 commented Jul 4, 2026

Copy link
Copy Markdown
Contributor

Summary

  • parse_type / translate_type in core/wren/src/wren/type_mapping.py now fall back to the raw string on any sqlglot failure, not just ParseError.
  • Fixes an uncaught TokenError crash on malformed type strings (unterminated quotes, stray control chars).

Motivation

Both functions document: "Falls back to original string if parsing fails." They only caught (sqlglot.errors.ParseError, ValueError). But sqlglot's tokenizer raises sqlglot.errors.TokenError, which is a SqlglotError and not a subclass of ParseError. So a type string like "unterminated (or one containing a stray \x00) escaped the fallback and propagated up, crashing callers (e.g. parse_types / translate_types over a batch of introspected columns).

The fix catches sqlglot.errors.SqlglotError, the common parent of both ParseError and TokenError, matching the documented fallback contract.

Verification

Real output on current upstream/main (fix reverted, tests present):

>           raise TokenError(f"Error tokenizing '{context}'") from e
E           sqlglot.errors.TokenError: Error tokenizing '"unterminate'
...
FAILED tests/unit/test_type_mapping.py::test_parse_type_falls_back_on_tokenizer_error["unterminated]
FAILED tests/unit/test_type_mapping.py::test_translate_type_falls_back_on_tokenizer_error
2 failed, 1 passed

With the fix applied:

python3 -m pytest tests/unit/test_type_mapping.py -q
48 passed in 2.77s
  • Added 3 regression cases (2 for parse_type, 1 for translate_type) that fail without the fix.
  • Path is Apache-2.0 (core/**).

Summary by CodeRabbit

  • Bug Fixes
    • Improved handling of invalid SQL type inputs so type parsing and translation now safely fall back to the original value instead of failing.
    • Fixed edge cases involving malformed text such as unterminated quotes or stray control characters.
    • Added tests to cover the new fallback behavior and ensure these inputs are returned unchanged.

parse_type/translate_type promise to fall back to the raw string when
parsing fails, but only caught (ParseError, ValueError). sqlglot's
tokenizer raises TokenError (a SqlglotError, not a ParseError subclass)
on unterminated quotes or stray control chars in a type string, so those
inputs crashed callers instead of falling back. Catch SqlglotError, which
is the common parent of ParseError and TokenError.
@github-actions github-actions Bot added python Pull requests that update Python code core labels Jul 4, 2026
@coderabbitai

coderabbitai Bot commented Jul 4, 2026

Copy link
Copy Markdown
Contributor

Review Change Stack

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: dc9bfce5-4187-42cc-8527-95d225230293

📥 Commits

Reviewing files that changed from the base of the PR and between 5ff51e4 and a9ee91a.

📒 Files selected for processing (2)
  • core/wren/src/wren/type_mapping.py
  • core/wren/tests/unit/test_type_mapping.py

Walkthrough

Exception handling in parse_type and translate_type (core/wren/src/wren/type_mapping.py) is broadened to catch sqlglot.errors.SqlglotError and ValueError instead of only ParseError, falling back to the original type string. New unit tests verify this fallback on tokenizer errors.

Changes

Type Mapping Fallback Handling

Layer / File(s) Summary
Broaden exception handling
core/wren/src/wren/type_mapping.py
parse_type and translate_type now catch sqlglot.errors.SqlglotError (plus ValueError) instead of only ParseError, including during re-serialization, falling back to the original type_str.
Tokenizer-error fallback tests
core/wren/tests/unit/test_type_mapping.py
New tests assert parse_type and translate_type return the original input string on tokenizer errors like unterminated quotes or stray control characters.

Estimated code review effort: 1 (Trivial) | ~5 minutes

Possibly related PRs

  • Canner/WrenAI#2410: Introduced the translate_type feature whose error-handling is now broadened in this PR.

Poem

A quote left dangling, a stray control key,
No longer would crash our type-parsing spree.
Now SqlglotError is caught with care,
Original strings fall back, safe and rare.
Hop, hop, hooray — tests confirm it's true! 🐇

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately reflects the main fix: malformed sqlglot type strings now fall back instead of crashing, though the change also covers broader SqlglotError handling.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

core python Pull requests that update Python code

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant