Skip to content

Conversation

@yashmayya
Copy link
Contributor

  • Legacy PQL behavior in RequestUtils::getLiteral replaces '' with ' in string literals ([PQL Cleanup] Remove all PQL related code #8626)
  • Calcite SQL parser already handles this standard SQL escaping.
  • This caused double-unescaping: '''' (4 quotes) → '' (Calcite) → ' (RequestUtils::getLiteral) instead of staying as ''
  • This patch fixes the above issue and introduces a config to retain the existing behavior for easier migration.
  • Added InstanceConfigProvider singleton in pinot-spi for global access to instance config (broker / server, can be extended to controller / minion) from any module. Note that this pattern can be used in various scenarios in the future but is particularly useful here because we need to access the configuration from a static context that can be called from a broker path (Main SQL query parsing, time series request handler, RLS filter rewriter) or a server path (JSON index filter parsing, theta sketch agg function, time series server side execution).

@yashmayya yashmayya added backward-incompat Referenced by PRs that introduce or fix backward compat issues query bugfix labels Dec 29, 2025
@yashmayya yashmayya force-pushed the fix-sse-quote-unescaping branch 2 times, most recently from d8f8ae2 to 7dfb5c8 Compare December 29, 2025 20:45
@yashmayya yashmayya force-pushed the fix-sse-quote-unescaping branch from 7dfb5c8 to f23ac59 Compare December 29, 2025 20:46
@codecov-commenter
Copy link

codecov-commenter commented Dec 29, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 63.29%. Comparing base (72be4e6) to head (8609f30).
⚠️ Report is 48 commits behind head on master.

Additional details and impacted files
@@             Coverage Diff              @@
##             master   #17438      +/-   ##
============================================
+ Coverage     63.26%   63.29%   +0.03%     
- Complexity     1474     1476       +2     
============================================
  Files          3152     3163      +11     
  Lines        187881   188826     +945     
  Branches      28765    28900     +135     
============================================
+ Hits         118855   119516     +661     
- Misses        59810    60045     +235     
- Partials       9216     9265      +49     
Flag Coverage Δ
custom-integration1 100.00% <ø> (ø)
integration 100.00% <ø> (ø)
integration1 100.00% <ø> (ø)
integration2 0.00% <ø> (ø)
java-11 63.25% <100.00%> (+0.04%) ⬆️
java-21 63.24% <100.00%> (+<0.01%) ⬆️
temurin 63.29% <100.00%> (+0.03%) ⬆️
unittests 63.29% <100.00%> (+0.03%) ⬆️
unittests1 55.56% <58.33%> (-0.10%) ⬇️
unittests2 34.05% <91.66%> (+0.15%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@yashmayya yashmayya marked this pull request as ready for review December 30, 2025 15:30
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR fixes a double-unescaping bug in the single-stage query engine where SQL string literals containing multiple consecutive single quotes were incorrectly processed. The Calcite SQL parser already handles standard SQL escaping (converting '' to '), but legacy PQL behavior in RequestUtils::getLiteral was performing an additional unescaping pass, causing strings like '''' to become ' instead of the correct ''.

Key changes:

  • Adds a configuration flag (pinot.sse.parsing.legacy.literal.unescaping) to control the legacy unescaping behavior, defaulting to false (disabled)
  • Introduces InstanceConfigProvider singleton for global access to instance configuration across modules
  • Updates broker and server startup to initialize the InstanceConfigProvider

Reviewed changes

Copilot reviewed 7 out of 7 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
pinot-spi/src/main/java/org/apache/pinot/spi/utils/CommonConstants.java Adds configuration constants for the legacy literal unescaping feature flag
pinot-spi/src/main/java/org/apache/pinot/spi/config/instance/InstanceConfigProvider.java New singleton class providing global access to instance configuration
pinot-server/src/main/java/org/apache/pinot/server/starter/helix/BaseServerStarter.java Initializes InstanceConfigProvider during server startup
pinot-broker/src/main/java/org/apache/pinot/broker/broker/helix/BaseBrokerStarter.java Initializes InstanceConfigProvider during broker startup
pinot-common/src/main/java/org/apache/pinot/common/utils/request/RequestUtils.java Updates getLiteral to conditionally apply legacy unescaping based on configuration
pinot-integration-tests/src/test/java/org/apache/pinot/integration/tests/OfflineClusterIntegrationTest.java Adds integration tests for SQL string literal escaping behavior
pinot-common/src/test/java/org/apache/pinot/sql/parsers/CalciteSqlParserTest.java Adds comprehensive unit tests for the double-unescaping fix

Comment on lines 251 to 254
boolean useLegacyUnescaping =
InstanceConfigProvider.getProperty(CommonConstants.Helix.CONFIG_OF_SSE_LEGACY_LITERAL_UNESCAPING,
CommonConstants.Helix.DEFAULT_SSE_LEGACY_LITERAL_UNESCAPING);
literal.setStringValue(useLegacyUnescaping ? StringUtils.replace(node.toValue(), "''", "'") : node.toValue());
Copy link

Copilot AI Dec 30, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The configuration property is read on every literal conversion, which could be called many times per query. Consider caching this configuration value as a static final field initialized during class loading or reading it once per query to avoid repeated lookups.

Copilot uses AI. Check for mistakes.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a valid concern, will refactor.

String.valueOf(Integer.MAX_VALUE);

// SQL parsing
public static final String CONFIG_OF_SSE_LEGACY_LITERAL_UNESCAPING =
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given this is a broker config, consider following the existing config pattern: pinot.broker.sse.*

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This isn't just a broker config though, since it can be used in some server side paths as well. From the PR description:

Added InstanceConfigProvider singleton in pinot-spi for global access to instance config (broker / server, can be extended to controller / minion) from any module. Note that this pattern can be used in various scenarios in the future but is particularly useful here because we need to access the configuration from a static context that can be called from a broker path (Main SQL query parsing, time series request handler, RLS filter rewriter) or a server path (JSON index filter parsing, theta sketch agg function, time series server side execution).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The other pattern we already have is pinot.query.sse.* which can be applied to multiple components

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm I don't see any such configs, where do you see this config pattern?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pinot.query.mse.stats.mode

@yashmayya yashmayya requested a review from Jackie-Jiang January 9, 2026 15:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

backward-incompat Referenced by PRs that introduce or fix backward compat issues bugfix query

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants