support sharded parquet files in parquet converter and queryable #7189

yeya24 · 2026-01-07T00:27:20Z

What this PR does:

This PR tries to store the number of shards in the parquet converter marker as well as in the bucket index. This supports sharded parquet conversion (we don't support it today) in both write path and read path. Read path can tell how many shards there are by looking at the parquet marker to know how many files to read.

Note that to make this PR small I only changed parquet queryable and left parquet store gateway untouched. Ideally, Store Gateway should load bucket index so that it is able to tell how many shards there are for parquet blocks. But today parquet store gateway doesn't sync bucket index at all.

The plan is to add more shard info to the parquet convert marker like min and max metric name for each shard so that we can prune the shards to query based on the metric name as our parquet file is sorted by metric name. That can leave for future implementation.

Which issue(s) this PR fixes:
Fixes #7175

Checklist

Tests updated
Documentation added
CHANGELOG.md updated - the order of entries should be [CHANGE], [FEATURE], [ENHANCEMENT], [BUGFIX]

Signed-off-by: yeya24 <[email protected]>

SungJin1212

LGTM

pull-request-size bot added the size/L label Jan 7, 2026

yeya24 added 3 commits January 6, 2026 17:34

support sharded parquet files in parquet converter and queryable

0e26186

Signed-off-by: yeya24 <[email protected]>

use latest parquet version

259b921

Signed-off-by: yeya24 <[email protected]>

rebase master

8908b3e

Signed-off-by: yeya24 <[email protected]>

yeya24 force-pushed the parquet-converter-shards branch from 3282ac3 to 8908b3e Compare January 7, 2026 01:36

SungJin1212 approved these changes Jan 7, 2026

View reviewed changes

dosubot bot added the lgtm This PR has been approved by a maintainer label Jan 7, 2026

yeya24 merged commit ec124e5 into cortexproject:master Jan 8, 2026
74 of 77 checks passed

yeya24 deleted the parquet-converter-shards branch January 8, 2026 22:42

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

support sharded parquet files in parquet converter and queryable #7189

support sharded parquet files in parquet converter and queryable #7189

Uh oh!

yeya24 commented Jan 7, 2026 •

edited

Loading

Uh oh!

SungJin1212 left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

support sharded parquet files in parquet converter and queryable #7189

support sharded parquet files in parquet converter and queryable #7189

Uh oh!

Conversation

yeya24 commented Jan 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

SungJin1212 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

yeya24 commented Jan 7, 2026 •

edited

Loading