Skip to content

feat: add rewrite manifests#774

Open
zhjwpku wants to merge 1 commit into
apache:mainfrom
zhjwpku:support_rewrite_manifests
Open

feat: add rewrite manifests#774
zhjwpku wants to merge 1 commit into
apache:mainfrom
zhjwpku:support_rewrite_manifests

Conversation

@zhjwpku

@zhjwpku zhjwpku commented Jun 23, 2026

Copy link
Copy Markdown
Collaborator

No description provided.

@zhjwpku zhjwpku force-pushed the support_rewrite_manifests branch 7 times, most recently from 18bade1 to 7de6949 Compare June 28, 2026 06:40
@zhjwpku zhjwpku marked this pull request as ready for review June 28, 2026 06:42
@zhjwpku zhjwpku force-pushed the support_rewrite_manifests branch from 7de6949 to 5c1ee1f Compare July 1, 2026 11:52
@zhjwpku zhjwpku force-pushed the support_rewrite_manifests branch from 5c1ee1f to dd9c7e6 Compare July 1, 2026 16:56
auto new_writer = [this, &schema](const RewriteWriter& rewrite_writer)
-> Result<std::unique_ptr<ManifestWriter>> {
std::optional<int64_t> first_row_id = std::nullopt;
if (base().format_version >= 3 && rewrite_writer.content == ManifestContent::kData) {

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the table was upgraded from v2, existing live files may still have null per-file row IDs. Then ManifestEntryAdapterV3::GetFirstRowId falls back to the writer-level value, so those files get 0, and the manifest list skips assignment because the manifest now has a non-null first_row_id. Java leaves the rewrite writer’s manifest firstRowId null via newManifestWriter, allowing manifest-list assignment when needed after upgrade.

const bool has_direct_replacements = !deleted_manifests_.empty() ||
!added_manifests_.empty() ||
!rewritten_added_manifests_.empty();
if (!cluster_by_func_ && !predicate_ && has_direct_replacements) {

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Java does not rewrite at all when clusterByFunc == null, while this rewrites when predicate_ is set. Is it intentional?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants