Skip to content

Feature: Handle read only columns#437

Open
driv3r wants to merge 6 commits intomainfrom
feat/handle-generated-columns
Open

Feature: Handle read only columns#437
driv3r wants to merge 6 commits intomainfrom
feat/handle-generated-columns

Conversation

@driv3r
Copy link
Copy Markdown
Contributor

@driv3r driv3r commented Apr 16, 2026

ref: #400 & @proton-lisandro-pin

The back and forth is taking a bit of time, lets speed this up, I've cherry-picked your commits so the attribution is there, CLA was signed on original PR, if you could sign the one here as well it would be 👍

Handle MySQL Generated Columns (STORED and VIRTUAL) in Data Replication

This PR adds support for MySQL generated columns (both VIRTUAL and STORED) to Ghostferry, enabling proper handling of computed columns during selective data replication.

Problem Statement

MySQL 8.0.23 introduced significant changes to how generated columns are handled in binary log ROW events:

  • Virtual columns are completely omitted from binlog events (not stored on disk)
  • Stored columns are included in binlog events (computed once and persisted)

Without special handling, Ghostferry would fail or produce incorrect results when replicating tables with generated columns, as it would attempt to insert values into columns that cannot be modified or would have incorrect column positions.

Solution

This PR implements a comprehensive solution with four key components:

  1. Row Expansion - Detects when MySQL omits virtual columns from binlog events and re-inserts nil sentinels to maintain consistent full-schema column indexing throughout the pipeline.

  2. Insert Value Filtering - Filters out generated column values before constructing INSERT statements, allowing only modifiable columns to be inserted while using proper column metadata for value escaping.

  3. Unsigned Integer Normalization - Fixed the order of operations: row expansion happens before unsigned integer normalization, ensuring consistent full-schema column indexing throughout.

  4. Verification with Generated Columns - Includes all columns (including generated) in fingerprint queries to detect divergence when computed values differ between source and target databases.

Changes

Core Implementation

  • dml_events.go: Modified INSERT event handling to filter out generated columns and improved binlog event processing for MySQL 8.0.23+ compatibility
  • table_schema_cache.go: Added column classification and filtering utilities:
    • IsColumnGenerated() - Identifies virtual and stored columns
    • NonGeneratedColumnNames() - Returns only insertable columns
    • FilterGeneratedColumnsOnRowData() - Removes generated column values from rows
  • row_batch.go: Updated row batch handling for filtered column data
  • iterative_verifier.go: Updated verification logic to handle generated columns

Test Coverage

  • Added unit tests for generated column handling with edge cases:
    • Virtual columns before unsigned integer columns
    • Generated columns before JSON columns (ensures JSON casting is preserved)
    • Mixed VIRTUAL and STORED columns
  • Added integration tests confirming:
    • Verification detects divergence in computed generated column values
    • Stored and virtual generated columns are handled correctly
    • Interrupt/resume scenarios work with generated columns

Edge Cases Handled

  • ✅ Virtual columns before unsigned integer columns
  • ✅ Generated columns before JSON columns (JSON casting preserved)
  • ✅ Mixed VIRTUAL and STORED columns in same table
  • ✅ MySQL version differences (pre-8.0.23 vs 8.0.23+)
  • ✅ Interrupt/resume with generated columns
  • ✅ Verification with computed column divergence detection

Testing

  • 5 new commits with comprehensive testing
  • Tests for critical edge cases involving generated columns and other column types
  • Integration tests verifying both inline and checkpoint verification modes
  • Fixed race condition in interrupt/resume tests

Related Issue

Closes #338

This PR modifies all `INSERT` logic so virtual (a.k.a generated) MySQL
columns are not attempted to insert into, which otherwise breaks
the ferrying process.

See also #338.
@driv3r driv3r self-assigned this Apr 16, 2026
@driv3r driv3r added enhancement New feature or request go Pull requests that update Go code labels Apr 16, 2026
@driv3r driv3r marked this pull request as ready for review April 16, 2026 19:28
@driv3r driv3r requested review from a team, austenLacy, forge33 and milanatshopify April 16, 2026 19:29
@driv3r
Copy link
Copy Markdown
Contributor Author

driv3r commented Apr 16, 2026

Hey @plisandro ! There was a bunch more things to tackle here, everything should be in place now, feel free to review and test, as well as sign the CLA mentioned in the checks for the contributions 👍

@proton-lisandro-pin
Copy link
Copy Markdown

Hey @plisandro ! There was a bunch more things to tackle here, everything should be in place now, feel free to review and test, as well as sign the CLA mentioned in the checks for the contributions 👍

This is much appreciated, thank you! Been testing it today and your PR seems to work well. CLA is now signed as well 😄

@driv3r
Copy link
Copy Markdown
Contributor Author

driv3r commented Apr 17, 2026

Hey @plisandro I'm not super familiar with the CLA stuff, but the error says:

  • @plisandro: Sign the CLA and comment "I have signed the CLA!" to re-run the checks and have your PR reviewed.

leave a comment and lets see, also I think you may have committed under your original account, so you might need to leave/sign under it as well 🤔

@plisandro
Copy link
Copy Markdown

I have signed the CLA!

@plisandro
Copy link
Copy Markdown

leave a comment and lets see, also I think you may have committed under your original account, so you might need to leave/sign under it as well 🤔

Done, and apologies for the confusion - i'm in the process of merging accounts now, and this was actually the last contribution left with the old one 🤦

@driv3r
Copy link
Copy Markdown
Contributor Author

driv3r commented Apr 17, 2026

@plisandro no problem!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request go Pull requests that update Go code

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Trouble with virtual generated columns

3 participants