Skip to content

[BUG] Fix for psync After Restarting AOF-Enabled Instances May Introduce New Data Inconsistency #2904

@arthurkiller

Description

@arthurkiller

Regarding the potential new data inconsistency introduced by the fix for psync after restarting AOF-enabled instances:

#2366 #2677

  1. The repl_offset in the RDB only reflects the value at the time of the save. Due to multipart-AOF, the subsequent loaded AOF files do not accumulate the repl_offset value, so it cannot be directly used for synchronization.
  2. The offset of AOF files does not represent the offset of the server's replication buffer. This is because some data in master-slave synchronization occupies the replication offset but is not written to the AOF.
  3. repl_offset cannot be saved in real-time, making it impossible to rely on recording a single replication_offset for synchronization.
  4. We cannot simply assume that consistent offsets guarantee valid synchronization. Non-idempotent commands (e.g., incrby) may directly disrupt data consistency.

On one hand, we still need to load the data because this replica node may be promoted to a master node in the future. On the other hand, if the master node of this replica is still alive, the entire data loading process will be futile—another psync synchronization will be required.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions