Skip to content

Conversation

@FGasper
Copy link
Collaborator

@FGasper FGasper commented Nov 12, 2025

This adds an opt-in mode to tail the oplog directly instead of reading a change stream. Oplog tailing is less resource-intensive on the server than a change stream, which yields better performance and allows verification of more migrations.

Oplog tailing carries a few caveats (or else it would be the default when possible):

  • The oplog’s format is undocumented, by design. A future MongoDB version may not work with the present logic. Change streams, by contrast, are fully documented and intended for use in applications like Migration Verifier.
  • To capture transactions that are in progress when verification starts, oplog tailing actually starts from the start optime of the oldest transaction that’s in progress when verification starts. This means a few extra rechecks may be enqueued at the start of a verification.

Additional notes:

  • Since migration-verifier always reads with majority read concern, we can ignore the term and hash that normally, with the timestamp, comprise an oplog entry’s “optime”.
  • ChangeStreamReader.start() is now part of ChangeReaderCommon so that its retry bits can be shared with the new OplogReader type.
  • This includes a nascent agg library, which simplifies aggregation usage in Go by providing types for various expressions.
  • The retryer now wraps errors with the function description message.
  • To ensure that OplogReader doesn’t read too far “back in time”, the integration test suite now kills all transactions between test methods.
  • Some of the tests are reordered a bit so that document insertion happens before verifier creation. This was originally done to troubleshoot test failures but seems worth retaining for general clarity.

@FGasper FGasper marked this pull request as ready for review November 24, 2025 14:49
@FGasper FGasper marked this pull request as draft November 24, 2025 15:54
@FGasper FGasper marked this pull request as ready for review November 24, 2025 17:06
@FGasper FGasper requested a review from visemet December 3, 2025 01:04
@FGasper
Copy link
Collaborator Author

FGasper commented Dec 3, 2025

@visemet I’ve added you as a reviewer for just the oplog-specific parts of this. (oplog.go in particular)

Thank you!

Copy link
Collaborator

@tdq45gj tdq45gj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not familiar with the oplog tailing logic, but it looks good to me overall. I've left some small comments.

@FGasper FGasper requested a review from tdq45gj December 4, 2025 01:33
Copy link
Collaborator

@tdq45gj tdq45gj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

ctx context.Context,
client *mongo.Client,
) (OpTime, OpTime, error) {
oldestTxn, err := getOldestTransactionTime(ctx, client)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a ticket to track swapping the order of getOldestTransactionTime() and getLatestVisibleOplogOpTime() per the issue described in TOOLS-4015?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤦 This is brand-new logic … I’m not sure why I wrote it this way. Fixed, and thank you.

README.md Outdated
## `tailOplog`
The verifier will read the oplog continually instead of reading a change stream. This is generally faster, but it doesn’t work in sharded clusters.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we change "doesn't work" to be "requires connecting directly to each shard"?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I’ve been under the impression that direct shard connections are something we want people not to do. But, revised.

}

func getOplogDocIDExpr(docroot string) any {
// $switch was new in MongoDB 4.4, so use $cond instead.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

$switch was added in MongoDB 3.4 (SERVER-10689). Can this comment be removed?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I must have misread 3.4 as 4.4 … of course it doesn’t matter because the expr form only works in 4.4+. Removed.

// op=c is for applyOps, and also to detect forbidden DDL.
agg.And{
agg.Eq{"$op", "c"},
agg.Not{helpers.StringHasPrefix{
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[question] Do we limit ourselves to ignoring DDL operations on the "config" database because it would be more complex MQL to also ignore DDL operations on the "admin" database? I don't anticipate op=c oplog entries on the "config" database after a cluster is first initialized so I wanted to double check the intention here (I'm not proposing any changes).

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe I copied this part from mongo-speedcam, which I essentially “reverse-engineered” from mongomirror.

The ideal, actually, is to detect & fail on any DDL event. In that regard, would it be better to remove the filtering on config here so we catch DDL events on config?

return nil, bson.Timestamp{}, errors.Wrapf(err, "extracting applyOps[%d].op", i)
}

err = parseOneDocumentOp(opName, latestTS, opRaw)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[note] I imagine because this logic is only used by the verifier it is ok for rechecks to be spuriously performed on documents which are attempted to be modified by a multi-statement transaction which ultimately aborts. As a comparison point, change streams only emits change event for multi-statement transactions after they are committed.

This behavior of the oplog reader is also what enables its resume token to only be aware of the 'ts' timestamp position within the oplog collection.

I'm unsure if this is the best spot to call this out or if there's some high-level overview on how the oplog reader mode behaves it can be summarized in instead.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added a note at the top of this file about the 2 cases (including this) where oplog reader triggers “extra” rechecks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants