SPOC-484: TRANSDISCARD/SUB_DISABLE: switch from subtransaction replay to read-only dry-run logging#400
Conversation
…-run replay In TRANSDISCARD and SUB_DISABLE modes, the second (retry) pass replays the transaction as read-only. Each DML handler logs a row to spock.exception_log instead of attempting the operation. The record that originally caused the error carries the real error message; other records get an empty string. The failed record is identified by xact_action_counter (saved as failed_action in SpockExceptionLog), which is unique per DML within a transaction. In DISCARD mode, each DML still runs inside a subtransaction; only the failed ones are logged. Add dry_run_logging regression test covering TRANSDISCARD (absent table), DISCARD (truncated table), and SUB_DISABLE (deleted row), with INSERT_EXISTS conflict on drl_t1 to verify that dry-run modes leave spock.resolutions empty while DISCARD populates it.
📝 WalkthroughWalkthroughThis pull request enhances exception handling and retry replay logic in the replication system. It adds a Changes
Poem
🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
src/spock_apply.c (1)
2263-2334:⚠️ Potential issue | 🟠 MajorPotential duplicate exception logging in TRANSDISCARD/SUB_DISABLE mode.
In the SQL handler, for TRANSDISCARD/SUB_DISABLE modes:
- Lines 2281-2288 log the exception directly
- Then
failedis set tofalseat line 2289- The code falls through to line 2314 where
should_log_exception(false)is calledshould_log_exception(false)returnstruewhenexception_behaviour == TRANSDISCARD || SUB_DISABLE(lines 310-312)- Lines 2326-2333 log the exception again
This could result in duplicate entries in
spock.exception_logfor SQL operations in dry-run modes.🐛 Proposed fix to prevent duplicate logging
Add an early return or skip the second logging block for TRANSDISCARD/SUB_DISABLE:
if (exception_behaviour == TRANSDISCARD || exception_behaviour == SUB_DISABLE) { /* * TRANSDISCARD and SUB_DISABLE: skip the DDL and log the * discarded operation directly to spock.exception_log. */ char *error_msg = (my_exception_log_index >= 0 && xact_action_counter == exception_log_ptr[my_exception_log_index].failed_action && exception_log_ptr[my_exception_log_index].initial_error_message[0] != '\0') ? exception_log_ptr[my_exception_log_index].initial_error_message : NULL; sql = JsonbToCString(NULL, &queued_message->message->root, 0); exception_command_counter++; add_entry_to_exception_log(remote_origin_id, replorigin_session_origin_timestamp, remote_xid, 0, 0, NULL, NULL, NULL, NULL, sql, queued_message->role, "SQL", error_msg); - failed = false; + /* Already logged above; skip the should_log_exception block below */ } else { /* DISCARD MODE needs hard way - try block and subtransactions */ PG_TRY(); { exception_command_counter++; BeginInternalSubTransaction(NULL); handle_sql(queued_message, tx_just_started, &sql); } PG_CATCH(); { failed = true; RollbackAndReleaseCurrentSubTransaction(); edata = CopyErrorData(); xact_had_exception = true; } PG_END_TRY(); if (!failed) ReleaseCurrentSubTransaction(); - } - /* Let's create an exception log entry if true. */ - if (should_log_exception(failed)) - { - /* - * Use current error message if operation failed, otherwise use - * initial_error_message for context (e.g., in DISCARD mode when - * SQL succeeds but we're logging it because of a previous error). - */ - char *error_msg = failed ? edata->message : - (my_exception_log_index >= 0 && - exception_log_ptr[my_exception_log_index].initial_error_message[0] != '\0' ? - exception_log_ptr[my_exception_log_index].initial_error_message : NULL); - - add_entry_to_exception_log(remote_origin_id, - replorigin_session_origin_timestamp, - remote_xid, - 0, 0, - NULL, NULL, NULL, NULL, - sql, queued_message->role, - "SQL", - error_msg); + + /* Let's create an exception log entry if true (DISCARD mode only). */ + if (should_log_exception(failed)) + { + /* + * Use current error message if operation failed, otherwise use + * initial_error_message for context (e.g., in DISCARD mode when + * SQL succeeds but we're logging it because of a previous error). + */ + char *error_msg = failed ? edata->message : + (my_exception_log_index >= 0 && + exception_log_ptr[my_exception_log_index].initial_error_message[0] != '\0' ? + exception_log_ptr[my_exception_log_index].initial_error_message : NULL); + + add_entry_to_exception_log(remote_origin_id, + replorigin_session_origin_timestamp, + remote_xid, + 0, 0, + NULL, NULL, NULL, NULL, + sql, queued_message->role, + "SQL", + error_msg); + } } }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/spock_apply.c` around lines 2263 - 2334, The TRANSDISCARD/SUB_DISABLE branch logs an exception via add_entry_to_exception_log and then falls through to the generic logging block which calls should_log_exception and logs again, causing duplicates; fix by recording that the exception was already logged (e.g., set a local flag like exception_logged = true after the first add_entry_to_exception_log in the TRANSDISCARD/SUB_DISABLE branch) and then skip the later add_entry_to_exception_log call when exception_logged is true (or check exception_behaviour directly before calling add_entry_to_exception_log in the final block); reference symbols: exception_behaviour, TRANSDISCARD, SUB_DISABLE, add_entry_to_exception_log, should_log_exception, failed, exception_log_ptr, my_exception_log_index.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Outside diff comments:
In `@src/spock_apply.c`:
- Around line 2263-2334: The TRANSDISCARD/SUB_DISABLE branch logs an exception
via add_entry_to_exception_log and then falls through to the generic logging
block which calls should_log_exception and logs again, causing duplicates; fix
by recording that the exception was already logged (e.g., set a local flag like
exception_logged = true after the first add_entry_to_exception_log in the
TRANSDISCARD/SUB_DISABLE branch) and then skip the later
add_entry_to_exception_log call when exception_logged is true (or check
exception_behaviour directly before calling add_entry_to_exception_log in the
final block); reference symbols: exception_behaviour, TRANSDISCARD, SUB_DISABLE,
add_entry_to_exception_log, should_log_exception, failed, exception_log_ptr,
my_exception_log_index.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: ee09d8b3-4d1d-4de2-92e6-415759235dd9
⛔ Files ignored due to path filters (2)
tests/regress/expected/dry_run_logging.outis excluded by!**/*.outtests/regress/expected/replication_set.outis excluded by!**/*.out
📒 Files selected for processing (6)
Makefileinclude/spock_exception_handler.hsrc/spock_apply.csrc/spock_exception_handler.ctests/regress/sql/dry_run_logging.sqltests/regress/sql/replication_set.sql
spock.exception_loginstead of attempting the operation. Only the record that originally caused the error carries the real error message; other records get an empty string.xact_action_counter(saved asfailed_actioninSpockExceptionLog), which is unique per DML within a transaction.dry_run_loggingregression test covering all three exception modes with different breakage methods (absent table, truncated table, deleted row) and an INSERT_EXISTS conflict to verify that dry-run modes leavespock.resolutionsempty while DISCARD populates it.