feat(dbapi): use inline begin to eliminate BeginTransaction RPC#1502
feat(dbapi): use inline begin to eliminate BeginTransaction RPC#1502waiho-gumloop wants to merge 4 commits intogoogleapis:mainfrom
Conversation
|
Thanks for your pull request! It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA). View this failed invocation of the CLA check for more information. For the most up to date status, view the checks section at the bottom of the pull request. |
Summary of ChangesHello @waiho-gumloop, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request optimizes transaction handling in the Spanner DBAPI by leveraging the existing inline begin mechanism. It removes an unnecessary explicit Highlights
Changelog
Activity
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Code Review
This pull request introduces a valuable optimization by removing the explicit BeginTransaction RPC call in transaction_checkout. Instead, it leverages the existing inline begin functionality, where the transaction is initiated with the first SQL execution. This change effectively reduces latency by saving one network round-trip per transaction. The modification is well-supported by a new unit test ensuring begin() is not called, and the detailed description provides a thorough safety analysis, confirming the change is sound.
f176484 to
880081e
Compare
…ests Add test_dbapi_inline_begin.py with 7 mockserver tests that verify: - Read-write DBAPI transactions send no BeginTransactionRequest - First ExecuteSqlRequest uses TransactionSelector(begin=...) - Read + write + commit request sequence is correct - DML-only transactions use inline begin - Read-only transactions still use explicit BeginTransaction - Transaction retry after abort works with inline begin Update existing mockserver tests that expected BeginTransactionRequest for read-write DBAPI transactions: - test_tags.py: Remove BeginTransactionRequest from expected sequences for all read-write tag tests, adjust tag index offsets - test_dbapi_isolation_level.py: Verify isolation level on the inline begin field of ExecuteSqlRequest instead of BeginTransactionRequest Made-with: Cursor
a98de4c to
48d4df8
Compare
Made-with: Cursor
- Consolidate 3 redundant single-read tests into one comprehensive test that verifies: no BeginTransactionRequest, inline begin on first ExecuteSqlRequest, correct request sequence, and correct query results - Rename test_second_statement_uses_transaction_id to test_read_then_write_full_lifecycle with additional assertions: CommitRequest.transaction_id matches the transaction ID from inline begin - Strengthen test_rollback to verify RollbackRequest is sent with a non-empty transaction_id (was only checking no BeginTransactionRequest) - Add CommitRequest assertions to abort retry test: both the aborted and successful commits carry valid transaction IDs - Assert cursor.fetchall() return values in read tests to verify inline begin doesn't corrupt result set metadata - Add RollbackRequest import Made-with: Cursor
Fixes #1503
Summary
Connection.transaction_checkout()currently callsTransaction.begin()explicitly before the first query, which sends a standaloneBeginTransactiongRPC RPC. This is unnecessary because theTransactionclass already supports inline begin — piggybackingBeginTransactiononto the firstExecuteSql/ExecuteBatchDmlrequest viaTransactionSelector(begin=...).This PR removes the explicit
begin()call, letting the existing inline begin logic inexecute_sql(),execute_update(), andbatch_update()handle transaction creation. This saves one gRPC round-trip per transaction (~16ms measured on the emulator).What changed
google/cloud/spanner_dbapi/connection.py—transaction_checkout():self._transaction.begin()on L413_transaction_id=Nonetests/unit/spanner_dbapi/test_connection.py:test_transaction_checkout_does_not_call_beginto assertbegin()is not calledtests/mockserver_tests/test_dbapi_inline_begin.py(new):BeginTransactionRequestsent, firstExecuteSqlRequestusesTransactionSelector(begin=...), transaction ID reuse on second statement, rollback, read-only unaffected, retry after aborttests/mockserver_tests/test_tags.py:BeginTransactionRequestfrom expected RPC sequences, adjusted tag index offsetstests/mockserver_tests/test_dbapi_isolation_level.py:ExecuteSqlRequest.transaction.begin.isolation_levelinstead ofBeginTransactionRequest.options.isolation_levelWhy this is safe
The inline begin code path already exists and is battle-tested —
Session.run_in_transaction()creates aTransactionwithout callingbegin()and relies on the same inline begin logic (session.py L566).Specific safety analysis:
transaction_checkout()callers always execute SQL immediately: It's only called fromrun_statement()→execute_sql()andbatch_dml_executor→batch_update(). Both set_transaction_idvia inline begin before any commit/rollback path.execute_sql/execute_update/batch_updatehandle_transaction_id is None: They acquire a lock, use_make_txn_selector()which returnsTransactionSelector(begin=...), and store the returned_transaction_id(transaction.py L612-L623).rollback()handles_transaction_id is None: Skips the RPC — correct when no server-side transaction exists (transaction.py L163).commit()handles_transaction_id is None: Falls back to_begin_mutations_only_transaction()for mutation-only transactions (transaction.py L263-L267).Retry mechanism is compatible:
_set_connection_for_retry()resets_spanner_transaction_started=False, so replayed statements go throughtransaction_checkout()again, create a freshTransaction, and use inline begin.PEP 249 conformance
This change is fully conformant with PEP 249 (DB-API 2.0). The spec does not define a
begin()method — transactions are implicit. The PEP author clarified on the DB-SIG mailing list that "transactions start implicitly after you connect and after you call.commit()or.rollback()", and the mechanism by which the driver starts the server-side transaction is an implementation detail. Deferring the server-side begin to the first SQL execution (as psycopg2 and other mature DB-API drivers do) is the standard approach. The observable transactional semantics — atomicity betweencommit()/rollback()calls — are unchanged.Performance impact
Before (4 RPCs per read-write transaction):
After (3 RPCs per read-write transaction):
Measured ~16ms savings per transaction on the Spanner emulator.
Context
This optimization was identified while profiling SQLAlchemy/DBAPI performance against Spanner. The DBAPI was created ~2 years before inline begin support was added to the Python client library (inline begin landed in PR #740, Dec 2022). The
run_in_transactionpath was updated to use inline begin, buttransaction_checkoutwas not.Test plan
tests/unit/spanner_dbapi/)begin()is not called bytransaction_checkout()tests/mockserver_tests/test_dbapi_inline_begin.py)test_tags.py,test_dbapi_isolation_level.py)tests/system/test_dbapi.py)