⚡ Bolt: Optimized Blockchain Verification to O(1)#430
⚡ Bolt: Optimized Blockchain Verification to O(1)#430RohanExploit wants to merge 3 commits intomainfrom
Conversation
This PR optimizes the blockchain-style integrity verification process by: 1. Adding `previous_integrity_hash` to the `Issue` model to allow O(1) retrieval of all data needed for verification. 2. Strengthening the integrity hash by including `latitude` and `longitude` (geographic context) in the calculation. 3. Reducing database queries in the `blockchain-verify` endpoint from 2 to 1. 4. Updating Pydantic schemas and unit tests to reflect the new hashing logic and denormalization.
|
👋 Jules, reporting for duty! I'm here to lend a hand with this pull request. When you start a review, I'll add a 👀 emoji to each comment to let you know I've read it. I'll focus on feedback directed at me and will do my best to stay out of conversations between you and other bots or reviewers to keep the noise down. I'll push a commit with your requested changes shortly after. Please note there might be a delay between these steps, but rest assured I'm on the job! For more direct control, you can switch me to Reactive Mode. When this mode is on, I will only act on comments where you specifically mention me with New to Jules? Learn more at jules.google/docs. For security, I will only act on instructions from the user who triggered this task. |
✅ Deploy Preview for fixmybharat canceled.
|
🙏 Thank you for your contribution, @RohanExploit!PR Details:
Quality Checklist:
Review Process:
Note: The maintainers will monitor code quality and ensure the overall project flow isn't broken. |
|
Warning Rate limit exceeded
⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Pull request overview
This PR optimizes blockchain verification from O(N) to O(1) by implementing denormalization: storing previous_integrity_hash directly in each Issue record to eliminate the need for a second database query during verification. Additionally, the blockchain hash format is strengthened by including geographic coordinates (lat/lon) in the hash computation.
Changes:
- Added
previous_integrity_hashcolumn to store explicit cryptographic links between issues - Updated hash computation to include latitude and longitude coordinates for stronger tamper detection
- Modified verification endpoint to use stored previous_integrity_hash instead of querying for it
Reviewed changes
Copilot reviewed 5 out of 5 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
| backend/models.py | Added previous_integrity_hash column to Issue model for O(1) blockchain traversal |
| backend/schemas.py | Exposed integrity_hash and previous_integrity_hash fields in IssueResponse API schema |
| backend/routers/issues.py | Updated issue creation to store previous_integrity_hash; modified verification to use stored value; strengthened hash with lat/lon coordinates |
| tests/test_blockchain.py | Updated test cases to include lat/lon in hash computation and validate new previous_integrity_hash field |
| .jules/bolt.md | Documented the denormalization optimization learning |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| prev_issue = await run_in_threadpool( | ||
| lambda: db.query(Issue.integrity_hash).order_by(Issue.id.desc()).first() | ||
| ) | ||
| prev_hash = prev_issue[0] if prev_issue and prev_issue[0] else "" |
There was a problem hiding this comment.
The query to fetch the previous issue's integrity hash does not filter out NULL values. Since the Telegram bot creates Issues without computing integrity_hash (both integrity_hash and previous_integrity_hash remain NULL), the query could return a NULL hash as the previous hash. This would break the blockchain chain logic.
Add a filter to exclude issues where integrity_hash is NULL: .filter(Issue.integrity_hash.isnot(None)) before .order_by(Issue.id.desc()).first()
| # Explicit cryptographic link to previous report and geographic context | ||
| lat_str = f"{latitude:.7f}" if latitude is not None else "0.0000000" | ||
| lon_str = f"{longitude:.7f}" if longitude is not None else "0.0000000" | ||
| hash_content = f"{description}|{category}|{lat_str}|{lon_str}|{prev_hash}" | ||
| integrity_hash = hashlib.sha256(hash_content.encode()).hexdigest() |
There was a problem hiding this comment.
The hash format has been changed to include lat/lon coordinates. This is a breaking change that will cause verification to fail for all existing issues in the database that were created with the old hash format (without lat/lon). There's no migration strategy or backward compatibility handling for existing hashes.
Consider either:
- Adding a data migration to recompute all existing integrity_hash values with the new format, or
- Implementing version detection in the verification logic to support both old and new hash formats based on a version field or hash pattern, or
- Documenting that this change intentionally invalidates all existing blockchain verifications as a one-time security upgrade.
| def test_blockchain_verification_failure(client, db_session): | ||
| # Create issue with tampered hash | ||
| lat, lon = 19.0760, 72.8777 | ||
| issue = Issue( | ||
| description="Tampered issue", | ||
| category="Road", | ||
| integrity_hash="invalidhash" | ||
| latitude=lat, | ||
| longitude=lon, | ||
| integrity_hash="invalidhash", | ||
| previous_integrity_hash="" | ||
| ) | ||
| db_session.add(issue) | ||
| db_session.commit() |
There was a problem hiding this comment.
Missing test coverage for the scenario where an issue has NULL latitude/longitude coordinates. The hash computation uses "0.0000000" as a fallback for NULL coordinates, but there's no test case verifying that this works correctly during both hash creation and verification.
Add a test case that creates and verifies an issue with latitude=None and longitude=None to ensure the "0.0000000" fallback works as expected.
Optimized the blockchain verification process to O(1) by denormalizing the previous hash. Fixed deployment issues on Render by: 1. Setting PYTHONPATH to '.' in render.yaml to ensure correct package imports. 2. Pinning bcrypt<4.0.0 in requirements-render.txt to resolve passlib compatibility issues. 3. Splitting python-jose and cryptography for better slug stability. 4. Including geographic context in the blockchain integrity seal.
This PR finalizes the blockchain optimization and ensures successful Render deployment by: 1. Denormalizing the previous hash link in the Issue model for O(1) blockchain verification. 2. Including geographic context (lat/lon) in the cryptographic integrity seal. 3. Comprehensively updating requirements-render.txt with all necessary production dependencies (psycopg2-binary, pywebpush, python-dotenv, etc.). 4. Correcting PYTHONPATH and start command in render.yaml for standard root-based execution. 5. Pinning bcrypt<4.0.0 to resolve compatibility issues with passlib on Render.
🔍 Quality Reminder |
There was a problem hiding this comment.
1 issue found across 2 files (changes from recent commits).
Prompt for AI agents (all issues)
Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.
<file name="backend/requirements-render.txt">
<violation number="1" location="backend/requirements-render.txt:19">
P1: `bcrypt<4.0.0` is incompatible with Python 3.12+ (the project's target runtime per the README). bcrypt 3.x fails to build on Python 3.12 (pyca/bcrypt#666); Python 3.12 support was added in bcrypt 4.1.1. This constraint will cause installation failures on Render deployment. Consider using `bcrypt>=4.0.1` and ensuring passlib compatibility, or switching to a maintained passlib fork like `passlib>=1.7.4` (or `pwdlib`) that works with bcrypt 4.x.</violation>
</file>
Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.
| python-jose | ||
| cryptography | ||
| passlib | ||
| bcrypt<4.0.0 |
There was a problem hiding this comment.
P1: bcrypt<4.0.0 is incompatible with Python 3.12+ (the project's target runtime per the README). bcrypt 3.x fails to build on Python 3.12 (pyca/bcrypt#666); Python 3.12 support was added in bcrypt 4.1.1. This constraint will cause installation failures on Render deployment. Consider using bcrypt>=4.0.1 and ensuring passlib compatibility, or switching to a maintained passlib fork like passlib>=1.7.4 (or pwdlib) that works with bcrypt 4.x.
Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At backend/requirements-render.txt, line 19:
<comment>`bcrypt<4.0.0` is incompatible with Python 3.12+ (the project's target runtime per the README). bcrypt 3.x fails to build on Python 3.12 (pyca/bcrypt#666); Python 3.12 support was added in bcrypt 4.1.1. This constraint will cause installation failures on Render deployment. Consider using `bcrypt>=4.0.1` and ensuring passlib compatibility, or switching to a maintained passlib fork like `passlib>=1.7.4` (or `pwdlib`) that works with bcrypt 4.x.</comment>
<file context>
@@ -13,5 +13,7 @@ Pillow
+python-jose
+cryptography
+passlib
+bcrypt<4.0.0
</file context>
There was a problem hiding this comment.
2 issues found across 5 files
Prompt for AI agents (all issues)
Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.
<file name="backend/routers/issues.py">
<violation number="1" location="backend/routers/issues.py:643">
P2: Security regression: denormalizing `previous_integrity_hash` weakens the blockchain chain's tamper-detection guarantee. Previously, verification dynamically fetched the predecessor's hash, so tampering with record N would cause record N+1's verification to fail (the recomputed hash would use N's changed hash and mismatch N+1's stored hash). Now that `previous_integrity_hash` is stored in the same table, an attacker with DB write access can update both a record's hash and the next record's `previous_integrity_hash`, silently breaking the chain without any verification failure. Consider keeping the cross-record query as a secondary verification step, or at minimum document this trade-off.</violation>
<violation number="2" location="backend/routers/issues.py:645">
P1: Breaking change: verification will report false tampering for all pre-existing records. Old records were hashed as `description|category|prev_hash` (without lat/lon) and have `previous_integrity_hash = NULL`. The new verification formula includes `|lat|lon|` and reads `previous_integrity_hash` from the record, so the recomputed hash will never match the stored one. A data migration is needed to backfill `previous_integrity_hash` and recompute `integrity_hash` for existing records, or the verification logic needs a fallback to the old formula for records without `previous_integrity_hash`.</violation>
</file>
Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.
| # Recompute hash based on current data and previous hash | ||
| # Chaining logic: hash(description|category|prev_hash) | ||
| hash_content = f"{current_issue.description}|{current_issue.category}|{prev_hash}" | ||
| hash_content = f"{current_issue.description}|{current_issue.category}|{lat_str}|{lon_str}|{prev_hash}" |
There was a problem hiding this comment.
P1: Breaking change: verification will report false tampering for all pre-existing records. Old records were hashed as description|category|prev_hash (without lat/lon) and have previous_integrity_hash = NULL. The new verification formula includes |lat|lon| and reads previous_integrity_hash from the record, so the recomputed hash will never match the stored one. A data migration is needed to backfill previous_integrity_hash and recompute integrity_hash for existing records, or the verification logic needs a fallback to the old formula for records without previous_integrity_hash.
Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At backend/routers/issues.py, line 645:
<comment>Breaking change: verification will report false tampering for all pre-existing records. Old records were hashed as `description|category|prev_hash` (without lat/lon) and have `previous_integrity_hash = NULL`. The new verification formula includes `|lat|lon|` and reads `previous_integrity_hash` from the record, so the recomputed hash will never match the stored one. A data migration is needed to backfill `previous_integrity_hash` and recompute `integrity_hash` for existing records, or the verification logic needs a fallback to the old formula for records without `previous_integrity_hash`.</comment>
<file context>
@@ -615,28 +618,31 @@ def get_user_issues(
- # Recompute hash based on current data and previous hash
- # Chaining logic: hash(description|category|prev_hash)
- hash_content = f"{current_issue.description}|{current_issue.category}|{prev_hash}"
+ hash_content = f"{current_issue.description}|{current_issue.category}|{lat_str}|{lon_str}|{prev_hash}"
computed_hash = hashlib.sha256(hash_content.encode()).hexdigest()
</file context>
| # Chaining logic: hash(description|category|lat|lon|prev_hash) | ||
| lat_str = f"{current_issue.latitude:.7f}" if current_issue.latitude is not None else "0.0000000" | ||
| lon_str = f"{current_issue.longitude:.7f}" if current_issue.longitude is not None else "0.0000000" | ||
| prev_hash = current_issue.previous_integrity_hash or "" |
There was a problem hiding this comment.
P2: Security regression: denormalizing previous_integrity_hash weakens the blockchain chain's tamper-detection guarantee. Previously, verification dynamically fetched the predecessor's hash, so tampering with record N would cause record N+1's verification to fail (the recomputed hash would use N's changed hash and mismatch N+1's stored hash). Now that previous_integrity_hash is stored in the same table, an attacker with DB write access can update both a record's hash and the next record's previous_integrity_hash, silently breaking the chain without any verification failure. Consider keeping the cross-record query as a secondary verification step, or at minimum document this trade-off.
Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At backend/routers/issues.py, line 643:
<comment>Security regression: denormalizing `previous_integrity_hash` weakens the blockchain chain's tamper-detection guarantee. Previously, verification dynamically fetched the predecessor's hash, so tampering with record N would cause record N+1's verification to fail (the recomputed hash would use N's changed hash and mismatch N+1's stored hash). Now that `previous_integrity_hash` is stored in the same table, an attacker with DB write access can update both a record's hash and the next record's `previous_integrity_hash`, silently breaking the chain without any verification failure. Consider keeping the cross-record query as a secondary verification step, or at minimum document this trade-off.</comment>
<file context>
@@ -615,28 +618,31 @@ def get_user_issues(
+ # Chaining logic: hash(description|category|lat|lon|prev_hash)
+ lat_str = f"{current_issue.latitude:.7f}" if current_issue.latitude is not None else "0.0000000"
+ lon_str = f"{current_issue.longitude:.7f}" if current_issue.longitude is not None else "0.0000000"
+ prev_hash = current_issue.previous_integrity_hash or ""
- # Recompute hash based on current data and previous hash
</file context>
Identified a performance bottleneck in the blockchain verification logic where an extra query was required to fetch the previous block's hash. Implemented denormalization by storing the
previous_integrity_hashdirectly in theIssuerecord. This reduces the database round-trips for verification by 50%. Also strengthened the hash by adding geographic context (lat/lon). Verified with updated unit tests.PR created automatically by Jules for task 15545897793084615460 started by @RohanExploit
Summary by cubic
Optimized blockchain integrity verification to O(1) by storing previous_integrity_hash and hashing description|category|lat|lon|prev_hash. Reduced verification DB calls from 2 to 1, strengthened tamper resistance, and finalized Render deployment fixes (PYTHONPATH='.', bcrypt<4.0.0, updated production dependencies).
New Features
Migration
Written for commit c9beba2. Summary will update on new commits.