⚡ Bolt: Optimize blockchain verification to O(1) by storing previous hash by RohanExploit · Pull Request #474 · RohanExploit/VishwaGuru

RohanExploit · 2026-02-25T14:07:16Z

Identified and optimized a performance bottleneck in the blockchain integrity verification process. By storing the hash of the preceding record directly in the current issue (as a back-link), the system now performs verification with a single database lookup instead of two, achieving $O(1)$ database roundtrip efficiency for the link verification. Indices were also added to ensure fast lookups on audit trails. Full backward compatibility is maintained via a fallback mechanism for legacy records.

PR created automatically by Jules for task 15333545155765582858 started by @RohanExploit

Summary by cubic

Stores each issue’s previous hash to make blockchain verification O(1) with a single DB lookup. Updates the verification API, expands migrations with indexes and new tracking fields, keeps a fallback for legacy records, and stabilizes Render builds and CORS.

New Features
- Added previous_integrity_hash to Issue and populate it on creation.
- Verification API now includes previous_hash in the response.
- Tests cover O(1) storage and verification, including previous_hash in responses.
Refactors
- Migrations hardened: set String/VARCHAR(255), added indexes on integrity_hash/previous_integrity_hash, and added reference_id (unique), verified_at, assigned_at, resolved_at, assigned_to; standardized lengths for location/audio paths.
- Deployment/runtime fixes: pinned bcrypt<4.0.0, split python-jose/cryptography, removed async_lru and indic-nlp-library, added a simple TTL cache in gemini_summary, and made CORS safer when FRONTEND_URL is missing in production.

^{Written for commit a079f17. Summary will update on new commits.}

Summary by CodeRabbit

New Features
- Records now store the previous integrity hash, enabling O(1) integrity verification and exposing previous_hash in verification responses.
Bug Fixes / Performance
- Verification path optimized to avoid multi-step cross-record lookups, improving verification speed and scalability.
Documentation
- Added documented entry describing the chaining-based O(1) integrity verification approach.

…hash - Added `previous_integrity_hash` to `Issue` model to allow single-query verification. - Added database indexes to `integrity_hash` and `previous_integrity_hash` columns. - Refactored `verify_blockchain_integrity` to use the stored previous hash, reducing DB roundtrips. - Updated `create_issue` to populate the new field during report creation. - Enhanced `BlockchainVerificationResponse` schema for transparency. - Updated blockchain tests to verify the O(1) storage and verification logic. - Implemented database migration in `init_db.py` for indexes.

google-labs-jules · 2026-02-25T14:07:17Z

👋 Jules, reporting for duty! I'm here to lend a hand with this pull request.

When you start a review, I'll add a 👀 emoji to each comment to let you know I've read it. I'll focus on feedback directed at me and will do my best to stay out of conversations between you and other bots or reviewers to keep the noise down.

I'll push a commit with your requested changes shortly after. Please note there might be a delay between these steps, but rest assured I'm on the job!

For more direct control, you can switch me to Reactive Mode. When this mode is on, I will only act on comments where you specifically mention me with @jules. You can find this option in the Pull Request section of your global Jules UI settings. You can always switch back!

New to Jules? Learn more at jules.google/docs.

For security, I will only act on instructions from the user who triggered this task.

netlify · 2026-02-25T14:07:22Z

👷 Deploy Preview for fixmybharat processing.

Name	Link
🔨 Latest commit	`a079f17`
🔍 Latest deploy log	https://app.netlify.com/projects/fixmybharat/deploys/699f0e40afef900008750b85

github-actions · 2026-02-25T14:07:25Z

🙏 Thank you for your contribution, @RohanExploit!

PR Details:

Title: ⚡ Bolt: Optimize blockchain verification to O(1) by storing previous hash
Number: ⚡ Bolt: Optimize blockchain verification to O(1) by storing previous hash #474

Quality Checklist:
Please ensure your PR meets the following criteria:

Code follows the project's style guidelines
Self-review of code completed
Code is commented where necessary
Documentation updated (if applicable)
No new warnings generated
Tests added/updated (if applicable)
All tests passing locally
No breaking changes to existing functionality

Review Process:

Automated checks will run on your code
A maintainer will review your changes
Address any requested changes promptly
Once approved, your PR will be merged! 🎉

Note: The maintainers will monitor code quality and ensure the overall project flow isn't broken.

coderabbitai · 2026-02-25T14:07:39Z

📝 Walkthrough

Walkthrough

Adds O(1) blockchain-style chaining by storing previous_integrity_hash on Issue records, creates indexes for O(1) lookups, updates creation and verification logic and schemas, adds tests and documentation, and adjusts ancillary modules (CORS, caching, requirements).

Changes

Cohort / File(s)	Summary
Database schema & migrations `backend/models.py`, `backend/init_db.py`	Add `previous_integrity_hash` column (String(255), indexed); change `integrity_hash` to `String(255)` and index it; add `CREATE INDEX IF NOT EXISTS` for `integrity_hash` and `previous_integrity_hash`; adjust several VARCHAR lengths.
API, router & schema `backend/routers/issues.py`, `backend/schemas.py`	Persist `previous_integrity_hash` when creating issues; verification endpoint reads stored previous hash (with legacy fallback) and returns `previous_hash` in `BlockchainVerificationResponse`.
Tests `tests/test_blockchain.py`	Add `test_blockchain_o1_storage` asserting `previous_integrity_hash` is populated and verification response includes `previous_hash` and is valid.
Docs / Design note `.jules/bolt.md`	Add entry describing "Chaining for O(1) Integrity Verification" dated 2024-05-30.
App startup & infra `backend/main.py`, `backend/requirements-render.txt`	Refine FRONTEND_URL/CORS handling (prod vs dev branching); adjust dependency pins/removals (remove async_lru, separate python-jose/cryptography, passlib/bcrypt changes).
MLA summary caching `backend/gemini_summary.py`	Replace async_lru with simple in-module timestamped cache and fallback behavior on generation failure.

Sequence Diagram

sequenceDiagram
    actor Client
    participant Router as Issue Router
    participant DB as Database

    Note over Client,DB: Create Issue (chaining)
    Client->>Router: POST /issues (payload)
    Router->>DB: SELECT latest Issue.integrity_hash
    DB-->>Router: latest_integrity_hash
    Router->>DB: INSERT Issue (integrity_hash, previous_integrity_hash=latest_integrity_hash)
    DB-->>Router: Insert OK
    Router-->>Client: 201 Created

    Note over Client,DB: O(1) Verification
    Client->>Router: GET /blockchain/verify/{issue_id}
    Router->>DB: SELECT Issue (integrity_hash, previous_integrity_hash) WHERE id=issue_id
    DB-->>Router: current_issue (with previous_integrity_hash)
    Router->>Router: Recompute hash using previous_integrity_hash
    Router->>Router: Compare recomputed vs stored integrity_hash
    Router-->>Client: BlockchainVerificationResponse { previous_hash, computed_hash, current_hash, is_valid }

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

Enhance Blockchain Integrity and Implement Vote Deduplication #409: Modifies Issue integrity chaining and verification logic; touches same fields and endpoints.
⚡ Bolt: Optimize issue retrieval and implement blockchain integrity #348: Direct predecessor—introduces integrity hashing work that this PR extends with previous_integrity_hash.
⚡ Bolt: Optimize DB operations and implement blockchain verification #357: Related updates to blockchain verification flow and endpoint behavior.

Suggested labels

ECWoC26-L3, size/m

Poem

🐰
I hop through records, tail held high,
Each hash a breadcrumb in the sky.
Previous links make checks so light,
One lookup, one hop — everything's right! 🥕🔗

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 55.56% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Title check	✅ Passed	The PR title clearly and specifically describes the main optimization: adding previous hash storage to achieve O(1) blockchain verification, which directly aligns with the core changes across all modified files.
Description check	✅ Passed	PR description is comprehensive, covering objectives, changes, testing, and deployment considerations, but lacks explicit checklist completion and formal template structure.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings (stacked PR)
📝 Generate docstrings (commit on current branch)

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch bolt-optimize-blockchain-o1-verification-15333545155765582858

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

Copilot

Pull request overview

This PR optimizes the blockchain integrity verification system by storing the previous issue's integrity hash directly in each issue record, achieving O(1) verification performance instead of O(N). The optimization maintains full backward compatibility through a fallback mechanism for legacy records that don't have the cached previous hash.

Changes:

Added previous_integrity_hash column to the Issue model with index for efficient lookups
Updated issue creation to store the previous issue's hash at the time of creation
Modified verification endpoint to use the stored hash with fallback to legacy query for old records
Added new API response field to expose the previous hash in verification results

Reviewed changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated 2 comments.

Show a summary per file

File	Description
backend/models.py	Added `previous_integrity_hash` column with index to Issue model
backend/init_db.py	Added database migration for new column and indices on both hash columns
backend/routers/issues.py	Updated issue creation to store previous hash; optimized verification to O(1) with legacy fallback
backend/schemas.py	Added `previous_hash` field to BlockchainVerificationResponse
tests/test_blockchain.py	Added new test to verify O(1) storage and previous_hash in response
.jules/bolt.md	Documented the optimization learning with minor formatting issues

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-02-25T14:11:16Z

.jules/bolt.md

 **Action:** Always maintain strict return type consistency for core utilities. Use type hints and verify all call sites when changing a function's signature. Ensure that performance-oriented optimizations (like returning multiple processed formats) are applied uniformly.
+
+## 2024-05-30 - Chaining for O(1) Integrity Verification
+**Learning:** Chained data structures (like blockchains) that require cross-record lookups for verification can suffer from O(N) or O(log N) latency as the dataset grows. Storing the "back-link" (the previous hash) directly on the current record transforms verification into a single (1)$ database lookup.


There's a formatting issue: "(1)$" should be "O(1)" to properly denote Big O notation. The dollar sign appears to be a typo or formatting error.

Suggested change

**Learning:** Chained data structures (like blockchains) that require cross-record lookups for verification can suffer from O(N) or O(log N) latency as the dataset grows. Storing the "back-link" (the previous hash) directly on the current record transforms verification into a single (1)$ database lookup.

**Learning:** Chained data structures (like blockchains) that require cross-record lookups for verification can suffer from O(N) or O(log N) latency as the dataset grows. Storing the "back-link" (the previous hash) directly on the current record transforms verification into a single O(1) database lookup.

Copilot · 2026-02-25T14:11:16Z

backend/routers/issues.py

+                integrity_hash=integrity_hash,
+                previous_integrity_hash=prev_hash


The blockchain optimization is only being applied to web-based issue creation in this endpoint. However, bot.py (line 74-79) and voice.py (line 260-276) also create Issues but do not compute or store integrity_hash or previous_integrity_hash. This creates an inconsistent blockchain where some issues are chained and others are not. Consider either: 1) Implementing blockchain hashing for all issue creation paths to maintain a complete chain, or 2) Documenting this as an intentional design decision if only web submissions need blockchain verification.

coderabbitai

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

backend/routers/issues.py (1)
173-177: ⚠️ Potential issue | 🟠 Major

Chain head assignment is non-atomic, allowing concurrent forked links.

prev_hash is read and then reused for insert without serialization. Concurrent requests can bind to the same parent hash, creating multiple children that all verify as valid, which breaks strict linear-chain semantics.

Please serialize head selection + insert in one critical section (e.g., DB advisory lock or equivalent transactional chain-head lock).

Also applies to: 199-200
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@backend/routers/issues.py` around lines 173 - 177, The chain-head read
(prev_issue/prev_hash computed via
run_in_threadpool/db.query(Issue.integrity_hash)) is non-atomic with the
subsequent insert and must be serialized; wrap the head-selection and the insert
that uses prev_hash in a single DB-critical section (e.g., acquire a DB
advisory/transactional lock or use SELECT ... FOR UPDATE within a transaction)
so no two requests can read the same head and create concurrent children. Modify
the code around the prev_issue/prev_hash lookup and the later insert routine to
begin a transaction or obtain an advisory lock (using your DB's advisory lock
API), re-read the head inside that lock/transaction, compute prev_hash, perform
the insert, then release/commit; apply the same fix to the other occurrence that
reads prev_hash (the block referenced near lines 199-200).

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In @.jules/bolt.md:
- Line 42: Replace the accidental complexity token "single (1)$ database lookup"
in the learning note with the correct notation, e.g. "single O(1) database
lookup" (or "constant-time (O(1)) database lookup"); locate the exact phrase
"single (1)$ database lookup" in the text and update it to "single O(1) database
lookup" to fix the typo.

In `@tests/test_blockchain.py`:
- Line 98: Replace the explicit boolean comparison in the test assertion —
change the line asserting data["is_valid"] == True to use a direct truth
assertion (i.e., assert data["is_valid"]) in tests/test_blockchain.py so the
test uses Python idiomatic truthiness for the data["is_valid"] value.

---

Outside diff comments:
In `@backend/routers/issues.py`:
- Around line 173-177: The chain-head read (prev_issue/prev_hash computed via
run_in_threadpool/db.query(Issue.integrity_hash)) is non-atomic with the
subsequent insert and must be serialized; wrap the head-selection and the insert
that uses prev_hash in a single DB-critical section (e.g., acquire a DB
advisory/transactional lock or use SELECT ... FOR UPDATE within a transaction)
so no two requests can read the same head and create concurrent children. Modify
the code around the prev_issue/prev_hash lookup and the later insert routine to
begin a transaction or obtain an advisory lock (using your DB's advisory lock
API), re-read the head inside that lock/transaction, compute prev_hash, perform
the insert, then release/commit; apply the same fix to the other occurrence that
reads prev_hash (the block referenced near lines 199-200).

ℹ️ Review info

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 39d5fbb and edce47c.

📒 Files selected for processing (6)

.jules/bolt.md
backend/init_db.py
backend/models.py
backend/routers/issues.py
backend/schemas.py
tests/test_blockchain.py

coderabbitai · 2026-02-25T14:14:29Z

.jules/bolt.md

 **Action:** Always maintain strict return type consistency for core utilities. Use type hints and verify all call sites when changing a function's signature. Ensure that performance-oriented optimizations (like returning multiple processed formats) are applied uniformly.
+
+## 2024-05-30 - Chaining for O(1) Integrity Verification
+**Learning:** Chained data structures (like blockchains) that require cross-record lookups for verification can suffer from O(N) or O(log N) latency as the dataset grows. Storing the "back-link" (the previous hash) directly on the current record transforms verification into a single (1)$ database lookup.


⚠️ Potential issue | 🟡 Minor

Fix the complexity notation typo in the learning note.

single (1)$ database lookup looks accidental; use single O(1) database lookup (or equivalent wording).

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In @.jules/bolt.md at line 42, Replace the accidental complexity token "single (1)$ database lookup" in the learning note with the correct notation, e.g. "single O(1) database lookup" (or "constant-time (O(1)) database lookup"); locate the exact phrase "single (1)$ database lookup" in the text and update it to "single O(1) database lookup" to fix the typo.

coderabbitai · 2026-02-25T14:14:29Z

tests/test_blockchain.py

+    response = client.get(f"/api/issues/{issue2_id}/blockchain-verify")
+    assert response.status_code == 200
+    data = response.json()
+    assert data["is_valid"] == True


⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash # First, check if the file exists and read the specific line if [ -f tests/test_blockchain.py ]; then echo "=== Line 98 in tests/test_blockchain.py ===" sed -n '98p' tests/test_blockchain.py echo "" echo "=== Context around line 98 (lines 95-101) ===" sed -n '95,101p' tests/test_blockchain.py echo "" echo "=== Search for all assert...==True patterns in the file ===" rg -n 'assert\s+.+==\s*True' tests/test_blockchain.py else echo "File tests/test_blockchain.py not found" fi

Repository: RohanExploit/VishwaGuru

Length of output: 650

Use direct truth assertion for is_valid.

Avoid explicit comparison to True in assertions; prefer assert data["is_valid"] to satisfy linting and follow Python idioms.

✅ Minimal fix

- assert data["is_valid"] == True + assert data["is_valid"]

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

assert data["is_valid"] == True

assert data["is_valid"]

🧰 Tools

🪛 Ruff (0.15.2)

[error] 98-98: Avoid equality comparisons to True; use data["is_valid"]: for truth checks

Replace with data["is_valid"]

(E712)

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@tests/test_blockchain.py` at line 98, Replace the explicit boolean comparison in the test assertion — change the line asserting data["is_valid"] == True to use a direct truth assertion (i.e., assert data["is_valid"]) in tests/test_blockchain.py so the test uses Python idiomatic truthiness for the data["is_valid"] value.

cubic-dev-ai

2 issues found across 6 files

Prompt for AI agents (unresolved issues)


Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.


<file name="backend/routers/issues.py">

<violation number="1" location="backend/routers/issues.py:639">
P1: The blockchain verification no longer validates that the stored previous_integrity_hash matches the actual previous issue’s integrity_hash, so tampering with the prior issue can go undetected when verifying this issue. Consider validating the backlink against the real previous issue before reporting success.</violation>
</file>

<file name=".jules/bolt.md">

<violation number="1" location=".jules/bolt.md:42">
P3: Fix the malformed inline formatting in this sentence so the learning note renders correctly (remove the stray `$` or use proper `O(1)` formatting).</violation>
</file>

_{Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.}

cubic-dev-ai · 2026-02-25T14:14:31Z

backend/routers/issues.py

+        )
+        prev_hash = prev_issue_hash[0] if prev_issue_hash and prev_issue_hash[0] else ""
+    else:
+        prev_hash = current_issue.previous_integrity_hash


P1: The blockchain verification no longer validates that the stored previous_integrity_hash matches the actual previous issue’s integrity_hash, so tampering with the prior issue can go undetected when verifying this issue. Consider validating the backlink against the real previous issue before reporting success.

Prompt for AI agents

Check if this issue is valid — if so, understand the root cause and fix it. At backend/routers/issues.py, line 639: <comment>The blockchain verification no longer validates that the stored previous_integrity_hash matches the actual previous issue’s integrity_hash, so tampering with the prior issue can go undetected when verifying this issue. Consider validating the backlink against the real previous issue before reporting success.</comment> <file context> @@ -615,24 +616,27 @@ def get_user_issues( + ) + prev_hash = prev_issue_hash[0] if prev_issue_hash and prev_issue_hash[0] else "" + else: + prev_hash = current_issue.previous_integrity_hash # Recompute hash based on current data and previous hash </file context>

cubic-dev-ai · 2026-02-25T14:14:31Z

.jules/bolt.md

 **Action:** Always maintain strict return type consistency for core utilities. Use type hints and verify all call sites when changing a function's signature. Ensure that performance-oriented optimizations (like returning multiple processed formats) are applied uniformly.
+
+## 2024-05-30 - Chaining for O(1) Integrity Verification
+**Learning:** Chained data structures (like blockchains) that require cross-record lookups for verification can suffer from O(N) or O(log N) latency as the dataset grows. Storing the "back-link" (the previous hash) directly on the current record transforms verification into a single (1)$ database lookup.


P3: Fix the malformed inline formatting in this sentence so the learning note renders correctly (remove the stray $ or use proper O(1) formatting).

Prompt for AI agents

Check if this issue is valid — if so, understand the root cause and fix it. At .jules/bolt.md, line 42: <comment>Fix the malformed inline formatting in this sentence so the learning note renders correctly (remove the stray `$` or use proper `O(1)` formatting).</comment> <file context> @@ -37,3 +37,7 @@ **Action:** Always maintain strict return type consistency for core utilities. Use type hints and verify all call sites when changing a function's signature. Ensure that performance-oriented optimizations (like returning multiple processed formats) are applied uniformly. + +## 2024-05-30 - Chaining for O(1) Integrity Verification +**Learning:** Chained data structures (like blockchains) that require cross-record lookups for verification can suffer from O(N) or O(log N) latency as the dataset grows. Storing the "back-link" (the previous hash) directly on the current record transforms verification into a single (1)$ database lookup. +**Action:** Always store the hash of the preceding record in the current record if integrity chaining is required, allowing for immediate verification without secondary queries. </file context>

Suggested change

**Learning:** Chained data structures (like blockchains) that require cross-record lookups for verification can suffer from O(N) or O(log N) latency as the dataset grows. Storing the "back-link" (the previous hash) directly on the current record transforms verification into a single (1)$ database lookup.

**Learning:** Chained data structures (like blockchains) that require cross-record lookups for verification can suffer from O(N) or O(log N) latency as the dataset grows. Storing the "back-link" (the previous hash) directly on the current record transforms verification into a single O(1) database lookup.

- Pinned `bcrypt<4.0.0` in `backend/requirements-render.txt` to avoid build failures on Render. - Split `python-jose` and `cryptography` for cleaner dependency management. - Refined `backend/init_db.py` to use `VARCHAR(255)` for better compatibility. - Updated `Issue` model to use `String(255)` for hash columns. - Verified app startup and migration logic on fresh and existing databases.

- Optimized blockchain integrity verification from O(log N) to O(1) by storing `previous_integrity_hash` on each report. - Added database indexes for `integrity_hash` and `previous_integrity_hash`. - Fixed Render deployment by pinning `bcrypt<4.0.0` and removing `async_lru`. - Hardened `init_db.py` migrations with `VARCHAR(255)` and column existence checks. - Enhanced CORS safety in `main.py` for production environments. - Added verification tests in `tests/test_blockchain.py`.

github-actions · 2026-02-25T14:42:53Z

🔍 Quality Reminder

Thanks for the updates! Please ensure:
- Your changes don't break existing functionality
- All tests still pass
- Code quality standards are maintained

*The maintainers will verify that the overall project flow remains intact.*

cubic-dev-ai

1 issue found across 3 files (changes from recent commits).

Prompt for AI agents (unresolved issues)


Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.


<file name="backend/gemini_summary.py">

<violation number="1" location="backend/gemini_summary.py:100">
P2: The new manual cache is unbounded and never evicts entries, so the cache can grow indefinitely with unique MLA/issue combinations and leak memory over time. Add a max size and evict old entries (or reintroduce an LRU) when storing results.</violation>
</file>

_{Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.}

cubic-dev-ai · 2026-02-25T14:46:26Z

backend/gemini_summary.py

    try:
-        return await retry_with_exponential_backoff(_generate_mla_summary_with_gemini, max_retries=2)
+        summary = await retry_with_exponential_backoff(_generate_mla_summary_with_gemini, max_retries=2)
+        _summary_cache[cache_key] = (summary, current_time)


P2: The new manual cache is unbounded and never evicts entries, so the cache can grow indefinitely with unique MLA/issue combinations and leak memory over time. Add a max size and evict old entries (or reintroduce an LRU) when storing results.

Prompt for AI agents

Check if this issue is valid — if so, understand the root cause and fix it. At backend/gemini_summary.py, line 100: <comment>The new manual cache is unbounded and never evicts entries, so the cache can grow indefinitely with unique MLA/issue combinations and leak memory over time. Add a max size and evict old entries (or reintroduce an LRU) when storing results.</comment> <file context> @@ -84,7 +96,9 @@ async def _generate_mla_summary_with_gemini() -> str: try: - return await retry_with_exponential_backoff(_generate_mla_summary_with_gemini, max_retries=2) + summary = await retry_with_exponential_backoff(_generate_mla_summary_with_gemini, max_retries=2) + _summary_cache[cache_key] = (summary, current_time) + return summary except Exception as e: </file context>

coderabbitai

Actionable comments posted: 4

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@backend/gemini_summary.py`:
- Around line 48-50: The manual _summary_cache dict with SUMMARY_CACHE_TTL is
unbounded and can grow forever; replace it with a bounded cache (e.g., an LRU or
TTL-aware bounded structure) and evict old entries when the size exceeds a
MAX_CACHE_SIZE. Concretely: replace _summary_cache usage with a bounded cache
implementation (for example collections.OrderedDict-based LRU or
cachetools.TTLCache) keyed the same way, set a MAX_CACHE_SIZE constant (e.g.,
1000), ensure entries also respect SUMMARY_CACHE_TTL, and update any places that
read/write _summary_cache (symbols: _summary_cache and SUMMARY_CACHE_TTL and the
functions that access them around the existing uses) to use the new cache API so
old/rare keys are evicted automatically. Ensure thread/async safety if the
module is used concurrently.
- Line 71: The current cache_key is built by concatenating fields into a string
(cache_key = f"{district}_{assembly_constituency}_{mla_name}_{issue_category}")
which can collide if values contain underscores; change cache_key to use an
immutable tuple (e.g., (district, assembly_constituency, mla_name,
issue_category)) wherever the key is created and looked up, and update any cache
get/set usages that reference cache_key so they use the tuple form instead of
the concatenated string.
- Around line 99-101: When GEMINI is disabled (genai is None), avoid calling
retry_with_exponential_backoff(_generate_mla_summary_with_gemini, ...) and
returning after delay; instead detect genai is None before the retry and
immediately return the fallback summary (and update _summary_cache[cache_key] if
existing code caches fallbacks) so no retry/delay occurs; specifically modify
the block around retry_with_exponential_backoff to short-circuit when genai is
None, calling the same fallback-path used elsewhere rather than invoking
_generate_mla_summary_with_gemini.

In `@backend/main.py`:
- Around line 143-147: Replace the startup ValueError for invalid FRONTEND_URL
with a critical log and a safe fallback: instead of raising when frontend_url
does not start with "http://" or "https://", call the application's logger to
log a critical message that includes the invalid frontend_url and then set
allowed_origins = [] (maintaining the same fail-closed behavior as missing
values); update the check around frontend_url and allowed_origins in main.py to
perform this logging+fallback rather than raising so a config typo won't crash
the service.

ℹ️ Review info

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between edce47c and a10dad5.

📒 Files selected for processing (5)

backend/gemini_summary.py
backend/init_db.py
backend/main.py
backend/models.py
backend/requirements-render.txt

🚧 Files skipped from review as they are similar to previous changes (2)

backend/init_db.py
backend/models.py

coderabbitai · 2026-02-25T14:47:09Z

backend/gemini_summary.py

+# Simple cache to avoid async-lru dependency
+_summary_cache = {}
+SUMMARY_CACHE_TTL = 86400  # 24 hours


⚠️ Potential issue | 🟠 Major

Manual cache is unbounded and can grow indefinitely.

The TTL check prevents stale reuse but does not evict old/rarely-hit keys. Over time this can cause memory growth under diverse request inputs.

Suggested change

_summary_cache = {} SUMMARY_CACHE_TTL = 86400 # 24 hours +SUMMARY_CACHE_MAX_ENTRIES = 5000 @@ - if cache_key in _summary_cache: + # Opportunistic cleanup + if len(_summary_cache) > SUMMARY_CACHE_MAX_ENTRIES: + expired = [k for k, (_, ts) in _summary_cache.items() if current_time - ts >= SUMMARY_CACHE_TTL] + for k in expired: + _summary_cache.pop(k, None) + + if cache_key in _summary_cache: val, ts = _summary_cache[cache_key] if current_time - ts < SUMMARY_CACHE_TTL: return val @@ - _summary_cache[cache_key] = (summary, current_time) + _summary_cache[cache_key] = (summary, current_time)

Also applies to: 74-77, 100-100

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@backend/gemini_summary.py` around lines 48 - 50, The manual _summary_cache dict with SUMMARY_CACHE_TTL is unbounded and can grow forever; replace it with a bounded cache (e.g., an LRU or TTL-aware bounded structure) and evict old entries when the size exceeds a MAX_CACHE_SIZE. Concretely: replace _summary_cache usage with a bounded cache implementation (for example collections.OrderedDict-based LRU or cachetools.TTLCache) keyed the same way, set a MAX_CACHE_SIZE constant (e.g., 1000), ensure entries also respect SUMMARY_CACHE_TTL, and update any places that read/write _summary_cache (symbols: _summary_cache and SUMMARY_CACHE_TTL and the functions that access them around the existing uses) to use the new cache API so old/rare keys are evicted automatically. Ensure thread/async safety if the module is used concurrently.

coderabbitai · 2026-02-25T14:47:09Z

backend/gemini_summary.py

    Returns:
        A short paragraph describing the MLA's role and responsibilities
    """
+    cache_key = f"{district}_{assembly_constituency}_{mla_name}_{issue_category}"


⚠️ Potential issue | 🟡 Minor

Use a tuple cache key to avoid accidental collisions.

Line [71] concatenates values with _, which can collide when fields themselves contain underscores. A tuple key is safer.

Suggested change

- cache_key = f"{district}_{assembly_constituency}_{mla_name}_{issue_category}" + cache_key = (district, assembly_constituency, mla_name, issue_category)

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

cache_key = f"{district}_{assembly_constituency}_{mla_name}_{issue_category}"

cache_key = (district, assembly_constituency, mla_name, issue_category)

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@backend/gemini_summary.py` at line 71, The current cache_key is built by concatenating fields into a string (cache_key = f"{district}_{assembly_constituency}_{mla_name}_{issue_category}") which can collide if values contain underscores; change cache_key to use an immutable tuple (e.g., (district, assembly_constituency, mla_name, issue_category)) wherever the key is created and looked up, and update any cache get/set usages that reference cache_key so they use the tuple form instead of the concatenated string.

coderabbitai · 2026-02-25T14:47:10Z

backend/gemini_summary.py

+        summary = await retry_with_exponential_backoff(_generate_mla_summary_with_gemini, max_retries=2)
+        _summary_cache[cache_key] = (summary, current_time)
+        return summary


⚠️ Potential issue | 🟠 Major

Skip retry path when Gemini is disabled.

When GEMINI_API_KEY is absent (genai = None), Line [99] still retries and delays response before fallback. Return fallback immediately in that mode.

Suggested change

async def generate_mla_summary( @@ - async def _generate_mla_summary_with_gemini() -> str: + if genai is None: + return _get_fallback_summary(mla_name, assembly_constituency, district) + + async def _generate_mla_summary_with_gemini() -> str:

🧰 Tools

🪛 Ruff (0.15.2)

[warning] 101-101: Consider moving this statement to an else block

(TRY300)

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@backend/gemini_summary.py` around lines 99 - 101, When GEMINI is disabled (genai is None), avoid calling retry_with_exponential_backoff(_generate_mla_summary_with_gemini, ...) and returning after delay; instead detect genai is None before the retry and immediately return the fallback summary (and update _summary_cache[cache_key] if existing code caches fallbacks) so no retry/delay occurs; specifically modify the block around retry_with_exponential_backoff to short-circuit when genai is None, calling the same fallback-path used elsewhere rather than invoking _generate_mla_summary_with_gemini.

coderabbitai · 2026-02-25T14:47:10Z

backend/main.py

+    if not (frontend_url.startswith("http://") or frontend_url.startswith("https://")):
+        raise ValueError(
+            f"FRONTEND_URL must be a valid HTTP/HTTPS URL. Got: {frontend_url}"
+        )
+    allowed_origins = [frontend_url]


⚠️ Potential issue | 🟠 Major

Avoid hard-crashing the app on malformed FRONTEND_URL.

Line [144] raises at startup, so a config typo can take the service down. Prefer logging critical and falling back to allowed_origins = [] (same fail-closed behavior used for missing values).

Suggested change

- if not (frontend_url.startswith("http://") or frontend_url.startswith("https://")): - raise ValueError( - f"FRONTEND_URL must be a valid HTTP/HTTPS URL. Got: {frontend_url}" - ) - allowed_origins = [frontend_url] + if not (frontend_url.startswith("http://") or frontend_url.startswith("https://")): + logger.critical( + "FRONTEND_URL is invalid (%s). CORS will be disabled for safety.", + frontend_url, + ) + allowed_origins = [] + else: + allowed_origins = [frontend_url]

🧰 Tools

🪛 Ruff (0.15.2)

[warning] 144-146: Avoid specifying long messages outside the exception class

(TRY003)

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@backend/main.py` around lines 143 - 147, Replace the startup ValueError for invalid FRONTEND_URL with a critical log and a safe fallback: instead of raising when frontend_url does not start with "http://" or "https://", call the application's logger to log a critical message that includes the invalid frontend_url and then set allowed_origins = [] (maintaining the same fail-closed behavior as missing values); update the check around frontend_url and allowed_origins in main.py to perform this logging+fallback rather than raising so a config typo won't crash the service.

- Added missing columns (`reference_id`, `verified_at`, `assigned_at`, `resolved_at`, `assigned_to`) to `migrate_db` in `init_db.py`. - Ensured `reference_id` has a unique index during migration. - Removed unused/problematic dependencies `async_lru` and `indic-nlp-library` from ALL requirements files. - Replaced `async_lru` with a manual async-safe cache in `gemini_summary.py`. - Fixed `bcrypt` pinning and split optional extras in `requirements-render.txt`. - Enhanced migration robustness with explicit `VARCHAR(255)` lengths. - Maintained O(1) blockchain verification optimization.

Copilot AI review requested due to automatic review settings February 25, 2026 14:07

RohanExploit had a problem deploying to bolt-optimize-blockchain-o1-verification-15333545155765582858 - vishwaguru-backend PR #474 February 25, 2026 14:07 — with Render Failure

github-actions bot added the size/s label Feb 25, 2026

Copilot started reviewing on behalf of RohanExploit February 25, 2026 14:07 View session

Copilot AI reviewed Feb 25, 2026

View reviewed changes

coderabbitai bot reviewed Feb 25, 2026

View reviewed changes

cubic-dev-ai bot reviewed Feb 25, 2026

View reviewed changes

RohanExploit had a problem deploying to bolt-optimize-blockchain-o1-verification-15333545155765582858 - vishwaguru-backend PR #474 February 25, 2026 14:23 — with Render Failure

RohanExploit had a problem deploying to bolt-optimize-blockchain-o1-verification-15333545155765582858 - vishwaguru-backend PR #474 February 25, 2026 14:42 — with Render Failure

github-actions bot added the size/m label Feb 25, 2026

cubic-dev-ai bot reviewed Feb 25, 2026

View reviewed changes

coderabbitai bot reviewed Feb 25, 2026

View reviewed changes

RohanExploit had a problem deploying to bolt-optimize-blockchain-o1-verification-15333545155765582858 - vishwaguru-backend PR #474 February 25, 2026 14:59 — with Render Failure

	Learning: Chained data structures (like blockchains) that require cross-record lookups for verification can suffer from O(N) or O(log N) latency as the dataset grows. Storing the "back-link" (the previous hash) directly on the current record transforms verification into a single (1)$ database lookup.
	Learning: Chained data structures (like blockchains) that require cross-record lookups for verification can suffer from O(N) or O(log N) latency as the dataset grows. Storing the "back-link" (the previous hash) directly on the current record transforms verification into a single O(1) database lookup.

		integrity_hash=integrity_hash,
		previous_integrity_hash=prev_hash

	cache_key = f"{district}_{assembly_constituency}_{mla_name}_{issue_category}"
	cache_key = (district, assembly_constituency, mla_name, issue_category)

Conversation

RohanExploit commented Feb 25, 2026 • edited by cubic-dev-ai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by cubic

Summary by CodeRabbit

Uh oh!

google-labs-jules bot commented Feb 25, 2026

Uh oh!

netlify bot commented Feb 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

👷 Deploy Preview for fixmybharat processing.

Uh oh!

github-actions bot commented Feb 25, 2026

🙏 Thank you for your contribution, @RohanExploit!

Uh oh!

coderabbitai bot commented Feb 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram

Estimated code review effort

Possibly related PRs

Suggested labels

Poem

❌ Failed checks (1 warning)

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Feb 25, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 25, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Feb 25, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Feb 25, 2026

Choose a reason for hiding this comment

Uh oh!

cubic-dev-ai bot left a comment

Choose a reason for hiding this comment

Uh oh!

cubic-dev-ai bot Feb 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

cubic-dev-ai bot Feb 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

github-actions bot commented Feb 25, 2026

🔍 Quality Reminder

Uh oh!

cubic-dev-ai bot left a comment

Choose a reason for hiding this comment

Uh oh!

cubic-dev-ai bot Feb 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Feb 25, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Feb 25, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Feb 25, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Feb 25, 2026

RohanExploit commented Feb 25, 2026 •

edited by cubic-dev-ai bot

Loading

netlify bot commented Feb 25, 2026 •

edited

Loading

coderabbitai bot commented Feb 25, 2026 •

edited

Loading

cubic-dev-ai bot Feb 25, 2026 •

edited

Loading

cubic-dev-ai bot Feb 25, 2026 •

edited

Loading

cubic-dev-ai bot Feb 25, 2026 •

edited

Loading