Skip to content

Fix FTS5 corruption on publish#768

Merged
ascorbic merged 8 commits into
mainfrom
opencode/issue649-20260425090851
May 18, 2026
Merged

Fix FTS5 corruption on publish#768
ascorbic merged 8 commits into
mainfrom
opencode/issue649-20260425090851

Conversation

@ask-bonk
Copy link
Copy Markdown
Contributor

@ask-bonk ask-bonk Bot commented Apr 25, 2026

What does this PR do?

Fixes the SQLITE_CORRUPT_VTAB (database disk image is malformed) error reported on every publish/edit on collections with search enabled.

Root cause

The FTS5 virtual table is created with content='ec_<slug>', which makes it an external content FTS5 table — the row data lives in the content table and FTS5 only stores the inverted index. But the AFTER UPDATE/DELETE triggers used the contentless-table sync pattern:

DELETE FROM "${ftsTable}" WHERE rowid = OLD.rowid;

For external-content tables, DELETE FROM fts WHERE rowid = ? causes FTS5 to read column values from the backing content table to figure out which tokens to remove. By the time an AFTER UPDATE trigger fires, the row in ec_<collection> already holds the NEW values. FTS5 therefore removes NEW tokens while the index still holds OLD tokens, drifting the inverted index out of sync until SQLite refuses further reads with SQLITE_CORRUPT_VTAB. Same problem in the AFTER DELETE trigger — by then the row is gone, so FTS5 can't find any column values to remove. This explains every observation in the issue thread:

  • Only published-content edits trigger it (each edit is an UPDATE that fires the broken trigger).
  • "Disable search" fixes it (drops the broken triggers, stopping further damage).
  • 'rebuild' works after dropping/recreating the table but not always on its own (the corruption can be in the FTS shadow tables themselves).
  • It hits 0.4.0, 0.5.0, and 0.6.0 — the trigger code is identical across those versions.

Fix

  1. Rewrote the AFTER UPDATE and AFTER DELETE triggers in packages/core/src/search/fts-manager.ts to use the documented external-content-safe pattern from https://www.sqlite.org/fts5.html#external_content_tables:

    INSERT INTO fts(fts, rowid, col1, col2) VALUES('delete', OLD.rowid, OLD.col1, OLD.col2);

    This passes the OLD column values directly so FTS5 removes the right tokens regardless of what's currently in the content table.

  2. Gated 'delete' on OLD.deleted_at IS NULL so the trigger only removes rows that were actually indexed. The INSERT trigger skips soft-deleted rows; issuing 'delete' for a never-indexed rowid is itself a corruption trigger. Without this gate, restore-from-trash, permanent-delete, and edit-while-trashed would all raise SQLITE_CORRUPT_VTAB.

  3. Added migration 039_fix_fts5_triggers that rebuilds the FTS index for every search-enabled collection on upgrade. The migration runs once at startup before any request can hit the publish/trash paths, so upgrading sites get the fix on next deploy without depending on a search-endpoint visit to trigger lazy repair. verifyAndRepairIndex at runtime stays as cheap belt-and-braces drift detection (missing table or row-count mismatch) — the expensive integrity-check probe and sqlite_master-scanning legacy-trigger detection from earlier commits have been removed.

Tests

packages/core/tests/integration/search/fts-corruption.test.ts covers the trigger fix:

  • Publish → edit → republish round-trip (the reported reproduction).
  • 25 edit/publish cycles with FTS integrity-check after each.
  • Full trash lifecycle (soft-delete → restore → integrity-check).
  • Permanent-delete of soft-deleted row.
  • Edit a row while it's in the trash.

packages/core/tests/unit/database/migrations/039_fix_fts5_triggers.test.ts covers the migration:

  • No-op on databases without search-enabled collections.
  • Rebuilds FTS index for every search-enabled collection.
  • Repairs an already-corrupted index (asserts stale tokens are gone after migration, and new tokens are present).
  • Disables FTS cleanly when no fields are searchable.
  • Ignores collections where search_config.enabled = false.
  • Idempotent re-run.

All affected tests pass; full core suite passes.

Closes #649

Type of change

  • Bug fix
  • Feature (requires maintainer-approved Discussion)
  • Refactor (no behavior change)
  • Translation
  • Documentation
  • Performance improvement
  • Tests
  • Chore (dependencies, CI, tooling)

Checklist

  • I have read CONTRIBUTING.md
  • pnpm typecheck passes
  • pnpm lint passes
  • pnpm test passes (or targeted tests for my change)
  • pnpm format has been run
  • I have added/updated tests for my changes (if applicable)
  • User-visible strings in the admin UI are wrapped for translation and pnpm locale:extract has been run (if applicable) — N/A, no admin UI changes
  • I have added a changeset (if this PR changes a published package)
  • New features link to an approved Discussion: https://github.com/emdash-cms/emdash/discussions/...

AI-generated code disclosure

  • This PR includes AI-generated code

Originally opened by /ultrabonk (Claude Opus 4.7 via opencode + ask-bonk) in response to my comment on #649. Subsequent commits — including the trash-lifecycle gate, the migration-based approach, and the test strengthening — were also AI-generated (Claude Opus 4.7 via opencode), iterated against Copilot reviewer feedback and an adversarial second-opinion review.

Co-authored-by: ascorbic <ascorbic@users.noreply.github.com>
@changeset-bot
Copy link
Copy Markdown

changeset-bot Bot commented Apr 25, 2026

🦋 Changeset detected

Latest commit: 0e150f6

The changes in this PR will be included in the next version bump.

This PR includes changesets to release 13 packages
Name Type
emdash Patch
@emdash-cms/cloudflare Patch
@emdash-cms/fixture-perf-site Patch
@emdash-cms/perf-demo-site Patch
@emdash-cms/cache-demo-site Patch
@emdash-cms/admin Patch
@emdash-cms/auth Patch
@emdash-cms/blocks Patch
@emdash-cms/gutenberg-to-portable-text Patch
@emdash-cms/x402 Patch
create-emdash Patch
@emdash-cms/auth-atproto Patch
@emdash-cms/plugin-embeds Patch

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

@cloudflare-workers-and-pages
Copy link
Copy Markdown

Deploying with  Cloudflare Workers  Cloudflare Workers

The latest updates on your project. Learn more about integrating Git with Workers.

Status Name Latest Commit Updated (UTC)
🔵 In progress
View logs
emdash-playground ccf2d57 Apr 25 2026, 09:22 AM

@cloudflare-workers-and-pages
Copy link
Copy Markdown

Deploying with  Cloudflare Workers  Cloudflare Workers

The latest updates on your project. Learn more about integrating Git with Workers.

Status Name Latest Commit Updated (UTC)
🔵 In progress
View logs
emdash-demo-cache ccf2d57 Apr 25 2026, 09:22 AM

@cloudflare-workers-and-pages
Copy link
Copy Markdown

Deploying with  Cloudflare Workers  Cloudflare Workers

The latest updates on your project. Learn more about integrating Git with Workers.

Status Name Latest Commit Updated (UTC)
🔵 In progress
View logs
emdash-perf-coordinator ccf2d57 Apr 25 2026, 09:22 AM

@cloudflare-workers-and-pages
Copy link
Copy Markdown

cloudflare-workers-and-pages Bot commented Apr 25, 2026

Deploying with  Cloudflare Workers  Cloudflare Workers

The latest updates on your project. Learn more about integrating Git with Workers.

Status Name Latest Commit Updated (UTC)
✅ Deployment successful!
View logs
emdash-demo-cache 0e150f6 May 18 2026, 06:06 PM

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Apr 25, 2026

All contributors have signed the CLA ✍️ ✅
Posted by the CLA Assistant Lite bot.

@pkg-pr-new
Copy link
Copy Markdown

pkg-pr-new Bot commented Apr 25, 2026

Open in StackBlitz

@emdash-cms/admin

npm i https://pkg.pr.new/@emdash-cms/admin@768

@emdash-cms/auth

npm i https://pkg.pr.new/@emdash-cms/auth@768

@emdash-cms/blocks

npm i https://pkg.pr.new/@emdash-cms/blocks@768

@emdash-cms/cloudflare

npm i https://pkg.pr.new/@emdash-cms/cloudflare@768

emdash

npm i https://pkg.pr.new/emdash@768

create-emdash

npm i https://pkg.pr.new/create-emdash@768

@emdash-cms/gutenberg-to-portable-text

npm i https://pkg.pr.new/@emdash-cms/gutenberg-to-portable-text@768

@emdash-cms/x402

npm i https://pkg.pr.new/@emdash-cms/x402@768

@emdash-cms/plugin-ai-moderation

npm i https://pkg.pr.new/@emdash-cms/plugin-ai-moderation@768

@emdash-cms/plugin-atproto

npm i https://pkg.pr.new/@emdash-cms/plugin-atproto@768

@emdash-cms/plugin-audit-log

npm i https://pkg.pr.new/@emdash-cms/plugin-audit-log@768

@emdash-cms/plugin-color

npm i https://pkg.pr.new/@emdash-cms/plugin-color@768

@emdash-cms/plugin-embeds

npm i https://pkg.pr.new/@emdash-cms/plugin-embeds@768

@emdash-cms/plugin-forms

npm i https://pkg.pr.new/@emdash-cms/plugin-forms@768

@emdash-cms/plugin-webhook-notifier

npm i https://pkg.pr.new/@emdash-cms/plugin-webhook-notifier@768

commit: 0e150f6

@ascorbic
Copy link
Copy Markdown
Collaborator

/review

@ask-bonk
Copy link
Copy Markdown
Contributor Author

ask-bonk Bot commented Apr 25, 2026

@ascorbic Bonk workflow was cancelled.

View workflow run · To retry, trigger Bonk again.

@ascorbic
Copy link
Copy Markdown
Collaborator

/review

@ask-bonk
Copy link
Copy Markdown
Contributor Author

ask-bonk Bot commented Apr 25, 2026

LGTM!

github run

@ascorbic
Copy link
Copy Markdown
Collaborator

/ultrareview

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Note

Copilot was unable to run its full agentic suite in this review.

Fixes recurring SQLITE_CORRUPT_VTAB on publish/edit for collections with FTS enabled by correcting FTS5 trigger semantics for external-content tables and adding automatic detection/repair of legacy unsafe triggers/corrupted indexes.

Changes:

  • Update FTS5 UPDATE/DELETE triggers to use the external-content-safe VALUES('delete', OLD...) pattern.
  • Add legacy-trigger detection via sqlite_master inspection and rebuild on upgrade when unsafe triggers are found.
  • Add an integrity-check pass during verification and introduce integration regression tests covering corruption and auto-repair.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 5 comments.

File Description
packages/core/src/search/fts-manager.ts Fixes trigger behavior for external-content FTS5 tables; adds legacy trigger detection and integrity-check based repair.
packages/core/tests/integration/search/fts-corruption.test.ts Adds regression + stress tests for corruption and verifies auto-repair behavior.
.changeset/fts-corrupt-vtab-fix.md Documents the patch release and explains root cause/fix at a high level.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines 135 to +141
const fieldList = searchableFields.join(", ");
const newFieldList = searchableFields.map((f) => `NEW.${f}`).join(", ");
// `'delete'` takes the FTS5 virtual table name as the first column,
// then the rowid being removed, then the OLD value of every column
// declared on the FTS5 table (in declaration order: id, locale,
// then each searchable field).
const oldFieldList = searchableFields.map((f) => `OLD.${f}`).join(", ");
Copy link

Copilot AI Apr 25, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If searchableFields can be empty, fieldList becomes an empty string and the generated SQL ends up with a trailing comma in the column list (and an extra comma in VALUES), which is a syntax error. Either enforce searchableFields.length > 0 in validateInputs (and/or at the call site), or branch the SQL generation to omit ", ${fieldList}" / ", ${oldFieldList}" when there are no searchable fields.

Copilot uses AI. Check for mistakes.
Comment thread packages/core/src/search/fts-manager.ts Outdated
Comment on lines +165 to +166
INSERT INTO "${ftsTable}"("${ftsTable}", rowid, id, locale, ${fieldList})
VALUES('delete', OLD.rowid, OLD.id, OLD.locale, ${oldFieldList});
Copy link

Copilot AI Apr 25, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If searchableFields can be empty, fieldList becomes an empty string and the generated SQL ends up with a trailing comma in the column list (and an extra comma in VALUES), which is a syntax error. Either enforce searchableFields.length > 0 in validateInputs (and/or at the call site), or branch the SQL generation to omit ", ${fieldList}" / ", ${oldFieldList}" when there are no searchable fields.

Copilot uses AI. Check for mistakes.
Comment thread packages/core/src/search/fts-manager.ts Outdated
Comment on lines +423 to +428

const result = await sql<{ sql: string | null }>`
SELECT sql FROM sqlite_master
WHERE type = 'trigger'
AND tbl_name = ${this.getContentTableName(collectionSlug)}
AND name LIKE ${`${ftsTable}_%`}
Copy link

Copilot AI Apr 25, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SQLite LIKE treats "_" as a single-character wildcard, and ftsTable names contain underscores (e.g. "_emdash_fts_pages"). This pattern can unintentionally match non-FTS triggers and cause false-positive rebuilds. Prefer matching exact trigger names (e.g. IN (${ftsTable}_insert, ${ftsTable}_update, ${ftsTable}delete)) or escape "" and "%" and add an ESCAPE clause.

Suggested change
const result = await sql<{ sql: string | null }>`
SELECT sql FROM sqlite_master
WHERE type = 'trigger'
AND tbl_name = ${this.getContentTableName(collectionSlug)}
AND name LIKE ${`${ftsTable}_%`}
const insertTrigger = `${ftsTable}_insert`;
const updateTrigger = `${ftsTable}_update`;
const deleteTrigger = `${ftsTable}_delete`;
const result = await sql<{ sql: string | null }>`
SELECT sql FROM sqlite_master
WHERE type = 'trigger'
AND tbl_name = ${this.getContentTableName(collectionSlug)}
AND name IN (${insertTrigger}, ${updateTrigger}, ${deleteTrigger})

Copilot uses AI. Check for mistakes.
Comment thread packages/core/src/search/fts-manager.ts Outdated
Comment on lines +512 to +520
// Check: FTS5 integrity-check. This catches corruption that the row
// count check misses (e.g. orphaned tokens in segments where the
// docsize entry exists but points to garbage). Throws on a corrupt
// index; treat the throw itself as the signal to rebuild.
try {
await sql
.raw(`INSERT INTO "${ftsTable}"("${ftsTable}") VALUES('integrity-check')`)
.execute(this.db);
} catch (err) {
Copy link

Copilot AI Apr 25, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Running FTS5 'integrity-check' on every verify pass can be expensive for large indexes and will directly impact startup time if ensureSearchHealthy runs at boot. Consider gating the integrity-check behind a cheaper heuristic (e.g. only after legacy-trigger detection, after row-count mismatch, or on a lower-frequency schedule), or making it configurable (e.g. "deep check" mode) to avoid consistently paying the full scan cost.

Copilot uses AI. Check for mistakes.
Comment thread packages/core/src/search/fts-manager.ts Outdated
Comment on lines +431 to +436
const legacyMarker = `DELETE FROM "${ftsTable}" WHERE rowid = OLD.rowid`;
for (const row of result.rows) {
if (row.sql && row.sql.includes(legacyMarker)) {
return true;
}
}
Copy link

Copilot AI Apr 25, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Legacy-trigger detection relies on an exact SQL substring match (including quoting and whitespace). If older shipped versions differed in formatting/quoting, hasLegacyTriggers can miss genuinely unsafe triggers and skip the rebuild. To make this more robust, consider normalizing whitespace and matching via a small regex (e.g. "DELETE\s+FROM\s+"?"?\s+WHERE\s+rowid\s*=\s*OLD\.rowid") or matching multiple expected variants.

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Contributor Author

@ask-bonk ask-bonk Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CRITICAL — the fix introduces a new SQLITE_CORRUPT_VTAB regression on restore / permanent-delete / edit-while-trashed.

Empirically verified against better-sqlite3 12.8.0 (vanilla SQLite, same as D1). With the PR's triggers in place, a normal user lifecycle now throws SQLITE_CORRUPT_VTAB:

Lifecycle: insert active → soft-delete → restore
  soft delete OK
  restore FAILED: SQLITE_CORRUPT_VTAB - database disk image is malformed

Lifecycle: insert active → soft-delete → permanent delete
  permanent delete FAILED: SQLITE_CORRUPT_VTAB

Lifecycle: edit a soft-deleted row
  edit-while-deleted FAILED: SQLITE_CORRUPT_VTAB

Root cause: the INSERT trigger has WHEN NEW.deleted_at IS NULL, so soft-deleted rows are never added to the FTS index. The new UPDATE / DELETE triggers issue 'delete' unconditionally, including for rows where OLD.deleted_at IS NOT NULL (i.e., the row is not in the FTS index at all). FTS5's 'delete' command is not a no-op when the rowid is absent — it tries to remove tokens that don't exist and corrupts state, throwing SQLITE_CORRUPT_VTAB and aborting the parent UPDATE/DELETE. The previous broken DELETE FROM fts WHERE rowid = OLD.rowid form silently no-op'd in that case, so this case worked by accident before.

Reachable from production via ContentRepository.restore() / ContentRepository.permanentDelete() (used by /api/.../restore and the trash endpoints, see packages/core/src/api/handlers/content.ts:765,810). Anyone who soft-deletes content on a search-enabled collection then tries to restore or permanently-delete it will hit the same error this PR claims to fix.

Fix: gate the 'delete' on OLD.deleted_at IS NULL (the row was actually indexed). Verified working — a full lifecycle (insert → edit → soft-delete → restore → edit → soft-delete → permanent-delete) survives integrity-check at every step:

CREATE TRIGGER ..._update AFTER UPDATE ON ec_pages BEGIN
  INSERT INTO fts(fts, rowid, id, locale, title, content)
  SELECT 'delete', OLD.rowid, OLD.id, OLD.locale, OLD.title, OLD.content
  WHERE OLD.deleted_at IS NULL;
  INSERT INTO fts(rowid, id, locale, title, content)
  SELECT NEW.rowid, NEW.id, NEW.locale, NEW.title, NEW.content
  WHERE NEW.deleted_at IS NULL;
END;

CREATE TRIGGER ..._delete AFTER DELETE ON ec_pages BEGIN
  INSERT INTO fts(fts, rowid, id, locale, title, content)
  SELECT 'delete', OLD.rowid, OLD.id, OLD.locale, OLD.title, OLD.content
  WHERE OLD.deleted_at IS NULL;
END;

The fts-corruption test file should add coverage for the trash lifecycle (restore + permanent-delete + edit-while-soft-deleted) — none of the three current tests exercise a row that was inserted with deleted_at IS NOT NULL or transitions through the trash, which is exactly why this regression slipped through.

Not blocking diagnosis — the diagnosis and the choice of 'delete' command are correct, and the legacy-trigger detection + integrity-check additions are good. But this needs the OLD-side gate before merging or the bug it ships will be the same SQLITE_CORRUPT_VTAB on a different operation.

Comment thread packages/core/src/search/fts-manager.ts Outdated
BEGIN
DELETE FROM "${ftsTable}" WHERE rowid = OLD.rowid;
INSERT INTO "${ftsTable}"("${ftsTable}", rowid, id, locale, ${fieldList})
VALUES('delete', OLD.rowid, OLD.id, OLD.locale, ${oldFieldList});
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CRITICAL — this 'delete' is unconditional, but the INSERT trigger above only indexes rows where NEW.deleted_at IS NULL. So when an UPDATE fires with OLD.deleted_at IS NOT NULL (restore from trash, edit a row that's currently in the trash, or any update on a row that was inserted soft-deleted), the OLD row is not in the FTS index and FTS5's 'delete' command throws SQLITE_CORRUPT_VTAB — the same error this PR is fixing.

Reproduced against vanilla SQLite (better-sqlite3 12.8.0):

insert active → soft-delete → restore  =>  SQLITE_CORRUPT_VTAB
insert active → soft-delete → edit     =>  SQLITE_CORRUPT_VTAB

Reachable from production via ContentRepository.restore() (packages/core/src/database/repositories/content.ts:625), used by the restore endpoint at packages/core/src/api/handlers/content.ts:765. Any user who soft-deletes content on a search-enabled collection and then tries to restore it will hit corruption.

Fix: gate on OLD.deleted_at IS NULL so we only 'delete' rows that were actually indexed:

INSERT INTO "${ftsTable}"("${ftsTable}", rowid, id, locale, ${fieldList})
SELECT 'delete', OLD.rowid, OLD.id, OLD.locale, ${oldFieldList}
WHERE OLD.deleted_at IS NULL;

Verified that this form survives the full lifecycle (insert → edit → soft-delete → restore → edit → soft-delete → permanent-delete) with 'integrity-check' passing at every step. The previous (buggy) DELETE FROM fts WHERE rowid = OLD.rowid was silently a no-op when the rowid wasn't in the index, which is why this case happened to work before — 'delete' is not a no-op.

Comment thread packages/core/src/search/fts-manager.ts Outdated
BEGIN
DELETE FROM "${ftsTable}" WHERE rowid = OLD.rowid;
INSERT INTO "${ftsTable}"("${ftsTable}", rowid, id, locale, ${fieldList})
VALUES('delete', OLD.rowid, OLD.id, OLD.locale, ${oldFieldList});
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CRITICAL — same issue as the UPDATE trigger. A DELETE FROM ec_<slug> on a row whose deleted_at IS NOT NULL (i.e. the user permanently-deletes a soft-deleted row from trash, the standard flow via ContentRepository.permanentDelete() -> packages/core/src/api/handlers/content.ts:810) will fire this trigger with OLD values that were never inserted into the FTS index, and FTS5 throws SQLITE_CORRUPT_VTAB, aborting the user's DELETE.

Verified end-to-end against vanilla SQLite. Same fix:

INSERT INTO "${ftsTable}"("${ftsTable}", rowid, id, locale, ${fieldList})
SELECT 'delete', OLD.rowid, OLD.id, OLD.locale, ${oldFieldList}
WHERE OLD.deleted_at IS NULL;

.execute(db),
).resolves.toBeDefined();
});
});
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Test gap that hid the regression: none of the three new tests exercise a row that was inserted soft-deleted, transitions through the trash, or comes back through restore()/permanentDelete(). The new triggers throw SQLITE_CORRUPT_VTAB in all three of those cases (verified empirically), but the test suite never goes there, so it ships green.

Please add coverage for the trash lifecycle once the trigger gating is fixed — at minimum: insert active row → soft-delete → restore → integrity-check, and insert → soft-delete → permanent-delete → integrity-check. These are the ones that currently break and would have caught this in CI.

@ask-bonk
Copy link
Copy Markdown
Contributor Author

ask-bonk Bot commented Apr 25, 2026

Review posted with three line-anchored comments. Found a critical regression: the new triggers throw SQLITE_CORRUPT_VTAB on restore from trash, permanent-delete from trash, and edits to soft-deleted rows — empirically reproduced against vanilla SQLite. The INSERT trigger has WHEN NEW.deleted_at IS NULL so soft-deleted rows are never indexed, but the new UPDATE/DELETE triggers issue 'delete' unconditionally. The old buggy DELETE FROM fts WHERE rowid = OLD.rowid silently no-op'd on missing rowids; the documented 'delete' command does not. Fix is to gate the 'delete' on OLD.deleted_at IS NULL.

Review URL: #768 (review)

github run

@ascorbic ascorbic closed this Apr 26, 2026
@ascorbic ascorbic reopened this Apr 26, 2026
@ascorbic
Copy link
Copy Markdown
Collaborator

/ultrabonk look at the review comments and fix the issues

Co-authored-by: ascorbic <ascorbic@users.noreply.github.com>
@ask-bonk
Copy link
Copy Markdown
Contributor Author

ask-bonk Bot commented Apr 26, 2026

All search tests pass (10/10). Summary of changes:

  1. Critical fix (fts-manager.ts): Gated the 'delete' command in both UPDATE and DELETE triggers on OLD.deleted_at IS NULL. Without this, restore-from-trash, permanent-delete, and edit-while-trashed all raised SQLITE_CORRUPT_VTAB (the same error the original PR was meant to fix).
  2. Tightened legacy-trigger detection (hasLegacyTriggers): Switched from a LIKE pattern (where underscores are wildcards) to exact IN (...) matching of trigger names, and a whitespace-tolerant regex for the unsafe SQL fragment so older shipped variants are caught.
  3. Defensive guard in createTriggers: Throws if searchableFields is empty rather than producing malformed SQL with trailing commas.
  4. Three new lifecycle tests covering the regression: soft-delete -> restore -> integrity-check, soft-delete -> permanent-delete, and edit-while-soft-deleted.

What does this PR do?

Addresses review feedback on PR #768. The FTS5 trigger fix introduced a new SQLITE_CORRUPT_VTAB regression on restore, permanent-delete, and edit-while-trashed flows. This PR fixes that and tightens a few rough edges flagged by Copilot.

The regression

The INSERT trigger has WHEN NEW.deleted_at IS NULL, so soft-deleted rows are never indexed. The new UPDATE/DELETE triggers issued 'delete' unconditionally. FTS5's 'delete' command is not a no-op for missing rowids (the previous broken DELETE FROM fts WHERE rowid = OLD.rowid form silently was, which is why this case worked by accident before) and raises SQLITE_CORRUPT_VTAB when the row was never in the index. Reachable from production via ContentRepository.restore() and ContentRepository.permanentDelete() -- any user who soft-deletes content on a search-enabled collection then tries to restore or permanently-delete it would hit the same error #768 was meant to fix.

Fixes (packages/core/src/search/fts-manager.ts)

  1. Gate 'delete' on OLD.deleted_at IS NULL in both UPDATE and DELETE triggers using INSERT ... SELECT ... WHERE OLD.deleted_at IS NULL. The OLD-side index state matches the INSERT trigger's NULL-only indexing condition.
  2. Tighten hasLegacyTriggers detection -- switched from a LIKE pattern (underscores in _emdash_fts_<slug> are SQL wildcards) to exact IN (...) matching of trigger names, plus a whitespace-tolerant regex match for the unsafe fragment so older shipped variants with different formatting are still caught.
  3. Defensive guard in createTriggers -- throws if searchableFields is empty rather than producing malformed SQL with trailing commas. The call sites (enableSearch, verifyAndRepairIndex) already filter this out, but the function should fail loud rather than silent.

Tests

Three new lifecycle tests in fts-corruption.test.ts:

  • soft-delete → restore → integrity-check (the main regression case)
  • soft-delete → permanent-delete (DELETE on a row not in the index)
  • soft-delete → direct UPDATE (edit-while-trashed)

All 10 search tests pass; full core suite at 2577 passing (1 pre-existing failure in wordpress-slug-sanitization.test.ts that's a workspace self-resolution issue, present on main and unrelated to this change).

Type of change

  • Bug fix
  • Feature (requires maintainer-approved Discussion)
  • Refactor (no behavior change)
  • Translation
  • Documentation
  • Performance improvement
  • Tests
  • Chore (dependencies, CI, tooling)

Checklist

  • I have read CONTRIBUTING.md
  • pnpm typecheck passes
  • pnpm lint passes (no new diagnostics in changed files; remaining 25 are pre-existing on main)
  • pnpm test passes (or targeted tests for my change)
  • pnpm format has been run
  • I have added/updated tests for my changes (if applicable)
  • User-visible strings in the admin UI are wrapped for translation and pnpm locale:extract has been run (if applicable) — N/A, no admin UI changes
  • I have added a changeset (if this PR changes a published package) — updated existing changeset
  • New features link to an approved Discussion: https://github.com/emdash-cms/emdash/discussions/...

AI-generated code disclosure

  • This PR includes AI-generated code

github run

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread packages/core/src/search/fts-manager.ts Outdated
Comment on lines +515 to +524
// Check: legacy/broken triggers from a pre-fix install. These corrupt
// the index on every UPDATE/DELETE, so any site that has them needs a
// rebuild even if the row counts happen to match right now.
if (fields.length > 0 && (await this.hasLegacyTriggers(collectionSlug))) {
console.warn(
`FTS index for "${collectionSlug}" has legacy sync triggers from a pre-fix EmDash version. Rebuilding to install corruption-safe triggers.`,
);
await this.rebuildIndex(collectionSlug, fields, config?.weights);
return true;
}
Copy link

Copilot AI Apr 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

verifyAndRepairAll() (and therefore the legacy-trigger auto-heal) is currently only invoked via EmDashRuntime.ensureSearchHealthy, which the codebase calls from the search endpoints. Content mutations (publish/edit/delete/restore) don’t call this health check, so an upgraded site with legacy triggers/corruption can still hit SQLITE_CORRUPT_VTAB on publish before any search request ever triggers the rebuild.

Consider ensuring the verify/repair path runs before content writes for search-enabled collections (e.g. call ensureSearchHealthy()/verifyAndRepairIndex() from publish/edit/trash routes, or trigger a targeted rebuild when a mutation fails with SQLITE_CORRUPT_VTAB).

Copilot uses AI. Check for mistakes.
Comment thread packages/core/src/search/fts-manager.ts Outdated
Comment on lines +553 to +560
// Check: FTS5 integrity-check. This catches corruption that the row
// count check misses (e.g. orphaned tokens in segments where the
// docsize entry exists but points to garbage). Throws on a corrupt
// index; treat the throw itself as the signal to rebuild.
try {
await sql
.raw(`INSERT INTO "${ftsTable}"("${ftsTable}") VALUES('integrity-check')`)
.execute(this.db);
Copy link

Copilot AI Apr 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

verifyAndRepairIndex now runs an unconditional FTS5 'integrity-check' whenever an index exists and row counts match. This can be significantly more expensive than the row-count check for large indexes and will run on the first search request in each worker lifetime.

If this is intended, it may be worth documenting/justifying the expected cost. Otherwise, consider making integrity-check conditional (e.g. only after detecting legacy triggers / mismatch / a prior crash marker, or rate-limiting it with a persisted “last checked” timestamp) so search latency doesn’t regress for large sites.

Copilot uses AI. Check for mistakes.
@github-actions
Copy link
Copy Markdown
Contributor

This PR has been inactive for 14 days. It will be closed automatically in 7 days if there is no further activity.

If you're still working on this, please push an update or leave a comment.

@github-actions github-actions Bot added stale and removed stale labels May 10, 2026
@claughinghouse
Copy link
Copy Markdown

Adding a note that this is still an active branch since the root issue is unresolved.

@cloudflare-workers-and-pages
Copy link
Copy Markdown

cloudflare-workers-and-pages Bot commented May 18, 2026

Deploying with  Cloudflare Workers  Cloudflare Workers

The latest updates on your project. Learn more about integrating Git with Workers.

Status Name Latest Commit Updated (UTC)
✅ Deployment successful!
View logs
emdash-i18n 0e150f6 May 18 2026, 06:05 PM

@cloudflare-workers-and-pages
Copy link
Copy Markdown

cloudflare-workers-and-pages Bot commented May 18, 2026

Deploying with  Cloudflare Workers  Cloudflare Workers

The latest updates on your project. Learn more about integrating Git with Workers.

Status Name Latest Commit Updated (UTC)
✅ Deployment successful!
View logs
emdash-perf-coordinator 0e150f6 May 18 2026, 06:05 PM

@cloudflare-workers-and-pages
Copy link
Copy Markdown

cloudflare-workers-and-pages Bot commented May 18, 2026

Deploying with  Cloudflare Workers  Cloudflare Workers

The latest updates on your project. Learn more about integrating Git with Workers.

Status Name Latest Commit Updated (UTC)
✅ Deployment successful!
View logs
emdash-playground 0e150f6 May 18 2026, 06:07 PM

@cloudflare-workers-and-pages
Copy link
Copy Markdown

cloudflare-workers-and-pages Bot commented May 18, 2026

Deploying with  Cloudflare Workers  Cloudflare Workers

The latest updates on your project. Learn more about integrating Git with Workers.

Status Name Latest Commit Updated (UTC)
✅ Deployment successful!
View logs
docs 0e150f6 May 18 2026, 06:06 PM

ascorbic added 2 commits May 18, 2026 17:32
Replaces the runtime auto-repair plumbing (hasLegacyTriggers detection
plus FTS5 'integrity-check' probe inside verifyAndRepairIndex) with a
one-shot migration 039_fix_fts5_triggers that rebuilds every
search-enabled collection's FTS index at boot.

Addresses two outstanding review concerns on this PR:

1. Auto-repair previously only fired from /api/search and
   /api/search/suggest. The publish path didn't call it, so an upgrading
   site with a corrupted index still hit SQLITE_CORRUPT_VTAB on its
   first publish unless somebody happened to hit a search endpoint
   first. The migration runs once at startup before any request can
   reach the affected paths.

2. verifyAndRepairIndex ran a full FTS5 'integrity-check' on every
   first-search per isolate. For large indexes that scan is expensive.
   The migration takes the upgrade-rebuild cost once at deploy time
   instead of pinning it onto search latency.

verifyAndRepairIndex now sticks to cheap belt-and-braces checks:
missing FTS table -> rebuild, row-count mismatch -> rebuild. The
expensive recovery cases are handled at boot.

The trigger fix itself (corruption-safe 'delete' command with the
OLD.deleted_at IS NULL gate) is unchanged. fts-corruption.test.ts
keeps the trash-lifecycle and stress tests; the legacy-trigger
repair scenario moves to the new migration unit test.
Two fixes from adversarial review of bb81027:

1. Add IF NOT EXISTS to CREATE VIRTUAL TABLE and CREATE TRIGGER in
   migration 039. D1 has no advisory migration lock (see runner.ts:264),
   so two isolates on a cold first deploy of this version can both enter
   migration 039 simultaneously. The DROP IF EXISTS / CREATE / populate
   sequence was previously not safe under that race -- the loser would
   hit 'table already exists' or 'trigger already exists' on its CREATE
   statements, fail, and Kysely's race handler only catches the final
   UNIQUE constraint on _emdash_migrations.name.

   With IF NOT EXISTS on every CREATE, concurrent migrators converge:
   FTS5 INSERT into an external-content table dedupes by rowid via the
   docsize shadow table (verified empirically on better-sqlite3 12.x),
   so a double populate ends with one row per content row regardless of
   isolate ordering.

2. Strengthen the 'repairs a database whose FTS index is already
   corrupted' test so it actually proves stale tokens are removed.
   The previous assertion was integrity-check + docsize=1, both of
   which pass on the unrepaired corrupted index because the legacy
   broken trigger leaves the inverted index with both OLD and NEW
   tokens but the docsize shadow table happens to still show one row.

   The new assertion uses distinct OLD/NEW tokens ('aardvark' vs
   'zebra') and verifies (a) the OLD token is in the index before the
   broken UPDATE, (b) the OLD token is STILL in the index after the
   broken UPDATE (proving the corruption exists), (c) after the
   migration, the OLD token is gone and the NEW token is present.

   Verified the strengthened test catches a neutered migration: with
   rebuildIndex commented out, the test fails on the post-migration
   stale-token assertion as expected.
@ascorbic
Copy link
Copy Markdown
Collaborator

/review

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 7 out of 7 changed files in this pull request and generated 3 comments.

Comment on lines +180 to +186
// `IF NOT EXISTS` on all CREATE statements so two isolates running
// the migration concurrently (D1 has no advisory lock, so this can
// happen on cold starts of a fresh deploy) don't fail each other
// with "table already exists" / "trigger already exists" errors.
// The drop-then-create sequence is benign across isolates: FTS5
// silently dedupes duplicate-rowid INSERTs via the docsize shadow
// table, so a double populate ends with one row per content row.
Comment on lines +432 to +440
* Verify FTS index integrity and rebuild if drift is detected.
*
* Cheap belt-and-braces check, run lazily on the first search request
* per isolate. The expensive cases (corrupted indexes from pre-fix
* EmDash versions, broken legacy triggers) are handled at boot time by
* migration `039_fix_fts5_triggers`, not here. This routine sticks to:
*
* Checks for row count mismatch between content table and FTS table.
* 1. FTS table missing while config says search is enabled -> rebuild.
* 2. Row count mismatch between content table and FTS docsize -> rebuild.
Comment on lines +7 to +8
* `pragma integrity_check` (which FTS5 hooks via `'integrity-check'`) does not
* report a malformed disk image.
@ask-bonk
Copy link
Copy Markdown
Contributor Author

ask-bonk Bot commented May 18, 2026

The most recent review (Copilot 2026-04-26) was before commits bb81027 (move to migration) and bf1ed6d (idempotent DDL). Both commits address those review concerns. The migration approach explicitly addresses Copilot's "verifyAndRepairAll is only invoked via search endpoints" concern by running before content writes during boot.

This PR looks solid. No bugs to flag. Skipping the review.

LGTM!

github run

- Clarify concurrency comment in migration 039: call out that the
  drop-window where a still-running old-version isolate could fire
  triggers against the missing FTS table is pre-existing (same window
  exists in FTSManager.rebuildIndex), not introduced by this PR.
- Fix misleading header comment in fts-corruption.test.ts: FTS5
  exposes 'integrity-check' as its own command on the FTS table, not
  via PRAGMA integrity_check.

PR body was also updated separately to reflect the migration-based
approach (the original description still referenced hasLegacyTriggers
and the integrity-check probe, both removed in bb81027).
@ascorbic ascorbic enabled auto-merge (squash) May 18, 2026 18:07
@ascorbic ascorbic merged commit 121f173 into main May 18, 2026
37 checks passed
@ascorbic ascorbic deleted the opencode/issue649-20260425090851 branch May 18, 2026 18:13
@emdashbot emdashbot Bot mentioned this pull request May 18, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

SQLite Database Corrupt Error on new site

3 participants