Fix FTS5 corruption on publish by ask-bonk[bot] · Pull Request #768 · emdash-cms/emdash

ask-bonk · 2026-04-25T09:22:06Z

What does this PR do?

Fixes the SQLITE_CORRUPT_VTAB (database disk image is malformed) error reported on every publish/edit on collections with search enabled.

Root cause

The FTS5 virtual table is created with content='ec_<slug>', which makes it an external content FTS5 table — the row data lives in the content table and FTS5 only stores the inverted index. But the AFTER UPDATE/DELETE triggers used the contentless-table sync pattern:

DELETE FROM "${ftsTable}" WHERE rowid = OLD.rowid;

For external-content tables, DELETE FROM fts WHERE rowid = ? causes FTS5 to read column values from the backing content table to figure out which tokens to remove. By the time an AFTER UPDATE trigger fires, the row in ec_<collection> already holds the NEW values. FTS5 therefore removes NEW tokens while the index still holds OLD tokens, drifting the inverted index out of sync until SQLite refuses further reads with SQLITE_CORRUPT_VTAB. Same problem in the AFTER DELETE trigger — by then the row is gone, so FTS5 can't find any column values to remove. This explains every observation in the issue thread:

Only published-content edits trigger it (each edit is an UPDATE that fires the broken trigger).
"Disable search" fixes it (drops the broken triggers, stopping further damage).
'rebuild' works after dropping/recreating the table but not always on its own (the corruption can be in the FTS shadow tables themselves).
It hits 0.4.0, 0.5.0, and 0.6.0 — the trigger code is identical across those versions.

Fix

Rewrote the AFTER UPDATE and AFTER DELETE triggers in packages/core/src/search/fts-manager.ts to use the documented external-content-safe pattern from https://www.sqlite.org/fts5.html#external_content_tables:
```
INSERT INTO fts(fts, rowid, col1, col2) VALUES('delete', OLD.rowid, OLD.col1, OLD.col2);
```
This passes the OLD column values directly so FTS5 removes the right tokens regardless of what's currently in the content table.
Gated 'delete' on OLD.deleted_at IS NULL so the trigger only removes rows that were actually indexed. The INSERT trigger skips soft-deleted rows; issuing 'delete' for a never-indexed rowid is itself a corruption trigger. Without this gate, restore-from-trash, permanent-delete, and edit-while-trashed would all raise SQLITE_CORRUPT_VTAB.
Added migration 039_fix_fts5_triggers that rebuilds the FTS index for every search-enabled collection on upgrade. The migration runs once at startup before any request can hit the publish/trash paths, so upgrading sites get the fix on next deploy without depending on a search-endpoint visit to trigger lazy repair. verifyAndRepairIndex at runtime stays as cheap belt-and-braces drift detection (missing table or row-count mismatch) — the expensive integrity-check probe and sqlite_master-scanning legacy-trigger detection from earlier commits have been removed.

Tests

packages/core/tests/integration/search/fts-corruption.test.ts covers the trigger fix:

Publish → edit → republish round-trip (the reported reproduction).
25 edit/publish cycles with FTS integrity-check after each.
Full trash lifecycle (soft-delete → restore → integrity-check).
Permanent-delete of soft-deleted row.
Edit a row while it's in the trash.

packages/core/tests/unit/database/migrations/039_fix_fts5_triggers.test.ts covers the migration:

No-op on databases without search-enabled collections.
Rebuilds FTS index for every search-enabled collection.
Repairs an already-corrupted index (asserts stale tokens are gone after migration, and new tokens are present).
Disables FTS cleanly when no fields are searchable.
Ignores collections where search_config.enabled = false.
Idempotent re-run.

All affected tests pass; full core suite passes.

Closes #649

Type of change

Checklist

I have read CONTRIBUTING.md
pnpm typecheck passes
pnpm lint passes
pnpm test passes (or targeted tests for my change)
pnpm format has been run
I have added/updated tests for my changes (if applicable)
User-visible strings in the admin UI are wrapped for translation and pnpm locale:extract has been run (if applicable) — N/A, no admin UI changes
I have added a changeset (if this PR changes a published package)
New features link to an approved Discussion: https://github.com/emdash-cms/emdash/discussions/...

AI-generated code disclosure

This PR includes AI-generated code

Originally opened by /ultrabonk (Claude Opus 4.7 via opencode + ask-bonk) in response to my comment on #649. Subsequent commits — including the trash-lifecycle gate, the migration-based approach, and the test strengthening — were also AI-generated (Claude Opus 4.7 via opencode), iterated against Copilot reviewer feedback and an adversarial second-opinion review.

Co-authored-by: ascorbic <ascorbic@users.noreply.github.com>

changeset-bot · 2026-04-25T09:22:10Z

🦋 Changeset detected

Latest commit: 0e150f6

The changes in this PR will be included in the next version bump.

This PR includes changesets to release 13 packages

Name	Type
emdash	Patch
@emdash-cms/cloudflare	Patch
@emdash-cms/fixture-perf-site	Patch
@emdash-cms/perf-demo-site	Patch
@emdash-cms/cache-demo-site	Patch
@emdash-cms/admin	Patch
@emdash-cms/auth	Patch
@emdash-cms/blocks	Patch
@emdash-cms/gutenberg-to-portable-text	Patch
@emdash-cms/x402	Patch
create-emdash	Patch
@emdash-cms/auth-atproto	Patch
@emdash-cms/plugin-embeds	Patch

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

cloudflare-workers-and-pages · 2026-04-25T09:22:12Z

Deploying with Cloudflare Workers

The latest updates on your project. Learn more about integrating Git with Workers.

Status	Name	Latest Commit	Updated (UTC)
🔵 In progress View logs	emdash-playground	`ccf2d57`	Apr 25 2026, 09:22 AM

cloudflare-workers-and-pages · 2026-04-25T09:22:12Z

Deploying with Cloudflare Workers

The latest updates on your project. Learn more about integrating Git with Workers.

Status	Name	Latest Commit	Updated (UTC)
🔵 In progress View logs	emdash-demo-cache	`ccf2d57`	Apr 25 2026, 09:22 AM

cloudflare-workers-and-pages · 2026-04-25T09:22:14Z

Deploying with Cloudflare Workers

The latest updates on your project. Learn more about integrating Git with Workers.

Status	Name	Latest Commit	Updated (UTC)
🔵 In progress View logs	emdash-perf-coordinator	`ccf2d57`	Apr 25 2026, 09:22 AM

cloudflare-workers-and-pages · 2026-04-25T09:22:14Z

Deploying with Cloudflare Workers

The latest updates on your project. Learn more about integrating Git with Workers.

Status	Name	Latest Commit	Updated (UTC)
✅ Deployment successful! View logs	emdash-demo-cache	`0e150f6`	May 18 2026, 06:06 PM

github-actions · 2026-04-25T09:22:17Z

All contributors have signed the CLA ✍️ ✅
_{Posted by the CLA Assistant Lite bot.}

pkg-pr-new · 2026-04-25T09:23:34Z

Open in StackBlitz

@emdash-cms/admin

npm i https://pkg.pr.new/@emdash-cms/admin@768

@emdash-cms/auth

npm i https://pkg.pr.new/@emdash-cms/auth@768

@emdash-cms/blocks

npm i https://pkg.pr.new/@emdash-cms/blocks@768

@emdash-cms/cloudflare

npm i https://pkg.pr.new/@emdash-cms/cloudflare@768

emdash

npm i https://pkg.pr.new/emdash@768

create-emdash

npm i https://pkg.pr.new/create-emdash@768

@emdash-cms/gutenberg-to-portable-text

npm i https://pkg.pr.new/@emdash-cms/gutenberg-to-portable-text@768

@emdash-cms/x402

npm i https://pkg.pr.new/@emdash-cms/x402@768

@emdash-cms/plugin-ai-moderation

npm i https://pkg.pr.new/@emdash-cms/plugin-ai-moderation@768

@emdash-cms/plugin-atproto

npm i https://pkg.pr.new/@emdash-cms/plugin-atproto@768

@emdash-cms/plugin-audit-log

npm i https://pkg.pr.new/@emdash-cms/plugin-audit-log@768

@emdash-cms/plugin-color

npm i https://pkg.pr.new/@emdash-cms/plugin-color@768

@emdash-cms/plugin-embeds

npm i https://pkg.pr.new/@emdash-cms/plugin-embeds@768

@emdash-cms/plugin-forms

npm i https://pkg.pr.new/@emdash-cms/plugin-forms@768

@emdash-cms/plugin-webhook-notifier

npm i https://pkg.pr.new/@emdash-cms/plugin-webhook-notifier@768

commit: 0e150f6

ascorbic · 2026-04-25T09:28:59Z

/review

ask-bonk · 2026-04-25T10:12:00Z

@ascorbic Bonk workflow was cancelled.

View workflow run · To retry, trigger Bonk again.

ascorbic · 2026-04-25T10:20:28Z

/review

ask-bonk · 2026-04-25T10:25:11Z

LGTM!

github run

ascorbic · 2026-04-25T10:32:18Z

/ultrareview

Copilot

Pull request overview

Note

Copilot was unable to run its full agentic suite in this review.

Fixes recurring SQLITE_CORRUPT_VTAB on publish/edit for collections with FTS enabled by correcting FTS5 trigger semantics for external-content tables and adding automatic detection/repair of legacy unsafe triggers/corrupted indexes.

Changes:

Update FTS5 UPDATE/DELETE triggers to use the external-content-safe VALUES('delete', OLD...) pattern.
Add legacy-trigger detection via sqlite_master inspection and rebuild on upgrade when unsafe triggers are found.
Add an integrity-check pass during verification and introduce integration regression tests covering corruption and auto-repair.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 5 comments.

File	Description
packages/core/src/search/fts-manager.ts	Fixes trigger behavior for external-content FTS5 tables; adds legacy trigger detection and integrity-check based repair.
packages/core/tests/integration/search/fts-corruption.test.ts	Adds regression + stress tests for corruption and verifies auto-repair behavior.
.changeset/fts-corrupt-vtab-fix.md	Documents the patch release and explains root cause/fix at a high level.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-04-25T10:38:08Z

 		const fieldList = searchableFields.join(", ");
 		const newFieldList = searchableFields.map((f) => `NEW.${f}`).join(", ");
+		// `'delete'` takes the FTS5 virtual table name as the first column,
+		// then the rowid being removed, then the OLD value of every column
+		// declared on the FTS5 table (in declaration order: id, locale,
+		// then each searchable field).
+		const oldFieldList = searchableFields.map((f) => `OLD.${f}`).join(", ");


If searchableFields can be empty, fieldList becomes an empty string and the generated SQL ends up with a trailing comma in the column list (and an extra comma in VALUES), which is a syntax error. Either enforce searchableFields.length > 0 in validateInputs (and/or at the call site), or branch the SQL generation to omit ", ${fieldList}" / ", ${oldFieldList}" when there are no searchable fields.

Copilot · 2026-04-25T10:38:09Z

+				INSERT INTO "${ftsTable}"("${ftsTable}", rowid, id, locale, ${fieldList})
+				VALUES('delete', OLD.rowid, OLD.id, OLD.locale, ${oldFieldList});


If searchableFields can be empty, fieldList becomes an empty string and the generated SQL ends up with a trailing comma in the column list (and an extra comma in VALUES), which is a syntax error. Either enforce searchableFields.length > 0 in validateInputs (and/or at the call site), or branch the SQL generation to omit ", ${fieldList}" / ", ${oldFieldList}" when there are no searchable fields.

Copilot · 2026-04-25T10:38:09Z

+
+		const result = await sql<{ sql: string | null }>`
+			SELECT sql FROM sqlite_master
+			WHERE type = 'trigger'
+			AND tbl_name = ${this.getContentTableName(collectionSlug)}
+			AND name LIKE ${`${ftsTable}_%`}


SQLite LIKE treats "_" as a single-character wildcard, and ftsTable names contain underscores (e.g. "_emdash_fts_pages"). This pattern can unintentionally match non-FTS triggers and cause false-positive rebuilds. Prefer matching exact trigger names (e.g. IN (${ftsTable}_insert, ${ftsTable}_update, ${ftsTable}delete)) or escape "" and "%" and add an ESCAPE clause.

Suggested change

const result = await sql<{ sql: string | null }>`

SELECT sql FROM sqlite_master

WHERE type = 'trigger'

AND tbl_name = ${this.getContentTableName(collectionSlug)}

AND name LIKE ${`${ftsTable}_%`}

const insertTrigger = `${ftsTable}_insert`;

const updateTrigger = `${ftsTable}_update`;

const deleteTrigger = `${ftsTable}_delete`;

const result = await sql<{ sql: string | null }>`

SELECT sql FROM sqlite_master

WHERE type = 'trigger'

AND tbl_name = ${this.getContentTableName(collectionSlug)}

AND name IN (${insertTrigger}, ${updateTrigger}, ${deleteTrigger})

Copilot · 2026-04-25T10:38:09Z

+		// Check: FTS5 integrity-check. This catches corruption that the row
+		// count check misses (e.g. orphaned tokens in segments where the
+		// docsize entry exists but points to garbage). Throws on a corrupt
+		// index; treat the throw itself as the signal to rebuild.
+		try {
+			await sql
+				.raw(`INSERT INTO "${ftsTable}"("${ftsTable}") VALUES('integrity-check')`)
+				.execute(this.db);
+		} catch (err) {


Running FTS5 'integrity-check' on every verify pass can be expensive for large indexes and will directly impact startup time if ensureSearchHealthy runs at boot. Consider gating the integrity-check behind a cheaper heuristic (e.g. only after legacy-trigger detection, after row-count mismatch, or on a lower-frequency schedule), or making it configurable (e.g. "deep check" mode) to avoid consistently paying the full scan cost.

Copilot · 2026-04-25T10:38:09Z

+		const legacyMarker = `DELETE FROM "${ftsTable}" WHERE rowid = OLD.rowid`;
+		for (const row of result.rows) {
+			if (row.sql && row.sql.includes(legacyMarker)) {
+				return true;
+			}
+		}


Legacy-trigger detection relies on an exact SQL substring match (including quoting and whitespace). If older shipped versions differed in formatting/quoting, hasLegacyTriggers can miss genuinely unsafe triggers and skip the rebuild. To make this more robust, consider normalizing whitespace and matching via a small regex (e.g. "DELETE\s+FROM\s+"?"?\s+WHERE\s+rowid\s*=\s*OLD\.rowid") or matching multiple expected variants.

ask-bonk

CRITICAL — the fix introduces a new SQLITE_CORRUPT_VTAB regression on restore / permanent-delete / edit-while-trashed.

Empirically verified against better-sqlite3 12.8.0 (vanilla SQLite, same as D1). With the PR's triggers in place, a normal user lifecycle now throws SQLITE_CORRUPT_VTAB:

Lifecycle: insert active → soft-delete → restore
  soft delete OK
  restore FAILED: SQLITE_CORRUPT_VTAB - database disk image is malformed

Lifecycle: insert active → soft-delete → permanent delete
  permanent delete FAILED: SQLITE_CORRUPT_VTAB

Lifecycle: edit a soft-deleted row
  edit-while-deleted FAILED: SQLITE_CORRUPT_VTAB

Root cause: the INSERT trigger has WHEN NEW.deleted_at IS NULL, so soft-deleted rows are never added to the FTS index. The new UPDATE / DELETE triggers issue 'delete' unconditionally, including for rows where OLD.deleted_at IS NOT NULL (i.e., the row is not in the FTS index at all). FTS5's 'delete' command is not a no-op when the rowid is absent — it tries to remove tokens that don't exist and corrupts state, throwing SQLITE_CORRUPT_VTAB and aborting the parent UPDATE/DELETE. The previous broken DELETE FROM fts WHERE rowid = OLD.rowid form silently no-op'd in that case, so this case worked by accident before.

Reachable from production via ContentRepository.restore() / ContentRepository.permanentDelete() (used by /api/.../restore and the trash endpoints, see packages/core/src/api/handlers/content.ts:765,810). Anyone who soft-deletes content on a search-enabled collection then tries to restore or permanently-delete it will hit the same error this PR claims to fix.

Fix: gate the 'delete' on OLD.deleted_at IS NULL (the row was actually indexed). Verified working — a full lifecycle (insert → edit → soft-delete → restore → edit → soft-delete → permanent-delete) survives integrity-check at every step:

CREATE TRIGGER ..._update AFTER UPDATE ON ec_pages BEGIN
  INSERT INTO fts(fts, rowid, id, locale, title, content)
  SELECT 'delete', OLD.rowid, OLD.id, OLD.locale, OLD.title, OLD.content
  WHERE OLD.deleted_at IS NULL;
  INSERT INTO fts(rowid, id, locale, title, content)
  SELECT NEW.rowid, NEW.id, NEW.locale, NEW.title, NEW.content
  WHERE NEW.deleted_at IS NULL;
END;

CREATE TRIGGER ..._delete AFTER DELETE ON ec_pages BEGIN
  INSERT INTO fts(fts, rowid, id, locale, title, content)
  SELECT 'delete', OLD.rowid, OLD.id, OLD.locale, OLD.title, OLD.content
  WHERE OLD.deleted_at IS NULL;
END;

The fts-corruption test file should add coverage for the trash lifecycle (restore + permanent-delete + edit-while-soft-deleted) — none of the three current tests exercise a row that was inserted with deleted_at IS NOT NULL or transitions through the trash, which is exactly why this regression slipped through.

Not blocking diagnosis — the diagnosis and the choice of 'delete' command are correct, and the legacy-trigger detection + integrity-check additions are good. But this needs the OLD-side gate before merging or the bug it ships will be the same SQLITE_CORRUPT_VTAB on a different operation.

ask-bonk · 2026-04-25T10:39:33Z

 			BEGIN
-				DELETE FROM "${ftsTable}" WHERE rowid = OLD.rowid;
+				INSERT INTO "${ftsTable}"("${ftsTable}", rowid, id, locale, ${fieldList})
+				VALUES('delete', OLD.rowid, OLD.id, OLD.locale, ${oldFieldList});


CRITICAL — this 'delete' is unconditional, but the INSERT trigger above only indexes rows where NEW.deleted_at IS NULL. So when an UPDATE fires with OLD.deleted_at IS NOT NULL (restore from trash, edit a row that's currently in the trash, or any update on a row that was inserted soft-deleted), the OLD row is not in the FTS index and FTS5's 'delete' command throws SQLITE_CORRUPT_VTAB — the same error this PR is fixing.

Reproduced against vanilla SQLite (better-sqlite3 12.8.0):

insert active → soft-delete → restore => SQLITE_CORRUPT_VTAB insert active → soft-delete → edit => SQLITE_CORRUPT_VTAB

Reachable from production via ContentRepository.restore() (packages/core/src/database/repositories/content.ts:625), used by the restore endpoint at packages/core/src/api/handlers/content.ts:765. Any user who soft-deletes content on a search-enabled collection and then tries to restore it will hit corruption.

Fix: gate on OLD.deleted_at IS NULL so we only 'delete' rows that were actually indexed:

INSERT INTO "${ftsTable}"("${ftsTable}", rowid, id, locale, ${fieldList}) SELECT 'delete', OLD.rowid, OLD.id, OLD.locale, ${oldFieldList} WHERE OLD.deleted_at IS NULL;

Verified that this form survives the full lifecycle (insert → edit → soft-delete → restore → edit → soft-delete → permanent-delete) with 'integrity-check' passing at every step. The previous (buggy) DELETE FROM fts WHERE rowid = OLD.rowid was silently a no-op when the rowid wasn't in the index, which is why this case happened to work before — 'delete' is not a no-op.

ask-bonk · 2026-04-25T10:39:33Z

 			BEGIN
-				DELETE FROM "${ftsTable}" WHERE rowid = OLD.rowid;
+				INSERT INTO "${ftsTable}"("${ftsTable}", rowid, id, locale, ${fieldList})
+				VALUES('delete', OLD.rowid, OLD.id, OLD.locale, ${oldFieldList});


CRITICAL — same issue as the UPDATE trigger. A DELETE FROM ec_<slug> on a row whose deleted_at IS NOT NULL (i.e. the user permanently-deletes a soft-deleted row from trash, the standard flow via ContentRepository.permanentDelete() -> packages/core/src/api/handlers/content.ts:810) will fire this trigger with OLD values that were never inserted into the FTS index, and FTS5 throws SQLITE_CORRUPT_VTAB, aborting the user's DELETE.

Verified end-to-end against vanilla SQLite. Same fix:

INSERT INTO "${ftsTable}"("${ftsTable}", rowid, id, locale, ${fieldList}) SELECT 'delete', OLD.rowid, OLD.id, OLD.locale, ${oldFieldList} WHERE OLD.deleted_at IS NULL;

ask-bonk · 2026-04-25T10:39:33Z

+				.execute(db),
+		).resolves.toBeDefined();
+	});
+});


Test gap that hid the regression: none of the three new tests exercise a row that was inserted soft-deleted, transitions through the trash, or comes back through restore()/permanentDelete(). The new triggers throw SQLITE_CORRUPT_VTAB in all three of those cases (verified empirically), but the test suite never goes there, so it ships green.

Please add coverage for the trash lifecycle once the trigger gating is fixed — at minimum: insert active row → soft-delete → restore → integrity-check, and insert → soft-delete → permanent-delete → integrity-check. These are the ones that currently break and would have caught this in CI.

ask-bonk · 2026-04-25T10:39:39Z

Review posted with three line-anchored comments. Found a critical regression: the new triggers throw SQLITE_CORRUPT_VTAB on restore from trash, permanent-delete from trash, and edits to soft-deleted rows — empirically reproduced against vanilla SQLite. The INSERT trigger has WHEN NEW.deleted_at IS NULL so soft-deleted rows are never indexed, but the new UPDATE/DELETE triggers issue 'delete' unconditionally. The old buggy DELETE FROM fts WHERE rowid = OLD.rowid silently no-op'd on missing rowids; the documented 'delete' command does not. Fix is to gate the 'delete' on OLD.deleted_at IS NULL.

Review URL: #768 (review)

github run

ascorbic · 2026-04-26T05:32:34Z

/ultrabonk look at the review comments and fix the issues

Co-authored-by: ascorbic <ascorbic@users.noreply.github.com>

ask-bonk · 2026-04-26T05:41:19Z

All search tests pass (10/10). Summary of changes:

Critical fix (fts-manager.ts): Gated the 'delete' command in both UPDATE and DELETE triggers on OLD.deleted_at IS NULL. Without this, restore-from-trash, permanent-delete, and edit-while-trashed all raised SQLITE_CORRUPT_VTAB (the same error the original PR was meant to fix).
Tightened legacy-trigger detection (hasLegacyTriggers): Switched from a LIKE pattern (where underscores are wildcards) to exact IN (...) matching of trigger names, and a whitespace-tolerant regex for the unsafe SQL fragment so older shipped variants are caught.
Defensive guard in createTriggers: Throws if searchableFields is empty rather than producing malformed SQL with trailing commas.
Three new lifecycle tests covering the regression: soft-delete -> restore -> integrity-check, soft-delete -> permanent-delete, and edit-while-soft-deleted.

What does this PR do?

Addresses review feedback on PR #768. The FTS5 trigger fix introduced a new SQLITE_CORRUPT_VTAB regression on restore, permanent-delete, and edit-while-trashed flows. This PR fixes that and tightens a few rough edges flagged by Copilot.

The regression

The INSERT trigger has WHEN NEW.deleted_at IS NULL, so soft-deleted rows are never indexed. The new UPDATE/DELETE triggers issued 'delete' unconditionally. FTS5's 'delete' command is not a no-op for missing rowids (the previous broken DELETE FROM fts WHERE rowid = OLD.rowid form silently was, which is why this case worked by accident before) and raises SQLITE_CORRUPT_VTAB when the row was never in the index. Reachable from production via ContentRepository.restore() and ContentRepository.permanentDelete() -- any user who soft-deletes content on a search-enabled collection then tries to restore or permanently-delete it would hit the same error #768 was meant to fix.

Fixes (`packages/core/src/search/fts-manager.ts`)

Gate 'delete' on OLD.deleted_at IS NULL in both UPDATE and DELETE triggers using INSERT ... SELECT ... WHERE OLD.deleted_at IS NULL. The OLD-side index state matches the INSERT trigger's NULL-only indexing condition.
Tighten hasLegacyTriggers detection -- switched from a LIKE pattern (underscores in _emdash_fts_<slug> are SQL wildcards) to exact IN (...) matching of trigger names, plus a whitespace-tolerant regex match for the unsafe fragment so older shipped variants with different formatting are still caught.
Defensive guard in createTriggers -- throws if searchableFields is empty rather than producing malformed SQL with trailing commas. The call sites (enableSearch, verifyAndRepairIndex) already filter this out, but the function should fail loud rather than silent.

Tests

Three new lifecycle tests in fts-corruption.test.ts:

soft-delete → restore → integrity-check (the main regression case)
soft-delete → permanent-delete (DELETE on a row not in the index)
soft-delete → direct UPDATE (edit-while-trashed)

All 10 search tests pass; full core suite at 2577 passing (1 pre-existing failure in wordpress-slug-sanitization.test.ts that's a workspace self-resolution issue, present on main and unrelated to this change).

Type of change

Checklist

I have read CONTRIBUTING.md
pnpm typecheck passes
pnpm lint passes (no new diagnostics in changed files; remaining 25 are pre-existing on main)
pnpm test passes (or targeted tests for my change)
pnpm format has been run
I have added/updated tests for my changes (if applicable)
User-visible strings in the admin UI are wrapped for translation and pnpm locale:extract has been run (if applicable) — N/A, no admin UI changes
I have added a changeset (if this PR changes a published package) — updated existing changeset
New features link to an approved Discussion: https://github.com/emdash-cms/emdash/discussions/...

AI-generated code disclosure

This PR includes AI-generated code

github run

Copilot

Pull request overview

Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-04-26T09:16:15Z

+		// Check: legacy/broken triggers from a pre-fix install. These corrupt
+		// the index on every UPDATE/DELETE, so any site that has them needs a
+		// rebuild even if the row counts happen to match right now.
+		if (fields.length > 0 && (await this.hasLegacyTriggers(collectionSlug))) {
+			console.warn(
+				`FTS index for "${collectionSlug}" has legacy sync triggers from a pre-fix EmDash version. Rebuilding to install corruption-safe triggers.`,
+			);
+			await this.rebuildIndex(collectionSlug, fields, config?.weights);
+			return true;
+		}


verifyAndRepairAll() (and therefore the legacy-trigger auto-heal) is currently only invoked via EmDashRuntime.ensureSearchHealthy, which the codebase calls from the search endpoints. Content mutations (publish/edit/delete/restore) don’t call this health check, so an upgraded site with legacy triggers/corruption can still hit SQLITE_CORRUPT_VTAB on publish before any search request ever triggers the rebuild.

Consider ensuring the verify/repair path runs before content writes for search-enabled collections (e.g. call ensureSearchHealthy()/verifyAndRepairIndex() from publish/edit/trash routes, or trigger a targeted rebuild when a mutation fails with SQLITE_CORRUPT_VTAB).

Copilot · 2026-04-26T09:16:16Z

+		// Check: FTS5 integrity-check. This catches corruption that the row
+		// count check misses (e.g. orphaned tokens in segments where the
+		// docsize entry exists but points to garbage). Throws on a corrupt
+		// index; treat the throw itself as the signal to rebuild.
+		try {
+			await sql
+				.raw(`INSERT INTO "${ftsTable}"("${ftsTable}") VALUES('integrity-check')`)
+				.execute(this.db);


verifyAndRepairIndex now runs an unconditional FTS5 'integrity-check' whenever an index exists and row counts match. This can be significantly more expensive than the row-count check for large indexes and will run on the first search request in each worker lifetime.

If this is intended, it may be worth documenting/justifying the expected cost. Otherwise, consider making integrity-check conditional (e.g. only after detecting legacy triggers / mismatch / a prior crash marker, or rate-limiting it with a persisted “last checked” timestamp) so search latency doesn’t regress for large sites.

github-actions · 2026-05-10T12:29:19Z

This PR has been inactive for 14 days. It will be closed automatically in 7 days if there is no further activity.

If you're still working on this, please push an update or leave a comment.

claughinghouse · 2026-05-11T17:43:52Z

Adding a note that this is still an active branch since the root issue is unresolved.

cloudflare-workers-and-pages · 2026-05-18T16:12:09Z

Deploying with Cloudflare Workers

The latest updates on your project. Learn more about integrating Git with Workers.

Status	Name	Latest Commit	Updated (UTC)
✅ Deployment successful! View logs	emdash-i18n	`0e150f6`	May 18 2026, 06:05 PM

cloudflare-workers-and-pages · 2026-05-18T16:12:09Z

Deploying with Cloudflare Workers

The latest updates on your project. Learn more about integrating Git with Workers.

Status	Name	Latest Commit	Updated (UTC)
✅ Deployment successful! View logs	emdash-perf-coordinator	`0e150f6`	May 18 2026, 06:05 PM

cloudflare-workers-and-pages · 2026-05-18T16:12:09Z

Deploying with Cloudflare Workers

The latest updates on your project. Learn more about integrating Git with Workers.

Status	Name	Latest Commit	Updated (UTC)
✅ Deployment successful! View logs	emdash-playground	`0e150f6`	May 18 2026, 06:07 PM

cloudflare-workers-and-pages · 2026-05-18T16:12:12Z

Deploying with Cloudflare Workers

The latest updates on your project. Learn more about integrating Git with Workers.

Status	Name	Latest Commit	Updated (UTC)
✅ Deployment successful! View logs	docs	`0e150f6`	May 18 2026, 06:06 PM

Replaces the runtime auto-repair plumbing (hasLegacyTriggers detection plus FTS5 'integrity-check' probe inside verifyAndRepairIndex) with a one-shot migration 039_fix_fts5_triggers that rebuilds every search-enabled collection's FTS index at boot. Addresses two outstanding review concerns on this PR: 1. Auto-repair previously only fired from /api/search and /api/search/suggest. The publish path didn't call it, so an upgrading site with a corrupted index still hit SQLITE_CORRUPT_VTAB on its first publish unless somebody happened to hit a search endpoint first. The migration runs once at startup before any request can reach the affected paths. 2. verifyAndRepairIndex ran a full FTS5 'integrity-check' on every first-search per isolate. For large indexes that scan is expensive. The migration takes the upgrade-rebuild cost once at deploy time instead of pinning it onto search latency. verifyAndRepairIndex now sticks to cheap belt-and-braces checks: missing FTS table -> rebuild, row-count mismatch -> rebuild. The expensive recovery cases are handled at boot. The trigger fix itself (corruption-safe 'delete' command with the OLD.deleted_at IS NULL gate) is unchanged. fts-corruption.test.ts keeps the trash-lifecycle and stress tests; the legacy-trigger repair scenario moves to the new migration unit test.

Two fixes from adversarial review of bb81027: 1. Add IF NOT EXISTS to CREATE VIRTUAL TABLE and CREATE TRIGGER in migration 039. D1 has no advisory migration lock (see runner.ts:264), so two isolates on a cold first deploy of this version can both enter migration 039 simultaneously. The DROP IF EXISTS / CREATE / populate sequence was previously not safe under that race -- the loser would hit 'table already exists' or 'trigger already exists' on its CREATE statements, fail, and Kysely's race handler only catches the final UNIQUE constraint on _emdash_migrations.name. With IF NOT EXISTS on every CREATE, concurrent migrators converge: FTS5 INSERT into an external-content table dedupes by rowid via the docsize shadow table (verified empirically on better-sqlite3 12.x), so a double populate ends with one row per content row regardless of isolate ordering. 2. Strengthen the 'repairs a database whose FTS index is already corrupted' test so it actually proves stale tokens are removed. The previous assertion was integrity-check + docsize=1, both of which pass on the unrepaired corrupted index because the legacy broken trigger leaves the inverted index with both OLD and NEW tokens but the docsize shadow table happens to still show one row. The new assertion uses distinct OLD/NEW tokens ('aardvark' vs 'zebra') and verifies (a) the OLD token is in the index before the broken UPDATE, (b) the OLD token is STILL in the index after the broken UPDATE (proving the corruption exists), (c) after the migration, the OLD token is gone and the NEW token is present. Verified the strengthened test catches a neutered migration: with rebuildIndex commented out, the test fails on the post-migration stale-token assertion as expected.

ascorbic · 2026-05-18T17:33:16Z

/review

Copilot

Pull request overview

Copilot reviewed 7 out of 7 changed files in this pull request and generated 3 comments.

+	// `IF NOT EXISTS` on all CREATE statements so two isolates running
+	// the migration concurrently (D1 has no advisory lock, so this can
+	// happen on cold starts of a fresh deploy) don't fail each other
+	// with "table already exists" / "trigger already exists" errors.
+	// The drop-then-create sequence is benign across isolates: FTS5
+	// silently dedupes duplicate-rowid INSERTs via the docsize shadow
+	// table, so a double populate ends with one row per content row.


+	 * Verify FTS index integrity and rebuild if drift is detected.
+	 *
+	 * Cheap belt-and-braces check, run lazily on the first search request
+	 * per isolate. The expensive cases (corrupted indexes from pre-fix
+	 * EmDash versions, broken legacy triggers) are handled at boot time by
+	 * migration `039_fix_fts5_triggers`, not here. This routine sticks to:
 	 *
-	 * Checks for row count mismatch between content table and FTS table.
+	 *   1. FTS table missing while config says search is enabled -> rebuild.
+	 *   2. Row count mismatch between content table and FTS docsize -> rebuild.


+ * `pragma integrity_check` (which FTS5 hooks via `'integrity-check'`) does not
+ * report a malformed disk image.


ask-bonk · 2026-05-18T17:40:08Z

The most recent review (Copilot 2026-04-26) was before commits bb81027 (move to migration) and bf1ed6d (idempotent DDL). Both commits address those review concerns. The migration approach explicitly addresses Copilot's "verifyAndRepairAll is only invoked via search endpoints" concern by running before content writes during boot.

This PR looks solid. No bugs to flag. Skipping the review.

LGTM!

github run

- Clarify concurrency comment in migration 039: call out that the drop-window where a still-running old-version isolate could fire triggers against the missing FTS table is pre-existing (same window exists in FTSManager.rebuildIndex), not introduced by this PR. - Fix misleading header comment in fts-corruption.test.ts: FTS5 exposes 'integrity-check' as its own command on the FTS table, not via PRAGMA integrity_check. PR body was also updated separately to reflect the migration-based approach (the original description still referenced hasLegacyTriggers and the integrity-check probe, both removed in bb81027).

Fix FTS5 corruption on publish

ccf2d57

Co-authored-by: ascorbic <ascorbic@users.noreply.github.com>

ask-bonk Bot mentioned this pull request Apr 25, 2026

SQLite Database Corrupt Error on new site #649

Closed

github-actions Bot added area/core size/L labels Apr 25, 2026

github-actions Bot added the cla: needed label Apr 25, 2026

ascorbic mentioned this pull request Apr 25, 2026

ci(bonk): fix concurrency cancellation, missing PR templates, long timeouts #769

Merged

18 tasks

Merge branch 'main' into opencode/issue649-20260425090851

597cd10

Merge branch 'main' into opencode/issue649-20260425090851

2e41e2b

github-actions Bot added cla: signed and removed cla: needed labels Apr 25, 2026

ascorbic requested a review from Copilot April 25, 2026 10:31

Copilot started reviewing on behalf of ascorbic April 25, 2026 10:37 View session

Copilot AI reviewed Apr 25, 2026

View reviewed changes

ask-bonk Bot commented Apr 25, 2026

View reviewed changes

ascorbic closed this Apr 26, 2026

ascorbic reopened this Apr 26, 2026

Fix FTS trash-lifecycle corruption

e77a8d7

Co-authored-by: ascorbic <ascorbic@users.noreply.github.com>

github-actions Bot added size/XL and removed size/L labels Apr 26, 2026

ascorbic requested a review from Copilot April 26, 2026 09:11

Copilot started reviewing on behalf of ascorbic April 26, 2026 09:11 View session

Copilot AI reviewed Apr 26, 2026

View reviewed changes

github-actions Bot added stale and removed stale labels May 10, 2026

Merge branch 'main' into opencode/issue649-20260425090851

7b6f3cb

ascorbic added 2 commits May 18, 2026 17:32

ascorbic requested a review from Copilot May 18, 2026 17:33

Copilot started reviewing on behalf of ascorbic May 18, 2026 17:34 View session

Copilot AI reviewed May 18, 2026

View reviewed changes

ascorbic enabled auto-merge (squash) May 18, 2026 18:07

ascorbic merged commit 121f173 into main May 18, 2026
37 checks passed

ascorbic deleted the opencode/issue649-20260425090851 branch May 18, 2026 18:13

emdashbot Bot mentioned this pull request May 18, 2026

ci: release #1096

Open

		INSERT INTO "${ftsTable}"("${ftsTable}", rowid, id, locale, ${fieldList})
		VALUES('delete', OLD.rowid, OLD.id, OLD.locale, ${oldFieldList});

		* `pragma integrity_check` (which FTS5 hooks via `'integrity-check'`) does not
		* report a malformed disk image.

Conversation

ask-bonk Bot commented Apr 25, 2026 • edited by ascorbic Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

Root cause

Fix

Tests

Type of change

Checklist

AI-generated code disclosure

Uh oh!

changeset-bot Bot commented Apr 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🦋 Changeset detected

Uh oh!

cloudflare-workers-and-pages Bot commented Apr 25, 2026

Deploying with Cloudflare Workers

Uh oh!

cloudflare-workers-and-pages Bot commented Apr 25, 2026

Deploying with Cloudflare Workers

Uh oh!

cloudflare-workers-and-pages Bot commented Apr 25, 2026

Deploying with Cloudflare Workers

Uh oh!

cloudflare-workers-and-pages Bot commented Apr 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Deploying with Cloudflare Workers

Uh oh!

github-actions Bot commented Apr 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pkg-pr-new Bot commented Apr 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ascorbic commented Apr 25, 2026

Uh oh!

ask-bonk Bot commented Apr 25, 2026

Uh oh!

ascorbic commented Apr 25, 2026

Uh oh!

ask-bonk Bot commented Apr 25, 2026

Uh oh!

ascorbic commented Apr 25, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Apr 25, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 25, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 25, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 25, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 25, 2026

Choose a reason for hiding this comment

Uh oh!

ask-bonk Bot left a comment

Choose a reason for hiding this comment

Uh oh!

ask-bonk Bot Apr 25, 2026

Choose a reason for hiding this comment

Uh oh!

ask-bonk Bot Apr 25, 2026

Choose a reason for hiding this comment

Uh oh!

ask-bonk Bot Apr 25, 2026

Choose a reason for hiding this comment

Uh oh!

ask-bonk Bot commented Apr 25, 2026

Uh oh!

ascorbic commented Apr 26, 2026

Uh oh!

ask-bonk Bot commented Apr 25, 2026 •

edited by ascorbic

Loading

changeset-bot Bot commented Apr 25, 2026 •

edited

Loading

cloudflare-workers-and-pages Bot commented Apr 25, 2026 •

edited

Loading

github-actions Bot commented Apr 25, 2026 •

edited

Loading

pkg-pr-new Bot commented Apr 25, 2026 •

edited

Loading

Fixes (`packages/core/src/search/fts-manager.ts`)

cloudflare-workers-and-pages Bot commented May 18, 2026 •

edited

Loading

cloudflare-workers-and-pages Bot commented May 18, 2026 •

edited

Loading

cloudflare-workers-and-pages Bot commented May 18, 2026 •

edited

Loading

cloudflare-workers-and-pages Bot commented May 18, 2026 •

edited

Loading