Conversation
Introduces .iafbt format version 2 (default writer) — see
docs/design/bundle-format-v2.md for the public spec.
Changes:
- Backtest dataclass gains an engine_type field ('vector' | 'event' |
None) plus vector_runs / event_runs / vector_metrics / event_metrics
derived properties. Engine-tagged at construction by VectorBacktest
and EventBacktest service paths; legacy / unknown bundles keep the
engine-agnostic backtest_runs / backtest_summary view.
- Bundle envelope v2 routes runs into engine-specific slots
(vector_runs / event_runs) based on engine_type. v1 envelopes
remain readable indefinitely.
- Eight heavy metric time-series (equity_curve, drawdown_series,
cumulative_return_series, rolling_sharpe_ratio, monthly_returns,
yearly_returns, twr_equity_curve, twr_drawdown_series) are
extracted into embedded Parquet blobs with int64 epoch-ms
timestamps and float64 values, replacing the v1 inline
list-of-(value, ISO-string) shape.
- save_bundle gains format_version (default 2, accepts 1 for
downgrade) and float32_ohlcv (off by default) options.
- open_bundle and Backtest.open gain summary_only=True for bulk
listing pipelines that skip the per-run blob decode.
OHLCV side store, content addressing, and LazyOhlcvDict are
unchanged. Public OHLCV dedup protocol spec added at
docs/design/ohlcv-dedup-protocol.md for upload-style integrations.
17 bundle tests pass (8 new v2-specific cases); full suite passes
1681/1681 against unchanged behaviour for v1 bundles.
Measured on real production-shape bundles (~570 KB at level 7, 7 runs × 2192 hourly portfolio snapshots each): level 7 → ~570 KB level 19 → ~489 KB (−14%) level 22 → ~489 KB (saturated) Decode speed is unchanged in zstd's design (decoder is independent of encoder level). For a 12,500-bundle workload (~64 GB at level 7) this trims the archive to ~55 GB at zero behavioural cost. Level 19 is the highest level still in the standard tier (no --ultra flag, no special memory window), so the bytes are readable by any stock zstd reader.
Captures the storage architecture proposal that came out of the
v8.9 size-review measurements:
- Why per-file compression has hit its ceiling (~14% headroom max)
- Three-tier model: Index (SQL) + Columnar bulk (Parquet) +
Content-addressed chunks (S3-compatible)
- .iafbt demoted from storage primitive to deterministic export
format; v1 and v2 remain readable forever
- Local-only OSS path (LocalTieredStore) keeps the framework
self-contained without a server
- Phased migration: v8.10 read-side helpers, v8.11 store
abstraction, Finterion remote store closed-source
Companion to docs/design/bundle-format-v2.md and
docs/design/ohlcv-dedup-protocol.md.
…phase 1) Lift the existing untyped flat-row index helper into a public, typed Tier-1 contract: * New BacktestIndexRow dataclass (domain/backtesting/backtest_index_row.py) with identity / provenance / config / nested summary_metrics + forward-compat extras. Lossless to_flat_dict / from_flat_dict round trip for Parquet, SQL and JSON sinks. * New Backtest.index_row(bundle_path=None) method. Builds without decoding any v2 Parquet metric blobs, so it works against bundles loaded with Backtest.open(..., summary_only=True). This is the fast read path the upcoming 'iaf index' CLI (phase 2) and any tiered store implementation (phase 3) will rely on. * _backtest_to_index_row in backtest_utils now delegates to BacktestIndexRow.to_flat_dict() so the wire shape and the in-memory shape are a single source of truth (no behavioural change for the existing index.parquet sidecar). * Re-export BacktestIndexRow from the domain and top-level packages. * docs/design/tiered-backtest-storage.md \xa73.1 + roadmap row updated to reference the typed contract. Tests: 5 new (in-memory derivation, flat round-trip incl. NaN, unknown columns landing in extras, derivation from a summary_only=True bundle load). Full backtests suite green (29/29).
CI flake8 flags the import as unused without the corresponding entry in __all__. Phase 1 missed adding it.
Tier-1 SQLite index over a folder of .iafbt bundles, building on the phase-1 BacktestIndexRow contract. * services/backtest_index/sqlite_index.py: SqliteBacktestIndex with create / open / upsert / upsert_many / iter_rows / query. Every scalar field of BacktestSummaryMetrics is promoted to its own summary_<name> SQL column so analysts can filter without opening any bundle (e.g. WHERE summary_sharpe_ratio > 1.0). parameters / strategy_ids / extras round-trip as JSON text. WAL mode for safe concurrent reads. Forward-only additive schema migration via PRAGMA user_version. * cli/index_command.py + new 'iaf index <bundle-dir>' click command: walks the directory, opens each bundle with summary_only=True (no Parquet metric-blob decode), derives BacktestIndexRow, upserts. --output, --absolute-paths and --no-progress flags. * docs/design/tiered-backtest-storage.md updated with phase-2 status. Tests: 12 new (8 SqliteBacktestIndex unit tests + 4 CLI integration tests via click.testing.CliRunner). Full repo suite green (1698 passed, 42 skipped). Lint clean.
feat(cli): `iaf index` + SqliteBacktestIndex — epic #540 phase 2
feat(backtest): BacktestIndexRow DTO + Backtest.index_row() — epic #540 phase 1
Closes the remaining Phase 2 gaps for epic #540: - Backtest.scalar_summary() alias (canonical Phase 1 naming). - SqliteBacktestIndex tracks bundle_mtime_ns + bundle_size (schema v2); is_up_to_date() lets the indexer skip unchanged bundles. - build_index(..., incremental=True) skips up-to-date bundles by default; 'iaf index --rebuild' forces a full reindex. - 4 new tests covering skip, re-ingest on mtime bump, --rebuild, and the scalar_summary alias. - scripts/bench_540_phase2.py: acceptance benchmark. At 12,500 bundles: cold build 86s, incremental 536ms, list top-20 in 8.3ms (12x under the 100ms target), index footprint 2.5 MiB. - examples/storage_layer_demo/: end-to-end walkthrough of write -> index -> list -> rank -> open-with-summary-only, plus inline backtest report.
Introduces the storage seam that decouples *where* a backtest is persisted from the rest of the framework. Phase 3a is intentionally scoped to the Protocol + a thin adapter over today's .iafbt layout; LocalTieredStore (Tier-2 Parquet + Tier-3 chunks) and FinterionStore land in follow-up PRs. - BacktestStore Protocol: write / open / exists / delete / iter_handles / iter_index_rows / __len__ / __contains__. Mirrors today's Backtest.save_bundle / Backtest.open semantics so LocalDirStore is a 1:1 adapter. - StoreHandle: opaque str token (relative bundle path for LocalDirStore; uuid7 run_id for the upcoming tiered stores). - StoreError + StoreHandleNotFoundError. - Optional capability mixin SupportsCopyFrom — declared as a separate runtime_checkable Protocol so 'iaf migrate-store' (Phase 3d) and 'finterion push' (closed-source) can isinstance-test for it. Future capabilities (SupportsRelations for the strategy/version/report graph, SupportsContentAddressedChunks for Tier-3 dedup) follow the same pattern. - LocalDirStore: handle = bundle path relative to the root, so the store stays portable across moves. Sidecar SqliteBacktestIndex (built lazily, incrementally — same machinery as 'iaf index') backs iter_index_rows so listing does not re-decode bundles. Path-traversal guards reject handles that escape the root. - Tests (19, all passing): Protocol/SupportsCopyFrom conformance, round-trip, summary_only, default-handle derivation, sidecar index caching, copy_from with and without a handle subset, handle normalisation, path-traversal rejection, missing-handle error, delete idempotency. Targeted suite (store + index + cli): 86 / 86 passing.
…540 phase 3b) Second slice of Phase 3 of epic #540. Adds a real tiered storage implementation that ships the analytics value (cross-run DuckDB / Polars queries) without yet replacing the canonical .iafbt bundle. Layout under <root>: index.sqlite Tier-1 (always in sync) bundles/<handle>.iafbt canonical bytes parquet/portfolio_snapshots/run_id=<h>/... Tier-2 hive-partitioned parquet/trades/run_id=<h>/... Tier-2 parquet/orders/run_id=<h>/... Tier-2 Phase 3b deliberately keeps the bundle as the canonical representation; Tier-2 sidecars are auxiliary, written best-effort, and a malformed sidecar never blocks a write or a read. This trivially preserves byte-identical Backtest round-trips today. Byte-identical Tier-2 -> Backtest reassembly (no bundle on the read path) is Phase 3d. - decompose.py: Backtest -> flat record lists for snapshots / trades / orders, adding run_id and window_name columns so downstream tools group cleanly across walk-forward windows. Extension point for metric_series and any future kind is the DATASETS tuple. - LocalTieredStore: implements BacktestStore + SupportsCopyFrom. write() saves the bundle, upserts the Tier-1 row, and writes hive-partitioned Parquet sidecars per dataset. delete() removes all three tiers. iter_index_rows() serves from SQLite directly. rebuild_index() recreates Tier-1 from the bundles (useful after a software upgrade that adds new index columns). - scan('portfolio_snapshots' | 'trades' | 'orders') returns a pyarrow.dataset.Dataset that DuckDB / Polars can query across every run with partition pruning on run_id. - 15 new tests: Protocol + SupportsCopyFrom conformance, three-tier layout, handle normalisation, round-trip, summary_only, Tier-1 always-in-sync (write/delete/len), Tier-2 cross-run scan, copy_from from LocalDirStore, rebuild_index, missing-handle errors. Includes a synthetic-records test that asserts hive partitions are written and that scan() returns the expected rows + columns. Targeted suite (backtest_store + backtest_index + cli): 101 / 101 passing.
…e (epic #540 phase 3c) Wires LocalTieredStore into the existing OHLCV side-store machinery so identical (symbol, timeframe) Parquet bytes are written exactly once and shared across every bundle that references them. - write() now routes save_bundle's OHLCV writes to <root>/ohlcv/ whenever backtest.ohlcv is non-empty. The bundle envelope keeps its content-addressed manifest unchanged, so old bundles remain readable. - open() forwards the same shared directory to open_bundle so OHLCV lookups resolve regardless of what path the bundle was originally written with. - delete() intentionally does NOT touch ohlcv/. Chunks are globally shared; orphans are reclaimed via garbage_collect_ohlcv(dry_run=…). - Introspection helpers required by the dedup-upload protocol (docs/design/ohlcv-dedup-protocol.md): * iter_ohlcv_hashes() / ohlcv_referenced_hashes() * ohlcv_stored_hashes() * ohlcv_stats() -> stored_blobs / stored_bytes / referenced_blobs / orphan_blobs / missing_blobs * garbage_collect_ohlcv(dry_run=False) Manifests are decoded straight from the bundle envelope (_decode_payload) so the cost is one msgpack read per bundle — no full Backtest instantiation. 9 new tests: - No OHLCV -> no chunk dir created. - Identical OHLCV is stored once across distinct handles (dedup). - Different OHLCV yields separate chunks. - Round-trip via store.open() resolves OHLCV from the shared dir. - delete() keeps still-referenced chunks; orphans only after GC. - garbage_collect_ohlcv(dry_run=True) lists without deleting; the real call removes them. - iter_ohlcv_hashes() emits per-reference; ohlcv_referenced_hashes() dedups. - Hash strings are 64-char lowercase hex (matches the upload protocol spec). Targeted suite (backtest_store + backtest_index + cli): 110 / 110 passing.
…phase 3d) Closes the open Phase 3 deliverables that turn the new store abstraction into something users can actually move data through: - iaf migrate-store --from <kind> --src <path> --to <kind> --dst <path> delegates to dst.copy_from(src), so it is incremental, restartable, and tier-aware: when the destination is a local-tiered store, identical OHLCV chunks are written exactly once across the destination regardless of how many bundles reference them (Phase 3c invariant). Optional --handles subset selector for partial migrations. - migrate_store() programmatic helper for in-process pipelines. - BacktestStoreContractTest: a parameterised conformance suite that runs identical scenarios against every concrete store implementation (LocalDirStore, LocalTieredStore today, future remote stores tomorrow). Catches divergence as a failing subTest with the store class name in the label. Covers Protocol + SupportsCopyFrom conformance, write/open round-trip, summary_only, exists, idempotent delete, missing-handle errors, listing, iter_index_rows, and copy_from with both full and subset handle selection. - bug fix in LazyOhlcvDict: items() and values() were inheriting the empty backing dict's iteration, so any code path that did 'for k, v in bt.ohlcv.items()' silently dropped every blob after a tiered round-trip. Now both methods walk the manifest and materialise lazily on access. Caught by the migration dedup test. Note on what is *not* in this PR: byte-identical Tier-2 -> Backtest reassembly (so .iafbt could become export-only) is intentionally deferred. The current model where the bundle is canonical and Tier-1/2/3 are derived is simpler, preserves the existing round-trip contract bit-for-bit, and is what every test in the contract suite already exercises against both stores. Targeted suite (backtest_store + backtest_index + cli): 128 / 128 passing + 26 subTests. Full non-scenario suite: 1705 / 1705 passing with no regressions from the LazyOhlcvDict fix.
…L dashboard Two new sections in examples/storage_layer_demo/demo.py: - 6b. _print_backtest_full_report(): per-run breakdown (window / days / orders / trades / positions / final_value), end-of-backtest positions snapshot, first few trades, and a richer slice of per-run BacktestMetrics (cagr, annual_volatility, max_drawdown_absolute, gross_profit/loss, best_trade, max consecutive wins/losses) with safe n/a fallbacks. Built on top of the existing compact _print_backtest_report(). - 9. Storage layer -> HTML dashboard: wires the Tier-1 SQLite index, the Tier-2 LocalDirStore and the BacktestReport HTML dashboard end-to-end. rank_index() picks the top-N bundles from SQLite alone, store.open(handle) materialises just those via the BacktestStore protocol, BacktestReport(backtests=[...]).save() renders a self-contained interactive HTML dashboard. Demonstrates that the new storage layer plugs straight into the existing reporting stack with no glue code. README updated to describe both new sections.
- New feature bullet linking the storage_layer_demo - New '<details>' section explaining Tier-1 SQLite index, Tier-2 BacktestStore adapters (LocalDirStore / LocalTieredStore) and Tier-3 content-addressed OHLCV chunks - Python + CLI workflow showing the canonical pattern: build_index -> rank_index -> store.open(handle) -> BacktestReport(backtests=[...]).save(...) - Links to examples/storage_layer_demo/ for the runnable end-to-end
Add a 'From backtest results to a report' subsection under 'Backtest Analysis & Dashboard' demonstrating the canonical paths from a Backtest (or list of Backtests) to a BacktestReport: - single event-driven app.run_backtest(...) - a sweep via app.run_vector_backtests(..., backtest_storage_directory=...) - loading a persisted folder back via BacktestReport.open(directory_path=..., workers=-1) Cross-links to the Backtest Storage Layer section for sweeps that scale into the thousands.
New 'Getting Started/Backtest Storage Layer' page covering:
- mental model (Tier-1 SQLite / Tier-2 Parquet / canonical .iafbt /
Tier-3 content-addressed OHLCV)
- the BacktestStore protocol and when to pick LocalDirStore vs
LocalTieredStore
- the canonical 5-step developer workflow:
run sweep -> build index -> filter/rank in SQLite ->
materialise winners -> render report
- 'Avoid overloading your report.html': size-vs-bundle table,
the BacktestReport.open(directory_path=...) anti-pattern,
rules of thumb for narrow vs mega-reports
- pointers to examples/storage_layer_demo and the migrate-store CLI
Wired into the Getting Started sidebar between backtest-reports and
deployment, and added a tip block on backtest-reports.md pointing to
it for users with thousands of backtests.
feat(bundle): format v2 + groundwork for tiered storage rewrite (epic #540)
feat(store): epic #540 phase 3 (a-d) + docs/demo — rescue merge to dev
…e-scale analysis Replaces the one-line cross-link with a runnable Python snippet (build_index -> rank_index -> store.open(handle) -> BacktestReport(backtests=[...]).save(...)) so users see exactly how to keep their report.html from blowing up on large sweeps. Adds a callout explaining that for team-scale workflows (searching, filtering and annotating across thousands of backtests in a server-backed UI), the storage layer is best paired with a quant infrastructure provider such as Finterion.
… Finterion infographic
…report builder mock
…with partner-focused intro
…s, remove batch_one examples
- Fix fill_missing_timeseries_data writing back to source CSV during backtest preparation (ccxt.py, backtest_service.py: save_to_file=False) - Fix test_backtest_report.py and test_backtesting.py saving backtest reports into fixture directories (use tempfile instead) - Fix PyArrow schema conflict by dropping run_id column from Parquet sidecar payload in local_tiered_store.py - Add missing public API exports: Pipeline, Factor, CustomFactor, Filter, AverageDollarVolume, BacktestIndex, BUNDLE_FORMAT_VERSION, migrate_backtests, load_ipython_extension - Add CLI --prune, --archive-dir, --dry-run options to rank command - Add strategy showcase callout to README
- Call teardown_sqlalchemy() + gc.collect() before os.remove() in initialize_storage() to release file locks on Windows - Restore backslash-to-forward-slash conversion in SQLite URIs for Windows path compatibility - Export teardown_sqlalchemy from infrastructure module
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Merge dev branch into main — 51 commits covering Pipeline API, tiered backtest storage, CLI improvements, bug fixes, and test hardening.
Key changes
New features
Pipeline,Factor,CustomFactor,Filter,AverageDollarVolumeBacktestIndexfor fast backtest lookups--prune,--archive-dir,--dry-runoptions for rank commandload_ipython_extensionfor Jupyter magic supportBug fixes
fill_missing_timeseries_datawriting back to source CSV during backtest prep (save_to_file=False)run_idcolumn from Parquet sidecar payload)Tests