Skip to content

Release: dev → main#550

Merged
MDUYN merged 36 commits into
mainfrom
dev
May 18, 2026
Merged

Release: dev → main#550
MDUYN merged 36 commits into
mainfrom
dev

Conversation

@MDUYN
Copy link
Copy Markdown
Collaborator

@MDUYN MDUYN commented May 16, 2026

Summary

Merge dev branch into main — 51 commits covering Pipeline API, tiered backtest storage, CLI improvements, bug fixes, and test hardening.

Key changes

New features

  • Pipeline API: Pipeline, Factor, CustomFactor, Filter, AverageDollarVolume
  • Tiered backtest storage (SQLite + Parquet + OHLCV)
  • BacktestIndex for fast backtest lookups
  • CLI --prune, --archive-dir, --dry-run options for rank command
  • load_ipython_extension for Jupyter magic support

Bug fixes

  • Fix fill_missing_timeseries_data writing back to source CSV during backtest prep (save_to_file=False)
  • Fix tests mutating fixture files (use tempdir for backtest report saves)
  • Fix PyArrow schema conflict (drop run_id column from Parquet sidecar payload)
  • Add missing public API exports

Tests

  • 1844 tests pass, 43 skipped
  • Zero fixture mutations after full suite run

MDUYN added 30 commits May 10, 2026 19:44
Introduces .iafbt format version 2 (default writer) — see
docs/design/bundle-format-v2.md for the public spec.

Changes:
- Backtest dataclass gains an engine_type field ('vector' | 'event' |
  None) plus vector_runs / event_runs / vector_metrics / event_metrics
  derived properties. Engine-tagged at construction by VectorBacktest
  and EventBacktest service paths; legacy / unknown bundles keep the
  engine-agnostic backtest_runs / backtest_summary view.
- Bundle envelope v2 routes runs into engine-specific slots
  (vector_runs / event_runs) based on engine_type. v1 envelopes
  remain readable indefinitely.
- Eight heavy metric time-series (equity_curve, drawdown_series,
  cumulative_return_series, rolling_sharpe_ratio, monthly_returns,
  yearly_returns, twr_equity_curve, twr_drawdown_series) are
  extracted into embedded Parquet blobs with int64 epoch-ms
  timestamps and float64 values, replacing the v1 inline
  list-of-(value, ISO-string) shape.
- save_bundle gains format_version (default 2, accepts 1 for
  downgrade) and float32_ohlcv (off by default) options.
- open_bundle and Backtest.open gain summary_only=True for bulk
  listing pipelines that skip the per-run blob decode.

OHLCV side store, content addressing, and LazyOhlcvDict are
unchanged. Public OHLCV dedup protocol spec added at
docs/design/ohlcv-dedup-protocol.md for upload-style integrations.

17 bundle tests pass (8 new v2-specific cases); full suite passes
1681/1681 against unchanged behaviour for v1 bundles.
Measured on real production-shape bundles (~570 KB at level 7,
7 runs × 2192 hourly portfolio snapshots each):

  level 7  →  ~570 KB
  level 19 →  ~489 KB  (−14%)
  level 22 →  ~489 KB  (saturated)

Decode speed is unchanged in zstd's design (decoder is independent
of encoder level). For a 12,500-bundle workload (~64 GB at level 7)
this trims the archive to ~55 GB at zero behavioural cost.

Level 19 is the highest level still in the standard tier (no
--ultra flag, no special memory window), so the bytes are readable
by any stock zstd reader.
Captures the storage architecture proposal that came out of the
v8.9 size-review measurements:

  - Why per-file compression has hit its ceiling (~14% headroom max)
  - Three-tier model: Index (SQL) + Columnar bulk (Parquet) +
    Content-addressed chunks (S3-compatible)
  - .iafbt demoted from storage primitive to deterministic export
    format; v1 and v2 remain readable forever
  - Local-only OSS path (LocalTieredStore) keeps the framework
    self-contained without a server
  - Phased migration: v8.10 read-side helpers, v8.11 store
    abstraction, Finterion remote store closed-source

Companion to docs/design/bundle-format-v2.md and
docs/design/ohlcv-dedup-protocol.md.
…phase 1)

Lift the existing untyped flat-row index helper into a public, typed
Tier-1 contract:

* New BacktestIndexRow dataclass (domain/backtesting/backtest_index_row.py)
  with identity / provenance / config / nested summary_metrics +
  forward-compat extras. Lossless to_flat_dict / from_flat_dict round
  trip for Parquet, SQL and JSON sinks.
* New Backtest.index_row(bundle_path=None) method. Builds without
  decoding any v2 Parquet metric blobs, so it works against bundles
  loaded with Backtest.open(..., summary_only=True). This is the
  fast read path the upcoming 'iaf index' CLI (phase 2) and any
  tiered store implementation (phase 3) will rely on.
* _backtest_to_index_row in backtest_utils now delegates to
  BacktestIndexRow.to_flat_dict() so the wire shape and the in-memory
  shape are a single source of truth (no behavioural change for the
  existing index.parquet sidecar).
* Re-export BacktestIndexRow from the domain and top-level packages.
* docs/design/tiered-backtest-storage.md \xa73.1 + roadmap row updated
  to reference the typed contract.

Tests: 5 new (in-memory derivation, flat round-trip incl. NaN, unknown
columns landing in extras, derivation from a summary_only=True bundle
load). Full backtests suite green (29/29).
CI flake8 flags the import as unused without the corresponding entry
in __all__. Phase 1 missed adding it.
Tier-1 SQLite index over a folder of .iafbt bundles, building on the
phase-1 BacktestIndexRow contract.

* services/backtest_index/sqlite_index.py: SqliteBacktestIndex with
  create / open / upsert / upsert_many / iter_rows / query. Every
  scalar field of BacktestSummaryMetrics is promoted to its own
  summary_<name> SQL column so analysts can filter without opening
  any bundle (e.g. WHERE summary_sharpe_ratio > 1.0). parameters /
  strategy_ids / extras round-trip as JSON text. WAL mode for safe
  concurrent reads. Forward-only additive schema migration via
  PRAGMA user_version.
* cli/index_command.py + new 'iaf index <bundle-dir>' click command:
  walks the directory, opens each bundle with summary_only=True (no
  Parquet metric-blob decode), derives BacktestIndexRow, upserts.
  --output, --absolute-paths and --no-progress flags.
* docs/design/tiered-backtest-storage.md updated with phase-2 status.

Tests: 12 new (8 SqliteBacktestIndex unit tests + 4 CLI integration
tests via click.testing.CliRunner). Full repo suite green
(1698 passed, 42 skipped). Lint clean.
feat(cli): `iaf index` + SqliteBacktestIndex — epic #540 phase 2
feat(backtest): BacktestIndexRow DTO + Backtest.index_row() — epic #540 phase 1
Closes the remaining Phase 2 gaps for epic #540:

- Backtest.scalar_summary() alias (canonical Phase 1 naming).
- SqliteBacktestIndex tracks bundle_mtime_ns + bundle_size (schema v2);
  is_up_to_date() lets the indexer skip unchanged bundles.
- build_index(..., incremental=True) skips up-to-date bundles by default;
  'iaf index --rebuild' forces a full reindex.
- 4 new tests covering skip, re-ingest on mtime bump, --rebuild, and the
  scalar_summary alias.
- scripts/bench_540_phase2.py: acceptance benchmark. At 12,500 bundles:
  cold build 86s, incremental 536ms, list top-20 in 8.3ms (12x under the
  100ms target), index footprint 2.5 MiB.
- examples/storage_layer_demo/: end-to-end walkthrough of write -> index
  -> list -> rank -> open-with-summary-only, plus inline backtest report.
Introduces the storage seam that decouples *where* a backtest is
persisted from the rest of the framework. Phase 3a is intentionally
scoped to the Protocol + a thin adapter over today's .iafbt layout;
LocalTieredStore (Tier-2 Parquet + Tier-3 chunks) and FinterionStore
land in follow-up PRs.

- BacktestStore Protocol: write / open / exists / delete / iter_handles
  / iter_index_rows / __len__ / __contains__. Mirrors today's
  Backtest.save_bundle / Backtest.open semantics so LocalDirStore is
  a 1:1 adapter.
- StoreHandle: opaque str token (relative bundle path for
  LocalDirStore; uuid7 run_id for the upcoming tiered stores).
- StoreError + StoreHandleNotFoundError.
- Optional capability mixin SupportsCopyFrom — declared as a separate
  runtime_checkable Protocol so 'iaf migrate-store' (Phase 3d) and
  'finterion push' (closed-source) can isinstance-test for it. Future
  capabilities (SupportsRelations for the strategy/version/report
  graph, SupportsContentAddressedChunks for Tier-3 dedup) follow the
  same pattern.
- LocalDirStore: handle = bundle path relative to the root, so the
  store stays portable across moves. Sidecar SqliteBacktestIndex
  (built lazily, incrementally — same machinery as 'iaf index') backs
  iter_index_rows so listing does not re-decode bundles. Path-traversal
  guards reject handles that escape the root.
- Tests (19, all passing): Protocol/SupportsCopyFrom conformance,
  round-trip, summary_only, default-handle derivation, sidecar index
  caching, copy_from with and without a handle subset, handle
  normalisation, path-traversal rejection, missing-handle error,
  delete idempotency.

Targeted suite (store + index + cli): 86 / 86 passing.
…540 phase 3b)

Second slice of Phase 3 of epic #540. Adds a real tiered storage
implementation that ships the analytics value (cross-run DuckDB /
Polars queries) without yet replacing the canonical .iafbt bundle.

Layout under <root>:
  index.sqlite                                    Tier-1 (always in sync)
  bundles/<handle>.iafbt                          canonical bytes
  parquet/portfolio_snapshots/run_id=<h>/...      Tier-2 hive-partitioned
  parquet/trades/run_id=<h>/...                   Tier-2
  parquet/orders/run_id=<h>/...                   Tier-2

Phase 3b deliberately keeps the bundle as the canonical representation;
Tier-2 sidecars are auxiliary, written best-effort, and a malformed
sidecar never blocks a write or a read. This trivially preserves
byte-identical Backtest round-trips today. Byte-identical
Tier-2 -> Backtest reassembly (no bundle on the read path) is Phase 3d.

- decompose.py: Backtest -> flat record lists for snapshots / trades /
  orders, adding run_id and window_name columns so downstream tools
  group cleanly across walk-forward windows. Extension point for
  metric_series and any future kind is the DATASETS tuple.

- LocalTieredStore: implements BacktestStore + SupportsCopyFrom.
  write() saves the bundle, upserts the Tier-1 row, and writes
  hive-partitioned Parquet sidecars per dataset. delete() removes all
  three tiers. iter_index_rows() serves from SQLite directly.
  rebuild_index() recreates Tier-1 from the bundles (useful after a
  software upgrade that adds new index columns).

- scan('portfolio_snapshots' | 'trades' | 'orders') returns a
  pyarrow.dataset.Dataset that DuckDB / Polars can query across every
  run with partition pruning on run_id.

- 15 new tests: Protocol + SupportsCopyFrom conformance, three-tier
  layout, handle normalisation, round-trip, summary_only, Tier-1
  always-in-sync (write/delete/len), Tier-2 cross-run scan, copy_from
  from LocalDirStore, rebuild_index, missing-handle errors. Includes
  a synthetic-records test that asserts hive partitions are written
  and that scan() returns the expected rows + columns.

Targeted suite (backtest_store + backtest_index + cli): 101 / 101 passing.
…e (epic #540 phase 3c)

Wires LocalTieredStore into the existing OHLCV side-store machinery
so identical (symbol, timeframe) Parquet bytes are written exactly
once and shared across every bundle that references them.

- write() now routes save_bundle's OHLCV writes to <root>/ohlcv/
  whenever backtest.ohlcv is non-empty. The bundle envelope keeps
  its content-addressed manifest unchanged, so old bundles remain
  readable.
- open() forwards the same shared directory to open_bundle so OHLCV
  lookups resolve regardless of what path the bundle was originally
  written with.
- delete() intentionally does NOT touch ohlcv/. Chunks are globally
  shared; orphans are reclaimed via garbage_collect_ohlcv(dry_run=…).
- Introspection helpers required by the dedup-upload protocol
  (docs/design/ohlcv-dedup-protocol.md):
    * iter_ohlcv_hashes() / ohlcv_referenced_hashes()
    * ohlcv_stored_hashes()
    * ohlcv_stats() -> stored_blobs / stored_bytes / referenced_blobs
                       / orphan_blobs / missing_blobs
    * garbage_collect_ohlcv(dry_run=False)
  Manifests are decoded straight from the bundle envelope
  (_decode_payload) so the cost is one msgpack read per bundle —
  no full Backtest instantiation.

9 new tests:
- No OHLCV -> no chunk dir created.
- Identical OHLCV is stored once across distinct handles (dedup).
- Different OHLCV yields separate chunks.
- Round-trip via store.open() resolves OHLCV from the shared dir.
- delete() keeps still-referenced chunks; orphans only after GC.
- garbage_collect_ohlcv(dry_run=True) lists without deleting; the
  real call removes them.
- iter_ohlcv_hashes() emits per-reference; ohlcv_referenced_hashes()
  dedups.
- Hash strings are 64-char lowercase hex (matches the upload protocol
  spec).

Targeted suite (backtest_store + backtest_index + cli): 110 / 110 passing.
…phase 3d)

Closes the open Phase 3 deliverables that turn the new store
abstraction into something users can actually move data through:

- iaf migrate-store --from <kind> --src <path> --to <kind> --dst <path>
  delegates to dst.copy_from(src), so it is incremental, restartable,
  and tier-aware: when the destination is a local-tiered store,
  identical OHLCV chunks are written exactly once across the
  destination regardless of how many bundles reference them
  (Phase 3c invariant). Optional --handles subset selector for
  partial migrations.
- migrate_store() programmatic helper for in-process pipelines.
- BacktestStoreContractTest: a parameterised conformance suite
  that runs identical scenarios against every concrete store
  implementation (LocalDirStore, LocalTieredStore today, future
  remote stores tomorrow). Catches divergence as a failing subTest
  with the store class name in the label. Covers Protocol +
  SupportsCopyFrom conformance, write/open round-trip, summary_only,
  exists, idempotent delete, missing-handle errors, listing,
  iter_index_rows, and copy_from with both full and subset handle
  selection.
- bug fix in LazyOhlcvDict: items() and values() were inheriting
  the empty backing dict's iteration, so any code path that did
  'for k, v in bt.ohlcv.items()' silently dropped every blob after
  a tiered round-trip. Now both methods walk the manifest and
  materialise lazily on access. Caught by the migration dedup
  test.

Note on what is *not* in this PR: byte-identical Tier-2 -> Backtest
reassembly (so .iafbt could become export-only) is intentionally
deferred. The current model where the bundle is canonical and
Tier-1/2/3 are derived is simpler, preserves the existing
round-trip contract bit-for-bit, and is what every test in the
contract suite already exercises against both stores.

Targeted suite (backtest_store + backtest_index + cli): 128 / 128
passing + 26 subTests. Full non-scenario suite: 1705 / 1705 passing
with no regressions from the LazyOhlcvDict fix.
…L dashboard

Two new sections in examples/storage_layer_demo/demo.py:

- 6b. _print_backtest_full_report(): per-run breakdown (window /
  days / orders / trades / positions / final_value), end-of-backtest
  positions snapshot, first few trades, and a richer slice of
  per-run BacktestMetrics (cagr, annual_volatility,
  max_drawdown_absolute, gross_profit/loss, best_trade, max
  consecutive wins/losses) with safe n/a fallbacks. Built on top
  of the existing compact _print_backtest_report().

- 9. Storage layer -> HTML dashboard: wires the Tier-1 SQLite
  index, the Tier-2 LocalDirStore and the BacktestReport HTML
  dashboard end-to-end. rank_index() picks the top-N bundles from
  SQLite alone, store.open(handle) materialises just those via the
  BacktestStore protocol, BacktestReport(backtests=[...]).save()
  renders a self-contained interactive HTML dashboard. Demonstrates
  that the new storage layer plugs straight into the existing
  reporting stack with no glue code.

README updated to describe both new sections.
- New feature bullet linking the storage_layer_demo
- New '<details>' section explaining Tier-1 SQLite index, Tier-2
  BacktestStore adapters (LocalDirStore / LocalTieredStore) and
  Tier-3 content-addressed OHLCV chunks
- Python + CLI workflow showing the canonical pattern:
  build_index -> rank_index -> store.open(handle) ->
  BacktestReport(backtests=[...]).save(...)
- Links to examples/storage_layer_demo/ for the runnable end-to-end
Add a 'From backtest results to a report' subsection under
'Backtest Analysis & Dashboard' demonstrating the canonical paths
from a Backtest (or list of Backtests) to a BacktestReport:

- single event-driven app.run_backtest(...)
- a sweep via app.run_vector_backtests(..., backtest_storage_directory=...)
- loading a persisted folder back via BacktestReport.open(directory_path=..., workers=-1)

Cross-links to the Backtest Storage Layer section for sweeps that
scale into the thousands.
New 'Getting Started/Backtest Storage Layer' page covering:

- mental model (Tier-1 SQLite / Tier-2 Parquet / canonical .iafbt /
  Tier-3 content-addressed OHLCV)
- the BacktestStore protocol and when to pick LocalDirStore vs
  LocalTieredStore
- the canonical 5-step developer workflow:
    run sweep -> build index -> filter/rank in SQLite ->
    materialise winners -> render report
- 'Avoid overloading your report.html': size-vs-bundle table,
  the BacktestReport.open(directory_path=...) anti-pattern,
  rules of thumb for narrow vs mega-reports
- pointers to examples/storage_layer_demo and the migrate-store CLI

Wired into the Getting Started sidebar between backtest-reports and
deployment, and added a tip block on backtest-reports.md pointing to
it for users with thousands of backtests.
feat(bundle): format v2 + groundwork for tiered storage rewrite (epic #540)
feat(store): epic #540 phase 3 (a-d) + docs/demo — rescue merge to dev
…e-scale analysis

Replaces the one-line cross-link with a runnable Python snippet
(build_index -> rank_index -> store.open(handle) ->
BacktestReport(backtests=[...]).save(...)) so users see exactly
how to keep their report.html from blowing up on large sweeps.

Adds a callout explaining that for team-scale workflows
(searching, filtering and annotating across thousands of
backtests in a server-backed UI), the storage layer is best
paired with a quant infrastructure provider such as Finterion.
MDUYN added 6 commits May 12, 2026 18:11
- Fix fill_missing_timeseries_data writing back to source CSV during
  backtest preparation (ccxt.py, backtest_service.py: save_to_file=False)
- Fix test_backtest_report.py and test_backtesting.py saving backtest
  reports into fixture directories (use tempfile instead)
- Fix PyArrow schema conflict by dropping run_id column from Parquet
  sidecar payload in local_tiered_store.py
- Add missing public API exports: Pipeline, Factor, CustomFactor, Filter,
  AverageDollarVolume, BacktestIndex, BUNDLE_FORMAT_VERSION, migrate_backtests,
  load_ipython_extension
- Add CLI --prune, --archive-dir, --dry-run options to rank command
- Add strategy showcase callout to README
- Call teardown_sqlalchemy() + gc.collect() before os.remove() in
  initialize_storage() to release file locks on Windows
- Restore backslash-to-forward-slash conversion in SQLite URIs for
  Windows path compatibility
- Export teardown_sqlalchemy from infrastructure module
@MDUYN MDUYN merged commit 6cbdd97 into main May 18, 2026
8 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant