Skip to content

Propagate spec to v#198

Merged
0xGeorgii merged 17 commits into
mainfrom
propagate-spec-to-v
May 19, 2026
Merged

Propagate spec to v#198
0xGeorgii merged 17 commits into
mainfrom
propagate-spec-to-v

Conversation

@0xGeorgii
Copy link
Copy Markdown
Contributor

No description provided.

0xGeorgii and others added 16 commits May 18, 2026 20:24
Bundles GitHub issues #16-#22 across the type-checker, codegen, wasm-to-v
translator, CLI, and docs.

- Type-checker now opens a real scope per spec block via enter_spec, with
  cached re-entry across register_types, collect_function_and_constant_definitions
  and infer_def. Same-name disambiguation by scope; flatten_defs_with_spec_inner
  removed.
- Codegen threads FunctionOrigin::{TopLevel,SpecInner(spec_name)} through
  visit_function_definition, mangles spec-inner func/method/sret keys by
  "<Spec>.<name>", keeps name-section output unmangled, and gates exports
  to top-level functions only.
- New custom WASM section `inference.spec_funcs` written by wasm-codegen
  and consumed by wasm-to-v carries per-spec function-index lists; sorted
  for deterministic output, skipped when empty so compile-mode WASM is
  byte-identical to pre-feature.
- wasm-to-v emits one `Definition <mod>__<Spec>_specs : list N` per spec
  (sorted, `(@nil N)` for empty lists), the structural `valid_<mod>` theorem,
  and one `valid_<mod>__<Spec>` per spec. Empty specs are now preserved
  end-to-end so they survive into the proof artifact.
- New core/wasm-to-v/src/errors.rs (closes the CLAUDE.md gap) with
  WasmToVError variants for InvalidRocqIdentifier, RocqStdlibShadow,
  EmbeddedSpecMismatch, UnsupportedFeature, and WasmParse. Identifier
  validation rejects empty names, leading non-alpha, invalid chars, length
  > 255, Rocq stdlib shadows, reserved keywords, and `__` separator
  collisions. Strict canonical LEB128 enforced in the parser; bounded
  allocation against remaining payload; duplicate-section rejection;
  decode-time validation of embedded names.
- Public APIs: wasm_to_v signature gains `&FxHashMap<String, Vec<u32>>`;
  CodegenOutput exposes spec_func_indices_by_spec(); inference crate
  re-exports FxHashMap and WasmToVError/InvalidIdentifierReason so
  consumers don't need direct deps on rustc-hash or the translator crate.
- 19 new tests under tests/src/spec_propagation.rs covering scoping,
  export gating, per-spec emission, empty-list emission, invalid-name
  rejection paths, compile-mode byte-identity, explicit-vs-embedded
  precedence, and decode-time payload validation.

Breaking: CodegenOutput shape, wasm_to_v signature, and the ValidModule
arity flip are called out in CHANGELOG. Downstream Rocq libraries need
the new ValidSpec predicate; see core/wasm-to-v/ROCQ_CONTRACT.md.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Completes the user-facing surface of the spec-funcidx feature.

- `infc --mode {compile,proof}` (default `compile`) and `infs build --mode
  proof` forward through. Proof mode keeps spec functions, emits the
  `inference.spec_funcs` custom section, and now auto-enables the `-v`
  flag so the .v artifact lands alongside the .wasm. Default behavior is
  unchanged: `infc foo.inf` produces a byte-identical .wasm to before.
- Replace `path.file_stem().to_str().unwrap()` with a fallback to
  "module" so non-UTF-8 paths no longer panic the CLI on Windows.
- Add the missing `[#195]` link reference in CHANGELOG.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Replace the hand-rolled LEB128 reader in the wasm-to-v decoder with
inf_wasmparser::BinaryReader, which already enforces canonical encoding
and validates UTF-8 at the length-prefixed string read. Adds a leading
varuint32 version byte to the inference.spec_funcs payload so future
format revisions trip a loud "unsupported version" branch instead of
silently misparsing the rest of the section.

- New SPEC_FUNCS_SECTION_VERSION constant (=1) re-exported from both
  wasm-codegen and wasm-to-v as the single source of truth.
- Encoder emits version-then-count; decoder reads version first and
  rejects mismatches with WasmToVError::WasmParse carrying "version".
- Decoder now delegates all LEB128/string reads to BinaryReader; the
  hand-rolled read_leb128_u32 is removed along with its test.
- Existing corruption tests prepend the version byte so they exercise
  the targeted defenses (truncated count, count-exceeds-payload,
  invalid UTF-8, duplicate section) rather than tripping the version
  check.
- New test: decode_spec_funcs_unsupported_version_is_rejected.
H6: cross-spec same-named structs/enums are rejected at registration time
(previously silently used the first-registered layout). Mangling struct
identity would require carrying spec context through every type access
(field projection, sret layouts, method dispatch); the rejection avoids
that blast radius and surfaces a clear `RegistrationFailed` diagnostic.
Functions stay mangleable across specs.

A5: `SpecScopeGuard` wraps `&mut Compiler` and forwards via Deref/DerefMut
so `current_spec` is reset on every exit path from `visit_function_definition`,
including the `?` early return from `compute_frame_layout`. The wrapping
shape avoids the partial-borrow conflict that a slot-shaped guard
(`&mut self.current_spec`) would have with `self.<method>()` calls inside
the function body.
Mirrors the existing --mode proof -> -v rule. Previously `infc foo.inf -v`
silently ran in compile mode, which strips spec functions in codegen, so
the emitted .v contained only the structural module/theorem and zero
per-spec definitions - a UX trap on a feature most users were reaching
for. After this change:

- `infc foo.inf -v` resolves to proof mode, emitting per-spec
  Definition/Theorem pairs in the .v.
- `infc foo.inf --mode compile -v` keeps the prior behavior for users
  who legitimately want V output from a spec-stripped WASM.
- `infc foo.inf` (no flags) is byte-identical to before.

Implementation: `Cli::mode` and `BuildArgs::mode` switch to
`Option<CliMode>`/`Option<BuildMode>` so absence is distinguishable from
explicit `--mode compile`. `normalize_args` resolves the Option to a
concrete mode based on the (mode, -v) pair.
Section A of the review follow-ups:

- H1: SpecScopeGuard now saves the previous current_spec on `enter`
  and restores it on drop. Fixes the inverted invariant baked into
  the unit tests and adds a nested-guard composition test.
- H2: Introduces FnKey enum (Free / Method / SpecFree / SpecMethod)
  as the structured key type for func_name_to_idx, func_array_returns,
  func_struct_returns, and method_mangled_names values. The previous
  string-based scheme could textually collide a spec-inner free fn
  with a struct method under "<X>.<Y>"; this collision is impossible
  by construction. Drops the runtime assert! guards that used to
  trip on legal source. Adds current_fn_key for sret-emission lookup.
- A3 (M5): take_spec_func_indices_by_spec now debug-asserts a single
  drainage call.
- A4 (M6): codegen entry debug-asserts compile mode never produces
  recorded spec function indices.
- A5 (M7): Threads module_name through inference_wasm_codegen::codegen
  and inference::codegen so the CLI's file stem now flows into the
  WASM name subsection (previously hardcoded "output"). Callers
  updated: cli/main.rs, tests/utils.rs, tests/spec_propagation.rs,
  tests/codegen/wasm/inference_wrapper.rs.
- A7 (L2): Consolidates clippy::cast_possible_truncation #[allow]
  attributes into u32::try_from(...).expect(...) sites.
- A8 (L3): spec_section encode_payload now collects Vec<(&str, &[u32])>
  per the project's slice-of-refs preference.
- A9 (L5): collect_emittable_functions returns Result; nested specs
  surface as CodegenError::NestedSpecsNotSupported rather than being
  silently dropped.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Section B of the review follow-ups:

- B1 (H3): Adds TypeCheckError::SpecFunctionShadowsTopLevel and a new
  `check_spec_function_shadows_top_level` phase that walks each spec
  block's inner function definitions after the collect-function phase
  and rejects any whose bare name also exists as a top-level function
  via the new SymbolTable::lookup_function_in_root helper. Order-
  independent: the check runs after both top-level and spec-inner
  functions are registered. Diagnostic message names the shadow rule
  and points the user at renaming.
- B2 (M1): Introduces a small SpecScopeGuard in the type checker that
  wraps &mut TypeChecker with Deref/DerefMut forwarding, enters the
  spec scope on construction and pops on drop. Replaces three open-
  coded enter_spec/pop_scope pairs across register_type_for_def,
  collect_for_def, and infer_def. Pop now runs on every exit path
  including unwind.
- Updates three spec_propagation tests that previously asserted the
  permitted-shadowing behavior: two flip to assert the rejection
  diagnostic, one is rewritten with distinct names so it still
  exercises spec-aware call resolution without colliding identifiers.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Section C of the review follow-ups:

- C1 (M8): decode_spec_funcs_section now rejects trailing bytes after
  the last well-formed entry. Without this guard, a corrupted or
  hand-crafted payload could carry arbitrary tail bytes past the
  canonical schema.
- C2 (M9): translate_bytes now rejects a second standard WASM `name`
  custom section. The previous code silently used the last one,
  clobbering func_names_map and func_locals_name_map without warning.
- C3 (M16): Caps individual spec name length at MAX_SPEC_NAME_LEN
  (255 bytes, matching the validate_rocq_identifier TooLong limit).
  A hand-crafted payload that advertised a multi-MB name would no
  longer allocate that much memory before validation.
- C4 (L7): Replaces the `_ => todo!()` arm in the parser payload
  match with a structured WasmToVError::WasmParse, matching the
  project policy of never panicking on user input.

Updates embedded_name_section_with_invalid_module_name_is_rejected to
strip the codegen-emitted `name` section before appending its hand-
crafted replacement, since C2 now rejects the duplicate.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Section D of the review follow-ups:

- D1 (H7): inference orchestrator now re-exports
  SPEC_FUNCS_SECTION_NAME and SPEC_FUNCS_SECTION_VERSION so tests
  and downstream consumers share a single source of truth without
  reaching into inference_wasm_codegen directly. Sweeps tests/src/
  spec_propagation.rs to use the orchestrator re-export.
- D2 (M14): Adds per-field doc comments to
  WasmToVError::EmbeddedSpecMismatch so the `explicit` (caller-
  supplied) vs `embedded` (decoded from the WASM custom section)
  distinction is documented at the type level.
- D3 (M15): Adds #[non_exhaustive] to WasmToVError. The sibling
  InvalidIdentifierReason already had it; future variant additions
  are no longer breaking for downstream consumers that pattern-match.
  Downcast via anyhow::Error::downcast_ref::<WasmToVError>() is
  unaffected.
- D4 (L22): Fixes the CodegenOutput doctest in output.rs which
  referenced `inference::FxHashMap` from inside the wasm-codegen
  crate, where it cannot be reached without an orchestrator cycle.
  Reworded as "the inference orchestrator also re-exports it".

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Section E of the review follow-ups:

- E1 (M10): eprint_translation_error in core/cli/src/main.rs gains
  explicit arms for WasmToVError::WasmParse and ::UnsupportedFeature
  so users see purpose-specific diagnostics ("malformed binary" vs
  "feature not yet supported by the translator") instead of the
  generic "WASM->V translation failed" fallback. Wildcard arm is
  retained as required by #[non_exhaustive].
- E2 (M11): normalize_args now emits a stderr warning when
  --mode proof is combined with --parse or --analyze (which exclude
  codegen and therefore produce no .v output). Purely informational;
  exit code unchanged. Helps users discover the silent no-op.
- E3 (M12): apps/infs/src/commands/build.rs drops the re-implementation
  of the -v/--mode proof implication. infs now forwards only the
  explicit --mode the user passed; infc::normalize_args owns the
  symmetric rule, removing the dual source of truth.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Section F (H10) of the review follow-ups:

Adds `Nat` (capitalized) to REJECTED_ROCQ_STDLIB_NAMES. The Rocq
prelude auto-opens `Coq.Init.Nat`, which provides `Nat.add`,
`Nat.eqb`, and the rest of Peano-nat arithmetic. Emitting
`Module Nat. ... End Nat.` from a source file named `nat.inf` or
`Nat.inf` would shadow these across the entire generated proof,
causing user-supplied proofs and tactics that mention `Nat.x` to
silently resolve to the spec's empty `Nat` namespace.

Updates the rationale comment in validate_rocq_identifier to match.

The broader denylist sweep (`andb`, `orb`, `negb`, `I`, etc.) and
the `Module Inference. ... End Inference.` wrapping that would
make the denylist obsolete remain deferred per the review plan.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Section G (M18-partial) of the review follow-ups: hardening tests
paired with sections C and D.

New tests in scenario_6b_embedded_data_validation:

- embedded_spec_funcs_with_trailing_bytes_is_rejected — covers C1.
  A well-formed payload with a stray byte appended must surface
  WasmParse with `trailing` in the message.
- embedded_spec_funcs_idx_count_exceeds_payload_is_rejected — covers
  the per-spec idx_count guard (separate from the existing
  embedded_spec_funcs_count_exceeds_payload_is_rejected which
  covers the outer count). Declares idx_count = u32::MAX with no
  index bytes and asserts the bounded-allocation rejection.
- duplicate_wasm_name_section_is_rejected — covers C2. baseline_wasm
  already carries a `name` section; appending a second triggers
  the new duplicate guard.
- embedded_name_section_overrides_caller_mod_name — positive test
  paired with B3. Strips the codegen-emitted `name` section and
  appends a hand-crafted one, then asserts the override flows into
  the emitted Rocq output (`Theorem valid_<embedded>` appears,
  `<caller>` does not).

Tightens embedded_spec_funcs_duplicate_section_is_rejected to
downcast to WasmToVError::WasmParse and assert the diagnostic
substring instead of merely is_err().

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Section H of the review follow-ups:

- H1 (H8): Renames every CHANGELOG issue ref from [#NN] to [issue#NN]
  so they no longer collide visually with PR refs ([pr#NN]). Touches
  every body reference plus the link-definition block at the bottom.
  Affected numbers: 16, 17, 18, 19, 20, 21, 22, 97.
- H2 (L15): Adds [issue#22] tag to the `-v implies proof` CLI bullet
  for symmetry with neighboring entries.
- H3 (M17): Moves the (@nil N) bullet from `### Compiler` to
  `### Rocq Translation`, where the change actually lives.
- H4 (L20): core/wasm-to-v/README.md no longer claims to "re-export"
  the section constants — it consumes them at the decode boundary.
- H5 (L21): book/compilation_targets.md gains a "Selecting a mode at
  the CLI" pointer documenting `infs build --mode {compile,proof}`
  and the `-v` implication.

Also adds two user-visible entries for new behavior on this branch:

- Type Checker: SpecFunctionShadowsTopLevel rejection (Section B).
- Rocq Translation: module-name subsection now reflects the input
  file stem instead of `"output"` (Section A, M7).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Signed-off-by: Georgii Plotnikov <accembler@gmail.com>
@0xGeorgii 0xGeorgii marked this pull request as ready for review May 19, 2026 11:51
@0xGeorgii 0xGeorgii self-assigned this May 19, 2026
@0xGeorgii 0xGeorgii requested a review from Copilot May 19, 2026 11:51
@0xGeorgii 0xGeorgii added the refactoring Project supplements label May 19, 2026
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR propagates spec function metadata through WASM codegen into the WASM-to-Rocq translation path and adds .inf fixture coverage for proof-mode spec behavior.

Changes:

  • Reworks codegen finalization to emit WASM and move out the per-spec function index map in one step.
  • Moves Rocq identifier validation for caller-provided spec maps to the public translation boundary.
  • Adds .inf integration fixtures and tests covering spec methods, top-level calls from specs, multiple specs, compile/proof mode differences, and existing with_spec.inf wiring.

Reviewed changes

Copilot reviewed 10 out of 10 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
core/wasm-codegen/src/compiler.rs Updates spec index ownership/finalization and method lookup internals.
core/wasm-codegen/src/lib.rs Uses the new finish_and_take flow and snapshots has_main.
core/wasm-to-v/src/wasm_parser.rs Validates caller-supplied module/spec names before parsing.
core/wasm-to-v/src/translator.rs Removes duplicate spec-name validation from translation.
tests/src/spec_propagation_inf.rs Adds end-to-end .inf fixture tests for spec propagation.
tests/src/lib.rs Registers the new integration test module.
tests/test_data/inf/spec_method.inf Adds fixture for spec-contained struct methods.
tests/test_data/inf/spec_calls_top.inf Adds fixture for spec-inner calls to top-level functions.
tests/test_data/inf/three_specs.inf Adds fixture for multiple specs including an empty spec.
tests/test_data/inf/mixed_compile_proof.inf Adds fixture for compile/proof mode comparison.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread core/wasm-codegen/src/compiler.rs Outdated
@codecov
Copy link
Copy Markdown

codecov Bot commented May 19, 2026

Codecov Report

❌ Patch coverage is 96.77419% with 2 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
core/wasm-codegen/src/compiler.rs 96.22% 2 Missing ⚠️

📢 Thoughts on this report? Let us know!

Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Signed-off-by: Georgii Plotnikov <accembler@gmail.com>
@0xGeorgii 0xGeorgii merged commit ef62a21 into main May 19, 2026
6 checks passed
@0xGeorgii 0xGeorgii deleted the propagate-spec-to-v branch May 19, 2026 22:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

refactoring Project supplements

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants