Propagate spec to v#198
Merged
Merged
Conversation
Bundles GitHub issues #16-#22 across the type-checker, codegen, wasm-to-v translator, CLI, and docs. - Type-checker now opens a real scope per spec block via enter_spec, with cached re-entry across register_types, collect_function_and_constant_definitions and infer_def. Same-name disambiguation by scope; flatten_defs_with_spec_inner removed. - Codegen threads FunctionOrigin::{TopLevel,SpecInner(spec_name)} through visit_function_definition, mangles spec-inner func/method/sret keys by "<Spec>.<name>", keeps name-section output unmangled, and gates exports to top-level functions only. - New custom WASM section `inference.spec_funcs` written by wasm-codegen and consumed by wasm-to-v carries per-spec function-index lists; sorted for deterministic output, skipped when empty so compile-mode WASM is byte-identical to pre-feature. - wasm-to-v emits one `Definition <mod>__<Spec>_specs : list N` per spec (sorted, `(@nil N)` for empty lists), the structural `valid_<mod>` theorem, and one `valid_<mod>__<Spec>` per spec. Empty specs are now preserved end-to-end so they survive into the proof artifact. - New core/wasm-to-v/src/errors.rs (closes the CLAUDE.md gap) with WasmToVError variants for InvalidRocqIdentifier, RocqStdlibShadow, EmbeddedSpecMismatch, UnsupportedFeature, and WasmParse. Identifier validation rejects empty names, leading non-alpha, invalid chars, length > 255, Rocq stdlib shadows, reserved keywords, and `__` separator collisions. Strict canonical LEB128 enforced in the parser; bounded allocation against remaining payload; duplicate-section rejection; decode-time validation of embedded names. - Public APIs: wasm_to_v signature gains `&FxHashMap<String, Vec<u32>>`; CodegenOutput exposes spec_func_indices_by_spec(); inference crate re-exports FxHashMap and WasmToVError/InvalidIdentifierReason so consumers don't need direct deps on rustc-hash or the translator crate. - 19 new tests under tests/src/spec_propagation.rs covering scoping, export gating, per-spec emission, empty-list emission, invalid-name rejection paths, compile-mode byte-identity, explicit-vs-embedded precedence, and decode-time payload validation. Breaking: CodegenOutput shape, wasm_to_v signature, and the ValidModule arity flip are called out in CHANGELOG. Downstream Rocq libraries need the new ValidSpec predicate; see core/wasm-to-v/ROCQ_CONTRACT.md. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Completes the user-facing surface of the spec-funcidx feature.
- `infc --mode {compile,proof}` (default `compile`) and `infs build --mode
proof` forward through. Proof mode keeps spec functions, emits the
`inference.spec_funcs` custom section, and now auto-enables the `-v`
flag so the .v artifact lands alongside the .wasm. Default behavior is
unchanged: `infc foo.inf` produces a byte-identical .wasm to before.
- Replace `path.file_stem().to_str().unwrap()` with a fallback to
"module" so non-UTF-8 paths no longer panic the CLI on Windows.
- Add the missing `[#195]` link reference in CHANGELOG.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Replace the hand-rolled LEB128 reader in the wasm-to-v decoder with inf_wasmparser::BinaryReader, which already enforces canonical encoding and validates UTF-8 at the length-prefixed string read. Adds a leading varuint32 version byte to the inference.spec_funcs payload so future format revisions trip a loud "unsupported version" branch instead of silently misparsing the rest of the section. - New SPEC_FUNCS_SECTION_VERSION constant (=1) re-exported from both wasm-codegen and wasm-to-v as the single source of truth. - Encoder emits version-then-count; decoder reads version first and rejects mismatches with WasmToVError::WasmParse carrying "version". - Decoder now delegates all LEB128/string reads to BinaryReader; the hand-rolled read_leb128_u32 is removed along with its test. - Existing corruption tests prepend the version byte so they exercise the targeted defenses (truncated count, count-exceeds-payload, invalid UTF-8, duplicate section) rather than tripping the version check. - New test: decode_spec_funcs_unsupported_version_is_rejected.
H6: cross-spec same-named structs/enums are rejected at registration time (previously silently used the first-registered layout). Mangling struct identity would require carrying spec context through every type access (field projection, sret layouts, method dispatch); the rejection avoids that blast radius and surfaces a clear `RegistrationFailed` diagnostic. Functions stay mangleable across specs. A5: `SpecScopeGuard` wraps `&mut Compiler` and forwards via Deref/DerefMut so `current_spec` is reset on every exit path from `visit_function_definition`, including the `?` early return from `compute_frame_layout`. The wrapping shape avoids the partial-borrow conflict that a slot-shaped guard (`&mut self.current_spec`) would have with `self.<method>()` calls inside the function body.
Mirrors the existing --mode proof -> -v rule. Previously `infc foo.inf -v` silently ran in compile mode, which strips spec functions in codegen, so the emitted .v contained only the structural module/theorem and zero per-spec definitions - a UX trap on a feature most users were reaching for. After this change: - `infc foo.inf -v` resolves to proof mode, emitting per-spec Definition/Theorem pairs in the .v. - `infc foo.inf --mode compile -v` keeps the prior behavior for users who legitimately want V output from a spec-stripped WASM. - `infc foo.inf` (no flags) is byte-identical to before. Implementation: `Cli::mode` and `BuildArgs::mode` switch to `Option<CliMode>`/`Option<BuildMode>` so absence is distinguishable from explicit `--mode compile`. `normalize_args` resolves the Option to a concrete mode based on the (mode, -v) pair.
Section A of the review follow-ups: - H1: SpecScopeGuard now saves the previous current_spec on `enter` and restores it on drop. Fixes the inverted invariant baked into the unit tests and adds a nested-guard composition test. - H2: Introduces FnKey enum (Free / Method / SpecFree / SpecMethod) as the structured key type for func_name_to_idx, func_array_returns, func_struct_returns, and method_mangled_names values. The previous string-based scheme could textually collide a spec-inner free fn with a struct method under "<X>.<Y>"; this collision is impossible by construction. Drops the runtime assert! guards that used to trip on legal source. Adds current_fn_key for sret-emission lookup. - A3 (M5): take_spec_func_indices_by_spec now debug-asserts a single drainage call. - A4 (M6): codegen entry debug-asserts compile mode never produces recorded spec function indices. - A5 (M7): Threads module_name through inference_wasm_codegen::codegen and inference::codegen so the CLI's file stem now flows into the WASM name subsection (previously hardcoded "output"). Callers updated: cli/main.rs, tests/utils.rs, tests/spec_propagation.rs, tests/codegen/wasm/inference_wrapper.rs. - A7 (L2): Consolidates clippy::cast_possible_truncation #[allow] attributes into u32::try_from(...).expect(...) sites. - A8 (L3): spec_section encode_payload now collects Vec<(&str, &[u32])> per the project's slice-of-refs preference. - A9 (L5): collect_emittable_functions returns Result; nested specs surface as CodegenError::NestedSpecsNotSupported rather than being silently dropped. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Section B of the review follow-ups: - B1 (H3): Adds TypeCheckError::SpecFunctionShadowsTopLevel and a new `check_spec_function_shadows_top_level` phase that walks each spec block's inner function definitions after the collect-function phase and rejects any whose bare name also exists as a top-level function via the new SymbolTable::lookup_function_in_root helper. Order- independent: the check runs after both top-level and spec-inner functions are registered. Diagnostic message names the shadow rule and points the user at renaming. - B2 (M1): Introduces a small SpecScopeGuard in the type checker that wraps &mut TypeChecker with Deref/DerefMut forwarding, enters the spec scope on construction and pops on drop. Replaces three open- coded enter_spec/pop_scope pairs across register_type_for_def, collect_for_def, and infer_def. Pop now runs on every exit path including unwind. - Updates three spec_propagation tests that previously asserted the permitted-shadowing behavior: two flip to assert the rejection diagnostic, one is rewritten with distinct names so it still exercises spec-aware call resolution without colliding identifiers. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Section C of the review follow-ups: - C1 (M8): decode_spec_funcs_section now rejects trailing bytes after the last well-formed entry. Without this guard, a corrupted or hand-crafted payload could carry arbitrary tail bytes past the canonical schema. - C2 (M9): translate_bytes now rejects a second standard WASM `name` custom section. The previous code silently used the last one, clobbering func_names_map and func_locals_name_map without warning. - C3 (M16): Caps individual spec name length at MAX_SPEC_NAME_LEN (255 bytes, matching the validate_rocq_identifier TooLong limit). A hand-crafted payload that advertised a multi-MB name would no longer allocate that much memory before validation. - C4 (L7): Replaces the `_ => todo!()` arm in the parser payload match with a structured WasmToVError::WasmParse, matching the project policy of never panicking on user input. Updates embedded_name_section_with_invalid_module_name_is_rejected to strip the codegen-emitted `name` section before appending its hand- crafted replacement, since C2 now rejects the duplicate. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Section D of the review follow-ups: - D1 (H7): inference orchestrator now re-exports SPEC_FUNCS_SECTION_NAME and SPEC_FUNCS_SECTION_VERSION so tests and downstream consumers share a single source of truth without reaching into inference_wasm_codegen directly. Sweeps tests/src/ spec_propagation.rs to use the orchestrator re-export. - D2 (M14): Adds per-field doc comments to WasmToVError::EmbeddedSpecMismatch so the `explicit` (caller- supplied) vs `embedded` (decoded from the WASM custom section) distinction is documented at the type level. - D3 (M15): Adds #[non_exhaustive] to WasmToVError. The sibling InvalidIdentifierReason already had it; future variant additions are no longer breaking for downstream consumers that pattern-match. Downcast via anyhow::Error::downcast_ref::<WasmToVError>() is unaffected. - D4 (L22): Fixes the CodegenOutput doctest in output.rs which referenced `inference::FxHashMap` from inside the wasm-codegen crate, where it cannot be reached without an orchestrator cycle. Reworded as "the inference orchestrator also re-exports it". Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Section E of the review follow-ups:
- E1 (M10): eprint_translation_error in core/cli/src/main.rs gains
explicit arms for WasmToVError::WasmParse and ::UnsupportedFeature
so users see purpose-specific diagnostics ("malformed binary" vs
"feature not yet supported by the translator") instead of the
generic "WASM->V translation failed" fallback. Wildcard arm is
retained as required by #[non_exhaustive].
- E2 (M11): normalize_args now emits a stderr warning when
--mode proof is combined with --parse or --analyze (which exclude
codegen and therefore produce no .v output). Purely informational;
exit code unchanged. Helps users discover the silent no-op.
- E3 (M12): apps/infs/src/commands/build.rs drops the re-implementation
of the -v/--mode proof implication. infs now forwards only the
explicit --mode the user passed; infc::normalize_args owns the
symmetric rule, removing the dual source of truth.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Section F (H10) of the review follow-ups: Adds `Nat` (capitalized) to REJECTED_ROCQ_STDLIB_NAMES. The Rocq prelude auto-opens `Coq.Init.Nat`, which provides `Nat.add`, `Nat.eqb`, and the rest of Peano-nat arithmetic. Emitting `Module Nat. ... End Nat.` from a source file named `nat.inf` or `Nat.inf` would shadow these across the entire generated proof, causing user-supplied proofs and tactics that mention `Nat.x` to silently resolve to the spec's empty `Nat` namespace. Updates the rationale comment in validate_rocq_identifier to match. The broader denylist sweep (`andb`, `orb`, `negb`, `I`, etc.) and the `Module Inference. ... End Inference.` wrapping that would make the denylist obsolete remain deferred per the review plan. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Section G (M18-partial) of the review follow-ups: hardening tests paired with sections C and D. New tests in scenario_6b_embedded_data_validation: - embedded_spec_funcs_with_trailing_bytes_is_rejected — covers C1. A well-formed payload with a stray byte appended must surface WasmParse with `trailing` in the message. - embedded_spec_funcs_idx_count_exceeds_payload_is_rejected — covers the per-spec idx_count guard (separate from the existing embedded_spec_funcs_count_exceeds_payload_is_rejected which covers the outer count). Declares idx_count = u32::MAX with no index bytes and asserts the bounded-allocation rejection. - duplicate_wasm_name_section_is_rejected — covers C2. baseline_wasm already carries a `name` section; appending a second triggers the new duplicate guard. - embedded_name_section_overrides_caller_mod_name — positive test paired with B3. Strips the codegen-emitted `name` section and appends a hand-crafted one, then asserts the override flows into the emitted Rocq output (`Theorem valid_<embedded>` appears, `<caller>` does not). Tightens embedded_spec_funcs_duplicate_section_is_rejected to downcast to WasmToVError::WasmParse and assert the diagnostic substring instead of merely is_err(). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Section H of the review follow-ups: - H1 (H8): Renames every CHANGELOG issue ref from [#NN] to [issue#NN] so they no longer collide visually with PR refs ([pr#NN]). Touches every body reference plus the link-definition block at the bottom. Affected numbers: 16, 17, 18, 19, 20, 21, 22, 97. - H2 (L15): Adds [issue#22] tag to the `-v implies proof` CLI bullet for symmetry with neighboring entries. - H3 (M17): Moves the (@nil N) bullet from `### Compiler` to `### Rocq Translation`, where the change actually lives. - H4 (L20): core/wasm-to-v/README.md no longer claims to "re-export" the section constants — it consumes them at the decode boundary. - H5 (L21): book/compilation_targets.md gains a "Selecting a mode at the CLI" pointer documenting `infs build --mode {compile,proof}` and the `-v` implication. Also adds two user-visible entries for new behavior on this branch: - Type Checker: SpecFunctionShadowsTopLevel rejection (Section B). - Rocq Translation: module-name subsection now reflects the input file stem instead of `"output"` (Section A, M7). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Signed-off-by: Georgii Plotnikov <accembler@gmail.com>
There was a problem hiding this comment.
Pull request overview
This PR propagates spec function metadata through WASM codegen into the WASM-to-Rocq translation path and adds .inf fixture coverage for proof-mode spec behavior.
Changes:
- Reworks codegen finalization to emit WASM and move out the per-spec function index map in one step.
- Moves Rocq identifier validation for caller-provided spec maps to the public translation boundary.
- Adds
.infintegration fixtures and tests covering spec methods, top-level calls from specs, multiple specs, compile/proof mode differences, and existingwith_spec.infwiring.
Reviewed changes
Copilot reviewed 10 out of 10 changed files in this pull request and generated 1 comment.
Show a summary per file
| File | Description |
|---|---|
core/wasm-codegen/src/compiler.rs |
Updates spec index ownership/finalization and method lookup internals. |
core/wasm-codegen/src/lib.rs |
Uses the new finish_and_take flow and snapshots has_main. |
core/wasm-to-v/src/wasm_parser.rs |
Validates caller-supplied module/spec names before parsing. |
core/wasm-to-v/src/translator.rs |
Removes duplicate spec-name validation from translation. |
tests/src/spec_propagation_inf.rs |
Adds end-to-end .inf fixture tests for spec propagation. |
tests/src/lib.rs |
Registers the new integration test module. |
tests/test_data/inf/spec_method.inf |
Adds fixture for spec-contained struct methods. |
tests/test_data/inf/spec_calls_top.inf |
Adds fixture for spec-inner calls to top-level functions. |
tests/test_data/inf/three_specs.inf |
Adds fixture for multiple specs including an empty spec. |
tests/test_data/inf/mixed_compile_proof.inf |
Adds fixture for compile/proof mode comparison. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Codecov Report❌ Patch coverage is
📢 Thoughts on this report? Let us know! |
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com> Signed-off-by: Georgii Plotnikov <accembler@gmail.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
No description provided.