Skip to content

feat(pt/dpmodel): add sequential_update for dpa3#5355

Open
iProzd wants to merge 1 commit intodeepmodeling:masterfrom
iProzd:0328_seq_update
Open

feat(pt/dpmodel): add sequential_update for dpa3#5355
iProzd wants to merge 1 commit intodeepmodeling:masterfrom
iProzd:0328_seq_update

Conversation

@iProzd
Copy link
Copy Markdown
Collaborator

@iProzd iProzd commented Mar 30, 2026

Summary by CodeRabbit

Release Notes

  • New Features
    • Added sequential update mode configuration for DPA3 descriptor. This new boolean option enables an alternative update sequence within neural network layers, allowing edge and angle updates to be applied sequentially with different residual application timing. The mode is compatible with specific update style configurations to ensure correct model behavior.

Copilot AI review requested due to automatic review settings March 30, 2026 09:45
@iProzd iProzd requested review from njzjz and wanghan-iapcm March 30, 2026 09:46
n_multi_edge_message,
precision,
add_chg_spin_ebd,
sequential_update,

Check notice

Code scanning / CodeQL

Unused local variable Note test

Variable sequential_update is not used.
n_multi_edge_message,
precision,
add_chg_spin_ebd,
sequential_update,

Check notice

Code scanning / CodeQL

Unused local variable Note test

Variable sequential_update is not used.
n_multi_edge_message,
precision,
add_chg_spin_ebd,
sequential_update,

Check notice

Code scanning / CodeQL

Unused local variable Note test

Variable sequential_update is not used.
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a new sequential_update mode for the DPA3 repflow layers, wiring it through configuration/argcheck and implementing the sequential edge/angle/node update order in both the PyTorch and array-api (dpmodel) backends.

Changes:

  • Introduce sequential_update as a configurable repflow argument (docs + argcheck) and propagate it through DPA3 initialization.
  • Implement sequential-update execution paths in RepFlowLayer for both deepmd/pt and deepmd/dpmodel.
  • Expand DPA3 tests to exercise sequential_update across consistency and JIT scenarios.

Reviewed changes

Copilot reviewed 8 out of 8 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
source/tests/pt/model/test_dpa3.py Adds sequential_update coverage to PT consistency and JIT tests.
source/tests/consistent/descriptor/test_dpa3.py Adds sequential_update to cross-backend descriptor consistency matrix.
deepmd/utils/argcheck.py Documents and registers sequential_update in DPA3 repflow arg schema.
deepmd/pt/model/descriptor/repflows.py Threads sequential_update into PT repflow block and per-layer construction.
deepmd/pt/model/descriptor/repflow_layer.py Implements _forward_sequential and serialization of sequential_update.
deepmd/pt/model/descriptor/dpa3.py Propagates sequential_update from DPA3 args into PT repflow descriptor block.
deepmd/dpmodel/descriptor/repflows.py Threads sequential_update, implements _call_sequential, and serializes it.
deepmd/dpmodel/descriptor/dpa3.py Adds sequential_update to RepFlowArgs API + docs and propagates into repflow construction.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

dim=-1,
)
padding_edge_angle_update = torch.where(
full_mask.unsqueeze(-1), padding_edge_angle_update, edge_ebd
Copy link

Copilot AI Mar 30, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In sequential_update mode, the non-smooth padding path falls back to edge_ebd (original) when building padding_edge_angle_update. Since Phase 1 already produced edge_ebd_s1 (edge-self updated), padding with the original embedding can partially undo the sequential semantics for neighbors beyond a_sel. Use the current edge embedding for padding (i.e., edge_ebd_s1) so the edge-angle update starts from the edge-self-updated state.

Suggested change
full_mask.unsqueeze(-1), padding_edge_angle_update, edge_ebd
full_mask.unsqueeze(-1), padding_edge_angle_update, edge_ebd_s1

Copilot uses AI. Check for mistakes.
padding_edge_angle_update = xp.where(
xp.expand_dims(full_mask, axis=-1),
padding_edge_angle_update,
edge_ebd,
Copy link

Copilot AI Mar 30, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the sequential_update path, the non-smooth padding branch uses edge_ebd (original) as the fallback value when constructing padding_edge_angle_update. Because Phase 1 already computed edge_ebd_s1 (edge-self-updated), padding with the original edge embedding breaks the intended sequential semantics for neighbors beyond a_sel. Pad with the current edge state (edge_ebd_s1) instead.

Suggested change
edge_ebd,
edge_ebd_s1,

Copilot uses AI. Check for mistakes.
"When True, updates are applied sequentially: edge self → angle self (using updated edge) "
"→ edge angle (using updated angle) → node (using final edge), "
"instead of the default parallel mode where all updates use original embeddings. "
"Currently only supports update_style='res_residual'."
Copy link

Copilot AI Mar 30, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new sequential_update docs mention the update_style='res_residual' restriction, but the implementation also requires update_angle=True (see RepFlowLayer/RepFlowArgs checks). Please document this requirement here as well so users don’t discover it only via a runtime error.

Suggested change
"Currently only supports update_style='res_residual'."
"Currently only supports update_style='res_residual' and requires `update_angle=True`."

Copilot uses AI. Check for mistakes.
When True, updates are applied sequentially: edge self → angle self (using updated edge)
→ edge angle (using updated angle) → node (using final edge),
instead of the default parallel mode where all updates use original embeddings.
Currently only supports ``update_style='res_residual'``.
Copy link

Copilot AI Mar 30, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The sequential_update parameter docstring doesn’t mention that it requires update_angle=True, but RepFlowArgs.__init__ now raises a ValueError when sequential_update is True and update_angle is False. Please reflect this constraint in the docstring to prevent confusing configuration errors.

Suggested change
Currently only supports ``update_style='res_residual'``.
Currently only supports ``update_style='res_residual'`` and requires ``update_angle=True``;
otherwise, a ``ValueError`` will be raised during initialization.

Copilot uses AI. Check for mistakes.
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Mar 30, 2026

📝 Walkthrough

Walkthrough

This pull request adds a new sequential_update boolean parameter to the DPA3 descriptor's repflow architecture. When enabled with appropriate configuration constraints, it changes the update order within repflow layers from a parallel scheme to a sequential scheme: edge self update, angle self update, edge-angle update, then node updates.

Changes

Cohort / File(s) Summary
Configuration and Argument Definition
deepmd/utils/argcheck.py, deepmd/dpmodel/descriptor/dpa3.py
Added sequential_update: bool parameter with validation constraints (requires update_style=="res_residual" and update_angle==True). Extended serialization to include the new flag.
dpmodel Backend Repflow Implementation
deepmd/dpmodel/descriptor/repflows.py
Implemented _call_sequential() method in RepFlowLayer to execute sequential update order. Added parameter propagation through DescrptBlockRepflows and validation in both classes.
PyTorch Backend Repflow Implementation
deepmd/pt/model/descriptor/repflow_layer.py, deepmd/pt/model/descriptor/repflows.py
Mirrored implementation: added _forward_sequential() method in RepFlowLayer for PyTorch; propagated sequential_update through DescrptBlockRepflows. Extended serialization in both classes.
Descriptor Wiring
deepmd/dpmodel/descriptor/dpa3.py, deepmd/pt/model/descriptor/dpa3.py
Forward sequential_update from RepFlowArgs into DescrptBlockRepflows constructors in both backends.
Test Coverage
source/tests/consistent/descriptor/test_dpa3.py, source/tests/pt/model/test_dpa3.py
Added parameterization for sequential_update flag in both test suites; included control-flow guard to skip invalid configuration combinations.

Sequence Diagram(s)

sequenceDiagram
    participant Client as Descriptor Config
    participant Config as RepFlowLayer
    participant EdgeUp as Edge/Angle Updates
    participant NodeUp as Node Updates
    
    rect rgba(100, 150, 200, 0.5)
    Note over Client,NodeUp: Parallel Update (Default: sequential_update=False)
    Client->>Config: Initialize with sequential_update=False
    Config->>EdgeUp: Apply edge_self, angle_self, edge_angle updates
    EdgeUp->>NodeUp: Compute all nodes in parallel
    NodeUp-->>Config: Return updated embeddings
    end
    
    rect rgba(150, 200, 100, 0.5)
    Note over Client,NodeUp: Sequential Update (sequential_update=True)
    Client->>Config: Initialize with sequential_update=True
    Config->>EdgeUp: 1. Apply edge_self + apply residual
    EdgeUp->>EdgeUp: 2. Apply angle_self + apply residual
    EdgeUp->>EdgeUp: 3. Apply edge_angle + apply residual
    EdgeUp->>NodeUp: 4. Compute nodes using updated edge embeddings
    NodeUp-->>Config: Return sequentially-updated embeddings
    end
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

Suggested labels

Python

Suggested reviewers

  • wanghan-iapcm
  • njzjz
  • caic99
🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 31.03% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'feat(pt/dpmodel): add sequential_update for dpa3' clearly and concisely describes the main change: adding a sequential_update feature to the DPA3 descriptor across both PyTorch (pt) and model (dpmodel) components.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (2)
deepmd/utils/argcheck.py (1)

1553-1559: Mirror the full sequential_update constraint in the schema layer.

These docs only mention the update_style='res_residual' restriction, and argcheck will still accept sequential_update=true with update_angle=false even though RepFlowArgs rejects it later. An extra_check on the enclosing repflow argument would keep config validation and generated docs aligned with runtime behavior.

Also applies to: 1690-1696

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@deepmd/utils/argcheck.py` around lines 1553 - 1559, Update the repflow
argument schema to mirror the runtime constraint described in
doc_sequential_update by adding an extra_check that forbids
sequential_update=True unless update_style=='res_residual' AND
update_angle==True (matching RepFlowArgs). Specifically, modify the repflow
schema's extra_check to reference the sequential_update, update_style, and
update_angle fields and raise a validation error if sequential_update is true
while either update_style != 'res_residual' or update_angle is false, so the
schema-level validation and generated docs align with RepFlowArgs runtime
checks.
source/tests/consistent/descriptor/test_dpa3.py (1)

166-167: Consider prefixing unused unpacked variable with underscore.

The static analysis tool (Ruff RUF059) flags sequential_update as unpacked but never used in this property. While this follows the existing pattern of unpacking all parameters for consistency, you could prefix it with _ to satisfy the linter:

-            sequential_update,
+            _sequential_update,

This same pattern appears in skip_pd, skip_dp, skip_tf, setUp, rtol, and atol properties (lines 188, 210, 232, 298, 405, 433).

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@source/tests/consistent/descriptor/test_dpa3.py` around lines 166 - 167, The
unpacked parameter name sequential_update in the property (and similarly in
skip_pd, skip_dp, skip_tf, setUp, rtol, atol) is unused and triggers Ruff
RUF059; rename it to _sequential_update (and prefix the corresponding unused
variables in the listed properties with an underscore) to indicate intentional
unused unpacking while preserving the existing tuple unpack pattern and tests.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@deepmd/dpmodel/descriptor/dpa3.py`:
- Line 285: The serialized payload for repflow_args now includes a new key
"sequential_update" but DescrptDPA3.serialize() still emits "@version: 2",
causing old readers to attempt RepFlowArgs(**data) and crash; update
DescrptDPA3.serialize() to bump the serialization version (e.g., to 3) so that
readers can detect the new format, and ensure the version constant/metadata that
serialize() returns is updated accordingly (refer to DescrptDPA3.serialize and
the repflow_args payload construction to locate the change).

In `@deepmd/dpmodel/descriptor/repflows.py`:
- Around line 1602-1622: The non-smooth branch pads masked/out-of-a_sel slots
using edge_ebd, which reverts those slots to the post-update edge state and
breaks the intended sequential ordering; change the padding in the xp.where that
sets padding_edge_angle_update so it uses the phase-1 edge state (the edge
embedding/state used before the angle-stage update) instead of edge_ebd, keeping
the full_mask logic and all other flow (respect
smooth_edge_update/use_dynamic_sel) intact.

---

Nitpick comments:
In `@deepmd/utils/argcheck.py`:
- Around line 1553-1559: Update the repflow argument schema to mirror the
runtime constraint described in doc_sequential_update by adding an extra_check
that forbids sequential_update=True unless update_style=='res_residual' AND
update_angle==True (matching RepFlowArgs). Specifically, modify the repflow
schema's extra_check to reference the sequential_update, update_style, and
update_angle fields and raise a validation error if sequential_update is true
while either update_style != 'res_residual' or update_angle is false, so the
schema-level validation and generated docs align with RepFlowArgs runtime
checks.

In `@source/tests/consistent/descriptor/test_dpa3.py`:
- Around line 166-167: The unpacked parameter name sequential_update in the
property (and similarly in skip_pd, skip_dp, skip_tf, setUp, rtol, atol) is
unused and triggers Ruff RUF059; rename it to _sequential_update (and prefix the
corresponding unused variables in the listed properties with an underscore) to
indicate intentional unused unpacking while preserving the existing tuple unpack
pattern and tests.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 51893b84-b7f2-45e4-b52f-ff66721d1ab0

📥 Commits

Reviewing files that changed from the base of the PR and between 0828604 and c4ed1f9.

📒 Files selected for processing (8)
  • deepmd/dpmodel/descriptor/dpa3.py
  • deepmd/dpmodel/descriptor/repflows.py
  • deepmd/pt/model/descriptor/dpa3.py
  • deepmd/pt/model/descriptor/repflow_layer.py
  • deepmd/pt/model/descriptor/repflows.py
  • deepmd/utils/argcheck.py
  • source/tests/consistent/descriptor/test_dpa3.py
  • source/tests/pt/model/test_dpa3.py

"use_exp_switch": self.use_exp_switch,
"use_dynamic_sel": self.use_dynamic_sel,
"sel_reduce_factor": self.sel_reduce_factor,
"sequential_update": self.sequential_update,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Bump the DPA3 serialization version for the new repflow_args field.

Line 285 adds a new key to the persisted repflow_args payload, but DescrptDPA3.serialize() still advertises @version: 2. A pre-change v2 reader will accept the file and then die on RepFlowArgs(**data) with an unexpected sequential_update kwarg instead of rejecting the newer format cleanly.

Suggested fix
-            "@version": 2,
+            "@version": 3,
-        check_version_compatibility(version, 2, 1)
+        check_version_compatibility(version, 3, 1)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@deepmd/dpmodel/descriptor/dpa3.py` at line 285, The serialized payload for
repflow_args now includes a new key "sequential_update" but
DescrptDPA3.serialize() still emits "@version: 2", causing old readers to
attempt RepFlowArgs(**data) and crash; update DescrptDPA3.serialize() to bump
the serialization version (e.g., to 3) so that readers can detect the new
format, and ensure the version constant/metadata that serialize() returns is
updated accordingly (refer to DescrptDPA3.serialize and the repflow_args payload
construction to locate the change).

Comment on lines +1602 to +1622
if not self.smooth_edge_update:
if self.use_dynamic_sel:
raise NotImplementedError(
"smooth_edge_update must be True when use_dynamic_sel is True!"
)
full_mask = xp.concat(
[
a_nlist_mask,
xp.zeros(
(nb, nloc, self.nnei - self.a_sel),
dtype=a_nlist_mask.dtype,
device=array_api_compat.device(a_nlist_mask),
),
],
axis=-1,
)
padding_edge_angle_update = xp.where(
xp.expand_dims(full_mask, axis=-1),
padding_edge_angle_update,
edge_ebd,
)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Use the phase-1 edge state in the non-smooth padding branch.

Line 1621 falls back to edge_ebd, so whenever smooth_edge_update=False the edge-angle stage for masked / out-of-a_sel slots is still driven by the pre-update edge state. That breaks the intended sequential ordering in this mode.

Suggested fix
         padding_edge_angle_update = xp.where(
             xp.expand_dims(full_mask, axis=-1),
             padding_edge_angle_update,
-            edge_ebd,
+            edge_ebd_s1,
         )
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@deepmd/dpmodel/descriptor/repflows.py` around lines 1602 - 1622, The
non-smooth branch pads masked/out-of-a_sel slots using edge_ebd, which reverts
those slots to the post-update edge state and breaks the intended sequential
ordering; change the padding in the xp.where that sets padding_edge_angle_update
so it uses the phase-1 edge state (the edge embedding/state used before the
angle-stage update) instead of edge_ebd, keeping the full_mask logic and all
other flow (respect smooth_edge_update/use_dynamic_sel) intact.

@codecov
Copy link
Copy Markdown

codecov bot commented Mar 30, 2026

Codecov Report

❌ Patch coverage is 95.29412% with 8 lines in your changes missing coverage. Please review.
✅ Project coverage is 82.35%. Comparing base (a7e9fed) to head (c4ed1f9).
⚠️ Report is 6 commits behind head on master.

Files with missing lines Patch % Lines
deepmd/dpmodel/descriptor/repflows.py 92.50% 6 Missing ⚠️
deepmd/pt/model/descriptor/repflow_layer.py 97.56% 2 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master    #5355      +/-   ##
==========================================
+ Coverage   82.28%   82.35%   +0.07%     
==========================================
  Files         797      809      +12     
  Lines       82102    83330    +1228     
  Branches     4003     4067      +64     
==========================================
+ Hits        67555    68625    +1070     
- Misses      13338    13491     +153     
- Partials     1209     1214       +5     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Copy link
Copy Markdown
Collaborator

@wanghan-iapcm wanghan-iapcm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1. Serialization version not bumped (Bug)

sequential_update is added to serialize() in both dpmodel and PT RepFlowLayer, but @version remains at 2 and deserialize() has no backward compat for the missing field. Loading a
model serialized without sequential_update will raise KeyError. Per project convention, this must be done in the same commit:

  • Bump @version to 3
  • Update check_version_compatibility(data.pop("@version"), 3, 1)
  • Add data.setdefault("sequential_update", False) in deserialize() for versions < 3

This applies to both RepFlowLayer and DescrptRepflows in both dpmodel and PT backends.

2. Massive code duplication (~800 lines)

_call_sequential (dpmodel, ~410 lines) and _forward_sequential (PT, ~375 lines) are nearly identical copies of the existing call/forward methods with different residual application
order. Most of the code (edge_info construction, angle compression, node_sym, node_edge, etc.) is identical. This will be a maintenance burden — any future change to the update logic must
be applied in 4 places (call, forward, _call_sequential, _forward_sequential).

Consider refactoring to share code — e.g., factor out the individual phase computations (edge_self_update, angle_self_update, edge_angle_update, node_update) as methods, then have the
sequential vs parallel paths just call them in different order with different inputs.

3. Consistency test doubles parameter combinations

Adding (False, True) for sequential_update to the consistency test test_dpa3.py doubles all DPA3 test cases. Since sequential_update=True only works with update_angle=True, and the test already parametrizes update_angle with (True,) only, this is fine functionally — but it significantly increases CI time for a feature that doesn't affect cross-backend serialization format (the weights are the same, just applied differently).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants