feat(pt_expt): add missing losses (spin, DOS, tensor, property)#5345
feat(pt_expt): add missing losses (spin, DOS, tensor, property)#5345wanghan-iapcm wants to merge 3 commits intodeepmodeling:masterfrom
Conversation
…mae support Add EnergySpinLoss, DOSLoss, TensorLoss, PropertyLoss to dpmodel and pt_expt backends. Add serialize/deserialize to PT losses. Add mae=True support to dpmodel EnergyLoss and EnergySpinLoss for dp test metrics. Extend pt_expt get_loss() factory for all loss types. Add cross-backend consistency tests parameterized over loss_func and mae.
| def call( | ||
| self, | ||
| learning_rate: float, | ||
| natoms: int, | ||
| model_dict: dict[str, Array], | ||
| label_dict: dict[str, Array], | ||
| mae: bool = False, | ||
| ) -> tuple[Array, dict[str, Array]]: |
Check warning
Code scanning / CodeQL
Signature mismatch in overriding method Warning
| def call( | ||
| self, | ||
| learning_rate: float, | ||
| natoms: int, | ||
| model_dict: dict[str, Array], | ||
| label_dict: dict[str, Array], | ||
| mae: bool = False, | ||
| ) -> tuple[Array, dict[str, Array]]: |
Check warning
Code scanning / CodeQL
Signature mismatch in overriding method Warning
| def call( | ||
| self, | ||
| learning_rate: float, | ||
| natoms: int, | ||
| model_dict: dict[str, Array], | ||
| label_dict: dict[str, Array], | ||
| mae: bool = False, | ||
| ) -> tuple[Array, dict[str, Array]]: |
Check warning
Code scanning / CodeQL
Signature mismatch in overriding method Warning
| def call( | ||
| self, | ||
| learning_rate: float, | ||
| natoms: int, | ||
| model_dict: dict[str, Array], | ||
| label_dict: dict[str, Array], | ||
| mae: bool = False, | ||
| ) -> tuple[Array, dict[str, Array]]: |
Check warning
Code scanning / CodeQL
Signature mismatch in overriding method Warning
📝 WalkthroughWalkthroughAdds five loss implementations (DOS, EnergySpin, Property, Tensor, Energy MAE reporting), re-exports them, extends Loss.call with an optional Changes
Sequence Diagram(s)sequenceDiagram
participant Trainer as Training Loop
participant Model as Model
participant Loss as Loss Module
participant Labels as LabelDict
participant Metrics as Metrics Dict
Trainer->>Model: forward(input)
Model-->>Trainer: model_dict (predictions)
Trainer->>Loss: call(learning_rate, natoms, model_dict, Labels, mae=True)
Loss->>Loss: select array namespace (xp/jnp/torch)
Loss->>Loss: reshape preds & labels, apply mask if present
Loss->>Labels: read locality / atom weights (find_* / atom_weight)
Loss->>Loss: compute term losses (dos/cdf/atomic/global/tensor/energy/forces/virial)
Loss->>Loss: scale terms by scheduled prefactors (coef = lr / starter_lr)
Loss->>Loss: aggregate weighted loss and compute rmse/mae metrics
Loss-->>Trainer: return (loss, more_loss)
Estimated code review effort🎯 4 (Complex) | ⏱️ ~75 minutes Possibly related PRs
Suggested labels
Suggested reviewers
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 12
🧹 Nitpick comments (6)
deepmd/pt/loss/property.py (1)
226-239: Normalizeout_bias/out_stdin the serialized payload.These fields are returned as-is right now. If either is a
torch.Tensor, the payload stops being backend-neutral and becomes awkward to persist or compare. Converting tensor-like values to plain lists here keeps the schema stable.💡 Proposed fix
def serialize(self) -> dict: """Serialize the loss module.""" + def _as_list(value: list | torch.Tensor | None) -> list | None: + if value is None: + return None + return torch.as_tensor(value).detach().cpu().tolist() + return { "@class": "PropertyLoss", "@version": 1, "task_dim": self.task_dim, "var_name": self.var_name, "loss_func": self.loss_func, "metric": self.metric, "beta": self.beta, - "out_bias": self.out_bias, - "out_std": self.out_std, + "out_bias": _as_list(self.out_bias), + "out_std": _as_list(self.out_std), "intensive": self.intensive, }Based on learnings, serialization of dictionaries that include
torch.Tensorobjects should usepickle.dumps/pickle.loadsinstead of JSON, as JSON cannot serialize tensors.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@deepmd/pt/loss/property.py` around lines 226 - 239, The serialize method of PropertyLoss returns out_bias and out_std as-is which can be torch.Tensors and break backend-neutrality; update PropertyLoss.serialize to detect tensor- or array-like values for the "out_bias" and "out_std" fields and convert them to plain Python lists (e.g., call .tolist() for torch.Tensor / numpy.ndarray, leave None or already-serializable types unchanged) before returning the dict so the payload remains JSON-friendly and stable across backends.deepmd/pt_expt/train/training.py (1)
78-113: Copy/filter locally inget_loss()instead of mutating caller-owned objects.The new branches write derived fields back into
loss_params, and the tensor branch mutates the sequence returned by_model.model_output_type()withpop(). That makes the factory depend on whether those objects are shared by the caller/model. Working on local copies avoids config bleed-through and accidental model-state mutation.💡 Proposed fix
def get_loss( loss_params: dict[str, Any], start_lr: float, _ntypes: int, _model: Any, ) -> EnergyLoss: + loss_params = dict(loss_params) loss_type = loss_params.get("type", "ener") if loss_type == "ener": loss_params["starter_learning_rate"] = start_lr return EnergyLoss(**loss_params) elif loss_type == "dos": @@ elif loss_type == "tensor": - model_output_type = _model.model_output_type() - if "mask" in model_output_type: - model_output_type.pop(model_output_type.index("mask")) + model_output_type = [ + name for name in _model.model_output_type() if name != "mask" + ] tensor_name = model_output_type[0] loss_params["tensor_size"] = _model.model_output_def()[tensor_name].output_size🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@deepmd/pt_expt/train/training.py` around lines 78 - 113, In get_loss(), avoid mutating caller-owned objects: make a local shallow/deep copy of loss_params (e.g., local_params = loss_params.copy() or deepcopy when nested) and operate on that for adding starter_learning_rate, tensor_size, label_name, tensor_name, task_dim, var_name, intensive, etc.; likewise, don't call pop() on the sequence returned by _model.model_output_type() — copy it into a new list (e.g., model_output_type = list(_model.model_output_type())) and filter out "mask" via a non-mutating approach (list comprehension) before reading tensor_name; finally pass the local_params to the respective constructors (EnergyLoss, DOSLoss, EnergySpinLoss, TensorLoss, PropertyLoss).source/tests/pt_expt/loss/test_dos.py (1)
27-54: Add at least one masked DOS fixture.
_make_data()never populatesmodel_pred["mask"], so these tests never hit the masked DOS/CDF reductions—the part of the port that replaced boolean indexing with mask-based averaging. One masked case here would cover the riskiest branch.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@source/tests/pt_expt/loss/test_dos.py` around lines 27 - 54, The tests never exercise masked reductions because _make_data() does not set model_pred["mask"]; update _make_data to add a boolean torch.tensor model_pred["mask"] with at least one False element so the masked DOS/atom_dos branch runs (make the mask shape match the atom_dos axes—e.g., (nframes, natoms) or a broadcastable shape—and use dtype=torch.bool on the same device), and if the loss code expects a corresponding label mask include any required label mask key as well so the masked-path is exercised for atom_dos/dos reductions.source/tests/pt_expt/loss/test_tensor.py (1)
132-132: Prefix unused variables with underscore to satisfy linter.The static analysis correctly identifies unused variables. Per coding guidelines,
ruff checkmust pass.🔧 Suggested fix
- l_dp, more_dp = dp_loss(learning_rate, natoms, model_pred_np, label_np) + l_dp, _more_dp = dp_loss(learning_rate, natoms, model_pred_np, label_np)And at line 177:
- l0, more0 = loss0(learning_rate, natoms, model_pred, label) + l0, _more0 = loss0(learning_rate, natoms, model_pred, label)Also applies to: 177-177
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@source/tests/pt_expt/loss/test_tensor.py` at line 132, The linter flags unused variables returned from dp_loss (l_dp, more_dp); update the call sites (the dp_loss(...) invocation at the shown locations) to prefix any unused return names with an underscore (e.g., _l_dp, _more_dp or whichever specific variable(s) are unused at each site) so the variables remain assigned but satisfy ruff; apply the same change to the other occurrence around line 177 where dp_loss is called.source/tests/pt_expt/loss/test_property.py (1)
113-113: Prefix unused variable with underscore.🔧 Suggested fix
- l_dp, more_dp = dp_loss(learning_rate, natoms, model_pred_np, label_np) + l_dp, _more_dp = dp_loss(learning_rate, natoms, model_pred_np, label_np)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@source/tests/pt_expt/loss/test_property.py` at line 113, The tuple assignment from dp_loss currently binds l_dp and more_dp but more_dp is not used; rename the unused variable to _more_dp (e.g., change "l_dp, more_dp = dp_loss(...)" to "l_dp, _more_dp = dp_loss(...)") so the intent is clear and linters won't flag it; update the assignment in test_property.py where dp_loss is called and ensure no other references expect "more_dp".deepmd/dpmodel/loss/property.py (1)
47-58: Fail fast on invalidPropertyLossconfigs.A typo in
loss_func,beta <= 0forsmooth_mae, or any non-positiveout_stdonly blows up oncecall()hits the divisions at Lines 104-105 and Lines 110-113. Rejecting those cases in__init__would turn late NaNs into immediate config errors.Example guardrails
) -> None: + valid_loss_funcs = {"smooth_mae", "mae", "mse", "rmse", "mape"} + if loss_func not in valid_loss_funcs: + raise ValueError( + f"Invalid loss_func '{loss_func}'. Must be one of {sorted(valid_loss_funcs)}." + ) + if loss_func == "smooth_mae" and beta <= 0: + raise ValueError("beta must be > 0 for smooth_mae") + if out_std is not None and any(v <= 0 for v in out_std): + raise ValueError("out_std must contain only positive values") if metric is None: metric = ["mae"]Also applies to: 92-105, 108-125
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@deepmd/dpmodel/loss/property.py` around lines 47 - 58, In PropertyLoss.__init__, validate config arguments up front instead of letting call() produce NaNs: check that loss_func is one of the supported strings used by call (e.g., "smooth_mae", "mse", etc.), if loss_func == "smooth_mae" enforce beta > 0, and validate out_std (and any provided out_bias) entries are all positive non-zero numbers; if any check fails raise a ValueError with a clear message. Also normalize/convert metric if needed and store validated values on the instance so call() can safely assume loss_func, beta, out_std/out_bias are valid and non-zero. Ensure you reference the same names used in the class (PropertyLoss.__init__, PropertyLoss.call, loss_func, beta, out_std, out_bias) so these guards are easy to locate.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@deepmd/dpmodel/loss/dos.py`:
- Around line 150-157: Update pyproject.toml to require array-api-compat >= 1.9
so the Array API cumulative_sum used in dpmodel.loss.dos (see usage:
xp.cumulative_sum in the cumulative_sum calls that reshape
model_dict["atom_dos"] and label_dict["atom_dos"]) is available at runtime;
change the dependency entry for "array-api-compat" to "array-api-compat >= 1.9"
(or equivalent spec) and run dependency install/lock to pick up the new version.
In `@deepmd/dpmodel/loss/ener_spin.py`:
- Line 255: The code currently computes more_loss["rmse"] = xp.sqrt(loss)
regardless of loss type, which mislabels an L1/MAE aggregate as RMSE; in the
function handling loss aggregation (use references to loss_func, loss, xp.sqrt
and more_loss["rmse"]) only compute and set more_loss["rmse"] when loss_func
indicates an MSE-based loss (e.g., "mse") and otherwise either omit the "rmse"
key or set a neutral/explicit MAE aggregate key (e.g., more_loss["mae"] or
more_loss["aggregate"]) so logs and dp test report the correct metric. Ensure
the gating condition checks loss_func before calling xp.sqrt and that downstream
consumers expect the alternative key when in MAE mode.
- Around line 176-206: Default find_force_mag to 0.0 and short-circuit the
magnetic-force branch when the effective weight or valid counts are zero: change
the default assignment of find_force_mag to 0.0 and after pref_fm = pref_fm *
find_force_mag, check if pref_fm == 0.0 or n_valid == 0 and return/skip all FM
reductions; in the MAE branch guard the per-frame division by per_frame_count_fm
by either filtering out frames with zero count (e.g. only sum per_frame_sum_fm /
per_frame_count_fm for frames where per_frame_count_fm>0) or skip the term if no
valid frames, and only then compute and add l2_force_mag_loss /
l1_force_mag_loss to loss and set entries in more_loss (affecting symbols:
find_force_mag, pref_fm, mask_mag, n_valid, l2_force_mag_loss,
l1_force_mag_loss, per_frame_count_fm, more_loss, display_if_exist).
In `@deepmd/dpmodel/loss/tensor.py`:
- Around line 173-183: The label key names are inconsistent: label_requirement
builds a DataRequirementItem with key "atomic_weight" but the call() method
reads label_dict["atom_weight"]; make them match by choosing one canonical key
and updating the other. Edit the DataRequirementItem entry in label_requirement
or the access in call() so both use the same string (referencing the
DataRequirementItem creation for "atomic_weight" and the call() lookup of
"atom_weight"), and ensure any downstream uses/defaults (e.g., default=1.0)
remain correct.
- Around line 149-162: The label_requirement property currently registers the
wrong prefix ("atomic_"+self.label_name) causing a mismatch with the call()
method which expects "atom_"+self.label_name; update the DataRequirementItem key
in label_requirement to use "atom_"+self.label_name (keep the other fields and
condition on self.has_local_weight, and leave label_name, tensor_size and the
DataRequirementItem construction unchanged) so the label key matches what call()
(and other loss classes like ener.py, dos.py) expects.
In `@deepmd/pt/loss/dos.py`:
- Around line 271-286: The serialize() for DOSLoss is missing the constructor
flag "inference", causing inference-only instances to lose that state on
round-trip; update DOSLoss.serialize to include "inference": self.inference (and
likewise add the same field in the adjacent serialize block at 288-294 if
present) so that deserialization/.__init__() receives the original inference
boolean and preserves which terms are active when reloading.
In `@deepmd/pt/loss/ener_spin.py`:
- Around line 412-430: The serialize() method in the EnergySpinLoss class does
not include the inference flag in its output dictionary, causing deserialized
objects with inference=True to lose this state and default back to
inference=False. Add the inference field to the dictionary returned by
serialize(), but only include it when the value is True to maintain backward
compatibility with existing serialized configs that use the default training
mode. Make sure to also apply this same fix to the serialize() method in
EnergySpinLoss (also mentioned at lines 432-438).
In `@deepmd/pt/loss/tensor.py`:
- Around line 212-231: serialize() for TensorLoss omits the inference flag so
deserialized instances default to False; update TensorLoss.serialize to include
the inference attribute (e.g., "inference": self.inference) and ensure
TensorLoss.deserialize accepts and passes that field through (preserve it in
data.copy() handling and cls(**data) usage) so the inference state round-trips
and assertions relying on has_local_weight/has_global_weight remain valid.
In `@source/tests/consistent/loss/test_ener.py`:
- Line 94: The tuple unpacking in the test (the assignment of self.param to
use_huber, enable_atom_ener_coeff, loss_func, f_use_norm, _mae) defines
variables that are not used; rename those unused slots by prefixing them with an
underscore (e.g., _use_huber, _enable_atom_ener_coeff) so ruff no longer flags
them, and apply the same change to the other unpackings in this file where
enable_atom_ener_coeff is unused (replace use_huber/enable_atom_ener_coeff with
_use_huber/_enable_atom_ener_coeff in the self.param unpack statements).
- Line 96: The test skip condition must also skip TF/Paddle when mae cases are
used; update the boolean expression (the return line shown) to include the mae
flag (e.g., return CommonTest.skip_tf or loss_func == "mae" or f_use_norm or
mae) so TF/Paddle runs are avoided for mae=True; ensure this aligns with how
TestEnerGF sets mae and that extract_ret()/CommonTest._compare_ret() remain
consistent with build_tf()/eval_pd() behavior.
In `@source/tests/pt_expt/loss/test_dos.py`:
- Line 125: The unused local variables returned from dp_loss are causing RUF059;
rename the unused bindings more_dp and more0 to _more_dp (or simply _ ) to mark
them as intentionally unused. Update the assignment statements that call dp_loss
(the one assigning l_dp, more_dp and the one assigning l_dp, more0) so the
second tuple element is renamed, and run ruff check/format to verify the warning
is resolved.
In `@source/tests/pt_expt/loss/test_ener_spin.py`:
- Line 144: The test assigns unused return values to locals that trigger Ruff
RUF059; rename the unused locals returned from dp_loss and any other unused
locals (e.g., more_dp and more0) to a conventional unused name (e.g., _more_dp
or simply _ ) so they are intentionally ignored and Ruff stops flagging them;
update the call site using dp_loss (and the other occurrence around the second
reported location) to use the new unused-name variables and run ruff check/ruff
format before committing.
---
Nitpick comments:
In `@deepmd/dpmodel/loss/property.py`:
- Around line 47-58: In PropertyLoss.__init__, validate config arguments up
front instead of letting call() produce NaNs: check that loss_func is one of the
supported strings used by call (e.g., "smooth_mae", "mse", etc.), if loss_func
== "smooth_mae" enforce beta > 0, and validate out_std (and any provided
out_bias) entries are all positive non-zero numbers; if any check fails raise a
ValueError with a clear message. Also normalize/convert metric if needed and
store validated values on the instance so call() can safely assume loss_func,
beta, out_std/out_bias are valid and non-zero. Ensure you reference the same
names used in the class (PropertyLoss.__init__, PropertyLoss.call, loss_func,
beta, out_std, out_bias) so these guards are easy to locate.
In `@deepmd/pt_expt/train/training.py`:
- Around line 78-113: In get_loss(), avoid mutating caller-owned objects: make a
local shallow/deep copy of loss_params (e.g., local_params = loss_params.copy()
or deepcopy when nested) and operate on that for adding starter_learning_rate,
tensor_size, label_name, tensor_name, task_dim, var_name, intensive, etc.;
likewise, don't call pop() on the sequence returned by
_model.model_output_type() — copy it into a new list (e.g., model_output_type =
list(_model.model_output_type())) and filter out "mask" via a non-mutating
approach (list comprehension) before reading tensor_name; finally pass the
local_params to the respective constructors (EnergyLoss, DOSLoss,
EnergySpinLoss, TensorLoss, PropertyLoss).
In `@deepmd/pt/loss/property.py`:
- Around line 226-239: The serialize method of PropertyLoss returns out_bias and
out_std as-is which can be torch.Tensors and break backend-neutrality; update
PropertyLoss.serialize to detect tensor- or array-like values for the "out_bias"
and "out_std" fields and convert them to plain Python lists (e.g., call
.tolist() for torch.Tensor / numpy.ndarray, leave None or already-serializable
types unchanged) before returning the dict so the payload remains JSON-friendly
and stable across backends.
In `@source/tests/pt_expt/loss/test_dos.py`:
- Around line 27-54: The tests never exercise masked reductions because
_make_data() does not set model_pred["mask"]; update _make_data to add a boolean
torch.tensor model_pred["mask"] with at least one False element so the masked
DOS/atom_dos branch runs (make the mask shape match the atom_dos axes—e.g.,
(nframes, natoms) or a broadcastable shape—and use dtype=torch.bool on the same
device), and if the loss code expects a corresponding label mask include any
required label mask key as well so the masked-path is exercised for atom_dos/dos
reductions.
In `@source/tests/pt_expt/loss/test_property.py`:
- Line 113: The tuple assignment from dp_loss currently binds l_dp and more_dp
but more_dp is not used; rename the unused variable to _more_dp (e.g., change
"l_dp, more_dp = dp_loss(...)" to "l_dp, _more_dp = dp_loss(...)") so the intent
is clear and linters won't flag it; update the assignment in test_property.py
where dp_loss is called and ensure no other references expect "more_dp".
In `@source/tests/pt_expt/loss/test_tensor.py`:
- Line 132: The linter flags unused variables returned from dp_loss (l_dp,
more_dp); update the call sites (the dp_loss(...) invocation at the shown
locations) to prefix any unused return names with an underscore (e.g., _l_dp,
_more_dp or whichever specific variable(s) are unused at each site) so the
variables remain assigned but satisfy ruff; apply the same change to the other
occurrence around line 177 where dp_loss is called.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: 7228a7bd-845d-4205-866b-dab0daaa93f0
📒 Files selected for processing (26)
deepmd/dpmodel/loss/__init__.pydeepmd/dpmodel/loss/dos.pydeepmd/dpmodel/loss/ener.pydeepmd/dpmodel/loss/ener_spin.pydeepmd/dpmodel/loss/loss.pydeepmd/dpmodel/loss/property.pydeepmd/dpmodel/loss/tensor.pydeepmd/pt/loss/dos.pydeepmd/pt/loss/ener_spin.pydeepmd/pt/loss/property.pydeepmd/pt/loss/tensor.pydeepmd/pt_expt/loss/__init__.pydeepmd/pt_expt/loss/dos.pydeepmd/pt_expt/loss/ener_spin.pydeepmd/pt_expt/loss/property.pydeepmd/pt_expt/loss/tensor.pydeepmd/pt_expt/train/training.pysource/tests/consistent/loss/test_dos.pysource/tests/consistent/loss/test_ener.pysource/tests/consistent/loss/test_ener_spin.pysource/tests/consistent/loss/test_property.pysource/tests/consistent/loss/test_tensor.pysource/tests/pt_expt/loss/test_dos.pysource/tests/pt_expt/loss/test_ener_spin.pysource/tests/pt_expt/loss/test_property.pysource/tests/pt_expt/loss/test_tensor.py
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## master #5345 +/- ##
==========================================
+ Coverage 82.28% 82.35% +0.07%
==========================================
Files 797 806 +9
Lines 82100 82635 +535
Branches 4003 4003
==========================================
+ Hits 67557 68058 +501
- Misses 13336 13369 +33
- Partials 1207 1208 +1 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
The eval_pd methods in TestEner and TestEnerGF were missing the mae parameter, causing key mismatch when comparing Paddle output against the reference (dpmodel) which includes mae_* keys.
There was a problem hiding this comment.
♻️ Duplicate comments (3)
source/tests/consistent/loss/test_ener.py (3)
100-100:⚠️ Potential issue | 🟠 MajorUse
_-prefixed names for the still-unused tuple slots.
skip_pd()never readsuse_huber/enable_atom_ener_coeff, andsetUp()never readsenable_atom_ener_coeff. Ruff is already flagging these unpackings.As per coding guidelines, `**/*.py`: Always run `ruff check .` and `ruff format .` before committing changes or CI will fail.💡 Proposed fix
- (use_huber, enable_atom_ener_coeff, loss_func, f_use_norm, _mae) = self.param + (_use_huber, _enable_atom_ener_coeff, loss_func, f_use_norm, _mae) = self.param ... - (use_huber, enable_atom_ener_coeff, loss_func, f_use_norm, mae) = self.param + (use_huber, _enable_atom_ener_coeff, loss_func, f_use_norm, mae) = self.paramAlso applies to: 119-119
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@source/tests/consistent/loss/test_ener.py` at line 100, The tuple unpacking in the test (the line assigning (use_huber, enable_atom_ener_coeff, loss_func, f_use_norm, _mae) = self.param) exposes unused names that Ruff flags; change unused slots to underscore-prefixed names (for example _use_huber and _enable_atom_ener_coeff or simply _ , as appropriate) so only actually-used variables (loss_func, f_use_norm, _mae) remain named; update the same pattern at the other occurrence around line 119, and run ruff check . and ruff format . before committing.
94-96:⚠️ Potential issue | 🟠 MajorSkip TF when the extra
mae_*metrics are enabled.The non-TF paths now forward
mae=self.mae, butbuild_tf()still cannot request those metrics. On TF-enabled runs this turns the consistency check into a key-set mismatch. Addor maehere; Paddle is already covered becauseeval_pd()now passesmae.💡 Proposed fix
- (use_huber, enable_atom_ener_coeff, loss_func, f_use_norm, _mae) = self.param + (_use_huber, _enable_atom_ener_coeff, loss_func, f_use_norm, mae) = self.param # Skip TF for MAE loss tests (not implemented in TF backend) - return CommonTest.skip_tf or loss_func == "mae" or f_use_norm + return CommonTest.skip_tf or loss_func == "mae" or f_use_norm or mae🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@source/tests/consistent/loss/test_ener.py` around lines 94 - 96, The TF-skip condition needs to also skip when extra MAE metrics are requested: update the return expression in the test where (use_huber, enable_atom_ener_coeff, loss_func, f_use_norm, _mae) = self.param (and where CommonTest.skip_tf is checked) to include the mae flag (e.g., add "or mae" / "or self.mae" to the predicate), so the clause becomes CommonTest.skip_tf or loss_func == "mae" or f_use_norm or mae to avoid key-set mismatches with TF backends.
355-355:⚠️ Potential issue | 🟠 Major
TestEnerGFalso needs to skip TF now that every comparable backend forcesmae=True.
extract_ret()now comparesmae_*, and each non-TF evaluator in this class opts into those metrics.build_tf()still has no matching path, so TF-enabled environments will compare different result keys unless this class skips TF entirely or wiresmae=Truethrough the TF backend too.💡 Proposed fix
- skip_tf = CommonTest.skip_tf + skip_tf = True # TF loss path does not emit mae_* metricsAlso applies to: 457-457, 469-469, 497-497, 515-515, 535-535, 547-547
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@source/tests/consistent/loss/test_ener.py` at line 355, TestEnerGF must skip TensorFlow runs because extract_ret() now expects mae_* keys that build_tf() does not emit; update the test class to skip TF by adding or setting a TF-skip flag (e.g., set skip_tf_expt = True or set skip_tf_expt = not INSTALLED_TF_EXPT) alongside the existing skip_pt_expt assignment in TestEnerGF so that TF backends are not executed; refer to TestEnerGF, extract_ret(), and build_tf() when making the change and apply the same TF-skip update at the other occurrences mentioned.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Duplicate comments:
In `@source/tests/consistent/loss/test_ener.py`:
- Line 100: The tuple unpacking in the test (the line assigning (use_huber,
enable_atom_ener_coeff, loss_func, f_use_norm, _mae) = self.param) exposes
unused names that Ruff flags; change unused slots to underscore-prefixed names
(for example _use_huber and _enable_atom_ener_coeff or simply _ , as
appropriate) so only actually-used variables (loss_func, f_use_norm, _mae)
remain named; update the same pattern at the other occurrence around line 119,
and run ruff check . and ruff format . before committing.
- Around line 94-96: The TF-skip condition needs to also skip when extra MAE
metrics are requested: update the return expression in the test where
(use_huber, enable_atom_ener_coeff, loss_func, f_use_norm, _mae) = self.param
(and where CommonTest.skip_tf is checked) to include the mae flag (e.g., add "or
mae" / "or self.mae" to the predicate), so the clause becomes CommonTest.skip_tf
or loss_func == "mae" or f_use_norm or mae to avoid key-set mismatches with TF
backends.
- Line 355: TestEnerGF must skip TensorFlow runs because extract_ret() now
expects mae_* keys that build_tf() does not emit; update the test class to skip
TF by adding or setting a TF-skip flag (e.g., set skip_tf_expt = True or set
skip_tf_expt = not INSTALLED_TF_EXPT) alongside the existing skip_pt_expt
assignment in TestEnerGF so that TF backends are not executed; refer to
TestEnerGF, extract_ret(), and build_tf() when making the change and apply the
same TF-skip update at the other occurrences mentioned.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: 5aab57a0-c401-4db3-8577-85db0d5538e3
📒 Files selected for processing (1)
source/tests/consistent/loss/test_ener.py
- Change find_* label defaults from 1.0 to 0.0 in dpmodel losses (ener_spin, dos, tensor) to match PT behavior: missing labels should zero out the contribution, not assume they exist. - Prefix unused unpacked variables with _ to satisfy ruff RUF059.
There was a problem hiding this comment.
Actionable comments posted: 2
♻️ Duplicate comments (1)
deepmd/dpmodel/loss/tensor.py (1)
149-183:⚠️ Potential issue | 🔴 CriticalAlign
label_requirementkeys withcall()lookups (atom_*).Line [156] and Line [176] use
atomic_*keys, butcall()readsatom_*keys at Line [93] and Line [83]. This mismatch can suppress local loss unexpectedly and break atomic-weighted execution.🐛 Proposed fix
if self.has_local_weight: label_requirement.append( DataRequirementItem( - "atomic_" + self.label_name, + "atom_" + self.label_name, ndof=self.tensor_size, atomic=True, must=False, high_prec=False, ) ) @@ if self.enable_atomic_weight: label_requirement.append( DataRequirementItem( - "atomic_weight", + "atom_weight", ndof=1, atomic=True, must=False, high_prec=False, default=1.0, ) )🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@deepmd/dpmodel/loss/tensor.py` around lines 149 - 183, label_requirement currently registers keys with the "atomic_" prefix ("atomic_"+self.label_name and "atomic_weight") but call() expects "atom_*" keys; update the DataRequirementItem keys in the label_requirement property so they use "atom_"+self.label_name for the local label entry and "atom_weight" for the atomic weight entry (leave other fields the same) so the entries created by label_requirement match the lookups performed by call(); reference symbols: label_requirement, call(), DataRequirementItem, self.label_name, has_local_weight, enable_atomic_weight.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@deepmd/dpmodel/loss/ener_spin.py`:
- Around line 137-148: The code overwrites more_loss["mae_e"] with a differently
normalized value when loss_func == "mae" and mae is True; to fix, ensure
consistent MAE reporting by either not reassigning more_loss["mae_e"] in the if
mae: block when loss_func == "mae" (add guard: if not (self.loss_func == "mae")
before setting more_loss["mae_e"]) or compute a single base value (e.g.,
base_mae = xp.mean(abs_diff_e) or xp.mean(xp.abs(energy_pred - energy_label))
once) and reuse it to set both more_loss["mae_e"] (with or without atom_norm
consistently) and more_loss["mae_e_all"]; reference symbols: self.loss_func,
mae, more_loss["mae_e"], abs_diff_e, atom_norm, and display_if_exist to locate
and adjust the assignments.
In `@deepmd/dpmodel/loss/tensor.py`:
- Line 1: Run ruff locally and fix all lint issues in
deepmd/dpmodel/loss/tensor.py and other Python files: execute `ruff check .` to
list violations, run `ruff format .` to auto-fix formatting, then manually
correct remaining offenses (e.g., unused imports, line-length, naming, or type
issues) in the reported functions/classes in this module and others, and finally
re-run `ruff check .` and `ruff format .` to ensure no violations remain before
committing.
---
Duplicate comments:
In `@deepmd/dpmodel/loss/tensor.py`:
- Around line 149-183: label_requirement currently registers keys with the
"atomic_" prefix ("atomic_"+self.label_name and "atomic_weight") but call()
expects "atom_*" keys; update the DataRequirementItem keys in the
label_requirement property so they use "atom_"+self.label_name for the local
label entry and "atom_weight" for the atomic weight entry (leave other fields
the same) so the entries created by label_requirement match the lookups
performed by call(); reference symbols: label_requirement, call(),
DataRequirementItem, self.label_name, has_local_weight, enable_atomic_weight.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: c951210d-cc65-4df2-963e-3474242d33f2
📒 Files selected for processing (9)
deepmd/dpmodel/loss/dos.pydeepmd/dpmodel/loss/ener_spin.pydeepmd/dpmodel/loss/tensor.pysource/tests/consistent/loss/test_ener.pysource/tests/pt_expt/loss/test_dos.pysource/tests/pt_expt/loss/test_ener.pysource/tests/pt_expt/loss/test_ener_spin.pysource/tests/pt_expt/loss/test_property.pysource/tests/pt_expt/loss/test_tensor.py
✅ Files skipped from review due to trivial changes (3)
- source/tests/pt_expt/loss/test_ener.py
- source/tests/pt_expt/loss/test_dos.py
- source/tests/pt_expt/loss/test_property.py
🚧 Files skipped from review as they are similar to previous changes (3)
- source/tests/consistent/loss/test_ener.py
- source/tests/pt_expt/loss/test_tensor.py
- deepmd/dpmodel/loss/dos.py
Summary
EnergySpinLoss,DOSLoss,TensorLoss,PropertyLossto dpmodel (array_api) and pt_expt (@torch_modulewrapper) backendsserialize()/deserialize()to PT loss classes (ener_spin,dos,tensor,property) for cross-backend consistency testingmae=Truesupport to dpmodelEnergyLossandEnergySpinLoss(extra MAE metrics fordp test)get_loss()factory to handle all loss typesloss_funcandmaeDetails
dpmodel losses (
deepmd/dpmodel/loss/)Array-API compatible implementations ported from
deepmd/pt/loss/. Key adaptations:torch.cumsum→xp.cumulative_summodel_predin-place)pt_expt wrappers (
deepmd/pt_expt/loss/)Thin
@torch_modulewrappers inheriting from dpmodel classes, following existingEnergyLosspattern.Known limitations
out_std/out_biasmust be provided explicitly (dpmodel losses can't access the model at forward time). Defaults to identity normalization if not provided.inferenceparameter: Not ported from PT losses — it only suppressesl2_*intermediate metrics and is never actually used (inference=Trueis never constructed anywhere in the codebase).denoiseloss: Not ported (internal DPA pretraining, not user-facing).Test plan
python -m pytest source/tests/pt_expt/loss/ -v— all 42 tests passpython -m pytest source/tests/consistent/loss/ -v— all 342 tests pass (358 skipped for unavailable backends)python -m pytest source/tests/universal/dpmodel/loss/ -v— existing tests still passSummary by CodeRabbit
New Features
Integration
Tests
Chores