Skip to content

Experiment cleaning up light class#734

Draft
TheJulianJES wants to merge 7 commits into
zigpy:devfrom
TheJulianJES:tjj/light_class_cleanup_experiments_1
Draft

Experiment cleaning up light class#734
TheJulianJES wants to merge 7 commits into
zigpy:devfrom
TheJulianJES:tjj/light_class_cleanup_experiments_1

Conversation

@TheJulianJES
Copy link
Copy Markdown
Contributor

@TheJulianJES TheJulianJES commented Apr 1, 2026

DRAFT. We need full test coverage first. This is also still experimental.

Proposed change

This adds a _check_result method to the light class that raises if the status is not SUCCESS.
Depending on the _ZCLFailureAction, the exception handling differs for the transition timing.

#733 is also pulled in temporarily. Other clean-ups are applied as well.

Background

The light class basically works perfectly, except for turning off an effect when also switching color modes (affecting some lights). This would be fixed with PR #733. However, the light class also has potential to be cleaned up a bit.

In order to handle transitions and intermediate command failures correctly, you need a lot of conditionals to correctly handle all possible failure states. Other Zigbee implementations often do not do this correctly, causing incorrect state in Home Assistant when one command fails to deliver.

However, we can still clean up our light class a lot. One idea for this failed command handling is the one implemented in this PR. We likely want more test coverage for all edge-cases (group lights, assumed state, turn on after being turned off with a transition (internal brightness differs), ...) before trying any clean-up PRs though.

Additional information

Somewhat related:

@codecov
Copy link
Copy Markdown

codecov Bot commented Apr 1, 2026

Codecov Report

❌ Patch coverage is 82.60870% with 12 lines in your changes missing coverage. Please review.
✅ Project coverage is 97.80%. Comparing base (696ab1c) to head (4ccf42b).
⚠️ Report is 3 commits behind head on dev.

Files with missing lines Patch % Lines
zha/application/platforms/light/__init__.py 82.60% 12 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##              dev     #734      +/-   ##
==========================================
+ Coverage   97.63%   97.80%   +0.16%     
==========================================
  Files          62       62              
  Lines       10814    10825      +11     
==========================================
+ Hits        10558    10587      +29     
+ Misses        256      238      -18     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copy link
Copy Markdown
Collaborator

@zigpy-review-bot zigpy-review-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Engaging with this as an experiment, per the draft label.

What the cleanup is buying

The _check_result + _ZCLCommandFailure(action) indirection collapses six near-identical fail-and-return blocks into one try/except plus seven typed call sites. That's a real readability win, and the two failure-handling actions (COMPLETE_IF_NO_LISTENER vs START_TIMER) cleanly capture what was previously implicit in each block.

A few side-benefits picked up along the way that are net positive and worth keeping independent of the broader refactor:

  • The _async_unsub_transition_listener reorder (line 833) — old code nulled self._transition_listener before _tracked_handles.remove(...), which meant the remove was always a no-op against None (silently swallowed by contextlib.suppress(ValueError)). The handle was leaking into _tracked_handles until something else cleaned it up. New order actually removes it. This is a latent-bug fix, not a stylistic cleanup — could stand on its own.
  • The PR #733 logic is folded in correctly: an active colorloop is now deactivated before color commands when starting from on (early-deactivate at line 492), and deactivate_effect_after_turn_on covers the from-off case after commands. The two new tests (test_turn_on_disables_effect_before_color_commands, test_turn_on_from_off_still_disables_stale_effect) lock both paths in. Cleaner than #733's standalone shape.
  • The transition_time ternary unpacking and the explicit if/else on execute_if_off_supported are honest readability improvements with no semantic change.

Risks worth tightening before promoting from draft

1. The early-deactivate ZCL call has no failure handling

The new pre-command deactivate at line 492 sets self._effect = EFFECT_OFF inside _async_deactivate_color_loop regardless of the ZCL result. If the device is unreachable for that command, we'll record the effect as off while the device may still be looping. The old end-of-function deactivate had the same property, so this is technically not a regression — but the new position (before the main turn-on commands run) makes the optimism more consequential: the rest of the turn-on flow then runs against a potentially-wrong internal effect state. Worth deciding whether the early-deactivate should use _check_result with one of the two failure actions, or whether the comment-block should explicitly call out "we optimistically mark effect off here too."

2. Cross-PR overlap with #767

#767 (open, non-draft, fixing on/off-only revert) edits the same async_turn_off block — specifically the _zha_config_enable_light_transitioning_flag gate becomes a set_transition_flag = brightness_supported and ... predicate at line 636-ish. Your async_turn_off transition_time ternary cleanup overlaps lexically; the eventual merge of #767 + this PR will need to fold both. Worth either rebasing on top of #767 before further work here, or coordinating with the author. #742 (transition-entities-disabled-by-default) is also draft and touches helpers.py — no file overlap with this PR but the same general light-cleanup space, so worth keeping in the same review window.

3. _ZCLFailureAction doesn't compose if a third action is ever needed

The enum currently has two members and the dispatch in the except _ZCLCommandFailure block is an if/elif with no fallback. If a future change wants a third failure mode (e.g. "complete unconditionally", "propagate as user-facing exception"), the dispatch grows linearly. Not a blocker — two values is the right shape for today — but if you're considering whether this generalises further (group-light, assumed-state, or the other edge-cases the PR body mentions), a small match on exc.action with a case _: guard is cheaper to extend than the current if/elif. Also, the except clause silently falls through when neither action matches; a default branch logging the unexpected value would be cheap insurance.

4. Tests don't exercise the failure paths

The two new tests are great for the success path of effect-handling, but the substantial structural change is the failure-dispatch through _ZCLCommandFailure. None of the new tests force a non-SUCCESS status to confirm each of the seven _check_result call sites maps to the right _ZCLFailureAction. Without that, a future refactor could flip a COMPLETE_IF_NO_LISTENER to START_TIMER (or vice versa) at one call site and CI would stay green. The PR body acknowledges "need full test coverage first" — concretely, one parametrised test per call site, varying which mocked cluster call returns Status.FAILURE, would cover this.

5. Minor: _async_turn_on_commands still has # noqa: C901

The split moved the colorloop and flash logic out into _async_turn_on_impl, but the seven failure points and three conditional command paths in _async_turn_on_commands still trip the complexity check. Not a blocker — but if the goal is "clean up," splitting the new helper further along the new_color_provided_while_off / execute_if_off_supported axes would let the noqa go away. Optional.

Verdict

Not a verdict — this is a draft / experiment. Direction is good; the _check_result abstraction is worth pursuing. The early-deactivate placement is the most-load-bearing behavioural change and deserves a clear stance on failure handling before promoting from draft. Test coverage for the failure dispatch is the other gate.

and self._effect == EFFECT_COLORLOOP
and effect != EFFECT_COLORLOOP
):
await self._async_deactivate_color_loop(t_log)
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This early-deactivate ZCL call has no _check_result_async_deactivate_color_loop unconditionally sets self._effect = EFFECT_OFF (line 567) even when the command failed. Old code's deactivate happened at the end of the function and had the same property, so this isn't a regression — but the new position makes the optimism more consequential: the rest of the turn-on path then runs against potentially-wrong internal effect state.

Worth deciding whether this should also use _check_result (probably COMPLETE_IF_NO_LISTENER since no transition timer was started yet for the effect itself) or whether the comment should explicitly document "we optimistically mark effect off even if the deactivate failed."

elif exc.action == _ZCLFailureAction.START_TIMER:
self.async_transition_start_timer(transition_time)
self.debug("turned on: %s", t_log)
return
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The if/elif here silently falls through if a future _ZCLFailureAction member is added without updating this dispatch. Cheap insurance: replace with match exc.action: case ... case _: <log> (or add an else: raise AssertionError(...) to force the failure to surface). Optional for two-member enum, useful guard if this grows.

t_log["color_loop_set"] = result
self._effect = EFFECT_OFF

async def _async_turn_on_commands( # noqa: C901
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Optional: the # noqa: C901 was inherited from _async_turn_on_impl. After the split, _async_turn_on_impl no longer needs it (good — already removed), but this helper still trips the check. If you want to drop the noqa here too, a further split along the new_color_provided_while_off / execute_if_off_supported axes would do it.

# needed for correct state with light groups
self._brightness = 1
self._off_with_transition = transition is not None
self._off_with_transition = True
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Behaviour-equivalent because this line is already inside if transition is not None: (line 732), so transition is not None was always True here — the new = True is just the explicit form. Fine, but worth a one-line commit-message note since the diff reads like a behaviour change at first glance.

with contextlib.suppress(ValueError):
self._tracked_handles.remove(self._transition_listener)

self._transition_listener = None
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Latent-bug fix worth calling out in the commit message: old order set self._transition_listener = None before _tracked_handles.remove(self._transition_listener), so the remove always tried to delete None (not in the list, silently swallowed by contextlib.suppress(ValueError)). The handle was leaking into _tracked_handles until the next cleanup path. This reorder is independent of the rest of the refactor and could be its own PR if you want to land it ahead of the larger experiment.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants