Added WRED with affected Leaf/LC/FM model check#379

Open

Priyanka-Patil14 wants to merge 2 commits intodatacenter:v4.1.0-devfrom

Priyanka-Patil14:bugfix/CSCwt50713

Priyanka-Patil14 commented Apr 9, 2026

Summary

Adds a new pre-upgrade validation check to detect fabric nodes at risk due to CSCwt50713, where WRED-enabled QoS combined with specific Leaf/LC/FM hardware models can cause N9504 spine crashes after upgrading to affected ACI releases.

Detection Logic

Three gates must all be true to trigger a FAIL:

Version Gate – Target version is in the affected range:
- ACI 6.1(x) older than 6.1(6a)
- ACI 6.2(x) older than 6.2(2a)
Feature Gate – WRED is enabled (qosCong.algo = wred)
Hardware Gate – Any of the following affected models are present:
- Leaf: N9K-C9236C, N9K-C92300YC, N9K-C9272Q, N9K-C92304QC
- LC: N9K-C92304QC
- FM: N9K-C9504-FM-E, N9K-C9508-FM-E, N9K-C9516-FM-E

Testing

5 unit test cases added under tests/checks/wred_affected_model_check/
All 5 passed
Validated on live fabric (fab3-apic1): confirmed FAIL_O with real hit on node 201 (FAB3-S1, N9K-C9504-FM-E)


          Added WRED with affected Leaf/LC/FM model check

96dbf31

Author

Priyanka-Patil14 commented Apr 10, 2026

WredCheck_APIC_Output_logs.txt
WredCheck_Pytest_Logs.txt

Uploaded the test logs.

Harinadh-Saladi reviewed

View reviewed changes

Harinadh-Saladi left a comment

Pls address the comments given and also Pls add the bug details in validations.md file. It's missing.
Pls execute the script on Fab3 and share PASS, FAIL and NA logs. Will review it.

tests/checks/wred_affected_model_check/test_wred_affected_model_check.py

+              @pytest.mark.parametrize(
+                  "tversion, fabric_nodes, icurl_outputs, expected_result, expected_data",
+                  [

Harinadh-Saladi Apr 10, 2026

Pls add the comments for each test cases to understand what test case is doing, then will review.

Author

Priyanka-Patil14 Apr 13, 2026 •

edited

Loading

Updated. Added comments to all the test cases.

tests/checks/wred_affected_model_check/test_wred_affected_model_check.py

+                  "tversion, fabric_nodes, icurl_outputs, expected_result, expected_data",
+                  [
+                      (
+                          None,

Harinadh-Saladi Apr 10, 2026

Pls add the json files and read the json files for each test case and provide the test result accordingly instead of hard-coding here. Pls follow the existing structure.

Author

Priyanka-Patil14 Apr 13, 2026

Updated. Replaced all hardcoded data with JSON fixture files

aci-preupgrade-validation-script.py Outdated

+                  headers = ["Node ID", "Node Name", "Source", "Model"]
+                  data = []
+                  recommended_action = (
+                      'Detected affected node(s) with WRED enabled. '

Harinadh-Saladi Apr 10, 2026

Pls check appropriate recommended action for this issue and add in a single line

Author

Priyanka-Patil14 Apr 13, 2026

Updated.

aci-preupgrade-validation-script.py Outdated

+                      'Detected affected node(s) with WRED enabled. '
+                      'Review software fix options and engage TAC.'
+                  )
+                  doc_url = 'https://bst.cloudapps.cisco.com/bugsearch/bug/CSCwt50713'

Harinadh-Saladi Apr 10, 2026

This doc url is incorrect, pls add right url

Author

Priyanka-Patil14 Apr 13, 2026

Updated. Changed doc url to point to the GitHub docs validation

aci-preupgrade-validation-script.py

+                  )
+                  doc_url = 'https://bst.cloudapps.cisco.com/bugsearch/bug/CSCwt50713'
+                  if not tversion:

Harinadh-Saladi Apr 10, 2026

No need to add tversion missing check, if tversion is not given script will prompt for tversion to provide the input.

Author

Priyanka-Patil14 Apr 10, 2026

This is consistent with the existing pattern used across the codebase. It handles the debug mode case where a user may run a single check without providing a target version, and the check needs to handle that gracefully instead of throwing an exception. Keeping it for consistency.

aci-preupgrade-validation-script.py Outdated

+                  wred_enabled = False
+                  for cong in qosCong:
+                      algo = cong.get('qosCong', {}).get('attributes', {}).get('algo', '')
+                      if algo.lower() == 'wred':

Harinadh-Saladi Apr 10, 2026

I could see the value of the attribute algo is in lower case only from moquery output. So need to covert it into lower case and validate.

Author

Priyanka-Patil14 Apr 13, 2026

Updated.

aci-preupgrade-validation-script.py Outdated

+                      algo = cong.get('qosCong', {}).get('attributes', {}).get('algo', '')
+                      if algo.lower() == 'wred':
+                          wred_enabled = True
+                          break

Harinadh-Saladi Apr 10, 2026

If wred_enabled flag is True then you're coming out of the loop. What if we have multiple objects? then the loop will not be iterated for other objects. Can you check the code and validate with multiple wred enabled objects and share the logs

Author

Priyanka-Patil14 Apr 13, 2026

For the break comment, I validated it with 4 objects where WRED was at position 3, The loop exits after finding wred at position 3 and skips the 4th object, but the result is still correctly FAIL_O. The break is intentional here since we just need to know if WRED is enabled anywhere once we find one wred object the answer is yes, so there is no need to continue. I have also added a test case to cover this scenario.

Please find the pytest logs attached.
wred_break_validation.txt

aci-preupgrade-validation-script.py Outdated

+                  }
+                  def is_affected_model(model):
+                      m = (model or '').upper()

Harinadh-Saladi Apr 10, 2026

Pls keep the meaningful variable name instead of letter 'm' and why are we converting it into upper case here? We can chnage the case to upper if we are not getting, All the hardware models we're getting in upper case. Pls check if we are getting in lower case anywhere and convert if required.

aci-preupgrade-validation-script.py

+                      if attr.get('id'):
+                          node_name_map[attr.get('id')] = attr.get('name', '')
+                  impacted = set()

Harinadh-Saladi Apr 10, 2026

Pls use generic variable names as per the structure of the script.

Author

Priyanka-Patil14 Apr 13, 2026

Updated. Replaced generic variable names to match the script's conventions.

aci-preupgrade-validation-script.py Outdated

+                      model = attr.get('model', '')
+                      if not is_affected_model(model):
+                          continue
+                      dn = attr.get('dn', '')

Harinadh-Saladi Apr 10, 2026

I could see dn extraction and node_regex parsing logic is duplicated in both LC and FM loops. Can you implement with a small helper, so that parsing can be implemented once and reused.

Author

Priyanka-Patil14 Apr 13, 2026

Updated.


          Addressed PR review comments

Author

Priyanka-Patil14 commented Apr 13, 2026

Pls address the comments given and also Pls add the bug details in validations.md file. It's missing. Pls execute the script on Fab3 and share PASS, FAIL and NA logs. Will review it.

WRED_PASS:FAIL:NA_APIC_Logs.txt

Please find the attached logs. Executed on fab3 for PASS, FAIL and NA scenario.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet