Added WRED with affected Leaf/LC/FM model check#379
Added WRED with affected Leaf/LC/FM model check#379Priyanka-Patil14 wants to merge 2 commits intodatacenter:v4.1.0-devfrom
Conversation
|
WredCheck_APIC_Output_logs.txt Uploaded the test logs. |
Harinadh-Saladi
left a comment
There was a problem hiding this comment.
Pls address the comments given and also Pls add the bug details in validations.md file. It's missing.
Pls execute the script on Fab3 and share PASS, FAIL and NA logs. Will review it.
|
|
||
| @pytest.mark.parametrize( | ||
| "tversion, fabric_nodes, icurl_outputs, expected_result, expected_data", | ||
| [ |
There was a problem hiding this comment.
Pls add the comments for each test cases to understand what test case is doing, then will review.
There was a problem hiding this comment.
Updated. Added comments to all the test cases.
| "tversion, fabric_nodes, icurl_outputs, expected_result, expected_data", | ||
| [ | ||
| ( | ||
| None, |
There was a problem hiding this comment.
Pls add the json files and read the json files for each test case and provide the test result accordingly instead of hard-coding here. Pls follow the existing structure.
There was a problem hiding this comment.
Updated. Replaced all hardcoded data with JSON fixture files
aci-preupgrade-validation-script.py
Outdated
| headers = ["Node ID", "Node Name", "Source", "Model"] | ||
| data = [] | ||
| recommended_action = ( | ||
| 'Detected affected node(s) with WRED enabled. ' |
There was a problem hiding this comment.
Pls check appropriate recommended action for this issue and add in a single line
aci-preupgrade-validation-script.py
Outdated
| 'Detected affected node(s) with WRED enabled. ' | ||
| 'Review software fix options and engage TAC.' | ||
| ) | ||
| doc_url = 'https://bst.cloudapps.cisco.com/bugsearch/bug/CSCwt50713' |
There was a problem hiding this comment.
This doc url is incorrect, pls add right url
There was a problem hiding this comment.
Updated. Changed doc url to point to the GitHub docs validation
| ) | ||
| doc_url = 'https://bst.cloudapps.cisco.com/bugsearch/bug/CSCwt50713' | ||
|
|
||
| if not tversion: |
There was a problem hiding this comment.
No need to add tversion missing check, if tversion is not given script will prompt for tversion to provide the input.
There was a problem hiding this comment.
This is consistent with the existing pattern used across the codebase. It handles the debug mode case where a user may run a single check without providing a target version, and the check needs to handle that gracefully instead of throwing an exception. Keeping it for consistency.
aci-preupgrade-validation-script.py
Outdated
| wred_enabled = False | ||
| for cong in qosCong: | ||
| algo = cong.get('qosCong', {}).get('attributes', {}).get('algo', '') | ||
| if algo.lower() == 'wred': |
There was a problem hiding this comment.
I could see the value of the attribute algo is in lower case only from moquery output. So need to covert it into lower case and validate.
aci-preupgrade-validation-script.py
Outdated
| algo = cong.get('qosCong', {}).get('attributes', {}).get('algo', '') | ||
| if algo.lower() == 'wred': | ||
| wred_enabled = True | ||
| break |
There was a problem hiding this comment.
If wred_enabled flag is True then you're coming out of the loop. What if we have multiple objects? then the loop will not be iterated for other objects. Can you check the code and validate with multiple wred enabled objects and share the logs
There was a problem hiding this comment.
For the break comment, I validated it with 4 objects where WRED was at position 3, The loop exits after finding wred at position 3 and skips the 4th object, but the result is still correctly FAIL_O. The break is intentional here since we just need to know if WRED is enabled anywhere once we find one wred object the answer is yes, so there is no need to continue. I have also added a test case to cover this scenario.
Please find the pytest logs attached.
wred_break_validation.txt
aci-preupgrade-validation-script.py
Outdated
| } | ||
|
|
||
| def is_affected_model(model): | ||
| m = (model or '').upper() |
There was a problem hiding this comment.
Pls keep the meaningful variable name instead of letter 'm' and why are we converting it into upper case here? We can chnage the case to upper if we are not getting, All the hardware models we're getting in upper case. Pls check if we are getting in lower case anywhere and convert if required.
| if attr.get('id'): | ||
| node_name_map[attr.get('id')] = attr.get('name', '') | ||
|
|
||
| impacted = set() |
There was a problem hiding this comment.
Pls use generic variable names as per the structure of the script.
There was a problem hiding this comment.
Updated. Replaced generic variable names to match the script's conventions.
aci-preupgrade-validation-script.py
Outdated
| model = attr.get('model', '') | ||
| if not is_affected_model(model): | ||
| continue | ||
| dn = attr.get('dn', '') |
There was a problem hiding this comment.
I could see dn extraction and node_regex parsing logic is duplicated in both LC and FM loops. Can you implement with a small helper, so that parsing can be implemented once and reused.
WRED_PASS:FAIL:NA_APIC_Logs.txt Please find the attached logs. Executed on fab3 for PASS, FAIL and NA scenario. |
Summary
Adds a new pre-upgrade validation check to detect fabric nodes at risk due to CSCwt50713, where WRED-enabled QoS combined with specific Leaf/LC/FM hardware models can cause N9504 spine crashes after upgrading to affected ACI releases.
Detection Logic
Three gates must all be true to trigger a FAIL:
Version Gate – Target version is in the affected range:
Feature Gate – WRED is enabled (
qosCong.algo = wred)Hardware Gate – Any of the following affected models are present:
Testing
tests/checks/wred_affected_model_check/