[netrid] Check response validity when calling all_flights#1388
[netrid] Check response validity when calling all_flights#1388the-glu wants to merge 1 commit intointeruss:mainfrom
Conversation
BenjaminPelletier
left a comment
There was a problem hiding this comment.
Some checks expects the query to fail (e.g. with too large area checks) so checks are not done every time. I noticed that some
Misbehaviorwon't work as they expect the initial query to pass, no? (Here there will no flight if area is too large)(But that previous behavior).
Yes, making sure our documentation is correct and actual actions match documentation are both important, and I think there is probably some existing tech debt that will need to be resolved as part of the main issue. I think the core scenario does work currently though; here is the sequence view artifact from the latest CI run.
Note though that there seem to be 6 queries for the test step, but only 1 check for the whole test step. That's a strong code smell as I would expect there to be something we would want to check regarding the reasonableness of the response for every query we make. Of course, this is existing tech debt and not caused by this PR, so any fixes could be in a separate PR.
diagonal_kmis computed twice to avoid returning two values, but we could change that.
I think the strong presumption to favor clarity over maximum performance holds here and computing twice is no problem if that makes the code easier to read, understand, maintain, etc.
monitoring/uss_qualifier/scenarios/astm/netrid/common/misbehavior.py
Outdated
Show resolved
Hide resolved
|
|
||
| self.record_queries(sp_observation.queries) | ||
|
|
||
| if not sp_observation.success and not self._is_area_too_large(rect): |
There was a problem hiding this comment.
As you noticed in your PR comment, rid.all_flights doesn't just perform an ISA query; it performs an ISA query (potentially) followed by a bunch of queries for flights and flight details from individual USSs. The DSS failing to return ISAs is a violation of different requirements than the core USS failing to respond correctly to a flights query or a flight details query. Probably part of #1386 involves resolving the TODO just above this section.
There was a problem hiding this comment.
Yes, I think the todo is resolved, however the name is wrong (and probably wrong for existing ISA query as well, since they do the same).
Do you have a better name in mind?
There was a problem hiding this comment.
For misbehaviour I switched to "Initial queries" and improved description based on comment bellow, but I think the point still hold for display data evaluator ones.
There was a problem hiding this comment.
Remember that test steps are where actions are performed, and then checks (following the actions) determine if any requirements were violated based on the information obtained from those actions. With that in mind, I'm not sure what "Initial queries" would be checking. I think this first check description makes sense:
interuss.f3411.dss_endpoints.SearchISAs requires a USS providing a DSS instance to implement the DSS endpoints of the OpenAPI specification. If uss_qualifier is unable to query the DSS for ISAs, this check will fail.
But, it seems like the thing we're checking is "Did the (single) DSS query to search for ISAs succeed?" I would expect that check to be named something like "Successful ISA search" or "DSS ISAs search succeeded" or "ISA query succeeded" or "ISA query success".
This check description makes less sense to me (additional emphasis added):
interuss.f3411.dss_endpoints.SearchISAs requires a USS providing a DSS instance to implement the DSS endpoints of the OpenAPI specification. Then, in order to properly test whether the SP handles authentication correctly, after identifying the SP contact information via its ISA in the DSS, this step will first attempt to do a flights request with the proper credentials to confirm that the requested data is indeed available to any authorized query. If uss_qualifier is unable to query the DSS for ISAs or flights, this check will fail.
This seems to be documenting a check (the first sentence) then another, separate test step (second sentence). Actions in a test step belong in the test step documentation (one documentation level higher than check documentation) and then checks should describe only how the information already obtained will be evaluated to determine if a requirement was violated.
So, it seems like there are three actions in this test step (due to rid.all_flights):
- Query DSS for ISAs
- For each ISA found, query the USS for flights
- For each flight found, query the USS for flight details
The InterUSS requirement interuss.f3411.dss_endpoints.SearchISAs applies only to the information obtained from action 1. interuss.f3411.dss_endpoints.SearchISAs doesn't relate at all to the information obtained in actions 2 nor 3 -- instead, presumably requirements NET0710,1 and NET0710,2 apply to actions 2 and 3, respectively (when those queries are expected to succeed). But, these should be two separate checks since the information we're evaluating for NET0710,1 is coming from action 2 and the information we're evaluating for NET0710,2 is coming from action 3 -- the action, information, and relevant requirement are all different between action 2/NET0710,1 and action 3/NET0720,2 so they shouldn't be combined into a single check.
The original problem prompting #1386 was that these two checks AND the ISA query check were all combined into a single check. When that check failed, it was extremely unclear what had actually gone wrong. That's because a check for A && B && C && D failing provides little information as to whether A failed, B failed, and/or C failed. This was especially problematic because the check only documented that D was being checked, so the fact that B was failing was extremely difficult to determine.
monitoring/uss_qualifier/scenarios/astm/netrid/v22a/misbehavior.md
Outdated
Show resolved
Hide resolved
monitoring/uss_qualifier/scenarios/astm/netrid/v22a/misbehavior.md
Outdated
Show resolved
Hide resolved
monitoring/uss_qualifier/scenarios/astm/netrid/display_data_evaluator.py
Outdated
Show resolved
Hide resolved
I think that the issue that in misbehavior, the test should not try to do authentication tests with too large area (nor multiple ones since it's the same), because it will directly fail at initial step, therefore not finding any flights to test.
Commenting line 100/101, and doing on "self._rid_version.max_details_diagonal_km * 1000 - 100" check seems to do it more cleanly:
(Also print-based-debugging show similar result, however I see the 3 pass/pooling (too large without flights to test, clustered ok and details ok, but I don't understand why the rapport show only 2 "Initial queries" (and one check)), the two OK pass seems to be merged)
Yes, with the new explicit check it will be better (see previous screenshots) so we're probably better now. |
Answering to myself: Because the pooler stop at the first function returning result. That seems wrong, I would just switch to a fix, The goal is to evaluate authentification, not if initial query fail because it's too large; and the dedicated function to test search area is already restricting to this for that reason. (I can side fix it there or in another PR) |
The authentication step still seems valid with area-too-large. The correct response to a properly-authenticated "GET /flights" request is 413 (area too large). When that same "GET /flights" request is made without authentication, the correct response is 401 (413 reveals some information to an unauthenticated user). It does seem like good coverage would probably (also) include a correct response of 200 to compare against the correct unauthenticated response of 401, but the current test scenario documentation doesn't specifically call for a 200 or 413 to compare against 401, so comparing 413 to 401 does seem to accurately fulfill/implement the test scenario documentation.
Yes, I think a separate, explicit check for the DSS query is an improvement and could be sufficient for a single PR. But, to fully fix the scenario, I think the DSS query check should point to the DSS query it's checking (specify query_timestamp) and each query should have at least 1 check. A query with no check means "the result of this query could be literally anything and there would be nothing of concern to report" (in that case, why make the query at all?) In the new screenshot, the USS /flights query in e29 appears to have no checks. So, it could return literally anything (413: error, 200: flights, 200: cake recipe) and we wouldn't be justified reporting any concerns. If that's actually true, we probably shouldn't make the query. In the more likely event that's not true, we should document the check we'd perform -- presumably we expect that query to work properly, which means if it didn't return 413 we'd be concerned per NET0710,1. |


Fix #1386 by checking
all_flightsresponses for errors, before doing anything else.A few points:
ISA querycheck, but is it really an 'ISA query' since it's also querying flights?Misbehaviorwon't work as they expect the initial query to pass, no? (Here there will no flight if area is too large)(But that previous behavior).diagonal_kmis computed twice to avoid returning two values, but we could change that.