fix(query): passwords and secrets improvements to "Avoiding TF resource access" allow rules #7905
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.








Reason for Proposed Changes
INITIAL NOTE - the "resource reference" (terraform), whenever mentioned, refers to something like this :
data.aws_kms_secrets.sci_app_kms_secrets.plaintext["ayreshirerarran_password"]Where an "element" would be each individual block/resource/field mentioned like "data", "aws_kms_secrets", "plaintext" etc.
Currently the "Avoiding TF resource access" allow rule for the "Generic Password" and "Generic Secret" regex rules have a few issues that are causing false positive results.
Current regex: (without the double slashing from the actual regex_rules)
"(?i)['\"]?password['\"]?\s*=\s*(([a-zA-z_]+(.))?[a-zA-z_]+\s*(.)\s*[a-zA-z_]+(.)[a-zA-z_]+)?(\s*:\s*null|null)$""(?i)['\"]?secret[_]?(key)?['\"]?\s*=\s*([a-zA-z_]+(.))?[a-zA-z_]+(.)[a-zA-z_]+(.)[a-zA-z_]+"On the first regex we can see quite a few issues:
(.)was almost certainly meant to represent the literal dot character and written as\.instead.nullafter the=sign but the way it is formulated is nonsensical. With the current regex we can captures strings such as:"password" = test.test.test : null"password" = test.test.testnull(\s*:\s*null|null)capture at the end, not capturing simple references like:"password" = valid.resource.reference"password" = valid.referencevalid.referencewould not be captured.$and the lack of:as an alternative to the=are notable.For the second regex a lot of problems persist:
:, and not capturing resource references of 2 elements persist.([a-zA-z_]+(.))?excerpt of the regex is useless since the allow rules capture on a per line basis, with this any number of resource references with 3+ elements set within a given line will be captured no matter how many elements are used.secret = re.rwould not be captured.(.)strings such as :secret = reference_caught_for_wrong.reasonsecret = mypasswordsecret = a3b4c@mypasswordsecretsecret = r1and3omwo2rd3shere4[a-zA-z_]separated by any character at least 2 times will be captured. This means that to stop it from capturing we would need something like 2 numeric values or special characters in a row inside the strings such as "13" or "@%". Needless to say this regex is way to broad mostly due to the wrongful use of the dot notation. As of now basically any string of length 5 or more would be wrongfully captured.", but this might not be true in the future or in specific scenarios not yet accounted for.The reason all these issues were noticed in the first place stems from the new examples in sample negative59 where one other issue requires fixing from the regex to allow them to work properly : allowing resources with index referencing such as
random_password.client_password[each.key].result.Proposed Changes
Updated regex as follows :
"(?i)['\"]?password['\"]?\s*=\s*(([a-zA-z_]+(.))?[a-zA-z_]+\s*(.)\s*[a-zA-z_]+(.)[a-zA-z_]+)?(\s*:\s*null|null)$""(?i)['\"]?password['\"]?\s*[:=]\s*(([a-zA-Z_]+(\".+\"|\[([a-zA-Z_]+\.[a-zA-Z_]+.*|\d+)\])?\.[a-zA-Z_]+(\".+\"|\[([a-zA-Z_]+\.[a-zA-Z_]+.*|\d+)\])?)\s*([\[\.\)\]\}\$]|(:\s*null))|null)"Starting with the overall structure of the new regex we have a few parts we can breakdown:
"(?i)['\"]?password['\"]?\s*[:=]\s*:.(([a-zA-Z_]+(\[([a-zA-Z_]+\.[a-zA-Z_]+.*|\d+)\])?\.[a-zA-Z_]+(\[([a-zA-Z_]+\.[a-zA-Z_]+.*|\d+)\])?)\s*.".(\[([a-zA-Z_]+\.[a-zA-Z_]+.*|\d+)\])?should be broad enough to identify proper referencing and simple integer use whilst also not consuming large unrelated strings which would happen with a simple approach such as(\[.+\])for example.\s*used between a resource reference element and a "(.)" value were removed since it does not make sense to ever have spacing in that way.([\[\.\)\]\}\$]|(:\s*null))- the second to last segment of the regex shows the possible values after the initial resource referencing:[- this will enable index referencing on the last element of the resource referencing done.- this will enable referencing that is longer than 2 elements (the most common scenario)),]and}- this allows the reference to be set within a number of structures like the for statement (]) or the object used in the "jsonencode" on the new negative59 test.$- just like before ending the line after the declaration is allowed|null)null" value that could be set to the target password field, representing a value that should not be flagged as sensitive.Note that although i cannot see a reason to include "
(:\s*null)" as a possibility after the resource referencing portion of the regex i kept it to maintain the original regex's ability to capture said values, values such as"password" = test.test.test : null.Regex for "Generic Secret"
"(?i)['\"]?secret[_]?(key)?['\"]?\s*=\s*([a-zA-z_]+(.))?[a-zA-z_]+(.)[a-zA-z_]+(.)[a-zA-z_]+""(?i)['\"]?secret[_]?(key)?['\"]?\s*[:=]\s*(([a-zA-Z_]+(\".+\"|\[([a-zA-Z_]+\.[a-zA-Z_]+.*|\d+)\])?\.[a-zA-Z_]+(\".+\"|\[([a-zA-Z_]+\.[a-zA-Z_]+.*|\d+)\])?)\s*([\.\)\]\$]|(:\s*null))|null)"This regex should not require much explanation, it is now nearly identical to the previous one with the only change being the starting portion focusing on "secrets" over "passwords"; which is to be expected since they are used for the same reason, literally boasting the same "description" which is "Avoiding TF resource access" on the regex_rules.json.
Final Note
"password" : random_password["index"].clien...", and yet even though none of the allow rules cover the case and the capture regex for "Generic Password" does capture the value, it is not flagged; even when I removed all allow rules the results persisted. This allowed me to keep the 2 new regex shorter than expected but I can't help but wonder why the value is not flagged in the first place.I submit this contribution under the Apache-2.0 license.