ext/strings: add runtime cost tracking for string extension functions#1304
Closed
MindflareX wants to merge 1 commit intogoogle:masterfrom
Closed
ext/strings: add runtime cost tracking for string extension functions#1304MindflareX wants to merge 1 commit intogoogle:masterfrom
MindflareX wants to merge 1 commit intogoogle:masterfrom
Conversation
|
Thanks for your pull request! It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA). View this failed invocation of the CLA check for more information. For the most up to date status, view the checks section at the bottom of the pull request. |
The string extension library (ext/strings.go) did not register any OverloadCostTracker entries, causing all string extension function calls to fall through to the default case in costCall() which charges a cost of 1 regardless of input size or output size. This is inconsistent with ext/lists.go, ext/sets.go, and ext/regex.go which all register proper cost trackers for their functions. The missing cost tracking allows resource exhaustion when CostLimit() is used as a defense mechanism: - indexOf()/lastIndexOf(): O(N*M) naive nested-loop search with cost=1. A 500K string searched against a 200K pattern burns 24 seconds of CPU with no cost limit error. - join(): Concatenates a list of strings into a single output with cost=1. Joining 10K strings of 10K chars each produces 587 MB of allocation with no cost limit error. - replace(): Output can be much larger than input when replacing short patterns with long strings, tracked as cost=1. This change adds OverloadCostTracker registrations for indexOf, lastIndexOf, join, replace, lowerAscii, upperAscii, reverse, trim, and split. The cost formulas follow the patterns used by the built-in string operations in interpreter/runtimecost.go: - indexOf/lastIndexOf: O(n*m) * StringTraversalCostFactor - join/replace: O(result_size) * StringTraversalCostFactor - lowerAscii/upperAscii/reverse/trim: O(input_size) * StringTraversalCostFactor - split: O(result_size) * StringTraversalCostFactor + ListCreateBaseCost
6b2c404 to
7d02e44
Compare
Collaborator
|
There's already an open PR for this which hasn't yet responded to feedback |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
The string extension library (
ext/strings.go) does not register anyOverloadCostTrackerentries in itsProgramOptions()method, causing all string extension function calls to fall through to the default case incostCall()(interpreter/runtimecost.go:337) which charges a cost of 1 regardless of input size or output size.This is inconsistent with
ext/lists.go,ext/sets.go, andext/regex.gowhich all register proper cost trackers.Impact
The missing cost tracking allows resource exhaustion when
CostLimit()is used as a defense mechanism:indexOf()/lastIndexOf(): Uses an O(N×M) naive nested-loop search implementation (ext/strings.go:608-619). A 500K string searched against a 200K almost-matching pattern burns 24 seconds of CPU atCostLimit(100)with no error.join(): Concatenates N strings into a single output string. Joining 10,000 strings of 10,000 chars each allocates 587 MB atCostLimit(100)with no error.replace(): Output can be significantly larger than input (e.g., replacing each character with a long string).replace("a", "<1000 chars>")on 100K input produces 95 MB with no cost error.lowerAscii()/upperAscii()/reverse()/split()/trim(): All perform O(N) work on the input string but are tracked as cost=1.Fix
This change adds
OverloadCostTrackerregistrations for all string extension functions. The cost formulas follow the patterns already used by built-in string operations ininterpreter/runtimecost.go:indexOf/lastIndexOfstr_size * substr_size * StringTraversalCostFactorjoin/replaceresult_size * StringTraversalCostFactorlowerAscii/upperAscii/reverse/triminput_size * StringTraversalCostFactorsplitresult_size * StringTraversalCostFactor + ListCreateBaseCostTest
Added
TestStringsCostTrackingwhich verifies:indexOf/lastIndexOf/joinoperations correctly exceed cost limitsAll existing tests pass.