-
-
Notifications
You must be signed in to change notification settings - Fork 33.6k
Description
Bug report
Bug description:
CPython main at c5b3722, ./configure --enable-optimizations --with-lto, Linux 6.17.9-arch1-1
Consider the following small program that does some CPU-bound work in a recursive function:
import sys, random
def foo(x):
for _ in range(x):
random.sample(range(x), x)
if x > 0:
foo(x-1)
for _ in range(x):
random.sample(range(x), x)
foo(int(sys.argv[1]))When I profile this program with the command line python -m profiling.sampling run --sort cumul-pct tachyon-recursive-cumul.py 500, I get the following output:
Captured 100001 samples in 10.00 seconds
Sample rate: 10000.06 samples/sec
Error rate: 16.89%
Profile Stats:
nsamples sample% tottime (s) cumul% cumtime (s) filename:lineno(function)
256/13217760 0.3 0.026 15903.2 1321.776 tachyon-recursive-cumul.py:7(foo)
620/113462 0.7 0.062 136.5 11.346 tachyon-recursive-cumul.py:5(foo)
0/83110 0.0 0.000 100.0 8.311 tachyon-recursive-cumul.py:11(<module>)
0/83110 0.0 0.000 100.0 8.311 _sync_coordinator.py:193(_execute_script)
0/83110 0.0 0.000 100.0 8.311 _sync_coordinator.py:234(main)
0/83110 0.0 0.000 100.0 8.311 _sync_coordinator.py:251(<module>)
0/83110 0.0 0.000 100.0 8.311 <frozen runpy>:88(_run_code)
0/83110 0.0 0.000 100.0 8.311 <frozen runpy>:198(_run_module_as_main)
52397/70127 63.0 5.240 84.4 7.013 random.py:446(Random.sample)
150/6409 0.2 0.015 7.7 0.641 tachyon-recursive-cumul.py:9(foo)
5553/5736 6.7 0.555 6.9 0.574 random.py:448(Random.sample)
4965/4965 6.0 0.496 6.0 0.496 random.py:249(Random._randbelow_with_getrandbits)
4513/4513 5.4 0.451 5.4 0.451 random.py:250(Random._randbelow_with_getrandbits)
2849/2849 3.4 0.285 3.4 0.285 random.py:251(Random._randbelow_with_getrandbits)
2501/2501 3.0 0.250 3.0 0.250 random.py:252(Random._randbelow_with_getrandbits)
Legend:
nsamples: Direct/Cumulative samples (direct executing / on call stack)
sample%: Percentage of total samples this function was directly executing
tottime: Estimated total time spent directly in this function
cumul%: Percentage of total samples when this function was on the call stack
cumtime: Estimated cumulative time (including time in called functions)
filename:lineno(function): Function location and name
Summary of Interesting Functions:
Functions with Highest Direct/Cumulative Ratio (Hot Spots):
1.000 direct/cumulative ratio, 17.8% direct samples: random.py:(Random._randbelow_with_getrandbits)
0.764 direct/cumulative ratio, 69.7% direct samples: random.py:(Random.sample)
0.000 direct/cumulative ratio, 1.2% direct samples: tachyon-recursive-cumul.py:(foo)
Functions with Highest Call Frequency (Indirect Calls):
13336605 indirect calls, 16047.4% total stack presence: tachyon-recursive-cumul.py:(foo)
83110 indirect calls, 100.0% total stack presence: tachyon-recursive-cumul.py:(<module>)
83110 indirect calls, 100.0% total stack presence: _sync_coordinator.py:(_execute_script)
Functions with Highest Call Magnification (Cumulative/Direct):
12999.6x call magnification, 13336605 indirect calls from 1026 direct: tachyon-recursive-cumul.py:(foo)
1.3x call magnification, 17913 indirect calls from 57950 direct: random.py:(Random.sample)
Per the legend "cumul%: Percentage of total samples when this function was on the call stack", I expect foo to have about 100% cumul%. Instead it has 15903.2% and/or 136.5%. The module correctly has 100%, so the problem seems to relate to recursion. Perhaps the profiler is counting each stack frame in each sample?
The actual program I want to profile has a highly recursive part and a nonrecursive part. I want to use the cumulative percent/time to compare the amount of time spent in each part.
CPython versions tested on:
CPython main branch
Operating systems tested on:
Linux