This profiles the GPU Mode QR v2 problem from reference-kernels and downloads
Nsight Compute details that AI agents can read directly. The full .ncu-rep
GUI report is still included for local inspection.
curl -fsSL https://raw.githubusercontent.com/gpu-mode/popcorn-cli/main/install.sh | bash
popcorn register discordRestart your terminal if popcorn is not found after installation.
mkdir -p qr-v2-profile
cd qr-v2-profile
curl -O https://raw.githubusercontent.com/gpu-mode/reference-kernels/main/problems/linalg/qr_v2/submission.pyThe profiler uses the hosted GPU Mode NCU service:
export POPCORN_BREV_PROFILER_URL=https://http--brev-profiler-proxy--dxfjds728w5v.code.runThis profiles benchmarks[0] from
reference-kernels/problems/linalg/qr_v2/task.yml:
popcorn submit submission.py \
--leaderboard qr_v2 \
--profile-brev \
--benchmark-index 0 \
--no-tuiThe first QR v2 benchmark shape is:
batch: 20; n: 32; cond: 1; seed: 43214
After the run finishes, the CLI downloads and extracts files like:
profile.0-batch-20-n-32-cond-1-seed-43214.zip
profile.0-batch-20-n-32-cond-1-seed-43214/ncu-details.txt
profile.0-batch-20-n-32-cond-1-seed-43214/ncu-details.csv
profile.0-batch-20-n-32-cond-1-seed-43214/profile.ncu-rep # optional GUI report
Use ncu-details.txt or ncu-details.csv as the default artifact for AI
analysis. The CLI prints clickable links for these detail files.
The last line printed by the CLI opens the optional GUI report on macOS:
open -a "NVIDIA Nsight Compute" 'profile.0-batch-20-n-32-cond-1-seed-43214/profile.ncu-rep'Omit --benchmark-index:
popcorn submit submission.py \
--leaderboard qr_v2 \
--profile-brev \
--no-tuiThis profiles every entry in the benchmarks: list in QR v2 task.yml, not
the tests: list. It will produce one zip plus extracted details and optional
.ncu-rep files per benchmark shape.
For correctness testing:
popcorn submit submission.py --leaderboard qr_v2 --gpu B200 --mode test --no-tuiFor leaderboard submission:
popcorn submit submission.py --leaderboard qr_v2 --gpu B200 --mode leaderboard --no-tui