chore(a/b): update tests/README.md

ShadowCurse · ShadowCurse · commit 3b97fcaa4ffc · 2025-12-12T14:56:47.000Z
Fix incorrect instructions about running `tools/ab_test.py`
with manually collected test results.
Move A/B visualization section after manual test collection
section as `tools/ab_plot.py` script is using same inputs as
`tools/ab_test.py`

Signed-off-by: Egor Lazarchuk &lt;yegorlz@amazon.co.uk&gt;
diff --git a/tests/README.md b/tests/README.md
@@ -251,26 +251,6 @@ schedule an A/B-Test in buildkite, the `REVISION_A` and `REVISION_B` environment
 variables need to be set in the "Environment Variables" field under "Options" in
 buildkite's "New Build" modal.
 
-### A/B visualization
-
-To create visualization of A/B runs use `tools/ab_plot.py` script. It supports
-creating `pdf` and `table` outputs with multiple directories as inputs. Example
-usage:
-
-```sh
- ./tools/ab_plot.py a_path b_path --output_type pdf
-```
-
-Alternatively using `devtool` running the script in the dev container with
-pre-installed dependencies.
-
-```sh
- ./tools/devtool sh ./tools/ab_plot.py a_path b_path --output_type pdf
-```
-
-> [!NOTE] Generating `pdf` output may take some time for tests with a lot of
-> permutations.
-
 ### Beyond commit comparisons
 
 While our automated A/B-Testing suite only supports A/B-Tests across commit
@@ -279,26 +259,37 @@ arbitrary environment (such as comparison how the same Firecracker binary
 behaves on different hosts).
 
 For this, run the desired tests in your environments using `devtool` as you
-would for a non-A/B test. The only difference to a normal test run is you should
-set two environment variables: `AWS_EMF_ENVIRONMENT=local` and
-`AWS_EMF_NAMESPACE=local`:
+would for a non-A/B test. This will produce `test_results` directories which
+will contain `metrics.json` files for each run test.
+
+The `tools/ab_test.py` script can find and use these `metrics.json` files in the
+provided directories to compare runs:
 
 ```sh
-AWS_EMF_ENVIRONMENT=local AWS_EMF_NAMESPACE=local tools/devtool -y test -- integration_tests/performance/test_boottime.py::test_boottime
+tools/ab_test.py analyze <path to A `test_results`> <path to B `test_results`>
 ```
 
-This instructs `aws_embedded_metrics` to dump all data series that our A/B-Test
-orchestration would analyze to `stdout`, and pytest will capture this output
-into a file stored at `./test_results/test-report.json`.
+This will then print the same analysis described in the previous sections.
 
-The `tools/ab_test.py` script can consume these test reports, so next collect
-your two test report files to your local machine and run
+#### Visualization
+
+To create visualization of A/B runs use `tools/ab_plot.py` script. It supports
+creating `pdf` and `table` outputs using same `metrics.json` files used by
+`tools/ab_test.py`. Example usage:
 
 ```sh
-tools/ab_test.py analyze <first test-report.json> <second test-report.json>
+ ./tools/ab_plot.py <path to A `test_results`> <path to B `test_results`> --output_type pdf
 ```
 
-This will then print the same analysis described in the previous sections.
+Alternatively using `devtool` running the script in the dev container with
+pre-installed dependencies.
+
+```sh
+ ./tools/devtool sh ./tools/ab_plot.py <path to A `test_results`> <path to B `test_results`> --output_type pdf
+```
+
+> [!NOTE] Generating `pdf` output may take some time for tests with a lot of
+> permutations.
 
 #### Troubleshooting