Enable buck-native x86 simulator test for QNN op tests#20494
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/20494
Note: Links to docs will display an error until the docs builds have been completed. ✅ No FailuresAs of commit ae8279b with merge base fa5d85a ( This comment was automatically generated by Dr. CI and updates every 15 minutes. |
|
|
|
@billmguo has exported this pull request. If you are a Meta employee, you can view the originating Diff in D109606746. |
This PR needs a
|
b190fc6 to
970dea0
Compare
Summary: The QNN operator tests in `backends/qualcomm/tests/test_qnn_delegate.py` already support host x86_64 simulator execution via `--enable_x86_64`, but were only runnable as a standalone argparse script after a CMake build. There was no `buck test` path: `test_qnn_delegate` is a `python_library` (not a test target), and the host `qnn_executor_runner` binary was Android-only. This change wires up a buck-native, internal-only test target `//executorch/backends/qualcomm/tests:test_qnn_delegate_x86` that runs the FP16 and quantized operator suites (`TestQNNFloatingPointOperator`, `TestQNNQuantizedOperator`) on the x86 QNN simulator with no device and no CMake build tree. Changes: - `backends/qualcomm/runtime/targets.bzl`: add `CXX` to the `pal` library platforms. It already ships `pal/src/linux/*.cpp`, but was gated to `[ANDROID]`, which blocked any host build of `:runtime` (an exported dep). This is a Buck/CMake parity fix. - `examples/qualcomm/executor_runner/targets.bzl`: add `CXX` to `qnn_executor_runner` so the host runner binary builds, and add `//executorch/kernels/portable:generated_lib` to its deps. The CMake runner links `full_portable_ops_lib` + `quantized_ops_lib`; the Buck runner had only the quantized lib, so ops that leave a CPU-fallback node (e.g. `acos` -> `aten::asin.out`, `cast`, `index_copy`, `index_put`, `logical_and`, `avg_pool1d`) aborted the runner with a "Missing operator" error. - `backends/qualcomm/tests/test_qnn_delegate_x86.py` (new): under `buck test` the file's argparse `__main__`/`setup_environment()` never runs, so this wrapper sets the equivalent `TestQNN` class attributes (`enable_x86_64`, `backend`, `soc_model`) at import and subclasses the operator TestCases so the runner discovers them. - `backends/qualcomm/tests/BUCK`: add the `test_qnn_delegate_x86` `python_test`, gated behind `runtime.is_oss` via a top-level conditional expression (the BUCK dialect forbids top-level `if`/`def`). It stays out of the OSS graph to preserve the `test-qnn-buck-build-linux` CI signal. The QNN x86 SDK libs and host runner are supplied via `env` (`QNN_SDK_ROOT`, `LD_LIBRARY_PATH`, `QNN_EXECUTOR_RUNNER`). `CUDA_VISIBLE_DEVICES` is forced empty because these tests are CPU-only (calibration, AOT compile, x86 sim); without it, parallel test shards each grabbed CUDA and exhausted GPU memory. The fbcode and xplat copies of all four files are kept byte-identical per the existing twin convention. Differential Revision: D109606746
970dea0 to
224e6d6
Compare
224e6d6 to
c0da9c3
Compare
Summary: The QNN operator tests in `backends/qualcomm/tests/test_qnn_delegate.py` already support host x86_64 simulator execution via `--enable_x86_64`, but were only runnable as a standalone argparse script after a CMake build. There was no `buck test` path: `test_qnn_delegate` is a `python_library` (not a test target), and the host `qnn_executor_runner` binary was Android-only. This change wires up a buck-native, internal-only test target `//executorch/backends/qualcomm/tests:test_qnn_delegate_x86` that runs the FP16 and quantized operator suites (`TestQNNFloatingPointOperator`, `TestQNNQuantizedOperator`) on the x86 QNN simulator with no device and no CMake build tree. Changes: - `backends/qualcomm/runtime/targets.bzl`: add `CXX` to the `pal` library platforms (gated `if is_fbcode()`). It already ships `pal/src/linux/*.cpp`, but was gated to `[ANDROID]`, which blocked any host build of `:runtime` (an exported dep). - `examples/qualcomm/executor_runner/targets.bzl`: add `CXX` to `qnn_executor_runner` (gated `if is_fbcode()`) so the host runner binary builds, and add `//executorch/kernels/portable:generated_lib` to its deps. The CMake runner links `full_portable_ops_lib` + `quantized_ops_lib`; the Buck runner had only the quantized lib, so ops that leave a CPU-fallback node (e.g. `acos` -> `aten::asin.out`, `cast`, `index_copy`, `index_put`, `logical_and`, `avg_pool1d`) aborted the runner with a "Missing operator" error. The `CXX` (host) surface is gated to `is_fbcode()` because it is only used by the internal x86 simulator test; in OSS, `CXX` includes macOS (no QNN host libs), so the host runner/pal stay Android-only there, restoring the original OSS build surface. - `backends/qualcomm/tests/test_qnn_delegate_x86.py` (new): under `buck test` the file's argparse `__main__`/`setup_environment()` never runs, so this wrapper sets the equivalent `TestQNN` class attributes (`enable_x86_64`, `backend`, `soc_model`) at import and subclasses the operator TestCases so the runner discovers them. - `backends/qualcomm/tests/BUCK`: add the `test_qnn_delegate_x86` `python_test`, gated behind `runtime.is_oss` via a top-level conditional expression (the BUCK dialect forbids top-level `if`/`def`). It stays out of the OSS graph to preserve the `test-qnn-buck-build-linux` CI signal. The QNN x86 SDK libs and host runner are supplied via `env` (`QNN_SDK_ROOT`, `LD_LIBRARY_PATH`, `QNN_EXECUTOR_RUNNER`). `CUDA_VISIBLE_DEVICES` is forced empty because these tests are CPU-only (calibration, AOT compile, x86 sim); without it, parallel test shards each grabbed CUDA and exhausted GPU memory. The fbcode and xplat copies of all four files are kept byte-identical per the existing twin convention. Differential Revision: D109606746
c0da9c3 to
3545844
Compare
3545844 to
3b8432d
Compare
Summary: The QNN operator tests in `backends/qualcomm/tests/test_qnn_delegate.py` already support host x86_64 simulator execution via `--enable_x86_64`, but were only runnable as a standalone argparse script after a CMake build. There was no `buck test` path: `test_qnn_delegate` is a `python_library` (not a test target), and the host `qnn_executor_runner` binary was Android-only. This change wires up a buck-native, internal-only test target `//executorch/backends/qualcomm/tests:test_qnn_delegate_x86` that runs the FP16 and quantized operator suites (`TestQNNFloatingPointOperator`, `TestQNNQuantizedOperator`) on the x86 QNN simulator with no device and no CMake build tree. Changes: - `backends/qualcomm/runtime/targets.bzl`: add `CXX` to the `pal` library platforms (gated `if is_fbcode()`). It already ships `pal/src/linux/*.cpp`, but was gated to `[ANDROID]`, which blocked any host build of `:runtime` (an exported dep). - `examples/qualcomm/executor_runner/targets.bzl`: add `CXX` to `qnn_executor_runner` (gated `if is_fbcode()`) so the host runner binary builds, and add `//executorch/kernels/portable:generated_lib` to its deps. The CMake runner links `full_portable_ops_lib` + `quantized_ops_lib`; the Buck runner had only the quantized lib, so ops that leave a CPU-fallback node (e.g. `acos` -> `aten::asin.out`, `cast`, `index_copy`, `index_put`, `logical_and`, `avg_pool1d`) aborted the runner with a "Missing operator" error. The `CXX` (host) surface is gated to `is_fbcode()` because it is only used by the internal x86 simulator test; in OSS, `CXX` includes macOS (no QNN host libs), so the host runner/pal stay Android-only there, restoring the original OSS build surface. - `backends/qualcomm/tests/test_qnn_delegate_x86.py` (new): under `buck test` the file's argparse `__main__`/`setup_environment()` never runs, so this wrapper sets the equivalent `TestQNN` class attributes (`enable_x86_64`, `backend`, `soc_model`) at import and subclasses the operator TestCases so the runner discovers them. - `backends/qualcomm/tests/BUCK`: add the `test_qnn_delegate_x86` `python_test`, gated behind `runtime.is_oss` via a top-level conditional expression (the BUCK dialect forbids top-level `if`/`def`). It stays out of the OSS graph to preserve the `test-qnn-buck-build-linux` CI signal. The QNN x86 SDK libs and host runner are supplied via `env` (`QNN_SDK_ROOT`, `LD_LIBRARY_PATH`, `QNN_EXECUTOR_RUNNER`). `CUDA_VISIBLE_DEVICES` is forced empty because these tests are CPU-only (calibration, AOT compile, x86 sim); without it, parallel test shards each grabbed CUDA and exhausted GPU memory. The fbcode and xplat copies of all four files are kept byte-identical per the existing twin convention. Reviewed By: YIWENX14 Differential Revision: D109606746
Summary: The QNN operator tests in `backends/qualcomm/tests/test_qnn_delegate.py` already support host x86_64 simulator execution via `--enable_x86_64`, but were only runnable as a standalone argparse script after a CMake build. There was no `buck test` path: `test_qnn_delegate` is a `python_library` (not a test target), and the host `qnn_executor_runner` binary was Android-only. This change wires up a buck-native, internal-only test target `//executorch/backends/qualcomm/tests:test_qnn_delegate_x86` that runs the FP16 and quantized operator suites (`TestQNNFloatingPointOperator`, `TestQNNQuantizedOperator`) on the x86 QNN simulator with no device and no CMake build tree. Changes: - `backends/qualcomm/runtime/targets.bzl`: add `CXX` to the `pal` library platforms so they match `:logging` and `:runtime` (both `[ANDROID, CXX]`). `:pal` ships `pal/src/linux/*.cpp` and is an exported dep of `:runtime`, so its host (CXX) variant must exist for the `:runtime` CXX build to resolve on Linux -- both the OSS `buck2 build //backends/qualcomm/...` job (`test-qnn-buck-build-linux`) and the internal x86 simulator runner. (Not gated on `is_fbcode()`: the OSS Linux job builds the `:runtime` CXX variant too, so `pal` must provide CXX there as well.) - `examples/qualcomm/executor_runner/targets.bzl`: add `CXX` to `qnn_executor_runner` (gated `if is_fbcode()`) so the host runner binary builds, and add `//executorch/kernels/portable:generated_lib` to its deps. The CMake runner links `full_portable_ops_lib` + `quantized_ops_lib`; the Buck runner had only the quantized lib, so ops that leave a CPU-fallback node (e.g. `acos` -> `aten::asin.out`, `cast`, `index_copy`, `index_put`, `logical_and`, `avg_pool1d`) aborted the runner with a "Missing operator" error. The `CXX` (host) surface is gated to `is_fbcode()` because it is only used by the internal x86 simulator test; in OSS, `CXX` includes macOS (no QNN host libs), so the host runner/pal stay Android-only there, restoring the original OSS build surface. - `backends/qualcomm/tests/test_qnn_delegate_x86.py` (new): under `buck test` the file's argparse `__main__`/`setup_environment()` never runs, so this wrapper sets the equivalent `TestQNN` class attributes (`enable_x86_64`, `backend`, `soc_model`) at import and subclasses the operator TestCases so the runner discovers them. - `backends/qualcomm/tests/BUCK`: add the `test_qnn_delegate_x86` `python_test`, gated behind `runtime.is_oss` via a top-level conditional expression (the BUCK dialect forbids top-level `if`/`def`). It stays out of the OSS graph to preserve the `test-qnn-buck-build-linux` CI signal. The QNN x86 SDK libs and host runner are supplied via `env` (`QNN_SDK_ROOT`, `LD_LIBRARY_PATH`, `QNN_EXECUTOR_RUNNER`). `CUDA_VISIBLE_DEVICES` is forced empty because these tests are CPU-only (calibration, AOT compile, x86 sim); without it, parallel test shards each grabbed CUDA and exhausted GPU memory. The fbcode and xplat copies of all four files are kept byte-identical per the existing twin convention. Reviewed By: YIWENX14 Differential Revision: D109606746
3b8432d to
ae8279b
Compare
Summary:
The QNN operator tests in
backends/qualcomm/tests/test_qnn_delegate.pyalready support host x86_64 simulator execution via--enable_x86_64, but were only runnable as a standalone argparse script after a CMake build. There was nobuck testpath:test_qnn_delegateis apython_library(not a test target), and the hostqnn_executor_runnerbinary was Android-only.This change wires up a buck-native, internal-only test target
//executorch/backends/qualcomm/tests:test_qnn_delegate_x86that runs the FP16 and quantized operator suites (TestQNNFloatingPointOperator,TestQNNQuantizedOperator) on the x86 QNN simulator with no device and no CMake build tree.Changes:
backends/qualcomm/runtime/targets.bzl: addCXXto thepallibrary platforms so they match:loggingand:runtime(both[ANDROID, CXX]).:palshipspal/src/linux/*.cppand is an exported dep of:runtime, so its host (CXX) variant must exist for the:runtimeCXX build to resolve on Linux -- both the OSSbuck2 build //backends/qualcomm/...job (test-qnn-buck-build-linux) and the internal x86 simulator runner. (Not gated onis_fbcode(): the OSS Linux job builds the:runtimeCXX variant too, sopalmust provide CXX there as well.)examples/qualcomm/executor_runner/targets.bzl: addCXXtoqnn_executor_runner(gatedif is_fbcode()) so the host runner binary builds, and add//executorch/kernels/portable:generated_libto its deps. The CMake runner linksfull_portable_ops_lib+quantized_ops_lib; the Buck runner had only the quantized lib, so ops that leave a CPU-fallback node (e.g.acos->aten::asin.out,cast,index_copy,index_put,logical_and,avg_pool1d) aborted the runner with a "Missing operator" error. TheCXX(host) surface is gated tois_fbcode()because it is only used by the internal x86 simulator test; in OSS,CXXincludes macOS (no QNN host libs), so the host runner/pal stay Android-only there, restoring the original OSS build surface.backends/qualcomm/tests/test_qnn_delegate_x86.py(new): underbuck testthe file's argparse__main__/setup_environment()never runs, so this wrapper sets the equivalentTestQNNclass attributes (enable_x86_64,backend,soc_model) at import and subclasses the operator TestCases so the runner discovers them.backends/qualcomm/tests/BUCK: add thetest_qnn_delegate_x86python_test, gated behindruntime.is_ossvia a top-level conditional expression (the BUCK dialect forbids top-levelif/def). It stays out of the OSS graph to preserve thetest-qnn-buck-build-linuxCI signal. The QNN x86 SDK libs and host runner are supplied viaenv(QNN_SDK_ROOT,LD_LIBRARY_PATH,QNN_EXECUTOR_RUNNER).CUDA_VISIBLE_DEVICESis forced empty because these tests are CPU-only (calibration, AOT compile, x86 sim); without it, parallel test shards each grabbed CUDA and exhausted GPU memory.The fbcode and xplat copies of all four files are kept byte-identical per the existing twin convention.
Reviewed By: YIWENX14
Differential Revision: D109606746