Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 2 additions & 1 deletion scripts/install-tuolumne.sh
Original file line number Diff line number Diff line change
Expand Up @@ -22,4 +22,5 @@ for f in *.so*; do
if patchelf --print-needed "$f" 2>/dev/null | grep -Fxq "$OLD"; then
echo "STILL NEEDS $OLD -> $f"
fi
done
done
cd -
Copy link
Collaborator

@PatrickRMiles PatrickRMiles Mar 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this cd - necessary?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It gets the user back to the directory they were in after running the script. otherwise they will end up in .venvs/scaffoldvenv-tuo/lib/python3.11/site-packages/torch/lib

1 change: 0 additions & 1 deletion scripts/scaffold-tuolumne-torchpypi.job
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,6 @@ ml cce/21.0.0 cray-mpich/9.1.0 rocm/7.1.0 rccl/fast-env-slows-mpi

# Use ccl plugin that we manually built with install-rccl.sh
export NCCL_NET_PLUGIN=../aws-ofi-nccl.git/install/lib/librccl-net.so
export NCCL_NET="AWS Libfabric"

torchrun-hpc -N 1 -n 1 $(which scaffold) generate_fractals -c $(pwd)/ScaFFold/configs/benchmark_default.yml

Expand Down
3 changes: 0 additions & 3 deletions scripts/scaffold-tuolumne.job
Original file line number Diff line number Diff line change
Expand Up @@ -15,9 +15,6 @@ ml cce/21.0.0 cray-mpich/9.1.0 rocm/7.1.0 rccl/fast-env-slows-mpi
# (2) Removing libmpi may cause segfault on mpi4py import
export LD_PRELOAD="/opt/rocm-7.1.0/llvm/lib/libomp.so /opt/cray/pe/mpich/9.1.0/ofi/gnu/11.2/lib/libmpi_gnu.so.12"

# Ensure using libfabric. NCCL_NET_PLUGIN should be unecessary to set for WCI wheel.
export NCCL_NET="AWS Libfabric"

torchrun-hpc -N 1 -n 1 $(which scaffold) generate_fractals -c $(pwd)/ScaFFold/configs/benchmark_default.yml

# Uncomment if you want torch profiling
Expand Down