Skip to content

Commit 24e3d6c

Browse files
committed
docs: add a page about HPE PALS
Signed-off-by: Howard Pritchard <[email protected]>
1 parent d1083f9 commit 24e3d6c

File tree

2 files changed

+93
-0
lines changed

2 files changed

+93
-0
lines changed

docs/launching-apps/index.rst

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -46,6 +46,7 @@ same command).
4646
lsf
4747
tm
4848
gridengine
49+
pals
4950

5051
unusual
5152
troubleshooting

docs/launching-apps/pals.rst

Lines changed: 92 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,92 @@
1+
Launching with HPE PALS
2+
=======================
3+
4+
Open MPI supports two modes of launching parallel MPI jobs on HPE
5+
systems with HPE PALS (Parallel Application Launch Service) installed and enabled:
6+
7+
#. Using Open MPI's full-features ``mpirun`` launcher.
8+
#. Using the PALS "direct launch" capability.
9+
10+
Unless there is a strong reason to use ``aprun`` for direct launch, the
11+
Open MPI team recommends using ``mpirun`` for launching jobs on these
12+
systems.
13+
14+
PALS is available on HPE systems which use PBS/Torque for the
15+
resource manager. It supports a PMIx server, albeit with some limitations
16+
compared to recent PRRTE releases.
17+
18+
Information about PALS can be found at `HPE's support portal <http://support.hpe.com/>`_. Search for
19+
**parallel application launch service.**
20+
21+
.. note:: Open MPI has only been tested against PALS 1.5.0. PALS support was introduced
22+
in PRRTE starting with release 4.0.0.
23+
24+
Since PALS is currently only available on HPE systems managed with PBS, also see the **Launching with PBS/Torque**
25+
documentation :doc:`tm`.
26+
27+
Verify PALS support
28+
-------------------
29+
30+
The ``prte_info`` command can be used to determine whether or not an
31+
installed Open MPI includes PALS support:
32+
33+
.. code-block::
34+
35+
shell$ prte_info | grep pals
36+
37+
If the Open MPI installation includes support for PALS, you
38+
should see lines similar to those below. Note the MCA version
39+
information varies depending on which version of PRRTE is
40+
installed.
41+
42+
.. code-block::
43+
44+
MCA ess: pals (MCA v2.1.0, API v3.0.0, Component v5.0.0)
45+
MCA plm: pals (MCA v2.1.0, API v2.0.0, Component v5.0.0)
46+
47+
Using ``mpirun``
48+
----------------
49+
50+
This section assumes there is PALS support in the PRRTE being used for the Open MPI installation.
51+
52+
When ``mpirun`` is launched in a PBS job, ``mpirun`` will
53+
automatically utilize the PALS infrastructure for launching and
54+
controlling the individual MPI processes.
55+
56+
.. note:: Using ``mpirun`` is the recommended method for launching Open
57+
MPI jobs on HPE systems where PALS is available. This is primarily due to limitations in the
58+
PMIx server provided in PALS.
59+
60+
For example:
61+
62+
.. code-block:: sh
63+
64+
# Allocate a PBS job with 32 slots on 1 node
65+
shell$ qsub -I -l select=1:ncpus=32:mpiprocs=32,filesystems=home -lwalltime=0:30:00 -lwalltime=10:00 -Afoobar
66+
qsub: waiting for job XXX to start
67+
qsub: job XXX ready
68+
69+
# Now run an Open MPI job on all the slots allocated by PBS
70+
shell$ mpirun mpi-hello-world
71+
72+
This will run the 32 MPI processes on the node that was allocated by PBS.
73+
74+
Using PALS "direct launch" functionality
75+
----------------------------------------
76+
77+
The HPE PALS 1.5.0 documentation states that it comes pre-built with PMIx support.
78+
By default the PALS ``aprun`` launcher does not use pmix. To use the launcher's
79+
PMIx capabilities either the command line option ``--pmix=pmix`` needs to be set
80+
or the ``ALPS_PMI`` environment variable needs to be set to ``pmix``.
81+
82+
.. code-block:: sh
83+
84+
shell$ aprun -n 4 -N 2 --pmi=pmix mpi-hello-world
85+
86+
or
87+
88+
shell$ ALPS_PMI=pmix aprun -n 4 -N 2 mpi-hello-world
89+
90+
In these examples, four instances of the application are started, two instances per node.
91+
92+
See the PALS ``aprun`` man page for documentation on how to this command.

0 commit comments

Comments
 (0)