MPI Tutorial

NOTE: the mpich installation on Millennium depends on "gexec". If “gexec” isn't working from your account then mpich will not run for you.

NOTE: On the CITRIS and PSI BATCH clusters, you need to launch your processes using qsub and properly formatted Torque (PBS) batch scripts.

Index

PATH Transport Compiler
/usr/mill/pkg/mpich Ethernet gcc/gfortran
/usr/mill/pkg/mpich-gm Myrinet GM gcc/gfortran
/usr/mill/pkg/mpich-intel Ethernet Intel icc/ifort
/usr/mill/pkg/mpich-gm-intel Myrinet GM Intel icc/ifort

Web Manual for MPI and MPE

The web manual contains complete listing of all MPI Commands, MPI Routines, and MPE Routines: mpiCC, mpicc, mpif77, mpif90, mpirun, MPI_* functions, MPIO_* functions, CLOG_* functions, MPE_* functions, and more.

Using Millennium mpirun

Millennium mpirun is different from the standard mpich mpirun

  • It dynamically generates a load-balanced list based on real-time data instead of using a flat file listing of nodes in the cluster.
  • It processes the -nodes option (see examples below)
  • It does not allow the -machinefile option. Instead it processes the GEXEC_SVRS environment variable (if it exists).

Examples

Run 32 MPI processes on 32 unique nodes:

mpirun -np 32 ./myprog

Run 32 MPI processes on 16 unique nodes (2 processes on 16 nodes)

mpirun -np 32 -nodes 16 ./myprog

Run 32 MPI processes on 8 unique nodes (4 processes on 8 nodes)

mpirun -np 32 -nodes 8 ./myprog

Choose 4 hosts to run MPI jobs on

export GEXEC_SVRS="mm10 mm12 mm14 mm83" (bourne shell)
setenv GEXEC_SVRS "mm10 mm12 mm14 mm83" (C shell)

Run 4 MPI processes on those 4 hosts (1 process per host)

 mpirun -np 4 ./myprog

Run 8 MPI processes on those 4 hosts (2 processes per host)

 mpirun -np 8 ./myprog

Notice how you don't need to specify the -nodes options since it's implicit in your GEXEC_SVRS environment variable.

Run 4 MPI processes on the first two nodes in your “GEXEC_SVRS” list (2 processes per host)

mpirun -np 4 -nodes 2 ./myprog

mpirun accepts the -v option for verbose output if you want more information about what it's doing.

mpicc example

Download sample program hello-world.c.

Compile sample MPI program.

mpicc hello-world.c -o hello-world

Run the sample MPI program.

mpirun -np 2 ./hello-world

Output from the sample program should look like this:

I am master. I am sending the message.
I am slave. I am receiving the message.
The message is: Hello World! "

more advanced mpicc examples

-mpilog

This option builds a version of your MPI program that generates MPI log files. There are three different log formats: ALOG, CLOG, and SLOG, selected by the $MPE_LOG_FORMAT environment variable. At present, only upshot, using ALOG, is installed on Millennium.

export MPE_LOG_FORMAT="ALOG"
mpicc -mpilog -o mpiprog mpiprog.c
      (Substitute mpiprog with the name of your program. )
mpirun -np x mpiprog
      

Your job will run normally but you will see logfile messages. The log file will be in the same directory as your mpi program with an .alog extension.

  
upshot mpiprog.alog
      

The upshot window will pop up showing the log file. Hit the “Options” button to configure upshot. Hit the “Setup” button to view the graphical representation of your mpi program.

-mpitrace

This will generate traces of all MPI calls.

mpicc -mpitrace -o mpiprog mpiprog.c
mpirun -np x mpiprog
      

Be prepared for verbose output from your program about what MPI_* calls it's making

-mpianim

This option is least useful but way cool. It allows you to watch real-time animation of your mpi progam.

mpicc -mpianim -o mpiprog mpiprog.c -L/usr/X11R6/lib -lX11 -lm
mpirun -np x ./mpiprog
      

You'll see a window pop up with a dot representing each node of your mpi group. Arrows will flash when a message is sent.

Running mpich over Myrinet using the General Messages (GM) API

Each node of the Millennium Cluster has a 4MB Myrinet 2000 PCI card connected by Myrinet-serial cable to a 9U Myrinet 2000 switch. Myrinet is a very very low-latency, high bandwidth interconnect.

Each node has 6 usable General Messages (GM) API communication ports (2 through 7). A gm port broker was developed to make GM work in a multi-user environment . The broker creates per-user, dynamic load-balanced gm configuration files that avoid port conflicts. Due to limited ports, only one mpich process per machine is allowed.

The GM version of mpich is ideal for mpi programs that have heavy message passing. If your mpi program is very computational without much message passing, please use to p4 (ethernet) version of mpich to conserve gm ports. The GM remote processes are spawned in exactly the same way in the p4 version.

Using the GM version of mpich:

export PATH="/usr/mill/pkg/mpich-gm/bin:${PATH}"

Recompile your mpi program if it was originally compiled to use the p4 interface. Check if your program is compiled for p4 by running a quick test: ./mpiprog -p4help. If you get the p4 help page, you must recompile.

mpicc -o mpiprog mpiprog.c
mpirun -np x mpiprog''

If you forget to recompile your program and it is still linked against the p4 library, it will run with rank 0 (the master) under the gm mpirun script, causing errors.

mpich with the Intel Compilers

Two additional versions of MPICH are installed. They're configured to use the Intel compilers (icc/ifort) for mpicc instead of the usual gcc.

  • /usr/mill/pkg/mpich-intel ← Run on ethernet, build with Intel icc/ifort
  • /usr/mill/pkg/mpich-gm-intel ← Run on Myrinet GM, build with Intel icc/ifort
 
mpi.txt · Last modified: 2008/11/25 23:52 by mhoward
 
Except where otherwise noted, content on this wiki is licensed under the following license:CC Attribution-Noncommercial-Share Alike 3.0 Unported
Recent changes RSS feed Donate Powered by PHP Valid XHTML 1.0 Valid CSS Driven by DokuWiki