% cat lamhosts # a 2-node LAM tbag.osc.edu alex.osc.eduEach machine will be given a node identifier (nodeid) starting with 0 for the first listed machine, 1 for the second, etc.
The recon tool verifies that the cluster is bootable.
% recon -v lamhosts recon: testing n0 (tbag.osc.edu) recon: testing n1 (alex.osc.edu)The lamboot tool actually starts LAM on the specified cluster.
% lamboot -v lamhosts LAM 6.0 - Ohio Supercomputer Center hboot n0 (tbag.osc.edu)... hboot n1 (alex.osc.edu)...lamboot returns to the UNIX shell prompt. LAM does not force a canned environment or a "LAM shell". The tping command builds user confidence that the cluster and LAM are running.
% tping -c1 N 1 byte from 2 nodes: 0.009 secs
% hcc -o foo foo.c -lmpi % hf77 -o foo foo.f -lmpi
% mpirun -v n0-1 foo 2445 foo running on n0 (o) 361 foo running on n1An application with multiple programs must be described in an application schema, a file that lists each program and its target node(s).
% cat appfile # 1 master, 2 slaves master n0 slave n0-1 % mpirun -v appfile 3292 master running on n0 (o) 3296 slave running on n0 (o) 412 slave running on n1
% mpitask TASK (G/L) FUNCTION PEER|ROOT TAG COMM COUNT DATATYPE 0/0 master Recv ANY ANY WORLD 1 INT 1 slave <running> 2 slave <running>Process rank 0 is blocked receiving a message consisting of a single integer from any source rank and any message tag, using the MPI_COMM_WORLD communicator. The other processes are running.
% mpimsg SRC (G/L) DEST (G/L) TAG COMM COUNT DATATYPE MSG 0/0 1/1 7 WORLD 4 INT n0,#0Later, we see that a message sent by process rank 0 to process rank 1 is buffered and waiting to be received. It was sent with tag 7 using the MPI_COMM_WORLD communicator and contains 4 integers.
% lamclean -v killing processes, done sweeping messages, done closing files, done sweeping traces, done
% wipe -v lamhosts tkill n0 (tbag.osc.edu)... tkill n1 (alex.osc.edu)...