Setting up a MPI cluster on Ubuntu involves the following steps:
1. Install OpenMPI on all machines.
$sudo apt-get install libopenmpi-dev openmpi-bin openmpi-doc
2. Installing openSSH on all machines.
$sudo apt-get install openssh-server openssh-client
3. Select one Master machine say M and others will be slaves say S.
4. Now we need to make sure M can logon to all S machines with out a password.
Create a user with same name on all machines. Say 'fawad'.
Login on M using 'fawad' and generate public/private key pair.
$ssh-keygen -t dsa
This command will generate id_dsa and files in ~/.ssh folder. id_dsa is the private key and is the public key. Now we need to keep the private key and copy the public key on to all the slaves machines.
5. Use scp to copy in the home directory of fawad on all S machines.
6. On the S machines do:
Login as fawad.
$cd ~/.ssh
$ cat >> authorized_keys
7. Now the M machine will be able to logon to all machines without password.
8. On the M machines.
Write a MPI code and save it. In my case i used the following code and saved it in hello_nodes.c
A simle example program using MPI - Hello World
The program consists of one receiver process and N-1 sender
processes. The sender processes send a message consisting
of their hostname to the receiver. The receiver process prints
out the values it receives in the messages.
Compile the program with 'mpicc hello_nodes.c -o hello_nodes'
To run the program on four processors do 'mpiexec -n 4 ./hello'
int main(int argc, char *argv[]) {
const int tag = 42; /* Message tag */
int id, ntasks, source_id, err, i;
MPI_Status status;
char msg[80]; /* Message array */
err = MPI_Init(&argc, &argv); /* Initialize MPI */
if (err != MPI_SUCCESS) {
printf("MPI_init failed!\n");
err = MPI_Comm_size(MPI_COMM_WORLD, &ntasks); /* Get nr of tasks */
if (err != MPI_SUCCESS) {
printf("MPI_Comm_size failed!\n");
err = MPI_Comm_rank(MPI_COMM_WORLD, &id); /* Get id of this process */
if (err != MPI_SUCCESS) {
printf("MPI_Comm_rank failed!\n");
/* Check that we run on at least two processors */
if (ntasks < 2) {
printf("You have to use at least 2 processors to run this program\n");
MPI_Finalize(); /* Quit if there is only one processor */
/* Process 0 (the receiver) does this */
if (id == 0) {
int length;
MPI_Get_processor_name(msg, &length); /* Get name of this processor */
printf("Hello World from process %d running on %s\n", id, msg);
for (i=1; i err = MPI_Recv(msg, 80, MPI_CHAR, MPI_ANY_SOURCE, tag, MPI_COMM_WORLD, \
&status); /* Receive a message from any sender */
if (err != MPI_SUCCESS) {
printf("Error in MPI_Recv!\n");
source_id = status.MPI_SOURCE; /* Get id of sender */
printf("Hello World from process %d running on %s\n", source_id, msg);
/* Processes 1 to N-1 (the senders) do this */
else {
int length;
MPI_Get_processor_name(msg, &length); /* Get name of this processor */
err = MPI_Send(msg, length+1, MPI_CHAR, 0, tag, MPI_COMM_WORLD);
if (err != MPI_SUCCESS) {
printf("Process %i: Error in MPI_Send!\n", id);
err = MPI_Finalize(); /* Terminate MPI */
if (err != MPI_SUCCESS) {
printf("Error in MPI_Finalize!\n");
if (id==0) printf("Ready\n");
9. Compile the code.
$ mpicc hello_nodes.c -o hello_nodes
10. Run the code on the same machine.
fawad@fawad-virtual-machine:~$ mpirun -np 4 hello_nodes
Hello World from process 0 running on fawad-virtual-machine
Hello World from process 2 running on fawad-virtual-machine
Hello World from process 1 running on fawad-virtual-machine
Hello World from process 3 running on fawad-virtual-machine
11. Create a file called hostfile, which contains the slave computers in the MPI cluster:
fawad@fawad-virtual-machine:~$ cat hostfile
12. Copy the source code hello_mpi to all slave machines using scp.
13. Then run the following command to run the job on all slave machines:
fawad@fawad-virtual-machine:~$ mpirun --hostfile hostfile -np 10 hello_nodes
Hello World from process 0 running on ubuntu
Hello World from process 6 running on ubuntu
Hello World from process 4 running on ubuntu
Hello World from process 8 running on ubuntu
Hello World from process 3 running on fawad-virtual-machine
Hello World from process 2 running on ubuntu
Hello World from process 9 running on fawad-virtual-machine
Hello World from process 5 running on fawad-virtual-machine
Hello World from process 7 running on fawad-virtual-machine
Hello World from process 1 running on fawad-virtual-machine
DONE :).
Now the next step is that we need to try creating a NFS so that we do not have to copy the binary file all the time.
