How are MPI_Scatter and MPI_Gather used from C? How are MPI_Scatter and MPI_Gather used from C? c c

How are MPI_Scatter and MPI_Gather used from C?


This is a common misunderstanding of how operations work in MPI with people new to it; particularly with collective operations, where people try to start using broadcast (MPI_Bcast) just from rank 0, expecting the call to somehow "push" the data to the other processors. But that's not really how MPI routines work; most MPI communication requires both the sender and the receiver to make MPI calls.

In particular, MPI_Scatter() and MPI_Gather() (and MPI_Bcast, and many others) are collective operations; they have to be called by all of the tasks in the communicator. All processors in the communicator make the same call, and the operation is performed. (That's why scatter and gather both require as one of the parameters the "root" process, where all the data goes to / comes from). By doing it this way, the MPI implementation has a lot of scope to optimize the communication patterns.

So here's a simple example (Updated to include gather):

#include <mpi.h>#include <stdio.h>#include <stdlib.h>int main(int argc, char **argv) {    int size, rank;    MPI_Init(&argc, &argv);    MPI_Comm_size(MPI_COMM_WORLD, &size);    MPI_Comm_rank(MPI_COMM_WORLD, &rank);    int *globaldata=NULL;    int localdata;    if (rank == 0) {        globaldata = malloc(size * sizeof(int) );        for (int i=0; i<size; i++)            globaldata[i] = 2*i+1;        printf("Processor %d has data: ", rank);        for (int i=0; i<size; i++)            printf("%d ", globaldata[i]);        printf("\n");    }    MPI_Scatter(globaldata, 1, MPI_INT, &localdata, 1, MPI_INT, 0, MPI_COMM_WORLD);    printf("Processor %d has data %d\n", rank, localdata);    localdata *= 2;    printf("Processor %d doubling the data, now has %d\n", rank, localdata);    MPI_Gather(&localdata, 1, MPI_INT, globaldata, 1, MPI_INT, 0, MPI_COMM_WORLD);    if (rank == 0) {        printf("Processor %d has data: ", rank);        for (int i=0; i<size; i++)            printf("%d ", globaldata[i]);        printf("\n");    }    if (rank == 0)        free(globaldata);    MPI_Finalize();    return 0;}

Running it gives:

gpc-f103n084-$ mpicc -o scatter-gather scatter-gather.c -std=c99gpc-f103n084-$ mpirun -np 4 ./scatter-gatherProcessor 0 has data: 1 3 5 7 Processor 0 has data 1Processor 0 doubling the data, now has 2Processor 3 has data 7Processor 3 doubling the data, now has 14Processor 2 has data 5Processor 2 doubling the data, now has 10Processor 1 has data 3Processor 1 doubling the data, now has 6Processor 0 has data: 2 6 10 14