- Last time
- Wrapped up “Critical thinking” segment. Went through a case study, saw more than 100X speed up
- No function-call blockers; loop unrolling & re-association; dropped in the wide-register vectorization
- Started discussion about parallel computing via message passing (multi-process parallel computing)
- Covered the hardware aspects related to HPC
- Wrapped up “Critical thinking” segment. Went through a case study, saw more than 100X speed up
- Today
- HPC via MPI: discuss the basic ideas/paradigms
- MPI point-to-point communication
- CUDA: A kernel (a small snippet of code) is run by all threads spawned via an execution configuration
- OpenMP: All threads execute an omp parallel region, work sharing
- MPI: The entire code is executed in parallel by all processses
- MPI does branching based on the process rank
- Very similar to GPU computing, where one thread does work based on its thread index
- Very similar to OpenMP function omp_get_thread_num()
- Each MPI process has its own program counter and virtual address space
- The variables of each program have the same name but live in different virtual memories and assume different values
- MPI can be used whenever it is possible for processes to exchange messages:
- Distributed memory systems
- Network of workstations
- One workstation with many cores
- Data is passed through the main memory instead of a network
- Different ranks share the same physical memory, but they are each tied to separate virtual memory spaces
- P2P: Simplest form of message passing communication
- One process sends a message to another process (MPI__Send, MPI__Recv)
int MPI_Send(void *buf, int count, MPI_Datatype datatype, int dst, int tag, MPI_Comm comm)
- buf: starting point of the message with count elements, each described with datatype
- dst: rank of the destination process within the comm communicator
- tag: used to distinguish between different messages
int MPI_Send(void buf, int count, MPI_Datatype datatype, int dest, int tag, MPI_Comm comm, MPI_Status status)
- Envelope information is returned in an MPI_Status object
- A custom communicator can be created using
MPI_Comm_create(MPI_COMM_WORLD, new_group, &MY_COMM_WORLD);
- MPI data types and their C counterparts: see table below
- The order of messages is preserved, i.e. messages do not overtake each other
- Receiver can wildcard to received from any source/tag: MPI_ANY_SOURCE/MPI_ANY_TAG
- For a communication to succeed:
- Sender must specify a valid destination rank
- Receiver must specify a valid source rank
- The communicator must be the same
- Tags must match
- Message data types must match
- Receiver's buffer must be large enough
- MPI_Send and MPI_Recv are blocking: when a process sends, it does not stop until another process receives
- Eager mode vs. Rendezvous mode
- Eager mode: Small messages, the content of the buffer is picked up right away by the MPI runtime
- Rendezvous mode: Large amount of data, the sender function waits for the receiver to post a receive before the runtime facilitates the sending of the actual data of the message