Skip to content

Releases: pmodels/mpich

v4.3.0rc3

14 Jan 21:27
4f08552
Compare
Choose a tag to compare
v4.3.0rc3 Pre-release
Pre-release

Changes in 4.3

  • Support MPI memory allocation kinds side document.

  • Support MPI ABI Proposal. Configure with --enable-mpi-abi and build with
    mpicc_abi. By default, mpicc still builds and links with MPICH ABI.

  • Experimental API MPIX_Op_create_x. It supports user callback function with
    extra_state context and op destructor callback. It supports language bindings
    to use proxy function for language-specific user callbacks.

  • Experimental API MPIX_{Comm,File,Session,Win}_create_errhandler_x. They allow
    user error handlers to have extra_state context and corresponding destructor.
    This allows language bindings to implement user error handlers via proxy.

  • Experimental API MPIX_Request_is_complete. This is a pure request state query
    function that will not invoke progress, nor will free the request. This should
    help applications that want separate task dependency checking from progress
    engine to avoid progress contentions, especially in multi-threaded context.
    It is also useful for tools to profile non-deterministic calls such as
    MPI_Test.

  • Experimental API MPIX_Async_start. This function let applications to inject
    progress hooks to MPI progress. It allows application to implement custom
    asynchronous operations that will be progressed by MPI. It avoids having to
    implement separate progress mechanisms that may either take additional
    resource or contend with MPI progress and negatively impact performance. It
    also allows applications to create custom MPI operations, such as MPI
    nonblocking collectives, and achieve near native performance.

  • Added benchmark tests test/mpi/bench/p2p_{latency,bw}.

  • Added CMA support in CH4 IPC.

  • Added IPC read algorithm for intranode Allgather and Allgatherv.

  • Added CVAR MPIR_CVAR_CH4_SHM_POSIX_TOPO_ENABLE to enable non-temporal memcpy
    for inter-numa shm communication.

  • Added CVAR MPIR_CVAR_DEBUG_PROGRESS_TIMEOUT for debugging MPI deadlock issues.

  • ch4:ucx now supports dynamic processes. MPI_Comm_spawn{_multiple} will work.
    MPI_Open_port will fail due to ucx port name exceeds current MPI_MAX_PORT_NAME
    of 256. One can work around by use an info hint "port_name_size" and use a
    larger port name buffer.

  • PMI-1 defines PMI_MAX_PORT_NAME, which may be different from MPI_MAX_PORT_NAME.
    This is used by "PMI_Lookup_name". Consequently, MPI_Lookup_name accepts info
    hint "port_name_size" that may be larger than MPI_MAX_PORT_NAME. If the port
    name does not fit in "port_name_size", it will return a truncation error.

  • Autogen default to use -yaksa-depth=2.

  • Default MPIR_CVAR_CH4_ROOTS_ONLY_PMI to on.

  • Added ch4 netmod API am_tag_send and am_tag_recv.

  • Added MPIR_CVAR_CH4_OFI_EAGER_THRESHOLD to force RNDV send mode.

  • Make check target will run ROMIO tests.

  • Add back handle conversion macros (f2c/c2f) to preserve ABI
    compatibility with older MPICH libraries

  • Fix compilation issue with g++ in -std=gnu++20 mode

  • Fix bug in MPI_ANY_SOURCE handling observed using the libfabric CXI
    provider

  • Add NIC information to error messages in ch4:ofi netmod

v4.3.0rc2

02 Jan 20:40
Compare
Choose a tag to compare
v4.3.0rc2 Pre-release
Pre-release

Changes in 4.3

  • Support MPI memory allocation kinds side document.
  • Support MPI ABI Proposal. Configure with --enable-mpi-abi and build with
    mpicc_abi. By default, mpicc still builds and links with MPICH ABI.
  • Experimental API MPIX_Op_create_x. It supports user callback function with
    extra_state context and op destructor callback. It supports language bindings
    to use proxy function for language-specific user callbacks.
  • Experimental API MPIX_{Comm,File,Session,Win}_create_errhandler_x. They allow
    user error handlers to have extra_state context and corresponding destructor.
    This allows language bindings to implement user error handlers via proxy.
  • Experimental API MPIX_Request_is_complete. This is a pure request state query
    function that will not invoke progress, nor will free the request. This should
    help applications that want separate task dependency checking from progress
    engine to avoid progress contentions, especially in multi-threaded context.
    It is also useful for tools to profile non-deterministic calls such as
    MPI_Test.
  • Experimental API MPIX_Async_start. This function let applications to inject
    progress hooks to MPI progress. It allows application to implement custom
    asynchronous operations that will be progressed by MPI. It avoids having to
    implement separate progress mechanisms that may either take additional
    resource or contend with MPI progress and negatively impact performance. It
    also allows applications to create custom MPI operations, such as MPI
    nonblocking collectives, and achieve near native performance.
  • Added benchmark tests test/mpi/bench/p2p_{latency,bw}.
  • Added CMA support in CH4 IPC.
  • Added IPC read algorithm for intranode Allgather and Allgatherv.
  • Added CVAR MPIR_CVAR_CH4_SHM_POSIX_TOPO_ENABLE to enable non-temporal memcpy
    for inter-numa shm communication.
  • Added CVAR MPIR_CVAR_DEBUG_PROGRESS_TIMEOUT for debugging MPI deadlock issues.
  • ch4:ucx now supports dynamic processes. MPI_Comm_spawn{_multiple} will work.
    MPI_Open_port will fail due to ucx port name exceeds current MPI_MAX_PORT_NAME
    of 256. One can work around by use an info hint "port_name_size" and use a
    larger port name buffer.
  • PMI-1 defines PMI_MAX_PORT_NAME, which may be different from MPI_MAX_PORT_NAME.
    This is used by "PMI_Lookup_name". Consequently, MPI_Lookup_name accepts info
    hint "port_name_size" that may be larger than MPI_MAX_PORT_NAME. If the port
    name does not fit in "port_name_size", it will return a truncation error.
  • Autogen default to use -yaksa-depth=2.
  • Default MPIR_CVAR_CH4_ROOTS_ONLY_PMI to on.
  • Added ch4 netmod API am_tag_send and am_tag_recv.
  • Added MPIR_CVAR_CH4_OFI_EAGER_THRESHOLD to force RNDV send mode.
  • Make check target will run ROMIO tests.
  • Add back handle conversion macros (f2c/c2f) to preserve ABI
    compatibility with older MPICH libraries
  • Fix compilation issue with g++ in -std=gnu++20 mode

v4.3.0rc1

17 Dec 18:24
f763d57
Compare
Choose a tag to compare
v4.3.0rc1 Pre-release
Pre-release

Changes in 4.3

  • Support MPI memory allocation kinds side document.
  • Support MPI ABI Proposal. Configure with --enable-mpi-abi and build with
    mpicc_abi. By default, mpicc still builds and links with MPICH ABI.
  • Experimental API MPIX_Op_create_x. It supports user callback function with
    extra_state context and op destructor callback. It supports language bindings
    to use proxy function for language-specific user callbacks.
  • Experimental API MPIX_{Comm,File,Session,Win}_create_errhandler_x. They allow
    user error handlers to have extra_state context and corresponding destructor.
    This allows language bindings to implement user error handlers via proxy.
  • Experimental API MPIX_Request_is_complete. This is a pure request state query
    function that will not invoke progress, nor will free the request. This should
    help applications that want separate task dependency checking from progress
    engine to avoid progress contentions, especially in multi-threaded context.
    It is also useful for tools to profile non-deterministic calls such as
    MPI_Test.
  • Experimental API MPIX_Async_start. This function let applications to inject
    progress hooks to MPI progress. It allows application to implement custom
    asynchronous operations that will be progressed by MPI. It avoids having to
    implement separate progress mechanisms that may either take additional
    resource or contend with MPI progress and negatively impact performance. It
    also allows applications to create custom MPI operations, such as MPI
    nonblocking collectives, and achieve near native performance.
  • Added benchmark tests test/mpi/bench/p2p_{latency,bw}.
  • Added CMA support in CH4 IPC.
  • Added IPC read algorithm for intranode Allgather and Allgatherv.
  • Added CVAR MPIR_CVAR_CH4_SHM_POSIX_TOPO_ENABLE to enable non-temporal memcpy
    for inter-numa shm communication.
  • Added CVAR MPIR_CVAR_DEBUG_PROGRESS_TIMEOUT for debugging MPI deadlock issues.
  • ch4:ucx now supports dynamic processes. MPI_Comm_spawn{_multiple} will work.
    MPI_Open_port will fail due to ucx port name exceeds current MPI_MAX_PORT_NAME
    of 256. One can work around by use an info hint "port_name_size" and use a
    larger port name buffer.
  • PMI-1 defines PMI_MAX_PORT_NAME, which may be different from MPI_MAX_PORT_NAME.
    This is used by "PMI_Lookup_name". Consequently, MPI_Lookup_name accepts info
    hint "port_name_size" that may be larger than MPI_MAX_PORT_NAME. If the port
    name does not fit in "port_name_size", it will return a truncation error.
  • Autogen default to use -yaksa-depth=2.
  • Default MPIR_CVAR_CH4_ROOTS_ONLY_PMI to on.
  • Added ch4 netmod API am_tag_send and am_tag_recv.
  • Added MPIR_CVAR_CH4_OFI_EAGER_THRESHOLD to force RNDV send mode.
  • Make check target will run ROMIO tests.

MPICH 4.3.0b1

15 Nov 21:00
d1b04e8
Compare
Choose a tag to compare
MPICH 4.3.0b1 Pre-release
Pre-release

Changes in 4.3

  • Support MPI memory allocation kinds side document.
  • Support MPI ABI Proposal. Configure with --enable-mpi-abi and build with
    mpicc_abi. By default, mpicc still builds and links with MPICH ABI.
  • Experimental API MPIX_Op_create_x. It supports user callback function with
    extra_state context and op destructor callback. It supports language bindings
    to use proxy function for language-specific user callbacks.
  • Experimental API MPIX_{Comm,File,Session,Win}_create_errhandler_x. They allow
    user error handlers to have extra_state context and corresponding destructor.
    This allows language bindings to implement user error handlers via proxy.
  • Experimental API MPIX_Request_is_complete. This is a pure request state query
    function that will not invoke progress, nor will free the request. This should
    help applications that want separate task dependency checking from progress
    engine to avoid progress contentions, especially in multi-threaded context.
    It is also useful for tools to profile non-deterministic calls such as
    MPI_Test.
  • Experimental API MPIX_Async_start. This function let applications to inject
    progress hooks to MPI progress. It allows application to implement custom
    asynchronous operations that will be progressed by MPI. It avoids having to
    implement separate progress mechanisms that may either take additional
    resource or contend with MPI progress and negatively impact performance. It
    also allows applications to create custom MPI operations, such as MPI
    nonblocking collectives, and achieve near native performance.
  • Added benchmark tests test/mpi/bench/p2p_{latency,bw}.
  • Added CMA support in CH4 IPC.
  • Added IPC read algorithm for intranode Allgather and Allgatherv.
  • Added CVAR MPIR_CVAR_CH4_SHM_POSIX_TOPO_ENABLE to enable non-temporal memcpy
    for inter-numa shm communication.
  • Added CVAR MPIR_CVAR_DEBUG_PROGRESS_TIMEOUT for debugging MPI deadlock issues.
  • ch4:ucx now supports dynamic processes. MPI_Comm_spawn{_multiple} will work.
    MPI_Open_port will fail due to ucx port name exceeds current MPI_MAX_PORT_NAME
    of 256. One can work around by use an info hint "port_name_size" and use a
    larger port name buffer.
  • PMI-1 defines PMI_MAX_PORT_NAME, which may be different from MPI_MAX_PORT_NAME.
    This is used by "PMI_Lookup_name". Consequently, MPI_Lookup_name accepts info
    hint "port_name_size" that may be larger than MPI_MAX_PORT_NAME. If the port
    name does not fit in "port_name_size", it will return a truncation error.
  • Autogen default to use -yaksa-depth=2.
  • Default MPIR_CVAR_CH4_ROOTS_ONLY_PMI to on.
  • Added ch4 netmod API am_tag_send and am_tag_recv.
  • Added MPIR_CVAR_CH4_OFI_EAGER_THRESHOLD to force RNDV send mode.
  • Make check target will run ROMIO tests.

v4.2.3

11 Oct 14:26
09ca854
Compare
Choose a tag to compare

Changes in 4.2.3

  • Update embedded libfabric to fix a build issue on FreeBSD

  • Fix HIP support for use with AMD GPUs

  • Fix potential invalid context issue in CUDA memory hooks

  • Fix GPU fallback path in ch4/ofi for Intel GPU buffers

  • Fix IPC handle destruction with Level Zero API (Intel GPU)

  • Fix potential crash in MPI_ISENDRECV with derived datatypes

  • Fix bug in persistent MPI_GATHER that incorrectly cached buffer
    contents at init time

  • Fix memory allocation bug in ROMIO GPFS driver

  • Fix missing error names in ch4/ofi netmod

  • Fix potential hang in multi-VCI active message RMA

  • Fix bug in ch3 large count support with derived datatypes

  • Fix manpage generation to provide aliases for large-count versions

  • Fix potential crash in Hydra with long PMI command messages

  • Fix bug in exit status capture in Hydra when there are multiple
    processes with non-zero exit

  • Fix implementation of C/Fortran status conversion functions

  • Fix implementation of MPI_Type_create_f90_xxx functions

v4.2.3rc1

25 Sep 17:38
ead3768
Compare
Choose a tag to compare
v4.2.3rc1 Pre-release
Pre-release

Changes in 4.2.3

  • Update embedded libfabric to fix a build issue on FreeBSD

  • Fix HIP support for use with AMD GPUs

  • Fix potential invalid context issue in CUDA memory hooks

  • Fix GPU fallback path in ch4/ofi for Intel GPU buffers

  • Fix IPC handle destruction with Level Zero API (Intel GPU)

  • Fix potential crash in MPI_ISENDRECV with derived datatypes

  • Fix bug in persistent MPI_GATHER that incorrectly cached buffer
    contents at init time

  • Fix memory allocation bug in ROMIO GPFS driver

  • Fix missing error names in ch4/ofi netmod

  • Fix potential hang in multi-VCI active message RMA

  • Fix bug in ch3 large count support with derived datatypes

  • Fix manpage generation to provide aliases for large-count versions

  • Fix potential crash in Hydra with long PMI command messages

  • Fix bug in exit status capture in Hydra when there are multiple
    processes with non-zero exit

  • Fix implementation of C/Fortran status conversion functions

  • Fix implementation of MPI_Type_create_f90_xxx functions

v4.2.2

03 Jul 14:59
7a413b5
Compare
Choose a tag to compare

Changes in 4.2.2

  • Update embedded libfabric to v1.20.1 and fix compilation with GCC 14.

  • Fix dataloop support for MPIX_Type_iov routines

  • Fix crash in Hydra when system has many local ip addresses

  • Fix RMA fallback check in ch4:ofi netmod

  • Fix MPI_UNDEFINED handling in mpi_f08 module

  • Fix Slurm environment variable inheritance in Hydra

  • Fix multiple issues with large count support in ROMIO

  • Fix potential hang in init using PMIx client and nonstandard keys

  • Fix crash if PMIx client cannot get appnum from server during init

  • Fix other build errors and warnings

For a full list of commits see: v4.2.1...v4.2.2

v4.2.1

17 Apr 21:21
442c4af
Compare
Choose a tag to compare

Changes in 4.2.1

  • Disable flat_namespace to build libmpifort on macOS by default

  • Prefix warning messages with "MPICH"

  • Add --disable-doc configure option

  • Fix support for building MPICH Fortran support with Xcode 15 on macOS

  • Fix bug in MPI_WIN_SHARED_QUERY when window is not shared

  • Fix bug in ch4/ofi gpu pipelining

  • Fixes for Intel GPU support

  • Fix memory leak in ch4/shm collectives

  • Fix bug in MPI_COMM_SPLIT with intercommunicators and non-zero root

  • Fix bug in DAOS ROMIO driver

  • Fix bug in cycling error code array

  • Return an error if there is failure to create a datatype in mpi_f08
    module for noncontiguous user data

  • Return an error when shared memory segment creation fails

For a full list of commits see: v4.2.0...v4.2.1

v4.1.3

27 Feb 21:57
0481cb2
Compare
Choose a tag to compare

Changes in 4.1.3

  • Ignore errors when shutting down ch4/ofi netmod to avoid crashing in
    MPI_FINALIZE. Debug builds will now warn about unclean shutdown.

  • Add missing PMPI_Status_{f082f,f2f08} in mpi_f08 module

  • Fix names for large count subroutines in mpi_f08 module

  • Fix return argument name to be ierror in Fortran mpi module

  • Fix bug in persistent synchronous send that could result in completion
    before matching

  • Fix integer overflow bugs in ROMIO GPFS driver

  • Fix bug in ch4/ucx netmod when partial data is sent to a noncontig
    recv

  • Fix bug in MPI_REDUCE with MPI_IN_PLACE in ch4/shm collectives

  • Fix status object returned by MPI_IRECV with MPI_PROC_NULL

  • Fix memory leak in release_gather collectives

  • Fix integer overflow bugs in MPI_(I)ALLGATHERV

  • Return MPI_ERR_TYPE if MPI_TYPE_GET_ENVELOPE is used with a large
    count datatype

  • Return an error if no valid libfabric provider is found by ch4/ofi
    netmod

  • Return an error if no executable is given to mpiexec (Hydra)

  • Return an error when cudaMalloc fails in the GPU tests

  • Return an error if MPI_ACCUMULATE datatypes are mismatched

  • Return an error when shared memory segment creation fails

  • Return an error if there is failure to create a datatype in mpi_f08
    module for noncontiguous user data

v4.2.0

09 Feb 19:31
ebc00ab
Compare
Choose a tag to compare

Changes in 4.2.0

  • Complete support MPI 4.1 specification

  • Experimental thread communicator feature (e.g. MPIX_Threadcomm_init).
    See paper "Frustrated With MPI+Threads? Try MPIxThreads!",
    https://doi.org/10.1145/3615318.3615320.

  • Experimental datatype functions MPIX_Type_iov_len and MPIX_Type_Iov

  • Experimental op MPIX_EQUAL for MPI_Reduce and MPI_Allreduce (intra
    communicator only)

  • Use --with-{pmi,pmi2,pmix]=[path] to configure external PMI library.
    Convenience options for Slurm and cray deprecated. Use --with-pmi=oldcray
    for older Cray environment.

  • Error checking default changed to runtime (used to be all).

  • Use the error handler bound to MPI_COMM_SELF as the default error handler.

  • Use ierror instead of ierr in "use mpi" Fortran interface. This affects
    user code if they call with explicit keyword, e.g. call MPI_Init(ierr=arg).
    "ierror" is the correct name specified in the MPI specification. We only
    added subroutine interface in "mpi.mod" since 4.1.

  • Handle conversion functions, such as MPI_Comm_c2f, MPI_Comm_f2c, etc., are
    no longer macros. MPI-4.1 require these to be actual functions.

  • Yaksa updated to auto detect the GPU architecture and only build for
    the detected arch. This applies to CUDA and HIP support.

  • MPI_Win_shared_query can be used on windows created by MPI_Win_create,
    MPI_Win_allocate, in addition to windows created by MPI_Win_allocate_shared.
    MPI_Win_allocate will create shared memory whenever feasible, including between
    spawned processes on the same node.

  • Fortran mpi.mod support Type(c_ptr) buffer output for MPI_Alloc_mem,
    MPI_Win_allocate, and MPI_Win_allocate_shared.

  • New functions added in MPI-4.1: MPI_Remove_error_string, MPI_Remove_error_code,
    and MPI_Remove_error_class

  • New functions added in MPI-4.1: MPI_Request_get_status_all,
    MPI_Request_get_status_any, and MPI_Request_get_status_some.

  • New function added in MPI-4.1: MPI_Type_get_value_index.

  • New functions added in MPI-4.1: MPI_Comm_attach_buffer, MPI_Session_attach_buffer,
    MPI_Comm_detach_buffer, MPI_Session_detach_buffer,
    MPI_Buffer_flush, MPI_Comm_flush_buffer, MPI_Session_flush_buffer,
    MPI_Buffer_iflush, MPI_Comm_iflush_buffer, and MPI_Session_iflush_buffer.
    Also added constant MPI_BUFFER_AUTOMATIC to allow automatic buffers.

  • Support for "mpi_memory_alloc_kinds" info key. Memory allocation kind
    requests can be made via argument to mpiexec, or as info during
    session creation. Kinds supported are "mpi" (with standard defined
    restrictors) and "system". Queries for supported kinds can be made on
    MPI objects such as sessions, comms, windows, or files. MPI 4.1 states
    that supported kinds can also be found in MPI_INFO_ENV, but it was
    decided at the October 2023 meeting that this was a mistake and will
    be removed in an erratum.

  • Fix potential crash in GPU memory hooks