# Parallel HDF5 It is possible to read and write [parallel HDF5](https://portal.hdfgroup.org/display/HDF5/Parallel+HDF5) files using MPI. For this, the HDF5 binaries loaded by HDF5.jl must have been compiled with parallel support, and linked to the specific MPI implementation that will be used for parallel I/O. Parallel-enabled HDF5 libraries are usually included in computing clusters and linked to the available MPI implementations. They are also available via the package manager of a number of Linux distributions. Finally, note that the MPI.jl package is lazy-loaded by HDF5.jl using [Requires](https://github.com/JuliaPackaging/Requires.jl). In practice, this means that in Julia code, `MPI` must be imported _before_ `HDF5` for parallel functionality to be available. ## Setting-up Parallel HDF5 The following step-by-step guide assumes one already has access to parallel-enabled HDF5 libraries linked to an existent MPI installation. ### [1. Using system-provided MPI libraries](@id using_system_MPI) Using a system-provided MPI library can be done with MPIPreferences.jl. After installing MPIPreferences.jl and running `julia --project -e 'using MPIPreferences; MPIPreferences.use_system_binary()'` MPIPreferences.jl identifies any available MPI implementation and stores the information in a file LocalPreferences.toml. See the [MPI.jl docs](https://juliaparallel.org/MPI.jl/stable/configuration/#Using-a-system-provided-MPI-backend) for details. ### [2. Using parallel HDF5 libraries](@id using_parallel_HDF5) !!! note "Migration from HDF5.jl v0.16 and earlier" How to use a system-provided HDF5 library has been changed in HDF5.jl v0.17. Previously, the library path was set by the environment variable `JULIA_HDF5_PATH`, which required to rebuild HDF5.jl afterwards. The environment variable has been removed and no longer has an effect (for backward compatibility it is still recommended to **also** set the environment variable). Instead, proceed as described below. As detailed in [Using custom or system provided HDF5 binaries](@ref), set the preferences `libhdf5` and `libhdf5_hl` to the full path, where the parallel HDF5 binaries are located. This can be done by: ```julia julia> using HDF5 julia> HDF5.API.set_libraries!("/path/to/your/libhdf5.so", "/path/to/your/libhdf5_hl.so") ``` ### 3. Loading MPI-enabled HDF5 In Julia code, MPI.jl must be loaded _before_ HDF5.jl for MPI functionality to be available: ```julia using MPI using HDF5 @assert HDF5.has_parallel() ``` ### Notes to HPC cluster administrators More information for a setup at an HPC cluster can be found in the [docs of MPI.jl](https://juliaparallel.org/MPI.jl/stable/configuration/#Notes-to-HPC-cluster-administrators). After performing the steps [1.](@ref using_system_MPI) and [2.](@ref using_parallel_HDF5) the LocalPreferences.toml file could look something like the following: ```toml [MPIPreferences] _format = "1.0" abi = "OpenMPI" binary = "system" libmpi = "/software/mpi/lib/libmpi.so" mpiexec = "/software/mpi/bin/mpiexec" [HDF5] libhdf5 = "/path/to/your/libhdf5.so" libhdf5_hl = "/path/to/your/libhdf5_hl.so" ``` ### Reading and writing data in parallel A parallel HDF5 file may be opened by passing a `MPI.Comm` (and optionally a `MPI.Info`) object to [`h5open`](@ref). For instance: ```julia comm = MPI.COMM_WORLD info = MPI.Info() ff = h5open(filename, "w", comm, info) ``` MPI-distributed data is typically written by first creating a dataset describing the global dimensions of the data. The following example writes a `10 × Nproc` array distributed over `Nproc` MPI processes. ```julia Nproc = MPI.Comm_size(comm) myrank = MPI.Comm_rank(comm) M = 10 A = fill(myrank, M) # local data dims = (M, Nproc) # dimensions of global data # Create dataset dset = create_dataset(ff, "/data", datatype(eltype(A)), dataspace(dims)) # Write local data dset[:, myrank + 1] = A ``` Note that metadata operations, such as `create_dataset`, must be called _collectively_ (on all processes at the same time, with the same arguments), but the actual writing to the dataset may be done independently. See [Collective Calling Requirements in Parallel HDF5 Applications](https://portal.hdfgroup.org/display/HDF5/Collective+Calling+Requirements+in+Parallel+HDF5+Applications) for the exact requirements. Sometimes, it may be more efficient to write data in chunks, so that each process writes to a separate chunk of the file. This is especially the case when data is uniformly distributed among MPI processes. In this example, this can be achieved by passing `chunk=(M, 1)` to `create_dataset`. For better performance, it is sometimes preferable to perform [collective I/O](https://portal.hdfgroup.org/display/HDF5/Introduction+to+Parallel+HDF5) when reading and writing datasets in parallel. This is achieved by passing `dxpl_mpio=:collective` to `create_dataset`. See also the [HDF5 docs](https://portal.hdfgroup.org/display/HDF5/H5P_SET_DXPL_MPIO). A few more examples are available in [`test/mpio.jl`](https://github.com/JuliaIO/HDF5.jl/blob/master/test/mpio.jl).