Tips for Using IRATE and writing I/O modulesΒΆ

An extremely useful tool for parallel simulations is HDF5’s external links. These allow an HDF5 file to be scattered over many logical files while still looking to someone accessing the file as though it were all one file. A typical use case might be a simulation where multiple nodes of a cluster are processing a single snapshot simultaneously. For the sake of example, imagine a simulation where a single snapshot is processed by two seperate nodes that each create hdf5 files “node1.h5” and “node2.h5”, that contain datasets containing dark matter particles following the format standard specified in Particle Data. An IRATE file might be generated to house these datasets like so:

import h5py

#assume simulation.h5 already has the Cosmology and SimulationProperties groups

f = h5py.File('simulation.h5')

snap1 = f.create_group('Snapshot0001')
snap1data = snap1.create_group('ParticleData')

snap1data['Dark_node1'] = h5py.ExternalLink('node1.h5','/')
snap1data['Dark_node2'] = h5py.ExternalLink('node2.h5','/')

The file ‘simulation.h5’ can now be loaded and manipulated just like any IRATE file, and as long as the “node1.h5” and “node2.h5” files are kept in the same directory, everything will work fine.