status: Accepted in H5MD 1.1
Define a consistent scheme to store lists of particles and particle tuples (pairs, triples, etc.) in an H5MD file. This is of importance to define arbitrary groups of particles or connectivity among particles.
http://article.gmane.org/gmane.science.simulation.h5md.user/604
The group I work with does a fair amount of analysis on bond lengths, angle
distributions and dihedral angle distributions as validation for force field
development. So our main use case for a 'topology' module would be to store
the pairs, 3- and 4-tuples of atoms that define the bonds, angles and
dihedrals.
That said, I have use cases for wanting to store lists of non-bonded pairs
of atoms such as opposing carbons on a ring or designated 'endpoint' atoms
that can be used to represent the overall orientation of a larger molecule,
and in this use case a specific 'bond' list would not really be appropriate.
(...)
I would definitely agree that a standard storage scheme, would be useful,
even if the structure of a 'topology' section varies by module. That way,
even if the location of the lists varies, at least the same routines can be
used to read/write the data when it is located. As a possible example:
<tuple-list>
+-- type: String
+-- dimension: Integer[]
\-- values: Integer[n-tuples][D]
Where a pair list would be dimension(D) = 2, and the values in list would be
the particle IDs. Given what Konrad pointed out about flexibility
vs. specificity ,the "type" string could have specific list of acceptable
values, much like the boundary attribute in the 'box' item. For my
particular use cases, I can see the following types of atom-tuple lists
being useful as topology/connectivity information: bonds, angles, dihedrals,
chains, and 'other'.
http://article.gmane.org/gmane.science.simulation.h5md.user/637
A specific problem of 2) is that for non-trivial forcefields (proteins
etc.), a simple bond list is not of much use. What you want is *all*
forcefield terms. I can't think of any practical use for just the bonds.
A list of particles is an H5MD element:
<list_name>: Integer[N]
+-- particles_group: String[]
where N
is the length of the list and particles_group
is the
name of an existing group within /particles
.
Note: Alternatives to store particles_group
particles_group
point to the respective group in
/particles
using a hard/soft link, which ensures that this group
exists.If the corresponding particles_group
does not possess the id
element, the values in list_name
correspond to the indexing of the
elements in particles_group
. Else, the values in list_name
must
be put in correspondence with the equal values in the id
element.
If a fill value (see § 6.6 in (“HDF5 User’s guide,” n.d.)) is defined, entries whose value is set to this value are ignored.
A list of tuples is an H5MD element:
<tuples_list_name>: Integer[N,T]
+-- particles_group: String[]
where N
is the length of the list, T
is the size of the tuples
and particles_group
is the name an existing group within
/particles
. Both N
and T
may indicate variable dimensions.
The interpretation of the values stored within the tuples is done as for a list of particles.
If a fill value (see § 6.6 in (“HDF5 User’s guide,” n.d.)) is defined, tuples with all entries set to this value are ignored. If only certain entries within a tuple are set to the fill value, these entries are ignored decreasing the size of this tuple.
**Note:**
The scheme should also allow for the storage of neighbour lists, i.e., the
size of the tuples is not fixed.
As the lists of particles and tuples above are H5MD elements, they can be stored either as time-dependent groups or time-independent datasets.
As an example, a time-dependent list of pairs is stored as:
<pair_list_name>
+-- particles_group: String[]
\-- value: Integer[variable,N,2]
\-- step: Integer[variable]
\-- time: Float[variable]
The dimension denoted by N
may be variable.
“HDF5 User’s guide.” n.d. http://www.hdfgroup.org/HDF5/doc/UG.