Memory use / redundant rereads with large 3DS mesh?


(Erik Keever) #1

Hello,

I am visualizing a large (1536x1536x384) structured mesh - a 60* wedge of cylindrical coordinates - and my observations of the PV memory use inspector suggest that something is “wrong” based on the amount of memory being used.

The mesh coordinates are stored in one monolith HDF5 file (10GB of float32s) and the data frames are stored one frame per file in additional files (3D_XYZ_###.h5, 10 fields of float64s totalling 72GB each).

The output, seen in Paraview, can be labelled by vtkProcessID and shows clearly that the output is distributed among however many MPI ranks the pvserver job has (good). However the memory use shows that total memory consumption behaves as 10GB x (# of ranks) - which seemingly suggests that every rank is loading the entire mesh which is… problematic.

  • In the PV timer log, with only one data field open, for each updateDisplay the pvservers are reading TWO massive data fields in, which suggests that they are re-reading the entire geometry file. How can I specify that the mesh is static and does NOT need to be re-read?
  • How can I avoid every rank [apparently] reading entire, un-partitioned fields into memory? Does chunking in the hdf5 matter?

Relevant bits of the XDMF metadata file:

<?xml version="1.0" ?>
<!DOCTYPE Xdmf SYSTEM "Xdmf.dtd" []>
<Xdmf xmlns:xi="http://www.w3.org/2003/XInclude" Version="2.0">
  <Domain>
    <Grid Name="TimeSeries" GridType="Collection" CollectionType="Temporal">
      <!-- Frame 1 - special - contains actual definition of geometry -->
      <Grid Name="thedomain" GridType="Uniform">
        <Topology name="thetopo" TopologyType="3DSMesh" Dimensions="384 1536 1536"/>
        <Geometry name="thegeo" GeometryType="XYZ">
          <DataItem Dimensions="905969664 3" NumberType="Float" Precision="4" Format="HDF">100urun_geometry1.h5:/geometry_mesh</DataItem>
        </Geometry>
        <Time Value="0.000000" />
        <Attribute Name="mass" Active="1" AttributeType="Scalar" Center="Node">
          <DataItem Dimensions="384 1536 1536" NumberType="Float" Precision="4" Format="HDF">3D_XYZ_00000.h5:/fluid1/mass</DataItem>
        </Attribute>
<!-- 9 more data fields in attribute tags -->
      </Grid>
      <!-- Frame 2 of output -->
      <Grid GridType="Uniform">
        <Topology Reference="/Xdmf/Domain/Grid/Grid/Topology[@name='thetopo']" />
        <Geometry Reference="/Xdmf/Domain/Grid/Grid/Geometry[@name='thegeo']" />
        <Time Value="0.058249" />
        <Attribute Name="mass" Active="1" AttributeType="Scalar" Center="Node">
          <DataItem Dimensions="384 1536 1536" NumberType="Float" Precision="4" Format="HDF">3D_XYZ_00400.h5:/fluid1/mass</DataItem>
        </Attribute>
<!-- 9 more data fields in attribute tags -->
      </Grid>
<!-- 23 more frames -->
    </Grid>
  </Domain>
</Xdmf>

And the h5dump -H output from the geometry file,

HDF5 "100urun_geometry1.h5" {
GROUP "/" {
   DATASET "geometry_mesh" {
      DATATYPE  H5T_IEEE_F32LE
      DATASPACE  SIMPLE { ( 905969664, 3 ) / ( 905969664, 3 ) }
   }
}
}

And some of the fields from one of the data files:

HDF5 "3D_XYZ_00000.h5" {
GROUP "/" {
    GROUP "fluid1" {
      DATASET "ener" {
         DATATYPE  H5T_IEEE_F64LE
         DATASPACE  SIMPLE { ( 384, 1536, 1536 ) / ( 384, 1536, 1536 ) }
      }
      DATASET "mass" {
         DATATYPE  H5T_IEEE_F64LE
         DATASPACE  SIMPLE { ( 384, 1536, 1536 ) / ( 384, 1536, 1536 ) }
      }
      DATASET "momX" {
         DATATYPE  H5T_IEEE_F64LE
         DATASPACE  SIMPLE { ( 384, 1536, 1536 ) / ( 384, 1536, 1536 ) }
      }
      DATASET "momY" {
         DATATYPE  H5T_IEEE_F64LE
         DATASPACE  SIMPLE { ( 384, 1536, 1536 ) / ( 384, 1536, 1536 ) }
      }
      DATASET "momZ" {
         DATATYPE  H5T_IEEE_F64LE
         DATASPACE  SIMPLE { ( 384, 1536, 1536 ) / ( 384, 1536, 1536 ) }
      }
   }
}
}