Hi all!
First of all, let me start by thanking everyone for the work you are doing with this software and the help that we will receive for sure
We are developing a CFD Code (SOD2D BSC_SOD2D / sod2d_gitlab · GitLab) and our output results are formatted using VTKHDF format.
Till now, we had no problems with it and we are able to generate the output files and read them properly using ParaView. Since we use high-order Lagrangian elements we have two options to save the meshes/results:
- Using high-order lagrange hexahedra (we interpolate the results using equidistant node distribution)
- Linealising the mesh (we “transform” and divide the p-order elements into several first order hexahedra)
Till here no problem, everything working well!
The problem arose last week when pushing the software and we computed a case for meshes with more than 1 billion nodes. Trying to open the mesh or the results in Paraview gets the following issue:
[…]
HDF5-DIAG: Error detected in HDF5 (1.12.1) thread 0:
#000: /builds/gitlab-kitware-sciviz-ci/build/superbuild/hdf5/src/src/H5Dio.c line 179 in H5Dread(): can’t read data
major: Dataset
minor: Read failed
#001: /builds/gitlab-kitware-sciviz-ci/build/superbuild/hdf5/src/src/H5VLcallback.c line 2011 in H5VL_dataset_read(): dataset read failed
major: Virtual Object Layer
minor: Read failed
#002: /builds/gitlab-kitware-sciviz-ci/build/superbuild/hdf5/src/src/H5VLcallback.c line 1978 in H5VL__dataset_read(): dataset read failed
major: Virtual Object Layer
minor: Read failed
#003: /builds/gitlab-kitware-sciviz-ci/build/superbuild/hdf5/src/src/H5VLnative_dataset.c line 159 in H5VL__native_dataset_read(): could not get a validated dataspace from file_space_id
major: Invalid arguments to routine
minor: Bad value
#004: /builds/gitlab-kitware-sciviz-ci/build/superbuild/hdf5/src/src/H5S.c line 266 in H5S_get_validated_dataspace(): selection + offset not within extent
major: Dataspace
minor: Out of range
( 202.168s) [pvserver.46 ]vtkHDFReaderImplementat:864 ERR| vtkHDFReader (0x1514c150): Error H5Dread start: 18446744071577530368, 140159467271832, 0 count: 1555968, 354777680, 354776672
( 202.168s) [pvserver.46 ] vtkHDFReader.cxx:440 ERR| vtkHDFReader (0x1514c150): Cannot read the Connectivity array
( 202.168s) [pvserver.46 ] vtkExecutive.cxx:753 ERR| vtkPVCompositeDataPipeline (0x1514fe70): Algorithm vtkFileSeriesReader(0x1514e570) returned failure for request: vtkInformation (0x15227980)
Debug: Off
Modified Time: 163221
Reference Count: 1
Registered Events: (none)
Request: REQUEST_DATA
FROM_OUTPUT_PORT: 0
ALGORITHM_AFTER_FORWARD: 1
FORWARD_DIRECTION: 0
[…]
I have checked the mesh/results files in our code and the values look ok (I think). I don’t know if the issue can be related to int32 / int64 for the vtkIdType
… This mesh goes above the int32 limit and in fact, we had to refactor our code for these cases allowing us to store larger global ids.
Of course, the mesh is partitioned in several ranks (for this particular case 5520 ranks), so the local ids do not arrive at the int32 limit, but maybe when trying to read the HDF5 file affects the variable vtkIdType offsetvtk
in HDFReader.cxx. No idea, just a guess…
We have tried two different versions of paraview (5.10.1 & 5.11) getting the same error. We have asked the support of our cluster and they told us that the Paraview versions we have in the cluster are the precompiled versions, so I expect to have the flag compilation VTK_USE_64BIT_IDS
, but cannot be sure…
In summary, our code is able to read and use the mesh file (stored in H5), but Paraview cannot open it giving the error I posted above. For all the meshes that we did till now there was no problem, and this error showed when going to this very large mesh, let me show the values:
HDF5 “cube-5520.hdf” {
GROUP “VTKHDF” {
ATTRIBUTE “Type” {
DATATYPE H5T_STRING {
STRSIZE 16;
STRPAD H5T_STR_NULLPAD;
CSET H5T_CSET_ASCII;
CTYPE H5T_C_S1;
}
DATASPACE SCALAR
}
ATTRIBUTE “Version” {
DATATYPE H5T_STD_I32LE
DATASPACE SIMPLE { ( 2 ) / ( 2 ) }
}
GROUP “CellData” {
DATASET “mpi_rank” {
DATATYPE H5T_STD_U8LE
DATASPACE SIMPLE { ( 1073741824 ) / ( 1073741824 ) }
}
}
DATASET “Connectivity” {
DATATYPE H5T_STD_I64LE
DATASPACE SIMPLE { ( 8589934592 ) / ( 8589934592 ) }
}
GROUP “FieldData” {
}
DATASET “NumberOfCells” {
DATATYPE H5T_STD_I64LE
DATASPACE SIMPLE { ( 5520 ) / ( 5520 ) }
}
DATASET “NumberOfConnectivityIds” {
DATATYPE H5T_STD_I64LE
DATASPACE SIMPLE { ( 5520 ) / ( 5520 ) }
}
DATASET “NumberOfPoints” {
DATATYPE H5T_STD_I64LE
DATASPACE SIMPLE { ( 5520 ) / ( 5520 ) }
}
DATASET “Offsets” {
DATATYPE H5T_STD_I64LE
DATASPACE SIMPLE { ( 1073747344 ) / ( 1073747344 ) }
}
GROUP “PointData” {
}
DATASET “Points” {
DATATYPE H5T_IEEE_F32LE
DATASPACE SIMPLE { ( 1147407183, 3 ) / ( 1147407183, 3 ) }
}
DATASET “Types” {
DATATYPE H5T_STD_U8LE
DATASPACE SIMPLE { ( 1073741824 ) / ( 1073741824 ) }
}
}
}
Has anyone faced similar issues for large VTKHDF files?
Any help on how to solve/debug the problem? I’m thinking of a first try of ‘splitting’ the mesh in some smaller files, but of course is not the desired solution.
It would be difficult to share the mesh files since it is 125Gb… but if you need…
Thanks a lot!