vtkGhostType array missing

I’m working on a filter which requires a rectilinear grid with ghost cells and I want to know which cell is a ghost cell when running an algorithm on the dataset. To my current understanding the vtkGhostType array is meant for exactly this purpose.
But this array is missing some times. I.e. it is set when I open a *.vtr file and then apply my filter which requires a ghost level, but it is not set when using the Xdmf2 reader instead.

Who is responsible for setting the vtkGhostType array and how can I ensure to have the vtkGhostType array available?

Seems the pipeline is generating it if the source sets CAN_PRODUCE_SUB_EXTENT():
https://gitlab.kitware.com/vtk/vtk/-/blob/2959413ff190bc6e3ff40f5b6c1342edd2e5233f/Common/ExecutionModel/vtkStreamingDemandDrivenPipeline.cxx#L968-988

In the wiki I found the explanation “[…] Such readers should not set CAN_PRODUCE_SUB_EXTENT() but set CAN_HANDLE_PIECE_REQUEST() and handle both UPDATE_EXTENT() and pieces/ghosts internally. […]” https://vtk.org/Wiki/VTK/Parallel_Pipeline#Structured_Data_Readers_and_Filters

The Xdmf2 Reader seems to be such a reader which just sets CAN_HANDLE_PIECE_REQUEST(), does this mean the Xdmf2 Reader has a bug by not setting vtkGhostType or is this something I must do in the custom filter?

When a reader sets CAN_PRODUCE_SUB_EXTENT(), the pipeline takes care of converting a piece request (which is in the form of (piece_m of n_pieces)) to an extent (which is in the form of (imin, imax, jmin, jmax, zmin, zmax). The pipeline will also take any requested ghost levels into account, potentially asking the reader to producer overlaps. After the reader produces the data, the pipeline will call a method to generate the vtkGhostType array.
If a reader sets CAN_HANDLE_PIECE_REQUEST(), it is up to the reader to deal with all of the above including producing the vtkGhostType array. I am not sure what the Xdmf2 reader does with the ghost level request. If it doesn’t honor the request, then it is a missing feature. If it honors the request and generates the ghost levels but not the vtkGhostType array, it is a bug. My guess is that it doesn’t honor the request because otherwise you would see overlaps in parallel.
Try the Universal Ghost Cell Generator after the reader.

Exactly this is happening. Ghost cells are generated but not the vtkGhostType array. Overlaps can nicely be seen when rendering the rectlinear grid from the Xdmf Reader as Surface and use vtkProcessId to color the surface and enable transparent surfaces (and of course I need to add a filter afterwards which asks for a ghost level).

The reader needs to call GenerateGhostArray() then. See vtkStreamingDemandDrivenPipeline ::ExecuteDataEnd() for how it is used. We are not really maintaining the Xdmf readers anymore but if someone makes a fix, I am sure we can upstream it. What is you HDF5 file layout like? There may be alternatives to using the Xdmf reader that is easier to maintain (and potentially do not require the XML file).

I tried to make a fix for the Xdmf reader: https://gitlab.kitware.com/vtk/vtk/-/merge_requests/8074

Out of interest, what alternative file formats are you thinking of?
The data I’m using are volume of fluid simulations. There are basically single HDF5 files for each timestep and for each property with a single 3D array per file (plus one HDF5 file with the rectilinear grid coordinates and the xml file).
But the files a written by some HPC Fortran simulation code and I got the files from an other institute writing this simulations. So short term fixing the Xdmf reader is probably easier than switching formats. But when you say the Xdmf Reader is more or less deprecated within VTK/ParaView, maybe they are also interested in knowing about alternatives for future.

I didn’t mean switching formats. I meant switching readers. IMO the easiest option for you would be to use the recently integrated h5py module to develop a custom reader in Python. It would be a plugin. Probably 50-100 lines long and could easily support all of the features of Xdmf. I can provide some pointers if you’d like.

Ok, now I get your idea. Sounds interesting, if you have a few pointers that will be helpful, then I can try it.

I’m a bit scared about performance with the extra data piping through python, but probably only trying and benchmarking it will show it :slight_smile: