extract surface on a distributed dataset

Alexandre_Minot · May 25, 2023, 4:54pm

Hello,

When I use the extract surface on data distributed by mpi, the mpi boundaries get extracted as well. This make sense, but I would like to avoid that. Is there a way to get only the full domain surfaces?

Here is how to reproduce:

mpirun -np 4 pvserver
connect a ParaView client to the server
create a wavelet source
convert it to a partionned data set collection (not necessary, but that what my real data is)
extract surface
clip to see the inside

Many thanks,
Alexandre

Andy_Bauer · May 25, 2023, 5:02pm

Have you tried the Ghost Cells Generator filter? Or you could generate the ghost cells yourself inside of whatever is generating your data.

Alexandre_Minot · May 25, 2023, 5:08pm

Thanks for your quick answer.

Indeed I have and it does not fix the issue.

Andy_Bauer · May 25, 2023, 5:43pm

I just noticed that you have a partitioned dataset collection (https://vtk.org/doc/nightly/html/classvtkPartitionedDataSetCollection.html#details). Since by definition the partitions may be essentially unrelated (e.g. some could be unstructured grids and some could be image datas) there’s no real way in general to figure out how to make the items in a partitioned dataset collection fit back together. If it were a partitioned dataset (https://vtk.org/doc/nightly/html/classvtkPartitionedDataSet.html) then the Ghost Cells Generator filter should work since there should be homogeneity within the items.

So, I think you’d need to either specify the ghost information yourself when generating the data or use a different dataset type.

Alexandre_Minot · May 25, 2023, 5:54pm

Thanks for your insight. The conversion from an image data to a partitioned dataset collection creates a collection of one partitioned data set. I can use ExtractBlock to extract all the wavelet data in a single partitioned dataset to which I applied the ghost cell generator. I still end up with the mpi boundaries, and for some reason some of the surfaces are missing. Any ideas?

mwestphal · May 26, 2023, 7:10am

Could you share an actual data ? I just want to check we are trying to fix the actual issue at play here.

Alexandre_Minot · May 26, 2023, 2:35pm

My data comes to ParaView through Catalyst2, so it’s not easy to reproduce offline. If I write the data to disk, then I can’t figure out how to distributed again using ParaView only.

utkarsh.ayachit · May 30, 2023, 1:59pm

Seems to me there’s a bug in the Ghost Cell Generator filter. Let me try to explain what is expected here, since there seems to be some confusion based on a quick glance of the responses.

Since you’re using Catalyst2, the data coming into ParaView will either be a PartitionedDataSetCollection (PDC) or PartitionedDataSet (PD). In either case, for different partitions of a data block split across ranks, all individual datasets will show up as partitions in a single PD.
Unless ghost information is already present in the data being provided to Catalyst, there’s no automatic addition of ghost arrays in the pipeline. So if you directly apply ExtractSurface filter to a PD (or PDC), you’ll see internal faces.
Now, one is expected to use Ghost Cells Generator. Ghost Cells Generator is intended to process each PD at a time and exchange boundary elements to generate ghost layer for each dataset in the PD. This seems to not be working as expected. I suspect there’s a bug handling ghost layers when partitions are structured datasets – a case the implementation should support, IIRC (cc @Yohann_Bearzi).

wascott · May 30, 2023, 4:21pm

@utkarsh.ayachit @cory.quammen @Yohann_Bearzi Utkarsh, thanks for the analysis. If soneone writes this up, I will make sure it gets funded. Be sure to mention me in the bug…

Alexandre_Minot · June 1, 2023, 4:20pm

Thanks for your answers.

When I apply the GhostCellGenerator on our simulation data coming to ParaView through Catalyst, I get

(   3.778s) [pvbatch.5       ]   vtkAbstractArray.cxx:430   WARN| Unsupported data type: 1740893683! Setting to VTK_DOUBLE
(   3.786s) [pvbatch.9       ]   vtkAbstractArray.cxx:430   WARN| Unsupported data type: -419430401! Setting to VTK_DOUBLE
(   3.786s) [pvbatch.1       ]vtkGenericDataArray.txx:389    ERR| vtkIntArray (0x13fb94e0): Unable to allocate 1099528404992 elements of size 4 bytes. 
(   3.786s) [pvbatch.7       ]   vtkAbstractArray.cxx:430   WARN| Unsupported data type: 16777215! Setting to VTK_DOUBLE
(   3.786s) [pvbatch.4       ]   vtkAbstractArray.cxx:430   WARN| Unsupported data type: 9872! Setting to VTK_DOUBLE
(   3.786s) [pvbatch.4       ]vtkGenericDataArray.txx:389    ERR| vtkDoubleArray (0x2ca8c3e0): Unable to allocate 5967269506265907200 elements of size 8 bytes.

The same python script works when applied to data read from the disk. I have tried to reproduce this by running the CxxPolyhedraV2 example from paraview/Examples/Catalyst2/CxxPolyhedra. In that example, I have added a GhostCellGenerator filter to the producer (see attached). This seems to work correctly though. Since I cannot reproduce the problem on a stand alone case, I’m not sure how to phrase the problem were I to open an issue.

catalyst_pipeline.py (1.2 KB)

utkarsh.ayachit · June 2, 2023, 1:33pm

Hmm, so your data is unstructured grid with polyhedral elements and not uniform rectilinear grids as was the test case mentioned earlier? If so, then it’s believable that the ghost-cell generator has a bug with dealing with polyhedral elements – not sure if it’s expected to support polyhedral elements yet (@Yohann_Bearzi, is it?).

Alexandre_Minot · June 2, 2023, 1:35pm

Indeed, my data coming through Catalyst is on a mesh consisted of polyhedras.