Catalyst V2 API spacing not being passed properly for vtkImageData

It looks like the spacing parameter isn’t being passed through the new Catalyst v2 API to construct vtkImageData. For the Catalyst2 CxxImageDataExample the spacing should be:

  double spacing[3] = { 1, 1.1, 1.3 };

but it’s at the default spacing of [1, 1, 1].

from spec should be passed as spacing/dx…spacing/dy etc. The example has a bug where it’s being passed as spacing/x, spacing/y… hence the issue.

Yup, it looks like it’s the example that had the bug.

Testing a bit more with this example, now in parallel – each partition has the same origin and the extents look like they all start at 0. I’m guessing the example should have the origin adjusted for each process according to the Blueprint spec, correct? Not sure how this should work with topologically regular grids and partitioned datasets though as this will likely require quite a bit of reconstruction underneath the covers to get things working properly. Maybe I’m missing something from not knowing enough about Blueprint.

There too, the issue is simply with the example. the adaptor needs to adjust origin accordingly since Conduit blueprint has no notion of extent and only understands dimensions.

Looking more closely at the example, it should have adjusted. Unless I am misunderstanding the original code – which I believe you wrote – grid.GetLocalPoint(0, origin) should have given me local origin for each of the grids – which should not be 0 on all ranks.

No, the original example has a global origin that is the same on each process and the extent is different on each process. Similar to how vtkImageData is set in parallel. I’d guess that this is how a vtkImageData should also be set in a vtkPartitionedDataSet but am not 100% sure on that.

Conduit appears to assume the origin and dimensions the way that VisIt does it where each origin and dimensions are independent for each process and piecing them together is done at a higher level (from what little understanding I have).That is, each extent in VTK terms would be [0, 0, 0, Nx, Ny, Nz] for each proc.

Any case, “origin/x”. “origin/y” shoudl be local to each rank and not global…feel free to update the example accordingly.

So, a question for this – should operations like Extract Subset work at the block level or at the vtkPartitionedDataSet level? On my simulation code each process has an image data with dimensions of [17,33, 33] and when I run in parallel and then save it out and load the partitioned dataset into PV master and do an Extract Subset on it with VOI of [0, 8, 0, 8, 0, 8] I get the following (the reader output is shown in Outline representation and the Extract Subset output is shown in Surface with Edges representation).

Basically, it appears to be doing the Extract Subset on each individual block instead of over the entire, pieced together dataset, which is what I would expect and want in this case.

ParitionedDataSet is not fully supported by VTK/ParaView yet. It’s ongoing work and most likely be completed by 5.10/5.11. But yes, filters like ExtractSubset need to handle paritioned-dataset of uniform grids where the extents across ranks don’t share the same origin. Either that, or this will need to be handled in vtkConduitSource – a good place for this to be handled.

@berkgeveci, any thoughts?

On further thinking about this, here’re my thoughts:

  • with partitioned-datasets, partitions can be mixed types i.e. some are structured data and others are not. that being the case, any requirement that local-extents per partition should use a global origin is unreasonable and potentially impossible.
  • to support use-cases like Extract-Subset, maybe we add a new filter that the user can apply to align local extents across partitions when the user knows that it makes sense i.e. all partitions comprise of structured data and form a filled volume.

Yeah, I’m probably mixing up concepts here in this long running discussion. Maybe the vtkMultiPieceDataSet would be a better choice though for the Catalyst V2 API implementation. I don’t really see very many folks creating a structured dataset on one process and then an unstructured dataset on another process, all with the same channel. Maybe Conduit allows it but at some point structure becomes more important than flexibility without control. Too much flexibility without control and it gets tough to do things with it. Think NetCDF without the CF conventions (even with CF conventions it gets confusing at times).