Multiblock Datasets (
vtkMultiBlockDataSet) have been used in ParaView for representing a collection of datasets. Besides being a container for multiple datasets, called blocks, it also lets us define relationships between those blocks in an hierarchical fashion. While conceptually this sounds great, in practice, the implementation introduced several challenges especially for developers writing algorithms that worked with multiblock datasets in distributed fashion. Some of the challenges are as follows:
- It’s not easy to distinguish between blocks that are parts of whole i.e. simply split into chunks for parallel processing or blocks that define logical grouping .e.g. assemblies. While
vtkMultiPieceDataSetwas supposed to help there, in practice, it’s hardly used.
- The hierarchy needs to be consistent across ranks with parallel processing. This causes undue burden on readers/filters which then often resort to merging blocks together.
- The index used to identify nodes (called composite-index) is unintuitive and affected by even the slightest change to the hierarchy.
In past several months, we have been working on a replacement for multiblock datasets. The new design comprises of three new classes: vtkPartitionedDataSet, vtkPartitionedDataSetCollection and vtkDataAssembly, described here.
Initially, the thought was we’d start slowing converting readers that produce multiblock datasets to use this new data model and thus reduce usage of multiblock datasets to a point where we can deprecate them for good. With that in mind, when time came to rewrite the Exodus reader, we opted to implement that reader to produce this new data model. The reader was easy. The new data model did indeed make it lot easier to write this reader which no longer had to do crazy gymnastics to ensure that the structure lined up across all ranks. However, the devil is always in the details and that’s where things started getting more complicated than easier. Now, for this reader to be usable in ParaView, filters and UI components need to support 2 different datasets: multiblock datasets and the new paritioned-dataset collections. That means additional complexity in several of these already complex filters – even if this was only until we removed the multiblock dataset related logic.
Rather going down this highly unmaintainable path, here’s another alternative: we deprecate multiblock datasets right off the bat!
After the initial shock has subsided, if we think about what this exactly means, it may not be too hard of a pill to swallow (or so I hope).
- Core components of ParaView that currently use multiblock datasets will be converted to use partitioned-dataset collections instead. Thus, filters like Restribute Dataset, Resample Dataset etc. that will need to be converted. Same is true for rendering components i.e. mappers, multiblock inspector etc etc. From user’s point of view, these are all internal changes. In fact these may be a little better since now, instead of using some silly composite-index to set block colors and other properties in the Multiblock Inspector, for example, the user will use block names which are more intuitive.
- Now, what about all the readers that are reading in multiblock datasets? While overtime we should convert all of such readers and data producers, in the interim, we will develop a filter that converts any multiblock dataset to partitioned-dataset collection + data-assembly. ParaView can internally apply this filter so that users don’t have to explicitly use it. This lets all existing multiblock sources continue to work.
- To support custom filters that only work on multiblock datasets, we develop a filter that converts a partitioned-dataset-collection + data-assembly to a multiblock-dataset – inverse of the filter described earlier. This too can be automatically applied under the covers by ParaView when it encounters a multiblock filter being applied to a partitioned-dataset collection pipeline.