Plugin for extremely large number of multiblock zones

Hi

I’m attempting to figure out the best way to get paraview to efficiently visualize my dataset. Briefly, the dataset consists of a refined oct-tree mesh, where each of the resulting cells is further refined into 8^3 cells. It is essentially a hybrid of a structured/unstructured mesh were the oct-tree is unstructured and the leaves are themselves structured.

My first though to do this would be simply to “unroll” all the oct-tree cells and treat each one of them as vtkImageData and then combine into a vtkMultiBlockDataSet. Unfortunately, the number of zones in this multiblock dataset will be in millions and I suspect the vtkMultiBlockDataSet can’t handle this.

To experiment with this, I wrote a simple plugin ‘reader’ that just generates many vtkImageData datasets and dumps them into a multiblockdataset. The important part is as follows:

int vtkMyReader::RequestData(vtkInformation *vtkNotUsed(request),
                               vtkInformationVector **vtkNotUsed(inputVector),
                               vtkInformationVector *outputVector)
{
 
  // get the info object
  vtkInformation *outInfo = outputVector->GetInformationObject(0);

  // get the output
  vtkMultiBlockDataSet *output =
    vtkMultiBlockDataSet::SafeDownCast(outInfo->Get(vtkMultiBlockDataSet::DATA_OBJECT()));
  vtkMultiBlockDataSet* rootNode = output;

  this->UpdateProgress(0.0);
  

  const int Nx = 32;
  const int Ny = 32;
  const int Nz = 32;
  rootNode->SetNumberOfBlocks(Nx*Ny*Nz);
  printf("Start adding..");
  int counter = 0;
  for (int k = 0; k < Nz; ++k) {
    for (int j = 0; j < Ny; ++j) {
      for (int i = 0; i < Nx; ++i) {
	printf("Creating Zone %d\n", counter);
	auto origin = vtkVector3d(static_cast<double>(i), static_cast<double>(j), static_cast<double>(k));
	rootNode->SetBlock(counter, CreateCubeDataSet(8, origin, 1.0/ static_cast<double>(8)));
	
	
	std::string str = "Zone" + std::to_string(counter);
	rootNode->GetMetaData(counter)->Set(vtkCompositeDataSet::NAME(),str);
	counter++;
      }
    }
  }

  this->UpdateProgress(1.0);
  return 1;
}

This will generate 32^3 zones or 32,768 zones and 256^3 total cells. The actual generation of the zones is very quick (~ 1sec), but after the loader is finished there is about 2 minutes of pause before anything is displayed.

This is running on my own build of paraview, (HEAD detached at v5.11.0-RC1) compiled with “RelWithDebugInfo”. (I tried to use the official 5.11.0-RC1 download but the plugin doesn’t work there…a topic for another time).

Is there something obviously wrong with the loader (source)? Or is this approach eventually going to be doomed to fail anyway?

Any thoughts would be helpful.

Thanks
Gaetan

Try vtkPartitiionedDataSet. Those 2 minutes are probably for meta-data generation for each block. With vtkPartitionedDataSet (or its legacy equivalent vtkMultiPieceDataSet), ParaView skips meta-data generation per block. A version of vtkPartitionedDataSet is used for AMR (Berger-Collela) meshes and we regularly with numbers of patches around that number. Having said all of that, you may have better luck converting the whole thing to unstructured grid given the overhead of having that many blocks…

Thanks for the suggestion. I’ve actually had a bit more luck with VTKNonOverlappingAMR type. For a 1024 cube (> 1B cells) that is represented with 8^3 boxes (~2 million total boxes) the load time to get the “outline” representation on a quad-core Haswell-era machine is about 45 seconds which I would consider pretty good.

It would appear that other operations with this dataset type are not so great. My attempt at making a slice through the a 512^3 data set (64^3=262144 boxes) took about 2 minutes. I would have expected this operation to be nearly instant for this type of data.

There are many other issues as well such as the “slice” representation of the volume not working, most of the AMR specific filters being grayed out, the “AMR Slice with Plane” filter segfaults etc.

I can follow up with a separate topic for those other issues.

With respect to the unstructured form…I was really trying to avoid have to commit to a massive amount of additional memory to describe the entire dataset using unstructured data. For a billion cells, you’re looking at 64G just to have the connectivity in memory (assuming 8-bytes ints).

Thanks,
Gaetan

Yeah I understand the memory issue. These kind of datasets are always tricky. There is the hyperoctree class that might fit but I am not sure if it supports 8^3 subdivision. I believe that it is a straight octree for the children. Unstructured grid has issues too in that you will have “t junctions” (hanging nodes) that some filters will not like. The right solution is of course to implement a new data type but it is not a straightforward process. We are still struggling with the hyperoctree implementation.

I’m starting to realize a new data type might be in order. I’m fairly new to Paraview/VTK, but from what I understand you would make a custom data type that would inherit from vtkDataSet (or VTKDataObject?) and implement the required functionality. Then you would need to have custom filters for the required operations on that data type. In practice, all the volume data needs to display is a “outline”, anything else you do with the data would be via a filter, such as a slice, isocontour etc. Then the filters that operate on a vtkDataSet should just “work”.

In the meantime I’ll keep plugging away at the vtkNonOverlappingAMR object to see if I can more fully understand its shortcomings for my application.

Gaetan

That’s exactly right. If you can subclass from vtkDataSet and implement its API, you get most functionality. If you subclass from vtkDataObject, you have to implement all of your own filters pretty much. It is somewhat challenging to implement the vtkDataSet API for this type of data though. @Yohann_Bearzi can say more as he thought about it a lot.