Datasets to evaluate ParaView algorithms

For research purpose I have made some changes in ParaView’s code, and I would like to test whether these changes bring any performance improvement at scale on large datasets. So my question is: does the ParaView community have large datasets available, ideally originating from real scientific simulations, that I could use for this purpose?

You can generate large dataset using sources in ParaView, but these are not from real scientific simulation. Other than that, OpenFOAM has quite big dataset to simulate.

What kind of datasets are you looking for ?

I’m looking for any kind of dataset that I could visualize, with a size of several GB to a TB. If the dataset came with some explanation of visualization typically done with it, that would be great. If it was used in some other visualization-related publication, it would be awesome since I would have something to cite.

Also since I’m interested in parallel rendering, a dataset that can be split across processes in a way that reflects how it was originally generated, would be a must.

Do you have a link to those OpenFOAM datasets?

I think @olesenm must know :slight_smile:

The Deep Water Impact Dataset is available. It was used for the 2018 IEEE Visualization SciVis Contest and visualizations from it won the 2017 Visualization Showcase at SC17. It is also used/referenced in the ghost cell generator paper. It is a relatively large unstructured grid of hexahedrons originally a grid of octtrees in the originating simulation. We are working at getting a new machine to serve up the data and make it available, hopefully in the next month. In the meantime, I have some of the yA31 timesteps on google drive that I can share with you, if you like. We frequently use this data set to test emerging ParaView versions on parallel distributed memory supercomputers. I believe others have used it to test volume renderers and other parallel algorithms in VTK/ParaView.