Applying ParaView pipelines on different datasets (originating from the same code)


Good afternoon,

I am using ParaView now for over a year but still struggle with one significant issue: I run MHD simulations on a daily basis with one code and generate vtu output that I analyze with ParaView. I built a complex pipeline doing coordinate transformations for vector fields, computing derived quantities from the output etc. Building the pipeline takes me > 3 hrs!

The problem is that when I run my code again with new parameters I have to build the pipeline again because the data files output is not identical. I usually run with my code with a different number of cores (each of which writes out part of the output vtu data in a separate file) depending of whether I run the code locally or on our cluster. I ususally also have a different number of iterations/timesteps before convergence, so I have a different number of datafiles even if use the same computational mesh.

So far I was only partially successful to reuse my pipelines on different runs of the same code:

  • I linked (and renamed) one temporal snapshot to a separate directory and then was able to read-in that single time step by using the feature “only import data from this directory”. So this works for a fixed number of cores (=fixed number of output files) and one single time step. I had to generate the same pipeline for, say 8 and 60 cores. I cannot study the temporal evolution with this pipeline.
  • I tried to merge all the vtu output (from different I/O cores and different time steps) into on h5-file and build the whole pipeline based on this h5-file. However, also this approach fails when trying to load that pipeline on data from a new run as the different number of timesteps in a run is unavoidable for my implicit code.

At this point I don’t know whether I should use ParaView any longer. The data analysis and rendering works very well but I cannot the time to re-build complex pipelines (pvsm state files) everytime I want to analyze data from a new run of the same code. I simply don’t have the time for that and don’t know whether this issue can be fixed somehow.

Thanks a lot for your time and feedback!

(Mathieu Westphal (Kitware)) #2

I’m not what is the problem here. Can’t you use statefiles and change the inputs ?

On a side note, you may be interested by catalyst.


Thank you Mathieu for your reply!

Can’t you use statefiles and change the inputs ?

Well, that was exactly my question. I use statefiles where I save my pipelines. The problem is that in ParaView pipelines and the data they are acting upon are not separate entities. Loading of the data is part of the pipeline. Now what I can do is merging the data into one h5 file and step 1 of my pipeline would be to read-in that data file. But this approach does not allow me to load my statefile based on a different h5 (with different numbers of timesteps, say).

So my question was exactly what you asked me in return: Is it possible to modify the statefiles for this specific purpose and how? Let’s say instead of 500 timesteps in my original statefile my new h5-file has 1300 timesteps. So I have to open the statefile in an editor and copy the lines I find associated with reading in a single time step 800 times? Apart from that this would be cumbersome, I don’t even know whether it’s possible - I got an error message when naively trying to load my statefile based on output of a different run that my timesteps are not the same – which of course they aren’t, my CFL changes adaptively and so on. In the end I also have to run the same simulations on different tetrahedral unstructured meshes with different resolutions.

So my question was really can this be fixed in the statefile so that I do not have to recreate a new statefile for every new run? If yes, I can devote more time into studying ParaView besides using the GUI but if not, this would be a good time to know so that I switch to some other 3-D data analysis platform, as for instance some colleagues of mine don’t have these problems using Visit.

I would like to stick to ParaView but at the moment I don’t know how to proceed with this problem. If you say with catalyst I can solve it I will devote my time going into it. In the ParaView manual I didn’t find this problem addressed at all where pipelines and statefiles are discussed. But again, so far I am mostly using the GUI and am not an expert. Maybe I should save the state as a python script and modify that somehow-I am regularly programming in Python so if you think that would be a doable approach I’ll go for it. Just at the moment I don’t know which approach would be best and, most importantly, whether this can be done at all.

Thanks for your feedback!

(Mathieu Westphal (Kitware)) #4

Considering that you want to apply the same exact pipeline with the same exact properties, statefile (.pvsm) should work fine. If they don’t, maybe you are encountering some bugs.

If you want to do some modifications, I would suggest using python statefile instead, which are much easier to modify or even to use a macros in order to recreate your pipeline.

To test this, just change the type of the file when saving your state.

Catalyst is not a solution for your specific problem, just a way to do in-situ analysis with your simulaton.

1 Like

So if I understand you correctly, no matter whether the data I am analyzing sit on a different unstructured mesh (with a different resolution, say) have a different number of time steps and were written from different I/O cores before merged into on .h5-file, the pipeline in my pvsm statefile should load that data correctly (in the ideal case without my intervention) and if not it is either a bug or the code I am using writes the vtu in a non-ideal way?

I will give it a try with python statefiles, maybe I can see from there more easily what has to be modified from one run to the next.

Thank you very much for your input!

(Mathieu Westphal (Kitware)) #6

it is either a bug or the code I am using writes the vtu in a non-ideal way?


1 Like

Thanks again for the hint that in principle the pipeline/statefiles should work without my intervention (as long as the h5-file for a particular run lies in the same directory as the unchanged statefile and has a particular (unchanged) name, of course).

I found that I have been writing the statefile on ParaView 5.5.2 Qt5 on my institute and was trying to load that same state on ParaView 5.6 on my private laptop at home. Installing 5.5.2 and loading the state based on data from different runs suddenly worked!

This is great indeed because I was really afraid I couldn’t do this sort of job with ParaView. And without the really handy possibility to convert the many vtu-output files into one h5 in ParaView that probably would not even have worked without a lot of manual modification of the statefile to adjust to each particular run.

1 Like
(Mathieu Westphal (Kitware)) #8

State file are suposed to be backward compatible, but it can indeed causes issues. I’m glad you found a solution.