it seems like this discussion forum has been opened just recently. Thanks for that!
I have a question regarding caching while running pvbatch in parallel:
I am trying to read a large dataset with 200 mio. cells on a visualization node with 512 GB RAM using 16 MPI processes.
My python script reads in the data, applies some filter (including D3 for data repartitioning) and writes out the resulting data to the disk. After that, it repeats this for each timestep.
When I run the script, it takes about 10 minutes until the first timestep is completely written to disk and profiling the memory foortprint using
top shows that all pybatch processes together use about 200 GB of RAM (the first one alone uses 50GB!). Just shortly after loading the second timestep into memory, the processes get killed by some watchdog of the system which doesn’t allow processes using more than 256 GB RAM.
It seems like pvbatch caches the first timestep. Is there a possibility to disable that caching or to completely flush the data that has been loaded so far before continuing?
I also tried a pipeline using the ForceTime filter and calling paraview.simple.Delete() before going to the next timestep, but that didn’t remove the first timestep from memory.
I’m using paraview 5.2.0 with mesa (because it will be executed on a cluster without graphics cards for production runs). The script is called using
mpiexec pvbatch ~/viz/apply_D3_volume.py --use-offscreen-rendering
Thanks in advance and best regards