I bought a new machine for research and it has 12 Physical CPUs (24 logical CPU’s), and I wanted to process a 3D model written as vtu files. What I have is 12 vtu’s, all referenced in a single .pvtu file to read it in parallel in ParaView.
When I open the .pvtu file in ParaView 5.8 (direct download from compiled version on Mac), the ParaView distributes the data only to 12 CPUs (as VtkProcessID is between 0 and 11, so 12 CPUs), although I have 24 pvserver actively working with it (autoMPI Limit=24).
Do you have any idea why it’s doing so?
ParaView will only read one file into one process. Otherwise, especially on supercomputers, you would have horrible interprocess communication. This is true for VT* files and Exodus. So, 12 files get read into 12 processes. The only file format that I know of that will efficiently “spread” around numerous processes is a .raw file format.
Thanks for the reply. When I use 6 CPU’s, ParaView doesn’t read one file into one process. It reads 2 files into one process. If this is possible, then it should also be possible to distribute 12 files into 24 CPU’s.
Of course. You can always have multiple files read into each process. For instance, you could have all 12 files read into one process.
ParaView is an opensource product, and of course stands on the shoulders of it’s users. If you would like to contribute a modification of the VTM reader, we would appreciate it…
It’s a nice invitation, but I am not that kind of a guy who can modify a code or sth.
Reading two files onto one process and appending the results is an easier problem than partitioning one file onto two processes (or at least there are more decisions to make with the latter problem).
You can use the Redistribute DataSet filter to repartition the data from 12 processes to all 24.
If I Redistribute DataSet to 24 Cores, which were already distributed to 12 Cores once we first uploaded it to ParaView, then I would have 2 Datasets available in ParaView Pipeline Browser, they are common copy, one is distributed to 12 Cores of my computer, and the other is distributed to 24 Cores of the same computer. I feel like I am losing in efficiency by doubling the data and making two different distributions. What do you think?
Indeed, that is a concern. One way to get around it is to save out the dataset from the 24 processors, reset ParaView, and load in the single data set.
@ofbodur : If I can add something, when working with Parallel VTK format, you should always generate the finest grain you may need at any point, as the readers are unable to redistribute the data, but can combine partitions together.
Thanks Mathieu and Cory. Using D3 filter, which distributes the dataset to the # of processors that I opened with paraview (pvserver), I was able to save the data as already distributed to 24 cpus. I also uploaded the 24-cpus-distributed-data (pvtu) to Paraview with only 12 cpu’s activated (AutoMpi=12). However, the distribution of data to cpu’s were not even, which is fine for now.
Also, when I save it as a state file, I hope the number of cpu’s don’t matter… Otherwise, it will very hard to share a state file.
Number of cpu do not matter for state files.
Thanks @mwestphal it’s very useful to have that.
Does the operation system matter for state files ? For example, if I share a state file created in MacOS, can someone open it in Windows using the same version of ParaView?
OS does not matter and state files are cross platform.
The only limitations is ParaView versions. It is in general supported to open a state file with a newer version of ParaView, it is never supported to open a state file with an older version of ParaView.