Paraview doesn't like using 24 logical CPU's I have, but only uses 12 CPUs based on VtkProcessID when opening a pvtu file including 12 vtu's

Hi All,

I bought a new machine for research and it has 12 Physical CPUs (24 logical CPU’s), and I wanted to process a 3D model written as vtu files. What I have is 12 vtu’s, all referenced in a single .pvtu file to read it in parallel in ParaView.

When I open the .pvtu file in ParaView 5.8 (direct download from compiled version on Mac), the ParaView distributes the data only to 12 CPUs (as VtkProcessID is between 0 and 11, so 12 CPUs), although I have 24 pvserver actively working with it (autoMPI Limit=24).

Do you have any idea why it’s doing so?

ParaView will only read one file into one process. Otherwise, especially on supercomputers, you would have horrible interprocess communication. This is true for VT* files and Exodus. So, 12 files get read into 12 processes. The only file format that I know of that will efficiently “spread” around numerous processes is a .raw file format.

Thanks for the reply. When I use 6 CPU’s, ParaView doesn’t read one file into one process. It reads 2 files into one process. If this is possible, then it should also be possible to distribute 12 files into 24 CPU’s.

Of course. You can always have multiple files read into each process. For instance, you could have all 12 files read into one process.

ParaView is an opensource product, and of course stands on the shoulders of it’s users. If you would like to contribute a modification of the VTM reader, we would appreciate it…

1 Like

Thanks Walter,

It’s a nice invitation, but I am not that kind of a guy who can modify a code or sth.

Reading two files onto one process and appending the results is an easier problem than partitioning one file onto two processes (or at least there are more decisions to make with the latter problem).

You can use the Redistribute DataSet filter to repartition the data from 12 processes to all 24.

Thanks Cory,

If I Redistribute DataSet to 24 Cores, which were already distributed to 12 Cores once we first uploaded it to ParaView, then I would have 2 Datasets available in ParaView Pipeline Browser, they are common copy, one is distributed to 12 Cores of my computer, and the other is distributed to 24 Cores of the same computer. I feel like I am losing in efficiency by doubling the data and making two different distributions. What do you think?

Indeed, that is a concern. One way to get around it is to save out the dataset from the 24 processors, reset ParaView, and load in the single data set.

1 Like

@ofbodur : If I can add something, when working with Parallel VTK format, you should always generate the finest grain you may need at any point, as the readers are unable to redistribute the data, but can combine partitions together.

1 Like

Thanks Mathieu and Cory. Using D3 filter, which distributes the dataset to the # of processors that I opened with paraview (pvserver), I was able to save the data as already distributed to 24 cpus. I also uploaded the 24-cpus-distributed-data (pvtu) to Paraview with only 12 cpu’s activated (AutoMpi=12). However, the distribution of data to cpu’s were not even, which is fine for now.

Also, when I save it as a state file, I hope the number of cpu’s don’t matter… Otherwise, it will very hard to share a state file.

1 Like

Number of cpu do not matter for state files.

1 Like

Thanks @mwestphal it’s very useful to have that.

1 Like