I’m trying to use paraview to analyze large numerical datasets in parallel on our cluster. What I want to do is extract data at several slices, and convert them to numpy arrays for further post-processing. Currently I’m using something like this:
from paraview.simple import * import vtk from vtk.numpy_interface import dataset_adapter as dsa from mpi4py import MPI import numpy as np data = XMLMultiBlockDataReader(FileName=[<InputFile>]) xPos = np.linspace(0.32, 0.8, 10) sliceData = dict() for pos in xPos: slice1 = Slice(Input=data) slice1.SliceType = 'Plane' slice1.SliceOffsetValues = [0.0] slice1.SliceType.Origin = [pos, 0, 0.0] slice1.SliceType.Normal = [1, 0, 0.0] sliceDataPV = servermanager.Fetch(slice1) sliceDataNP = dsa.WrapDataObject(sliceDataPV) # extract point coordinates sliceData["x"] = sliceDataNP.Points.Arrays[:,0] sliceData["y"] = sliceDataNP.Points.Arrays[:,1] sliceData["z"] = sliceDataNP.Points.Arrays[:,2] sliceData["pressure"] = sliceDataNP.PointData["pressure"].Arrays
This works well, as long as I run my script in serial mode, however, due to large file sizes I also want to be able to run it in parallel. However, since the data is distributed between the different processes I have the problem that every process contains only part of the sliceData, causing problems with my post-processing.
My first try was resolve this, was to try and gather all the data on one (or all) processes by using mpi4py, similar to whats been described here: mpi4py and vtk
So I just added the following lines to the beginning of my script:
gc = vtk.vtkMultiProcessController.GetGlobalController() comm = vtk.vtkMPI4PyCommunicator.ConvertToPython(gc.GetCommunicator())
And inside the for-loop I tried gathering the data with the following code:
sendcounts = comm.allreduce(len(sliceData["pressure"]), op=MPI.SUM) pressureGathered = np.empty(sendcounts,dtype=np.float64) comm.Allgatherv([sliceData["pressure"], MPI.FLOAT], [pressureGathered, MPI.FLOAT]
However, pvbatch always hangs at the allreduce commands and stops doing anything. There’s no error, and the cpu usage of the processes drops to almost 0. If I run the same commands with normal python and only mpi4py and numpy arrays, everything works as expected.
So I’d like to know: Is there anything special I need to take into account when gathering vtk arrays? Or is there maybe a different way to get all the slice data on one process?