correct way of work with decomposed OpenFOAM case

otaolafr · February 5, 2025, 10:25am

Hello,
I wanted to get a feedback on the correct way of post process OpenFOAM cases using paraview.
For better understanding, I will use case and simulation files indistinctively.
for background:

I have a simulation that is in its decomposed state (ie., it has not been reconstructed, and the case is divided in all the processor* folders
I have an empty .foam file (created by touch sim.foam)
I have my post processing python file (created from a trace in paraview)
right now, the post processing takes a relative chunck of time in the total workflow (simulation+post processing), and I would like to know if I am doing things correctly or I am missing in calculation power to ‘accelerate’ this section of the workflow.
some information:

my PC has 8 cores with 2 threads each (mpirun is limited to 8, and I see 16 processors in htop)
my PC has a graphic card NVIDIA Corporation GA104GLM [RTX A3000 Mobile]
my case is divided into 8 subdomains (the case has processor0, 1,…, 7)
this is the about info from paraview:

Client Information:
Version: 5.13.2-1040-g1a57559ebc
VTK Version: 9.4.0-533-g5184be8117
Qt Version: 5.15.10
vtkIdType size: 64bits
Embedded Python: On
Python Library Path: /home/franco/Programs/ParaView/ParaView-5.13.20250113-MPI-Linux-Python3.12-x86_64/lib/python3.12
Python Library Version: 3.12.7 (main, Jan 13 2025, 06:08:27) [GCC 10.2.1 20210130 (Red Hat 10.2.1-11)]
Python Numpy Support: On
Python Numpy Path: /home/franco/Programs/ParaView/ParaView-5.13.20250113-MPI-Linux-Python3.12-x86_64/lib/python3.12/site-packages/numpy
Python Numpy Version: 1.26.4
Python Matplotlib Support: On
Python Matplotlib Path: /home/franco/.local/lib/python3.12/site-packages/matplotlib
Python Matplotlib Version: 3.10.0
Python Testing: Off
MPI Enabled: On
ParaView Build ID: superbuild c6d7e04a2f5e215b44466da4f91952c39f6d5cd2 (master)
Disable Registry: Off
Test Directory: 
Data Directory: 
SMP Backend: TBB
SMP Max Number of Threads: 16
OpenGL Vendor: Intel
OpenGL Version: 4.6 (Core Profile) Mesa 24.3.4-1~24.04-tux1
OpenGL Renderer: Mesa Intel(R) UHD Graphics (TGL GT1)
Accelerated filters overrides available: No

Connection Information:
Remote Connection: No

currently I am running the paraview script using:
pvbatch ./paraviewScript.py
where pvbatch is a simple alias defined in the .bashrc alias pvbatch='/home/franco/Programs/ParaView/ParaView-5.13.20250113-MPI-Linux-Python3.12-x86_64/bin/pvbatch'
also If I open my simulation by doing paraview sim.foam and put it in vtkBlockColors I see that the geometry is all the same color (it can be seen nevertheless the internal boundaries of the processors)

I am missing or doing something wrong?
should I add something to the python script (inside of the file)?
should I run a different command instead of simply pvbatch ./paraviewScript.py?
should I do change a setting in paraview GUI?
thanks in advance,

mwestphal · February 5, 2025, 10:26am

The results looks correct to me.

otaolafr · February 5, 2025, 10:35am

so there is no possibility to improve the speed of the post processing with using mpirun or things like that?

mwestphal · February 5, 2025, 11:59am

there is no possibility to improve the speed of the post processing with using mpirun or things like that?

Yes of course, run with mpi, eg:

mpiexec -np 4 pvbatch ./paraviewScript.py

otaolafr · February 5, 2025, 12:17pm

should the number of processors be equal to the number of subdomains of the OF case?
is it the same to use mpiexec or mpirun in this case?
I see that I am getting n times (at least the print) of my python file,
is it really running the script in parallel? or is it running the same script n times?(n being the number of processors given in this case to mpiexec)

mwestphal · February 5, 2025, 12:19pm

Generally, yes

is it the same to use mpiexec or mpirun in this case?

You should use the one that ParaView has been compiled against. Afaict you are using the binary release of ParaView, which means you should use

/home/franco/Programs/ParaView/ParaView-5.13.20250113-MPI-Linux-Python3.12-x86_64/bin/mpiexec

is it really running the script in parallel? or is it running the same script n times?

Yes, if you are using the right mpiexec

otaolafr · February 5, 2025, 12:25pm

I am sorry I might misunderstood you the correct command should be:
A. (mpiexec is the bin/mpiexec and pvbatch is a flag?)
/home/franco/Programs/ParaView/ParaView-5.13.20250113-MPI-Linux-Python3.12-x86_64/bin/mpiexec -np 4 pvbatch ./paraviewScript.py
B. (mpiexec is the bin/mpiexec and pvbatch is actually the path to the bin/pvbatch)
/home/franco/Programs/ParaView/ParaView-5.13.20250113-MPI-Linux-Python3.12-x86_64/bin/mpiexec -np 4 /home/franco/Programs/ParaView/ParaView-5.13.20250113-MPI-Linux-Python3.12-x86_64/bin/pvbatch ./paraviewScript.py
C. (mpiexec is the executable on the terminal and pvbatch is a flag?)
mpiexec -np 4 pvbatch ./paraviewScript.py
D. (mpiexec is the executable on the terminal and pvbatch is actually the path to the bin/pvbatch)
mpiexec -np 4 /home/franco/Programs/ParaView/ParaView-5.13.20250113-MPI-Linux-Python3.12-x86_64/bin/pvbatch ./paraviewScript.py

thanks for the clarification

mwestphal · February 5, 2025, 12:26pm

B. (mpiexec is the bin/mpiexec and pvbatch is actually the path to the bin/pvbatch)
/home/franco/Programs/ParaView/ParaView-5.13.20250113-MPI-Linux-Python3.12-x86_64/bin/mpiexec -np 4 /home/franco/Programs/ParaView/ParaView-5.13.20250113-MPI-Linux-Python3.12-x86_64/bin/pvbatch ./paraviewScript.py

otaolafr · February 5, 2025, 12:27pm

okey now I am getting only ‘one’ print which is more logic. thanks!

otaolafr · February 5, 2025, 12:59pm

Hello Mathieu,
I am still facing some problems.
if I run:
pvbatch paraviewScript.py
the script finish correctly without any print errors
if I run:
/home/franco/Programs/ParaView/ParaView-5.13.20250113-MPI-Linux-Python3.12-x86_64/bin/mpiexec -np 8 /home/franco/Programs/ParaView/ParaView-5.13.20250113-MPI-Linux-Python3.12-x86_64/bin/pvbatch ./paraviewScript.py
I am getting this a LOT of times:

(   6.669s) [pvbatch.1       ]vtkPConnectivityFilter.:489    ERR| vtkPConnectivityFilter (0x17165b40): No points in data set
(   6.669s) [pvbatch.3       ]vtkPConnectivityFilter.:489    ERR| vtkPConnectivityFilter (0x202a62e0): No points in data set
(   6.669s) [pvbatch.4       ]vtkPConnectivityFilter.:489    ERR| vtkPConnectivityFilter (0x2aef43f0): No points in data set
(   6.647s) [pvbatch.5       ]vtkPConnectivityFilter.:489    ERR| vtkPConnectivityFilter (0x27cc5a90): No points in data set
(   6.669s) [pvbatch.6       ]vtkPConnectivityFilter.:489    ERR| vtkPConnectivityFilter (0x78e5b80): No points in data set
(   6.648s) [pvbatch.7       ]vtkPConnectivityFilter.:489    ERR| vtkPConnectivityFilter (0x3cdfce40): No points in data set
(   6.669s) [pvbatch.3       ]vtkPConnectivityFilter.:499    ERR| vtkPConnectivityFilter (0x202a62e0): An error occurred on at least one process.
(   6.647s) [pvbatch.5       ]vtkPConnectivityFilter.:499    ERR| vtkPConnectivityFilter (0x27cc5a90): An error occurred on at least one process.
(   6.648s) [pvbatch.7       ]vtkPConnectivityFilter.:499    ERR| vtkPConnectivityFilter (0x3cdfce40): An error occurred on at least one process.
(   6.669s) [pvbatch.1       ]vtkPConnectivityFilter.:499    ERR| vtkPConnectivityFilter (0x17165b40): An error occurred on at least one process.
(   6.669s) [pvbatch.4       ]vtkPConnectivityFilter.:499    ERR| vtkPConnectivityFilter (0x2aef43f0): An error occurred on at least one process.
(   6.669s) [pvbatch.6       ]vtkPConnectivityFilter.:499    ERR| vtkPConnectivityFilter (0x78e5b80): An error occurred on at least one process.
(   6.669s) [pvbatch.1       ]       vtkExecutive.cxx:729    ERR| vtkPVCompositeDataPipeline (0x1749a050): Algorithm vtkPConnectivityFilter (0x17165b40) returned failure for request: vtkInformation (0x188170c0)
  Debug: Off
  Modified Time: 15318083
  Reference Count: 1
  Registered Events: (none)
  Request: REQUEST_DATA
  FROM_OUTPUT_PORT: 0
  ALGORITHM_AFTER_FORWARD: 1
  FORWARD_DIRECTION: 0

and Im recovering some data from the filters using dsa/fetch etc. and when I save them in a csv file and I can see that some values are now NaN instead of the correct values (in serial they are correctly saved)

mwestphal · February 5, 2025, 1:08pm

Looks like you have data distribution issues. My guess would be that the connectivity filter in your pipeline is not happy with empty partitions. You may want to avoid having empty partitions.

otaolafr · February 5, 2025, 1:11pm

Hmmmm…
and how one would ensure this? or circumnavigate this issue? I mean each processor (I imagine that thats what you call partitions) have data. once I use the connectivity the surface will not be at all proccs as it is to get the free surface in a water-air simulation, and the partitions are vertically (so the free surface would be in one or two procs only)

mwestphal · February 5, 2025, 1:12pm

You can use RedistributeDataSet filter.

otaolafr · February 5, 2025, 1:20pm

mmmm still there… here is a snip of part of my script that might be generating the issue…



###
# integrate values for kMass
###

    # clip at z=z_riser_max
clipRiser = Clip(registrationName='clip_Riser', Input=filterForData,
             Invert = 1)
clipRiser.ClipType.Normal = [0.0, 0.0, 1.0]
clipRiser.ClipType.Origin = [0,0, zMax_riserWalls_0]

    # integrateVariables over the clipRiser data
integrateVariablesRiser = IntegrateVariables(registrationName='IntegrateVariables_clipRiser', Input=clipRiser,
                                         DivideCellDataByVolume = 1)


    # clip at the value of the free surface average value
clipFreeSurface = Clip(registrationName='clip_FreeSurface', Input=filterForData,
                       Invert = 1)
clipFreeSurface.ClipType.Normal = [0.0, 0.0, 1.0]
clipFreeSurface.ClipType.Origin = [0,0, averageZ]

    # integrateVariables over the clipFreeSurface data
integrateVariablesFreeSurface = IntegrateVariables(registrationName='IntegrateVariables_clipFreeSurface', Input=clipFreeSurface,
                                         DivideCellDataByVolume = 0)

    # thershold over alpha.air between 0 and 0.505
threshold_alphaAir = Threshold(registrationName='Threshold_alpha.air', Input=filterForData,
                       Scalars = ['CELLS', 'alpha.air'], 
                       # Scalars = ['POINTS','alpha.air'],
                       LowerThreshold = 0,
                       UpperThreshold = 0.505)

    # integrateVariables over the threshold1 data
integrateVariablesThreshold_alphaAir = IntegrateVariables(registrationName='IntegrateVariables_threshold1', Input=threshold_alphaAir,
                                         DivideCellDataByVolume = 1)


##
# treat of the data while evolving on timeSteps (the cliping boundaries should evolve with the time steps!)
##





integrateVariablesGlobal = RedistributeDataSet(registrationName='RedistributeDataSet1', Input=integrateVariablesGlobal)
integrateVariablesRiser = RedistributeDataSet(registrationName='RedistributeDataSet1', Input=integrateVariablesRiser)
integrateVariablesFreeSurface = RedistributeDataSet(registrationName='RedistributeDataSet1', Input=integrateVariablesFreeSurface)
integrateVariablesThreshold_alphaAir = RedistributeDataSet(registrationName='RedistributeDataSet1', Input=integrateVariablesThreshold_alphaAir)


timeSteps=[]
zPositions=[]
momentumAir=[]
momentumWater=[]
kMassClipRiser=[]
kMassClipFreeSurface=[]
kMassThreshold=[]
kMassComplete=[]
gasHoldupVolume=[]
for t in timeSteps_:
    try:
        # updating the value of the free surface average Z position
        valuesZ=getFieldsFromFilters(filter=clip2 , fields=['z'] )
        averageZ=np.mean(valuesZ)
        
        valuesMomentums=getFieldsFromFilters(filter=integrateVariablesGlobal , fields=['momentum.air','momentum.water','kMass'] )

        valuesRiser=getFieldsFromFilters(filter=integrateVariablesRiser , fields=['kMass'] )
        
        valuesFreeSurface=getFieldsFromFilters(filter=integrateVariablesFreeSurface , fields=['kMass','Volume'] )
        
        valuesThreshold=getFieldsFromFilters(filter=integrateVariablesThreshold_alphaAir , fields=['kMass'] )
        

        valuesToSave=[
                    t, # 0
                    averageZ, # 1
                    valuesMomentums[0][0], # 2
                    valuesMomentums[1][0], # 3
                    valuesMomentums[2][0], # 4
                    valuesRiser[0][0], # 5
                    valuesFreeSurface[0][0]/valuesFreeSurface[1][0], # 6
                    valuesFreeSurface[1][0], # 7
                    valuesThreshold[0][0] # 8
                    ]
        timeSteps.append(valuesToSave[0])
        
        zPositions.append(valuesToSave[1])

        momentumAir.append(valuesToSave[2])
        momentumWater.append(valuesToSave[3])
        kMassComplete.append(valuesToSave[4])

        kMassClipRiser.append(valuesToSave[5])

        kMassClipFreeSurface.append(valuesToSave[6])
        gasHoldupVolume.append(valuesToSave[7])

        kMassThreshold.append(valuesToSave[8])

        clipFreeSurface.ClipType.Origin = [0,0, averageZ]
        extractFreeSurface.ClosestPoint = [0,0,averageZ]
        clip1.ClipType.Origin = [0,0, averageZ+factorOfDiameter*0.5*(xMax-xMin)]
        clip2.ClipType.Origin = [0,0, averageZ-factorOfDiameter*0.5*(xMax-xMin)]
        updateAnimation(filterForData,timeStep=t)
        updateFilters([clip2,integrateVariablesGlobal,integrateVariablesRiser,integrateVariablesFreeSurface,integrateVariablesThreshold_alphaAir],t)
    except:
        break
        pass