Paraview parallel

Dear all,

I am highly interested on using the Parallel processing in paraview. But I have no idea how to implement it. I read that the following variables should be set: PARAVIEW_USE_MPI = ON. The problem is that I have installed paraview from Ubuntu repositories:
sudo apt-get install paraview . So, first question is, is the paraview package version able to handle parallel processing?

In case that the package version is not able to handle parallel processing, I must download ParaView Source Code. Some time ago I faced a lot of problems trying to install this type of paraview version. So, if you do not mind, I please you to assist me to install it, since the wiki (https://www.paraview.org/Wiki/ParaView:Build_And_Install) is quite short for me. I mean, indicate me the steps required to manage a good installation process: how to install Qt5, the required libraries, etc.

Once I have properly configured paraview to handle parallel processing, could you indicate me the required steps to use it?. I have read that I must connect to a server (I have access to a cluster) but I have no idea how to implement it. I have searched around Internet but I have not find out the solution to my problem.
Also, I tried to replicate the steps on the 15.7 Parallel processing in paraview and pvpython section from Paraview guide 5.0.0. When executing: mpirun -np 4 pvserver, the startup message appears:
Waiting for client …
Connection URL : cs :// myhost :11111
Accepting connection (s): myhost :11111

but paraview doesn’t start up.

I am using Ubuntu 16.04 and paraview version 5.0.1.

Thank you so much in advance.

Best regards,
Guillermo

1 Like

Guillermo,
I think you would be better off installing from the binary distribution.

You will get a newer ParaView, and if the file has MPI in its name, you know it was compiled with MPI.

https://www.paraview.org/download/

Make sure you install the same version on the client and on the server. Then you can run on your server:

<paraview_install>/bin/mpiexec -np 4 <paraview_install>/bin/pvserver

Than you can use your client to connect to this server.

<paraview_install>bin/paraview -url=

Where is printed by the sever.

Dan

Hi Dan,

Thank you very much for your quick reply. And thanks for the recommendation to upgrade, it has been a large improvement.

However, I have been investigating I think that it is possible to parallelize paraview on my local machine (my computer). Is it possible? If so, how?

Thank you very much and sorry for the inconvenience, but I am a completely new about this topic.

Guillermo

The simplest way is to enable the “Auto MPI” mode.

  • Open ParaView
  • Edit -> Settings
  • Enabled advanced options (the cogwheel up right)
  • Scroll down (or search for) “MultiCore Support”
  • Enable AutoMPI and set the number of of cores
  • Ok, restart paraview
  • you are now in parallel locally.
1 Like

Hi Mathieu,

Thank you very much to all of you.

Best regards,
Guillermo

Hi Mathieu,

I also compiled ParaView from the source with PARAVIEW_USE_MPI=ON. The reason with source compilation was to enable the gdal. I have 5.8.0 version source compiled.

Although I enable AutoMPI in “MultiCore Support”, the ParaView doesn’t start with multiprocessors. It also doesn’t give any warning or error, very strange…

I have macOS Catalina, 10.15.3 Mac Pro

Auto-MPI is been shown to not work on certain system anymore.
See here : https://gitlab.kitware.com/paraview/paraview/-/issues/18420

It will most likely been deprecated and removed in the future

I would suggest to manually run the servers instead.

Is MPI the same as using multiple cores on a single machine? For instance, on Windows, if I have a machine with 2 physical CPUs, each with 48 cores, would I set the number of cores to 2 or to 96?

In the old days, you could use 96 (well, 95 to give the client a core) to split serial algorithms across your computing cores. Nowadays, quite a few core operations in ParaView/VTK are multithreaded to take advantage of multiple cores that share memory if you’ve compiled with a symmetric multiprocesing backend (such as TBB) enabled. Hence, you are probably better off most of the time not using MPI at all unless you need to, e.g., you are running on a cluster of machines each with their own memory spaces.

@cory.quammen I’ve noticed ParaView is EXTREMELY slow to open with MPI on and it takes never loads my files (micro-CT DICOM stacks in my case) with MPI on, but it does with it turned off.

I am using the pre-built Windows installer binary ParaView-5.9.1-MPI-Windows-Python3.8-msvc2017-64bit.exe. I also have the Windows MPI installed.

When I turn on AutoMPI and enable up to 47 cores, the mpiexec does run

image

after about a minute (as opposed to maybe 10 seconds with it turned off), it opens. I then load my data (I cannot share my data) and when I hit visualize (i.e. click the eyeball), without MPI it takes maybe 20-30 seconds to load. With MPI, it just hangs and never loads.

image

do not use AutoMPI.

Should I not use MPI at all, or just not AutoMPI? Should I use the other Windows binary that does not include MPI?

You should use MPI if you can distribute your data.

You should not use AutoMPI.

  1. Distribute across cores on a single CPU, across CPUs on 1 machine in separate sockets, or distribute across CPUs on a cluster of workstations?

  2. Are there instructions for distributing data within the ParaView GUI? e.g. It is an easy task for me to, for example, perform the same OpenCV Gaussian blurring filter in Python across thousands of images in parallel by distributing the images across multiple cores. However, MPI is not a prerequisite to do this (i.e. I do not have to install MSMPI to utilize the Python multiprocessing and threading libraries). How can I acheive the same results via ParaView in the GUI and why is MPI necessary?

This depends on your workflow and process. If you are using multithreaded filters, then on a single machine or on a single CPU, using MPI may not be needed at all.

On a cluster, using MPI is needed.

Are there instructions for distributing data within the ParaView GUI? e.g. It is an easy task for me to, for example, perform the same OpenCV Gaussian blurring filter in Python across thousands of images in parallel by distributing the images across multiple cores. However, MPI is not a prerequisite to do this (i.e. I do not have to install MSMPI to utilize the Python multiprocessing and threading libraries).

You need to make sure your data is distributed, show the “processId” field to check.

How can I acheive the same results via ParaView in the GUI and why is MPI necessary?

You still need to run pvserver with MPI and connect to it, then you need to make sure your data is distributed. Then this is as usual.

You may want to follow a ParaView course to learn all this.

MPI is necessary because this is how ParaView handle distributed computing/

YES! I would love this! I have been scouring YouTube for information and it all feels kind of jumbled. Do you have good resources you would recommend? Where are the best places I can learn?

I have the same request for CMake. I would love to learn and between the Kitware books and YouTube, I still feel a little lost.

https://www.kitware.com/what-we-offer/#training

Is there a list of which filters are multithreaded and which are not?

I assume things like the programmable filter will require MPI (with servers launched separately)?

Is there a list of which filters are multithreaded and which are not?

I do not think there is, but all common filters are.

I assume things like the programmable filter will require MPI (with servers launched separately)?

These three subjects, programmable filtes, MPI and multithread are unrelated.

This is unrelated to the initial topic, if you have further questions, please open your own topic.