That should work, though if you run into mpi errors you may want to try a pvbatch build without mpi or use pvpython. Can we get this added to the paraview docs, or at least updated in the headers?
I am getting a Segmentation fault both in ospray_mpi_worker and in pvbatch. I suppose it is related that I am using and Ospray compiled with the system mpi while Paraview is the osmesa binary downloaded. I am trying compiling Paraview with the system mpi.
Error output attached.
Thanks
Reynaldo pvM0.e69290 (9.3 KB)
I was able to run your script locally on my mac using 5.7 and ospray 1.8.5. I did notice that paraview does not seem to exit ospray correctly and reports some errors, but it still ran through the script correctly. Is the stall you are seeing after it exports sphere.png? Also, another thought I had is if the stalling you are seeing is hiding some other error, possibly a linking issue finding module_mpi. Are you able to run that same script loading the mpi module, but without running the mpi device?
It’s going to be tricky to debug this issue from my end unfortunately since I can’t reproduce it locally. My hunch is that your error output is actually caused by a linking issue.
To help narrow this down, can you please try running that same script without mpi? Keep the env var that loads the mpi module, but take out the env flag that loads the mpi device. If that fails, try just running it without the mpi module at all.
If it succeeds, could you try running ospExampleViewer with the same mpi split launch you used for paraview?
#o: initMPI::OSPonRanks: 1/5 #o: initMPI::OSPonRanks: 4/5 #o: initMPI::OSPonRanks: 0/5 #o: initMPI::OSPonRanks: 2/5 #o: initMPI::OSPonRanks: 3/5
master: Made ‘worker’ intercomm (through split): 0x55f638663690 #w: app process -1/-1 (global 4/5 #w: app process 0/1 (global 0/5
master: Made ‘worker’ intercomm (through split): 0x55ffe68f5690 #w: app process -1/-1 (global 2/5
master: Made ‘worker’ intercomm (through split): 0x55bcb14196c0 #w: app process -1/-1 (global 1/5
master: Made ‘worker’ intercomm (through split): 0x55ae68a54690 #w: app process -1/-1 (global 3/5
master: Made ‘worker’ intercomm (through intercomm_create): 0x563ecc7da500 #osp.mpi.master: processing/sending work item 0 #osp.mpi.master: done work item, tag 0: N6ospray3mpi4work15SetLoadBalancerE #w: running MPI worker process 0/4 on pid 8468@dellrg #w: running MPI worker process 3/4 on pid 8471@dellrg #w: running MPI worker process 2/4 on pid 8470@dellrg #w: running MPI worker process 1/4 on pid 8469@dellrg
Running on: dellrg /home/reynaldo/tmp/test 2019-12-06 09:30:49.403919 #osp.mpi.master: processing/sending work item 1 #ospray: trying to look up renderer type ‘scivis’ for the first time #osp.mpi.master: done work item, tag 1: N6ospray3mpi4work10NewObjectTINS_8RendererEEE #osp.mpi.master: processing/sending work item 2
…
I think the problem is that pvbatch is meant to run paraview distributed and is likely trying to send conflicting mpi messages to the other ranks in that mpirun command. Pvpython works because it does not send any mpi messages. I’m not sure there is an easy way around this really, short of perhaps connecting to a separate ring of pvservers to run some data analysis distributed, or using ospray in a different mpi mode than offload.