In situ crash when trying to re-initialize

Hi,

I have this very simple sample code using ParaView to render a fractal in situ: https://github.com/mdorier/MandelbulbCatalystExample

Right now I call InSitu::Initialize() at the beginning, InSitu::CoProcess() at every iteration, and InSitu::Finalize() at the end (see src/Mandelbulb.cpp).

For a research work, however, I need to be able to finalize and re-initialize the in situ part at every iteration. If I move InSitu::Initialize() and InSitu::Finalize() inside the iteration (i.e. right before and after InSitu::CoProcess()), the second time the in situ part is initialized, the code gives me the following warning:

( 15.685s) [pvbatch.0 ]vtkSocketController.cxx:50 WARN| vtkSocketController (0x44a4350): Already initialized.

then crashed with this long stack trace:

Stack trace:
[truncated]
123     0x2b7d31ab213b /nfs2/mdorier/spack/opt/spack/linux-ubuntu14.04-ivybridge/gcc-8.2.1/python-3.7.8-xe6or5bd6urzfux6vhc2asxszabx2666/lib/libpython3.7m.so.1.0(+0x6813b) [0x2b7d31ab213b]
122     0x2b7d31abc47d _PyEval_EvalFrameDefault + 37373
121     0x2b7d31ade983 _PyFunction_FastCallKeywords + 147
120     0x2b7d31bb5461 _PyEval_EvalCodeWithName + 2785
119     0x2b7d31abbe72 _PyEval_EvalFrameDefault + 35826
118     0x2b7d31adf695 _PyCFunction_FastCallDict + 37
117     0x2b7d31adf4d8 _PyMethodDef_RawFastCallDict + 456
116     0x2b7d31bb2b4d /nfs2/mdorier/spack/opt/spack/linux-ubuntu14.04-ivybridge/gcc-8.2.1/python-3.7.8-xe6or5bd6urzfux6vhc2asxszabx2666/lib/libpython3.7m.so.1.0(+0x168b4d) [0x2b7d31bb2b4d]
115     0x2b7d31bb57db PyEval_EvalCode + 27
114     0x2b7d31bb57ae PyEval_EvalCodeEx + 62
113     0x2b7d31bb5461 _PyEval_EvalCodeWithName + 2785
112     0x2b7d31abc1aa _PyEval_EvalFrameDefault + 36650
111     0x2b7d31bd02be PyImport_ImportModuleLevelObject + 1422
110     0x2b7d31adffcd _PyObject_CallMethodIdObjArgs + 173
109     0x2b7d31adfd60 /nfs2/mdorier/spack/opt/spack/linux-ubuntu14.04-ivybridge/gcc-8.2.1/python-3.7.8-xe6or5bd6urzfux6vhc2asxszabx2666/lib/libpython3.7m.so.1.0(+0x95d60) [0x2b7d31adfd60]
108     0x2b7d31ade6c7 _PyFunction_FastCallDict + 183
107     0x2b7d31bb5461 _PyEval_EvalCodeWithName + 2785
106     0x2b7d31aba35b _PyEval_EvalFrameDefault + 28891
105     0x2b7d31ade983 _PyFunction_FastCallKeywords + 147
104     0x2b7d31bb5461 _PyEval_EvalCodeWithName + 2785
103     0x2b7d31abbe72 _PyEval_EvalFrameDefault + 35826
102     0x2b7d31ae0af3 PyCFunction_Call + 243
101     0x2b7d31bb0f78 /nfs2/mdorier/spack/opt/spack/linux-ubuntu14.04-ivybridge/gcc-8.2.1/python-3.7.8-xe6or5bd6urzfux6vhc2asxszabx2666/lib/libpython3.7m.so.1.0(+0x166f78) [0x2b7d31bb0f78]
100     0x2b7d31bd030f PyImport_ImportModuleLevelObject + 1503
99      0x2b7d31adffcd _PyObject_CallMethodIdObjArgs + 173
98      0x2b7d31adfd60 /nfs2/mdorier/spack/opt/spack/linux-ubuntu14.04-ivybridge/gcc-8.2.1/python-3.7.8-xe6or5bd6urzfux6vhc2asxszabx2666/lib/libpython3.7m.so.1.0(+0x95d60) [0x2b7d31adfd60]
97      0x2b7d31ade8dc _PyFunction_FastCallDict + 716
96      0x2b7d31ab213b /nfs2/mdorier/spack/opt/spack/linux-ubuntu14.04-ivybridge/gcc-8.2.1/python-3.7.8-xe6or5bd6urzfux6vhc2asxszabx2666/lib/libpython3.7m.so.1.0(+0x6813b) [0x2b7d31ab213b]
95      0x2b7d31aba35b _PyEval_EvalFrameDefault + 28891
94      0x2b7d31ab213b /nfs2/mdorier/spack/opt/spack/linux-ubuntu14.04-ivybridge/gcc-8.2.1/python-3.7.8-xe6or5bd6urzfux6vhc2asxszabx2666/lib/libpython3.7m.so.1.0(+0x6813b) [0x2b7d31ab213b]
93      0x2b7d31aba35b _PyEval_EvalFrameDefault + 28891
92      0x2b7d31ab213b /nfs2/mdorier/spack/opt/spack/linux-ubuntu14.04-ivybridge/gcc-8.2.1/python-3.7.8-xe6or5bd6urzfux6vhc2asxszabx2666/lib/libpython3.7m.so.1.0(+0x6813b) [0x2b7d31ab213b]
91      0x2b7d31aba6ed _PyEval_EvalFrameDefault + 29805
90      0x2b7d31ab213b /nfs2/mdorier/spack/opt/spack/linux-ubuntu14.04-ivybridge/gcc-8.2.1/python-3.7.8-xe6or5bd6urzfux6vhc2asxszabx2666/lib/libpython3.7m.so.1.0(+0x6813b) [0x2b7d31ab213b]
89      0x2b7d31abc47d _PyEval_EvalFrameDefault + 37373
88      0x2b7d31ade983 _PyFunction_FastCallKeywords + 147
87      0x2b7d31bb5461 _PyEval_EvalCodeWithName + 2785
86      0x2b7d31abbe72 _PyEval_EvalFrameDefault + 35826
85      0x2b7d31adf695 _PyCFunction_FastCallDict + 37
84      0x2b7d31adf4d8 _PyMethodDef_RawFastCallDict + 456
83      0x2b7d31bb2b4d /nfs2/mdorier/spack/opt/spack/linux-ubuntu14.04-ivybridge/gcc-8.2.1/python-3.7.8-xe6or5bd6urzfux6vhc2asxszabx2666/lib/libpython3.7m.so.1.0(+0x168b4d) [0x2b7d31bb2b4d]
82      0x2b7d31bb57db PyEval_EvalCode + 27
81      0x2b7d31bb57ae PyEval_EvalCodeEx + 62
80      0x2b7d31bb5461 _PyEval_EvalCodeWithName + 2785
79      0x2b7d31abc1aa _PyEval_EvalFrameDefault + 36650
78      0x2b7d31bd02be PyImport_ImportModuleLevelObject + 1422
77      0x2b7d31adffcd _PyObject_CallMethodIdObjArgs + 173
76      0x2b7d31adfd60 /nfs2/mdorier/spack/opt/spack/linux-ubuntu14.04-ivybridge/gcc-8.2.1/python-3.7.8-xe6or5bd6urzfux6vhc2asxszabx2666/lib/libpython3.7m.so.1.0(+0x95d60) [0x2b7d31adfd60]
75      0x2b7d31ade6c7 _PyFunction_FastCallDict + 183
74      0x2b7d31bb5461 _PyEval_EvalCodeWithName + 2785
73      0x2b7d31aba35b _PyEval_EvalFrameDefault + 28891
72      0x2b7d31ade983 _PyFunction_FastCallKeywords + 147
71      0x2b7d31bb5461 _PyEval_EvalCodeWithName + 2785
70      0x2b7d31abbe72 _PyEval_EvalFrameDefault + 35826
69      0x2b7d31ae0af3 PyCFunction_Call + 243
68      0x2b7d31bb0f78 /nfs2/mdorier/spack/opt/spack/linux-ubuntu14.04-ivybridge/gcc-8.2.1/python-3.7.8-xe6or5bd6urzfux6vhc2asxszabx2666/lib/libpython3.7m.so.1.0(+0x166f78) [0x2b7d31bb0f78]
67      0x2b7d31bd030f PyImport_ImportModuleLevelObject + 1503
66      0x2b7d31adffcd _PyObject_CallMethodIdObjArgs + 173
65      0x2b7d31adfd60 /nfs2/mdorier/spack/opt/spack/linux-ubuntu14.04-ivybridge/gcc-8.2.1/python-3.7.8-xe6or5bd6urzfux6vhc2asxszabx2666/lib/libpython3.7m.so.1.0(+0x95d60) [0x2b7d31adfd60]
64      0x2b7d31ade8dc _PyFunction_FastCallDict + 716
63      0x2b7d31ab213b /nfs2/mdorier/spack/opt/spack/linux-ubuntu14.04-ivybridge/gcc-8.2.1/python-3.7.8-xe6or5bd6urzfux6vhc2asxszabx2666/lib/libpython3.7m.so.1.0(+0x6813b) [0x2b7d31ab213b]
62      0x2b7d31aba35b _PyEval_EvalFrameDefault + 28891
61      0x2b7d31ab213b /nfs2/mdorier/spack/opt/spack/linux-ubuntu14.04-ivybridge/gcc-8.2.1/python-3.7.8-xe6or5bd6urzfux6vhc2asxszabx2666/lib/libpython3.7m.so.1.0(+0x6813b) [0x2b7d31ab213b]
60      0x2b7d31aba35b _PyEval_EvalFrameDefault + 28891
59      0x2b7d31ab213b /nfs2/mdorier/spack/opt/spack/linux-ubuntu14.04-ivybridge/gcc-8.2.1/python-3.7.8-xe6or5bd6urzfux6vhc2asxszabx2666/lib/libpython3.7m.so.1.0(+0x6813b) [0x2b7d31ab213b]
58      0x2b7d31aba6ed _PyEval_EvalFrameDefault + 29805
57      0x2b7d31ab213b /nfs2/mdorier/spack/opt/spack/linux-ubuntu14.04-ivybridge/gcc-8.2.1/python-3.7.8-xe6or5bd6urzfux6vhc2asxszabx2666/lib/libpython3.7m.so.1.0(+0x6813b) [0x2b7d31ab213b]
56      0x2b7d31abc47d _PyEval_EvalFrameDefault + 37373
55      0x2b7d31ade983 _PyFunction_FastCallKeywords + 147
54      0x2b7d31bb5461 _PyEval_EvalCodeWithName + 2785
53      0x2b7d31abbe72 _PyEval_EvalFrameDefault + 35826
52      0x2b7d31adf695 _PyCFunction_FastCallDict + 37
51      0x2b7d31adf4d8 _PyMethodDef_RawFastCallDict + 456
50      0x2b7d31bb2b4d /nfs2/mdorier/spack/opt/spack/linux-ubuntu14.04-ivybridge/gcc-8.2.1/python-3.7.8-xe6or5bd6urzfux6vhc2asxszabx2666/lib/libpython3.7m.so.1.0(+0x168b4d) [0x2b7d31bb2b4d]
49      0x2b7d31bb57db PyEval_EvalCode + 27
48      0x2b7d31bb57ae PyEval_EvalCodeEx + 62
47      0x2b7d31bb5461 _PyEval_EvalCodeWithName + 2785
46      0x2b7d31abc1aa _PyEval_EvalFrameDefault + 36650
45      0x2b7d31bd02be PyImport_ImportModuleLevelObject + 1422
44      0x2b7d31adffcd _PyObject_CallMethodIdObjArgs + 173
43      0x2b7d31adfd60 /nfs2/mdorier/spack/opt/spack/linux-ubuntu14.04-ivybridge/gcc-8.2.1/python-3.7.8-xe6or5bd6urzfux6vhc2asxszabx2666/lib/libpython3.7m.so.1.0(+0x95d60) [0x2b7d31adfd60]
42      0x2b7d31ade6c7 _PyFunction_FastCallDict + 183
41      0x2b7d31bb5461 _PyEval_EvalCodeWithName + 2785
40      0x2b7d31aba35b _PyEval_EvalFrameDefault + 28891
39      0x2b7d31ade983 _PyFunction_FastCallKeywords + 147
38      0x2b7d31bb5461 _PyEval_EvalCodeWithName + 2785
37      0x2b7d31abbe72 _PyEval_EvalFrameDefault + 35826
36      0x2b7d31ae0af3 PyCFunction_Call + 243
35      0x2b7d31bb0f78 /nfs2/mdorier/spack/opt/spack/linux-ubuntu14.04-ivybridge/gcc-8.2.1/python-3.7.8-xe6or5bd6urzfux6vhc2asxszabx2666/lib/libpython3.7m.so.1.0(+0x166f78) [0x2b7d31bb0f78]
34      0x2b7d31bd030f PyImport_ImportModuleLevelObject + 1503
33      0x2b7d31adffcd _PyObject_CallMethodIdObjArgs + 173
32      0x2b7d31adfd60 /nfs2/mdorier/spack/opt/spack/linux-ubuntu14.04-ivybridge/gcc-8.2.1/python-3.7.8-xe6or5bd6urzfux6vhc2asxszabx2666/lib/libpython3.7m.so.1.0(+0x95d60) [0x2b7d31adfd60]
31      0x2b7d31ade8dc _PyFunction_FastCallDict + 716
30      0x2b7d31ab213b /nfs2/mdorier/spack/opt/spack/linux-ubuntu14.04-ivybridge/gcc-8.2.1/python-3.7.8-xe6or5bd6urzfux6vhc2asxszabx2666/lib/libpython3.7m.so.1.0(+0x6813b) [0x2b7d31ab213b]
29      0x2b7d31aba35b _PyEval_EvalFrameDefault + 28891
28      0x2b7d31ab213b /nfs2/mdorier/spack/opt/spack/linux-ubuntu14.04-ivybridge/gcc-8.2.1/python-3.7.8-xe6or5bd6urzfux6vhc2asxszabx2666/lib/libpython3.7m.so.1.0(+0x6813b) [0x2b7d31ab213b]
27      0x2b7d31aba35b _PyEval_EvalFrameDefault + 28891
26      0x2b7d31ab213b /nfs2/mdorier/spack/opt/spack/linux-ubuntu14.04-ivybridge/gcc-8.2.1/python-3.7.8-xe6or5bd6urzfux6vhc2asxszabx2666/lib/libpython3.7m.so.1.0(+0x6813b) [0x2b7d31ab213b]
25      0x2b7d31aba6ed _PyEval_EvalFrameDefault + 29805
24      0x2b7d31ab213b /nfs2/mdorier/spack/opt/spack/linux-ubuntu14.04-ivybridge/gcc-8.2.1/python-3.7.8-xe6or5bd6urzfux6vhc2asxszabx2666/lib/libpython3.7m.so.1.0(+0x6813b) [0x2b7d31ab213b]
23      0x2b7d31abc47d _PyEval_EvalFrameDefault + 37373
22      0x2b7d31ade983 _PyFunction_FastCallKeywords + 147
21      0x2b7d31bb5461 _PyEval_EvalCodeWithName + 2785
20      0x2b7d31abbe72 _PyEval_EvalFrameDefault + 35826
19      0x2b7d31adf695 _PyCFunction_FastCallDict + 37
18      0x2b7d31adf4d8 _PyMethodDef_RawFastCallDict + 456
17      0x2b7d31bb2b4d /nfs2/mdorier/spack/opt/spack/linux-ubuntu14.04-ivybridge/gcc-8.2.1/python-3.7.8-xe6or5bd6urzfux6vhc2asxszabx2666/lib/libpython3.7m.so.1.0(+0x168b4d) [0x2b7d31bb2b4d]
16      0x2b7d31bb57db PyEval_EvalCode + 27
15      0x2b7d31bb57ae PyEval_EvalCodeEx + 62
14      0x2b7d31bb5461 _PyEval_EvalCodeWithName + 2785
13      0x2b7d31abc1aa _PyEval_EvalFrameDefault + 36650
12      0x2b7d31bd030f PyImport_ImportModuleLevelObject + 1503
11      0x2b7d31adffcd _PyObject_CallMethodIdObjArgs + 173
10      0x2b7d31adfd60 /nfs2/mdorier/spack/opt/spack/linux-ubuntu14.04-ivybridge/gcc-8.2.1/python-3.7.8-xe6or5bd6urzfux6vhc2asxszabx2666/lib/libpython3.7m.so.1.0(+0x95d60) [0x2b7d31adfd60]
9       0x2b7d31ade8dc _PyFunction_FastCallDict + 716
8       0x2b7d31ab213b /nfs2/mdorier/spack/opt/spack/linux-ubuntu14.04-ivybridge/gcc-8.2.1/python-3.7.8-xe6or5bd6urzfux6vhc2asxszabx2666/lib/libpython3.7m.so.1.0(+0x6813b) [0x2b7d31ab213b]
7       0x2b7d31aba35b _PyEval_EvalFrameDefault + 28891
6       0x2b7d31ab213b /nfs2/mdorier/spack/opt/spack/linux-ubuntu14.04-ivybridge/gcc-8.2.1/python-3.7.8-xe6or5bd6urzfux6vhc2asxszabx2666/lib/libpython3.7m.so.1.0(+0x6813b) [0x2b7d31ab213b]
5       0x2b7d31ab5e9a _PyEval_EvalFrameDefault + 11290
4       0x2b7d31b21497 PyObject_IsTrue + 55
3       0x2b7d4c7da1fc /nfs2/mdorier/spack/opt/spack/linux-ubuntu14.04-ivybridge/gcc-8.2.1/py-numpy-1.19.1-6frc7dvyvazw7k5krmsxkc3v5frqdhte/lib/python3.7/site-packages/numpy/core/_multiarray_umath.cpython-37m-x86_64-linux-gnu.so(+0x14f1fc) [0x2b7d4c7da1fc]
2       0x2b7d4c7d9177 /nfs2/mdorier/spack/opt/spack/linux-ubuntu14.04-ivybridge/gcc-8.2.1/py-numpy-1.19.1-6frc7dvyvazw7k5krmsxkc3v5frqdhte/lib/python3.7/site-packages/numpy/core/_multiarray_umath.cpython-37m-x86_64-linux-gnu.so(+0x14e177) [0x2b7d4c7d9177]
1       0x2b7d4c7d58f8 /nfs2/mdorier/spack/opt/spack/linux-ubuntu14.04-ivybridge/gcc-8.2.1/py-numpy-1.19.1-6frc7dvyvazw7k5krmsxkc3v5frqdhte/lib/python3.7/site-packages/numpy/core/_multiarray_umath.cpython-37m-x86_64-linux-gnu.so(+0x14a8f8) [0x2b7d4c7d58f8]
0       0x2b7d35a35330 /lib/x86_64-linux-gnu/libpthread.so.0(+0x10330) [0x2b7d35a35330]
(  16.532s) [pvbatch.0       ]                       :0     FATL| Signal: SIGSEGV

After further investigation, it seems the warning is caused by the initialization of the processor, here, and the crash is caused by the initialization of the pipeline here.

How can I fix this problem?

Initialize/Finalize are simply not intended to be called multiple times in the lifetime of the process. What is your intent behind calling these multiple times? Maybe there are alternative options.

We are trying to make an elastic in situ visualization mechanism, i.e. changing the number of processes at run time in-between iterations. We are using our own custom child classes of vtkMultiProcessController and vtkCommunicator, but this could be applied to MPI-based communication as well: we can create a vtkMPICommunicator from an MPI_Comm, then create a vtkMPIController and calling SetCommunicator() with the previously created communicator, and finally call vtkMultiProcessController::SetGlobalController(). Doing this before initializing the Processor and Pipeline makes the in situ code run only the processes from the MPI_Comm we initially provided. However if later we want to use more (or fewer) processes, we need to be able to re-initialize our in situ code with a new MPI communicator. Since the initialization of the Processor and Pipeline involve MPI collective communications, we can’t simply initialize in added processes, we need to re-initialize existing processes as well.

Ah! Alas, I can’t think of an easy to address this short for going through each step of the init/finialize code and then addressing the issues. These methods were indeed not intended for being called multiple times.

Maybe another option is to replace the vtkCommunicator used the controller under the covers, rather than replacing the global controller. All filters etc. hang on to the controller, which they typically obtain using vtkMultiProcessController::GetGlobalController in the constructor. Maybe a trick would be replace the communicator underneath. Would require all callbacks etc are preserved so that may be some work too.

We are trying that, but the problems we have is that new processes, when joining, have to initialize their processor and pipeline, and this initialization involves collective communications, so they are blocking on some communication functions, expecting messages to arrive from processes that won’t send them because they have already initialized their processor/pipeline.

Maybe the solution is to have new processes initialize their pipeline using a controller/communicator with just themselves in it, and then replace the communicator with one that includes the remaining processes?

Worth a shot. There’s still going to be some tricks to play to make the root node in the new group of processses now act as a non-root node one it’s subsumed in the larger group.