Catalyst deadlock with AMR data

Hi All, I’m attempting to run Catalyst with an AMR dataset. In this run I’m rendering 10 iso surfaces with a custom opacity map. I’ve found that with non-trivial numbers of MPI ranks (eg 256,512,1024) Catalyst deadlocks consistently. a stack trace follows.

Is this issue on the radar? Would it be worth filing a bug report?

<$ps>: =========================================================
<$ps>: Process id 20197 Caught SIGTERM
<$ps>: Program Stack:
<$ps>: WARNING: The stack trace will not use advanced capabilities because this is a release build.
<$ps>: 0x2aaacb834c10 : ??? [(???) ???:-1]
<$ps>: 0x2aaaabe91da4 : MPIDI_Cray_shared_mem_coll_bcast [(libmpich_gnu_51.so.3) ???:-1]
<$ps>: 0x2aaaabea3967 : MPIR_CRAY_Barrier [(libmpich_gnu_51.so.3) ???:-1]
<$ps>: 0x2aaaabdb4af3 : MPIR_Barrier_impl [(libmpich_gnu_51.so.3) ???:-1]
<$ps>: 0x2aaaabdb55b1 : MPI_Barrier [(libmpich_gnu_51.so.3) ???:-1]
<$ps>: 0x2aaab3d3e57e : vtkMPICommunicator::AllReduceVoidArray(void const*, void*, long long, int, int) [(libvtkParallelMPI-pv5.5.so.1) ???:-1]
<$ps>: 0x2aaac545bb81 : vtkKdTreeManager::AddDataSetToKdTree(vtkDataSet*) [(libvtkPVVTKExtensionsRendering-pv5.5.so.1) ???:-1]
<$ps>: 0x2aaac545bf6b : vtkKdTreeManager::AddDataObjectToKdTree(vtkDataObject*) [(libvtkPVVTKExtensionsRendering-pv5.5.so.1) ???:-1]
<$ps>: 0x2aaac545cdc5 : vtkKdTreeManager::GenerateKdTree() [(libvtkPVVTKExtensionsRendering-pv5.5.so.1) ???:-1]
<$ps>: 0x2aaac815111d : vtkPVDataDeliveryManager::RedistributeDataForOrderedCompositing(bool) [(libvtkPVClientServerCoreRendering-pv5.5.so.1) ???:-1]
<$ps>: 0x2aaac81aa19f : vtkPVRenderView::Render(bool, bool) [(libvtkPVClientServerCoreRendering-pv5.5.so.1) ???:-1]
<$ps>: 0x2aaac81a3fca : vtkPVRenderView::StillRender() [(libvtkPVClientServerCoreRendering-pv5.5.so.1) ???:-1]
<$ps>: 0x2aaab53bad40 : vtkPVRenderViewCommand(vtkClientServerInterpreter*, vtkObjectBase*, char const*, vtkClientServerStream const&, vtkClientServerStream&, void*) [(libvtkPVServerManagerApplication-pv5.5.so.1) ???:-1]
<$ps>: 0x2aaab302efde : vtkClientServerInterpreter::CallCommandFunction(char const*, vtkObjectBase*, char const*, vtkClientServerStream const&, vtkClientServerStream&) [(libvtkClientServer-pv5.5.so.1) ???:-1]
<$ps>: 0x2aaab302f7fa : vtkClientServerInterpreter::ProcessCommandInvoke(vtkClientServerStream const&, int) [(libvtkClientServer-pv5.5.so.1) ???:-1]
<$ps>: 0x2aaab303002e : vtkClientServerInterpreter::ProcessOneMessage(vtkClientServerStream const&, int) [(libvtkClientServer-pv5.5.so.1) ???:-1]
<$ps>: 0x2aaab303043d : vtkClientServerInterpreter::ProcessStream(vtkClientServerStream const&) [(libvtkClientServer-pv5.5.so.1) ???:-1]
<$ps>: 0x2aaaaeee49a2 : vtkPVSessionCore::ExecuteStreamInternal(vtkClientServerStream const&, bool) [(libvtkPVServerImplementationCore-pv5.5.so.1) ???:-1]
<$ps>: 0x2aaaaeee47c2 : vtkPVSessionCore::ExecuteStream(unsigned int, vtkClientServerStream const&, bool) [(libvtkPVServerImplementationCore-pv5.5.so.1) ???:-1]
<$ps>: 0x2aaaaeee2ef5 : vtkPVSessionBase::ExecuteStream(unsigned int, vtkClientServerStream const&, bool) [(libvtkPVServerImplementationCore-pv5.5.so.1) ???:-1]
<$ps>: 0x2aaac7933cfa : vtkSMViewProxy::StillRender() [(libvtkPVServerManagerRendering-pv5.5.so.1) ???:-1]
<$ps>: 0x2aaac78e91dc : vtkSMRenderViewProxy::RenderForImageCapture() [(libvtkPVServerManagerRendering-pv5.5.so.1) ???:-1]
<$ps>: 0x2aaac79364d2 : vtkSMViewProxy::CaptureWindowInternal(int, int) [(libvtkPVServerManagerRendering-pv5.5.so.1) ???:-1]
<$ps>: 0x2aaac7933f75 : vtkSMViewProxy::CaptureWindowSingle(int, int) [(libvtkPVServerManagerRendering-pv5.5.so.1) ???:-1]
<$ps>: 0x2aaac7934cc8 : vtkSMViewProxy::CaptureWindow(int, int) [(libvtkPVServerManagerRendering-pv5.5.so.1) ???:-1]
<$ps>: 0x2aaac78fcc32 : vtkSMSaveScreenshotProxy::vtkStateView::CaptureImage() [(libvtkPVServerManagerRendering-pv5.5.so.1) ???:-1]
<$ps>: 0x2aaac78f89b8 : vtkSMSaveScreenshotProxy::CaptureImage() [(libvtkPVServerManagerRendering-pv5.5.so.1) ???:-1]
<$ps>: 0x2aaac78fc298 : vtkSMSaveScreenshotProxy::WriteImage(char const*) [(libvtkPVServerManagerRendering-pv5.5.so.1) ???:-1]
<$ps>: 0x2aaaee33310d : ??? [(???) ???:-1]
<$ps>: 0x2aaaae034ce6 : PyEval_EvalFrameEx [(libpython2.7.so.1.0) ???:-1]
<$ps>: 0x2aaaae037f30 : PyEval_EvalCodeEx [(libpython2.7.so.1.0) ???:-1]
<$ps>: 0x2aaaae034539 : PyEval_EvalFrameEx [(libpython2.7.so.1.0) ???:-1]
<$ps>: 0x2aaaae037f30 : PyEval_EvalCodeEx [(libpython2.7.so.1.0) ???:-1]
<$ps>: 0x2aaaae034539 : PyEval_EvalFrameEx [(libpython2.7.so.1.0) ???:-1]
<$ps>: 0x2aaaae037f30 : PyEval_EvalCodeEx [(libpython2.7.so.1.0) ???:-1]
<$ps>: 0x2aaaadfabfcd : ??? [(???) ???:-1]
<$ps>: 0x2aaaadf7c4f3 : PyObject_Call [(libpython2.7.so.1.0) ???:-1]
<$ps>: 0x2aaaae030075 : PyEval_EvalFrameEx [(libpython2.7.so.1.0) ???:-1]
<$ps>: 0x2aaaae037f30 : PyEval_EvalCodeEx [(libpython2.7.so.1.0) ???:-1]
<$ps>: 0x2aaaae034539 : PyEval_EvalFrameEx [(libpython2.7.so.1.0) ???:-1]
<$ps>: 0x2aaaae037f30 : PyEval_EvalCodeEx [(libpython2.7.so.1.0) ???:-1]
<$ps>: 0x2aaaae034539 : PyEval_EvalFrameEx [(libpython2.7.so.1.0) ???:-1]
<$ps>: 0x2aaaae03655c : PyEval_EvalFrameEx [(libpython2.7.so.1.0) ???:-1]
<$ps>: 0x2aaaae037f30 : PyEval_EvalCodeEx [(libpython2.7.so.1.0) ???:-1]
<$ps>: 0x2aaaae038349 : PyEval_EvalCode [(libpython2.7.so.1.0) ???:-1]
<$ps>: 0x2aaaae05a376 : PyRun_StringFlags [(libpython2.7.so.1.0) ???:-1]
<$ps>: 0x2aaaae05cb9b : PyRun_SimpleStringFlags [(libpython2.7.so.1.0) ???:-1]
<$ps>: 0x2aaaad6cc551 : vtkPythonInterpreter::RunSimpleString(char const*) [(libvtkPythonInterpreter-pv5.5.so.1) ???:-1]
<$ps>: 0x2aaaae3498c2 : vtkCPPythonScriptPipeline::CoProcess(vtkCPDataDescription*) [(libvtkPVPythonCatalyst-pv5.5.so.1) ???:-1]
<$ps>: 0x2aaab5c975fb : vtkCPProcessor::CoProcess(vtkCPDataDescription*) [(libvtkPVCatalyst-pv5.5.so.1) ???:-1]
<$ps>: 0xade63c : ??? [(???) ???:-1]
<$ps>: 0xaa3ed0 : ??? [(???) ???:-1]
<$ps>: 0x632177 : ??? [(???) ???:-1]
<$ps>: 0x5870de : ??? [(???) ???:-1]
<$ps>: 0x58a71c : ??? [(???) ???:-1]
<$ps>: 0x422b43 : ??? [(???) ???:-1]
<$ps>: 0x2aaacc002725 : __libc_start_main [(libc.so.6) ???:-1]
<$ps>: 0x4290b9 : ??? [(???) ???:-1]
<$ps>: =========================================================

Hi Burlen,

Can you try turning off ordered compositing to see if that solves the deadlock? If it does, then it narrows down the problem to data layout/redistribution. I think the “Depth Peeling” and “Depth Peeling for Volumes” settings in the Rendering tab of user preferences in ParaView control ordered compositing. Hopefully those properties will show up in the trace generated for Catalyst.

I’ll do you one better, here is a dataset that can be used to reproduce the issue in the GUI.

Steps to reproduce:

  1. untar the dataset below
  2. run paraview in client server mode with 256 MPI ranks
  3. load the state file below, point to the dataset
  4. in the lut editor, enable opacity mapping.

The Catalyst runs were made with 5.5.2, and didn’t deadlock until the 3 timestep. The above steps to reproduce were tested with 5.5.1 on NERSC’s cori, the deadlock occurs right away. The stack trace is pretty much identical.

link to the dataset
https://drive.google.com/open?id=18vnoP_Nm9uaWCV2F8m_AISksXIszJN17

link to the pvsm
https://drive.google.com/open?id=10nvlbQSn98dl4dcQZxvvVjcUnAer4yhS