Seg Fault when initializing pipeline


#1

Hey All,

I’ve been integrating Catalyst with our in-house solver, but I’ve run into an issue. When calling pipeline->Initialize(scripts[i]) the code throws a seg fault. valgrind pinpoints the precise point of failure as line 1013 in “obmalloc.c” which appears to be a part of the Python source code, and the offending line itself is:

if (Py_ADDRESS_IN_RANGE(p, pool)) 

where pool is defined as

pool = POOL_ADDR(p);

But I can’t figure out what could be causing a seg fault. The only argument getting passed to pipeline->initialize is the scripts, and for testing I’ve hardcoded that to be the path to a single simple script.

Is some object not being created properly elsewhere in the code? Does anyone have any suggestions on where to start looking for a cause here?

Thanks.


(Andy Bauer) #2

Hi,

There’s not a lot of context in your question to help diagnose the issue. I would suggest trying one of the examples in the Examples/Catalyst subdirectory of the source code and see if that works for you. Alternatively, if you can provide a full example of the bad behavior we should be able to help figure out what’s wrong.


#3

Hey Andy,

Good point, I guess I didn’t provide much detail. I have looked at the sample codes, and to debug my issue I wrote up a toy code based on them that I’ve been using to debug this issue outside our production code. The Main function is simply

int main(int argc, char* argv[])
{
     MPI_Init(&argc, &argv);

  PopZoneData zone;

  BuildGrid(&zone);
  BuildSolution(&zone);

  POP_INITIALIZE(argc - 1, argv + 1);
   int timeSteps = 1000;
  for (unsigned int timeStep = 0; timeStep < timeSteps; timeStep++)
    {
      double time = timeStep * 0.1;
      incrementSolution(&zone, time);
      int lastTimeStep = 0;
      if (timeStep == (timeSteps - 1)) 
	   {
	      lastTimeStep = 1;
	    }
      COPROCESS(&zone, 1, time, timeStep, lastTimeStep);
    }
  POP_FINALIZE();

     MPI_Finalize();
   }

“zone” is just an overarching structure that contains the grid and solution data for the processor. The code runs into problems immediately when it enters POP_INITIALIZE, which looks like:

vtkCPProcessor* Processor = NULL;
vtkMultiBlockDataSet* VTKGrid;

POP_RC POP_INITIALIZE(int numScripts, char* scripts[])
{
  POP_RC rc = POP_RC_OK;

  if (Processor == NULL)
    {
      Processor = vtkCPProcessor::New();
      Processor->Initialize();
    }
  else
    {
      Processor->RemoveAllPipelines();
    }
  for (int i=0; i < numScripts; i++)
    {
      vtkNew<vtkCPPythonScriptPipeline> pipeline;
      std::cout<<"The " << i <<"th entry in scripts is: " << scripts[i] << std::endl;
      pipeline->Initialize(scripts[i]);
      Processor->AddPipeline(pipeline.GetPointer());
    }
  return rc;
}

After calling pipeline->Initialize it throws the seg fault. As far as I can tell everything in my INITIALIZE is straight out of the sample codes, and “scripts” returns exactly what I would expect.

Edit: forgot to add, here’s the stack output by valgrind for this error

==14311== Invalid read of size 4
==14311== at 0xDDA1423: PyObject_Free (obmalloc.c:1013)
==14311== by 0xDE087AD: compiler_unit_free (compile.c:447)
==14311== by 0xDE11BA0: compiler_exit_scope (compile.c:544)
==14311== by 0xDE11BA0: compiler_mod (compile.c:1213)
==14311== by 0xDE11BA0: PyAST_Compile (compile.c:292)
==14311== by 0xDE26DBE: Py_CompileStringFlags (pythonrun.c:1433)
==14311== by 0xDDFD453: builtin_compile (bltinmodule.c:570)
==14311== by 0xDE075A1: call_function (ceval.c:4350)
==14311== by 0xDE075A1: PyEval_EvalFrameEx (ceval.c:2987)
==14311== by 0xDE081CD: PyEval_EvalCodeEx (ceval.c:3582)
==14311== by 0xDE082E1: PyEval_EvalCode (ceval.c:669)
==14311== by 0xDE26EBB: run_mod (pythonrun.c:1376)
==14311== by 0xDE26EBB: PyRun_StringFlags (pythonrun.c:1339)
==14311== by 0xDE281FF: PyRun_SimpleStringFlags (pythonrun.c:974)
==14311== by 0xD09A8E8: vtkPythonInterpreter::RunSimpleString(char const*) (vtkPythonInterpreter.cxx:355)
==14311== by 0x58A2131: vtkCPPythonScriptPipeline::Initialize(char const*) (vtkCPPythonScriptPipeline.cxx:215)


(Andy Bauer) #4

I don’t see anything suspicious in the code you shared. My suggestions at this point would be to make sure that the Python and system libraries that ParaView Catalyst is built against are consistent with the ones that your simulation code and adaptor are built against. Another thing to check is that the Python libraries are consistent with the Python interpreter. For example, it would find Python 2.7 libraries and header files while also finding a Python 3 interpreter which would cause hard to diagnose issues. In older versions of ParaView this would happen a surprising amount but I haven’t seen that too often recently. By the way, what version of ParaView are you using?


#5

Hey Andy,

Thanks for the reply. First off, I’m using Paraview 5.4.1.

This is plausible. Since posting I’ve done some more tests, including downloading and compiling CxxFullExample verbatim and running it with my build of Paraview. Valgrind reports the same segfault when initializing the pipeline (which suggests a consistent problem not with my adaptor code itself). However, it doesn’t seem to be fatal as the code runs to completion regardless and produces sensible outputs. Oddly, when I modify CxxExample code to instead call my version of initialize (but otherwise leave the code unmodified), the seg fault causes a crash (despite the two versions being functionally identical).

Another thing to check is that the Python libraries are consistent with the Python interpreter.

I’ve ensured that how $PYTHONHOME is set when compiling our source code/the adaptor is consistent with the PYTHON variables in cmake I used for compiling ParaView, but can you suggest a way to double check which python libraries/interpreter are actually being used?

Thanks again.


(Andy Bauer) #6

When building ParaView you’ll want to check the following CMake options (note they’re probably advanced options):

 PYTHON_EXECUTABLE                /usr/bin/python                                                                                                                                           
 PYTHON_EXTRA_LIBS                                                                                                                                                                          
 PYTHON_INCLUDE_DIR               /usr/include/python2.7                                                                                                                                    
 PYTHON_LIBRARY                   /usr/lib/x86_64-linux-gnu/libpython2.7.so                                                                                                                 
 PYTHON_MODULE_mpi4py.MPI_BUILD   ON                                                                                                                                                        
 PYTHON_MODULE_mpi4py.dl_BUILD_   ON                                                                                                                                                        
 PYTHON_UTIL_LIBRARY              /usr/lib/x86_64-linux-gnu/libutil.so 

I’ve included what I have on my workstation. You may also want to try unsetting all Python environment variables like $PYTHONHOME as well as take out any references to Python libraries in your $LD_LIBRARY_PATH just to be safe.


#7

Unsetting all Python environment variables? I needed to compile with a non-default version of Python for our system, and I found when I tried compiling without $PYTHONHOME set it threw a fit because the Python versions didn’t match.