Segmentation fault when building with VTKm

Hello!

I’m trying to build ParaView with VTKm enabled but all binaries (paraview, pvpython, and servers) failed with SIGSEGV (segmentation fault) when I start the program, in the initialization without any message, shell or window.
Without VTKm but with CUDA/OptiX the binaries run fine. Other CUDA apps that I build also run fine so I think my CUDA install is alright.

I tried to change the VTKm_CUDA_Architecture from “native” to “pascal” (using a 1060 card here) but both give the SIGSEGV, tried enabling CUDA_SEPARABLE_COMPILATION with no success, and tried to enable Debug but it has not improved the backtrace.

I found some mention of a similar error in a CMake merge request (https://gitlab.kitware.com/cmake/cmake/merge_requests/457) but don’t know if it’s related, maybe I need to manually set the CMAKE_CUDA_FLAGS like “-gencode arch=compute_61,code=sm_61” (I use this on my Ph.D. project with success).

Using CMake 3.14.6, CUDA 10.0.130 (with GCC 7.4.0), GCC 9.2.0 for c++ code and NVidia driver 440.31. If more information is needed please let me know.

Can anyone point me in the direction to fix this?

The debug backtrace from cuda-gdb with CMAKE_BUILD_TYPE=Debug is:

0x00007ffff359139a in __cudaRegisterLinkedBinary(__fatBinC_Wrapper_t const*, void (*)(void**), void*) ()
   from /root/src/paraview_build/bin/../lib64/libvtkPVServerManagerApplication-pv5.7.so.1
(cuda-gdb) bt
#0  0x00007ffff359139a in __cudaRegisterLinkedBinary(__fatBinC_Wrapper_t const*, void (*)(void**), void*) ()
   from /root/src/paraview_build/bin/../lib64/libvtkPVServerManagerApplication-pv5.7.so.1
#1  0x00007ffff3590380 in __cudaRegisterLinkedBinary_48_tmpxft_0000622a_00000000_6_ClipWithField_cpp1_ii_7724b337 ()
   from /root/src/paraview_build/bin/../lib64/libvtkPVServerManagerApplication-pv5.7.so.1
#2  0x00007ffff421670d in __sti____cudaRegisterAll ()
   from /root/src/paraview_build/bin/../lib64/libvtkPVServerManagerApplication-pv5.7.so.1
#3  0x00007ffff7fe318a in _dl_rtld_di_serinfo () from /lib64/ld-linux-x86-64.so.2
#4  0x00007ffff7fe3286 in _dl_rtld_di_serinfo () from /lib64/ld-linux-x86-64.so.2
#5  0x00007ffff7fd50ca in ?? () from /lib64/ld-linux-x86-64.so.2
#6  0x0000000000000001 in ?? ()
#7  0x00007fffffffe28b in ?? ()
#8  0x0000000000000000 in ?? ()

Thanks for your time!

@robert.maynard @allison.vacanti

Does this occur if you have VTK-m enabled, and OptiX disabled?

Edit:
Follow-up. You are building ParaView master?

I’ve tried a clean build with only VTK-m enabled and it worked fine, even with PARAVIEW_USE_CUDA=ON. Then just enabling ray tracing and VisRTX worked fine with not SIGSEGV.

I’m using master @697b01a200.

It may be related to some flag I enabled on the previous build, or the fact now I didn’t “make clean” between builds, really don’t know.
Will try to reproduce the problem here.

This sounds like the classic issue where two dynamic libraries that are using static CUDA and therefore get multiple duplicate symbols.

I think the quickest solution is to add -Xlinker muldefs to you CMAKE_CUDA_FLAGS and see if that builds properly