Error: "Some error in socket processing"

Hi all,

I’m trying to render some openfoam data on AWS, and to simplify things, just using mesa CPU rendering. I’m trying to check if I can save an image of my data set going parallel across ten nodes with 16 threads a piece. So, I start my pvserver as follows:

#!/bin/bash
#SBATCH -N 8
#SBATCH -n 192
mpi=/shared/opt/ParaView-5.6.0-osmesa-MPI-Linux-64bit/bin/mpiexec
$mpi -n 192 /shared/opt/ParaView-5.6.0-osmesa-MPI-Linux-64bit/bin/pvserver --force-offscreen-rendering --mpi

Then try to run this simple script using pvpython from the same binary download:

from paraview.simple import *
from paraview import servermanager

# connect to server
connection = servermanager.Connect('ip-172-31-0-19', 11111)
sphere = Sphere()
view = servermanager.CreateRenderView()

# make a sphere source
WriteImage('test.png')

Unfortunately, I get this error on the server:

Waiting for client...
Connection URL: cs://ip-172-31-0-19:11111
Accepting connection(s): ip-172-31-0-19:11111
Client connected.
ERROR: In /home/buildslave/dashboards/buildbot/paraview-pvbinsdash-linux-shared-release_osmesa_superbuild/build/superbuild/paraview/src/VTK/Parallel/Core/vtkSocketCommunicator.cxx, line 808
vtkSocketCommunicator (0x1f72a10): Could not receive tag. 1

ERROR: In /home/buildslave/dashboards/buildbot/paraview-pvbinsdash-linux-shared-release_osmesa_superbuild/build/superbuild/paraview/src/ParaViewCore/ClientServerCore/Core/vtkTCPNetworkAccessManager.cxx, line 297
vtkTCPNetworkAccessManager (0x110f110): Some error in socket processing.

What’s up here?

I am getting the exact same error trying to get the Nvidia Index plugin to work. Any dataset I add will cause the following errors on the server-side:

( 231.995s) [pvserver ]vtkSocketCommunicator.c:808 ERR| vtkSocketCommunicator (0x55a388dedf10): Could not receive tag. 1

( 231.996s) [pvserver ]vtkTCPNetworkAccessMana:297 ERR| vtkTCPNetworkAccessManager (0x55a388b76f50): Some error in socket processing. Exiting...

and then crash both the server and the ParaView client. Any insight into what’s actually going on here would be enormously appreciated.

This error is highly non specific and could be caused by many things.

Please provide precise steps to reproduce.

Thanks for the fast response!

I’m working from the instructions here, specifically the ParaView Server/Running with Docker section:

https://ngc.nvidia.com/catalog/containers/nvidia-hpcvis:paraview-index

I can connect to the Docker container from ParaView, and browse files inside of it, but attempting to visualize any dataset causes the crashes. I’ve tried adding Wavelets, as the instructions suggest, and tried loading custom datasets. Both produce the same two error messages.

My guess is that the issue is with the IndeX plugin and not with ParaView, but I was hoping to get more information about what that error actually means so I can troubleshoot it myself.

I have the same issue, in 5.9.0 (both server and client), and have exact steps to reproduce them:

1: on a cluster, start a pvserver. I just use a single thread server which is already difficult enough for now.
2: on my home computer, I set up an ssh tunnel to the compute node to which the pvserver was assigned.
3: I connect my client to the server. I get about 20 errors (that I somehow think I should actually solve first), but the connection is made, so confirms the server log.
4: I open a very simple example vtu file from John Burkardt: https://people.math.sc.edu/Burkardt/data/vtu/ugridex.vtu
5: [apply]
6: I actually see a rendering of the model (and a lot more errors).
7: there are two quantities in the vtu file: csum and nmax. When the selector ‘solid color’ is changed into any of those two, then even before apply these two errors are given on the server side, and both server and client crash.

I hope this helps. best wishes

Lukas

No issues here.

Errors

Please share these

SSH

Please share your ssh tunnel command

What is your rendering setup ? Remote or local ? OpenGL ? EGL ? OSMesa ?

Dear Mathieu,

thank you for your reaction.

I have to do a walk of shame here…

the reason of these errors was partly discovered:
For the client I used the 5.9.0 prebuilt binaries.
For the server I compiled the 5.9.0. manually, because I was dumb enough to actually try and run the prebuilt pvserver with our system mpirun (openmpi 4.0.4), and not with the mpiexec so helpfully provided with the binaries. This caused the error that is typically associated with the build not being properly parallel:

The first thread is fine, and all all other threads show:

Socket error in call to bind. Address already in use.

To prevent this, I compiled Paraview myself with our system MPI, which eventually enabled me to run a server in parallel with it, but then there are subtle differences that make the two (the prebuilt and my system-MPI built) incompatible.

Because our various users will eventually run Paraview clients on their home machines, I wished to allow them to use the prebuilt binaries, and to prevent them all from having to do a manual compile, but that would only work when the server runs from that same set of binaries as well, which it now does.

Now all I have to do is to give the GPU nodes of the cluster a sufficient X environment so that PVserver can open the render window, which it currently cannot.

The SSH tunnel command was:

ssh -f -L 11111:gpu001:11111 eejit.geo.uu.nl

with the name of the login node being eejit.geo.uu.nl, and the GPU node being gpu001.
That now works, and it is possible to load data and view models, although ‘remote rendering disabled’ until I fixed the X environment stuff.

We are running this with OpenGL

We wish to eventually use remote rendering, because the models of some of the users can run into the several tens of gigabytes, and while that is easy enough for the cluster, it is too much for their little laptops. :wink:

I will get let you know when I have the X stuff set up properly and managed to enable remote rendering.

For completenes’ sake, although no longer really relevant for me: the runtime errors (on the client side) from the version conflict were:

click to show

( 10.639s) [paraview ]vtkOSPRayMaterialLibrar:280 ERR| vtkOSPRayMaterialLibrary (0x141469e0): JSON parsing error: * Line 1, Column 1
Syntax error: value, object or array expected.

( 11.603s) [paraview ] VisRTXBackend.cxx:42 WARN| VisRTX Error: Unsupported device
( 11.603s) [paraview ] VisRTXBackend.cxx:42 WARN| VisRTX Error: Unsupported device
( 11.603s) [paraview ] VisRTXBackend.cxx:42 WARN| VisRTX Error: Unsupported device
( 11.603s) [paraview ] VisRTXBackend.cxx:42 WARN| VisRTX Error: Unsupported device
( 11.603s) [paraview ] VisRTXBackend.cxx:42 WARN| VisRTX Error: Unsupported device
( 11.603s) [paraview ] VisRTXBackend.cxx:42 WARN| VisRTX Error: Unsupported device
( 11.607s) [paraview ] vtkPVSessionCore.cxx:372 ERR| vtkPVSessionCore (0x15192ca0): Object type: vtkTextProperty, could not find requested method: “SetCellOffset”
or the method was called with incorrect arguments.

while processing
Message 0 = Invoke
Argument 0 = vtk_object_pointer {vtkTextProperty (0x2175a50)}
Argument 1 = string_value {SetCellOffset}
Argument 2 = int32_value {0}

( 11.608s) [paraview ] vtkPVSessionCore.cxx:373 ERR| vtkPVSessionCore (0x15192ca0): Aborting execution for debugging purposes.
############ ABORT #############
( 11.608s) [paraview ] vtkSIProxy.cxx:621 ERR| vtkSIProxy (0x2175950): Could not parse property: CellOffset
( 11.608s) [paraview ] vtkPVSessionCore.cxx:372 ERR| vtkPVSessionCore (0x15192ca0): Object type: vtkTextProperty, could not find requested method: “SetCellOffset”
or the method was called with incorrect arguments.

while processing
Message 0 = Invoke
Argument 0 = vtk_object_pointer {vtkTextProperty (0x2175a50)}
Argument 1 = string_value {SetCellOffset}
Argument 2 = int32_value {0}

( 11.608s) [paraview ] vtkPVSessionCore.cxx:373 ERR| vtkPVSessionCore (0x15192ca0): Aborting execution for debugging purposes.
############ ABORT #############
( 11.609s) [paraview ] vtkSIProxy.cxx:621 ERR| vtkSIProxy (0x1517dfa0): Could not parse property: CellOffset
( 11.609s) [paraview ] vtkPVSessionCore.cxx:372 ERR| vtkPVSessionCore (0x15192ca0): Object type: vtkTextProperty, could not find requested method: “SetCellOffset”
or the method was called with incorrect arguments.

while processing
Message 0 = Invoke
Argument 0 = vtk_object_pointer {vtkTextProperty (0x2175a50)}
Argument 1 = string_value {SetCellOffset}
Argument 2 = int32_value {0}

( 11.609s) [paraview ] vtkPVSessionCore.cxx:373 ERR| vtkPVSessionCore (0x15192ca0): Aborting execution for debugging purposes.
############ ABORT #############
( 11.609s) [paraview ] vtkSIProxy.cxx:621 ERR| vtkSIProxy (0x2177d50): Could not parse property: CellOffset
( 11.610s) [paraview ] vtkPVSessionCore.cxx:372 ERR| vtkPVSessionCore (0x15192ca0): Object type: vtkTextProperty, could not find requested method: “SetCellOffset”
or the method was called with incorrect arguments.

while processing
Message 0 = Invoke
Argument 0 = vtk_object_pointer {vtkTextProperty (0x2175a50)}
Argument 1 = string_value {SetCellOffset}
Argument 2 = int32_value {0}

( 11.610s) [paraview ] vtkPVSessionCore.cxx:373 ERR| vtkPVSessionCore (0x15192ca0): Aborting execution for debugging purposes.
############ ABORT #############
( 11.610s) [paraview ] vtkSIProxy.cxx:621 ERR| vtkSIProxy (0x23ab7a0): Could not parse property: CellOffset
( 11.610s) [paraview ] vtkPVSessionCore.cxx:372 ERR| vtkPVSessionCore (0x15192ca0): Object type: vtkTextProperty, could not find requested method: “SetCellOffset”
or the method was called with incorrect arguments.

while processing
Message 0 = Invoke
Argument 0 = vtk_object_pointer {vtkTextProperty (0x2175a50)}
Argument 1 = string_value {SetCellOffset}
Argument 2 = int32_value {0}

( 11.611s) [paraview ] vtkPVSessionCore.cxx:373 ERR| vtkPVSessionCore (0x15192ca0): Aborting execution for debugging purposes.
############ ABORT #############
( 11.611s) [paraview ] vtkSIProxy.cxx:621 ERR| vtkSIProxy (0x2179060): Could not parse property: CellOffset
( 11.611s) [paraview ] vtkPVSessionCore.cxx:372 ERR| vtkPVSessionCore (0x15192ca0): Object type: vtkTextProperty, could not find requested method: “SetCellOffset”
or the method was called with incorrect arguments.

while processing
Message 0 = Invoke
Argument 0 = vtk_object_pointer {vtkTextProperty (0x2175a50)}
Argument 1 = string_value {SetCellOffset}
Argument 2 = int32_value {0}

( 11.611s) [paraview ] vtkPVSessionCore.cxx:373 ERR| vtkPVSessionCore (0x15192ca0): Aborting execution for debugging purposes.
############ ABORT #############
( 11.611s) [paraview ] vtkSIProxy.cxx:621 ERR| vtkSIProxy (0xda7610): Could not parse property: CellOffset
( 11.614s) [paraview ] vtkSIProxy.cxx:302 ERR| vtkSIProxy (0x2175950): Incorrect message received. Missing xml_group and xml_name information.
global_id: 9098
location: 17
[paraview_protobuf.ProxyState.property] {
name: “FontSize”
value {
type: INT
integer: 12
}
}

( 11.614s) [paraview ] vtkSIProxy.cxx:302 ERR| vtkSIProxy (0x1517dfa0): Incorrect message received. Missing xml_group and xml_name information.
global_id: 9099
location: 17
[paraview_protobuf.ProxyState.property] {
name: “FontSize”
value {
type: INT
integer: 12
}
}

( 11.615s) [paraview ] vtkSIProxy.cxx:302 ERR| vtkSIProxy (0x2177d50): Incorrect message received. Missing xml_group and xml_name information.
global_id: 9100
location: 17
[paraview_protobuf.ProxyState.property] {
name: “FontSize”
value {
type: INT
integer: 12
}
}

( 11.615s) [paraview ] vtkSIProxy.cxx:302 ERR| vtkSIProxy (0x23ab7a0): Incorrect message received. Missing xml_group and xml_name information.
global_id: 9101
location: 17
[paraview_protobuf.ProxyState.property] {
name: “FontSize”
value {
type: INT
integer: 12
}
}

( 11.615s) [paraview ] vtkSIProxy.cxx:302 ERR| vtkSIProxy (0x2179060): Incorrect message received. Missing xml_group and xml_name information.
global_id: 9102
location: 17
[paraview_protobuf.ProxyState.property] {
name: “FontSize”
value {
type: INT
integer: 12
}
}

( 11.615s) [paraview ] vtkSIProxy.cxx:302 ERR| vtkSIProxy (0xda7610): Incorrect message received. Missing xml_group and xml_name information.
global_id: 9103
location: 17
[paraview_protobuf.ProxyState.property] {
name: “FontSize”
value {
type: INT
integer: 12
}
}

( 11.616s) [paraview ] VisRTXBackend.cxx:42 WARN| VisRTX Error: Unsupported device
( 11.616s) [paraview ] VisRTXBackend.cxx:42 WARN| VisRTX Error: Unsupported device
( 11.701s) [paraview ] vtkSIProxy.cxx:302 ERR| vtkSIProxy (0x2175950): Incorrect message received. Missing xml_group and xml_name information.
global_id: 9098
location: 17
[paraview_protobuf.ProxyState.log_name]: “RenderView1/AxesGrid/GridAxes3DActor/XLabelProperties”

( 11.701s) [paraview ] vtkSIProxy.cxx:302 ERR| vtkSIProxy (0x1517dfa0): Incorrect message received. Missing xml_group and xml_name information.
global_id: 9099
location: 17
[paraview_protobuf.ProxyState.log_name]: “RenderView1/AxesGrid/GridAxes3DActor/XTitleProperties”

( 11.702s) [paraview ] vtkSIProxy.cxx:302 ERR| vtkSIProxy (0x2177d50): Incorrect message received. Missing xml_group and xml_name information.
global_id: 9100
location: 17
[paraview_protobuf.ProxyState.log_name]: “RenderView1/AxesGrid/GridAxes3DActor/YLabelProperties”

( 11.702s) [paraview ] vtkSIProxy.cxx:302 ERR| vtkSIProxy (0x23ab7a0): Incorrect message received. Missing xml_group and xml_name information.
global_id: 9101
location: 17
[paraview_protobuf.ProxyState.log_name]: “RenderView1/AxesGrid/GridAxes3DActor/YTitleProperties”

( 11.702s) [paraview ] vtkSIProxy.cxx:302 ERR| vtkSIProxy (0x2179060): Incorrect message received. Missing xml_group and xml_name information.
global_id: 9102
location: 17
[paraview_protobuf.ProxyState.log_name]: “RenderView1/AxesGrid/GridAxes3DActor/ZLabelProperties”

( 11.703s) [paraview ] vtkSIProxy.cxx:302 ERR| vtkSIProxy (0xda7610): Incorrect message received. Missing xml_group and xml_name information.
global_id: 9103
location: 17
[paraview_protobuf.ProxyState.log_name]: “RenderView1/AxesGrid/GridAxes3DActor/ZTitleProperties”

To prevent this, I compiled Paraview myself with our system MPI, which eventually enabled me to run a server in parallel with it, but then there are subtle differences that make the two (the prebuilt and my system-MPI built) incompatible.

You can also use --system-mpi option with our binaries.

but then there are subtle differences that make the two (the prebuilt and my system-MPI built) incompatible.
Because our various users will eventually run Paraview clients on their home machines, I wished to allow them to use the prebuilt binaries, and to prevent them all from having to do a manual compile, but that would only work when the server runs from that same set of binaries as well, which it now does.

If that is so, this is a bug. The only requirement is that the paraview version is strictly the same.

although ‘remote rendering disabled’ until I fixed the X environment stuff.

We are running this with OpenGL

We wish to eventually use remote rendering, because the models of some of the users can run into the several tens of gigabytes, and while that is easy enough for the cluster, it is too much for their little laptops.

Yes, you need to use remote rendering in that case.