we are using paraview (5.6.0) to visualize CFD data remotley. Our Pipeline is to start a pvserver on the remote machine and then connect to it via paraview.
What we discovered recently (by accident) is the following:
If UserA opens a port with pvserver and UserB (un-)intentionally connects with his Paraview to the same port, he is granted access to the files of UserA, i.e. when browsing files to open (File->Open) UserB starts in the remote-folder of UserA.
Is this behavior known?
Is there anything we can do to prevent such a behavior (either on the pvserver or the network/ssh side)?
Otherwise it will be too much of a security risk for us to use the pvserver, if we expose data to other users.
I hope someone can clarify, what is happening here.
Yes, when a pvserver waits on a socket connection, it, by default, does not perform authentication. If you are worried about accidental (or adversarial) connections, you should use the --connect-id option. The way it works is that you launch pvserver with the --connect-id option and give it a random number that only the appropriate client knows. During the handshake the client has to give the appropriate number or the connection is refused.This makes accidental connections near impossible and adversarial connections difficult.
Another option you might consider is the --reverse-connection option. When this mode is on, the pvserver connects back to the client GUI rather than the other way around. Assuming your users are running ParaView on their local desktops, it makes inappropriate connections much less likely.
Thank you for the quick answer.
Just to make clear: I will start the pvserver via pvserver --connect-id=<random Nr> ?
Then, on the remote “cluster” everyone that uses e.g. “top” to monitor processes could easily see the connect-id and then connect to the pvserver, or am I mistaken here?
Yes, since it is a command line option, anyone with login access to the server will be able to query the command line and see the “secret” code. If you are only worried about accidental connections, that should not be an issue. However, if you are worried about adversaries that have access to the server connecting, that would be an issue. Most HPC systems I have seen restrict login access to service nodes to prevent problems of this nature, but this might not be an option for you.
A way around the problem is to use the rather obscure ParaView feature of placing command line arguments in a pvx file. A pvx file is a simple XML file that can store, among other configuration, command lines to the server program. As long as you appropriately protect the file, and adversary will not be able to read your secret connect id.
As an example, let’s say you create a file named secret_args.pvx with the following contents.
You can then launch the server with pvserver secret_args.pvx. It will set the connect-id to 123456, and no one will be able to get that information through ps.
The tricky part is creating this file. It is most secure to generate a new connect-id every time you launch the server, which means you will have to rebuild this file each time. Building a file like this with a shell script is easy enough, but then where does the shell script get the number from? Your best bet is probably to generate the file on the client side and then scp it over to the server. (Doing this is left as an exercise for the reader.)
To reiterate what Ken has said, there are really three levels of security.
Do a remote server connect. First start the client, then the server. Now, you KNOW where the client is, and the socket will get locked up when you start the connection. Thus, no one else can be on the receiving side (unless they already started the client, waiting).
Use the --connect-id. Again, if you are reverse connecting, only you know the --connect-id when you lock up the client side port. Thus, no one else possibly can get at your data.
Once the connection is made, you move off the 11111 port, onto your own unique port.
With regards to not encrypting, and agreeing with Ken, I assume that the network you are on is appropriate for your data. If you on an open network, with access to the internet, you have lots of security issues above and beyond just the ParaView connection.
@Kenneth_Moreland : I may be mistaken, but If an adversaries has access to the computer where the server is running (as with your connect-id example), then can’t he have access to the data after it was decrypted by ssh ?
I don’t consider myself an expert in security, so take my statements with a grain of salt. It’s my understanding that the real risk of sending data unencrypted over a network (particularly a WAN), is that it is impossible to control what equipment the data go through, so someone could be snooping the data at any point in the network.
Even on the local machine, pvserver is pretty vulnerable before the socket connection is made, which is the whole point of the connect-id conversation we have been having on this thread. However, once the connection is made (let’s say local process to local process for the purpose of this discussion) then the OS itself should protect your program’s memory and the memory that goes between the two processes that you own from anyone with a different account (assuming they don’t have root access). If an adversary has managed to breach the OS’s protections to get to this information, then they already “own” your computer, and getting access to a pvserver is the least of your problems.
First of all: Thank you for all the answers!
Just to clarify the network “setup” that i was referring to:
We (our company) have an internal network with ssh-access to a remote cluster(where the pvserver will be running). On that cluster, only employees of the company (should) have access.
So the connection itself should be safe (no “third party” intervening or “owning” our computers).
As far as I understood the only vulnerability would be if I open a pvserver and am not yet connecting with my local paraview client. Otherwise no one should be able to intefere.
We were just uncertain, if the pvserver itself poses any threat to the user’s data.
Maybe I can propose the client-id and/or remote-connection solution to our IT-security and the head of our department.
remote-connection, where ParaView itself run on the server, would force you to use mesa rendering, limiting your rendering capabilities. I would not recommend it for large dataset or advanced rendering techniques.
Mathieu, I’m being stupid. I don’t think that St Ruoff said anything about running paraview client on the clusters.
Even if he did, it wouldn’t matter, assuming he was using a remote server with mesa. The mesa server does all of the rendering, and just pushes images back to the client (for datasets of any reasonable size).