how coprocess render the image for distributed cases

zhe · June 28, 2020, 3:57am

Hello, guys:

There is a workable example that uses the catalyst to render the data
(https://github.com/mdorier/MandelbulbCatalystExample/blob/master/src/Mandelbulb.cpp#L44 )

I could run this example and generates the image in distributed case (the name of the python script is a parameter of the program)
(https://github.com/mdorier/MandelbulbCatalystExample/blob/master/scripts/render.py)

I am curious how catalyst render the image when there are multiple processes. It seems that every MPI process generates a specific data partition, but there is only one image in global view is generated at each time step even if I use multiple process to run the program.

I still did not figure out how catalyst do that, I’m not sure if the data is reduced into one process somewhere and then rendered by this process. If there are communications between different processes, how the catalyst implement it in this example?

Thanks a lot for your help!

mdorier · June 29, 2020, 10:16am

To add some precision about what Zhe is doing: we wanted to (1) check whether we could provide our own implementation of a VTK communicator and (2) check which collective communication functions would be called when using some Catalyst scripts.

We have copied the vtkMPICommunicator and vtkMPIController implementation from the VTK source, renamed them into MochiCommunicator and MochiController (and replaced “vtkMPI” with “Mochi” everywhere in the source), then in our Catalyst adaptor we do the following before creating the vtkCPProcessor:

MochiCommunicator *communicator = MochiCommunicator::New();
MochiController *controller = MochiController::New();
controller->SetCommunicator(communicator);
controller->Initialize(nullptr, nullptr, 1);
vtkMultiProcessController::SetGlobalController(controller);

Each function in these two classes has a print statement so we could see which ones are called.

The Python script that Zhe is using does an isocontour rendering of a Mandelbulb fractal. When distributed across multiple processes, each process computes a region of this fractal.

What we noticed is that although we see calls to functions that initialize the controller and communicator, as well as calls to functions such as Duplicate, all of which correspond to our initialization of the controller and communicator. But when doing the actual co-processing, we don’t see any actual calls to communication functions such as send receives. Our guess is that either (1) the rendering algorithm doesn’t care and uses MPI functions directly, or (2 - more likely) somewhere down the line, something (maybe the Python script itself?) resets the global controller to an MPI one, overriding the one we have installed.

utkarsh.ayachit · June 29, 2020, 11:19am

The compositing is done using IceT. IceT doesn’t use VTK-communicator, but directly uses MPI. @Kenneth_Moreland, any suggestions on how IceT’s MPI calls can be forwarded avoid some other mechanism than MPI?

mdorier · June 29, 2020, 12:22pm

Actually we tracked down the address of the global controller before and after co-processing, and we can see it changing, so it seems VTK is replacing the global controller with another one (probably an MPI one).

Regarding IceT: is there any way of not relying on IceT for compositing? We are trying to implement a non-MPI communicator so it’s going to be pretty limiting if IceT forces us to have MPI.

utkarsh.ayachit · June 29, 2020, 12:25pm

vtkCPProcessor::Initialize sets up the global controller during initialization.

mdorier · June 29, 2020, 12:29pm

Only vtkCPProcessor::Initialize(vtkMPICommunicatorOpaqueComm& comm, const char* workingDirectory) does, but we are not calling this version. We are calling the version that only takes a workingDirectory argument.

EDIT: after checking, we can confirm that Initialize (the version we call) does indeed modify the controller. Should we set the global controller after creating the vtkCPProcessor?

mdorier · June 29, 2020, 12:50pm

Update: calling SetGlobaleController after initializing the processor works, we can now see functions from our MochiCommunicator being called during co-processing.

Kenneth_Moreland · June 29, 2020, 12:51pm

It is possible to replace the IceT communicator to use something other than MPI. IceT has its own abstraction for a communicator, and you can write one that does not use MPI.

The reason why the one in ParaView/Catalyst uses MPI directly rather than vtkCommunicator is that IceT requires asynchronous sends and receives (i.e. ISend and IRecv) and vtkCommunicator does not support these.

If MochiCommunicator supports asynchronous sends/receives, it should be possible to have the IceT library use it.

mdorier · June 29, 2020, 1:13pm

Yes, MochiCommunicator will support non-blocking send/recv. I was looking at IceT right now and I was actually thinking that I could write an IceT communicator that rely on vtkCommunicator, so that any non-MPI implementation (not just ours) would be supported. It’s good that you let me know about the need for non-blocking functions.

Would that be useful if I contribute such an IceT communicator anyway? Maybe you could add a member function in vtkCommunicator to query which features (such as non-blocking send/recv) are supported by the actual implementation. Wherever IceT is used, instead of downcasting to a vtkMPICommunicator, you would first check if the vtkCommunicator has the right feature, and use a VTK-based IceT communicator implementation instead?

(The alternative is that we write an IceT communicator specifically for our library, and we would have to patch VTK anyway to prevent it from downcasting to MPI).

EDIT: I actually see non-blocking send/recv implemented in the vtkMPICommunicator, here…?

Kenneth_Moreland · June 29, 2020, 1:34pm

If you added support for non-blocking communication to the generic vtkCommunicator superclass, then I think it would make a lot of sense for IceT to just use that within ParaView/Catalyst. So, yes, that would be a useful contribution.

As an aside, yes, vtkMPICommunicator supports asynchronous communication, but it’s superclass does not. Since that only works with MPI anyway, I did not feel building a custom IceT communicator to use vtkMPICommunicator was any more useful than just using the MPI communicator that comes with IceT.

mdorier · June 29, 2020, 1:42pm

Ok thanks!

I also see that sendrecv and alltoall, as well as getting the rank and size of the communicator (these informations are protected without access member functions).