Proposal: Catalyst V2 Bidirectional API

Francois_Mazen · May 20, 2021, 12:09pm

The purpose of this post is to propose an API in order to drive a background simulation from ParaView graphical interface (GUI). The user wants to access random simulation parameters, then tweak the parameters and send them back to the simulation engine in order to change the way the simulation is running. Ultimately, the user should be able to stop the simulation engine.

The Catalyst API already defines and implements a way to get data directly from the simulation engine and to run a specific visualization pipeline on it. A proof-of-concept from @utkarsh.ayachit proves that it is possible to send information back to the simulation engine. This experiment is based on the first version of Catalyst API through the Sensei abstraction.

In this proposal, we want to use the on-going V2 version of Catalyst to implement bidirectional API.

The Catalyst V2 API defines an abstract way to describe the data via a conduit node. For each time step, the user simply calls the catalyst_execute method to process its data with a custom pipeline previously declared in the catalyst_initialize method.

To extend this behavior for bidirectional communication, the catalyst_execute method should handle the simulation parameters that the user wants to visualize and change interactively. The user would use a dedicated key in the node dictionary like catalyst/driving_parameters.

ParaView needs a mesh to associate the parameters that the user wants to display in a pipeline. Unless it’s a limitation, the content of the catalyst/driving_parameters key should not mention this mesh but just the parameters and their associated values.

If the conduit node can not be changed during the catalyst execution (for thread safety for instance), then we may add a dedicated method that returns the changed parameters conduit_cpp::Node* catalys_execute_and_get_updated_parameters(...).

For the GUI description, the user has to define which parameters he/she wants to display and the way they should be displayed. The standard mechanism in ParaView for such a purpose is the XML proxy description like in Utkarsh’s sensei prototype.

Hence, the user will provide an XML file that associates graphical elements to an vtkSteeringDataGenerator class. The XML file should also contains a dedicated XML tag InitializePropertiesWithCatalystDrivingParameters with similar behavior toSenseiInitializePropertiesWithMesh in order to explicitly link the driving parameter names to their GUI representations.

We can imagine two ways to pass this XML file to throught the Catalyst API: either at the compile time or at the running time.
At compile time, the XML file is embedded in the simulation build system.
At running time, the XML script is passed to Catalyst during the initialization of the process. The Catalyst V2 way iscatalyst_initialize method with a dedicated key in the argument node like catalyst/driving_ui_description_file.
I would prefer the running time solution in order to break dependency between the parameters that the simulation engine provides and their representation in ParaView GUI. Hence changing the way the parameter are displayed and modified would not require to rebuild the simulation engine.

Of course proposed key names and method names are subject to change.

Feel free to comment this proposal.

utkarsh.ayachit · May 20, 2021, 1:57pm

Thanks for this proposal. Overall, I really like this. A capability like this will definitely be nice to add. A few comments.

Can we use Catalyst 2.0 as the term for the Catalyst+Conduit API? CatalystV2 is used here to indicate the newer version of the Catalyst Python scripts. I know it’s still confusing, so if people have better suggestions for a uniform naming convention, we should adopt that and adopt that soon.

I don’t follow this. What does it mean that catalyst/driving_parameters should not mention a mesh unless it’s a limitation?

I am not a huge fan of this approach. I think we should probably assume immutable input node. This will make it easier especially in cases where the Catalyst API is being used under layers that involve in-transit or other such scenarios. So instead of adding this alternate...execute API, we could add a void catalyst_results(conduit_node* params) that can returns any “results” generated by Catalyst. Whether they are changes to driving_parameters or something else is immaterial.

If we go with this approach, this enables another simplification in how driving parameters are specified. We can totally discard any need for any extension of protocol to add catalyst/driving_parameters. A simulation can just use another named channel to post the data that defines the control parameters. In the sensei example, we use the oscillators mesh for that purpose. Now, whether the pipeline will modify them or something else depends on the pipeline itself. Thus, next we need an independent mechanism in the Catalyst Python API for the script to “post” results. For what we can add some API to the ParaView’s Python module available in the catalyst insitu scripts.

The UI stuff, however, I feel should be deferred to the Python pipeline code. One strong reason for that is that the UI stuff will be tightly dependent on the ParaView version and may change from version to version. It’s better if the simulation didn’t have to know about which version of ParaView is being used to provide appropriate UI information. We could probably discuss how to do that independently once we’ve sorted out the bi-directional design.

wascott · May 20, 2021, 4:35pm

My only comment would be I strongly dislike using “Catalyst 2.0” vs “CatalystV2” to represent different things. May I propose that the new Catalyst+Conduit API be called “CatalystAPI 2.0”? The other one is harder for me, how about “CatalystScript 2.0” or CatalystTrace 2.0?

@Kenneth_Moreland Any thoughts on names?

utkarsh.ayachit · May 20, 2021, 4:37pm

Okay, maybe it’s worth starting a separate thread to discuss the naming convention and focus on the bi-direction API discussion in this post.

thread started here

boonth · May 20, 2021, 4:50pm

Very nice proposal.

@utkarsh.ayachit, this is related to what I’ve been thinking about for doing analysis over time with Catalyst. For example, finding the maximum of a variable over time for all points in a mesh. There needs to be a way to send data computed in the python pipeline back to the simulation, and the void catalyst_results() function sounds like the exact thing we need for that.

Francois_Mazen · May 21, 2021, 7:20am

Thanks all for your feedback!

I was influenced by the sensei prototype where the parameters are linked to the oscillator mesh to define control parameters. So I wanted to emphasized that the mesh is not required at all, just keep the parameter names and values in the conduit node.

Does it mean that the simulation engine can call catalyst_execute and catalyst_results at its own pace? For instance querying results only every X calls to catalyst_execute ?

The simulation engine adaptor uses the C API. How will it post and get parameters through a python script? The conduit node provides a nice bridge so I though that passing parameters in a dedicated key at each catalyst execution is the way to go.

Currently the UI stuff is XML file, do you plan to embedded this XML file in the python code or define python methods to describe the UI?

So I understand that the current XML file format is likely to change with ParaView version, whereas the python API should not change?

utkarsh.ayachit · May 21, 2021, 11:42am

Indeed. catalyst_results call, as are calls to catalyst_execute, is completely optional. catalyst_results will return the results generated in the most recent catalyst_execute call.

Note, we’re saying that the results, similar to the channels, are simply “meshes”. Thus, the Python script can create a vtkTable (or simply a vtkFieldData, if we so choose), and pass that back. The harness should do reverse of what vtkConduitSource does to pass that table back to as conduit_node for results. This can be used by the Python script for passing arbitrary results. However, this can be made simpler, as in the oscillator example. vtkSteeringDataGenerator, when present in the pipeline, can automatically be checked by the harness to grab the results and pass back.

Both are possible. It can still be an XML file which defines how the GUI gets generated. We could also offer vtkPythonAlgorithm-like support to create a definition using a Python-class + method-decorators rather than having the users write a clumsy XML.

Anything ParaView changes. That what has happened historically and despite our best efforts will keep happening. With Catalyst 2.0, what we need to strive really hard to do is avoid changing Catalyst API and the ParaView-Catalyst Blueprint unless necessary and in backwards-compatible ways.

Now assume the UI XML changed in ParaView 6.0. Now, if the simulation had compiled it in, it will need to know which version of UI to compile in based on ParaView version. At compile time, this is simply not available since the simulation can be built with the stub which has no information about ParaView version. The pipeline scripts, when executing, can check ParaView version etc and, as tedious as it is, support two versions for ParaView 5.* and ParaView 6.* of both the Python script and UI XML. Thus will not require any changes to simulation code itself.

The same argument holds for providing UI XML via a conduit_node. Potential solutions with this approach include:

providing a way to determine ParaView version using catalyst_about before catalyst_initialize and then sim can use that to pass appropriate UI XML.
extend the blueprint to support adding multile UI XML versions for different versions of ParaView.

Both of these are a little clunky, IMO. So deferring that switch to a single place – the pipeline scripts – makes things more maintainable.

Francois_Mazen · May 25, 2021, 9:28am

Trying to sum-up, fell free to adjust.

From a user perspective:

parameters are sent to ParaView via a dedicated channel in the conduit node of the catalyst_execute call. Key path: catalyst/channels/[channel-name]
void catalyst_results(conduit_node* params) called by the simulation to get new parameter values. The given node is filled by Catalyst with the most recent parameters values.
parameters UI defined in the python catalyst script with a dedicated python-class with method decorators

Decorators example from the existing ParaView plugin python API:

    @smproperty.intvector(name="PhiResolution", default_values=16)
    @smdomain.intrange(min=0, max=1000)

    @smproperty.intvector(name="ThetaResolution", default_values=16)
    def SetThetaResolution(self, x):
        self._realAlgorithm.SetThetaResolution(x)
        self.Modified()

This python script is given to ParaView via the catalyst_initialize method, node’s key: catalyst/scripts

Internal challenges:

generate a vtkTable (or mesh) on ParaView side with the given parameters.
pass this table back to the simulation, by creating a kind of reversed vtkConduitSource See vtkSteeringDataGenerator for inspiration
create python API to describe parameters UI: a class to access/mutate parameters values, and decorators to specify the UI

For the UI, the need is very similar to current ParaView python’s plugin API but I’m not sure we should merge them because we don’t want Catalyst API dependent on ParaView stuffs.

utkarsh.ayachit · May 25, 2021, 11:19am

The UI is nothing more than a proxy definition. So that will indeed be same as the existing Python Plugin API. I don’t see any reason to create a parallel one. The only thing catalyst-specific is where the values for this proxy’s property are setup on each iteration using an input mesh. That can be done by a function in a catalyst-specific Python module.