Trying to understand Catalyst (Live) design choices

I’ve been looking closer into Catalyst 2 and have some questions on why particular things are the way they are. If there’s misconceptions on my end then I apologize (and am happy to be corrected).

  • To do live visualization with ParaView it seems the simulation side will initiate the connection to the ParaView GUI. Why was that setup chosen, instead of the other way around where the GUI client connect to the running simulation when it chooses to? The latter would match the usual way of connecting to a Paraview server running on an HPC system. Plus it would avoid reverse SSH tunneling in many cases to get an outward connection from the HPC system to a non-public IP address.
  • Does the previous point not imply that a Catalyst-enabled simulation and ParaView GUI need to be co-scheduled? Or can the GUI connection made at any point during the simulation lifetime?
  • Why tie the visualization setup to the simulation side by having to pass one or more Python scripts exported from ParaView? Why not have a Catalyst-enabled simulation be a data source only, with the visualization/analysis parts on the client-side only?

Thanks in advance for any feedback.

One key point here is that Catalyst was designed to run in batch mode on the HPC: live visu is “just” a utility feature built upon it. It may lack some options.

An usual use case is as follow:

  • simulation runs on some HPC nodes
  • simulation makes API call to Catalyst, on same or other HPC nodes
  • Catalyst acts as a kind of pvserver
  • Catalyst may try to connect to a ParaView GUI

Some HPC does not allow inward connection, so reverse connection is then needed. Also note that allowing connection from the GUI (as possible with a classical pvserver) lead to some consideration about who can connect to a given Catalyst, in a context of multi-user. We have some feature implemented for pvserver (like the ConnectID, 2. Command Line Arguments — ParaView Documentation 6.1.0 documentation), but not for Catalyst.

If no GUI found, no error. The Catalyst and the simu processes continue. The connection is retried at the next timestep (or as configured).

To run without GUI (again, it was the main case in mind), it is required to pass a pipeline somehow. It is then done simulation side.
Creating a pipeline from the GUI may be a thing, but it was just never implemented/designed.

Hope it helps,

Ah, that helps! I never considered “in-situ viz” to include batch-style (non-interactive) visualization, with the viz pipeline configurable through Python. To me it always meant interactive human-in-the-loop viz (as writing one or more visualizations from a simulation feels fairly common, not needing Catalyst). But apparently I’m not the only one grappling with the term “in-situ”.

Thanks, this indeed helps me better understand Catalyst.

By the way, the Catalyst examples linked from the Paraview docs (e.g. Introduction — ParaView Documentation 6.1.0 documentation) don’t point to the new location https://gitlab.kitware.com/paraview/catalyst-examples

Good catch!

We will update it soon.

By the way, how is this string interpolation I see in the catalyst examples supposed to work?

producer = TrivialProducer(registrationName="${args.channel-name}")

Is that something that paraview/catalyst should replace? As I see the non-interpolated name in Paraview (e.g. with the CxxImageData example):

image

Also, when using Save Catalyst State from Paraview the resulting script does not contain a catalyst_execute(info) function. All the example script do have that function, but is that merely to show some info as each catalyst step is executed?

Correct, Save Catalyst State doesn’t emit a catalyst_execute(info) function, and that’s expected.

The generated script is declarative. It builds the pipeline and registers extractors with triggers at import time. The implementation handles each step: update the producers, evaluate triggers, fire the extractors that are due. No user function is needed for that.

catalyst_execute(info) is an optional callback. If the module defines it, the implementation calls it each step; if not, the extractor processing still happens. You add it yourself only when you need imperative logic that extractors can’t express: conditional output, steering, catalyst_results(info) round-trips, or custom analysis.

So the generated script omits it by design. Add it manually if you need a per-step hook.