Missing support for coordset fields in Mesh blueprint

To understand the issue, let’s take a look at a typical Unstructured Grid blueprint for specifying data arrays associated with points.

coordsets:
   coords0:
      type: "explicit"  
      values:
        x: [....]
        y: [....]
        z: [....]
topologies:
  mesh0:
    coordset: "coords0"
    elements:
       shape: "..."
       connectivity: [......]
fields:
   field_vertex0:
     association: "vertex"
     topology: "mesh0"
     ...

As we notice, when specifying fields, one has to specify topology which points to a named topology.
Immediately, we run into some ambiguities in definition. When specifying field_vertex0 array, what order are the tuples specified in? The implementation in VTK/ParaView assumes the order to be same as the order in which the coords are specified.

If all point locations specified in coordsets/coords0 are indeed used in topology/mesg0 then this seems reasonable. But there’s no requirement that coords0 can’t have extra point locations specified. In fact, in case of codes like Exodus, ICON, multiple topology patches reference same vertex coordinates. So here’s how the mesh may look in such case:

coordsets:
   coordsAll:
      type: "explicit"  
      values:
        x: [....]
        y: [....]
        z: [....]
topologies:
  mesh0:
    coordset: "coordsAll"
    elements:
       shape: "..."
       connectivity: [......]
  mesh1:
    coordset: "coordsAll"
    elements:
       shape: "..."
       connectivity: [......]
   ....
fields:
   field_vertex0:
     association: "vertex"
     topology: "mesh0"
     ...

Now, the field_vertex0 definition becomes even more dubious. Should I repeat it for each toplogy/mesh*? Should I only specify it once? What happens if I have multiple coordsets?

To avoid all this ambiguity, I propose we extend the blueprint to add support for specifying a coordset instead of topology for a field e.g.

fields:
  field_vertex0:
   association: "vertex" # only vertex can be specified here.
   coordset: "<coordset name>"
   ...
  field_element0:
   association: "element" # association "vertex" should be deprecated
   topology: "<topology name>"
   ...

This seems to make sense as an improvement to get rid of ambiguity. I need to think about this a bit more but right now my main concern is that this should be a change in Conduit itself and not a change in the Conduit that’s packaged with Catalyst. Have you brought this up in the Conduit development site?

Personally I think that Conduit assumes that nearly all array information will be zero-copied from the source and having repeated fields for each mesh is going to be ok because zero-copy means that multiple fields that share the same vertex association will only have a slight overhead of having extra meta-information for each array and not the full memory overhead of storing the entire field data as a deep copy.

I haven’t brought it up there yet. Wanted to check how the Catalyst community felt about it first. And also to confirm I wasn’t missing something already supported.

So it seems that Mesh blueprint can indeed support coordset fields by using implicit topology as follows:

coordsets:
   coordsAll:
      type: "explicit"  
      values:
        x: [....]
        y: [....]
        z: [....]
topologies:
  all_points:
    coordset: "coordsAll"
    type: "points"
fields:
   field_vertex0:
     association: "vertex"
     topology: "all_points"
     ...

Haven’t confirmed if VTK/ParaView is interpreting such fields correctly/efficiently. But at least we don’t need extend the blueprint to support this use-case!