understanding *vtu file format

ChriWiChris · August 31, 2018, 9:35pm

Hi,

I have a problem understanding the *vtu file format.

The files generated by paraview using the D3 filter with AssignCellsUniquely looks like this:

<VTKFile type="UnstructuredGrid" version="2.1" byte_order="LittleEndian" header_type="UInt64">
  <UnstructuredGrid>
    <Piece NumberOfPoints="27"                   NumberOfCells="48"                  >
      <PointData GlobalIds="___D3___GlobalNodeIds">
        <DataArray type="Int64" Name="___D3___GlobalNodeIds" format="appended" RangeMin="0"                    RangeMax="26"                   offset="0"                   >
        </DataArray>
        <DataArray type="UInt8" Name="vtkGhostType" format="appended" RangeMin="0"                    RangeMax="1"                    offset="224"                 />
      </PointData>
      <CellData>
        <DataArray type="Int64" Name="vtkOriginalCellIds" format="appended" RangeMin="0"                    RangeMax="47"                   offset="259"                 />
        <DataArray type="UInt8" Name="vtkGhostType" format="appended" RangeMin="0"                    RangeMax="1"                    offset="651"                 />
      </CellData>
      <Points>
        <DataArray type="Float32" Name="Points" NumberOfComponents="3" format="appended" RangeMin="0"                    RangeMax="3.4641016151"         offset="707"                 >
          <InformationKey name="L2_NORM_RANGE" location="vtkDataArray" length="2">
            <Value index="0">
              0
            </Value>
            <Value index="1">
              3.4641016151
            </Value>
          </InformationKey>
        </DataArray>
      </Points>
      <Cells>
        <DataArray type="Int64" Name="connectivity" format="appended" RangeMin=""                     RangeMax=""                     offset="1039"                />
        <DataArray type="Int64" Name="offsets" format="appended" RangeMin=""                     RangeMax=""                     offset="2583"                />
        <DataArray type="UInt8" Name="types" format="appended" RangeMin=""                     RangeMax=""                     offset="2975"                />
      </Cells>
    </Piece>
  </UnstructuredGrid>
  <AppendedData encoding="raw">

This documentation
https://www.vtk.org/wp-content/uploads/2015/04/file-formats.pdf
doesn’t say anything about __D3__GlobalNodeIds or vtkGhostType.
A websearch didn’t provide me with details about that, so can you please point me to some information on how those attributes are used in this file format and what the underlying meaning is?

Thanks in advance and greetings

Chris

danlipsa · September 1, 2018, 1:26am

Here is some information on vtkGhostType

https://blog.kitware.com/ghost-and-blanking-visibility-changes/

ChriWiChris · September 3, 2018, 9:12am

Thank you for the information, this is very helpful.

The only thing that remains unclear to me is the labelling of the GhostNodes/GhostCells:
In my file(s) I have 27 nodes with 48 cells and I created 2 partitions (=2 processors).
The vtkGhostType array for the nodes contains 35 bytes: 25 times 0 and 9 times 1 and one time 128.
The vtkGhostType array for the cells contains 56 bytes: 31 times 0 and 24 times 1 and one time 68.

Those values are the same both in the first *vtu file and in the second *vtu file. The only difference is, that in the second file the order of the nodes/cells is different.

How is the mapping of those numbers to the nodes/cells? In my point of view, those arrays should have the same length as the nodes/cells array, i.e. 27 and 48 respectively. Or is there any internal convention, that those arrays are always 8 bytes longer than expected?
And how is determined which processor gets which cells?

Thanks in advance and greetings

ChriWiChris · September 3, 2018, 2:23pm

I investigated a bit more with some dummy examples and now I understand the mapping of the vtkGhostType array to the nodes/cells and how the communication can be set up between processors.

But I still don’t know why those arrays always have additional 8 bytes with zero values in it. Has someone an idea?

The last thing that I have to mention is that in the documentation (see first post) I can find the sentence:

Each DataArray ’s data are stored contiguously and appended immediately after the previous DataArray ’s data without a seperator.

But in fact I found several additional bytes, that don’t act as a separator but that has nothing to do with the data, that has to be displayed.
That is also true for the beginning of the raw block. It says

The DataArray ’s offset attribute indicates the file position offset from the first character after the underscore to the beginning its data.

I had a paraview-generated file where this was true and I had a file where I found some disturbing bytes between the actual data and the underscore.

danlipsa · September 3, 2018, 2:35pm

Probably those extra cells / points are the ghost cells/points duplicated because of the partition.

ChriWiChris · September 3, 2018, 2:46pm

In this case the array would have additional M bytes, where M is the number of ghost nodes/cells. But the array has constantly 8 additional bytes, independent of the number of ghost nodes/cells in my files and independent of the number of total partitions.
And the value in those 8 bytes is always (!) zero, which would mean that the node/cell mapped to this value is part of the partition and not of the ghost nodes/cells.

Again, here I wonder where those 8 bytes are mapped to, as each node/cell already has a value mapped to it.

danlipsa · September 3, 2018, 3:25pm

Can you share you data? I could take a look at this.

Thanks,

Dan

ChriWiChris · September 3, 2018, 4:22pm

Thank you, Dan. Here is the link to the samples:
https://syncandshare.lrz.de/dl/fi9EoWWNvvuj4HCiguPWF6Ku

There is one *.pvtu file that splits a cube into 2 pieces and one *.pvtu file that splits the same cube into 4 pieces.

Greetings, Chris

danlipsa · September 3, 2018, 8:37pm

Take a look at the attached ParaView state file. In this state file, I load square4x4_p2_0 and then I rename vtkGhostType to ghost_type. This will enable ParaView to render the cells marked as ghosts.
So, at least for this file, things seem to be OK: You have 8 ghost cells that are not rendered by ParaView.

You can see the selected cells by navigating the spreadsheet.

You can take a look at the _1 file using a similar method. It should have another 8 ghost cells that are not rendered by ParaView.

ghost_cells.pvsm (308 KB)

ChriWiChris · September 3, 2018, 9:23pm

Hi Dan,

thank you for your effort.
I learned a lot with your state file and the use of the SpreadSheetView.

Indeed, the rendering of the file is right. What I am wondering is why the file generated by paraview has some “disturbing” bytes and doesn’t fit to the documentation, although the rendering works nicely.

I just tried to delete those bytes from the file and adjust the offsets, but then paraview gave me some error, that the array is too short.

I think I just accept to prepend 8 bytes with zero values to the vtkGhostType arrays and everything should work out.

Again, thanks for your help.

Greetings, Chris