Issues with Load State using a Custom Reader

Hello,

I’ve created a custom reader using the VTKPythonAlgorithm (ParaView 5.6.0) that takes in .xmf files and does some extra data manipulation to the MBDS before it gets passed down the pipeline.

It seems to work when opening a file (either a single time state or a time-series), however things break when I try to load a state saved using this custom reader.

It looks like the problem stems from the file name never actually getting passed to my SetFileName method, upon LoadState it always passes a file name of ‘None’ and I don’t know how to fix that.

I also see that upon initially opening a file, the SetFileName method is called twice. Once with a file name of ‘None’ and a second time with the correct file name.

Another funny thing I’ve noticed is that the GetDataArraySelection_Cell/Point methods get called multiple times, either upon opening a file or loading a saved state. I have no idea why this is.

Below I have included the reader as well as a test script. I’ve also attached them.

Test script:

#!/usr/bin/env python
# -*- coding: utf-8 -*-

import subprocess
import paraview.simple as pvs

p = './myXdmfReader.py'
f = './some_xdmf_file.xmf'
state = './reader_debug.pvsm' 

save = 1
load = 1

subprocess.call('printf "\033c"', shell=True)
pvs.LoadPlugin(p, remote=True, ns=globals())
if save:
    print('\n#################')
    print('#--- Reading File')
    print('#################\n')
    r = MyXDMFReader(FileName=f)
    pvs.SaveState(state)
    pvs.ResetSession()
if load:
    print('\n##################')
    print('#--- Loading State')
    print('##################\n')
    pvs.LoadState(state)
print('')

The reader (less the intensive data manipulation):

# !/usr/bin/env python
# -*- coding: utf-8 -*-

from paraview.util.vtkAlgorithm import *
import os
import sys
import inspect
curdir = os.path.dirname(os.path.abspath(inspect.getfile(inspect.currentframe())))
sys.path.insert(0, curdir)

def createModifiedCallback(anobject):
    print('createModifiedCallback')
    import weakref
    weakref_obj = weakref.ref(anobject)
    anobject = None
    def _markmodified(*args, **kwars):
        o = weakref_obj()
        if o is not None:
            o.Modified()
    return _markmodified

@smproxy.reader(name="MyXDMFReader", label="My XDMF Reader",
                extensions="xmf", file_description="XMF files", support_reload=False)
class PythonXdmfReader(VTKPythonAlgorithmBase):
    def __init__(self):
        print('__init__')
        VTKPythonAlgorithmBase.__init__(self, nInputPorts=0, nOutputPorts=1, outputType='vtkMultiBlockDataSet')
        self._filename = None
        self._ndata = None
        self._timesteps = None

        #--- debug
        self.cell_selection = 0
        self.point_selection = 0
        #---

        from vtkmodules.vtkCommonCore import vtkDataArraySelection
        self._arrayselection_cd = vtkDataArraySelection()
        self._arrayselection_cd.AddObserver("ModifiedEvent", createModifiedCallback(self))
        self._arrayselection_pd = vtkDataArraySelection()
        self._arrayselection_pd.AddObserver("ModifiedEvent", createModifiedCallback(self))

    def _get_raw_data(self, requested_time=None):
        print('_get_raw_data')
        if self._ndata is not None:
            if requested_time is not None:
                self._ndata.UpdateTimeStep(requested_time)
                self._ndata.Update()
                print('  requested time:', requested_time)
                return self._ndata
            print('  ndata:', 'exists')
            return self._ndata

        print('  filename:', repr(self._filename))
        if self._filename is None or self._filename == 'None':
            raise RuntimeError("No filename specified\n")

        from paraview.vtk.vtkIOXdmf2 import vtkXdmfReader
        import numpy as np
        self._ndata = vtkXdmfReader()
        self._ndata.CanReadFile(self._filename)
        self._ndata.SetFileName(self._filename)
        self._ndata.UpdateInformation()
        self._ndata.Update()
        self._timesteps = None
        executive = self.GetExecutive()
        if len(self._ndata.GetOutputInformation(0).Get(executive.TIME_STEPS())) > 1:
            self._timesteps = np.sort(self._ndata.GetOutputInformation(0).Get(executive.TIME_STEPS()))

        # array stuff
        cd_arrays = []
        pd_arrays = []
        for i in range(self._ndata.GetNumberOfCellArrays()):
            cd_arrays.append(self._ndata.GetCellArrayName(i))
        for i in range(self._ndata.GetNumberOfPointArrays()):
            pd_arrays.append(self._ndata.GetPointArrayName(i))
        cd_arrays = list(set(cd_arrays))
        pd_arrays = list(set(pd_arrays))
        for aname in cd_arrays:
            self._arrayselection_cd.AddArray(aname)
        for aname in pd_arrays:
            self._arrayselection_pd.AddArray(aname)

        # some work here to save data for use later
        # import os, someCustomPackage
			
        return self._get_raw_data(requested_time)

    def _get_timesteps(self):
        print('_get_timesteps')
        self._get_raw_data()
        return self._timesteps.tolist() if self._timesteps is not None else None

    def _get_update_time(self, outInfo):
        print('_get_update_time')
        executive = self.GetExecutive()
        timesteps = self._get_timesteps()
        if timesteps is None or len(timesteps) == 0:
            return None
        elif outInfo.Has(executive.UPDATE_TIME_STEP()) and len(timesteps) > 0:
            utime = outInfo.Get(executive.UPDATE_TIME_STEP())
            dtime = timesteps[0]
            for atime in timesteps:
                if atime > utime:
                    return dtime
                else:
                    dtime = atime
            return dtime
        else:
            assert(len(timesteps) > 0)
            return timesteps[0]

    def _get_array_selection(self):
        print('_get_array_selection')
        self._get_raw_data()
        return self._arrayselection_cd, self._arrayselection_pd

    @smproperty.stringvector(name="FileName")
    @smdomain.filelist()
    @smhint.filechooser(extensions="xmf", file_description="XMF files")
    def SetFileName(self, name):
        print('SetFileName')
        print('  new filename:', repr(name))
        """Specify filename for the file to read."""
        if self._filename != name:
            import os
            self._filename = os.path.abspath(name)
            self._ndata = None
            self._timesteps = None
            self.Modified()

    @smproperty.doublevector(name="TimestepValues", information_only="1", si_class="vtkSITimeStepsProperty")
    def GetTimestepValues(self):
        print('GetTimestepValues')
        return self._get_timesteps()

    @smproperty.dataarrayselection(name="Point Arrays")
    def GetDataArraySelection_Point(self):
        print('GetDataArraySelection_Point')
        self.point_selection += 1
        print('  point selection count:', self.point_selection)
        return self._get_array_selection()[1]

    @smproperty.dataarrayselection(name="Cell Arrays")
    def GetDataArraySelection_Cell(self):
        print('GetDataArraySelection_Cell')
        self.cell_selection += 1
        print('  cell selection count:', self.cell_selection)
        return self._get_array_selection()[0]

    def RequestInformation(self, request, inInfoVec, outInfoVec):
        print('RequestInformation')

        executive = self.GetExecutive()
        outInfo = outInfoVec.GetInformationObject(0)
        outInfo.Remove(executive.TIME_STEPS())
        outInfo.Remove(executive.TIME_RANGE())
        timesteps = self._get_timesteps()
        if timesteps is not None:
            for t in timesteps:
                outInfo.Append(executive.TIME_STEPS(), t)
            outInfo.Append(executive.TIME_RANGE(), timesteps[0])
            outInfo.Append(executive.TIME_RANGE(), timesteps[-1])
        return 1

    def RequestData(self, request, inInfoVec, outInfoVec):
        print('RequestData')
        data_time = self._get_update_time(outInfoVec.GetInformationObject(0))
        raw_data = self._get_raw_data(data_time)

        # array stuff
        for i in range(raw_data.GetNumberOfCellArrays()):
            aname = raw_data.GetCellArrayName(i)
            if self._arrayselection_cd.ArrayIsEnabled(aname):
                raw_data.SetCellArrayStatus(aname, 1)
            else:
                raw_data.SetCellArrayStatus(aname, 0)

        for i in range(raw_data.GetNumberOfPointArrays()):
            aname = raw_data.GetPointArrayName(i)
            if self._arrayselection_pd.ArrayIsEnabled(aname):
                raw_data.SetPointArrayStatus(aname, 1)
            else:
                raw_data.SetPointArrayStatus(aname, 0)

        raw_data.Update()

        ds = raw_data.GetOutputDataObject(0)

        # this is where I would apply some extra data to the MBDS before passing it along
        # import someCustomPackage

        from paraview import vtk
        output = vtk.vtkMultiBlockDataSet.GetData(outInfoVec)
        output.ShallowCopy(ds)

        if data_time is not None:
            output.GetInformation().Set(output.DATA_TIME_STEP(), data_time)
        return 1

debug_reader.py (620 Bytes)
myXdmfReader.py (7.5 KB)

So I’ve tried saving and loading a state file generated using the example vtkPythonAlgorithm CSV reader and it results in the same issues. Foremost that the filename is never actually retrieved from the .pvsm state file and passed to the SetFileName method of the reader.

Is this a bug of vtkPythonAlgorithm or perhaps just not supported yet?

I too have stepped into this exact issue with my custom reader.
Is there been any solution or work around to this.
Thanks
Marco

Indeed a bug

any comment on when this bug will be resolved?

Hard to say is this bug has not been funded yet. Feel free to contribute yourself or even contact our commercial services if you can help !
https://www.kitware.com/what-we-offer/#support

fyi
upon further experimentation I discovered that my custom reader would work if I save a python style state file instead of the pvsm state file type.

Great, I’ve updated to bug to include this work around.

the fix is to update the _get_array_selection method as follows:

    def _get_array_selection(self):
        print('_get_array_selection')
        # self._get_raw_data() <--- remove this line
        return self._arrayselection_cd, self._arrayselection_pd

Readers should provide access to the vtkDataArraySelection object even before the filename is set or RequestInformation is called.

I’ve updated the example accordingly.

This is awesome news.
I’ve been away on vacation a few weeks but will implement this update promptly now.

To date we’ve had our users following up the standard reader with a programmable filter to do the data manipulation needed. Wrapped it up in macro to make it as simple as possible but this will certainly streamline our workflow further.

Thanks everyone, both for the workaround and the actual fix.