How Can I read non-ascii FileNames from pvpython

Hi guys,

I have save a .py state file from File->Save State…, with only one ply reader proxy in current state:


note its FileName is not an ascii-encodable string. Then I got the py state file containing following lines:

# ----------------------------------------------------------------
# setup the data processing pipelines
# ----------------------------------------------------------------

# create a new 'PLY Reader'
ply = PLYReader(FileNames=['C:/tmp/??-.ply'])

This state file cannot be loaded back through File-> Load State…
This situation is just same as my requirement to read file with non-ascii character in its path, so my question is how can I load such files through python scripts: macros, state files or especially using pvpython.exe for offscreen processing?

This state file cannot be loaded back through File-> Load State…

Indeed, nice find. Issue logged here :
https://gitlab.kitware.com/paraview/paraview/issues/19504

A work around I can think of is to edit the python state file manually to correct the non-utf8 char.
Another work around would be to use the python trace feature instead of the python state.

Thanks for your reply

I have just tested Trace, it correctly generate the write code:

# create a new 'PLY Reader'
ply = PLYReader(FileNames=['C:/tmp/中.ply'])

but it cannot run through “python shell -> run scripts”, error message is:

Warning: In D:\ParaView56\VTK\IO\PLY\vtkPLYReader.cxx, line 135
vtkPLYReader (000001F667E1AFE0): Could not open PLY file

ERROR: In D:\ParaView56\VTK\Common\ExecutionModel\vtkExecutive.cxx, line 782
vtkPVCompositeDataPipeline (000001F6691B2040): Algorithm vtkFileSeriesReader(000001F6691B2130) returned failure for request: vtkInformation (000001F66964F970)
  Debug: Off
  Modified Time: 625380
  Reference Count: 1
  Registered Events: (none)
  Request: REQUEST_DATA
  FORWARD_DIRECTION: 0
  ALGORITHM_AFTER_FORWARD: 1
  FROM_OUTPUT_PORT: 0

It seems that even the character ‘中’ is encoded correctly in .py file, python interpretor still cannot decode this character correctly from py file.

Is there any way to specify encoding system for python interpretor embeded in paraview?

I can’t reproduce this issue with ParaView 5.7.0. Can you download it and try with it :
https://www.paraview.org/download/

Hi Mathieu,
I have downloaded 5.7 version at
https://www.paraview.org/paraview-downloads/download.php?submit=Download&version=v5.7&type=binary&os=Windows&downloadFile=ParaView-5.7.0-Windows-Python3.7-msvc2015-64bit.zip

After same process(Trace-> python shell -> run scripts), I still got same result, here is the screenshot, filename is not decoded correctly:

Maybe system encoding of my OS(Windows) is different from yours?

Maybe system encoding of my OS(Windows) is different from yours?

That’s a possibility, I’m using a Linux in en_US_UTF8.

So Is there some thing I can do to read these files from python scripts, maybe specify coding system through paraview command line or compile a utf-8 version by changing some cmake configuration? after all I definately cannot go steal your computer:)

I have tested add “# -- coding: utf-8 --” to the scripts, but it does not work

Which version of windows and which language are you using ?

Hi Mathieu,
My OS version is Windows10 1803, Language is Chinese(Simplified).

I have test another option may help us to address this issue:
In Windows Settings, there is an option introduced since Windows 1803 update package:
image
which on my computer like this:
image

with this check box checked, pvpython scripts can read files with non-ascii character in its path, but unfortunately, this option may cause error on other daily-use applications.

Based on information in the dialog above, it seems that Windows treats ParaView as “programs that do not support Unicode” which apperantly not true, so may be we explicitly tell Operation system that ParaView is capable of supporting unicode may fix this problem without enable that option system-wide?

1 Like

That’s really interesting information here. Thanks for your debug. I will investigate this.

1 Like

I think the work done by @todoooo in 5.8 should have resolved most of these issues. Is 5.8 working for you?

@ymjia The unicode changes are not in Paraview 5.8, but if you build from the master branch of Paraview and VTK this problem is fixed. There’s no need to force Windows to use a global UTF-8 setting.

Ah, right. Forgot they came in late in the 5.8 cycle. They’ll be in 5.9 though (due out in the fall).

@ymjia In the meantime you can try this solution for Windows 10. It provides a per-process utf8 setting.


https://docs.microsoft.com/en-us/windows/uwp/design/globalizing/use-utf8-code-page

Thanks for your reply, I have tested master(12606e87b7bd9b258c79f9146a48b001541fffb0) cloned today(2020.04.10).
Things seems got worse:
I compiled from master, it can read files with all ascii charactors path, but when I try to open a file with non-ascii char, here is the gif:

If I drag that file in, ParaView will print error message:

I have tested both the windows installer here and compiled version from master, error message is same.

I have a version 5.6 build, in my pc, it can read this file:

So Is there somothing I have missed in CMake Configuration or Operating System settings?

Looks like a regression.

@Joachim_Pouderoux

Did you also build from VTK master?

yes, I have update submodule recursively, current git node in my local repo is af056c90caf6d553c6ebe9e2f430bd23c85a524c (2020.04.08 by Charles Gueunet)

I suppose the Paraview master branch still builds against an official release of VTK (possibly 8.2) Perhaps @cory.quammen can shed some light on this?

Just out of interest what is the code page setting on your Windows machine?

What’s the full stack trace for that error?

ParaView master tracks VTK master pretty closely. For ParaView releases, there’s a branch in VTK to track that (mainly handled by ParaView developers rather than bothering the normal development process).