Resample to Image filter seems to be using a lot of memory

Andy_Bauer · December 20, 2018, 5:09pm

Hi,

I’m working with a dataset that’s a multiblock of rectilinear grids with 1024 blocks and blanking that has two cell data arrays (one is an unsigned char array and another is a float array). The total number of cells in this dataset is 8,245,157,400 and I’m running pvserver with 32 MPI processes. The memory inspector reports that pvserver is taking up less than 6%/6 GiB of the system memory on each of the 8 nodes I’m using (I’m running 4 MPI processes per node). Now when I use the Resample to Image filter with Sampling Dimensions of [1500, 1200, 1200] resulting in 2,154,963,899 cells the memory inspector reports that pvserver is now taking up between 27%/34 GiB and 52%/66 GiB of memory. Note that I’m still using the outline representation throughout so there shouldn’t be any rendering type operations that’s taking any significant amount of memory anywhere. Any ideas on ways to reduce the memory footprint of the Resample to Image filter?

Thanks,
Andy

cory.quammen · January 3, 2019, 11:02pm

I looked into this a bit and I’m not sure what would be causing the memory spike. My initial guess was that the entire data output dataset was being created on each rank, but some investigating further this does not appear to be the case. Another guess would be that some internal filter could be deleted after it is used but is retained instead.

Should we create an issue to investigate and potentially fix the memory usage for this filter?

wascott · January 4, 2019, 12:11am

Andy/Cory,
I’m also very interested in this bug. Would it be possible to replicate with one of the Source/ Unstructured Cell Types?

cory.quammen · January 4, 2019, 1:27am

Alan,

There seems to be some excessive memory use with the following pipeline:

ParaView client, connect to remote server with 4 ranks
Load can.ex2
D3
Resample To Image, resolution 500x500x500

wascott · January 4, 2019, 1:52am

OK, I just tried that, and on 16 ranks, using the memory inspector, we are using about 19 GBytes. Note that I didn’t turn any variables on (took the default).

So, we have 128 million points. We have 4 variables (Global Element Id (idtype), Global Pedegree Id (int), Object Id (idtype) and vtkValidPointMask (char)).

That looks really excessive. Does this say that we need 180’ish bytes for an idtype, int, idtype and char? Ouch!
The different ranks have different amounts of memory usage. Why? After D3, and then Resample to Image, rank 0 has 1.04 GByte, rank 5 has 3.29 GBytes and rank 14 has .720 GBytes. Shouldn’t all ranks be using exactly the same amount of memory? We just created an image stack!
Using timer log, the majority of time is spent in vtkResampleToImage. Think I should take a look, and try to figure out why it’s taking so long? This is from rank 5.

Execute vtkResampleToImage id: 11297, 6.71772 seconds
do-delivery: ResampleToImage1(UniformGridRepresentation)/GridAxes
do-delivery: ResampleToImage1(UniformGridRepresentation)/OutlineRepresentation
do-delivery: ResampleToImage1(UniformGridRepresentation)/SelectionRepresentation/Selection

Andy_Bauer · January 4, 2019, 2:21pm

I just took another look at the vtkPResampleToImage filter code and my wild, completely uninformed guess is that the high memory use is due to using DIY2. I just added an MR to VTK to clear out a member variable to the parent vtkResampleToImage class at https://gitlab.kitware.com/vtk/vtk/merge_requests/5027 but that’s not helping reduce the memory use at all.