# Script to take the mean of data-sets

Hi Everybody,
I’m studying a cyclic phenomenon and I have for each period i (for a total of 3 periods), 24 datasets, for a total of 72 datasets. I have to study the data emerging from the period-averaging of such phenomen, such that the resulting dataset representing the time 0 is given by

``````mean_dataset(1) =  (dataset(1)+dataset(25)+dataset(49) )/3
``````

where the sum is defined as taking the sum of a scalar or vectorial field at the same x,y,z location (or node location, all the datasets have the same mesh )

I think this is a work for a script, since I don’t have to visualize nothing.
For the average of N datasets (if I have to study N periods) I just have have to read 2 datasets at a time (RAM limitation), take their sum divided by N, save result in a temporal dataset, free-up memory space (except the temporal data-set), load next data-set and sum it divided by N to the previous temporal data-set, save this new result in a temporal data-set, free-up space (except the temporal data-set) and so on.
I think this can work, but simply I don’t know how to access field data in a coherent manner to perform sum at the same node location and how to free-up space.

TemporalStatistics ?

The problem is that I cannot read all time-steps at once…

The `TemporalStatistics` filter will not read in all time-steps at once. Rather, it will read in each time step one at a time and accumulate statistics like mean.

To apply temporal statistics I have to individually load the data-set corresponding to the same phase-angle and successively apply the temporal statistics filter. So I’m opening, at once, all files associated with the same phase-angle.
I have also another big problem, my mesh is oscillating periodically and, due to numerical errors, the meshes at same phase-angle are not precisely the same and also are not written in the same manner (the nodes are written in a random order at every time step ).Can Paraview overcome this problem ?

I think I am confused about what you are trying to do. Going back to your original post, you have 24 datasets, each with 3 periods. Putting the averaging across periods aside for the moment, it sounds like the processing of your 24 datasets are completely independent. If that’s the case, then you can write a Python script to iterate over each dataset such as:

``````for ds in range(24):
# Compute the mean of the 3 periods of dataset ds (to be discussed later)
# Write result
``````

Is this so far what you mean or am I missing something?

I have a total of 72 data-sets, as stated in my original post. These 72 datasets are composed of 3 groups of 24 data-sets, each group representing a period of the periodic phenomenon I’m studying. So I have 24 datasets per period, for a total of 3 periods. I want to take the mean beetween datasets representing the same point in the phase domain.
Let’s enumerate this 72 data-sets from 1 to 72 with the temporal order.
The 1, 25 and 49 will be averaged to form the averaged datasets number 1, corresponding to the 0 degree phase angle
The 2, 26 and 50 will be averaged to form the averaged datasets number 2, corresponding to the 15 degree phase angle

The 24, 48 and 72 will be averaged to form the averaged datasets number 24, corresponding to the 345 degree phase angle

Two problems arise now:

1. How to practically take the mean of these datasets ? I see no options in Paraview other than to load 3 files at once and using the temporal statistics filter… I don’t want to read them at once because in the future I probably will have 10 or 20 periods to average…

2. Due to numerical errors the meshes corresponding to :
period 1, phase 0 ;
period 2, phase 0;
period 3, phase 0;

period 1, phase x ;
period 2, phase x ;
period 3 , phase x;

the meshes are not the same for the same phase x, in theory they must be the same because of the periodic nature of the mesh-movement, but practically the meshes at same phase angles differ by:
a) little numerical errors in nodes coordinates
b) random ordering of the nodes in the mesh file.

Thank you

p.s.
Loading them individually seems to get paraview to not recognize them as a time series… they are named as casename_time with time written as for example 0.34005