How to modify a NumPy.recarray using its two views
-
20-08-2019 - |
Question
I am new to Python and Numpy, and I am facing a problem, that I can not modify a numpy.recarray, when applying to masked views. I read recarray from a file, then create two masked views, then try to modify the values in for loop. Here is an example code.
import numpy as np
import matplotlib.mlab as mlab
dat = mlab.csv2rec(args[0], delimiter=' ')
m_Obsr = dat.is_observed == 1
m_ZeroScale = dat[m_Obsr].scale_mean < 0.01
for d in dat[m_Obsr][m_ZeroScale]:
d.scale_mean = 1.0
But when I print the result
newFile = args[0] + ".no-zero-scale"
mlab.rec2csv(dat[m_Obsr][m_ZeroScale], newFile, delimiter=' ')
All the scale_means in the files, are still zero.
I must be doing something wrong. Is there a proper way of modifying values of the view? Is it because I am applying two views one by one?
Thank you.
Solution
I think you have a misconception in this term "masked views" and should (re-)read The Book (now freely downloadable) to clarify your understanding.
I quote from section 3.4.2:
Advanced selection is triggered when the selection object, obj, is a non-tuple sequence object, an ndarray (of data type integer or bool), or a tuple with at least one sequence object or ndarray (of data type integer or bool). There are two types of advanced indexing: integer and Boolean. Advanced selection always returns a copy of the data (contrast with basic slicing that returns a view).
What you're doing here is advanced selection (of the Boolean kind) so you're getting a copy and never binding it anywhere -- you make your changes on the copy and then just let it go away, then write a new fresh copy from the original.
Once you understand the issue the solution should be simple: make your copy once, make your changes on that copy, and write that same copy. I.e.:
dat = mlab.csv2rec(args[0], delimiter=' ')
m_Obsr = dat.is_observed == 1
m_ZeroScale = dat[m_Obsr].scale_mean < 0.01
the_copy = dat[m_Obsr][m_ZeroScale]
for d in the_copy:
d.scale_mean = 1.0
newFile = args[0] + ".no-zero-scale"
mlab.rec2csv(the_copy, newFile, delimiter=' ')