What is the most practical python data structure for a time series of grid data?

https://stackoverflow.com/questions/22759302

24-06-2023
|

Pregunta

I am working with a time series of grid data:

'2014-01-01'

0 1 1 0 0 1
0 1 1 0 1 1
1 1 1 0 0 1
0 1 0 0 0 1
0 1 1 0 1 1

'2014-01-02'

0 1 0 0 0 1
0 1 1 0 1 1
1 0 1 0 1 1
0 1 0 0 0 1
1 1 1 0 1 0

etc. ...

I have been using 3d numpy arrays, but I am not satisfied with handling the 3rd dimension as datetime objects, and would like to perhaps use pandas time-series for this.

Is there a generally accepted best-practice for working with such data in python?

Update:

I used 1s and 0s above to not make things too messy, but my data are values (of different types) over a geospatial grid (WGS84 lon/lat) over time.

I would like the ability to perform computations on the data in a date range on the whole grid, or some slice of the grid (using grid indexing). The pandas time series is nice, because you have the benefits of both datetime objects (with .month, .year, etc. methods) and datetime64 scalar simplicity (i.e. data.date=1995).

Solución

Maybe a structured array (or record array) would better suit your needs:

For example, if n is the number of entries you require and assuming your numeric arrays are boolean:

n=1
np.zeros(n, dtype=[('date','|S10'),('data','b1',(5,6))])

Licenciado bajo: CC-BY-SA con atribución

No afiliado a StackOverflow