esa_cci_sm

Tools to convert ESA CCI SM image files into a time series format.

Installation

This python package and all required dependencies can be installed from pypi via pip

pip install esa_cci_sm

On macOS if you get ImportError: Pykdtree failed to import its C extension, then it might be necessary to install the pykdtree package from conda-forge

conda install -c conda-forge pykdtree

Data download

Download ESA CCI SM data files either from the public CEDA data store via https or the CEDA FTP server using, e.g. FileZilla or wget

Host: anon-ftp.ceda.ac.uk no password or user required Directory: /neodc/esacci/soil_moisture)

E.g the following command will download v9.1 COMBINED data for the year 2023

wget ftp://anon-ftp.ceda.ac.uk/neodc/esacci/soil_moisture/data/daily_files/COMBINED/v09.1/2023/*.nc

Time series creation

After installing this package via pip, you have access to the command line tool to convert ESA CCI SM image files to CF conform time series. We use an Orthogonal multidimensional array representation as implemented in the pynetCF python library.

Note that we assume that the downloaded images are stored in yearly subfolders like

/tmp/img/
├── 1978/
│   ├── ESACCI-SOILMOISTURE-L3S-SSMV-PASSIVE-19781101000000-fv09.1.nc
│   ├── ESACCI-SOILMOISTURE-L3S-SSMV-PASSIVE-19781102000000-fv09.1.nc
│   ├── ...
...
├── 2023/
│   ├── ...
│   ├── ESACCI-SOILMOISTURE-L3S-SSMV-PASSIVE-20231231000000-fv09.1.nc

The following command would then take the daily images from 1991 to 2023 in the path /tmp/img and convert data for grid cells over land into times series. Time series are then stored in /tmp/ts.

ccism_reshuffle /tmp/img /tmp/ts 1991-01-01 2023-12-31 --land_points True

Afterwards, in python, the data can be read as pandas DataFrames.

>> from esa_cci_sm.interface import CCITs
>> ds = CCITs("/tmp/ts", ioclass_kws={'read_bulk': True})
>> ds.read(15, 45)  # lon, lat
                  sm  sm_uncertainty  flag  ...  mode  sensor            t0
1991-01-01  0.424880        0.094507     0  ...     1       2   7670.175000
1991-01-02       NaN             NaN    24  ...     2       2           NaN
1991-01-03       NaN             NaN     8  ...     0       2           NaN
...              ...             ...   ...  ...   ...     ...           ...
2023-12-29  0.495448        0.039983     0  ...     3   21536  19720.051575
2023-12-30  0.426107        0.055060     0  ...     3   16416  19721.147066
2023-12-31  0.390103        0.030294     0  ...     3   21600  19722.117129

Supported Products

At the moment this package supports ESA CCI soil moisture data versions 3 to 9 in netCDF format (reading and time series creation) with a spatial sampling of 0.25 degrees.

Contribute

We are happy if you want to contribute. Please raise an issue explaining what is missing or if you find a bug. We will also gladly accept pull requests against our master branch for new features or bug fixes.

Setup

Setup of a complete development environment with conda can be performed using the following commands:

git clone git@github.com:TUW-GEO/esa_cci_sm.git --recursive esa_cci_sm
cd ./esa_cci_sm
conda create -n esa_cci_sm python=3.12
conda activate esa_cci_sm
pip install -e .[testing]

To checkout our testdata files, you need to have Git LFS installed on your machine.

Guidelines

If you want to contribute please follow these steps:

Fork the esa_cci_sm repository to your account
Clone the repository, make sure you use git clone ... --recursive to also get the test data repository.
make a new feature branch from the esa_cci_sm master branch
Add your feature
Please include tests for your contributions in one of the test directories. We use pytest so a simple function called test_my_feature is enough
submit a pull request to our master branch

Reading ESA CCI SM images

To read the ESA CCI image data we recommend to use python tools such as xarray or netCDF4.

However, there are also reader classes provided in this package. These are mainly used by the image-to-timeseries conversion tool.

Reading of the ESA CCI SM raw netcdf files can be done in two ways.

import os
from datetime import datetime
from esa_cci_sm.interface import CCI_SM_025Img
import numpy.testing as nptest

# read several parameters
parameter = ['sm', 'sm_uncertainty']
# the class is initialized with the exact filename.
image_path = os.path.join(os.path.dirname(__file__), 'tests', 'esa_cci_sm-test-data',
                          'esa_cci_sm_dailyImages', 'v04.2', 'combined', '2016')
image_file = 'ESACCI-SOILMOISTURE-L3S-SSMV-COMBINED-20160607000000-fv04.2.nc'
img = CCI_SM_025Img(os.path.join(image_path, image_file), parameter=parameter)

# reading returns an image object which contains a data dictionary
# with one array per parameter. The returned data is a global 0.25 degree
# image/array.
image = img.read()

All the ESA CCI SM data in a directory structure can be accessed by date. The filename is automatically built from the given date.

from esa_cci_sm.interface import CCI_SM_025Ds

parameter = 'sm'
img = CCI_SM_025Ds(data_path=os.path.join(os.path.dirname(__file__),
                                                'tests', 'esa_cci_sm-test-data', 'esa_cci_sm_dailyImages',
                                                'v04.2', 'combined'),
                          parameter=parameter)

image = img.read(datetime(2016, 6, 7, 0))

For reading all image between two dates the esa_cci_sm.interface.CCI_SM_025Ds.iter_images() iterator can be used.

Variable names for ESA CCI Soil Moisture

ESA CCI SM variables as in the netcdf image files (and time series from netcdf images) for different products and versions

short_name	Parameter	Units
dnflag	Day / Night Flag
flag	Flag
freqbandID*	Frequency Band Identification
lat	Latitude	[degrees_north]
lon	Longitude	[degrees_east]
mode	Satellite Mode
sensor	Sensor Flag
sm	Volumetric Soil Moisture	[m3 m-3]
sm_uncertainty	Volumetric Soil Moisture Uncertainty	[m3 m-3]
t0	Observation Timestamp	[days since 1970-01-01 00:00:00 UTC]
time	Time	[days since 1970-01-01 00:00:00 UTC]

“freqbandID” is named “freqband” in older versions (before v3) of the data set.