pys5p package
Subpackages
Submodules
pys5p.ckd_io module
This module contains the class CKDio.
- class pys5p.ckd_io.CKDio(ckd_dir: Path | None = None, ckd_version: int = 1, ckd_file: Path | None = None)[source]
Bases:
object
Read Tropomi CKD from the Static CKD product or from dynamic CKD products.
- Parameters:
ckd_dir (Path, default=Path('/nfs/Tropomi/share/ckd')) – Directory where the CKD files are stored
ckd_version (int, default=1) – Version of the CKD
ckd_file (str, optional) – Name of the CKD file, default=None then the CKD file is searched in the directory ckd_dir with ckd_version in the glob-string
Notes
Not all CKD are defined or derived for all bands. You can request a CKD for one band or for a channel (bands: ‘12’, ‘34’, ‘56’, ‘78’). Do not mix bands from different channels
The option to have dynamic CKD is not used for the Tropomi mission, only for S/W version 1 a dynamic CKD product is defined. This product contained the OCAL CKD and was not updated automatically. For version 2, all CKD are stored in one product, where some CKD have a time-axis to correct any in-flight degradation.
Therefore, the logic to find a CKD is implemented as follows:
ckd_dir, defines the base directory to search for the CKD products (see below).
ckd_file, defines the full path to (static) CKD product; (version 1) any product with dynamic CKD has to be in the same directory.
Version 1:
Static CKD are stored in one file: glob(’_AUX_L1_CKD_’)
Dynamic CKD are stored in two files:
UVN, use glob(’_ICM_CKDUVN_’)
SWIR, use glob(’_ICM_CKDSIR_’)
Version 2+:
All CKD in one file: glob(’_AUX_L1_CKD_’)
Dynamic CKD are empty
- absirr(qvd: int = 1, bands: str = '78') Dataset [source]
Return absolute irradiance responsivity.
- Parameters:
qvd (int, default: 1) – Tropomi QVD identifier. Valid values are 1 or 2
bands (str, default: '78') – Tropomi bands [1..8] or channels [‘12’, ‘34’, ‘56’, ‘78’]
- absrad(bands: str = '78') Dataset [source]
Return absolute radiance responsivity.
- Parameters:
bands (str, default: '78') – Tropomi bands [1..8] or channels [‘12’, ‘34’, ‘56’, ‘78’]
- darkflux(bands: str = '78') Dataset [source]
Return dark-flux CKD, SWIR only.
- Parameters:
bands (str, default: '78') – Tropomi SWIR bands ‘7’, ‘8’ or both ‘78’
- dn2v_factors() ndarray [source]
Return digital number to Volt CKD, SWIR only.
Notes
The DN2V factor has no error attached to it.
- dpqf(threshold: float | None = None, bands: str = '78') Dataset [source]
Return detector pixel-quality flags (boolean), SWIR only.
- Parameters:
threshold (float, optional) – Value between [0..1], default is to read the threshold from CKD
bands (str, default='78') – Tropomi SWIR bands ‘7’, ‘8’, or both ‘78’
- Return type:
numpy ndarray
- get_param(ds_name: str, band: str = '7') ndarray | float [source]
Return value(s) of a CKD parameter from the Static CKD product.
- Parameters:
ds_name (str) – Name of the HDF5 dataset, default=’pixel_full_well’
band (str, default='7') – Band identifier ‘1’, ‘2’, …, ‘8’
- Returns:
CKD parameter value
- Return type:
numpy.ndarray or scalar
Notes
Datasets of size=1 are return as scalar
Handy function for scalar HDF5 datasets, such as:
dc_reference_temp
dpqf_threshold
pixel_full_well
pixel_fw_flag_thresh
- noise(bands: str = '78') Dataset [source]
Return readout-noise CKD, SWIR only.
- Parameters:
bands (str, default: '78') – Tropomi bands [1..8] or channels [‘12’, ‘34’, ‘56’, ‘78’]
- offset(bands: str = '78') Dataset [source]
Return offset CKD, SWIR only.
- Parameters:
bands (str, default: '78') – Tropomi SWIR bands ‘7’, ‘8’ or both ‘78’
- pixel_quality(bands: str = '78') Dataset [source]
Return detector pixel-quality mask (float [0, 1]), SWIR only.
- Parameters:
bands (str, default: '78') – Tropomi SWIR bands ‘7’, ‘8’ or both ‘78’
- prnu(bands: str = '78') Dataset [source]
Return Pixel Response Non-Uniformity (PRNU).
- Parameters:
bands (str, default: '78') – Tropomi bands [1..8] or channels [‘12’, ‘34’, ‘56’, ‘78’]
- relirr(qvd: int = 1, bands: str = '78') tuple[dict] | None [source]
Return relative irradiance correction.
- Parameters:
bands (str, default: '78') – Tropomi bands [1..8] or channels [‘12’, ‘34’, ‘56’, ‘78’]
qvd (int) – Tropomi QVD identifier. Valid values are 1 or 2, default: 1
- Returns:
CKD for relative irradiance correction as dictionaries with keys:
band: Tropomi spectral band ID
mapping_cols: coarse irregular mapping of the columns
mapping_rows: coarse irregular mapping of the rows
cheb_coefs: Chebyshev parameters for elevation and azimuth for pixels on a coarse irregular grid
- Return type:
dict
pys5p.error_propagation module
This module contains routines to divide or add uncertainties.
pys5p.get_data_dir module
This module contains routine get_data_dir.
- pys5p.get_data_dir.get_data_dir()[source]
Obtain directory with test datasets.
Limited to UNIX/Linux/macOS operating systems
- This module checks if the following directories are available:
/data/$USER/pys5p-data
/Users/$USER/pys5p-data
environment variable PYS5P_DATA_DIR
- It expects the data to be organized in the subdirectories:
CKD which should contain the SWIR dpqf CKD
OCM which should contain at least one directory of an on-ground calibration measurement with one or more OCAL LX products.
L1B which should contain at least one offline calibration, irradiance and radiance product.
ICM which contain at least one in-flight calibration product.
pys5p.icm_io module
pys5p.l1b_io module
pys5p.l1b_patch module
pys5p.lv2_io module
This module contains the class LV2io to access Tropomi level-2 products.
- class pys5p.lv2_io.LV2io(lv2_product: Path)[source]
Bases:
object
A class to read Tropomi Level-2 (offline) products.
- Parameters:
lv2_product (Path) – full path to S5P Tropomi level 2 product
Notes
The Python h5py module can read the operational netCDF4 products without any problems, however, the SRON science products contain incompatible attributes. Thus, should be fixed when more up-to-date netCDF software is used to generate the products. Currently, the Python netCDF4 module is used to read the science products.
- property algorithm_version: str | None
Return version of the level 2 algorithm.
- property coverage_time: tuple
Return start and end of the measurement coverage time.
- property creation_time: str
Return creation date/time of the level 2 product.
- get_attr(attr_name: str, ds_name: str | None = None) ndarray | None [source]
Obtain value of an HDF5 file attribute or dataset attribute.
- Parameters:
attr_name (str) – name of the attribute
ds_name (str, optional) – name of dataset, default is to read the product attributes
- get_data_as_xds(name: str, data_sel: tuple[slice | int] | None = None) DataArray [source]
Read dataset from group PRODUCT/target_product group.
- Parameters:
name (str) – name of dataset with level 2 data
data_sel (numpy slice) – a 3-dimensional numpy slice: time, scan_line, ground_pixel
- Return type:
xarray.DataArray
- get_dataset(name: str, data_sel: tuple[slice | int] | slice | None = None, fill_as_nan: bool = True) ndarray [source]
Read level 2 dataset from PRODUCT group.
- Parameters:
name (string) – name of dataset with level 2 data
data_sel (numpy slice) – a 3-dimensional numpy slice: time, scan_line, ground_pixel
fill_as_nan (boolean) – Replace (float) FillValues with Nan’s, when True
- Return type:
numpy.ndarray
- get_geo_bounds(extent: list[float, float, float, float] | None, data_sel: tuple[slice | int] | None) ndarray | tuple [source]
Return bounds of latitude/longitude as a mesh for plotting.
- Parameters:
extent (list) –
- select data to cover a region with geolocation defined by:
lon_min, lon_max, lat_min, lat_max and return numpy slice
data_sel (numpy slice) – a 3-dimensional numpy slice: time, scan_line, ground_pixel Note ‘data_sel’ will be overwritten when ‘extent’ is defined
- Returns:
data_sel (numpy slice) – Select slice of data which covers geolocation defined by extent. Only provided if extent is not None.
out (dictionary) – With numpy arrays for latitude and longitude
- get_geo_data(geo_dsets: str | None = None) dict [source]
Return data of selected datasets from the GEOLOCATIONS group.
- Parameters:
geo_dsets (str, optional) –
Name(s) of datasets, comma separated. Default:
operational: ‘latitude,longitude’
science: ‘latitude_center,longitude_center’
- Returns:
dictionary with arrays of selected datasets
- Return type:
dict
- property orbit: int
Return reference orbit number.
- property processor_version: str | None
Return version of the level 2 processor.
- property product_version: str
Return version of the level 2 product.
- property ref_time: datetime | None
Return reference start time of measurements.
- property science_product: bool
Check if product is a science product.
pys5p.ocm_io module
This module contain class OCMio to access on-ground calibration data.
- class pys5p.ocm_io.OCMio(ocm_product: Path)[source]
Bases:
object
A class to read Tropomi on-ground calibration products (Lx).
- Parameters:
ocm_product (Path) – Full path to on-ground calibration measurement
- get_attr(attr_name) ndarray | None [source]
Obtain value of an HDF5 file attribute.
- Parameters:
attr_name (string) – name of the attribute
- get_coverage_time() tuple[str, str] [source]
Return start and end of the measurement coverage time.
- get_delta_time() ndarray | None [source]
Return offset from the reference start time of measurement.
- get_exposure_time() ndarray | None [source]
Return the exact pixel exposure time of the measurements.
- get_msm_attr(msm_dset: str, attr_name: str) str | None [source]
Return attribute of measurement dataset “msm_dset”.
- Parameters:
msm_dset (str) – name of measurement dataset
attr_name (str) – name of the attribute
- Returns:
value of attribute “attr_name”
- Return type:
scalar or numpy.ndarray
- get_msm_data(msm_dset: str, fill_as_nan: bool = True, frames: list[int, int] | None = None, columns: list[int, int] | None = None) dict [source]
Return data of measurement dataset msm_dset.
- Parameters:
msm_dset (str) – name of measurement dataset if msm_dset is None then show names of available datasets
columns ([i, j]) – Slice data on fastest axis (columns) as, from index ‘i’ to ‘j’
frames ([i, j]) – Slice data on slowest axis (time) as, from index ‘i’ to ‘j’
fill_as_nan (boolean) – replace (float) FillValues with Nan’s
- Returns:
Python dictionary with names of msm_groups as keys
- Return type:
dict
- read_direct_msm(msm_dset: str, dest_sel: tuple[slice | int] | None = None, dest_dtype: type[Any] | None = None, fill_as_nan: bool = False) dict | None [source]
Return data of measurement dataset msm_dset (fast implementation).
- Parameters:
msm_dset (string) – Name of measurement dataset
dest_sel (numpy slice) – Selection must be the output of numpy.s_[<args>].
dest_dtype (numpy dtype) – Perform type conversion
fill_as_nan (boolean) – Replace (float) FillValues with Nan’s, when True
- Returns:
Python dictionary with names of msm_groups as keys
- Return type:
dict
- select(ic_id: int | None = None, *, msm_grp: str | None = None) int [source]
Select a measurement as BAND%/ICID_<ic_id>_GROUP_%.
- Parameters:
ic_id (int) – used as “BAND%/ICID_{}_GROUP_%”.format(ic_id)
msm_grp (str) – select measurement group with name msm_grp
None (All measurements groups are shown when ic_id and msm_grp are) –
- Returns:
Number of measurements found
- Return type:
scalar
Notes
- Updated object attributes:
bands : available spectral bands
pys5p.rls module
Implementation of the Relative Least-Squares regression (RLS).
The RLS regression is used to find the linear dependence y(x) = c0 + c1 * x that best describes the data before and after correction, using absolute residuals y_i - (c0 + c1 * x_i) divided by the expected signals c1 * x_i in the least-squares sum. Offset c0 has an arbitrary size and should not affect the fit result. Weight factors are determined such to effectively spread the data points evenly over the whole range of x, making the result less sensitive to the actual spacing between the data points.
- pys5p.rls.rls_fit(xdata: ndarray, ydata: ndarray | MaskedArray) tuple [source]
Perform RLS regression finding linear dependence y(x) = c0 + c1 * x.
- Parameters:
xdata (ndarray, shape (M,)) – X-coordinates of the M sample points (xdata[i], ydata[…, i]) The array values have to be positive and increasing
ydata (MaskedArray or ndarray, shape (..., M)) – Y-coordinates of the sample points
- Returns:
c0, c1, std_c0, std_c1 – coefficients of the linear dependence and their standard deviations
- Return type:
tuple of ndarrays
Notes
Calling a rls-function with MaskedArrays is much slower than with plain ndarrays.
The coefficients are set to NaN when the number of samples are less than 2.
The standard deviations can only be calculated when the number of samples are larger than two, else the standard deviations are equal to zero.
- pys5p.rls.rls_fit0(xdata: ndarray, ydata: ndarray | MaskedArray) tuple [source]
Perform RLS regression finding linear dependence y(x) = c1 * x.
- Parameters:
xdata (ndarray, shape (M,)) – X-coordinates of the M sample points (xdata[i], ydata[…, i]) The array values have to be positive and increasing
ydata (MaskedArray or ndarray, shape (..., M)) – Y-coordinates of the sample points
- Returns:
c1, std_c1 – coefficients of the linear dependence and their standard deviations
- Return type:
tuple of ndarrays
Notes
The coefficients are set to NaN when the number of samples are less than 2.
The standard deviations can only be calculated when the number of samples are larger than two, else the standard deviations are equal to zero.
pys5p.s5p_msm module
This module contains the class S5Pmsm.
Warning
Depreciated, this module is no longer maintained.
- class pys5p.s5p_msm.S5Pmsm(dset: Dataset | ndarray, data_sel: tuple[slice | int] | None = None, datapoint: bool = False)[source]
Bases:
object
A class to hold a HDF5 dataset and its attributes.
- Parameters:
dset (h5py.Dataset or ndarray) – h5py dataset from which the data is read, data is used to initialize S5Pmsm object
data_sel (numpy slice) – a numpy slice generated for example numpy.s_
datapoint (bool) – to indicate that the dataset is a compound of type datapoint
- Returns:
numpy structure with dataset data and attributes, including data,
fillvalue, coordinates, units, …
- biweight(data_sel: tuple[slice | int] | None = None, axis: int = 0, keepdims: bool = False) S5Pmsm [source]
Reduce this S5Pmsm data by applying biweight along some dimension.
- Parameters:
data_sel (numpy slice) – A numpy slice generated for example numpy.s_. Can be used to skip the first and/or last frame
axis (int, default=0) – Axis or axes along which the medians are computed.
keepdims (bool, default=False) – If this is set to True, the axes which are reduced are left in the result as dimensions with size one. With this option, the result will broadcast correctly against the original arr.
- Returns:
S5Pmsm object with its data (value & error) replaced by its biweight
medians along one axis. The coordinates are adjusted, accordingly.
- concatenate(msm: S5Pmsm, axis: int = 0) S5Pmsm [source]
Concatenate two measurement datasets, the current with another.
- Parameters:
msm (pys5p.S5Pmsm) – an S5Pmsm object
axis (int, default=0) – The axis for which the array will be joined.
- Returns:
The data of the new dataset is concatenated to the existing data along
an existing axis. The affected coordinate is also extended.
Note –
The arrays must have the same shape, except in the dimension
corresponding to axis (the first, by default).
- fill_as_nan()[source]
Replace fillvalues in data with NaN’s.
Works only on datasets with HDF5 datatype ‘float’ or ‘datapoints’
- nanmean(data_sel: tuple[slice | int] | None = None, axis: int = 0, keepdims: bool = False) S5Pmsm [source]
Reduce this S5Pmsm data by applying mean along some dimension.
- Parameters:
data_sel (numpy slice, optional) – A numpy slice generated for example numpy.s_. Can be used to skip the first and/or last frame
axis (int, default=0) – Axis or axes along which the mean are computed.
keepdims (bool, default=False) – If this is set to True, the axes which are reduced are left in the result as dimensions with size one. With this option, the result will broadcast correctly against the original arr.
- Returns:
S5Pmsm object with its data (value & error) replaced by its nanmean
and standard deviation along one axis.
The coordinates are adjusted, accordingly.
- nanmedian(data_sel: tuple[slice | int] | None = None, axis: int = 0, keepdims: bool = False) S5Pmsm [source]
Reduce this S5Pmsm data by applying median along some dimension.
- Parameters:
data_sel (numpy slice, optional) – A numpy slice generated for example numpy.s_. Can be used to skip the first and/or last frame
axis (int, default=0) – Axis or axes along which the medians are computed.
keepdims (bool, default=False) – If this is set to True, the axes which are reduced are left in the result as dimensions with size one. With this option, the result will broadcast correctly against the original arr.
- Returns:
S5Pmsm object with its data (value & error) replaced by its nanmedian
and standard deviation along one axis.
The coordinates are adjusted, accordingly.
- nanpercentile(vperc: int | list[float], data_sel: tuple[slice | int] | None = None, axis: int = 0, keepdims: bool = False) S5Pmsm [source]
Return percentile(s) of the data in the S5Pmsm.
- Parameters:
vperc (list) – range to normalize luminance data between percentiles min and max of array data.
data_sel (numpy slice) – A numpy slice generated for example numpy.s_. Can be used to skip the first and/or last frame
axis (int, default=0) – Axis or axes along which the medians are computed.
keepdims (bool, default=False) – If this is set to True, the axes which are reduced are left in the result as dimensions with size one. With this option, the result will broadcast correctly against the original arr.
- Returns:
S5Pmsm object with the original data replaced by the percentiles along
one of the axis, see below. The coordinates are adjusted, accordingly.
You should at least supply one percentile and at most three. –
- vperc is instance ‘int’ or len(vperc) == 1:
’value’ is replaced by its (nan-)percentile vperc ‘error’ is unchanged
- len(vperc) == 2:
’vperc’ is sorted ‘value’ is replaced by its (nan-)median ‘error’ is replaced by percentile(‘value’, (vperc[0], vperc[1]))
- len(vperc) == 3:
’vperc’ is sorted ‘value’ is replaced by percentile(‘value’, vperc[1]) ‘error’ is replaced by percentile(‘value’, (vperc[0], vperc[2]))
- set_coords(coords_data: list[ndarray], coords_name: list[str] | None = None)[source]
Set coordinates of data.
- Parameters:
coords_data (list of ndarrays) – list with coordinates data for each dimension
coords_name (list of strings) – list with the names of each dimension
- set_coverage(coverage: tuple[str, str], force: bool = False)[source]
Set the coverage attribute, as (coverageStart, coverageEnd).
- Parameters:
coverage (tuple[str, str]) –
force (bool, default=False) – overwrite when force is true
Notes
Both elements are expected to be datatime objects.
- set_long_name(name: str, force: bool = False)[source]
Set the long_name attribute, overwrite when force is true.
- set_units(units: str | None, force: bool = False)[source]
Set the units attribute, overwrite when force is true.
pys5p.swir_region module
Return the usable area on the SWIR detector.
There are two definitions:
'illuminated':
Detector area illuminated by external sources, defined as
a rectangular area where the signal is at least 50% of the
maximum signal. Coordinates: rows [11:228], columns [16:991].
'level2':
A smaller area used in official SWIR level 1B (ir)radiance
products. Coordinates: rows [12:227], columns [20:980].
Notes
Row 257 of the SWIR detector is neglected.
- pys5p.swir_region.coords(mode: str = 'illuminated', band: str = '78') tuple[slice, slice] [source]
Return slice defining the illuminated region on the SWIR detector.
- Parameters:
mode ({'illuminated', 'level2'}, optional) – default is ‘illuminated’
band (str, optional) – select band 7 or 8, default is both bands
- pys5p.swir_region.mask(mode: str = 'illuminated', band: str = '78') ndarray [source]
Return mask of the illuminated region.
- Parameters:
mode ({'illuminated', 'level2'}, optional) – default is ‘illuminated’
band (str, optional) – select band 7 or 8, default is both bands
Notes
Pixels within the illuminated region are set to True.
pys5p.swir_texp module
Calculate the Tropomi SWIR exposure time from detector settings.
- pys5p.swir_texp.swir_exp_time(int_delay: int, int_hold: int) float [source]
Calculate the correct SWIR exposure time from detector settings.
- Parameters:
int_delay (int) – parameters int_delay from the instrument_settings
int_hold (int) – parameters int_holt from the instrument_settings
- Returns:
exact (SWIR) pixel exposure time
- Return type:
float
pys5p.version module
Provide access to the software version as obtained from git.
Module contents
This is the SRON Python package pys5p.
It contains software to read Sentinel-5p Tropomi ICM, L1B and L2 products.