General

About

The TOARgridding tool projects data from the TOAR-II database (https://toar-data.fz-juelich.de/) onto a grid. The user can select the

  • variable,

  • statistical aggregation,

  • time period,

  • rectangular lat-lon grid of custom resolution,

  • (optional) filtering according to the station metadata

  • and (optional) an data aggregation mode at each station.

Base URL

https://toar-data.fz-juelich.de/api/v2/toargridding

Response: Description and documentation of available REST services (this document).

Services

The TOARgridding services offers at the moment four services:

  • test_auth: Checks weather your user rights allow you to obtain gridded products

  • reg-lat-lon-grid: request a gridded product with a regular lat-lon grid and custom resolution

  • status: Check the current status of your query

  • result: This hidden endpoint will allow you the download your results. status will automatically redirect you here.

Query arguments

In order to control the database queries and hence the response of the TOARgridding REST service, you can add arguments to the service URL. These arguments must adhere to the following format <argument_name>=. The first argument is prepended by a ‘?’ character and all other arguments are separated by ‘&’ characters.

Response format

The response is asynchronous. Rather than receiving your requested result, you will receive a unique task identifier for your request. This ID can be used to check the status of your request. Once your result is ready, the ID will redirect you to it. This approach has been chosen because the queries are known to take a long time to process.

Description of Services

test_auth:

This endpoint can be used to check whether you are permitted to use the TOARgridding service. It also allows you to check wether the OpenID Connect token has been passed correctly through the header.

Gridding

TOARgridding uses a user-defined grid to combine all stations within a cell. The resulting dataset reports the per-cell mean, standard deviation and number of stations.

Regular lat-lon grid: reg_lat_lon_grid

The first supported grid is a regular latitude longitude grid covering the hole world. It is available via the following endpoint https://toar-data-dev.fz-juelich.de/api/v2/toargridding/reg_lat_lon_grid/

This endpoint has a number of required and optional query parameters. The required parameters are:

Name Type Description
daterange string Date range in the format YYYY-MM-DDT,YYYY-MM-DD or YYYY-MM-DDThh:mm:ss,YYYY-MM-DDThh:mm:ss, with the first date being the start date and the second date being the end date. We support providing a time of day for compatibility reasons, but this is ignored when creating the gridded product.
sampling string Temporal sampling of the data. Possible values for TOARgridding are 'daily', 'monthly', 'annual'.
lat_res float The latitude resolution of the grid in degrees.
lon_res float The longitude resolution of the grid in degrees.
statistic string Statistical aggregation of the data before gridding.

Last but not least, we require a variable name. This can be provided either as variable_name or as variable_id. Providing both at the same time is not allowed. Both values follow the definition within the TOAR-II database.

As optional metadata fields, we support

Name Default value Description
station metadata different TOARgridding also supports filtering according to all other metadata fields supported by the analysis service, except for the datarange, sampling, mergedTS flag and statistics, which are explicitly controlled by this service.
data_quality_flags AllOK Data quality flags to be considered for the gridding. The flags are separated by a comma. The default is to consider only data that passed all checks, including the automated check of the TOAR data infrastructure.
data_aggregation_mode string This allows you to influence the data aggregation at each station. Details can be found in this document.

Python Example:

In case you want to use the full capabilities of the service, you can use create scripts, for example in python:

import requests
import liboidcagent as agent

oidc_shortname = "my_shortname"

required_arguments = {
    "daterange" : "2013-01-01,2013-01-08",
    "sampling" : "daily",
    "lat_res" : 1.0,
    "lon_res" : 2.0,
    "statistic" : "mean",
    "variable_name" : "o3",
    "variable_id" : None # as we provide a variable_name
    }

#additional metadata for station filtering:
optional_arguments = {
    "type_of_area" : "Rural",
}

headers = {"AccessToken" : agent.get_access_token(oidc_shortname)}

req = required_arguments | optional_arguments
response = requests.get("https://toar-data-dev.fz-juelich.de/api/v2/toargridding/reg-lat-lon-grid/", params=req, headers=headers)

Here, we create an access token of the Helmholtz ID with the OIDC agent, that has to be setup on your system.

Detailed Description of request parameters

Data aggregation mode:

In addition to extracting all time series from the TOAR-II database, TOARGridding offers two further options for processing time series recorded at the same station.

A brief reminder on timeseries and stations

The TOAR-II database uses timeseries, which are associated with a station. At an individual station, one or more physical sensors are mounted. These can measure different variables or in some cases the same variable using different techniques. A station may also be part of different networks, that contribute data to the TOAR-II database. A more detailed description of the included data can be found in Chapter Three: The TOAR data processing workflow. In the case of gridding, this can lead to systematic errors. For example, the a station’s statistical weight can increase, if it contributes twice.

Station averaging

The time series at each station can be averaged before gridding. This results in the same statistical weight for each station. Depending on the calculated statistical aggregates, this can introduce or remove systematic errors in the data analysis. This option is an alternative to the merged timeseries by setting data_aggregation_mode="meanTSByStation". Therefore, both operations cannot be combined.

Merged Timeseries

The TOAR data infrastructure introduced a feature called timeseries_merged as an alternative to the classical timeseries extraction. The merging process is described in more detail in the official documentation (TODO: add link). TOARgridding can use this feature by setting data_aggregation_mode="mergedTS".

status

This endpoint enables you to check the current status of your request by entering its unique ID. Requests that have been accepted also redirect to this endpoint. The following is an example of the output:

{
    "task_id"     : "175405xe-8136-4c59-85e8-2764720bf735",
    "status"      : "https://toar-data.fz-juelich.de/api/v2/toargridding/status/175405xe-8136-4c59-85e8-2764720bf735",
    "task_status" : "RUNNING",
    "info"        : {
                    "processed timeseries" : 780,
                    "total timeseries"   : 1308,
                    "status update every" : 130
    },
    "help"        : "Please wait a moment and call the status link (update interval: 15min). It will redirect to the download as soon as the job is finished. See info for progress on data retrieval from analysis service."
}

Output Formats

We only provide output as netCDF4 files. We include metadata that follow the CF Convention.

Contributors

Contributors include all projects, organizations and persons that are associated to any timeseries of a gridded dataset with the roles “contributor” and “originator”. The metadata of each gridded product contains a ready-to-use link that provides direct access to the contributors via their unique job ID. The default output format is a ready-to-use list of all the programmes, organizations and persons that contributed to this dataset. This list is sorted alphabetically within each of the three categories. The organizations provided include the affiliations of all persons, as stored in the TOAR-II database. The second option is JSCON (append ?format=json to the request url), which provides full information on all roles associated with to the provided timeseries IDs. These data should be processed to fit your needs.