A Representational State Transfer (REST) service that allows retrieval of analysis products from the Tropospheric Ozone Assessment Report (TOAR) database of surface ozone observations.
This documentation describes the URL architecture and query options of the TOAR analysis REST interface. For general information on REST, please consult other resources.
https://toar-data.fz-juelich.de/api/v2/analysis/
Response: Description and documentation of available REST services (this document).
The following analysis services are available and described individually below. Each service is invoked by appending its name and possible query arguments to the base URL.
data: get hourly data from the database
timeseries: get hourly time series data
map: get snapshot of one point in time of one variable
statistics: get aggregated data from the database
map: get snapshot of aggregated values of one variable
trends: get trends of aggregated data from the database
status: check the current status of your query
result: get the query result
In order to control the database queries and hence the response of the TOAR analysis REST service, you can add arguments to the service URL. These arguments must adhere to the format <argument_name>=<value>. The first argument is prepended by a ? character, all other arguments are separated by & characters.
The response can be either synchronous or asynchronous. If the response is synchronous you will receive the requested result directly. If the response is asynchronous you will not get your requested result but instead a unique task identifier for your request. This id can be used to check the status of your request. When your result is ready the id will redirect you to the requested result. This type of approach is chosen for queries that are expected to take more time to process.
The services below grant access to the hourly data of the TOAR database.
https://toar-data.fz-juelich.de/api/v2/analysis/data/timeseries/[?QUERY-OPTIONS]
where QUERY-OPTIONS are:
any combination of query options from both TOARDB REST interface - 2.4 Stationmeta and TOARDB REST interface - 2.5 Timeseries
daterange = <list of two datetimes: date range for which to extract data>
flags = <list of strings: only select data points with the specified quality flags> (for a description of flags and all available flag names see User Guide - 5.2 Data Quality Flags)
format = <string> (json|csv) (default: json)
Response: The query will return a unique task identifier and a link to check the status of your query.
Result: {"task_id":"94e3888a-33f8-4adf-a6d6-4d8627c9ecc0","status":"https://toar-data.fz-juelich.de/api/v2/analysis/status/94e3888a-33f8-4adf-a6d6-4d8627c9ecc0"}
To retrieve the result send a request to the status endpoint with your task identifier. If the result is there you will be redirected. The result will be a zip archive containing one file per time series in the format you have chosen.
https://toar-data.fz-juelich.de/api/v2/analysis/data/map/[?QUERY-OPTIONS]
where QUERY-OPTIONS are:
datetime = <datetime: date and time for which to extract data>
variable_id = <integer: variable to extract>
bounding_box = <list of four numbers: bounding_box (min_lat,min_lon,max_lat,max_lon) in degrees_north/degrees_east to define a geographical rectangle (do not set anything for global extraction)> (default: None)
format = <string> (json|csv) (default: json)
Response: The query will return tuples of latitude, longitude and value at the location in the specified format.
Result:
[{"lat":47.81564999946587,"lon":13.03488,"value":68.0628783549876},
{"lat":47.8055555994659,"lon":13.043333,"value":65.96268275498761},
...
{"lat":53.2465,"lon":6.60894,"value":43.978797594987604},
{"lat":52.0918,"lon":6.60537,"value":54.1289075949876}]
All statistics are calculated and reported in the local time without shifts for daylight saving time of the station where the data originated from.
https://toar-data.fz-juelich.de/api/v2/analysis/statistics/[?QUERY-OPTIONS]
where QUERY-OPTIONS are:
any combination of query options from both TOARDB REST interface - 2.4 Stationmeta and TOARDB REST interface - 2.5 Timeseries
daterange = <list of two datetimes: date range for which to extract data>
flags = <list of strings: only select data points with the specified quality flags> (for a description of flags and all available flag names see User Guide - 5.2 Data Quality Flags)
sampling = <string: temporal aggregation to use> (for available values see ALLOWED_SAMPLING_VALUES)
statistics = <list of strings: statistics to calculate> (for available values and details see 3. Available Statistics)
seasons = <list of strings: seasons to use for seasonal aggregations> (for available values see SEASON_DICT) (default: "DJF,MAM,JJA,SON")
crops = <list of strings: crops to use for vegseason aggregations> (for available values see ALLOWED_CROPS_VALUES) (default: "wheat,rice")
min_data_capture = <number: minimal fraction of available hourly values in the aggregation interval to report an aggregated value, must be between 0 and 1> (default: 0.75)
metadata_scheme = <string: select how much metadata is returned> (basic|extended|full) (default: full)
format = <string> (raw|by_statistic) (for details on the formats see 4. Aggregated Output Formats) (default: raw)
Response: The query will return a unique task identifier and a link to check the status of your query.
Result: {"task_id":"e2b17c39-6f80-4083-9bb8-f90cd72812b9","status":"https://toar-data.fz-juelich.de/api/v2/analysis/status/e2b17c39-6f80-4083-9bb8-f90cd72812b9"}
To retrieve the result send a request to the status endpoint with your task identifier. If the result is there you will be redirected. The result will be a zip archive containing files in the format you have chosen.
https://toar-data.fz-juelich.de/api/v2/statistics/map/[?QUERY-OPTIONS]
where QUERY-OPTIONS are:
daterange = <str: comma separated start and end date and time for which to extract data>
variable_id = <integer: variable to extract>
bounding_box = <list of four numbers: bounding_box (min_lat,min_lon,max_lat,max_lon) in degrees_north/degrees_east to define a geographical rectangle (do not set anything for global extraction)> (default: None)
statistics = <list of strings: statistics to calculate> (for available values and details see 3. Available Statistics)
format = <string> (json|csv) (default: json)
Response: The query will return a unique task identifier and a link to check the status of your query.
Result: {"task_id":"5a0beddf-a1c4-4584-9fb7-d5e98bafcd46","status":"https://toar-data.fz-juelich.de/api/v2/analysis/status/5a0beddf-a1c4-4584-9fb7-d5e98bafcd46"}
To retrieve the result send a request to the status endpoint with your task identifier. If the result is there you will be redirected. The result will be a zip archive containing files in the format you have chosen.
All statistics are calculated in the local time without shifts for daylight saving time of the station where the data originated from.
Daily aggregates will report the trend in ppbv/day and monthly aggregates will report the trend in ppbv/month.
https://toar-data.fz-juelich.de/api/v2/analysis/trends/[?QUERY-OPTIONS]
where QUERY-OPTIONS are:
any combination of query options from both TOARDB REST interface - 2.4 Stationmeta and TOARDB REST interface - 2.5 Timeseries
daterange = <list of two datetimes: date range for which to extract data>
flags = <list of strings: only select data points with the specified quality flags> (for a description of flags and all available flag names see User Guide - 5.2 Data Quality Flags)
sampling = <string: temporal aggregation to use> (daily|monthly)
statistics = <list of strings: statistics to calculate> (for available values and details see 3. Available Statistics)
seasons = <list of strings: seasons to use for seasonal aggregations> (for available values see SEASON_DICT) (default: "DJF,MAM,JJA,SON")
crops = <list of strings: crops to use for vegseason aggregations> (for available values see ALLOWED_CROPS_VALUES) (default: "wheat,rice")
min_data_capture = <number: minimal fraction of available hourly values in the aggregation interval to report an aggregated value, must be between 0 and 1> (default: 0.75)
method = <string: regression analysis method to use> (OLS|quant) (default: quant)
quantiles = <list of numbers: quantiles to use when using quantile regression, must be between 0 and 1>
num_samples = <number: number of sampled trends in moving block bootstrap> (default: 50)
metadata_scheme = <string: select how much metadata is returned> (basic|extended|full) (default: full)
format = <string> (json_simple|by_stat_quant) (for details on the formats see 4. Aggregated Output Formats) (default: json_simple)
Response: The query will return a unique task identifier and a link to check the status of your query.
Result: {"task_id":"24666af1-5a51-4223-b8fc-c7d2d5f0070e","status":"https://toar-data.fz-juelich.de/api/v2/analysis/status/24666af1-5a51-4223-b8fc-c7d2d5f0070e"}
To retrieve the result send a request to the status endpoint with your task identifier. If the result is there you will be redirected. The result will be a zip archive containing files in the format you have chosen.
https://toar-data.fz-juelich.de/api/v2/analysis/status/[task_id]
Response: If the result is not ready yet the response will return the task id and the URL itself again. If the result is ready you will be redirected to the result endpoint.
Example: https://toar-data.fz-juelich.de/api/v2/analysis/status/e2b17c39-6f80-4083-9bb8-f90cd72812b9
Result:
{"task_id":"e2b17c39-6f80-4083-9bb8-f90cd72812b9","status":"https://toar-data.fz-juelich.de/api/v2/analysis/status/e2b17c39-6f80-4083-9bb8-f90cd72812b9"}
or
redirect to https://toar-data.fz-juelich.de/api/v2/analysis/result/e2b17c39-6f80-4083-9bb8-f90cd72812b9
https://toar-data.fz-juelich.de/api/v2/analysis/result/[task_id]
Response: A zip archive containing the query result in the format you requested.
Example: https://toar-data.fz-juelich.de/api/v2/analysis/result/e2b17c39-6f80-4083-9bb8-f90cd72812b9
Result: zip archive
Remarks about the minimal fraction of available hourly data use 75% (the default) in the descriptions below. When you define a different min_data_capture that value is used instead.
For more details see supplement 1 of Schultz et al. (2017)
Name | Description |
---|---|
aot40 | Daily 12-h AOT40 values are accumulated using hourly values for the 12-h period from 08:00h until 19:59h. AOT40 is defined as cumulative ozone above 40 ppb. If less than 75% of hourly values (i.e. less than 9 out of 12 hours) are present, the cumulative AOT40 is considered missing. When there exist 75% or greater data capture in the daily 12-h window, the scaling by fractional data capture (ntotal/nvalid) is utilized. For monthly, seasonal, summer, or annual statistics, the daily AOT40 values are accumulated over the aggregation period and scaled by (ntotal/nvalid) days. If less than 75% of days are valid, the value is considered missing. |
avgdma8epax | Average value of the daily dma8epax statistics during the aggregation period. |
count | Number of available values in the aggregation period. |
dark_aot40 | As aot40, but using solar elevation <= 5 degrees to identify "dark" hours. |
dark_avg | As mean, but using solar elevation <= 5 degrees to identify "dark" hours. |
data_capture | Fraction of valid (hourly) values available in the aggregation period. |
daylight_aot40 | As aot40, but using solar elevation > 5 degrees to identify "daytime" hours. |
daylight_avg | As mean, but using solar elevation > 5 degrees to identify "daytime" hours. |
daytime_avg | Daytime average is defined as average of hourly values for the 12-h period from 08:00h to 19:59h. All hourly values in the aggregation period are averaged, and the resulting value is valid if at least 75% of hourly values are present. |
diurnal_cycle | Diurnal cycle (must be given without any other statistics). |
dma8epa | Daily maximum 8-hour average statistics according to the US EPA definition. 8-hour averages are calculated for 24 bins starting at 0 h local time. The 8-h running mean for a particular hour is calculated on the concentration for that hour plus the following 7 hours. If less than 75% of data are present (i.e. less than 6 hours), the average is considered missing. When the aggregation period is "seasonal", "summer", or "annual", the 4th highest daily 8-hour maximum of the aggregation period will be computed. Note that in contrast to the official EPA definition, a daily value is considered valid if at least one 8-hour average is valid. |
dma8epa_strict | As dma8epa, but additionally, a diurnal 8-hour maximum value is only saved if at least 18 out of the 24 8-hour averages are valid. This is the official dma8epa definition. |
dma8epax | As dma8epa, but using the new US EPA definition of the daily 8-hour window from 7 h local time to 23 h local time. |
dma8epax_strict | As dma8epax, but additionally, a diurnal 8-hour maximum value is only saved if at least 13 out of the 17 8-hour averages are valid. This is the official dma8epax definition. |
dma8eu | As dma8epa, but using the EU definition of the daily 8-hour window starting from 17 h of the previous day. When the aggregation period is "seasonal", "summer", or "annual", the 26th highest daily 8-hour maximum of the aggregation period will be computed. |
dma8eu_strict | As dma8eu, but additionally, a diurnal 8-hour maximum value is only saved if at least 18 out of the 24 8-hour averages are valid. This is the official dma8eu definition. |
drmdmax1h | Maximum of the 3-months running mean of daily maximum 1-hour mixing ratios during the aggregation period. |
m7_avg | Daytime mean values (9-16h). |
max | Maximum in the aggregation period. |
max1h | Daily maximum hourly value. |
mean | Average value in the aggregation period. |
median | Median value in the aggregation period. |
min | Minimum in the aggregation period. |
nighttime_avg | Same as daytime_average but accumulated over the daily interval from 20:00 h to 07:59 h. |
nvgt050 | Number of days with exceedance of the dma8epax value above 50 ppb. The value is marked as missing if less than 75% of days contain valid data. |
nvgt060 | Number of days with exceedance of the dma8epax value above 60 ppb. The value is marked as missing if less than 75% of days contain valid data. |
nvgt070 | Number of days with exceedance of the dma8epax value above 70 ppb. The value is marked as missing if less than 75% of days contain valid data. |
nvgt080 | Number of days with exceedance of the dma8epax value above 80 ppb. The value is marked as missing if less than 75% of days contain valid data. |
nvgt090 | Number of days with exceedance of the daily max1h_values above 90 ppb. The value is marked as missing if less than 75% of days contain valid data. |
nvgt100 | Number of days with exceedance of the daily max1h_values above 100 ppb. The value is marked as missing if less than 75% of days contain valid data. |
nvgt120 | Number of days with exceedance of the daily max1h_values above 120 ppb. The value is marked as missing if less than 75% of days contain valid data. |
nvgtall | nvgt050+nvgt060+nvgt080+nvgt090+nvgt100+nvgt120. |
p05 | Fifth-percentile of hourly values in the aggregation period. |
p10 | As p05, but for the 10th-percentile. |
p25 | As p05, but for the 25th-percentile. |
p75 | As p05, but for the 75th-percentile. |
p90 | As p05, but for the 90th-percentile. |
p95 | As p05, but for the 25th-percentile. |
p98 | As p05, but for the 98th-percentile. |
p99 | As p05, but for the 99th-percentile. |
percentiles1 | p25+p50+p75. |
percentiles2 | p5+p10+p25+p50+p75+p90+p95(+p98+p99 if aggregation period is "summer" or "annual"). |
somo10 | Sum of excess of daily maximum 8-h means (EU Airbase standard with relaxed criterion: dma8eu) over the cut-off of 10 ppb, i.e. 20 µg/m3 calculated for all days in the aggregation period. SOMO10 will be set to missing if less than 75% of days are available. The quantity will be weighted by the number of theoretical days over the number of available days. |
somo10_strict | As somo10, but using dma8eu_strict for data capture. |
somo35 | As somo10, but accumulating ozone values above 35 ppb. |
somo35_strict | As somo10_strict, but accumulating ozone values above 35 ppb. |
stddev | Standard deviation in the aggregation period. |
w126 | Daily W126 index is accumulated using hourly values for the 12-h period from 08:00h until 19:59h. W126 = SUM(wi*Ci) with weight wi = 1/[1 + M*exp(-A*Ci/1000)], where M = 4403, A = 126, and where Ci is the hourly average O3 mixing ratio in units of ppb. If there are less than 9 valid hourly values in the 12 hour window, the daily value is considered missing. When there exist 75% or greater data capture in the daily 12-h window, the scaling by fractional data capture (ntotal/nvalid) is utilized. Seasonal, summer, or annual statistics are calculated as sum over the daily W126 values. Results are marked as missing if less than 75% of daily values are valid. |
w126_24h | As w126, but using all 24 hours of a day. |
w90 | Daily maximum W90 5-h Experimental Exposure Index: EI = SUM(wi*Ci) with weight wi = 1/[1 + M*exp(-A·Ci/1000)], where M = 1400, A = 90, and where Ci is the hourly average O3 mixing ratio in units of ppb (Lefohn et al., 2010). For each day, 24 W90 indices are computed as 5-hour sums, requiring that at least 4 of the 5 hours are valid data (75%). If a sample consists of only 4 data points, a fifth value shall be constructed from averaging the 4 valid mixing ratios. For aggregation periods "month", "season", "summer", or "annual", the 4th highest W90 value is computed, but only if at least 75% of days in this period have valid W90 values. |
Following are the descriptions of available output formats for aggregated time series data. If you want to try different output formats to find the best one for your needs you can run your query with a low limit (e.g. limit=3) to check out the different outputs.
This output format will create csv files in the same way as TOARDB REST interface - 2.7 Data. The zip archive will contain one csv file per time series. The name of each individual csv file will be "<time_series_id>.csv"
This output format will create one csv file per requested statistic and one additional csv file with all the metadata. Each row in all the files will contain the information (either metadata or aggregated values) for one time series. All files have the same number and order of rows so that you can match the metadata and different aggregates for each time series via the row position. The metadata file is called "metadata.csv" and the files for the aggregated values are called "<statistic>.csv".
This output format will create one JSON file per time series, statistic and quantile (if using quantile regression). Each JSON file will contain a dictionary with one key which holds all the metadata and a second key holding the calculated trend, uncertainty and p-value. The files are called "<time_series_id>_<statistic>_<quantile>.json".
This output format will create one csv file per requested statistic and quantile and one additional csv file with all the metadata. Each row in all the files will contain the information (either metadata or trend values) for one time series. All files have the same number and order of rows so that you can match the metadata and different trends for each time series via the row position. The metadata file is called "metadata.csv" and the files for the trend values are called "<statistic>_<quantile>.csv".