5. Statistics

All statistics are calculated and reported in the local time without shifts for daylight saving time of the station where the data originated from
https://toar-data.fz-juelich.de/api/v2/analysis/statistics/[?QUERY-OPTIONS] —> link does not work


where QUERY-OPTIONS are:

any combination of query options from both TOARDB REST interface - 2.4 Stationmeta and TOARDB REST interface - 2.5 Timeseries

daterange = <list of two datetimes: date range for which to extract data>

flags = <list of strings: only select data points with the specified quality flags> (for a description of flags and all available flag names see User Guide - 5.2 Data Quality Flags)

sampling = <string: temporal aggregation to use> (for available values see ALLOWED_SAMPLING_VALUES)

statistics = <list of strings: statistics to calculate> (for available values and details see 5.2. Available Statistics )

seasons = <list of strings: seasons to use for seasonal aggregations> (for available values see SEASON_DICT) (default: “DJF,MAM,JJA,SON”)

crops = <list of strings: crops to use for vegeseason aggregations> (for available values see ALLOWED_CROPS_VALUES) (default: “wheat,rice”)

min_data_capture = <number: minimal fraction of available hourly values in the aggregation interval to report an aggregated value, must be between 0 and 1> (default: 0.75)

metadata_scheme = <string: select how much metadata is returned> (basic|extended|full) (default: full)

format = <string>(raw|by_statistic) (for details on the formats see 8. Aggregated Output Formats )


Response: The query will return a unique task identifier and a link to check the status of your query.
Example: https://toar-data.fz-juelich.de/api/v2/analysis/statistics/?country=DE&variable_id=5&limit=3&daterange=2010-01-01T00:00:00,2020-12-31T23:59:59&flags=AllOK&sampling=annual&statistics=mean,median,min,max
Result: {“task_id”:”e2b17c39-6f80-4083-9bb8-f90cd72812b9”,”status”:”https://toar-data.fz-juelich.de/api/v2/analysis/status/e2b17c39-6f80-4083-9bb8-f90cd72812b9”}

To retrieve the result send a request to the status endpoint with your task identifier. If the result is there you will be redirected. The result will be a zip archive containing files in the format you have chosen.

5.1. Map

https://toar-data.fz-juelich.de/api/v2/analysis/statistics/[?QUERY-OPTIONS] #link does not work


where QUERY-OPTIONS are:

daterange = <str: comma separated start and end date and time for which to extract data>

variable_id = <integer: variable to extract>

bounding_box = <list of four numbers: bounding_box (min_lat,min_lon,max_lat,max_lon) in degrees_north/degrees_east to define a geographical rectangle (do not set anything for global extraction)> (default: None)

statistics = <list of strings: statistics to calculate> (for available values and details see 3. Available Statistics)

format = <string> (json|csv) (default: json)


Response: The query will return a unique task identifier and a link to check the status of your query.
Example: https://toar-data.fz-juelich.de/api/v2/analysis/statistics/map/?daterange=2010-01-01T00:00:00,2020-12-31T23:59:59&variable_id=5&bounding_box=50,6,52,8&statistics=avgdma8epax&format=csv
Result: {“task_id”:”5a0beddf-a1c4-4584-9fb7-d5e98bafcd46”,”status”:”https://toar-data.fz-juelich.de/api/v2/analysis/status/5a0beddf-a1c4-4584-9fb7-d5e98bafcd46”}

To retrieve the result send a request to the status endpoint with your task identifier. If the result is there you will be redirected. The result will be a zip archive containing files in the format you have chosen.

5.2. Available Statistics

Remarks about the minimal fraction of available hourly data use 75% (the default) in the descriptions below. When you define a different min_data_capture that value is used instead.

For more details see supplement 1 of Schultz et al. (2017)

Name

Description

aot40

Daily 12-h AOT40 values are accumulated using hourly values for the 12-h period from 08:00h until 19:59h. AOT40 is defined as cumulative ozone above 40 ppb. If less than 75% of hourly values (i.e. less than 9 out of 12 hours) are present, the cumulative AOT40 is considered missing. When there exist 75% or greater data capture in the daily 12-h window, the scaling by fractional data capture (ntotal/nvalid) is utilized.
For monthly, seasonal, summer, or annual statistics, the daily AOT40 values are accumulated over the aggregation period and scaled by (ntotal/nvalid) days. If less than 75% of days are valid, the value is considered missing.

avgdma8epax

Average value of the daily dma8epax statistics during the aggregation period.

count

Number of available values in the aggregation period.

dark_aot40

As aot40, but using solar elevation <= 5 degrees to identify “dark” hours.

dark_avg

As mean, but using solar elevation <= 5 degrees to identify “dark” hours.

data_capture

Fraction of valid (hourly) values available in the aggregation period.

daylight_aot40

As aot40, but using solar elevation > 5 degrees to identify “daytime” hours.

daylight_avg

As mean, but using solar elevation > 5 degrees to identify “daytime” hours.

daytime_avg

Daytime average is defined as average of hourly values for the 12-h period from 08:00h to 19:59h. All hourly values in the aggregation period are averaged, and the resulting value is valid if at least 75% of hourly values are present.

diurnal_cycle

Diurnal cycle (must be given without any other statistics).

dma8epa

Daily maximum 8-hour average statistics according to the US EPA definition. 8-hour averages are calculated for 24 bins starting at 0 h local time. The 8-h running mean for a particular hour is calculated on the concentration for that hour plus the following 7 hours. If less than 75% of data are present (i.e. less than 6 hours), the average is considered missing.
When the aggregation period is “seasonal”, “summer”, or “annual”, the 4th highest daily 8-hour maximum of the aggregation period will be computed.
Note that in contrast to the official EPA definition, a daily value is considered valid if at least one 8-hour average is valid.

dma8epa_strict

As dma8epa, but additionally, a diurnal 8-hour maximum value is only saved if at least 18 out of the 24 8-hour averages are valid. This is the official dma8epa definition.

dma8epax

As dma8epa, but using the new US EPA definition of the daily 8-hour window from 7 h local time to 23 h local time.

dma8epax_strict

As dma8epax, but additionally, a diurnal 8-hour maximum value is only saved if at least 13 out of the 17 8-hour averages are valid. This is the official dma8epax definition.

dma8eu

As dma8epa, but using the EU definition of the daily 8-hour window starting from 17 h of the previous day.
When the aggregation period is “seasonal”, “summer”, or “annual”, the 26th highest daily 8-hour maximum of the aggregation period will be computed.

dma8eu_strict

As dma8eu, but additionally, a diurnal 8-hour maximum value is only saved if at least 18 out of the 24 8-hour averages are valid. This is the official dma8eu definition.

drmdmax1h

Maximum of the 3-months running mean of daily maximum 1-hour mixing ratios during the aggregation period.

m7_avg

Daytime mean values (9-16h).

max

Maximum in the aggregation period.

max1h

Daily maximum hourly value.

mean

Average value in the aggregation period.

median

Median value in the aggregation period.

min

Minimum in the aggregation period.

nighttime_avg

Same as daytime_average but accumulated over the daily interval from 20:00 h to 07:59 h.

nvgt050

Number of days with exceedance of the dma8epax value above 50 ppb. The value is marked as missing if less than 75% of days contain valid data.

nvgt060

Number of days with exceedance of the dma8epax value above 60 ppb. The value is marked as missing if less than 75% of days contain valid data.

nvgt070

Number of days with exceedance of the dma8epax value above 70 ppb. The value is marked as missing if less than 75% of days contain valid data.

nvgt080

Number of days with exceedance of the dma8epax value above 80 ppb. The value is marked as missing if less than 75% of days contain valid data.

nvgt090

Number of days with exceedance of the daily max1h_values above 90 ppb. The value is marked as missing if less than 75% of days contain valid data.

nvgt100

Number of days with exceedance of the daily max1h_values above 100 ppb. The value is marked as missing if less than 75% of days contain valid data.

nvgt120

Number of days with exceedance of the daily max1h_values above 120 ppb. The value is marked as missing if less than 75% of days contain valid data.

nvgtall

nvgt050+nvgt060+nvgt080+nvgt090+nvgt100+nvgt120.

p05

Fifth-percentile of hourly values in the aggregation period.

p10

As p05, but for the 10th-percentile.

p25

As p05, but for the 25th-percentile.

p75

As p05, but for the 75th-percentile.

p90

As p05, but for the 90th-percentile.

p95

As p05, but for the 25th-percentile.

p98

As p05, but for the 98th-percentile.

p99

As p05, but for the 99th-percentile.

percentiles1

p25+p50+p75.

percentiles2

p5+p10+p25+p50+p75+p90+p95(+p98+p99 if aggregation period is “summer” or “annual”).

somo10

Sum of excess of daily maximum 8-h means (EU Airbase standard with relaxed criterion: dma8eu) over the cut-off of 10 ppb, i.e. 20 µg/m3 calculated for all days in the aggregation period. SOMO10 will be set to missing if less than 75% of days are available. The quantity will be weighted by the number of theoretical days over the number of available days.

somo10_strict

As somo10, but using dma8eu_strict for data capture.

somo35

As somo10, but accumulating ozone values above 35 ppb.

somo35_strict

As somo10_strict, but accumulating ozone values above 35 ppb.

stddev

Standard deviation in the aggregation period.

w126

Daily W126 index is accumulated using hourly values for the 12-h period from 08:00h until 19:59h. W126 = SUM(wiCi) with weight wi = 1/[1 + Mexp(-A*Ci/1000)], where M = 4403, A = 126, and where Ci is the hourly average O3 mixing ratio in units of ppb. If there are less than 9 valid hourly values in the 12 hour window, the daily value is considered missing. When there exist 75% or greater data capture in the daily 12-h window, the scaling by fractional data capture (ntotal/nvalid) is utilized.
Seasonal, summer, or annual statistics are calculated as sum over the daily W126 values. Results are marked as missing if less than 75% of daily values are valid.

w126_24h

As w126, but using all 24 hours of a day.

w90

Daily maximum W90 5-h Experimental Exposure Index:

EI = SUM(wiCi) with weight wi = 1/[1 + Mexp(-A·Ci/1000)], where M = 1400, A = 90, and where Ci is the hourly average O3 mixing ratio in units of ppb (Lefohn et al., 2010). For each day, 24 W90 indices are computed as 5-hour sums, requiring that at least 4 of the 5 hours are valid data (75%). If a sample consists of only 4 data points, a fifth value shall be constructed from averaging the 4 valid mixing ratios.
For aggregation periods “month”, “season”, “summer”, or “annual”, the 4th highest W90 value is computed, but only if at least 75% of days in this period have valid W90 values.