Frequently asked questions about the TOAR Database (user)

The landing page of the TOAR Database Infrastructure is accessible at https://toar-data.fz-juelich.de.

Which data are available in TOAR DB?

The TOAR database primarily collects ground-level ozone observations from around the world. In TOAR-II there are also increased efforts to include ozone precursor observations and meteorological data to allow for trend attributions and other scientific analyses. All data in the TOAR database are harmonized and quality controlled, but the level of quality control differs (see questions on data curation and data quality). Only data from research-grade instruments is accepted. The current status of data in the TOAR database can be queried with the following URL: https://toar-data.fz-juelich.de/api/v2/database_statistics/.

Where does the data come from?

TOAR-II collects data from 18 large air quality monitoring networks and from many individual data providers. These datasets are augmented with extra metadata which is obtained through processing of various earth observation datasets (see chapter 4.2.4 (Station characterisation through geospatial data) of guide https://toar-data-dev.fz-juelich.de/documentation/TOAR_UG_Vol03_Database.pdf). Furthermore, meteorological information is added to the air pollutant time series through extraction from gridded ECMWF reanalysis version 5 (ERA5). Detailed information on the data sources can be found in chapter 2 (TOAR Data Sources) of guide https://toar-data-dev.fz-juelich.de/documentation/TOAR_TG_Vol02_Data_Processing.pdf.

Which metadata are available in TOAR DB?

The TOAR database version 2 has an extensive metadata schema which includes a lot of information about the measurement location (station) and the measurement itself (timeseries). In addition, we generate globally uniform metadata through a set of queries to our geospatial data set point extraction and aggregation service (GEO-PEAS). A detailed description of all metadata can be found in https://esde.pages.jsc.fz-juelich.de/toar-data/toardb_fastapi/docs/toardb_fastapi.html#stationmetaglobal.

Where do I find the metadata catalogue?

The API for this is under development. Currently all metadata is listed at https://esde.pages.jsc.fz-juelich.de/toar-data/toardb_fastapi/docs/toardb_fastapi.html.
You can retrieve the ontology of TOAR data as xml from https://toar-data-dev.fz-juelich.de/api/v2/ontology or you can have see it online as OWL document via https://toar-data-dev.fz-juelich.de/documentation/ontologies/v1.0/.

How do I retrieve data?

Curently, the only way to obtain data from the TOAR database is to use the REST API https://toar-data.fz-juelich.de/api/v2. This web page contains the documentation of the REST API. We are developing a graphical user interface and hope to make a first version of this available by October 2022. Please note that the REST API is being developed further so that more powerful and comfortable queries should become available over time.

What can I search for?

Currently, you can list data such as stationmeta, timeseries, or variables via the API.
You can search for all attributes being equal to a given value.
Example:
https://toar-data.fz-juelich.de/api/v2/stationmeta/?country=DE,NL
finds all stations in Germany and The Netherlands.

How to interpret the retrieved data?

Retrieved data can easily be processed with a Python script. Since JSON is a standardized human readable format, you can of course interpret the data with any tool of your choice.
csv files can be processed with your favourite tool.

What is the resolution of the data?

The TOAR database stores timeseries data in hourly resolution or finer. Most data have been collected as hourly data, but the database can also work with half-hourly, 15-minute or 10-minute data.

How long will TOAR data remain available?

The data is stored permanently and will be available for at least 10 years.

Is the data historic or near-real time?

The focus of TOAR is on the analysis of long-term trends and the TOAR database therefore primarily contains “historic” data. We are beginning to implement near realtime data streams for at least some of the major air quality data providers, but this has low priority at present.

Has the data been curated?

Yes. Data curation consists of a first manual inspection to check if the data can be processed at all and adheres to our metadata definitions and data format description (see https://toar-data.fz-juelich.de/documentation/TOAR_TG_Vol02_Data_Processing.pdf for details). The following automated processing workflow contains several checks and quality tests including statistical tests to detect large discrepancies to expected values and frequency distributions. Once a dataset passes these tests it will be inserted into the database where providers and the TOAR data centre team can visualize and inspect the data again. Other curation steps include the harmonization of metadata information and the augmentation of metadata through processed earth observation information (see question on metadata).

What is the quality of the data?

TOAR only accepts data from research grade instruments and relies on quality control exerted by the data provider (monitoring agency, scientific institution or other). Nevertheless, data processing errors and other factors can lead to errors in the data that is stored in the database. We try to identify such errors through an automated quality control tool and, in some cases, through manual inspection. Furthermore, preliminary analyses of TOAR data for the scientific papers produced in TOAR-II will identify data errors and we implemented a feedback function so that users can alert us to obvious or likely data errors. It is impossible to guarantee the correctness of all TOAR data, but we take data quality serious and do our best to achieve the maximum possible quality of the data in the TOAR database. Since the focus of TOAR is on tropospheric ozone, the data quality of ozone is likely better than that of ozone precursor species. The quality of the meteorological data from ERA5 can be assessed through the ERA5 validation report (https://confluence.ecmwf.int/display/CKB/ERA5).

How has the data been processed?

This process is detailed in https://toar-data.fz-juelich.de/documentation/TOAR_TG_Vol02_Data_Processing.pdf

Can I search for a certain quality level?

Yes. The data flagging scheme of the TOAR database version 2 allows to distinguish between quality flags set by the data provider and data quality flags assigned from our automated quality control tool or visual inspection of the TOAR data centre team. Details on how to specify the desired data qualtiy level can be found in https://toar-data.fz-juelich.de/documentation/TOAR_UG_Vol03_Database.pdf, section 5.

Under which conditions may I access the data?

Access to TOAR-II data is open and free. Please see question on data use for the data use conditions.

Under which conditions may I use the data?

All TOAR-II data are provided without restrictions under a CC-BY 4.0 license. This license requires that you acknowledge the data source. To facilitate this, each response to a TOAR data query via the REST API contains a citation and acknowledgement metadata element. It is your responsibility as data user to make sure that proper acknowledgements are given. See also data use policy.

How do I get access to TOAR DB?

Currently, the TOAR database version 2 can only be accessed via the REST API at https://toar-data.fz-juelich.de/api/v2. There is no registration or login required. For a documentation on the REST API, see https://toar-data.fz-juelich.de/documentation/TOAR_UG_Vol03_Database.pdf

How is the data delivered, instantly or later on as a batch?

Until now, all processing of TOAR data happens instantly and the results will be returned immediately (more complex queries may take several seconds). As we add more sophisticated analyses options, we may have to implement a data warehouse scheme where your query will be stored and queued and you will be informed by email when the results are available.

Which interfaces are available?

Curently, the TOAR database version 2 only provides a Representational State Transfer (REST) Application Programming Interface (API). Detail on the interfaces are provided at https://toar-data.fz-juelich.de/api/v2 and https://toar-data.fz-juelich.de/documentation/TOAR_UG_Vol03_Database.pdf. A graphical user interface is in preparation.

How can I cite the TOAR DB?

The TOAR database should be cited as Schröder et al; TOAR Data Infrastructure; https://doi.org/10.34730/4d9a287dec0b42f1aa6d244de8f19eb3

For individual data series and a small set of data series the original data sources should be cited. A recommended citation is provided with the metadata when data are downloaded.

What about acknowledging the original data provider?

Each response to a TOAR data query via the REST API contains a citation and acknowledgement metadata element. It is your responsibility as data user to make sure that proper acknowledgements are given.

Are the FAIR principles applied to the TOAR data infrastructure?

Yes. We are proud to have built one of the FAIRest data infrastructures for atmospheric data in the world.

Is there a GUI available for accessing the TOAR database?

A GUI is under development. A first version is planned to be available in October 2022.

Where can I get further support?

For further questions please send an email to support@toar-data.org.