.. _data-quality:

************
Data Quality
************

All data and metadata in the TOAR database have been subject to some quality checks. Nevertheless, nobody is perfect and therefore it is not unlikely that you may identify errors, inconsistencies or „weird looking“ data if you only dig deep enough. Most of the data that are kept in the TOAR database originate from quality-controlled repositories, which are maintained by professional data managers. Other data come from resources with fewer resources or potentially less knowledge about the many complex facets of providing FAIR [#f10]_ data. Finally, there are data sources, which provide „preliminary“ data in near real-time and such data can obviously not be checked by trained human experts before they are posted.

The TOAR database has been designed with the primary objective to support the Tropospheric Ozone Assessment Report, and therefore our focus lies on providing the data which are most useful for scientific analyses of global air quality and reflect our best knowledge about global air pollutant concentrations. Due to the data curation procedures described below, the data you obtain from the TOAR database may not always be completely identical to data from the same measurements which you might get from the original data providers. Therefore, TOAR data are not suitable for legal purposes, such as the initiation of law suits because of non-attainment of air quality standards.

The TOAR data centre developed a largely automated workflow to process and add new data into the TOAR database (see :doc:`processing-workflow`). One step in this workflow is the execution of automated scripts for checking the metadata which describes a measurement site and each individual time series. There is also an automated quality control tool, which performs some basic statistical tests on new data to ensure that at least gross errors are captured and that no „garbage“ enters the database. We are continuously working to improve this quality control tool and plan to add more sophisticated tests in the future. As part of our responsibilities in the TOAR assessment, we will double-check as much data as we can and perform several manual checks through database queries and visualisations at the time when the phase II assessment will be prepared. As TOAR database user you can help us by keeping an eye on the data you download and by informing us about any data or metadata issues you encounter when using the data from the TOAR database. We will try our best to follow your leads and inform the original data providers about any issues that can be confirmed.

During the first phase of TOAR, a semi-quantitative analysis was performed to determine the fraction of erroneous and questionable data among all ground-level ozone time series which are stored in the TOAR database (see [#f1]_ ). In general it was found that over 95 % of all data points can be regarded as „trustworthy“ in the sense that they exhibit „typical“ behaviour of ozone time series and show no obvious anomalies. Through the creation of animated maps and trend plots of the TOAR data it could be confirmed that the vast majority of data „fits together“ nicely, which means that errors in the aggregated ozone statistics are likely smaller than 5 parts per billion and trend estimates should be „reasonably accurate“ [#f11]_ . As the TOAR database allows downloads of hourly values including the data quality flags, you can always re-assess the quality of the data you obtain from us. You can also re-run our automated quality control tool, which is available from https://gitlab.jsc.fz-juelich.de/esde/toar-public/toarqc. 

.. _section-curation:

--------------------------
Data and Metadata Curation
--------------------------

Data quality is a complex topic and there are many different views about what constitutes „good quality data“. With respect to the metadata describing stations and time series we aim to achieve the best possible consistency through the use of controlled vocabulary (see https://esde.pages.jsc.fz-juelich.de/toar-data/toardb_fastapi/docs/toardb_fastapi.html#controlled-vocabulary) on the one hand, and by performing some algorithmic tests on the other hand. For example, we will compare reported station altitudes with the altitude returned from a fine resolution digital elevation model at the given latitude and longitude coordinates. A warning will be raised if the results differ too much. The development of such algorithmic tests is ongoing and will be documented at a later stage.


The quality of the actual data values can never be assessed with full certainty, but experience and statistical methods can at least provide good clues. In the current version of our automated quality control tests, we check the data ranges and test for outliers as well as unrealistically long periods of constant values and significant step changes. Thresholds for these tests have been developed based on sample data which have been determined to be of high quality due to 

   (i) trust in the data providers, and 
   
   (ii) visual inspection of the time series and various descriptive statistics.
   
The automated quality control tool will not delete any data, but instead change the data quality flag (see :numref:`section-quality-flags`). Any such changes applied to the data will be recorded and are made accessible through the time series’ „change log“.

There is some debate in the scientific community of environmental observers and database managers about the roles they have in the data curation procedures and about the respective rights and duties. As a general guiding principle it is often stated that only the first-hand data providers are allowed to make changes to their data and metadata, because they are the only ones who have the full insight into the measurement conditions. On the other hand, many modern data collection efforts place more responsibility on the data curators in the data centres, because only there it is possible to assess different data sets with common standards and to apply additional tests, which involve comparisons with neighbouring sites or with numerical model data. Best practice suggests that the results from such tests are communicated back to the data providers and they are then charged with the task to correct the data and re-send to the data centre. In practice, we have found that it is often more efficient to suggest specific corrections to the data providers and ask for their approval, because this means less work for them. In rare cases, the TOAR data centre may also modify data values without the approval of providers; for example, if the data come from a large monitoring network and there are no direct communication channels with the providers, or if we are convinced that data are erroneous, but the data provider will not react to our inquires. Such changes will only be applied if the correction is obvious. A typical example are unit conversions, which may be necessary if the metadata in the submitted file header is inconsistent with the data values. In any case will we document all of these changes and make this information available to you. 

.. _section-quality-flags:

------------------
Data Quality Flags
------------------

As described above, the quality of TOAR data is documented via so-called data quality flags. There are numerous flaging schemes in use around the world with varying level of detail. Some of the datasets which we receive for inclusion in the TOAR database provide quality information with their data, others don’t. 

We define four possible status code ranges to indicate whether a given data value is appropriate for use or not. In addition, code values greater 100 can be used for aggregated queries (:numref:`table-dq-status-code-ranges`).


.. _table-dq-status-code-ranges:

.. csv-table:: status code range for data quality
   :header: "**Status code range**", "**Data quality**"
   :class: longtable
   :file: csv/quality_flags_code_range.md
   :widths: 30 70
   :delim: |


Normally, you will be interested in “OK” data only, which means that you can filter data with quality flag < 10. However, in this case it is easier to request ‘AllOK’ data (flag value 100, see :numref:`table-agg-data-quality-flags`).

As mentioned above, all data are subjected to some automated tests before inclusion in the TOAR database. These tests can only lower the level of confidence in the data, but never change data that were labelled as questionable or erroneous by the data provider into OK values.

The second aspect that might be relevant for assessing the data quality is whether these data have been validated by the provider or not. While in the first phase of TOAR the database only accepted validated data, the expansion to previously uncovered world regions with help of OpenAQ necessitated the inclusion of realtime data, which are never thoroughly validated, although they might have passed some automated quality control checks.

To facilitate the selection of data with a specific quality status, we defined two sets of quality flags. The first set consists of aggregate flags, which allow you to easily select data according to their status as OK, questionable, or erroneous, and to distinguish between validated and preliminary data if you wish to do so (:numref:`table-dq-status-code-ranges`). The second set of flags preserves the information of the original quality assessment by the provider as well as any possible modification introduced through our automated quality control procedures (:numref:`table-agg-data-quality-flags`). These more detailed flag values are the values that are actually stored in the database. You can use both flag sets in the REST interface.

.. _table-agg-data-quality-flags:

.. csv-table:: aggregated data quality flags of the TOAR database [#f12]_
   :header: "**Flag value**", "**Flag name**", "**Description**", **Combination of original flag values (Table 5.3)**
   :class: longtable
   :file: csv/combined_quality_flags.md
   :widths: 5 10 25 15
   :delim: |


.. _table-specific-flag-values:

.. csv-table:: the specific flag values defined in the TOAR database
   :header: "**Flag value**", "**Flag name**", "**Description**"
   :class: longtable
   :file: csv/single_quality_flags.md
   :widths: 5 15 30
   :delim: |


The following two tables summarise how flag values may be modified as a result of the automated quality control tests which are run during data ingestion or as part of a data inspection.

.. _table-possible-flaging-states-validated:

.. csv-table:: possible flaging states of **validated** data depending on the data quality status offered by the data provider and the result of our automated QC tests
   :class: longtable
   :file: csv/validated_quality_flags.md
   :widths: 25 25 25 25
   :delim: |


.. _table-possible-flaging-states-preliminary:

.. csv-table:: Possible flaging states of **preliminary** data depending on the data quality status offered by the data provider and the result of our automated QC tests
   :class: longtable
   :file: csv/preliminary_quality_flags.md
   :widths: 20 20 20 20 20
   :delim: |


In some situations of realtime data processing the only automated test that can be run is a crude range test (for example if many values from different stations at one specific time step are inserted). This situation does not qualify as full QC test. Therefore, values are only flagged as erroneous (26, 27, or 24 depending on the provider flag) or as not checked (7, 16, 28).


.. rubric:: Footnotes

.. [#f1] TOAR V1 is described in Schultz, M. G. et al. (2017) Tropospheric Ozone Assessment Report: Database and Metrics Data of Global Surface Ozone Observations, Elem Sci Anth, 5, p.58. DOI: http://doi.org/10.1525/elementa.244

.. [#f10] Findable, Accessible, Interoperable and Re-usable. For details see https://www.force11.org/group/fairgroup/fairprinciples and the TOAR data FAIRness assessment in :numref:`fair-data` below.

.. [#f11] In the second phase of TOAR, a dedicated statistics working group will explore more quantitative ways of assessing the accuracy and robustness of ozone trends.

.. [#f12] These flags allow for convenient selection of data with the most relevant quality criteria, i.e. OK, questionable, or erroneous on the one hand and validated or preliminary on the other hand. The flags are composites of more specific flag values which are listed in :numref:`table-possible-flaging-states-preliminary`.