Automatic processing of atmospheric CO2 and CH4 mole fractions at the ICOS Atmosphere Thematic Centre

. The Integrated Carbon Observation System Atmosphere Thematic Centre (ICOS ATC) automatically pro-cesses atmospheric greenhouse gases mole fractions of data coming from sites of the ICOS network. Daily transferred raw data ﬁles are automatically processed and archived. Data are stored in the ICOS atmospheric database, the backbone of the system, which has been developed with an emphasis on the traceability of the data processing. Many data products, updated daily, explore the data through different angles to support the quality control of the dataset performed by the principal operators in charge of the instruments. The automatic processing includes calibration and water vapor corrections as described in the paper. The mole fractions calculated in near-real time (NRT) are automatically revaluated as soon as a new instrument calibration is processed or when the station supervisors perform quality control. By analyzing data from 11 sites, we determined that the average calibration corrections are equal to 1 . 7 ± 0 . 3 µmol mol − 1 for CO 2 and 2 . 8 ± 3 nmol mol − 1 for CH 4 . These biases are important to correct to avoid artiﬁcial gradients between stations that could lead to error in ﬂux estimates when using atmospheric inversion techniques. We also calculated that the average drift between two successive calibrations separated by 15 days amounts to ± 0.05 µmol mol − 1 and ± 0.7 nmol mol − 1 for CO 2 and CH 4 , respectively. Outliers are generally due to errors in the instrument conﬁguration and can be readily detected thanks to the data products provided by the ATC.


Introduction
Rising greenhouse gas (GHG) concentration in the atmosphere is a major source of forcing in the current changing climate (Intergovernmental Panel on Climate Change, 2013).Worldwide measurement systems are being implemented (Andrews et al., 2014;Deng et al., 2014;Deutscher et al., 2014;Dils et al., 2014;Fang et al., 2014;Frankenberg et al., 2015;Houweling et al., 2014;Ramonet et al., 2010) to both monitor and understand these increasing concentrations.In Europe, the Integrated Carbon Observation System (ICOS), an international research infrastructure for precise in situ measurements, is under construction.ICOS is a distributed infrastructure composed of three integrated networks measuring GHG in the atmosphere, over the ocean and at the ecosystem level.Each network is coordinated by a thematic center that performs, among other things, centralized data processing.Further processing takes place in the ICOS Carbon Portal where, for example, 2-D GHG flux maps are computed using the ICOS atmospheric station time series.One of the key focuses of ICOS is to provide standardized and automated high-precision measurements, which is achieved through the use of measurement protocols and standardized instrumentation.The implementation of ICOS included a preparatory phase (2008-2013, EU FP7 project reference 211574) with a demonstration experiment, later called "extended demo experiment" in the period between the end of the preparatory phase and the formal start of ICOS as a legal entity at the end of 2015.In total, 11 sites have been participating in the atmospheric network during this demonstration experiment and its extension.The data center of the ICOS Atmosphere Thematic Centre (ATC), located at the Laboratoire des Sciences du Climat et de L'Environnement (LSCE, France), began to automatically process atmospheric GHG mole fractions in 2009.The centralized data processing aims to reduce inter-laboratory differences and facilitate the production of a coherent dataset in near-real time (NRT).The NRT processing chain was built on the expertise gained during previous European projects including CARBOEUROPE, Infrastructure for Measurements of the European Carbon Cycle (IMECC) and Global Earth Observation and MONitoring (GEOMON).NRT is defined here as on a daily basis.
NRT data production is more demanding but brings several benefits.In terms of station management, it allows station principal operators and investigators to get a fast feedback on the data; it improves reactivity in case of disruption in the data flow and thus limits data gaps.NRT data are also useful for campaign-based measurement setups.It allows us, for example, to adjust the campaign setup and observation plan or to place more emphasis on a specific phenomenon.On a more scientific level, NRT data allow for early-warning monitoring systems, for example, in the case of extreme GHG events (e.g., drought, high-pollution event).NRT is a necessity to perform data assimilation for operational systems (e.g., Monitoring Atmospheric Composition & Climate -MACC) in which NRT data are either used as a diagnostic or ingested in assimilation mode to improve operational forecasting (http://www.gmes-atmosphere.eu/d/services/gac/verif/ghg/icos).
NRT data are, however, less precise than so-called consolidated data.In ICOS, consolidated data are expected to be produced on a 6-month basis.They contain additional data treatment steps ensuring increased precision and confidence in the dataset.These steps include potential correction due to drift in the reference scales used to make the measurements and "manual visual" inspection of the data to screen for potential problems that are difficult to detect automatically.The estimation of time-varying uncertainties, which is an essential information for an optimal use of the data, is still under development and therefore not addressed in the framework of this study.
To further increase confidence and trust, ICOS is building an efficient scheme to ensure traceability of the data.Persistent identifiers will be attached to the data for both proper acknowledgment and citation.ICOS atmospheric data are traceable to the Global Atmosphere Watch (WMO/GAW) international reference scales for GHG, and the history of data processing steps is archived.This allows full traceability and transparency of the consolidated dataset, which will be the basis for elaborated products and services.
This article describes the computing facility dedicated to the ICOS ATC at LSCE, the different steps of the automatic processing of CO 2 and CH 4 mole fractions, including the automatic quality control of the raw data, and the corrections due to water vapor interference and calibration (WMO scale).Most of the processing protocols and parameters are illustrated with a few examples from instruments currently providing raw data to the ICOS ATC as part of the ICOS extended demonstration experiment.Because the paper is fo- cused only on CO 2 and CH 4 , only analyzers deployed in the monitoring network that measure these gases have been considered.To date, for these species, only cavity ring-down spectroscopy (CRDS) analyzers commercialized by the Picarro company meet the ICOS requirements, but other instruments may be added in the future.

Server organization and data archive at ICOS ATC
The instrumental raw data are transferred at least once a day from the monitoring sites to an ATC server using the secure file transfer protocol (SFTP).The files are first archived, and the data are automatically processed by the ICOS database.Three dedicated servers (Fig. 1) are installed and maintained at ICOS ATC to fulfil automatic data collection from measurement stations, processing and distribution to users.
Data collection (icos-ssh server): ICOS network stations upload raw data from the instruments to the ATC on a daily basis.This upload can be managed by upload software developed at ATC.All collected data are centralized on the icos-ssh server and upon receipt are copied to a dedicated server for processing and archival.Data are kept on the icos-ssh server for 1 month after their upload.A duplicate archive of the raw data will be done at the ICOS Carbon Portal.Currently, the amount of data uploaded is on the order of 6.5 MB day −1 per CO 2 /CH 4 in situ analyzer, corresponding to a total ∼ 170 MB a day for 26 stations processed daily by ATC.Note that the files transferred every day to the ATC are not the high-resolution absorption spectra used to retrieve mole fractions (Crosson, 2008).The raw data files of the We consider three types of data: "in situ" corresponding to ambient air, "target" when a cylinder filled with a reference gas is measured and "calibration" when calibration cylinders are measured.
trace gas analyzers currently processed at ATC contain CO 2 /CH 4 information already in geophysical units.It is foreseen that the full spectra files will be archived at the ICOS site on specific hard drives for further postanalysis.The amount of data to archive would then be approximately 230, 780 MB day −1 and 1.3 GB day −1 , respectively, for models ESP100/G1301, G2301 and G2401 of CRDS Picarro analyzers.
Data processing (icos-data server): Upon reception, data are processed.The processing is performed on the icosdata server, a dedicated internal (inaccessible from outside the ATC) server at ATC that also hosts the ICOS atmospheric database.The icos-ssh server (accessible from outside the ATC) also hosts the QA / QC applications developed at ATC, used by principal investigators (PIs) and authorized persons to carry out the measurement control.
Data distribution (icos-web server): The distribution of data and data products are served by the icos-web server.This server hosts the ATC website and uses an open-source content management system framework (Drupal).For security purposes, only read-only access is allowed to some partitions on the icos-data server.Access to the ICOS atmospheric database hosted on icosdata from icos-web is prohibited.
Traceability of the data downloads and long-term archival, which are not described here, are being implemented in col-laboration with the Carbon Portal of ICOS, which is hosted and operated at Lund University in Sweden (https://www.icos-cp.eu).
3 Processing: automatic filtering of raw data Specific processing chains have been developed for each type of trace gas analyzer, but the general framework remains the same.Here, we describe the processing chain and associated parameters defined for the treatment of continuous measurements of CO 2 and CH 4 atmospheric mole fractions.Similar chains have also been developed for measurements of other ICOS parameters such as meteorological variables or radon but are not described in detail in this article.Figure 2 gives an overview of the different steps of the CO 2 and CH 4 data processing.One analyzer routinely measures three types of air samples: ambient air, air from target tanks and air from calibration tanks.The target tanks, also called "surveillance tanks", are used as a quality control tool.Their mole fractions are known (prescribed by the ICOS Central Analytical Laboratories (CAL) located in Germany, which is in charge of providing the calibration gases needed by the atmospheric stations) and are processed similarly to the ambient air.Consequently, the temporal variations of the target gas measurements can be used to estimate time-varying uncertainties (Yver Kwok et al., 2015).It should be noted, however, that the target gases do not pass Beyond the validity status of the data, each set of flags conveys additional meaning.Automatic quality control flags imply that no expert has manually inspected the data yet, whereas manual quality control flags imply that an expert has manually inspected the data.The backward propagation of manual quality control flags implies that an expert has performed manual inspection of the corresponding aggregate data but not the data directly.through the whole air inlet, and possible bias due to a contamination in the inlet upstream the connection of the target gas is not considered.As recommended by WMO, two target tanks, with a significant range in the mole fractions of the measured species, are required at ICOS stations (WMO, 2012).Short-term target gases are analyzed at least once per day, whereas long-term target gases are measured only once every 2 to 4 weeks (after each calibration sequence).This configuration allows for both frequent measurements using one target gas and the possibility to keep the other target gas over a long period (10-20 years).The system also handles so-called inter-comparison (ICP) tanks, which correspond to cylinders analyzed as part of a comparison exercise like the round-robin set up by WMO/GAW or by the Integrated non-CO 2 Greenhouse Gas Observing System (InGOS) European project (Manning et al., 2009;WMO, 2012).The ICP gases are processed similarly to target gases.The processing of the different types of gas follows the same general scheme: data control, correction, filtering and time aggregation (Fig. 2).
For traceability and transparency of the data processing, each rejection of data is associated with a flag.For this purpose, an internal cumulative flag has been defined, which is associated with the different steps of the processing.The steps and the flag will be described in the following paragraphs.Because ways and conditions to automatically validate raw data may differ from an instrument model to another, the list of internal flags are instrument dependent.If these flags are important for the traceability of the process, they are inconvenient for the majority of the data users who request a simple and unambiguous way to separate the valid and invalid data.For this reason, we have defined another flag scheme named "user flag", as described in Table 1.It is instrument independent and allows easy differentiation of the data that have been validated/invalidated either through NRT data processing or after the requested inspection of the data by an expert.This flagging scheme is completed with a third type of flag named "descriptive flag", which allows the PI to provide codified reasons for invalidating data or useful information for validating data.For each data point, there is an automatic descriptive flag and a manual descriptive flag.The manual flag is set by the expert via a graphical quality control application, and the automatic flag is set during the automatic processing of the data.Both flags use the same list of possible values.The flags are set only on raw data.The flag information on raw data is carried to the aggregated data (1 min or hourly averaged or injection).A description of the "descriptive flag" can be found in Table 2.

System configuration
The objective of ICOS is to develop a standardized European monitoring network for greenhouse gases with centralized data processing.Technical discussions about the measurement protocols have been organized during the ICOS preparatory phase through seven working groups.This process has resulted in the first version of the ICOS Atmospheric Station Specifications (ICOS, 2015).Because the monitoring stations have specific local constraints, it has been required that the processing chains can be parameterized to handle some of the station specificities.A dedicated application, called ATCConfig, has been developed to allow the station PIs to configure the stations of which they are in charge.This application enables the following key points to be described in detail: contact persons and institutes in charge of the station and instruments; geographic coordinates, postal address and description of the monitoring station including the different measurement setups with plumbing schemes; instrument description: category, model, firmware, location to trace instrument movements (e.g., for reparation) and various related metadata; calibration/target tanks: model, tank inspection date, valve and regulator description, filling date and mole fractions values; description of the sampling line connections and tank connections to the instruments; description of the measurement sequences (in situ air, calibration and target gases; see Table 3); definition of the measurement processing parameters (control, correction and data filtering; see Table 4).
Each registered instrument is assigned a unique identifier used to reference it (preceded by "no." in this article).A key aspect of the designed system is to ensure a high level of traceability that leads us to keep track of the history of all configurations provided by station PIs.
Regarding the configuration of the measurement processing, we consider three types of sequences: calibration, ambient/target and inter-comparison.Table 3 provides the list of parameters for each of the three sequences, with the Mace Head station (identified by the three-letter code MHD) configuration as an example.The station PIs must configure what is measured (tanks or in situ air), in which order and for how long.Minimum requirements -e.g., at least three calibration tanks and two target gases -are prescribed by an ICOS Atmospheric Station Specifications document.
The full list of parameters to be set up by station PIs for the operation of in situ CO 2 /CH 4 analyzers is shown in Table 4, with the example of the Mace Head set of values for instrument no.41.The means by which those parameters are used in the automatic processing of the raw measurements of CO 2 and CH 4 mole fractions is described in the following paragraphs.

Control based on analyzer ancillary data
The first step of the processing consists of the evaluation of instrumental parameters (e.g., temperature, pressure, flow rate).In the case of the CO 2 /CH 4 analyzers currently used in the ICOS network, each raw data point is scanned for three  parameters: the cavity pressure, the cavity temperature and the outlet valve opening.These ancillary data are provided by the analyzer at the same time resolution as the raw CO 2 and CH 4 data.Consequently, for each single data point, the values of the parameters are checked against a valid interval or threshold.An example of the range of variability allowed for those parameters, for instrument no.41 at Mace Head, is provided in Table 4.The valid intervals and thresholds are instrument and location dependent at this point, but discussions are ongoing between the scientists in charge of the instruments to evaluate the possibility to standardize these criteria for a given instrument model.This decision depends on whether the setup of the station has an influence on the instrument performance.For each GHG data measurement, all selected parameters are tested against their valid interval or threshold.If at least one parameter fails, the GHG data are flagged as invalid.Each failure is traced in the internal cumulative flag (Table 5).
Table 6 shows all internal flags that have been attributed to three analyzers continuously measuring the CH 4 mole fractions during 2014.From this list, it appears that raw data may be rejected for a combination of reasons.For example, during the stabilization period following the switch from one gas to another, the cavity pressure and temperature may also be out of the assigned validity range.Overall, for an instrument working without major failure, as in the case for the instruments in Table 6, the major cause of data rejection corresponds to the flushing time needed to stabilize the measurement after a change in the type of gas to analyze (e.g., from ambient air to target gas).Typically for a surface site with a single sampling level, the amount of data rejected for stabilization is on the order of 1 to 2 % of the continu-

MaxDeltaDurationTank
The time interval between 2 successive calibration tanks is too large 1 min

NbTank
The number of tanks for the calibration is below the minimum required 3 tanks

TankMinDuration
The measurement duration for a tank is below the configured minimum 10 % of the defined duration for the given type of tank (target or calibration)

TankMaxDuration
The measurement duration for a tank is above the configured maximum 10 % of the defined duration for the given type of tank (target or calibration).The calibration is not rejected, but a warning email is sent to notify the PI that more gas than expected is used up.

NbCycle
The number of cycles for a tank measurement during calibration is below the minimum required 2 cycles

SequenceCompleteness
The calibration sequence is incomplete See calibration sequence definition

Quality control
Manual rejection flag set up by the station PI -Backwards quality control Propagation of a manual flag set up by the station PI on an aggregated value (e.g., the hourly mean) to all data used for averaging (e.g., the 1 min means and raw data) ous raw data.For a multiple sampling level site, such as the Observatoire Pérenne de l'Environnement (identified by the three-letter code OPE) high tower in France, this percentage of rejected data can increase to 16 % (Table 6) because of the frequent changes from one sampling level to another.

Control of the stabilization periods
When the instrument switches between sample types or sampling levels, some residual gas remains in the common tubing and valves.For a given duration (called the stabilization period) after such switches, the data are flagged as invalid to avoid considering residual or mixed gas for further processing.The stabilization period duration depends on the flow rate, the volume of the analyzer cell, and the volume of the sampling line where continuous flushing is impossible.Consequently, the duration of the stabilization, given in minutes, is instrument and site dependent.Different values for the flushing time can also be set for in situ measurements and tank (calibration and target gas) measurements.
An example of the stabilization of CO 2 and CH 4 mole fractions is provided in Fig. 3, showing a synthesis of the calibration gas measurements at the Amsterdam Island station (identified by the three-letter code AMS, instrument no.111).At this station, four calibration gases are analyzed four times for 30 min every 30 days.The CO 2 and CH 4 mole fractions are averaged every minute, and we calculate the differences with the last minute of each target injection.On average, stabilization (±0.05 µmol mol −1 for CO 2 and ±1 nmol mol −1 for CH 4 ) is reached after 2 to 4 min.When looking at measurements of short-term and long-term target gases from several sites (Fig. 4), one can see that stabilization is very often reached within 4-6 min, but more time may be needed for the long-term target.The difference can be explained by the fact that the long-term target is used only once a month, and the associated pressure regulator and lines must be flushed for a little while before being stabilized.
Figure 3. CO 2 (above) and CH 4 (below) mole fraction differences between each minute and the last minute of the target gas measurement period (30 min in this case) at the Amsterdam Island station (identified by the three-letter code AMS).The differences are averaged for all target gas measurements from 6 August to 6 November 2014.The number of injections or samplings during this period is provided for each of the four target gases on the right.The minutes provided on the right of the graph for each gas correspond to the minute when the difference decreases below the horizontal dashed lines chosen as half the WMO-recommended compatibility for northern hemispheric sites (±0.05 µmol mol −1 for CO 2 and ±1 nmol mol −1 for CH 4 ).The second step of the processing consists of correcting the data (Fig. 2) for several artifacts.Corrections are applied only to the raw data that have been flagged as valid during the first step (see Sect. 3).This step is common to all types of gas (ambient, target, calibration), but the list of applicable corrections differs.There can be 0 to n correction(s), where the order in which they are applied is important.For each type of correction, there is a correction function defined, and the parameterization of this function is dependent on the instrument, location, species and type of gas.The values of all the intermediate corrections are stored for traceability but if a filter applied on a intermediate corrected value fails, raw data are flagged as invalid and will not be used to compute the associated aggregated values.
For CO 2 and CH 4 measurements, all types of samples (ambient air, target and calibration) are corrected for humidity effects, and the calibration gases are not corrected by the calibration equation.

Water vapor correction
To achieve the WMO/GAW compatibility goals for observations of CO 2 and CH 4 mole fractions in dry air, it is required when using gas chromatography or nondispersive infrared spectroscopy to dry the air sample prior to analysis to a dew point of no more than −50 • C (WMO, 2012).The emergence of new instruments using infrared absorption at specific spectral lines selected to minimize the interference between CO 2 /CH 4 and water vapor has enabled precise measurements in humid air.This technology, including CRDS or cavity-enhanced absorption spectroscopy, has been evaluated in both laboratory and field conditions by several research groups (Chen et al., 2010;Rella et al., 2013).Those studies have demonstrated that it is possible to precisely correct the effects of water vapor dilution and pressure broadening for CO 2 and CH 4 .An empirical quadratic correction has been established by Chen et al. (2010) for CRDS Picarro analyzers and confirmed by other laboratory experiments.All the Picarro CO 2 /CH 4 analyzers use the same manufacturer's built-in correction coefficients, defined by Chen et al. ( 2010), as described by Rella et al. (2013): where CO 2 wet and CH 4 wet are the mole fractions measured in wet air, H the reported H 2 O mole fraction and CO 2 dry and CH 4 dry the mole fractions in dry air.However, this generic manufacturer's water correction does not provide the optimum result as the pressure broadening effect induced by water vapor is specific to each instrument.In order to improve the water correction, the ICOS strategy is not to use the dry air mole fractions reported by the Picarro but to use the mole fractions measured without water vapor correction then apply a post-processing water correction with specific coefficients for each instrument.The determination of specific coefficients for one instrument requires laboratory experiments to be performed as described by Rella et al. (2013).Such experiments are now performed systematically for each ICOS instrument at the ATC ICOS Metrology Laboratory.A technical paper describing these tests and associated results is in preparation.In the ICOS data processing, the water vapor correction is applied in the same way to all analyzed samples (calibration and target gases, ambient air).
Figure 5 shows a comparison of the water corrections applied to CO 2 and CH 4 measurements on two instruments running in parallel at the Mace Head station.One instrument (G1301 model, no.41) is directly measuring the wet air, whereas for the other one (G2301 model, no.54) the air is preliminary dried with a cryogenic dryer using a "cold trap" immersed in an ethanol bath cooled at −50 • C. The H 2 O measurements decrease from approximately 1 % (wet air) to less than 0.01 % (dry air).The mean water vapor corrections applied in February 2014 for the instrument measuring the ambient air without any drying are 4.6 ± 0.7 µmol mol −1 and 17.8 ± 2.8 nmol mol −1 , respectively, for CO 2 and CH 4 (Fig. 6).The same corrections applied to the instrument measuring dry air are 0.04 ± 0.01 µmol mol −1 and 0.16 ± 0.05 nmol mol −1 , respectively, for CO 2 and CH 4 .Overall, over the 15-day period shown in Fig. 6, the differences between the dry mole fractions measured by the two instruments (no.41 minus no.54) at the Mace Head station are +0.015± 0.03 µmol mol −1 and −0.41 ± 0.3 nmol mol −1 , respectively, for CO 2 and CH 4 .
We have made the same calculations for the differences between the CO 2 and CH 4 mole fractions before and after the water correction for 11 instruments used at monitoring stations in 2014.Statistics of the comparisons of hourly means over the year are summarized in Fig. 7.The water vapor corrections shown in Fig. 7 correspond to the difference between data with and without the H2O correction (amount of water vapor correction) and not to a measurement or correction bias.These corrections are needed to convert humid air mole fractions in dry air mole fractions.However, any error in the water vapor correction would introduce a bias in the resulting dry air mole fractions, whose amplitude would depend on the H 2 O concentration.The determination of a specific correction for each instrument by the ATC will minimize the bias associated with humid air measurements.Conversely, drying the air (e.g., using a Nafion membrane) may also cause a measurement bias by contamination of the sampled air.The evaluation of these biases is underway at the ATC and will be published separately.Several instruments are operated with a drier system, and the water vapor corrections are consequently close to zero, as shown for the Mace Head station (for instrument no.54).The instruments operated at Ams-  terdam Island, Biscarrosse, Lamto, the Observatoire Pérenne de l'Environnement and Puy de Dôme were measuring dry air, whereas the Trainou instrument was successively operated in the two configurations (wet and dry) in 2014.For the other instruments, the water vapor corrections range for an-nual averages from 4 to 12 µmol mol −1 for CO 2 and from 18 to 40 nmol mol −1 for CH 4 , depending on the mean water vapor content.For example, the lowest corrections are observed at the Pic du Midi station (identified by the three-letter code PDM), which is a high-altitude station (2877 m) with drier air compared to low-elevation stations.The statistics of the Trainou station (identified by the three-letter code TRN) instrument no.108 are intermediate between the dry and wet instruments because this instrument was operated in both situations in 2014.

Calibration correction
All CO 2 and CH 4 measurements that are intended to be added to the international monitoring networks database must be calibrated relatively to the WMO mole fraction scale for gas mole fractions in dry air maintained by WMO/GAW Central Calibration Laboratories (CCL).The current scales used for CO 2 and CH 4 are "WMO CO 2 X2007" (http://www.esrl.noaa.gov/gmd/ccl/co2_scale.html) and "WMO CH 4 X2004" (http://www.esrl.noaa.gov/gmd/ccl/ch4_scale.html).Updates of the WMO scales will be taken into account by the CAL and the time series will recalculated by the ATC.As explained previously (see Sect. 3.1), the station PIs are in charge of the configuration of the calibrations performed at their site (number of calibration tanks, frequency of calibrations and duration of the gas injections).
A calibration episode is called a "calibration sequence".When n working standards (calibration tanks) are measured in a row, the succession of tanks in a defined order is called a cycle.During a calibration episode, the cycle is repeated several times, and the calibration sequence is defined as m times the repetition of the unitary cycle element (Fig. 8).
For each tank and each cycle, 1 min mole fraction means are calculated, and the injection mean is derived from the average of all minute means over the entire sampling period (excluding the stabilization period).For each tank, the mole fraction means are then averaged over all m cycles.These values are plotted against the tank's standard concentration attributed by the calibration laboratory, and the calibration equation is determined by linear least square fitting.
Because the calibration correction is essential for the final in situ or target data value determination, the calibration data are filtered through a set of specific controls to determine whether all expected data are present and the quality is sufficient for use in the computation of the calibration equation (see below).All controls made on the calibration sequences are instrument, location and species dependent.If there are enough valid data, the calibration is accepted and the calibration equation is determined.The equation coefficients are stored in the database, making them available for the calibration of the other types of samples (ambient air and target gases).
The controls applied to the calibration data are currently the following: 1.The expected number of cycles with their associated number of calibration standards is checked along with the minimum duration of the tank injection.If the calibration data do not correspond to the defined calibration sequence, the calibration is not taken into account.
2. The standard deviation of mole fraction 1 min means must be below a specified threshold.
3. The standard deviation of mole fraction injection means must be below a specified threshold.
4. A stabilization period given in terms of numbers of cycles can be applied.
5. The number of valid calibration injections (or cycle means) for each working standard, after applying the cycle stabilization, if any, must be equal to or greater than a minimum.
6.The number of valid working standard mole fraction means for the entire calibration sequence to use for the computation of the calibration equation must be equal to or greater than a minimum.
An example of calibration for instrument no.41 at the Mace Head station on 10 December 2014 is shown in Figs. 8 and 9.The set of parameters defined by the PI for this instrument are given in Tables 3 and 4. Four calibration tanks are used and are analyzed four times (cycles) for 20 min in each calibration, including 15 min dedicated to the flushing of the inlet lines and analyzer cell (stabilization time).Overall, the calibration lasts for 320 min.Figure 8 shows the different steps of the calibration process from analyzing the raw data and aggregating to the minute to calculating the cycle and calibration sequence averages.A fitting function (see Fig. 9) is then applied to the results of the calibration to define the coefficients of the correction, which will be applied to in situ air and target gas measurements to ensure the data are compatible with the WMO reference scales.Similar to the analysis of the water vapor corrections, we have summarized the calibration corrections applied at 11 instruments in 2014 (Fig. 10).All stations are calibrated with standard gases, which are themselves measured against the international WMO scales.The correction applied to the raw data depends on the pre-set calibration parameters of the CRDS analyzers, which correspond in this study to the factory settings.The mean CO 2 correction applied to the 11 instruments is 1.7 ± 0.3 µmol mol −1 , and its variability over a 1-year period, expressed as the mean standard deviation, is 0.07 µmol mol −1 .Calibration corrections calculated for CH 4 mole fractions have a mean of 2.8 ± 3 nmol mol −1 over the 11 sites and a yearly standard deviation of 0.7 nmol mol −1 on average.Even if the corrections are quite homogeneous from instrument to instrument and over the course of a year, these values demonstrate the need for regular calibrations with standard references to comply with WMO objectives of compatibility goals.
The data are corrected with the closest calibration equation in time existing before the data.As soon as there are calibration episodes before and after the considered data, the correction is made with a linear interpolation of the enclosing calibration equations.It is important to note that NRT data provided after 24 h will be automatically modified after a few weeks once the next calibration is available to estimate the temporal drift of the analyzer.If no calibration equation is available within a period of 180 days to correct the data, the data are flagged as incorrect, and the explanation is added to the internal cumulative flag.
We have analyzed, for 11 monitoring stations, the differences in the CO 2 and CH 4 mole fractions processed in nearreal time with the same dataset after calibration drift correction and manual validation by the PI.A posteriori verification of the NRT dataset is important to qualify this specific product, which is increasingly requested by users.Understanding the reasons for differences between NRT and validated datasets will also help improve the automatic processing of the measurements.Figure 10 shows the differences for the hourly means.The most evident feature of the differences for all sites is the linear drift correction between two calibration sequences (≈ 2 to 4 weeks).At the Amsterdam Island station we see a reverse slope for a short period (2 weeks) in early July 2014, with the drift changing towards a smaller bias with time.This is due to a revision of the calibration performed on 1 July, after the correction of an erroneous injection of one calibration gas.In most cases (95 %), the differences are within ±0.06 µmol mol −1 for CO 2 and ±0.75 nmol mol −1 for CH 4 .The statistics of the validated minus NRT mole fractions are shown for each site in Fig. 11.It is worth noting that for most of the stations, the median differences are less than or equal to zero.Only three instruments show a positive median difference for CO 2 (Lamto station -identified by the three-letter code LTO with instrument no.192; PDM, no.222; Ivittuut station -identified by the three-letter code IVI with instrument no.93) and one for CH 4 (IVI -no.93).This means that almost all instruments have a tendency to drift positively; consequently, when a NRT dataset is revised after a few days or weeks with the new calibration sequence, its value is slightly decreased.This tendency for a positive drift for CH 4 measurements by CRDS analyzers was also noticed by Yver-Kowk et al. (2015).
In addition to the data corrections due to instrumental drift, we also detect in Fig. 10 some isolated events that present a different profile of variability, and there are also a few outliers.For example, not visible in this figure (out of scale) is a 5-day period (10-15 July) at the Mace Head station (no.41) with very high differences between NRT and validated mole fractions: up to −25 µmol mol −1 for CO 2 and −250 nmol mol −1 for CH 4 .This event corresponds to the installation of a new calibration scale at the Mace Head station, with erroneous values of the standard gases entered into the database.Consequently, the mole fractions calculated in NRT were wrong, and a few days were required to identify the problem and reprocess the dataset.Another example is the relatively constant differences observed at the Finokalia station (identified by the three-letter code FKL) from 5 to 20 June 2014: +0.09 µmol mol −1 and +1.4 nmol mol −1 for CO 2 and CH 4 , respectively (Fig. 10, right).This event corre-  OPE;2015;04;06;05;00;00;2015.26084475;420.584;0.571;15;U;91;10;44402;;OPE;2015;04;06;06;00;00;2015.26095890;419.648;0.906;16;U;91;10;44402;I,F-1;OPE;2015;04;06;07;00;00;2015.26107306;415.025;0.787;16;U;91;10;44402;I,F-1;OPE;2015;04;06;08;00;00;2015.26118721;410.413;0.914;16;U;91;10;44402; I,F-1; . . .sponds to an error in the first calibration performed at the installation of the station.The calibration episode was later rejected, and the subsequent calibration was therefore the only one used to correct the raw values, as explained previously.This issue may be difficult to detect immediately upon the start of a monitoring site because we lack references for evaluation.The zoom into June 2014 (Fig. 10, right) also shows small oscillations in the CO 2 differences at the Lamto station.This feature is related to the strong diurnal cycle observed at this tropical site (typically 50 ppm).Since the correction applied to the data depends on concentration, the differences between NRT and validated data also display a diurnal cycle .We also observe for some periods a relatively high random variability of the mole fraction differences for the Trainou station instrument (no.108).This is due to the leakage of one valve that is used to evacuate the liquid water from a water trap setup inside a refrigerator.This problem caused contamination for a few minutes.These contaminated values were used in the NRT data processing, whereas they were excluded after the quality control of the measurements performed by the station PI, which explains the differences between the two datasets.This example shows the importance of the expert examination; it is very hard to completely automatize the quality control and the PI may have additional information at hand to help define the status of the data.However, when invalidating data the PI has to provide codified reasons (the list of such reasons, called "descriptive flag", can be found in Table 2).

Data time aggregation and associated metadata
Further processing consists of aggregating the data in time.The 1 min, hourly and daily means are computed for in situ data.The 1 min means and injection means are computed for tank data (calibration and target gases).As recommended by the World Data Centre for Greenhouse Gases (WDCGG; WMO, 2012), we calculate the means using data from the nearest time aggregation level and not always using the raw data.This implies that raw data are used to calculate 1 min averages, which are then used to calculate hourly averages and so on.For each single averaged data point, we provide the number of data used to compute the average and the standard deviation.The measurement time associated with an average dataset corresponds to the beginning of the averaging period (e.g., the hourly means at 13:00 are calculated from the 1 min means from 13:00 to 13:59), which is also in line with the recommendation of WDCGG (WMO, 2012).The times provided to the users are always universal time.The time difference between local time and universal time is provided in the metadata of the station.
Different data output formats can be provided to fit user needs.The files provided to the users always include the following information for each average mole fraction in dry air: time/date of the measurement, site and instrument identifiers, number of data and standard deviation, user flag and an internal identifier tracing all processing parameters (an example can be found in Table 7).In addition, the header of the file provides metadata including the station coordinates, the measurement calibration scale, the name of a contact person and the institute in charge of the monitoring program.More information (raw data, internal flags, etc.) is available upon request to the ATC data center.

Conclusion and perspectives
The provision of atmospheric GHG mole fractions in NRT is useful for early detection of anomalies, whether they are instrumental or geophysical, and data assimilation schemes.As part of the construction of the ICOS ATC data center, we have developed a framework for fast delivery (24 h) of the atmospheric greenhouse gases dataset.The setup of the hardware and software needed for data collection, data processing, configuration of measurements and quality control of the time series have been performed over the past years in close collaboration with experimentalists in charge of running stations during the demonstrator phase of ICOS.In the last few years, we moved from a situation in which each European station was performing its own data processing to the ICOS configuration with a central database and a set of software codes processing the raw data transferred from all ICOS sites daily.This configuration ensures better inter-comparison of the data.By analyzing data from 11 sites, we determined that the average calibration corrections applied in the data process by the ATC equals 1.7 ± 0.3 µmol mol −1 for CO 2 and 2.8 ± 3 nmol mol −1 for CH 4 .These biases are important to correct to avoid artificial gradients between stations that could lead to error in flux estimates when using atmospheric inversion techniques.Masarie et al. (2011) showed that a 1 µmol mol −1 bias at a measurement tower in Wisconsin induced a response in terms of fluxes of 68 TgC year −1 when using the carbon tracker inversion system (Peters et al., 2007).This flux represents approximately 10 % of the estimated North American annual terrestrial uptake.
We have also evaluated that the average drift between two calibrations separated by 15 days amounts to ±0.05 µmol mol −1 and ±0.7 nmol mol −1 for CO 2 and CH 4 , respectively.Outliers may occur, which are generally associated with an error in the metadata information provided by the station PI (e.g., error in the attributed value of the calibration gas).
ICOS aims to maintain very high-precision measurements with a high level of data recovery, traceability and fast delivery.Rapid access to processed data and their associated metadata, as well as a catalogue of data products updated daily, is intended to facilitate the verification of the measurements.In 2013, 17.8 GB of data files and data products were viewed by users on the ICOS ATC website (https://icos-atc.lsce.ipsl.fr),which corresponds to more than 17 000 hits and more than 380 000 pages viewed.Traceability of the downloads, longterm archival and data policies beyond the scope of this paper are being designed in collaboration with the carbon portal of ICOS.
Thus far, the NRT dataset has been provided to the participants of the ICOS Preparatory Phase and the following projects: InGOS (http://ingos-atm.lsce.ipsl.fr/),ICOS-INWIRE (http://www.icos-inwire.lsce.ipsl.fr/)and MACC-III/COPERNICUS (http://www.copernicus-atmosphere.eu/ d/summary/macc/gac/verif/ghg/icos/).The format of the files provided to the users was adapted to their needs, and the identifier which allows for the traceability of the measurements is part of the compulsory information.The MACC-III project is using the CO 2 data in NRT time to evaluate their assimilation and forecasting system developed at the European Centre for Medium-range Weather Forecasts (Agustí-Panareda et al., 2014).In another study, the authors performed a CH 4 inversion to test the ability of the European network of atmospheric observations to detect the leakage of an offshore oil platform at Elgin Field, North Sea (Berchet et al., 2013).
The continuous enhancement of automatic processing is important, and new developments are in progress.This includes the evaluation of spike detection algorithms that would allow the automatic identification of data being significantly influenced by local processes.Another perspective is to interface the database with the electronic logbooks of the station operations (maintenance, troubleshooting, etc.), as a support of the quality control of the time series.One important issue is the estimation of time-varying uncertainties based on regular measurements of the target gases, comparison of in situ and flask measurements and analysis of specific tests.Evaluation of algorithms to estimate random and systematic errors was performed by the INGOS and ICOS-INWIRE European projects, and we have started to transfer some of them into the ICOS data processing.Within the ICOS project research actions are ongoing for a better assessment of the calibration strategy and the water vapor correction, and their associated uncertainties.The outputs of these studies will be implemented later in the data processing to improve the current data corrections and uncertainties estimates.

Data availability
The NRT data of the ICOS atmospheric stations processed as described in the paper will be freely accessible from the ATC website in the near future.Raw data are available upon request to the authors.

Figure 1 .
Figure 1.Schematic view of ICOS ATC network infrastructure.

Figure 2 .
Figure2.Automatic data processing of CO 2 and CH 4 data at ICOS ATC.We consider three types of data: "in situ" corresponding to ambient air, "target" when a cylinder filled with a reference gas is measured and "calibration" when calibration cylinders are measured.

Figure 4 .
Figure4.Average stabilization times (in minutes) estimated to have a difference from the last minute of the target gas measurement of less than ±0.05 µmol mol −1 for CO 2 (in red) and ±1 nmol mol −1 for CH 4 (in blue).The time is calculated for several instruments indicated on the x axis; the left side of the figure shows short-term target gas measurements, whereas the right side shows the long-term target measurements, which are less frequent.

Figure 5 .
Figure 5. CO 2 (above), CH 4 (middle) and H 2 O (below) mole fractions observed at Mace Head in February 2014 with two CRDS analyzers.Left: analyzer Picarro model G1301 (no.41) measuring wet air.Right: analyzer Picarro model G2301 (no.54) measuring dry air.For CO 2 and CH 4 plots, the blue dashed lines correspond to the raw data, the gray lines correspond to the raw data corrected for water vapor and the thick black line corresponds to the calibrated mole fractions in dry air.

Figure 6 .
Figure 6.Water vapor corrections (dashed lines) and WMO calibration corrections (thick lines) applied to CO 2 (red) and CH 4 (blue) mole fractions for two CRDS analyzers used at Mace Head station (above: no.41 measuring wet air; below: no.54 measuring dry air).

Figure 7 .
Figure 7. Synthesis of the water vapor (above) and calibration (below) corrections applied to 11 instruments in 2014 for hourly mean CO 2 (left) and CH 4 (right) mole fractions.The length of the box represents the interquartile range, the horizontal line represents the median and the low and high whiskers show 10 and 90 % percentiles, respectively.Numbers below the box plots give the maximum and minimum corrections.It should be noted that the calibration corrections depend on the calibration settings of the analyzers.

Figure 8 .
Figure 8. Details of a CO 2 calibration performed at Mace Head station (instrument no.41) on 10 December 2014.(a) Raw CO 2 data measured for four calibration tanks analyzed four times in 20 min.Gray points show the rejected values during the stabilization period (i.e., flushing period).Values indicated on the right give the tank ID and their attributed mole fractions on a WMOx2007 scale.(b) Same as (a) for CO 2 mole fraction differences between measured values and attributed WMO values.(c) Same as (b) for 1 min averages.Gray crosses show rejected values due to a standard deviation higher that the threshold value (vertical bar on the left).(d) Cycle (squares) and calibration sequence averages (dashed lines and values on the right).The first cycle is rejected as a stabilization period.

Figure 9 .
Figure 9. Linear fit of the CO 2 calibration detailed in Fig. 8. Coefficients a and b of the fit are shown in bold characters.The lower plot shows CO 2 residuals from the linear regression.

Figure 10 .
Figure 10.CO 2 (above) and CH 4 (below) mole fraction differences between the validated and the near-real-time values at 11 stations in 2014 (left), and at three stations (Finokalia -FKL; Lamto -LTO; Puy de Dôme -PUY) in June 2014 (right).Most of the differences correspond to the drift between two calibrations, which cannot be considered in real time.Each point corresponds to an hourly average.

Figure 11 .
Figure 11.Statistics of the validated minus NRT differences of hourly means CO 2 (left) and CH 4 (right) mole fractions.Each of the 11 box-and-whisker plots describes the differences for monitoring stations in 2014.The length of the box represents the interquartile range, the horizontal line represents the median and the low and high whiskers show the first and ninth deciles, respectively.The numbers below the box plots give the maximum and minimum differences.

Table 1 .
List of user flags.The user flag is instrument independent.

Table 2 .
List of descriptive flags.The descriptive flag is instrument independent and is picked from a predefined list.The flag is case sensitive.Multiple flags (i.e., letters) can be set simultaneously on a single value.There is a list to be used for invalid data and one to be used for valid data.

Table 3 .
Definition of a measurement sequence.As an example, we show the configuration for the instrument installed at Mace Head station (identified by the three-letter code MHD), Ireland.

Table 4 .
List of the parameters used for the automatic processing of CO 2 and CH 4 mole fractions by CRDS analyzers.The humidity filtering applied to the tank measurements consists of checking the absolute difference between the wet value and the computed dry value against the defined threshold.The parameters are specific to the instrument considered.

Table 5 .
List of internal flags for instrument type CRDS Picarro model G2301.The example provided in the third column corresponds to the configuration of instrument no.41 at Mace Head station set up on 14 May 2009.

Table 6 .
Examples of user and internal flags that were attributed to raw data from CRDS Picarro instruments in 2014.The two last columns provide the number of raw data that have been attributed an internal flag or combination of internal flags and the corresponding percentage in the dataset.Most of the data have no internal flag, indicating that there is no anomaly detected.