Algorithm theoretical baseline for formaldehyde retrievals from S5P TROPOMI and from the QA4ECV project

On board the Copernicus Sentinel-5 Precursor (S5P) platform, the TROPOspheric Monitoring Instrument (TROPOMI) is a double-channel, nadir-viewing grating spectrometer measuring solar back-scattered earthshine radiances in the ultraviolet, visible, near-infrared, and shortwave infrared with global daily coverage. In the ultraviolet range, its spectral resolution and radiometric performance are equivalent to those of its predecessor OMI, but its horizontal resolution at true nadir is improved by an order of magnitude. This paper introduces the formaldehyde (HCHO) tropospheric vertical column retrieval algorithm implemented in the S5P operational processor and comprehensively describes its various retrieval steps. Furthermore, algorithmic improvements developed in the framework of the EU FP7project QA4ECV are described for future updates of the processor. Detailed error estimates are discussed in the light of Copernicus user requirements and needs for validation are highlighted. Finally, verification results based on the application of the algorithm to OMI measurements are presented, demonstrating the performances expected for TROPOMI.


Introduction
Long-term satellite observations of tropospheric formaldehyde (HCHO) 1 are essential to support studies related to air quality and chemistry-climate from the regional to the global scale. Formaldehyde is an intermediate gas in almost all oxidation chains of non-methane volatile organic compounds (NMVOCs), leading eventually to CO 2 (Seinfeld and Pandis, 2006). NMVOCs are, together with NO x , CO, and CH 4 , among the most important precursors of tropospheric ozone. NMVOCs also produce secondary organic aerosols and influence the concentrations of OH, the main tropospheric oxidant (Hartmann et al., 2013). The major HCHO source in the remote atmosphere is CH 4 oxidation. Over the continents, the oxidation of higher NMVOCs emitted from vegetation, fires, traffic, and industrial sources results in important and localized enhancements of the HCHO levels (as illustrated in Fig. 1; Stavrakou et al., 2009a). With its lifetime of the order of a few hours, HCHO concentrations in the boundary layer can be related to the release of short-lived hydrocarbons, which mostly cannot be observed directly from space. Furthermore, HCHO observations provide information on the chemical oxidation pro-I. De Smedt et al.: Algorithm theoretical baseline for formaldehyde retrievals cesses in the atmosphere, including CO chemical production from CH 4 and NMVOCs. The seasonal and interannual variations of the formaldehyde distribution are principally related to temperature changes (controlling vegetation emissions) and fire events, but also to changes in anthropogenic activities (Stavrakou et al., 2009b). For all these reasons, HCHO satellite observations are used in combination with tropospheric chemistry transport models to constrain NMVOC emission inventories in so-called top-down inversion approaches (e.g. Abbot et al., 2003;Palmer et al., 2006;Fu et al., 2007;Millet et al., 2008;Stavrakou et al., 2009aStavrakou et al., , b, 2014Stavrakou et al., , 2015Curci et al., 2010;Barkley et al., 2011: Fortems-Cheiney et al., 2012Marais et al., 2012;Mahajan et al., 2015;Kaiser et al., 2017).
HCHO tropospheric columns have been successively retrieved from GOME on ERS-2 and from SCIAMACHY on ENVISAT, resulting in a continuous dataset covering a period of almost 16 years from 1996 until 2012 (Chance et al., 2000;Palmer et al., 2001;Wittrock et al., 2006;Marbach et al., 2009;De Smedt et al., 2008. Starting in 2007, the measurements made by the three GOME-2 instruments (EUMETSAT Metop A, B, and C) have the potential to extend by more than a decade the successful time series of global formaldehyde morning observations (Vrekoussis et al., 2010;De Smedt et al., 2012;Hewson et al., 2013;Hassinen et al., 2016). Since its launch in 2004, OMI on the NASA AURA platform has been providing complementary HCHO measurements in the early afternoon with daily global coverage and a better spatial resolution than current morning sensors (Kurosu, 2008;Millet et al., 2008;González Abad et al., 2015;De Smedt et al., 2015). On the S-NPP spacecraft, OMPS has also allowed the retrieval of HCHO columns since the end of 2011 (Li et al., 2015;González Abad et al., 2016). TROPOMI aims to continue this time series of early afternoon observations, with daily global coverage, a spectral resolution, and signal-to-noise ratio (SNR) equivalent to OMI, but combined with a spatial resolution improved by an order of magnitude, which potentially offers an unprecedented view of the spatiotemporal variability of NMVOC emissions .
To fully exploit the potential of satellite data, applications relying on tropospheric HCHO observations require high-quality, long-term time series, provided with wellcharacterized errors and averaging kernels, and consistently retrieved from the different sensors. Furthermore, as the HCHO observations are aimed to be used synergistically with other species observations (e.g. with NO 2 for air quality applications), it is essential to homogenize the retrieval methods as well as the external databases as much as possible in order to minimize systematic biases between the observations. The design of the TROPOMI HCHO prototype algorithm, developed at BIRA-IASB, has been driven by the experience developed with formaldehyde retrievals from the series of precursor missions OMI, GOME(-2), and SCIAMACHY. Furthermore, within the Sentinel-5 Precur- sor (S5P) Level 2 Working Group project (L2WG), a strong component of verification has been developed involving independent retrieval algorithms for each operational prototype algorithm. For HCHO, the University of Bremen (IUP-UB) has been responsible of the algorithm verification. An extensive comparison of the processing chains of the prototype (the retrieval algorithm presented in this paper) and verification algorithm has been conducted. In parallel, within the EU FP7-project Quality Assurance for Essential Climate Variables (QA4ECV; Lorente et al., 2017), a detailed stepby-step study has been performed for HCHO and NO 2 differential optical absorption spectroscopy (DOAS) retrievals, including more scientific algorithms (BIRA-IASB, IUP-UB, MPIC, KNMI, and WUR), leading to state-of-the-art European products (http://www.qa4ecv.eu). Those iterative processes led to improvements that have been included in the S5P prototype algorithm or have been proposed as options for future improvements of the operational algorithm.
This paper gives a thorough description of the TROPOMI HCHO algorithm baseline, as implemented at the German Aerospace Center (DLR) in the S5P operational processor UPAS-2 (Universal Processor for UV/VIS Atmospheric Spectrometers). It reflects the S5P HCHO Level 2 (L2) Algorithm Theoretical Basis Document v1.0  and also describes the options to be activated after the S5P launch, as implemented for the QA4ECV OMI HCHO retrieval algorithm (see illustration in Fig. 1).
In Sect. 2, we discuss the product requirements and the expected product performance in terms of precision and trueness and provide a complete description of the retrieval algorithm. In Sect. 3, the uncertainty of the retrieved columns and the error budget is presented. Results from the algorithm verification exercise are given in Sect. 4. The possibilities and needs for future validation of the retrieved HCHO data product can be found in Sect. 5. Conclusions are given in Sect. 6.

Product requirements
In the UV, the sensitivity to HCHO concentrations in the boundary layer is intrinsically limited from space due to the combined effect of Rayleigh and Mie scattering that limits the fraction of radiation scattered back from low altitudes and reflected from the surface to the satellite. In addition, ozone absorption reduces the number of photons that reach the lowest atmospheric layers. Furthermore, the absorption signatures of HCHO are weaker than those of other UVvisible absorbers, such as NO 2 . As a result, the retrieval of formaldehyde from space is sensitive to noise and prone to error. While the precision (or random uncertainty) is mainly driven by the SNR of the recorded spectra, the trueness (or systematic uncertainty) is limited by the current knowledge on the external parameters needed in the different retrieval steps.
The requirements for HCHO retrievals have been identified as part of "Science Requirements Document for TROPOMI" (van Weele et al., 2008), the "GMES Sentinels 4 and 5 Mission Requirements Document" (Langen et al., 2011(Langen et al., , 2017, and the S5P Mission Advisory Group "Re-port of the Review of User Requirements for Sentinels-4/-5" (Bovensmann et al., 2011). The requirements for HCHO are summarized in Table 1. Uncertainty requirements include retrieval and measurement (instrument-related) errors. Absolute requirements (in total column units) relate to background conditions, while percentage values relate to elevated columns.
Three main COPERNICUS environmental themes have been defined as ozone layer (A), air quality (B), and climate (C) with further division into sub-themes. Requirements for HCHO have been specified for a number of these sub-themes (B1: Air Quality Protocol Monitoring; B2: Air Quality Near-Real Time; B3: Air Quality Assessment; and C3: Climate Assessment). With respect to air quality protocol monitoring, which is mostly concerned with trend and variability analysis, the requirements are specified for NMVOC emissions on monthly to annual timescales and on larger regional and country scales (Bovensmann et al., 2011). In the error analysis section, we discuss these requirements and the expected performances of the HCHO retrieval algorithm.

Algorithm description
Figure 2 displays a flow diagram of the L2 HCHO retrieval algorithm implemented in the S5P operational processor. The baseline operation flow scheme is based on the DOAS retrieval method (Platt, 1994;Platt and Stutz, 2008;and references therein). It is identical in concept to the retrieval  (Theys et al., 2017) and very close to that of NO 2 ( . The interdependencies with auxiliary data and other L2 retrievals, such as clouds, aerosols, or surface reflectance, are also represented. Following the diagram in Fig. 2, the processing of S5P Level 1b (L1b) data proceeds as follows: radiance and irradiance spectra are read from the L1b file, along with geolocation data such as pixel coordinates and observation geometry (sun and viewing angles). The relevant absorption crosssection data and characteristics of the instrument are used as input for the determination of the HCHO slant columns (N s ). In parallel to the slant column fit, S5P cloud information and absorbing aerosol index (AAI) data are obtained from the operational chain. Additionally, in order to convert the slant column to a vertical column (N v ), an air mass factor (AMF) that accounts for the average light path through the atmosphere is calculated. For this purpose, several auxiliary data are read from external (operational and static) sources: cloud cover data, topographic information, surface albedo, and the a priori shape of the vertical HCHO profile in the atmosphere. The AMF is computed by combining an a priori formaldehyde vertical profile and altitude-resolved AMFs extracted from a pre-computed look-up table (LUT; also used as a basis for the error calculation and retrieval characterization module). This LUT has been created using the VLIDORT 2.6 radiative transfer model (Spurr et al., 2008a) at a single wavelength representative of the retrieval interval. It is used to compute the total column averaging kernels (Eskes and Boersma, 2003), which provide essential information on the measurement vertical sensitivity and are required for comparison with other types of data.
Background normalization of the slant columns is required in the case of weak absorbers such as formaldehyde. Before converting the slant columns into vertical columns, background values of N s are normalized to compensate for possible systematic offsets (reference sector correction, see below). The tropospheric vertical column end product results therefore from a differential column to which the HCHO background is added due to methane oxidation, estimated using a tropospheric chemistry transport model.
The final tropospheric HCHO vertical column is obtained using the following equation: The main outputs of the algorithm are the slant column density (N s ), the tropospheric vertical column (N v ), the tropospheric air mass factor (M), and the values used for the reference sector correction (N s,0 and N v,0 ). Complementary product information includes the clear-sky air mass factor, the uncertainty on the total column, the averaging kernel, and quality flags. Table 13 in Appendix B gives a non-exhaustive set of data fields that are provided in the L2 data product. A complete description of the L2 data format is given in the S5P HCHO Product User Manual . Algorithmic steps are described in more detail in the next sections, and settings are summarized in Table 2, along with algorithmic improvements developed in the framework of the EU FP7-project QA4ECV and proposed for future TROPOMI processor updates. Figure 3 presents examples of monthly averaged HCHO vertical columns over four NMVOC emission regions, along with the background correction values.

Formaldehyde slant column retrieval
The DOAS method relies on the application of the Beer-Lambert law. The backscattered earthshine spectrum as measured by the satellite spectrometer contains the strong solar Fraunhofer lines and additional fainter features due to interactions taking place in the Earth atmosphere during the incoming and outgoing paths of the radiation. The basic idea of the DOAS method is to separate broad and narrowband spectral structures of the absorption spectra in order to isolate the narrow trace gas absorption features. In practice, the application of the DOAS approach to scattered light observations relies on the following key approximations.
For weak absorbers the exponential function can be linearized and the Beer-Lambert law can be applied to the measured radiance to which a large variety of atmospheric light paths contributes.
The absorption cross sections are assumed to be weakly dependent on temperature and independent of pressure. This allows the expression of light attenuation in terms of the Beer-Lambert law and (together with approximation 1) separating spectroscopic retrievals from radiative transfer calculations by introducing the concept of one effective slant column density for the considered wavelength window.
Broadband variations are approximated by a common low-order polynomial to compensate for the effects of loss and gain from scattering and reflections by clouds and air molecules and/or at the Earth surface.
The DOAS equation is obtained by considering the logarithm of the radiance I (λ) and the irradiance E 0 (λ) (or another reference radiance selected in a remote sector) and including all broadband variations in a polynomial function: where the measured optical depth τ meas s is modelled using a highly structured part τ diff s and a broadband variation τ smooth s . Equation (2) is a linear equation between the logarithm of the measured quantities (I and E 0 ), the slant column densities of all relevant absorbers (N s,j ), and the polynomial coefficients (c p ), at multiple wavelengths. DOAS retrievals consist in solving an overdetermined set of linear equations, which can be done by standard methods of linear leastsquares fit (Platt and Stutz, 2008). The fitting process consists in minimizing the chi-square function, i.e. the weighted sum of squares derived from Eq. (3): where the summation is made over the individual spectral pixels included in the selected wavelength range (k is the number of spectral pixels in the fitting interval). ε i is the statistical uncertainty on the measurement at wavelength λ i . Weighting the residuals by the instrumental errors ε i is optional. When no measurement uncertainties are used (or no error estimates are available), all uncertainties in Eq. (4) are set to ε i = 1, giving all measurement points equal weight in the fit. In order to optimize the fitting procedure, additional structured spectral effects have to be considered carefully such as the Ring effect (Grainger and Ring, 1962). Furthermore, the linearity of Eq. (3) may be broken down by instrumental aspects such as small wavelength shifts between I and E 0 .

Fitting intervals, absorption cross sections, and spectral fitting settings
Despite the relatively large abundance of formaldehyde in the atmosphere (of the order of 10 16 molec cm −2 ) and its well-defined absorption bands, the fitting of HCHO slant columns in earthshine radiances is a challenge because of the low optical density of HCHO compared to other UVvisible absorbers. The typical HCHO optical density is 1 order of magnitude smaller than that of NO 2 and 3 orders of magnitude smaller than that for O 3 (see Fig. 4). Therefore, the detection of HCHO is limited by the SNR of the measured radiance spectra and by possible spectral interferences and misfits due to other molecules absorbing in the same fitting interval, mainly ozone, BrO, and O 4 . In general, the correlation between cross sections decreases when the wavelength interval is extended, but the assumption of a single effective light path defined for the entire wavelength interval may not be fully satisfied, leading to systematic misfit effects that may also introduce biases in the retrieved slant columns. To optimize DOAS retrieval settings, a trade-off has to be found minimizing these effects taking also into consideration the instrumental characteristics. A basic limitation of the classical DOAS technique is the assumption that the atmosphere is optically thin in the wavelength region of interest. At shorter wavelengths, the usable spectral range of DOAS is limited by rapidly increasing Rayleigh scattering and O 3 absorption. The DOAS assumptions start to fail for ozone slant columns larger than 1500 DU . Historically, different wavelength intervals have been selected between 325 and 360 nm for the retrieval of HCHO using previous satellite UV spectrometers (e.g. GOME, Chance et al., 2000;SCIAMACHY, Wittrock et al., 2006;or GOME-2, Vrekoussis et al., 2010). The TEMIS dataset combines HCHO observations from GOME, SCIAMACHY, GOME-2, and OMI measurements retrieved in the same interval (De Smedt et al., 2008, 2012. The NASA operational and PCA OMI algorithm exploit a larger interval (Kurosu, 2008;González Abad et al., 2015;Li et al., 2015). The latest QA4ECV product uses the largest interval, thanks to the good quality of the OMI level 1 spectra. A summary of the different wavelength intervals is provided in Table 3.
As for the TEMIS OMI HCHO product , the TROPOMI L2 HCHO retrieval algorithm includes a two-step DOAS retrieval approach, based on two wavelength intervals.
1. 328.5-359 nm: This interval includes six BrO absorption bands and minimizes the correlation with HCHO, allowing a significant reduction of the retrieved slant column noise. Note that this interval includes part of a strong O 4 absorption band around 360 nm, which may introduce geophysical artefacts of HCHO columns over arid soils or high-altitude regions.
2. 328.5-346 nm: in a second step, HCHO columns are retrieved in a shorter interval, but using the BrO slant column values determined in the first step. This approach allows us to efficiently decorrelate BrO from HCHO absorption while, at the same time, the O 4 -related bias is avoided.
The use of a large fitting interval generally allows for a reduction of the noise on the retrieved slant columns. However, a substantial gain can only be obtained if the L1b spectra are of sufficiently homogeneous quality over the full spectral range. Indeed, experience with past sensors not equipped with polarization scramblers (e.g. GOME(-2) or SCIAMACHY) has shown that this gain can be partly or totally overruled due to the impact of interfering spectral polarization structures . Assuming spectra free of spectral features, the QA4ECV baseline option using one single large interval (fitting interval 1) will be applicable to TROPOMI in order to further improve the 3 × 10 16 molec cm −2 for HCHO, 10 1 9 molec cm − 2 for O 3 , 0.4 × 10 4 3 molec 2 cm −5 for O 2 -O 2 , 10 1 4 molec cm −2 for BrO, and 1 × 10 16 molec cm −2 for NO 2 . A ratio of 8 % has been taken for Raman scattering (Ring effect). High-resolution absorption cross sections of Table 2 have been convolved with the TROPOMI ISFR v1.0 (row 1 is shown in red and row 225 in black; see also Fig. 5). The two fitting intervals (−1 and −2) used to retrieve HCHO slant columns are limited by grey areas. Table 3. Wavelength intervals used in previous formaldehyde retrieval studies (nm). GOME SCIAMACHY GOME-2 OMI Chance et al. (2000) 337.5-359 Wittrock et al. (2006) 334-348 Vrekoussis et al. (2010) 337-353 Hewson et al. (2013) 328.5-346 González Abad et al. (2015); Li et al. (2015) 328. 5-356.5 De Smedt et al. (2008, 2012 328. precision. Results of the retrievals from the two intervals applied to OMI are presented in Fig. 3. In this case, vertical column differences between the two intervals are generally lower than 10 %. They can, however, reach 20 % in wintertime. In both intervals, the absorption cross sections of O 3 at 223 and 243 K, NO 2 , BrO, and O 4 are included in the fit. The correction for the Ring effect, defined as I rrs /I elas , where I rrs and I elas are the intensities for inelastic (rrs: rotational Raman scattering) and elastic scattering processes, is based on the technique published by Chance and Spurr (1997). Furthermore, in order to better cope with the strong ozone absorption at wavelengths shorter than 336 nm, the method of Puķīte et al. (2010) is implemented. In this method, the variation of the ozone slant column over the fitting window is taken into account. On the first order, the method consists in adding two cross sections to the fit: λ_ O 3 and σ _ O 3 2 (Puķīte et al., 2010;De Smedt et al., 2012), using the O 3 cross sections at 223 K (close to the temperature at ozone maximum in the tropics). It allows a much better treatment of optically thick ozone absorption in the retrieval and therefore to reduce the systematic underestimation of the HCHO slant columns by 50 to 80 %, for solar zenith angle (SZA) from 50 to 70 • .
To obtain the optical density (Eq. 2), the baseline option is to use the daily solar irradiance. A more advanced option, implemented in QA4ECV, is to use daily averaged radiances, selected for each detector row, in the equatorial Pacific  offsets present in both spectra; (3) possible row-dependent biases (stripes) are greatly reduced by cancellation of small optical mismatches between radiance and irradiance optical channels; and (4) the sensitivity to instrument degradation affecting radiance measurements is reduced because these effects tend to cancel between the analysed spectra and the references that are used. It must be noted, however, that the last three effects can be mitigated when a solar irradiance is used as reference, by means of a post-processing treatment applied as part of the background correction of the slant columns (see Sect. 2.2.3). The option of using an equatorial radiance as reference will be activated in the operational processor after the launch of TROPOMI, during the commissioning phase of the instrument.

Wavelength calibration and convolution to TROPOMI resolution
The quality of the DOAS fit critically depends on the accuracy of the wavelength alignment between the earthshine radiance spectrum, the reference (solar irradiance) spectrum, and the absorption cross sections. The wavelength registration of the reference spectrum can be fine-tuned to an accuracy of a few hundredths of a nanometre by means of a calibration procedure making use of the solar Fraunhofer lines. To this end, a reference solar atlas E s accurate in wavelength to better than 0.01 nm (Chance and Kurucz, 2010) is degraded to the resolution of the instrument, through convolution by the TROPOMI instrumental slit function (see Fig. 5).
Using a non-linear least-squares approach, the shift ( i ) between the TROPOMI irradiance and the reference solar atlas is determined in a set of equally spaced sub-intervals covering a spectral range large enough to encompass all relevant fitting intervals. The shift is derived according to the following equation: where E s is the reference solar spectrum convolved at the resolution of the TROPOMI instrument and i is the shift in sub-interval i. A polynomial is fitted through the individual points to reconstruct an accurate wavelength calibration (λ) over the complete analysis interval. Note that this approach allows compensation for stretch and shift errors in the original wavelength assignment. In the case of TROPOMI (or OMI), the procedure is complicated by the fact that such calibrations must be performed and stored for each separate spectral field on the CCD detector array. Indeed, due to the imperfect characteristics of the imaging optics, each row of the instrument must be considered as a separate detector for analysis purposes.
In a subsequent step of the processing, the absorption cross sections of the different trace gases must be convolved with the instrumental slit functions. The baseline approach is to use slit functions determined as part of the TROPOMI key data. Slit functions, or Instrument Spectral Response Functions (ISRF), are delivered for each binned spectrum and as a function of the wavelength as illustrated in Fig. 5. Note that an additional feature of the prototype algorithm allows us to dynamically fit for an effective slit function of known line shape (Danckaert et al., 2012). This can be used for verification and monitoring purpose during commissioning and later on during the mission. This option is used for the QA4ECV OMI HCHO product.
More specifically, wavelength calibrations are made for each orbit as follows: -The irradiances (one for each binned row of the CCD) are calibrated in wavelength over the 325-360 nm wavelength range, using five sub-windows.
-The earthshine radiances are first interpolated on the original L1 irradiance grid. The irradiance calibrated wavelength grid is assigned to those interpolated radiance values.
-The absorption cross sections are interpolated (cubic spline interpolation) on the calibrated wavelength grid, prior to the analysis.
-In the case where averaged radiances are used as reference, an additional step must be performed: the cross sections are aligned to the reference spectrum by means of shift and stretch values derived from a least-squares fit of the calibrated irradiance towards the averaged reference radiance.
-During spectral fitting, shift and stretch parameters for the radiance are derived to align each radiance with cross sections and reference spectrum.

Spike removal algorithm
A method to remove individual hot pixels or pixels affected by the South Atlantic Anomaly has been presented for NO 2 retrievals in Richter et al. (2011). Often only a few individual detector pixels are affected, and in these cases it is possible to identify and remove the outliers from the fit. However, as the amplitude of the distortion is usually only of the order of a few percent or less, it cannot always be found in the highly structured spectra themselves. Higher sensitivity for spikes can be achieved by analysing the residual of the fit where the contribution of the Fraunhofer lines, scattering, and absorption is already removed. When the residual for a single pixel exceeds the average residual of all pixels by a chosen threshold ratio (the tolerance factor), the pixel is excluded from the analysis in an iterative process. This procedure is repeated until no further outliers are identified or until the maximum number of iterations is reached (here fixed to three). Tests performed with OMI spectra show that a tolerance factor of 5 improves the HCHO fits. This is especially important to handle the sensitivity of 2-D detector arrays to high energy particles. However, this improvement of the algorithm has a non-negligible impact on the time of processing (×1.8). This option is activated in the QA4ECV algorithm and will be activated in the TROPOMI operational algorithm in the next update of the processor.

Tropospheric air mass factor
In the DOAS approach, an optically thin atmosphere is assumed. The mean optical path of scattered photons can therefore be considered as independent of the wavelength within the relatively small spectral interval selected for the fit. One can therefore define a single effective AMF (M) given by the ratio of the slant to the vertical optical depth of a particular absorber j : In the troposphere, scattering by air molecules, clouds, and aerosols leads to complex light paths and therefore complex altitude-dependent air mass factors. Full multiple scattering calculations are required for the determination of the air mass factors, and the vertical distribution of the absorber has to be assumed a priori. For optically thin absorbers, the formulation of Palmer et al. (2001) is conveniently used. It decouples the height-dependent measurement sensitivity from the vertical profile shape of the species of interest, so that the tropospheric AMF can be expressed as the sum of the altitudedependent AMFs (m l ) weighted by the partial columns (n al ) of the a priori vertical profile in each vertical layer l, from the surface up to the tropopause index (lt): where A s is the surface albedo, p s is the surface pressure, and f c , A cloud , and p cloud are, respectively, the cloud fraction, cloud albedo, and cloud top pressure. The altitude-dependent AMFs represent the sensitivity of the slant column to a change of the partial columns N v,j at a certain level. In a scattering atmosphere, m l depends on the wavelength, the viewing angles, the surface albedo, and the surface pressure, but not on the partial column amounts or the vertical distribution of the considered absorber (optically thin approximation).

LUT of altitude-dependent air mass factors
Generally speaking, m depends on the wavelength, as scattering and absorption processes vary with wavelength. However, in the case of HCHO, the amplitude of the M variation is found to be small (less than 5 % for SZA lower than 70 • ) in the 328.5-346 nm fitting window and a single AMF representative for the entire wavelength interval is used at 340 nm . Figure 6 illustrates the dependency of m with the observation angles, i.e. θ 0 (a), θ (b), and ϕ (c), and with scene conditions like A s (d) and p s for a weakly (e) or highly reflecting surface (f) (symbols in Table 4). The decrease of sensitivity in the boundary layer is more important for large solar zenith angles and wide instrumental viewing zenith angles. The relative azimuth angle does have relatively less impact on the measurement sensitivity (note, however, that aerosols and BRDF effects are not included in those simulations). In the UV, surfaces not covered with snow have an albedo lower than 0.1, while snow and clouds generally present larger albedos. For a weakly reflecting surface, the sensitivity decreases near the ground because photons are mainly scattered, and scattering can take place at varying altitudes. Larger values of the surface albedo increase the fraction of reflected compared to scattered photons, increasing measurement sensitivity to tropospheric absorbers near the  5.25, 5.75, 6.25, 6.75, 7.25, 7.75, 8.25, 8.75, 9.25, 9.75, 10.50, 11.50, 12.50, 13.50, 14.50, 16.00, 18.00, 20.00, 22.00, 24.00, 27.50, 32.50, 37.50, 42.50, 47.50, 55.00, 65.00, 75.00, 85.00, 95.00 z l surface. Over snow or ice also multiple scattering can play an important role further increasing the sensitivity close to the surface. Altitude-dependent AMFs are calculated with the VLI-DORT v2.6 radiative transfer model (Spurr, 2008a), at 340 nm, using an US standard atmosphere, for a number of representative viewing geometries, surface albedos, and surface pressures (used both for ground and cloud surface pressures), and stored in a LUT. Altitude-dependent AMFs are then interpolated within the LUT for each particular observation condition and interpolated vertically on the pressure grid of the a priori profile, defined within the TM5-MP model (Williams et al., 2017). Linear interpolations are performed in cos(θ 0 ), cos(θ ), relative azimuth angle, and surface albedo, while a nearest neighbour interpolation is performed in surface pressure. The parameter values chosen for the LUT are detailed in Table 4. In particular, the grid of surface pressure is very thin near the ground in order to minimize interpolation errors caused by the generally low albedo of ground surfaces. Indeed, as illustrated by Fig. 6e and f, the variation of the altitude-dependent AMFs is more discontinuous with surface elevation (low reflectivity) than with cloud altitude (high reflectivity). Furthermore, the LUT and model pressures are scaled to their respective surface pressures in order to avoid extrapolations outside the LUT range.

Treatment of partly cloudy scenes
The AMF calculations for TROPOMI will use the cloud fraction (f c ), cloud albedo (A cloud ), and cloud pressure (p cloud ) from the S5P operational cloud retrieval, treating clouds as Lambertian reflectors (OCRA/ROCINN-CRB; Loyola et al., 2018). The applied cloud correction is based on the independent pixel approximation (Martin et al., 2002;  al., 2004), in which an inhomogeneous satellite pixel is considered as a linear combination of two independent homogeneous scenes, one completely clear and the other completely cloudy. The intensity measured by the instrument for the entire scene is decomposed into the contributions from the clear-sky and cloudy fractions. Accordingly, for each vertical layer, the altitude-dependent AMF of a partly cloudy scene is a combination of two AMFs, calculated, respectively, for the cloud-free and cloudy fractions of the scene: where m l_clear is the altitude-dependent air mass factor for a completely cloud-free pixel, m l_cloud is the altitudedependent air mass factor for a completely cloudy scene, and the cloud radiance fraction w c is defined as I clear and I cloud are, respectively, the radiance intensities for clear-sky and cloudy scenes whose values are calculated with VLIDORT at 340 nm and stored in LUTs with the same grids as the altitude-dependent air mass factors. m l_clear and I clear are evaluated for a surface albedo A s and a surface pressure p s , while m l_cloud and I cloud are estimated for a cloud albedo A cloud and at the cloud pressure p cloud . Note that the variations of the cloud albedo are directly related to the cloud optical thickness. Strictly speaking, in a Lambertian (reflec-tive) cloud model approach, only thick clouds can be represented (one should keep in mind that still the penetration of photons into the cloud is not covered by the Lambertian model). An effective cloud fraction corresponding to an effective cloud albedo of 0.8 (f eff = f c A c 0.8 ) can be defined in order to transform optically thin clouds into equivalent optically thick clouds of reduced horizontal extent. In such altitude-dependent air mass factor calculations, a single cloud top pressure is assumed within a given viewing scene. For low effective cloud fractions (f eff lower than 10 %), the cloud top pressure retrieval is generally highly unstable and it is therefore reasonable to consider the observation as a clear-sky pixel (i.e. the cloud fraction is set to 0) in order to avoid unnecessary error propagation through the retrievals. This 10 % threshold might be adjusted according to the quality of the cloud product Loyola et al., 2018).
It should be noted that this formulation of the altitudedependent AMF for a partly cloudy pixel implicitly includes a correction for the HCHO column lying below the cloud and therefore not seen by the satellite, the so-called "ghost column". Indeed, the total AMF calculation as expressed by Eqs. (7) and (8) assumes the same a priori vertical profile in both cloudy and clear parts of the pixel and implies an integration of the profile from the top of atmosphere to the ground, for each fraction of the scene. The ghost column information is thus coming from the a priori profiles.
For this reason, observations with cloud fractions f eff larger than 30 % are assigned with a poor quality flag and have to be used with caution.

Aerosols
The presence of aerosol in the observed scene may affect the quality of the retrieval. No explicit treatment of aerosols (absorbing or not) is foreseen in the operational algorithm as there is no general and easy way to treat the aerosols effect on the retrieval. At computing time, the aerosol parameters (extinction profile, single scattering albedo, etc.) are unknown. However, the information on the AAI (Stein Zweers et al., 2016) will be included in the L2 HCHO files as it gives information to the user on the presence of absorbing aerosols and the affected data should be used and interpreted with care.

A priori vertical profile shapes
Formaldehyde concentrations decrease with altitude as a result of the near-surface sources of short-lived NMVOC precursors, the temperature dependence of CH 4 oxidation, and the altitude dependence of photolysis. The profile shape varies according to local NMHC sources, boundary layer depth, photochemical activity, and other factors.
To resolve this variability in the TROPOMI near-real-time (NRT) HCHO product, daily forecasts calculated with the TM5-MP chemical transport model (Huijnen et al., 2010;Williams et al., 2017) will be used to specify the vertical profile shape of the HCHO distribution. TM5-MP will also provide a priori profile shapes for the NO 2 , SO 2 , and CO retrievals. For the QA4ECV OMI products, high-resolution TM5-MP model runs were performed for the period 2004-2016, and the model profiles from this run are used for both HCHO and NO 2 retrievals.
TM5-MP is operated with a spatial resolution of 1 • × 1 • in latitude and longitude and with 34 sigma pressure levels up to 0.1 hPa in the vertical direction. TM5-MP uses 3-hourly meteorological fields from the European Centre for Medium Range Weather Forecast (ECMWF) operational model (ERA-Interim reanalysis data for reprocessing, and the operational archive for real-time applications and forecasts). These fields include global distributions of wind, temperature, surface pressure, humidity, (liquid and ice) water content, and precipitation.
For the calculation of the HCHO air mass factors, the profiles are linearly interpolated in space and time, at pixel centre and local overpass time, through a model time step of 30 min. To reduce the errors associated to topography and the lower spatial resolution of the model compared to the TROPOMI 3.5 × 7 km 2 spatial resolution, the a priori profiles need to be rescaled to effective surface elevation of the satellite pixel. Following Zhou et al. (2009) and Boersma et al (2011), the TM5-MP surface pressure is converted by applying the hypsometric equation and the assumption that the temperature changes linearly with height: where p s,TM5 and T TM5 are the TM5-MP surface pressure and temperature, = 0.0065 K m −1 the lapse rate, z TM5 the TM5-MP terrain height, and z s surface elevation for the satellite ground pixel from a digital elevation map at high resolution. R = 287 J kg −1 K −1 is the gas constant for dry air, and g = 9.8 ms −2 is the gravitational acceleration. The pressure levels for the a priori HCHO profiles are based on the improved surface pressure level p s : p l = a l + b l p s , a l and b l being the constants that effectively define the vertical coordinate (Table 13).
Yearly averaged OMI air mass factors obtained using prior information summarized in Table 5, in particular TM5-MP HCHO profiles, are presented in Fig. 7 in order to give an overview of the tropospheric AMF values and their global regional variations.

Across-track and zonal reference sector correction
Residual latitude-dependent biases in the columns, due to unresolved spectral interferences, are known to remain a limiting factor for the retrieval of weak absorbers such as HCHO.
Retrieved HCHO slant columns can present large offsets depending on minor changes in the fit settings and minor instrumental spectral inaccuracies. Resulting offsets are generally global but also show particular dependencies, mainly with detector row (across track) and with latitude (along track). In the case of a 2-D-detector array such as OMI or TROPOMI, across-track striping can possibly arise due to imperfect calibration and different dead/hot pixel masks for the CCD detector regions. Offset corrections are also meant to handle some effects of the time-dependent degradation of the instrument.
A large part of the resulting systematic HCHO slant column uncertainty is reduced by the application of a background correction, which is based on the assumption that the background HCHO column observed over remote oceanic regions (Pacific Ocean) is only due to methane oxidation. The natural background level of HCHO is well estimated from chemistry model simulations of CH 4 oxidation (N v,0,CTM ). It ranges from 2 to 4 × 10 15 molec cm −2 , depending on the latitude and the season (De Smedt et al., 2008González Abad et al., 2015).
For the HCHO retrieval algorithm, we use a two-step normalization of the slant columns (see Fig. 8 and Table 6).  Table 6). Those mean HCHO values are subtracted from all the slant columns of the same day, as a function of the row. The aim is to reduce possible row-dependent offsets. In the case were solar irradiance are used as reference, those offsets can exceed 2 × 10 16 molec cm −2 (see the first panel of Fig. 8). They are reduced below 10 15 molec cm −2 by this first step or when row averaged radiances are used as reference, as in the QA4ECV algorithm (middle panel of Fig. 8).
-Along track. These two corrections are applied to the global slant columns so that in the reference sector, the mean background corrected slant columns ( N s = N s − N s,0 ) are centred around zero (lower panel of Fig. 8).
To the corrected slant columns, the background HCHO values from a model have to be added. A latitude-dependent polynomial is fitted daily through 5 • latitude bin means of those modelled values in the reference sector. Corresponding values are added to all the columns of the day. Strictly speaking, those background values should be slant columns, derived as the product of AMFs in the reference sector (M 0 ) with HCHO vertical columns from the model (N s,0,CTM = M 0 N v,0,CTM ) (González Abad et al., 2015). However, this option requires the storage of the slant columns, the AMFs, and their errors in a separate database (QA4ECV algorithm and S5P option; see Eq. 11). An approximate solution is to add as background the constant vertical column from the model (N v,0,CTM ), thus neglecting the variability of the M 0 /M ratio. This is the current implementation in the S5P algorithm, which will be updated with Eq. (11) after launch. For NRT purposes, the evaluation in the reference sector is made using a moving time window of 1 week. For offline processing, the reference sector correction can be refined by using daily evaluations. Figure 3 presents some examples of monthly and regionally averaged vertical columns, together with the contribution of N v,0 . It should be realized that this contribution accounts for 20 to 50 % of the vertical columns, as expected from the large contribution of methane oxidation to the total HCHO column .

Uncertainty formulation by uncertainty propagation
The total uncertainty on the HCHO vertical column is composed of many sources of (random and systematic) errors. In part those are related to the measuring instrument, such as errors due to noise or knowledge of the slit function. In a DOAS-type algorithm, those instrumental errors propagate into the uncertainty of the slant columns. Other types of error can be considered as model errors and are related to the representation of the observation physical properties that are not measured. Examples of model errors are errors on the trace gas absorption cross sections, the treatment of clouds and errors of the a priori profiles. Model errors can affect the slant columns, the AMFs, or the applied background corrections. A formulation of the uncertainty can be derived analytically by uncertainty propagation, starting from the equation of the vertical column (Eq. 11), which directly results from the different retrieval steps. As the main algorithm steps are performed independently, they are assumed to be uncorrelated. The total uncertainty on the tropospheric vertical column can be expressed as (Boersma et al., 2004;De Smedt et al., 2008) where σ N,s , σ M , σ N,s,0 , σ M,0 , and σ N,v,0,CTM are, respectively, the uncertainties on the slant column, the air mass factor, the slant column correction, the air mass factor, and the model vertical column in the reference sector (indicated by suffix 0). For each of these categories, the following sections provide more details on the implementation of the uncertainty estimate in the HCHO algorithm. A discussion of the sources of uncertainties and, where possible, their estimated size are presented, as well as their spatial and temporal patterns.
Note that in the current implementation of the operational processor, M 0 = M, and the uncertainty formulation therefore reduces to Complementing this uncertainty propagation analysis, total column averaging kernels (A) based on the formulation of Eskes and Boersma (2003) are estimated. Column averaging kernels provide essential information when comparing measured columns with e.g. model simulations or correlative validation datasets, because they allow removal of the effect of the a priori HCHO profile shape used in the retrieval (see Appendix C; Boersma et al., 2004Boersma et al., , 2016. Section 3.2 presents our current estimates of the precision (random uncertainty) and the trueness (systematic uncertainty) that can be expected for the TROPOMI HCHO vertical columns. They are discussed along with the product requirements (Sect. 2.1).

Errors on the slant columns
Error sources that contribute to the total uncertainty on the slant column originate both from instrument characteristics and from errors in the DOAS slant column fitting procedure itself.
The retrieval noise for individual observations is limited by the SNR of the spectrometer measurements. A good estimate of the random variance of the reflectance (which results from the combined noise of radiance and reference spectra) is given by the reduced χ 2 of the fit, which is defined as the where k is the number of spectral pixels in the fitting interval, n is the number of parameters to fit, and the matrix A(j xk) is formed by the cross sections. For each absorber j , the value σ N,s,j is usually called the slant column error (SCE or σ N,s,rand ).
Equation (16)   columns. The systematic contributions to the slant column errors are empirically estimated from sensitivity tests (see Table 7) and can be viewed as part of the structural uncertainty . However, remaining systematic offsets and zonal biases are greatly reduced by the reference sector correction. All effects summed in quadrature, the various contributions are estimated to account for an additional systematic uncertainty of 20 % of the background-corrected slant column: The total uncertainty on slant columns is then

Errors on air mass factors
The uncertainties on the AMF depend on input parameter uncertainties and on the sensitivity of the AMF to each of them. This contribution is broken down into the squared sum (Boersma et al., 2004, De Smedt et al., 2008: The contribution of each parameter to the total air mass factor error depends on the observation conditions. The air mass factor sensitivities (M = ∂M ∂parameter ), i.e. the air mass factor derivatives with respect to the different input parameters, can be derived for any particular condition of observation using the altitude-dependent AMF LUT and using the model profile shapes (see Fig. 9). In practice, a LUT of AMF sensitivities has been created using coarser grids than the AMF LUT and one parameter describing the shape of the profile: the profile height, i.e. the altitude (pressure) below which resides 75 % of the integrated HCHO profile. ∂M ∂s is approached by ∂M ∂s h , where s h is half of the profile height. Relatively small variations of this parameter have a strong impact on the total air mass factors, because altitude-resolved air mass factors decrease quickly in the lower troposphere, where the HCHO profiles peak (Fig. 6).
The uncertainties σ A,s , σ f,c , σ p,cloud , and σ s,h are typical uncertainties on the surface albedo, cloud fraction, cloud top pressure, and profile shape, respectively. They are estimated from the literature or derived from comparisons with independent data (see Table 8). Together with the sensitivity coefficients, these give the first four contributions on the right of Eq. (19). The fifth term on the right of Eq. (19) represents the uncertainty contribution due to possible errors in the AMF model itself . We estimate this contribution to 20 % of the AMF (see also Sect. 3.2.2).
Estimates of the AMF uncertainties and of their impact on the vertical column uncertainties are listed in Table 8 and represented in Fig. 10. They are based on the application of Eq. (19) to HCHO columns retrieved from OMI measurements. In Eq. (19), the impact of possible correlations between errors on parameters is not considered, such as the  surface albedo and the cloud top pressure. Note also that errors on the solar angles, the viewing angles, and the surface pressure are supposed to be negligible, which is not totally true in practice, since Eq. (10) does not yield the true surface pressure but only a good approximation.

Surface albedo
A reasonable uncertainty on the albedo is 0.02 (Kleipool et al., 2008). This translates to an uncertainty on the AMF using the slope of the AMF as a function of the albedo and can be evaluated for each satellite pixel (Eq. 19). As an illustration, Fig. 9a shows the AMF dependence on the ground albedo for two typical HCHO profile shapes (remote profile I. De Smedt et al.: Algorithm theoretical baseline for formaldehyde retrievals in blue, emission profile in red). At 340 nm, the AMF sensitivity (the slope), is almost constant with albedo, being only slightly higher for low albedo values. As expected, the AMF sensitivity to albedo is higher for an emission profile peaking near the surface than for a background profile more spread in altitude. More substantial errors can be introduced if the real albedo differs considerably from what is expected, for example in the case of the sudden snowfall or ice cover. A snow and ice cover map will therefore be used for flagging such cases.

Clouds and aerosols
An uncertainty on the cloud fraction of 0.05 is considered, while an uncertainty on the cloud top pressure of 50 hPa is taken. Figure 9b shows the AMF variation with cloud altitude. The AMF is very sensitive to the cloud top pressure (the slope is steepest) when the cloud is located below or at the level of the formaldehyde peak. For higher clouds, the sensitivity of the AMF to any change in cloud pressure is very weak. As illustrated in Fig. 9c-e, for which a cloud top pressure of 966, 795, and 540 hPa is, respectively, considered, the sensitivity to the cloud fraction is mostly significant when the cloud lies below the HCHO layer.
The effect of aerosols on the AMFs are not explicitly considered in the HCHO retrieval algorithm. To a large extent, however, the effect of the non-absorbing part of the aerosol extinction is implicitly included in the cloud correction (Boersma et al., 2011). Indeed, in the presence of aerosols, the cloud detection algorithm is expected to overestimate the cloud fraction. Since non-absorbing aerosols and clouds have similar effects on the radiation in the UV-visible range, the omission of aerosols is partly compensated by the overestimation of the cloud fraction, and the resulting error on AMF is small, typically below 15 % (Millet et al., 2006;Boersma et al., 2011;Lin et al., 2014;Castellanos et al., 2015;Chimot et al., 2016). In some cases, however, the effect of clouds and aerosols will be different. For example, when the cloud height is significantly above the aerosol layer, clouds will have a shielding effect while the aerosol amplifies the signal through multiple scattering. This will result in an underestimation of the AMF. Absorbing aerosols have also a different effect on the AMFs, since they tend to decrease the sensitivity to HCHO concentration. In this case, the resulting error on the AMF can be as high as 30 % (Palmer et al., 2001;Martin et al., 2002). This may, for example, affect significantly the derivation of HCHO columns in regions dominated by biomass burning and over heavily industrialized regions. Shielding and reflecting effect can thus occur, depending on the observation, decreasing or increasing the sensitivity to trace gas absorption. It has been shown that uncertainties related to aerosols are reduced by spatiotemporal averaging Lin et al., 2014;Castellanos et al., 2015;Chimot et al., 2016). Furthermore, the applied cloud filtering effectively removes observations with the largest aerosol optical depth. In the HCHO product, observations with an elevated absorbing aerosol index will be flagged, to be used with caution.

Profile shape
This contribution to the total AMF error is the largest when considering monthly averaged observations. This is supported by validation results using Multi-Axis DOAS (MAX-DOAS) profiles measured around Beijing and Wuxi (see De Wang et al., 2016). Taking into account the averaging kernels allows removal of the error related to the a priori profiles from the comparison, when validating the results against other modelled or measured profiles (see Appendix C).

Errors on the reference sector correction
This uncertainty includes contributions from the model background vertical column (see the recent study of Anderson et al., 2017), from the error on the air mass factor in the reference sector, and from the amplitude of the normalization applied to the HCHO columns. As mentioned in Sect. 3.1.1, we consider that σ N,s,0 is taken into account in Eq. (17). The uncertainty on the air mass factor in the reference sector σ M,0 is calculated as in Eq. (19) and saved during the background correction step. Uncertainty on the model background has been estimated as the absolute values of the monthly averaged differences between two different CTM simulations in the reference sector: IMAGES (Stavrakou et al., 2009a) and TM5-MP (Huijnen et al., 2010). The differences range between 0.5 and 1.5 × 10 15 molec cm −2 .

HCHO error estimates and product requirements
This section presents estimates of the precision (random error) and trueness (systematic error) that can be expected for the TROPOMI HCHO vertical columns. These estimates are given in different NMVOC emission regions. Precision and trueness of the HCHO product are discussed against the user requirements.

Precision
When considering individual pixels, the total uncertainty is dominated by the random error on the slant columns. Our simulations and tests on real satellite measurements show that the precision by which the HCHO can be measured is well defined by the instrument signal-to-noise level. For the nominal SNR level (1000), the expected precision of singlepixel measurements is equivalent to the precision obtained with OMI HCHO retrievals , but with a ground pixel size of about 3.5 × 7 km 2 , i.e. 1 order of magnitude smaller in surface. Absolute σ N,s,rand values typically Figure 11. Estimated precision on the TROPOMI HCHO columns, in several NMVOC emission regions and at different spatial and temporal scales (from individual pixels to monthly averages in 20 × 20 km 2 grids). These estimated are based on OMI observations in 2005, using observations with an effective cloud fraction lower than 40 %. range between 7 and 12 × 10 15 molec cm −2 for individual pixels, showing an increase as a function of the surface altitude and of the solar zenith angle. Relative values range between 100 and 300 %, depending on the observation scene. In the case of HCHO retrievals, for individual satellite ground pixels, the random error on the slant columns is the most important source of uncertainty on the total vertical column. It can be reduced by averaging the observations, but of course at the expense of a loss in time and/or spatial resolution.
The precision of the vertical columns provided in the L2 files corresponds to the precision of the slant column divided by the AMF σ N,s,rand = σ N,s,rand M (see Table 13). It is dependent on the AMFs and, therefore, on the observation conditions and on the cloud statistics. Figure 11 shows the vertical column precision that is expected for TROPOMI, based on OMI observations in 2005. Results are shown in several regions and at different spatial and temporal scales (from individual pixels to monthly averaged column in 20 × 20 km 2 grids). The product requirements for HCHO measurements Table 9. Estimated errors on the reference sector correction.

Error source
Uncertainty on HCHO VCD Evaluation method -reference Model background 0.5 to 1.5 × 10 15 molec cm −2 Difference between IMAGES and TM model Amplitude of the column normalization (N s,0 ) 0 to 4 × 10 15 molec cm −2 Sensitivity tests using GOME-2 and OMI data. Figure 13. Correlation (a), slope (b), and offset (c) from a linear regression performed for the common fit settings (see Table 11) for each orbit of OMI test days. A correlation plot for an example orbit is provided in the left panel of Fig. 14. state a precision of 1.3 × 10 15 molec cm −2 . This particular requirement cannot be achieved with individual observations at full spatial resolution. However, as represented in Fig. 11, the requirement can be approached using daily observations at the spatial resolution of 20 × 20 km 2 (close to the OMI resolution) or using monthly averaged columns at the TROPOMI resolution. The precision can be brought below 1 × 10 15 molec cm −2 if a spatial resolution of 20 × 20 km 2 is considered for monthly averaged columns.

Trueness
In this section, we present monthly averaged values of the systematic vertical columns uncertainties estimated for OMI retrievals between 2005 and 2014. The contribution of the AMF uncertainties is the largest contribution to the vertical column systematic uncertainties (see also Table 10). Figure 12 presents the VCD uncertainties due to AMF errors and the five considered contributions, over equatorial Africa and northern China, as example of tropical and mid-latitude sites. The largest contributions are from the a priori profile uncertainty and from the structural uncertainty (taken as 20 % of the AMF). In the case where the satellite averaging kernels are used for comparisons with external HCHO columns, the a priori profile contribution can be removed from the comparison uncertainty budget, leading to a total uncertainty in the range of 25 to 50 %. Table 10 wraps up the estimated relative contributions to the HCHO vertical column uncertainty in the case of monthly averaged columns for typical low and high columns. Considering these estimates of the HCHO column trueness, the requirements for HCHO product (30 %) are achievable in regions of high emissions and for certain times of the year. In any case, observations need to be averaged to reduce random uncertainties at a level comparable or smaller than systematic uncertainties.

Verification
In the framework of the TROPOMI L2 WG and QA4ECV projects, extensive comparisons of the prototype (this paper), the verification (IUP-UB), and alternative scientific algorithms (MPIC, KNMI, WUR) have been conducted. All follow a common DOAS approach. Prototype and verification algorithms have been applied to both synthetic and OMI spectra. Here, we present a selection of OMI results. For a complete description of the verification algorithm as well as results and discussion of the retrievals applied to synthetic spectra, please refer to the TROPOMI verification report (Richter et al., 2015).

Harmonized DOAS fit settings using OMI test data
For this exercise, a common set of DOAS fit parameters has been agreed upon. The goal of the intercomparison of harmonized fit settings was to ensure that the software implementation of the different algorithms behaves as expected in a large  range of realistic measurement scenarios. Another objective was to gain knowledge of the level of agreement or disagreement of results from different groups when using the same settings as well as of the main drivers for differences. Common and simple fit parameters based on the operational and verification algorithm were selected. They are summarized in Table 11. The intercomparison of results using common settings allowed us to identify and fix several issues in the different codes, leading to an overall consolidation of the algorithms. It has been found that minor changes in the fit settings may lead to large offsets (±10 × 10 15 molec cm −2 ) in the HCHO SCDs. However, an excellent level of agreement (±2 × 10 15 molec cm −2 ) between the different retrieval codes was obtained after several iterations of the common settings. The main sources of discrepancies were found to be related to (1) the solar I 0 correction applied on the O 3 cross sections, (2) the intensity offset correction, (3) the details of the wavelength calibration of the radiance and irradiance spectra, and (4) the OMI slit functions and their implementation in the convolution tools .
An overview of the final SCD comparison is shown on Fig. 13 for 6 test days at the beginning and the end of the OMI time series and for a particular OMI orbit on the left panel of Fig. 14. The correlation coefficient, slope, and offset of linear regression fits performed on each comparison orbit are displayed. The correlation of slant columns from BIRA and IUP-UB is extremely high in most cases. It is > 0.998 for all orbits. The slope of the regression line between BIRA and IUP-UB results is close to 1.0. There is a constant offset of less than 1 × 10 15 molec cm −2 . The comparison between MPIC results and the two other algorithms gives somehow  Table 4  Ring effect Ring cross section based on the technique outlined by Chance and Spurr (1997)

Slit function
One slit function per binned spectrum as a function of wavelength (60 OMI ISRF; Dirksen et al., 2006). Polynomial Fifth order Intensity offset correction Linear offset (1 / I 0 ) Reference spectrum I 0 Daily solar irradiance lower correlations, but still larger than 0.98 from the beginning to the end of the OMI lifetime. Final deviations on OMI HCHO SCD when using common settings were found to be of maximum ±2 % (slope) and 2.5 × 10 15 molec cm −2 . When relating the remaining differences in retrieved SCDs using common settings to the slant column errors from the DOAS fit (σ N,s,rand ), it can be concluded that the differences between the results are significantly smaller than the uncertainties (from 10 to 20 % of σ N,s,rand ). Moreover, remaining offsets in SCDs are further reduced by the background correction procedure.

Verification of the operational implementation
A similar intercomparison exercise was performed with the operational algorithm UPAS, developed at DLR, but using the exact settings of the prototype algorithm as detailed in Table 2. An example of resulting correlation fit is shown in the right panel of Fig. 14 for the same OMI orbit as for the comparison with the IUP-UB results. The level of agreement between the prototype and operational results is found to be almost perfect (correlation coefficient of 1, slope of 1.003 and offset of less than 0.2 × 10 15 molec cm −2 ) and thus very satisfactory considering the sensitivity on small implementation changes.

Validation
Independent validation activities are proposed and planned by the S5P Validation Team (Fehr, 2016) and within the ESA S5P Mission Performance Center (MPC). The backbone of the formaldehyde validation is the MAX-DOAS and FTIR networks operated as part of the Network for the Detection of Atmospheric Composition Change (NDACC, http://www.ndsc.ncep.noaa.gov/) complemented by Pandonia (http://pandonia.net/) and national activities. In addition, model datasets will be used for validation as well as independent satellite retrievals. Finally, airborne campaigns are planned to support the formaldehyde and other trace gases validation.

Requirements for validation
To validate the TROPOMI formaldehyde data products, comparisons with independent sources of HCHO measurements are required. This includes comparisons with ground-based measurements, aircraft observations, and satellite datasets from independent sensors and algorithms. Moreover, not only information on the total (tropospheric) HCHO column is needed but also information on its vertical distribution, especially in the lowest 3 km where the bulk of formaldehyde generally resides. In this altitude range, the a priori vertical profile shapes have the largest systematic impact on the satellite column errors. HCHO and aerosol profile measurements are therefore needed.
The diversity of the NMVOC species, lifetimes, and sources (biogenic, biomass burning, or anthropogenic) calls for validation data in a large range of locations worldwide (tropical, temperate and boreal forests, urban and suburban areas). Continuous measurements are needed to obtain good statistics (both for ground-based measurements and for satellite columns) and to capture the seasonal variations. Validation and assessment of consistency with historical satellite datasets require additional information on the HCHO diurnal variation, which depends on the precursor emissions and the local chemical regime.
The main emphasis is on quality assessment of retrieved HCHO column amounts on a global scale and over long time periods. The validation exercise will establish whether HCHO data quality meets the requirements of geophysical research applications like long-term trend monitoring on the global scale, NMVOC source inversion, and research on the budget of tropospheric ozone. In addition, the validation will investigate the consistency between TROPOMI HCHO data and HCHO data records from other satellites. Table 12 summarizes the type of data and measurements that can be used for the validation of the TROPOMI HCHO columns. The advantages and limitations of each technique are discussed. It should be noted that, unlike tropospheric O 3 or NO 2 , the stratospheric contribution to the total HCHO column can be largely neglected, which simplifies the interpretation of both satellite and ground-based measurements. The MAX-DOAS measurement technique has been developed to retrieve stratospheric and tropospheric trace gas total columns and profiles. The most recent generation of MAX-DOAS instruments allows for measurement of aerosols and a number of tropospheric pollutants, such as NO 2 , HCHO, SO 2 , O 4 , and CHOCHO (e.g. Irie et al., 2011). With the development of operational networks such as Pandonia, it is anticipated that many more MAX-DOAS instruments will become available in the near future to extend validation activities in other areas where HCHO emissions are significant. The locations where HCHO measurements are required are reviewed in the next section. Previous comparisons between GOME-2 and OMI HCHO monthly averaged columns with MAX-DOAS measurements recorded by BIRA-IASB in the Beijing city centre and in the suburban site of Xianghe showed that the systematic differences between the satellite and ground-based HCHO columns (about 20 to 40 %) are almost completely explained when taking into account the vertical averaging kernels of the satellite observations Wang et al., 2017), showing the importance of validating the a priori profiles as well.

Reference measurement techniques
HCHO columns can also be retrieved from the ground using FTIR spectrometers. In contrast to MAX-DOAS systems which essentially probe the first 2 km of the atmosphere, FTIR instruments display a strong sensitivity higher up in the free troposphere and are thus complementary to MAX-DOAS (Vigouroux et al., 2009). The deployment of FTIR instruments of relevance for HCHO is mostly taking place within the NDACC network. Within the project NIDFOR-Val (S5P Nitrogen Dioxide and Formaldehyde Validation using NDACC and complementary FTIR and UV-visible networks), the number of FTIR stations providing HCHO time series has been raised from only 4 (Vigouroux et al., 2009;Jones et al., 2009;Viatte et al., 2014;Franco et al., 2015) to 21. These stations are covering a wide range of HCHO concentrations, from clean Arctic or oceanic sites to suburban and urban polluted sites, as well as sites with large biogenic Figure 15. HCHO columns over northern China as observed with GOME (in blue), SCIAMACHY (in black), GOME-2 (in green), and OMI (in red) (De Smedt et al., 2008. emissions such as Porto Velho (Brazil) or Wollongong (Australia).
Although ground-based remote-sensing DOAS and FTIR instruments are naturally best suited for the validation of column measurements from space, in situ instruments can also bring useful information. This type of instrument can only validate surface HCHO concentrations, and therefore additional information on the vertical profile (e.g. from regional modelling) is required to make the link with the satellite retrieved column. However, in situ instruments (where available) have the advantage to be continuously operated for pollution monitoring in populated areas, allowing for extended and long-term comparisons with satellite data (see e.g. Dufour et al., 2009). Although more expensive and with a limited time and space coverage, aircraft campaigns provide unique information on the HCHO vertical distributions (Zhu et al., 2016).

Deployment of validation sites
Sites operating correlative measurements should preferably be deployed at locations where significant NMVOC sources exist. This includes the following: -Tropical forests (Amazonian forest, Africa, Indonesia).
The largest HCHO columns worldwide are observed over these remote areas that are difficult to access. Biogenic and biomass burning emissions are mixed. A complete year is needed to discriminate the various effects on the HCHO retrieval. Clouds tend to have more systematic effects in tropical regions. Aircraft measurements are needed over biomass burning areas.
-Temperate forests (south-eastern US, China, Eastern Europe). In summertime, HCHO columns are dominated by biogenic emissions. Those locations are useful to validate particular a priori assumptions such as model isoprene chemistry and OH oxidation scheme. Measurements are mostly needed from April to September.
-Urban and suburban areas (Asian cities, California, European cities). Anthropogenic NMVOCs are more diverse and have a weaker contribution to the total HCHO column than biogenic NMVOCs. This type of signal is therefore more difficult to validate. Continuous observations at mid-latitudes over a full year are needed, to improve statistics.
For adequate validation, the long-term monitoring should be complemented by dedicated campaigns. Ideally such campaigns should be organized in appropriate locations such as south-eastern US and Alabama, where biogenic NMVOCs and biogenic aerosols are emitted in large quantities during summertime and should include both aircraft and groundbased components.

Satellite-satellite intercomparisons
Satellite-satellite intercomparisons of HCHO columns are generally more straightforward than validation using groundbased correlative measurements. Such comparisons are evaluated in a meaningful statistical sense focusing on global patterns and regional averages, seasonality, scatter of values and consistency between results and reported uncertainties. When intercomparing satellite measurements, special care has to be taken for differences in spatial resolutions, resulting in possible offsets between satellite observations (van der A et al., 2008;De Smedt et al., 2010;Hilboll et al., 2013); differences in overpass times, which hold valuable geophysical information about diurnal cycles in emissions and chemistry Stavrakou et al., 2015); differences in a priori assumptions; differences in the cloud algorithms and cloud correction schemes.
Assessing the consistency between successive satellite sensors is essential to allow for scientific studies making use of the combination of several sensors. For example, trends in NMVOC emissions have been successfully derived from GOME(-2), SCIAMACHY, and OMI measurements (Fig. 15). It is anticipated that TROPOMI and the next GOME-2 instruments -OMPS, GEMS, TEMPO, and the future Sentinels 4 and 5 -will allow to extend these time series.

Conclusions
The retrieval algorithm for the TROPOMI formaldehyde product generation is based on the heritage from algorithms successfully developed for the GOME, SCIA-MACHY, GOME-2, and OMI sensors. A double-interval fitting approach is implemented, following an algorithm baseline demonstrated on the GOME-2 and OMI sensors. The HCHO retrieval algorithm also includes a post-processing across-track reference sector correction to minimize OMItype striping effects, if any. Additional features for future processor updates include the use of a larger fitting interval (if the quality of the recorded spectra allows it), daily earthshine radiance as reference selected in the remote Pacific, spectral outlier screening during the fitting procedure (spike removal algorithm), and a more accurate background correction scheme (as developed for the QA4ECV product). A detailed uncertainty budget is provided for every satellite observation. The precision of the HCHO tropospheric column is expected to come close to the COPERNICUS product requirements in regions of high emissions and, at mid-latitude, for summer (high sun) conditions. The trueness of the vertical columns is also expected to be improved, owing to the use of daily forecasts for the estimation of HCHO vertical profile shapes that will be provided by a new version of the TM5-MP model, running at the spatial resolution of 1 • × 1 • degree in latitude and longitude. The validation of satellite retrievals in the lower troposphere is known to be challenging. Ground-based measurements, where available, often sample the atmosphere at different spatial and temporal scales than the satellite measurements, which leads to ambiguous comparisons. Additional correlative measurements are needed over a variety of regions, in particular in the tropics and at the suburban level in mid-latitudes. These aspects are covered by a number of projects developed in the framework of the TROPOMI validation plan (Fehr, 2016 In addition to the main product results, such as HCHO slant column, tropospheric vertical column, and air mass factor, the L2 data files contain a number of additional ancillary parameters and diagnostic information. Table B1. Selective list of output fields in the TROPOMI HCHO product. Scanline and ground_pixel are, respectively, the number of pixels in an orbit along track and across track. Layer is the number of vertical levels in the averaging kernels and the a priori profiles. Appendix C: Averaging kernel Retrieved satellite quantities always represent a weighted average over all parts of the atmosphere that contribute to the signal observed by the satellite instrument. The DOAS total column retrieval is implicitly dependant on the a priori trace gas profile n a . Radiative transfer calculations account for the sensitivity of the measurement to the HCHO concentrations at all altitudes and these sensitivities are weighted with the assumed a priori profile shape to produce the vertical column. The averaging kernel (A) is proportional to the measurement sensitivity profile and provides the relation between the retrieved column N v and the true tracer profile x (Rodgers, 2000;Rodgers and Connor, 2003):

Symbol
where the profiles are expressed in partial columns (pc). For total column observations of optically thin absorbers DOAS averaging kernels are calculated as follows (Eskes and Boersma, 2003): A(p) = m(p) M , where m(p) is the altituderesolved air mass factor and M is the tropospheric air mass factor. The air mass factor, and therefore the retrieved vertical column, depends on the a priori profile shape, in contrast to the altitude-resolved air mass factor, which describes the sensitivity of the slant column to changes in trace gas concentrations at a given altitude and does not depends on the a priori profile in an optically thin atmosphere. From the definition of A, we have N v,a = A · n pc a and Eq. (21) simplifies to The averaging kernel varies with the observation conditions. In the HCHO retrieval product, A is provided together with the error budget for each individual pixel. The provided HCHO vertical columns can be used in two ways, each with its own associated error (Boersma et al., 2004): 1. For independent study and/or comparison with other independent measurements of total column amounts, the total error related to the column consists of slant column measurement errors, reference sector correction errors, and air mass factor errors. The latter consists of errors related to uncertainties in the assumed profile n a and errors related to the m parameters.
2. For comparisons with chemistry transport models or validation with independent profile measurements, if the averaging kernel information is used, the a priori profile shape error no longer contributes to the total error. Indeed, the relative difference between the retrieved column N v and an independent profile x i is The total AMF M cancels since it appears as the denominator of both N v and A. Because only the total AMF depends on the a priori tracer profile n a , the comparison using the averaging kernel is not influenced by the chosen a priori profile shape. The a priori profile error does not influence the comparison, but it does still influence the error on the retrieved vertical column.