HDO and H 2 O total column retrievals from TROPOMI shortwave infrared measurements

The Tropospheric Monitoring Instrument (TROPOMI) onboard the European Space Agency Sentinel-5 Precursor mission is scheduled for launch in the last quarter of 2016. As part of its operational processing the mission will provide CH4 and CO total columns using backscattered sunlight in the shortwave infrared band (2.3 μm). By adapting the CO retrieval algorithm, we have developed a non-scattering algorithm to retrieve total column HDO and H2O from the same 5 measurements under clear sky conditions. The isotopologue ratio HDO/H2O is a powerful diagnostic in the efforts to improve our understanding of the hydrological cycle and its role in climate change, as it provides insight in the source and transport history of water vapour, nature’s strongest greenhouse gas. Due to the weak reflectivity over water surfaces we need to restrict the retrieval to cloud-free scenes over land. We exploit a novel two-band filter technique, using strong-vs-weak 10 water or methane absorption bands, to pre-filter scenes with medium-to-high level clouds, cirrus or aerosol and to significantly reduce processing time. Scenes with cloud top heights . 1 km or very low fractions of high-level clouds, or scenes with an aerosol layer above a high surface albedo are not filtered out. We use an ensemble of realistic measurement simulations for various conditions to show the efficiency of the cloud filter and to quantify the performance of the retrieval. The single 15 measurement precision in terms of δD is better than 15–25‰ for even the lowest surface albedo (2– 4‰ for high albedos), while a small bias remains possible of up to ∼ 20‰ due to remaining aerosol or up to ∼ 70‰ due to remaining cloud contamination. We also present an analysis of the sensitivity towards prior assumptions, which shows that the retrieval has a small but significant sensitivity to the a priori assumption of the atmospheric trace gas profiles. Averaging multiple measurements 20 over time and space, however, will reduce these errors, due to the quasi-random nature of the pro1 Atmos. Meas. Tech. Discuss., doi:10.5194/amt-2016-113, 2016 Manuscript under review for journal Atmos. Meas. Tech. Published: 8 April 2016 c © Author(s) 2016. CC-BY 3.0 License.

of the sensitivity towards prior assumptions, which shows that the retrieval has a small but significant sensitivity to the a priori assumption of the atmospheric trace gas profiles.Averaging multiple measurements over time and space, however, will reduce these errors, due to the quasi-random nature of the profile uncertainties.The sensitivity of the retrieval with respect to instrumental parameters within the expected instrument performance is < 3 ‰, which represents only a small contribution to the overall error budget.Spectroscopic uncertainties of the water lines, however, can have a larger and more systematic impact on the performance of the retrieval and warrant further reassessment of the water line parameters.With TROPOMI's high radiometric sensitivity, wide swath (resulting in daily global coverage) and efficient cloud filtering, in combination with a spatial resolution of 7 × 7 km 2 , we will greatly increase the amount of useful data on HDO, H 2 O and their ratio HDO / H 2 O.We showcase the overall performance of the retrieval algorithm and cloud filter with an accurate simulation of TROPOMI measurements from a single overpass over parts of the USA and Mexico, based on MODIS satellite data and realistic conditions for the surface, atmosphere and chemistry (including isotopologues).This shows that TROPOMI will pave the way for new studies of the hydrological cycle, both globally and locally, on timescales of mere days and weeks instead of seasons and years and will greatly extend the HDO / H 2 O datasets from the SCIAMACHY and GOSAT missions.

Introduction
Water vapour, being the strongest natural greenhouse gas, plays a vital role in our understanding of climate change.It is part of a positive atmospheric feedback mechanism (Soden et al., 2005;Randall et al., 2007), and it plays a role in the mechanisms of cloud formation, of which the feedback mechanisms are still poorly understood (Boucher et al., 2013).A correct understanding of the many interacting processes that control atmospheric humidity, as well as constraining atmospheric circulation, is crucial for general circulation models (GCMs) to come to accurate climate projections (Jouzel et al., 1987;Yoshimura et al., 2011Yoshimura et al., , 2014;;Risi et al., 2012a, b).
Measurements of stable water isotopologues, such as HDO, can be a unique diagnostic with which to improve our knowledge of the hydrological cycle (Dansgaard, 1964;Craig and Gordon, 1965).Different isotopologues have different equilibrium vapour pressures, which lead to a temperature-dependent isotope fractionation whenever phase changes occur.The ratio HDO / H 2 O of an air parcel is therefore dependent on the source region's location and temperature and the entire transport history of the air parcel, including all evaporation, condensation and mixing events.This makes measurements of the ratio HDO / H 2 O a valuable benchmark for the evaluation and further development of GCMs and explains why isotopologues have been used for decades in the fields of palaeoclimatology, either using ice cores (Dansgaard et al., 1969;Jouzel et al., 1997) or speleothems (Lee et al., 2012) and hydrology in general (Mook, 2000;Aggarwal et al., 2005).
In the last decade there has been a rise in the application of water isotopologues to the atmospheric component of the hydrological cycle.This is directly related to improved remotesensing techniques that can accurately measure water vapour isotopologues from ground-based networks, such as the Total Carbon Column Observing Network (TCCON, Wunch et al., 2011) and the Network for Detection of Atmospheric Composition Change (NDACC, formerly the Network for Detection of Stratospheric Change, Kurylo and Solomon, 1990;Schneider et al., 2016), as well as global measurements from space with instruments such as the Interferometric Monitor for Greenhouse gases (IMG, Zakharov et al., 2004), the Thermal Emission Spectrometer (TES, Worden et al., 2007), the SCanning Imaging Absorption spectroMeter for Atmospheric CHartographY (SCIAMACHY, Frankenberg et al., 2009;Scheepmaker et al., 2015), the Infrared Atmospheric Sounding Interferometer (IASI, Herbin et al., 2009) and the Greenhouse gases Observing Satellite (GOSAT, Frankenberg et al., 2013;Boesch et al., 2013).These new techniques allow for more frequent and global measurements of the ratio HDO / H 2 O in water vapour and show the clear potential for furthering our understanding of the atmospheric hydrological cycle through comparisons with GCMs (Frankenberg et al., 2009;Yoshimura et al., 2011;Risi et al., 2012a, b).
Here, we present an algorithm and performance analysis for new measurements of total column HDO and H 2 O using the TROPOspheric Monitoring Instrument (TROPOMI, Veefkind et al., 2012) on board the European Space Agency (ESA) Sentinel-5 Precursor (S5P) mission, scheduled for launch in Q4 2016.Like SCIAMACHY, TROPOMI will measure HDO and H 2 O in a 2.3 µm shortwave infrared (SWIR) band of backscattered sunlight, which provides a high sensitivity near the surface.TROPOMI, however, will have a higher spatial resolution with 7×7 km 2 ground pixels, better radiometric performance, a larger swath and shorter revisit time, resulting in daily global coverage and many more measurements over cloud-free land pixels, while also demanding more efficient processing.With TROPOMI we have the opportunity to extend and improve the existing global HDO / H 2 O time series as well as study spatial and temporal gradients with higher spatial sampling and resolution.
In Sect. 2 we describe how we adapted TROPOMI's CO algorithm to retrieve HDO, H 2 O and their respective averaging kernels and how we filter for cloudy scenes.We then describe the performance of the algorithm in Sect.3, as tested on a series of synthetic measurements with systematically varying scattering layers.A sensitivity analysis of the various input parameters is presented in Sect. 4. The performance on a realistic scenario of measurements above North America is presented in Sect. 5. Finally, in Sect.6 we discuss our results in the context of other studies and we formulate our conclusions.

Retrieval algorithm description
Due to the large difference in atmospheric abundance between HDO and H 2 O, the measurement sensitivity, reflected in the averaging kernels, is very different for HDO and H 2 O.This makes the interpretation of their ratio very challenging under conditions of light scattering by clouds.We therefore have to prefilter for the most cloudy conditions, which at the same time reduces processing time.This cloud filter will be described in Sect.2.3.After cloud filtering we use a nonscattering retrieval algorithm, adapted from the Shortwave Infrared CO Retrieval (SICOR) algorithm, which has already been developed as part of ESA's operational CO algorithm and is described by Landgraf et al. (2016) in this issue.By using this heritage of TROPOMI's CO processing, we benefit from an algorithm optimized for speed, while also leveraging already existing expertise and software.Sections 2.1 and 2.2 describe the specific implementation of the algorithm needed to retrieve both the H 2 O and HDO total column densities.

Forward model and averaging kernels
Throughout this paper we consider all water isotopologues present in the SWIR range, i.e.H 16 2 O, H 18 2 O and HD 16 O, as separate absorbing species.For readability, however, we Atmos.Meas. Tech., 9, 3921-3937, 2016 www.atmos-meas-tech.net/9/3921/2016/will simply write "H 2 O" when in fact we refer to the main isotopologue H 16 2 O, and "HDO" when we refer to HD 16 O.To simulate the SWIR radiance measurement, we employ a non-scattering forward model F that simulates the reflected radiance of the Earth at its spectral sampling point λ i by the spectral convolution of the simulated radiance at the top of the model atmosphere I TOA with the instrument spectral response function (ISRF) s i : (1) Here, we assume that sunlight is scattered only at the Earth's surface into the satellite line of sight (LOS) and is attenuated by atmospheric absorption along its path.Using this approximation, the simulated radiance at wavelength λ is given by: where A s is the Lambertian surface albedo, µ 0 = cos( 0 ) with the solar zenith angle 0 .For low solar zenith angles, µ 0 is corrected for the sphericity of the Earth according to Kasten and Young (1989).F 0 is the solar irradiance inferred from TROPOMI solar measurements (van Deelen et al., 2007;Landgraf et al., 2016) and indicates the air mass factor with µ v = cos( v ) and viewing zenith angle v .The total optical thickness τ tot is given by where z indicates the altitude ranging from the surface z = 0 to the top of the model atmosphere z TOA .Index k represents the relevant absorbers, CO and CH 4 , including all their isotopologues, and H 2 O, H 18 2 O and HDO.ρ k (z) is the concentration of absorber k at altitude z and σ k (z, λ) are the corresponding wavelength-and altitude-dependent absorption cross sections.
The retrieval relies on a priori concentration profiles for CO and CH 4 from the TM5 chemistry model (Krol et al., 2005) and specific humidity profiles from the European Centre for Medium-Range Weather Forecast (ECMWF, Dee et al., 2011).These profiles are interpolated to the higher resolution of the TROPOMI pixels using a digital elevation model to account for variations in air mass due to orography.Specific humidity is converted into concentration profiles for the absorbers H 2 O, H 18 2 O and HDO using their natural abundance ratios.
In Fig. 1 we show a simulated transmission spectrum for the entire TROPOMI SWIR spectral range (2305-2385 nm).The top panel shows the total transmission, while the lower four panels show the individual transmissions of the main absorbing species (H 2 O, HDO, CH 4 and CO).For the retrieval of the HDO and H 2 O total column densities we chose the spectral window between 2354.0 and 2380.5 nm (indicated with the grey band), as a trade-off between inclusion of the strongest HDO absorption lines with only minor overlap with the strongest H 2 O absorption lines.The smaller spectral windows in blue (H 2 O) and red (CH 4 ) indicate the weak and strong absorption bands used for cloud filtering (see below).Although we additionally fit H 18 2 O to improve the fit quality of the other species, its absorption lines are weaker than those of HDO (not shown).An accurate retrieval of total column H 18 2 O in this spectral range is not yet feasible and therefore not part of the final retrieval product.
For the following analysis, we define two relative profiles: and www.atmos-meas-tech.net/9/3921/2016/Atmos.Meas.Tech., 9, 3921-3937, 2016 where ρ rel k is the relative profile of absorber k with respect to the vertically integrated total column and ρ rel k,k is the relative profile of absorber k with respect to absorber k.Assuming that the abundance of trace gas k changes by a scaling of the reference profile ρ rel k , the derivative of the total optical depth with respect to the trace gas column density is given by thus Corresponding expressions hold for the radiance derivative with respect to the trace gas concentration at a certain altitude level.Finally, the derivative of I TOA with respect to surface albedo A s is After the spectral convolution in Eq. ( 1), we have a linearized forward model: with de Jacobian K = ∂F ∂x (x 0 , b).Here, we distinguish between the state vector x that comprises in its components the parameters to be retrieved and forward model parameters b describing parameters other than the state vector that influence the measurement.Equation (11) represents a Taylor expansion of the forward model around state vector x 0 truncated to first order.

Inversion
To determine the column density of water vapour isotopologues from SWIR measurements, we adjust the state vector x to fit the forward model to the measurement vector y, with spectral residuals e y , by a least squares fitting approach.The state vector x includes the total columns of CO, CH 4 , H 2 O, H 18 2 O and HDO, two coefficients to describe the linear spectral dependence of the surface albedo A s , and a spectral shift of the ISRF to adjust the spectral calibration of the TROPOMI instrument per retrieval.We apply the profile scaling approach as described by Borsdorff et al. (2014) employing a Gauss-Newton iteration scheme.The least squares minimization problem is solved per iteration step with the solution with the gain matrix and the measurement covariance matrix S y .
After convergence, the column averaging kernel can be calculated in a straightforward manner: g k is the row vector of the gain matrix G that belongs to the trace gas k and is the forward model Jacobian with respect to the trace gas profile ρ k .The height dependence of the profile ρ k is omitted for a clear presentation.For the full mathematical proof, the reader is referred to Borsdorff et al. (2014).For k = k , A k,k describes the interference of the retrieved column c k with the real trace gas vertical distribution of another trace gas k .For k = k , it is the standard column averaging kernel and we use the more simple notation The relation between our retrieval product c ret,k (the retrieved total column of species k) and the true state (concentration profile ρ k ) of the atmosphere is given by where e x is the error on the retrieved total column due to the forward model and measurement errors e y .In the case of a single trace gas retrieval the interference terms A k,k do not exist.In such case, the meaning of the remaining column averaging kernel A k is related to the proper choice of the reference profile and the effective null space of the regularization (see the discussion in Borsdorff et al., 2014 andWassmann et al., 2015).If the chosen reference profile is correct, the equation is equal to a geometrical integration of ρ k .
In the case of multiple trace gas retrievals, we need to assess Eq. ( 16) in more detail.Using Eq. ( 6), the above equation can be written as showing that the contribution of the interference kernel A k,k can be interpreted as an error term for every level of the averaging kernel A k .Because atmospheric humidity can show strong variability (in both time and space), and the variability of HDO is (to first order) strongly correlated to the variability of the main water isotopologue, we are particularly interested in possible interferences between H 2 O and HDO.We want to be certain that a measured variability in HDO is truly caused by variations in HDO and not a result of the interferences between H 2 O and HDO.Therefore, we need to test if the interferences are small for the cases k = H 2 O and k = HDO and vice versa.Figure 2 shows an example of the column averaging kernels A H 2 O and A HDO , including the interference kernels multiplied with the relative profiles as in Eq. ( 17).The averaging kernel for H 2 O (A H 2 O , top left panel) shows that the retrieval is only sensitive to H 2 O in the lower atmosphere.This is a result of strong pressure broadening of the H 2 O absorption lines (Frankenberg et al., 2009).Since the HDO lines are weaker, the averaging kernel for HDO (A HDO , lower left panel) is more uniform, showing only slightly lower sensitivity at high layers.The interference kernels show that above ∼ 10 km variations of HDO have a minor impact on the retrieval of H 2 O (ρ rel H 2 O,HDO A H 2 O,HDO ≈ −0.02, top right panel in Fig. 2), and variations of H 2 O have a small impact on the retrieval of HDO (ρ rel HDO,H 2 O A HDO,H 2 O ≈ −0.04 ± 0.01, lower right panel in Fig. 2).Since the column averaging kernels A H 2 O and A HDO are much larger, and the density profiles ρ H 2 O and ρ HDO will be very low higher in the atmosphere, the induced errors on the total columns due to this interference are practically negligible.
For a proper error characterization of the retrieval product, we calculate the error covariance matrix S x by This allows us to quantify the retrieval noise standard deviation σ k of the individual column densities and a possible correlation between them.
For data interpretation, it is common to consider the relative abundance of HDO with respect to H 2 O: r = c ret,HDO /c ret,H 2 O , and to reference the ratio to the Vienna Standard Mean Ocean Water (VSMOW) ratio where δD is typically given in units of per mil.The first studies that measured δD in the atmosphere (as mentioned in the introduction) have shown that typical variations in δD over time or space are of the order of 50-100 ‰, which we therefore regard as the required accuracy for a useful product.The diagnostic tools of the individual columns can be used to derive the corresponding quantities for δD.For example, the standard deviation of the retrieval noise is given by and in a similar manner the column averaging kernel with respect to the H 2 O and HDO abundance can be derived.

Two-band cloud filtering
Since the HDO and H 2 O retrieval algorithm does not account for clouds nor any other scattering layers, we need to filter for clouds to avoid large retrieval inaccuracies.This filtering is achieved using the retrieved columns in a weak and strong absorption band of either CH 4 or H 2 O.The bands used are indicated in Fig. 1.Elevated scattering layers not accounted for in the forward model generally cause a retrieval bias by scattering photons directly into the instrument.This optical path length shortening leads to negative biases in the retrieved total columns.The 2-band cloud filter relies on the fact that a total column measurement using a strong absorption band is more strongly affected by this "shielding bias" than the measurement using a weak absorption band.As a result, the relative difference in the retrieved total column between the weak and strong absorption band can be used to indicate the presence of clouds.Using a set of simulated measurements for varying cloudy conditions (as will be described in Sect.3.1), we have tested that using a threshold < 6 % for the relative difference in total column CH 4 between the weak and strong bands we filter for ground scenes that have a cloud fraction of more than 10-20 % (cloud top height ≥ 1 km).
Scenes with low-level clouds (cloud top heights < 1 km) are not affected by this filter.Since low-level clouds above sea pass the filter, the retrieval allows for measurements over sea above these clouds due to their high albedo.The albedo of the sea surface itself is too low in the shortwave infrared for a meaningful retrieval above cloud-free sea pixels.
As an example, Fig. 3 shows how the relative difference in retrieved methane absorption between the strong and weak Table 1.Overview of the generic scenarios used for the performance analysis.For the scenarios where the SZA was not variable, it was fixed at 50 • .For the clouds scenario the surface albedo was fixed at 0.05.

Scenario
Variable X Variable Y Cloud free surface albedo: 0.03-0.6SZA: 0.0-70.0• Clouds cloud top height: 1-8 km cloud fraction: 0.0-1.0Cirrus surface albedo: 0.03-0.6τ cir (2300 nm): 0.0-1.0Aerosol surface albedo: 0.03-0.6AOT (550 nm): 0.0-1.0absorption bands (shown in red in Fig. 1) changes for the scenarios with clouds (left panel), cirrus (middle panel) and aerosol (right panel).Scenes with strongest effects on the light path of the observed signal will also show the largest relative difference.We find that with a relative difference in methane absorption < 6 % (indicated with the pink curve) we effectively filter for clouds and cirrus, as well as for low surface albedo scenes affected by aerosol.For example, not affected by the filter are scenes with a cloud top height 1 km or scenes with a low fraction of higher-level clouds (i.e.everything below or left of the pink curve in the left panel of Fig. 3).A similar performance is achieved with a 2-band water filter, using the weak and strong water bands as shown in blue in Fig. 1 and a threshold for the relative difference in water absorption in these bands of 8 %.In the next sections the impact of this cloud filter on the retrievals of HDO and H 2 O will be shown.The 2-band cloud filtering will be part of TROPOMI's operational methane preprocessing pipeline (Hu et al., 2016), so synergies with the operational data processing can be used to reduce the processing time significantly, as we have estimated that on average 20 % of all the measured ground scenes will pass the cloud filter above land, and 14 % above sea.

Performance analysis for generic scenarios
To assess the performance of the retrieval algorithm, we applied the retrieval to simulated measurements for various generic scenarios.For each scenario, we systematically varied two variables such as surface albedo, solar zenith angle (SZA), cloud parameters (cloud top height, cloud fraction and cloud optical thickness (τ cld )) and aerosol optical thickness (AOT).An overview of the scenarios is given in Table 1.

Measurement simulations
The measurement simulations for the generic scenarios were created using the S-LINTRAN radiative transfer model (Schepers et al., 2014).The implementation of S-LINTRAN for TROPOMI simulations, including the instrument model, is described in detail in Landgraf et al. (2016), as the same simulations have been used to assess the performance of the Table 2. Microphysical properties of water and ice clouds: n(r) represents the size distribution type, r eff and v eff are the effective radius and variance of the size distribution, m = n − ik is the refractive index.The ice cloud size distribution follows a power-law distribution as proposed by Heymsfield and Platt (1984).
Water clouds Ice clouds CO retrieval algorithm.A summary of the implementation is provided in the following two paragraphs.
The model is a scalar plane-parallel radiative transfer model that fully accounts for multiple elastic light scattering by clouds, cirrus, air molecules and the reflection of light from the Earth's surface.The optical properties of water clouds are calculated using Mie theory with microphysical cloud properties given in Table 2.For ice clouds the raytracing model of Hess and Wiegner (1994); Hess et al. (1998) is employed assuming hexagonal, columnar ice crystals randomly oriented in space.Cirrus and water clouds are described by cloud top and base height, and cloud optical thickness.While cirrus fully cover the observed ground scene, water clouds can show partial cloud coverage by utilizing the independent pixel approximation (Marshak et al., 1995) for the simulation.
Measurement noise was superimposed on the radiance spectra using the TROPOMI noise model (Tol et al., 2011).This assumed an observed ground scene of 7 × 7 km 2 and a telescope aperture of 6 × 10 −6 m 2 .The resulting signalto-noise ratio is 120 in the continuum of the spectrum for a dark reference scene (surface albedo A s = 0.05, viewing zenith angle VZA = 0 • and solar zenith angle SZA = 70 • ).
The atmospheric model assumed the US standard atmosphere (1976) for the profiles of dry air density, temperature, pressure, water and CO.The CH 4 profile is taken from the CAMELOT European background profile scenario (Levelt and Veefkind, 2009), interpolated to the same pressure grid and converted from mixing ratios to densities using the air densities from the US standard atmosphere.We separated the water profile into individual profiles for the three isotopic components with absorption features in the TROPOMI SWIR range: H 16 2 O, H 18 2 O and HDO.First, the water profile was scaled with the VSMOW abundance of the respective species.Additionally, a realistic altitude-dependent depletion of HDO and H 18  2 O was assumed.For HDO we assumed a linear decrease from δD = −100 ‰ at the surface to δD = −600 ‰ at 15 km, followed by a linear increase to δD = −400 ‰ at the top of the atmosphere at an altitude of 48 km (Ehhalt, 1974;Ehhalt et al., 2005;Schneider et al.,  The cirrus scenario assumed a cloud fraction of 100 % for a layer between 9 and 10 km and a variable surface albedo (x axis) and cirrus optical thickness (y axis).The aerosol scenario assumed a sulphate-type aerosol in the boundary layer between 0 and 2 km, a variable surface albedo (x axis) and aerosol optical thickness (y axis).The pink curve shows the 6 % threshold that will be used for filtering.2010).We further assumed that the concentration of H 18 2 O is related to the concentration of HDO according to the empirically determined "global meteoric water line" (Craig, 1961) where δ 18 O is defined in the same way as δD (Eq.19).All the atmospheric profiles used for the measurement simulations are shown in Fig. 4.
In the following subsections, we characterize the retrieval performance for the generic scenarios, separately considering the retrieval statistical errors (i.e. the single-measurement noise), σ H 2 O , σ HDO and σ δD (neglecting the small HDO-H 2 O cross-correlation term for σ δD ), and the biases in the total columns c ret,H 2 O , c ret,HDO and their ratio δD.The retrieval statistical error estimate for δD is given by Eq. ( 20).For the bias in δD, which we refer to as " δD", we first determine δD retrieval by removing the noise on the retrieved total columns HDO and H 16 2 O using linear error propagation for the particular noise realization.We need to compare δD retrieval with δD model , where δD model is δD of the "true" model atmosphere: Finally, the retrieval bias on δD is defined as

Cloud-free conditions
In Fig. 5 we show the simulated cloud-free retrieval bias for the total column H 2 O (left panel), total column HDO (middle panel) and their ratio ( δD, right panel) as a function of surface albedo and SZA (no clouds or aerosol present).The figure shows that the retrieval performs very well for the majority of the scenes, with δD less than 0.8 ‰.Only for the lowest surface albedos (0.03-0.05) the bias in δD increases to a few per mil, due to slightly more negative bias in H 2 O compared to HDO.The corresponding statistical error estimates are shown in Fig. 6. σ H 2 O reaches maximum values of 1.6-2.0% for the lowest surface albedos and highest SZAs, and σ HDO is about a factor 2 larger due to the weaker HDO absorption features.Combined, it results in values for σ δD of the order of 15-25 ‰ for the lowest surface albedos.For high surface albedo regions such as deserts (surface albedo ∼ 0.3 in the SWIR), typical values for σ δD are 2-4‰.This is roughly an order of magnitude better than what is achieved with SCIAMACHY (Scheepmaker et al., 2015).

Clouds and cirrus
As the retrieval algorithm does not account for scattering, any clouds, cirrus and aerosol present in the observed scene will lead to biases in the retrieval of the total columns HDO and H 2 O.We have tested the performance of the retrieval under cloudy conditions with a scenario assuming a cloud with an optical thickness of τ cld = 5, with varying cloud top heights (between 1 and 8 km in steps of 1 km) and varying cloud fractions (between 0.0 and 1.0 in steps of 0.1).The same scenario was used to demonstrate the 2-band cloud filter in the left panel of Fig. 3. Due to differences in their retrieval sensitivities, the observed bias is stronger for H 2 O than for HDO, leading to significant biases in their ratio, increasing with both cloud fraction and cloud top height (although not shown, we find that δD can reach values > 900 ‰ for clouds above 7 km with 100 % cloud coverage).Similarly, by simulating scenarios with varying surface albedos and a uniform cirrus or aerosol layer with varying optical thickness (see Table 1), we find that this bias increases with the optical thickness of the layer and with lower surface albedos, as both lead to a lower contribution of photons from below the scattering layer reaching the instrument.As described in Sect.2.3, the 2-band cloud filtering technique will be used to prefilter the scenes most affected by this shielding bias.We find that, after applying the 2-band methane filter to scenes affected by clouds and cirrus, δD 70 ‰ and σ δD = 10-20 ‰ (for scenes with a low surface albedo of A s = 0.05).

Aerosol
An aerosol layer typically has a lower optical thickness than clouds and occurs lower in the atmosphere, leading to a different impact on our non-scattering retrieval.Our aerosol scenario assumes a uniform layer of a sulphate-type aerosol in the boundary layer between 0 and 2 km. Figure 7 shows how this induces a bias in the total columns H 2 O, HDO and their ratio, as a function of aerosol optical thickness and surface albedo.We see that for very low surface albedos, direct reflection off the aerosol layer leads to path length shortening and a corresponding negative (shielding) bias for the total column H 2 O.This effect is weaker for HDO due its more uniform averaging kernel.For higher surface albedos, however, we see that the bias becomes positive, likely due to an increased amount of light scattering in the boundary layer.The contribution of photons from the brighter surface increases and a fraction of these photons undergo multiple scattering events between the aerosol layer and the surface, enhancing the path length.The net effect on δD is that its bias due to aerosol is highest for the lowest surface albedos and highest AOT (right panel in Fig. 7).If we take the 2-band cloud filter into account (the pink curve coming from Fig. 3) to filter the lowest surface albedos affected by aerosol, we are left with δD 20 ‰ due to boundary layer aerosol with AOT = 1.0 (at 550 nm).The statistical error (not shown) does not de-pend significantly on AOT, but varies primarily with surface albedo, reaching similar peak values as in the cloud-free scenario (σ δD ≈ 20 ‰).

Summary of the general performance
In summary, we can conclude that the retrieval performs well under cloud-free conditions.The bias δD will be less than 2 ‰, even for the lowest surface albedos, and the statistical errors vary from 2-4 ‰ for high albedos to 15-25 ‰ for the lowest albedos.Under conditions with clouds, cirrus or aerosol the retrieval performs less well and we generally find a positive bias in δD.To restrict this bias we need strict filtering against clouds and aerosol by applying the 2-band cloud filter either to methane or water (which additionally leads to a great reduction in the computational effort).Applying a 2band methane threshold of 6 %, we restrict the bias in δD to δD < 70 ‰ for all simulated measurements.Averaging multiple single measurements over time and space will further reduce the statistical error and will improve the accuracy to better than the maximum 70 ‰.This brings the measurements within the minimum requirement to study, e.g. the range of seasonality and the meridional variation, which are of the order of 50-100 ‰.On smaller temporal and spatial scales, such as local daily variability, a higher accuracy is needed, which TROPOMI is able to deliver as long as the conditions are cloud free and only moderately affected by aerosol.

Sensitivity to prior assumptions
Similarly to what was done for the CO TROPOMI retrievals (Landgraf et al., 2016), we have tested the sensitivity of the H 2 O and HDO total column retrievals to the prior assumptions, including the impact on δD.These so-called forward modelling errors were tested on the cloud-free scenario (with varying SZA and surface albedo) using the same measurement simulation as described in Sect.3. A perturbation in one of the input assumptions was introduced, after which the retrievals were performed and compared with the default retrievals without the perturbation.The impact of the perturbation is expressed as a systematic error and standard deviation, where we define the systematic error as the mean difference in δD between the perturbed and default retrievals for the 45 scenes with the three lowest surface albedos (0.03, 0.05 and 0.075).The results are summarized in Table 3.
The retrieval uses a priori temperature profiles and surface pressures from the ECMWF.To test the impact of uncertainties in the temperature profile, we have varied this profile by ±0.5 and ±1 K.This primarily affects the retrieved total column H 2 O, while the total column HDO is not very sensitive to temperature variations.A perturbation of +1 K (−1 K) leads to a decrease (increase) in the retrieved H 2 O column of 1.8 %, inducing a systematic error in δD of +14 ‰ (−14 ‰).Table 3. Summary of the sensitivity to the meteorological input and instrument parameters, expressed as the mean difference in δD between the perturbed and default retrievals for the 45 scenes with the three lowest surface albedos (0.03, 0.05 and 0.075).

Prior parameter Systematic error in δD [‰]
Temp −0.5 K −6.9 ± 0.74 Temp +0.5 K +7.0 ± 0. This error is constant for all surface albedos and SZAs and scales linearly with the size of the temperature perturbation.
The atmospheric pressure profile is derived from the surface pressure.To test the impact on inaccuracies in the ECMWF surface pressure, we applied a perturbation of ±1 %.This leads to systematic errors of about 0.5 % in H 2 O and 0.13 % in HDO (with reversed sign), together inducing errors of about 4.5 ‰ in δD.
The retrieval algorithm requires a reflectance spectrum, acquired by dividing the radiance spectrum measured from the Earth's surface by the irradiance spectrum measured directly from the Sun.Differences in the radiometric offset between these spectra could induce spectral features in the reflectance spectrum, leading to systematic errors.The TROPOMI instrument requirement for the radiometric offset on the radiance is 0.1 % of the continuum level.We have tested the impact of an offset on the radiance of 0.1 and 0.5 % of the maximum value in the retrieval window.However, the retrieval fits for an offset in the reflectance spectra, which partly mitigates the effects of an offset in the radiance or irradiance.The systematic errors due to uncertainties in the radiometric offset are therefore very small (errors in δD less than 0.5 ‰).
For the default retrieval and measurement simulations we have assumed a Gaussian slit function (ISRF) with a full width at half maximum (FWHM) of 0.25 nm.We have tested the impact of perturbing this FWHM by ±1 % and find that the induced systematic errors are strongly dependent on surface albedo and SZA.The largest errors in δD reach ±3 ‰ and are found for high albedos and low SZAs.The mean systematic error for the lowest albedos is 0.36 ± 0.85 ‰.
In summary (also see Table 3), we find that the retrieval algorithm is most sensitive to uncertainties in the a priori temperature profiles, followed by the pressure profiles.The sensitivity to uncertainties in the instrument parameters is about an order of magnitude smaller.The uncertainties in the input profiles are expected to be mostly quasi-random in nature, which means their impact on the error in δD will diminish when taking averages in time and space.
More structural systematic errors (i.e.those that will not diminish by averaging) are potentially caused by uncertainties in the water spectroscopy.Recent studies have shown that spectroscopic uncertainties of water can have a large impact on total column retrievals of CO (Galli et al., 2012), CH 4 (Frankenberg et al., 2008;Schneising et al., 2009), H 2 O (Schrijver et al., 2009) and the HDO / H 2 O ratio (Scheepmaker et al., 2013).As a test of the possible impact of uncertainties in the water line parameters, we have repeated the retrievals of the simulated clear-sky scenario after replacing the line parameters of the water isotopologues.For the simulated spectra the parameters from the high-resolution transmission database were used (HITRAN, Rothman et al., 2009).We then performed the retrievals using the water line parameters from Scheepmaker et al. (2013).Table 4 shows the induced systematic errors for replacing a single isotopologue at a time, and for replacing all modelled water isotopologues simultaneously.It shows that the retrieval of HDO and H 2 O can be very sensitive to spectroscopic uncertainties, especially for the ratio HDO / H 2 O, since HDO and H 16  2 O can show sensitivities with opposite sign, which strengthen each other when taking the ratio (as can be seen from replacing only the H 16 2 O parameters).The differences in spectroscopy between HITRAN and Scheepmaker et al. (2013) can lead to differences in δD of up to 128 ‰.Although we find that the differences do not depend on surface albedo or SZA, we cannot exclude a dependency on the total amount of water vapour, which might lead to seasonal and latitudinal biases.Similar to the retrieval of CO (Galli et al., 2012), the HDO / H 2 O retrieval will very likely benefit from a reassessment of the spectroscopic line parameters of water, a study which is currently ongoing (Loos et al., 2015).Regardless of such reassessments, validation studies will be needed to verify spectroscopy and to define corrections that might mitigate spectroscopy related biases.

Performance analysis for a realistic scenario
To show the capabilities of the TROPOMI H 2 O and HDO total column retrievals, we have simulated an ensemble of measurements that reflect a realistic scenario as accurately as possible.In Sect.5.1 we describe the input data and measurement simulations in more detail.In Sect.5.2 we discuss the results of retrieving the simulated measurements in terms of retrieval bias, precision, and effectiveness of the cloud filtering.

Measurement simulations
The simulated measurement ensemble covers a region over the south-western US and north-eastern Mexico as observed by TROPOMI on 4 August 2009.Figures 8 and 9 show the region in terms of various input fields.This region comprises a clear gradient in the relative abundance of HDO with respect to H 2 O due to the transport of humid air from coastal regions inland.We have combined data from the MODIS Aqua satellite (clouds, land/water coverage, surface albedo, aerosol) with data from ETOPO5 (elevation), ECMWF (surface pressure, temperature profiles, specific humidity), TM5 (CO and CH 4 profiles) and LMDZiso (HDO and H 18  2 O profiles) to simulate 27405 TROPOMI measurements on a grid of 135 by 203 ground pixels.This ensemble represents roughly what TROPOMI will observe in 5 min with a daily revisiting cycle.
The viewing and solar geometry and ground pixel size were adapted from MODIS Aqua granule 2009216 (19:45 UT), where the MODIS information was spatially resampled on a pixel size of 10×10 km 2 at subsatellite point.The pixel distortion towards the outer swath was adopted from the MODIS observation.The surface reflection was estimated from the MODIS MCD43C4 data product at 2105-2155 nm for the same period (Strahler et al., 1999) in combination with the surface elevation from ETOPO5 (NOAA, 1988).Furthermore, the MODIS MYD06 cloud product (Platnick et al., 2015a) was used to estimate cloud cover and cloud top height for the individual TROPOMI ground pix- els.Only clouds with top height above 100 m were used.We derived cirrus optical thicknesses from the MODIS cirrus reflectance product employing the algorithm by Dessler and Yang (2003).For all pixels, the cirrus was located between 9 and 10 km.For the aerosol optical thickness (at 550 nm) we used the MODIS MYD08_M3 global monthly mean product (Platnick et al., 2015b), resampled to the above-mentioned granule with a pixel size of 10 × 10 km 2 at subsatellite point.
For some pixels with missing aerosol data the optical thickness was set to 0.1.We assumed three different aerosol types: oceanic (above water), dust (above land), and urban (above all land regions with AOT > 0.23).The corresponding model fields are depicted in Fig. 8.The distribution of atmospheric trace gases was estimated using TM5 chemistry model simulations (Krol et al., 2005), which yields the CO and CH 4 abundances.Moreover, we used data from ECMWF (Dee et al., 2011) for the atmospheric pressure, temperature and humidity profiles.For realistic HDO / H 2 O ratios we derived δD profiles from LMDZiso model simulations (Risi et al., 2010) and for the δ 18 O profiles we assumed correlation to δD according the global meteoric water line (also see Eq. ( 21), Craig, 1961).Figure 9 shows the resulting total columns for the most important species for our ensemble.Based on this input and the TROPOMI instrument model as described in Sect.3.1, we simulated for each individual pixel the TROPOMI SWIR observations using the S-LINTRAN radiative transfer model.

Results
Using the simulated measurement ensemble we retrieved the water vapour abundances using the SICOR retrieval algorithm including the retrieval of the two methane and water bands used for cloud filtering.In Fig. 10 the results are shown in terms of the retrieved bias in total column H 2 O, HDO and δD.We also show the relative difference in the weak vs. strong water bands that were used for cloud filtering.The cloud filter panel (lower left panel in Fig. 10) shows that the algorithm retrieved some pixels above the Gulf of Mexico, even though these pixels did not contain low-level clouds.In reality such pixels will be removed by prefiltering for very low albedo regions.Once the 2-band cloud filter threshold is applied (keeping only pixels with a relative difference < 8 % using the water bands), practically all ocean pixels are re-moved, as well as all the lands pixels affected by clouds, resulting in 54.5 % of the pixels remaining for further study.Using the two methane bands as a cloud filter with a threshold of 6 % resulted in slightly less strict filtering (60.7 % of the pixels remaining), as certain pixels with low and optical thin clouds in the west above Mexico and in the east above Alabama were not removed (not shown).The few rejected pixels in the centre and south of Texas show that the cloud filter effectively removed high isolated clouds with low optical thickness (cf. the lower two panels in Fig. 8 with Fig. 10), but left pixels with clouds < 1 km intact (as is preferred).The large group of pixels in the north-east of the ensemble were rejected based on the presence of high and optically thick clouds, or the presence of aerosol above low surface albedo regions.
The other three panels in Fig. 10 show the remaining biases in total column H 2 O, HDO and δD after cloud and ocean filtering.One has to keep in mind, however, that any additional bias due to uncertainties in the prior assumptions (as discussed in Sect.4) is not shown in these figures.Both H 2 O and HDO show a positive bias of a few percent above the higher surface albedo regions in the west and a small negative bias over the lower surface albedo regions in the east.Careful inspection of the states of Louisiana and Mississippi show that even the albedo contrast caused by the Mississippi and Red River basins can be observed in the H 2 O and HDO bias maps.We also see that the biases are slightly larger for H 2 O, compared to HDO.The cause for these biases is aerosol, as the aerosol bias shows the same patterns as a function of surface albedo and AOT, and the effect is slightly larger for H 2 O, as discussed in Sect.3.4 and shown in Fig. 7. Combined into a ratio, the lower right panel in Fig. 10 show that the retrieval bias in δD is slightly negative above the highest albedos and increases to a positive bias with a maximum of ∼ 20 ‰ above the lowest albedo regions.Areas at high altitudes usually have a lower humidity and therefore a lower δD compared to areas at lower altitudes.This gradient is visible in the bottom right panel in Fig. 9. Furthermore, areas at higher altitude generally have a higher surface albedo.Because the retrieval bias in δD is negative for high surface albedos and positive for low surface albedos, the altitude gradient in δD is overestimated by the retrieval.This will likely be the case for all scenarios with a gradient between higher    Figure 11 shows the same maps in terms of singlemeasurement precision error (1σ ).As expected, the dominant factor to determine the precision error is surface albedo.The error of the strong absorbers H 2 O and CH 4 is 0.05-0.15% for the highest surface albedos, which increases to 0.35 % for the lowest surface albedos.The precision errors of the weak absorber HDO are larger, reaching 0.50 % for the lowest surface albedos.This translates into precision errors in δD of at most 5 ‰ above the lowest albedos.
This realistic scenario demonstrates the capabilities of TROPOMI and the SICOR algorithm to retrieve accurate patterns in total column H 2 O, HDO and δD above land from a single overpass.After the 2-band cloud filter effectively removed all measurements above water and high clouds, a small bias remains due to aerosol, which correlates with surface albedo.The bias is smaller (and of opposite sign) compared to the temporal or spatial gradients in δD expected for typical science cases (e.g. as observed by SCIAMACHY in Yoshimura et al., 2011;Lee et al., 2012;Risi et al., 2012a;Okazaki et al., 2015).This ensures the ability to detect and study patterns in δD on much smaller timescales and at higher spatial resolution compared to previous satellite missions, but care should be taken when using the data over regions with strong gradients in surface albedo.

Discussion and conclusions
We have presented an algorithm and performance analysis for the retrieval of total column H 2 O and HDO from TROPOMI measurements onboard the Sentinel-5 Precursor mission.By adapting ESA's operational CO algorithm (Landgraf et al., 2016), we developed a relatively simple approach that is fast but relies on strict filtering for clouds, cirrus and aerosol using a 2-band methane or water retrieval.The ratio HDO / H 2 O will be a useful scientific product in the fields of hydrology and climate research, with the potential to improve our understanding of the processes controlling atmospheric humidity and transport.
The first studies in this direction which used a similar type of column-averaged satellite product were using SCIA-MACHY data (Frankenberg et al., 2009;Yoshimura et al., 2011;Lee et al., 2012;Risi et al., 2012b, a;Scheepmaker et al., 2015).These studies showed that the typical seasonal or spatial gradients in δD are about 50-100 ‰.The measurement precision and accuracy needs to be higher than this in order to contribute significantly to science.For SCIA-MACHY, this implied either taking monthly averages or binning to a spatial resolution of at least 1 × 1 • in order to bring the statistical error down to about 20 ‰ (the singlemeasurement precision being ∼ 115 ‰, Scheepmaker et al., 2015).The newer GOSAT measurements show an improvement in precision by a factor of about 2, compared to SCIA-MACHY (Frankenberg et al., 2013;Boesch et al., 2013).Both SCIAMACHY and GOSAT products show a negative bias of about 30-70 ‰ compared to ground-based Fouriertransform spectroscopy (FTS) networks.
Our analysis has shown that TROPOMI is expected to deliver a much better performance than SCIAMACHY and GOSAT in terms of δD in only a single overpass.The singlemeasurement noise will be better than 15-25 ‰ for even the lowest surface albedos, while at the same time the spatial resolution of 7 × 7 km 2 is much higher than SCIAMACHY's 120 × 30 km 2 and provides a better coverage than GOSAT's sparse spatial sampling.Even though we still need to filter for clouds, due to this higher spatial resolution TROPOMI will observe many more useful scenes in between clouds compared to SCIAMACHY or GOSAT.This allows for new opportunities to study the hydrological cycle on timescales of mere days or weeks instead of seasons or years, or over longer periods if a high spatial resolution is desired.
Mainly due to the presence of low-level aerosol in the atmosphere, the cloud-filtered TROPOMI measurements of total column HDO and H 2 O are not expected to be completely bias free.Changes to the light paths of the reflected photons due to any scattering particles remaining after filtering are not accounted for in the retrieval algorithm, and lead to biases of a few percent in total column HDO and H 2 O, and up to ∼ 20 ‰ in δD, depending on surface albedo, as shown by our simulated scenario of measurements above the USA and Mexico.
After launch and commissioning of the instrument in Q4 2016, validation using ground-based FTS data from the TC-CON and NDACC networks is needed to test the performance of the algorithm on real measurements.Thermal infrared products, such as δD from TES and IASI, also provide useful complementary information due to their different sensitivity.Therefore, aircraft validation may also be valuable, as in situ measurements could be useful to address any differences between total column and thermal infrared products.Ideally, the HDO / H 2 O products from the ground-based networks should first be intercompared, both using the results from the ongoing reassessment of the water spectroscopy (Loos et al., 2015), and for a range of atmospheric conditions and geographical locations.Any possible differences due to either spectroscopy or location (e.g. as found by Scheepmaker et al., 2015) need to be understood before the next generation of HDO and H 2 O global retrievals from space can be exploited to come to a better understanding of the atmospheric hydrological cycle and the role it plays in our changing climate.

Figure 1 .
Figure 1.Simulated spectral transmittance in the SWIR spectral range, showing the total transmittance (top panel) as well as the absorption features of the individual species (lower four panels).The simulation was performed assuming a solar zenith angle of 0 • and a viewing zenith angle of 40 • .The 2354.0-2380.5 nm retrieval window is indicated in grey.The coloured windows highlight the weak and strong absorption bands of H 2 O (blue) and CH 4 (red) used for cloud filtering.

Figure 2 .
Figure 2. Top left: total column averaging kernel for H 2 O. Lower left: total column averaging kernel for HDO.Top right: the sensitivity of total column H 2 O to variations in HDO at different altitudes.Lower right: the sensitivity of total column HDO to variations in H 2 O at different altitudes.

Figure 3 .
Figure 3. 2-Band CH 4 filter results for clouds (left), cirrus (middle) and aerosol (right).Plotted is the relative difference in total column CH 4 retrieved from the weak and strong bands: CH 4 (weak -strong)/strong ( %).The cloud scenario assumed a cloud optical thickness of τ cld = 5 and a variable cloud-top-height (x axis) and cloud fraction (y axis).The cirrus scenario assumed a cloud fraction of 100 % for a layer between 9 and 10 km and a variable surface albedo (x axis) and cirrus optical thickness (y axis).The aerosol scenario assumed a sulphate-type aerosol in the boundary layer between 0 and 2 km, a variable surface albedo (x axis) and aerosol optical thickness (y axis).The pink curve shows the 6 % threshold that will be used for filtering.

Figure 4 .
Figure 4. Atmospheric profiles for the number densities of the absorbers (bottom axis, normalized to the surface value) and temperature (top axis) used as input for the model atmosphere.

Figure 5 .
Figure 5. Cloud-free retrieval bias as a function of surface albedo and SZA for the total columns of H 2 O (left, %), HDO (middle, %) and the HDO / H 2 O ratio (right, ‰).

Figure 6 .
Figure 6.Cloud-free statistical error estimates (single-measurement noise) as a function of surface albedo and SZA for the total columns of H 2 O (left, %), HDO (middle, %) and the HDO / H 2 O ratio (right, ‰).

Figure 7 .
Figure 7. Retrieval bias for an aerosol layer between 0 and 2 km as a function of surface albedo and AOT for the total columns of H 2 O (left, %), HDO (middle, %) and the HDO / H 2 O ratio (right, ‰).The pink curve shows the 6 % methane cloud filter threshold from Fig. 3. Applying that filter would result in filtering of the scenes left of the pink line.

Figure 8 .
Figure 8.A selection of the input for the realistic scenario simulation.Top left: SWIR surface albedo.Top right: aerosol optical thickness at 550 nm.Bottom left: cloud optical thickness.Bottom right: cloud top height.

Figure 9 .
Figure 9. Input total columns for the most important absorbing species for the realistic scenario simulation.Top left: H 2 O. Top right: HDO.Bottom left: CH 4 .Bottom right: the resulting total column HDO / H 2 O ratio expressed in δD.

Figure 10 .
Figure 10.Retrieval biases for water in the realistic scenario simulation.Except for the bottom left panel, the results are cloud filtered using a weak vs. strong water band threshold of 8 %.Top left: H 2 O bias.Top right: HDO bias.Bottom left: relative difference in the weak vs. strong water bands used for cloud filtering.Bottom right: bias in the derived HDO / H 2 O ratio.