Journal topic
Atmos. Meas. Tech., 13, 2751–2783, 2020
https://doi.org/10.5194/amt-13-2751-2020
Atmos. Meas. Tech., 13, 2751–2783, 2020
https://doi.org/10.5194/amt-13-2751-2020

Research article 28 May 2020

Research article | 28 May 2020

# Total column water vapour retrieval from S-5P/TROPOMI in the visible blue spectral range

Total column water vapour retrieval from S-5P/TROPOMI in the visible blue spectral range
Christian Borger, Steffen Beirle, Steffen Dörner, Holger Sihler, and Thomas Wagner Christian Borger et al.
• Satellite Remote Sensing Group, Max Planck Institute for Chemistry, Mainz, Germany

Correspondence: Christian Borger (christian.borger@mpic.de) and Thomas Wagner (thomas.wagner@mpic.de)

Abstract

Total column water vapour has been retrieved from TROPOMI measurements in the visible blue spectral range and compared to a variety of different reference data sets for clear-sky conditions during boreal summer and winter. The retrieval consists of the common two-step DOAS approach: first the spectral analysis is performed within a linearized scheme and then the retrieved slant column densities are converted to vertical columns using an iterative scheme for the water vapour a priori profile shape, which is based on an empirical parameterization of the water vapour scale height. Moreover, a modified albedo map was used combining the OMI LER albedo and scaled MODIS albedo map. The use of the alternative albedo is especially important over regions with very low albedo and high probability of clouds like the Amazon region.

The errors of the total column water vapour (TCWV) retrieval have been theoretically estimated considering the contribution of a variety of different uncertainty sources. For observations during clear-sky conditions, over ocean surface, and at low solar zenith angles the error typically is around values of 10 %–20 %, and during cloudy-sky conditions, over land surface, and at high solar zenith angles it reaches values around 20 %–50 %.

In the framework of a validation study the retrieval demonstrates that it can well capture the global water vapour distribution: the retrieved H2O vertical column densities (VCDs) show very good agreement with the reference data sets over ocean for boreal summer and winter whereby the modified albedo map substantially improves the retrieval's consistency to the reference data sets, in particular over tropical land masses. However, over land the retrieval underestimates the VCD by about 10 %, particularly during summertime. Our investigations show that this underestimation is likely caused by uncertainties within the surface albedo and the cloud input data: low-level clouds cause an underestimation, but for mid- to high-level clouds good agreement is found. In addition, our investigations indicate that these biases can probably be further reduced by the use of improved cloud input data. For the general purpose we recommend only using VCDs with cloud fraction <20 % and AMF >0.1, which represents a good compromise between spatial coverage and retrieval accuracy.

The TCWV retrieval can be easily applied to further satellite sensors (e.g. GOME-2 or OMI) for creating uniform, long-term measurement data sets, which is particularly interesting for climate and trend studies of water vapour.

1 Introduction

Water vapour is the most important natural greenhouse gas in the atmosphere and plays a key role in the atmospheric energy balance via radiative effects and latent heat transport . Due to its high spatiotemporal variability on all atmospheric scales, accurate knowledge of the amount and distribution of water vapour is essential for numerical weather prediction and climate monitoring.

Several in situ and remote sensing measurement techniques have been developed in the past decades, enabling the observation of the water vapour distribution from platforms like radiosondes, balloons, aircrafts, and satellites. The particular absorption properties of water vapour allow for the retrieval of the water vapour content via satellites for several different spectral ranges from the radio , microwave, e.g. AMSU , thermal infrared, e.g. AIRS , near and shortwave infrared, e.g. MODIS , MERIS , and TROPOMI , to the visible, e.g. GOME , SCIAMACHY , and GOME-2 .

The visible spectral range is particularly interesting for the retrieval of total column water vapour (TCWV): in contrast to the microwave range it has a similar sensitivity for the ocean and land surface, allowing for global coverage. Also, it is possible to conduct retrievals under partly-cloudy conditions and, in comparison to the thermal infrared, it has a much higher sensitivity for the near-surface layers. Furthermore, the spectral analysis is straightforward; i.e. no forward model calculations are necessary.

So far TCWV has been retrieved mostly in the visible “red” spectral range because the absorption is strongest there. However, for this spectral range the ocean surface albedo is relatively low, leading to a low sensitivity for the lowermost troposphere, where the highest water vapour concentrations occur. In addition, current and past satellite sensors can not resolve the fine absorption structure of water vapour in this spectral range, causing non-linear absorption effects (e.g. saturation) which have to be accounted for in post-processing. Thus, suggested applying retrievals in the “blue” spectral range (around 442 nm) where the absorption is much weaker than in the red, making the retrieval problem quasi-linear. In addition, the ocean surface albedo is much higher, leading to a higher sensitivity of the near-surface layers. The first operational analyses of a similar approach have been performed by for measurements of the Ozone Monitoring Instrument (OMI, ).

In October 2017 the TROPOspheric Monitoring Instrument (TROPOMI, ) onboard ESA's Sentinel-5 Precursor (S-5P) satellite was launched in a Sun-synchronous polar orbit with an Equator crossing time of 13:30 LT (local time). TROPOMI is a UV-Vis-NIR push-broom spectrometer and consists of 450 detectors/rows covering a swath width of 2600 km. The outstanding property of TROPOMI is that its spectral bands in the visible combine a high signal-to-noise ratio with an unprecedented spatial resolution of 3.5×7.5 km2 (and 3.5×5.6 km2 since August 2019; ) at nadir, which allows for the performance of spectral analyses at a never seen before accuracy even on small spatial scales.

In this paper we introduce a TCWV retrieval based on the spectral analysis approach of to S-5P/TROPOMI observations. The paper is organized as follows: in Sect. 2 we give an overview of the retrieval describing general retrieval principles and presenting the retrieval setup. In Sect. 3 we present an empirical parameterization of the a priori water vapour profile shape and an iterative scheme making use of the relation between the water vapour profile shape and TCWV. In Sect. 4 we evaluate different input albedo products and in Sect. 5 we perform a detailed uncertainty analysis including a variety of different error sources. In Sect. 6 we present first TCWV results retrieved from TROPOMI measurements and perform a validation study using data sets from satellites, ground-based measurements, and reanalysis models as reference. In Sect. 7 we draw conclusions and summarize the outcomes of our investigations.

2 Retrieval principles

## 2.1 Wavelength calibration and spectral analysis

In a first step the wavelength alignment of the daily measured irradiance is calibrated for each of the 450 TROPOMI detectors/rows via a non-linear least-squares fit in intensity space using the solar spectrum from as reference. Simultaneously, the instrumental spectral response function (ISRF) is approximated assuming an asymmetric Super-Gaussian following the definition of :

Next, we perform a spectral analysis using the differential optical absorption spectroscopy (DOAS; ) scheme in which the attenuation along the light path is calculated via the Beer–Lambert law in optical depth space:

$\begin{array}{}\text{(2)}& \mathrm{ln}\left(\frac{I}{{I}_{\mathrm{0}}}\right)=\mathit{\tau }\approx -\sum _{i}{\mathit{\sigma }}_{i}\left(\mathit{\lambda }\right)\cdot {\mathrm{SCD}}_{i}+\mathrm{\Phi },\end{array}$

where i denotes the index of a trace gas of interest, σi(λ) its respective absorption cross section, ${\mathrm{SCD}}_{i}={\int }_{s}{c}_{i}\mathrm{d}s$ its concentration integrated along the light path s (the so-called slant column density), and Φ a closure polynomial accounting for Mie and Rayleigh scattering as well as low-frequency contributions.

Table 1 summarizes the fit setup of the retrieval's spectral analysis. The retrieval's fit window ranges from 430 to 450 nm and accounts for molecular absorption by water vapour (HITRAN 2008, ), NO2 at 220 K , ozone , and the O2O2 dimer . In order to account for the Ring effect we include two Ring spectra , and for Φ we use a fifth-order polynomial. Furthermore, we include pseudo-absorbers accounting for intensity offset, for shift and stretch effects , and for ISRF changes along the orbit for ISRF parameters w and k in Eq. (1). All molecular absorption cross sections are convolved with the ISRF of the corresponding TROPOMI row/detector determined during the calibration process.

Table 1DOAS fit settings for the H2O slant column retrievals.

The molecular absorption by water vapour within our fit window is relatively weak, and hence the modelled line lists vary systematically from HITRAN 2008 to HITRAN 2012 and to HITRAN 2016 . Thus, the choice of line list is afflicted by a high degree of uncertainty. found out that HITRAN 2012 underestimates the water vapour concentration derived from long-path DOAS observations by approximately 10 % and that the previous version, HITRAN 2008, agrees better with the reference measurements. Further long-path DOAS measurements taken during the CINDI-2 campaign also confirm the findings from (see Appendix B for more details). Hence, combining the findings from and , we conclude that HITRAN 2008 fits best our needs and is superior to the most recent version of the HITRAN line lists (HITRAN 2016).

Due to the high daily data volume of the TROPOMI L1B radiances, the execution of a non-linear fit without high-performance infrastructure is demanding in computation time. For instance, TROPOMI's UVIS Band 4, which covers the spectral range of 400–499 nm, generates about 40 GB every day. Therefore, we implemented a weighted linear least-squares fit for our retrieval, in which the weights are the fractional coverage of the spectral pixel within the fit window (details in Appendix A). This weighting of the outermost pixels of the fit window avoids “jumps” of pixels included in the DOAS fit, as they would occur for a fixed fit window due to the changing pixel-to-wavelength mapping across-track. Thus, across-track “stripes” in the SCDs are avoided. According to the computational speed increases by 3 orders of magnitude by going from non-linear to linear fits for their MATLAB routine (see Table 3 in their paper).

Figure 1Example of a typical spectral analysis of a TROPOMI measurement spectrum (rms: 0.5 ‰, orbit: 6930, −7.41 N, −111.97 E). The black line indicates the fit result for the respective trace gas and the red line indicates the residual spectrum and residual noise for each constituent.

Figure 1 illustrates a typical example of such a spectral analysis of a TROPOMI measurement spectrum in which the absorption structures of water vapour, NO2, and the Ring effect can be well identified and the residual spectrum shows a mainly noisy structure. Figure 2 depicts the distribution of the H2O SCD from one TROPOMI orbit (orbit number 6930) on 13 February 2019. It demonstrates that the TROPOMI retrieval is able to capture the meso- to macro-scale water vapour patterns like convective updrafts in the tropics and atmospheric rivers in the midlatitudes, whereby the small H2O SCD values in the tropics are caused by cloud shielding.

Figure 2H2O SCD distribution retrieved from TROPOMI measurements (orbit: 6930) on 13 February 2019 during an atmospheric river event at the western US coast.

## 2.2 Vertical column density conversion and Box-AMF simulations

To convert the slant column density to a vertical column density (VCD), we apply the so-called air-mass factor (AMF):

$\begin{array}{}\text{(3)}& \mathrm{VCD}=\frac{\mathrm{SCD}}{\mathrm{AMF}}.\end{array}$

The air-mass factor accounts for the non-trivial effects of the atmospheric radiative transfer and is usually based on radiative transfer model (RTM) simulations. In our case we used the 3D Monte Carlo RTM McArtim and performed simulations at a wavelength of 442 nm for different retrieval scenarios (summarized in Table 2) assuming an aerosol-free atmosphere. These simulations yield a Jacobian vector $\mathbit{J}=\frac{\partial \mathrm{ln}I}{\partial \mathit{\mu }}$ (with the absorption coefficient μ and the simulated intensity I at TOA normalized by the solar spectrum I0) defined at each grid box k. The altitude-dependent AMFs (BAMF) can then be calculated according to the formula

$\begin{array}{}\text{(4)}& {\mathrm{BAMF}}_{k}=-\frac{{\mathbit{J}}_{k}}{I\mathrm{\Delta }h},\end{array}$

with the box thickness Δh. These BAMF profiles have to be combined with the partial vertical columns ck of an a priori water vapour profile:

$\begin{array}{}\text{(5)}& \mathrm{AMF}=\frac{{\sum }_{k}{\mathrm{BAMF}}_{k}\cdot {c}_{k}}{{\sum }_{k}{c}_{k}},\end{array}$

with ${\sum }_{k}{c}_{k}=\mathrm{VCD}$. For the case of a cloud-contaminated pixel we assume that the cloud is a Lambertian reflector with an albedo of 80 % and use the cloud top height as surface altitude input for the AMF. Under the assumption of the independent pixel approximation, the resulting cloud-affected AMF can then be calculated as a linear combination of the AMF for a clear-sky scenario and the AMF for a cloudy-sky scenario weighted by the respective simulated intensities I and the effective cloud fractions ζ as follows:

$\begin{array}{}\text{(6)}& \mathrm{AMF}=\frac{\left(\mathrm{1}-\mathit{\zeta }\right){I}_{\mathrm{clear}}{\mathrm{AMF}}_{\mathrm{clear}}+\mathit{\zeta }\cdot {I}_{\mathrm{cloud}}{\mathrm{AMF}}_{\mathrm{cloud}}}{\left(\mathrm{1}-\mathit{\zeta }\right){I}_{\mathrm{clear}}+\mathit{\zeta }\cdot {I}_{\mathrm{cloud}}}.\end{array}$

Figure 3 depicts typical examples of BAMF profiles for different clear- and cloudy-sky scenarios. The AMFs for the cloudy-sky scenarios were calculated assuming a surface albedo of 7 % and an effective cloud fraction of 20 %. For the clear-sky scenario (panel a) the sensitivity decreases towards the surface. For the cloudy-sky scenarios (panel b) the BAMF profiles slightly increase towards the (bright) cloud top surface of the respective scenario. Below the cloud, the sensitivity is 0, because the atmosphere is shielded. Since high clouds shield large fractions of the atmosphere and hence also of the water vapour column below the cloud (see the black dashed curve), the AMF has to be corrected correspondingly and thus decreases for increasing cloud top heights.

Table 2Parameter list and nodes for the BAMF profile simulations.

Figure 3Examples of typical BAMF profiles for different observation scenarios under clear- (a) and cloudy-sky (b) conditions. For all profiles we assume a solar zenith angle of 0, a line-of-sight angle of −90, and a solar relative azimuth angle of 0. For the clear-sky case the BAMF profile is illustrated for a surface albedo of 7 %. For the cloudy-sky cases the profiles are depicted as cloud top heights of 2 and 8 km and the respective AMFs are calculated assuming a surface albedo of 7 % and an effective cloud fraction of 20 %. The black dashed lines indicate relative water vapour concentrations with a scale height of 2 km.

3 A priori water vapour profile shape

As described in Sect. 2.2 and Eq. (5), knowledge of the a priori water vapour profile shape is necessary for accurate calculations of the AMF from the BAMF profile. However, simply assuming the same a priori profile shape for the whole globe might cause biases because it can not account for the atmospheric variability of water vapour, such as latitudinal variation, seasonal cycles, or different profile shapes over maritime and continental regions due to different water vapour sources (e.g. evapotranspiration by plants). Also, simply using profiles from numerical weather models is not uncritical: for instance, found out that their calculated AMF changes strongly, depending on which reanalysis model data they were using.

approximated the water vapour profile by an exponential decay with altitude:

$\begin{array}{}\text{(7)}& {n}_{\mathrm{v}}\left(z\right)={n}_{\mathrm{0}}{e}^{-z/{H}_{\mathrm{v}}},\end{array}$

where Hv is the scale height of water vapour, which they defined as

$\begin{array}{}\text{(8)}& {H}_{\mathrm{v}}=\frac{{R}_{\mathrm{v}}{〈T〉}^{\mathrm{2}}}{L〈\mathrm{\Gamma }〉},\end{array}$

where 〈T〉 denotes the mean air temperature within an atmospheric column, 〈Γ〉 the mean lapse rate within the same atmospheric column, Rv the gas constant of water vapour, and L the specific latent heat. However, this definition requires knowledge of the mean air temperature and/or the lapse rate and that the relative humidity is constant with altitude. The former can be only estimated using numerical weather models and the latter is very unlikely to occur in the atmosphere.

Thus, we investigate to find an empirical parameterization of the scale height and thereby focus on its dependency on the H2O VCD and the aforementioned atmospheric variabilities, i.e. dependencies of latitude, seasonal cycle, and surface properties (such as vegetation effects).

We proceed as follows: first, we evaluate how well the method used to calculate the water vapour scale height can reproduce the COSMIC profiles via an AMF comparison. Then we examine how the scale height can be parameterized globally and investigate for a parameterization over ocean and land separately. Finally, we implement the parameterization in an iterative retrieval scheme and evaluate the new estimates of the H2O VCD.

## 3.1 COSMIC water vapour profiles

For our investigations we use profile data retrieved from measurements of the Constellation Observing System for Meteorology, Ionosphere, and Climate (COSMIC, ) program provided by the Radio Occultation Meteorology Satellite Application Facility (ROMSAF). The COSMIC data are based on the GPS radio occultation (RO) technique, which provides high-resolution vertical profiles of bending angles that can be used to retrieve the atmospheric refractivity. Since the atmospheric refractivity is dependent on the air pressure, the air temperature, and the water vapour pressure , GPS RO allows for the retrieval of profile information under all-weather conditions with a high vertical resolution of approximately 100 m in the lower troposphere up to 1 km in the stratosphere (Anthes2011) and an accuracy of around 1 g kg−1 while having an almost uniform global distribution .

The ROMSAF profiles have been retrieved via a 1D-VAR scheme within a reprocessing initiative for creating climate data record (CDR) v1.0. Given the strict product requirements and the validation studies with ERA-Interim and radiosondes , biases associated with using COSMIC should be of secondary order.

We use data retrieved between 2013 and 2016, which accumulates to approximately 1.6×106 profiles.

## 3.2 Calculation of scale height

For the calculation of the scale height we high-sample the COSMIC profile to a 100 m grid up to 14 km, or rather only consider profile data below 150 hPa (close to the tropopause height). Then we sum up all the partial columns of the COSMIC profile data from the ground up to a (scale) height Hsum where the H2O VCD reaches $\mathrm{1}-\frac{\mathrm{1}}{e}$:

$\begin{array}{}\text{(9)}& \frac{{\int }_{\mathrm{0}}^{{H}_{\mathrm{sum}}}n\left(z\right)\mathrm{d}z}{{\int }_{\mathrm{0}}^{\mathrm{TOA}}n\left(z\right)\mathrm{d}z}>\mathrm{1}-\frac{\mathrm{1}}{e}\approx \mathrm{63}\phantom{\rule{0.125em}{0ex}}\mathit{%}.\end{array}$

To evaluate this scale height approach, we performed a synthetic study in which we compared AMFs calculated for the original COSMIC water vapour profile measurements with AMFs for an exponential profile using the corresponding calculated scale height Hsum. For the simulation of the BAMF profiles we assume an albedo of 7 %, which is a representative value for the ocean surface albedo . The solar zenith angle is calculated for the location of the COSMIC profile assuming an hour angle of 90, and the line-of-sight angles are prescribed for −90, −70, and −50.

Figure 42D histograms comparing synthetic AMFs (calculated via the sum method) for different line-of-sight angles (a−90, b−70, and c−50) assuming clear-sky conditions. The colour depicts the number of observations within one defined bin of the 2D histogram and the red dashed line represents the 1-to-1 diagonal.

Figure 5Histogram of the relative deviation of the calculated synthetic AMFs between the exponential profile and COSMIC profile for selected latitude bins (0 to 10, −30 to −20, and −70 to −60 N) assuming clear-sky conditions and nadir-viewing geometry.

The results of the intercomparison are given in Fig. 4. The 2D histograms reveal that the AMFs derived with the exponential profile agree well with the AMFs calculated directly from the COSMIC profiles, indicating that the chosen method can well reproduce the shapes of the COSMIC profiles. This good agreement can also be observed in the histograms of Fig. 5, which illustrate distributions of relative deviation between the AMFs for selected latitude bins. These distributions have a sharp shape and peak around values of 0 %, indicating that the AMFs from the exponential shape are almost unbiased to the reference AMFs. In addition, Fig. S1 in the Supplement shows exemplary profiles for cases of good and bad agreement with the reference AMFs for the same selected latitude bins as in Fig. 5. In general, bad agreement (left column) occurs for profile shapes in which a sharp gradient is observed in the lower troposphere and from that quasi-constant values with altitude. Such profiles usually occur when a moist boundary layer is topped by a dry free atmosphere. Nevertheless, the maximal absolute relative AMF deviations only have values around 15 %. In contrast, good agreement (right column) is found for profile shapes following an exponential decay with altitude, which indicates a well-mixed troposphere.

The results of the intercomparison for prescribed cloudy-sky conditions and nadir-viewing geometry are illustrated in Fig. S2, in which the panels show histograms of the relative AMF deviation for the same selected latitude bins as in Fig. 5 but for different cloud fraction (10 %, 20 % and 50 %; left to right column) and cloud top height (1, 2, and 5 km; top to bottom rows) scenarios. For a cloud top height of 1 km the AMFs calculated from the exponential profiles are generally biased negative for all cloud fractions, in particular for the latitude bin of −30 to −20 N. However, for higher clouds the AMFs agree well with the reference AMFs for almost all cloud scenarios except the extreme case with a cloud fraction of 50 % and a cloud top height of 5 km or more.

Alternative methods for calculating the scale height yielded systematic overestimations of the AMF for clear-sky conditions (Fig. C1) and higher scatter within the AMF for cloudy-sky conditions (Fig. S8) in comparison to the sum method, as shown in detail in Appendix C.

## 3.3 Parameterization of scale height

Figure 6 depicts the distribution of the calculated COSMIC scale height Hsum against the COSMIC TCWV for boreal summer over ocean for latitude bins of 10. The regression fits (solid red lines) are based on orthogonal distance regression (ODR) using the “scipy.odr” package built on ODRPACK . For low latitudes (tropics and subtropics) the scale height shows a high linear correlation with the H2O VCD, with slopes around 0.04 and Pearson correlation coefficients R of 70 % and above. In contrast, for high latitudes the slope increases up to 0.1, and the scatter also increases distinctively; i.e. the correlation coefficient only reaches values of around 0.3 in the polar regions. This decrease in linear agreement is likely caused by the higher atmospheric variability due to higher atmospheric dynamics in the midlatitudes. Also, the uncertainty is higher in the COSMIC profile because a drier atmosphere leads to a smaller sensitivity of the COSMIC profile retrieval to water vapour concentrations (compare ).

Figure 62D histograms depicting the relation between calculated scale height and TCWV from COSMIC profiles for boreal summer (June, July, and August) only over ocean summarized in 10 latitude bins. Only latitude bins with a sample size of 1500 data points are illustrated. The colour indicates the relative share of total points within one bin of the histogram and the red line indicates the fit results of the orthogonal distance regression with detailed results in the legend of each subplot. In addition the Pearson correlation coefficient for each data set is given in the title of each subplot.

Figure 7Same as Fig.6 but for data over land.

Figure 7 illustrates the same panels as Fig. 6 but for data over land. In general, the scatter for all latitude bins has increased distinctively, resulting in an inferior linear agreement between the H2O VCD and the scale height compared to the data over ocean, especially for deserts and northern polar regions. Fortunately, the surface albedo of these regions is usually high, and thus the AMF is less dependent on the a priori profile shape. In addition, these regions are governed by an arid climate, and thus the retrieved H2O VCDs are expected to be small. Correspondingly, the absolute H2O VCD errors due to uncertainties in the AMF are still relatively small.

In the following we investigate a parameterization of the scale height with respect to H2O VCD, latitude, and season for ocean and land separately. To distinguish between ocean and land surface, we use a land–sea mask derived from GSHHS coastline data .

Figure 8Summary of the results of the ODR fit between COSMIC scale height and COSMIC TCWV as a function of latitude and month for data over ocean. Panel (a) illustrates the fitted slopes and panel (b) the corresponding fitted intercepts whereby the coloured points represent the fit results and the lines represent the approximations for α(θ,t) and β(θ,t) for each month.

### 3.3.1 Ocean

The regression line parameters of the ODR fit results between COSMIC TCWV and COSMIC scale height for each latitude bin for each month for data over ocean are illustrated in Fig. 8. The values for the fitted slopes (Fig. 8a) indicate a quadratic dependency with latitude and reveal a seasonal shift towards higher latitudes during July, August, and September. Also, the values for the fitted intercept vary with latitude and season.

Thus, the scale height over ocean Hocean can be approximated as follows:

$\begin{array}{}\text{(10)}& {H}_{\mathrm{ocean}}\left(\mathrm{VCD},\mathit{\theta },t\right)=\mathit{\alpha }\left(\mathit{\theta },t\right)\cdot \mathrm{VCD}+\mathit{\beta }\left(\mathit{\theta },t\right),\end{array}$

with

$\begin{array}{}\text{(11)}& \begin{array}{rl}& \mathit{\alpha }\left(\mathit{\theta },t\right)={a}_{\mathrm{0}}\left(t\right)+{a}_{\mathrm{1}}\left(t\right)\cdot \mathit{\theta }+{a}_{\mathrm{2}}\left(t\right)\cdot {\mathit{\theta }}^{\mathrm{2}},\\ & \mathit{\beta }\left(\mathit{\theta },t\right)={b}_{\mathrm{0}}\left(t\right)+{b}_{\mathrm{1}}\left(t\right)\cdot \mathit{\theta }+{b}_{\mathrm{2}}\left(t\right)\cdot \left|\mathit{\theta }-{\mathit{\theta }}_{\mathrm{0}}\left(t\right)\right|,\end{array}\end{array}$

with the latitude θ and the day of year t. The annual variation of the function parameters ai, bi, and θ0 from Eq. (11) fitted for the monthly data sets (illustrated in Fig. 8) is depicted in Fig. S9. Most function parameters reveal an annual and semi-annual cycle over the year. Hence, these function parameters can be approximated by a superposition of two simple cosine functions with prescribed frequencies:

$\begin{array}{}\text{(12)}& {a}_{i}\left(t\right)={a}_{i\mathrm{1}}\cdot \mathrm{cos}\left({a}_{i\mathrm{2}}+\mathit{\omega }t\right)+{a}_{i\mathrm{3}}\cdot \mathrm{cos}\left({a}_{i\mathrm{4}}+\mathrm{2}\mathit{\omega }t\right)+{a}_{i\mathrm{5}},\end{array}$

with t as the day of year and $\mathit{\omega }=\frac{\mathrm{2}\mathit{\pi }}{\mathrm{365}}$. Such functions have also been fitted and illustrated for the monthly data in Fig. S9 (solid orange lines), whereby we assumed that the day of year representing the month is the first day of the month. For most function parameters the fits coincide well with the data points, and in the cases of suboptimal fit results the annual variation of the data is relatively small, indicating that our choice of parameterization is valid.

Figure 92D histograms of the distribution between the parameterized scale height and the COSMIC scale height over ocean for selected global latitude zones.

Altogether, we have to fit 35 parameters to the complete data set of calculated COSMIC scale heights for the parameterization of the scale height over ocean. The goodness of the parameterization in approximating the scale height is illustrated in Fig.9 for different latitude zones. For the latitude zones including the tropics (−15 to 15 N) and subtropics (−35 to 35 N) we find a good agreement between the parameterization and the calculated COSMIC scale height, with R2 of 0.72 and 0.60 respectively. However, including higher latitudes in the evaluation, i.e. midlatitudes (−60 to 60 N) and polar regions (−90 to 90 N), leads to an increased scatter and a worsening of the parameterization (R2 of 0.45 and 0.44 respectively). This inferior agreement is likely caused by the larger atmospheric variability in the midlatitudes (e.g. higher atmospheric dynamics) as well as an increased uncertainty in the COSMIC water vapour profile measurements due to lower water vapour concentrations.

### 3.3.2 Land

Figure 7 already revealed much larger scatter in the distribution of COSMIC TCWV and COSMIC scale height for data over land, indicating that the water vapour profile shape over land surface is less homogeneous than over ocean, likely due to further heterogeneously distributed water vapour sources, such as evapotranspiration by plants and soil. Thus, the H2O VCD and scale height are likely to be dependent on the amount of vegetation; i.e. high vegetation is associated with high evapotranspiration and high water vapour concentrations near the ground, and thus the scale height should be close to the scale height over ocean. In contrast, low amounts of vegetation are associated with less evapotranspiration and a usually drier atmosphere, indicating that the scale height should be higher than over ocean.

To quantify the amount of vegetation, we use the Normalized Difference Vegetation Index (NDVI), where a value of 1.0 indicates high vegetation and a value around 0.0 indicates low vegetation. As data source for the NDVI, we use data within the MODIS Aqua MYDC13C2 Version 6 product and continue as follows: first, we calculate the parameterized scale height Hocean assuming an ocean surface globally. Then we calculate the ratio of the calculated COSMIC scale height over land Hland and the parameterized scale height Hocean.

Figure 102D histograms of the distribution between the ratio HlandHocean against the NDVI for different filtered data sets: panel (a) includes all data points, panel (b) includes all points except those with MODIS landcover type 15 (corresponding to deserts), and panel (c) includes all points except landcover types 7 (corresponding to open shrublands) and 15. The red solid line represents the fit result using the Siegel algorithm with details of the fit results in the legends of each panel.

Figure 10 shows the ratio HlandHocean as a function of the NDVI for data sets filtered by different landcover types and the solid red lines represent the robust regression results (summarized in Table 3) using the model from . Panel (a) depicts the distribution for which no filter is applied. Except for low NDVI values, a linear relation between the ratio and NDVI is observable; however, for NDVI values around 0.1 the ratio varies strongly, between 0.7 and 3.0. In the centre panel we use the landcover classification from the MODIS Aqua MCD12C1 Version 6 product to filter measurements for locations classified as landcover type 15 (corresponding to a desert). With this filter the ratio now only varies between 0.7 and 1.5, with a weak dependence on the NDVI. If we further filter locations of landcover type 7 (corresponding to open shrublands), the fit results of the robust regression change only slightly compared to the first filtered data set.

Table 3Fit results of the robust regression between the ratio of scale heights HlandHsum and the NDVI for different filtered data sets.

Hence the scale height over land Hland can be approximated as the scale height over ocean Hocean multiplied by a first-order polynomial of the NDVI:

$\begin{array}{}\text{(13)}& {H}_{\mathrm{land}}={H}_{\mathrm{ocean}}\left(\mathrm{VCD},\mathit{\theta },t\right)\cdot \left({\mathit{\gamma }}_{\mathrm{land}}+{\mathit{\delta }}_{\mathrm{land}}\cdot \mathrm{NDVI}\right),\end{array}$

whereby in the following we use the results for the data set filtered for landcover types 7 and 15 globally. Since regions of landcover types 7 or 15 are usually arid, the retrieved H2O VCD is small, and thus the error due to an inadequate parameterization of the AMF is much smaller than the fit error of the spectral analysis.

## 3.4 Iterative retrieval scheme

For the calculation of the H2O VCD we precomputed AMF look-up tables (LUTs) for the different water vapour profile shapes with scale heights ranging from 0.5 to 5.0 km. These LUTs can then be used within a fixed-point iteration. In our case the iterative retrieval scheme is based on a fixed-point iteration according to Steffensen's method :

$\begin{array}{}\text{(14)}& \begin{array}{rl}{\mathrm{VCD}}_{i+\mathrm{1}}& ={\mathrm{VCD}}_{i}\\ & -\frac{{\left[f\left({\mathrm{VCD}}_{i}\right)-{\mathrm{VCD}}_{i}\right]}^{\mathrm{2}}}{f\left(f\left({\mathrm{VCD}}_{i}\right)\right)-\mathrm{2}\cdot f\left({\mathrm{VCD}}_{i}\right)+{\mathrm{VCD}}_{i}},\end{array}\end{array}$

where f is a function calculating the scale height for a given VCD using Eqs. (10) and (13), applying it to the precomputed AMF look-up tables and from that returning a new VCD. The advantage of Steffensen's method is that it does not need a derivative and is able to determine the fixed point even for the case of a non-contractive function . For the first guess we derived the initial VCD from the SCD using a geometric AMF $\left({\mathrm{AMF}}_{\mathrm{geo}}=\frac{\mathrm{1}}{\mathrm{cos}\left(\mathrm{SZA}\right)}+\frac{\mathrm{1}}{\mathrm{cos}\left(\mathrm{VZA}\right)}\right)$ and stop the iteration as soon as the logarithmic difference between two consecutive results is smaller than 5 % (approximately 1 kg m−2 assuming an average H2O VCD of 20 kg m−2) or after six iteration steps. We also checked other values for the first guess and could confirm that the convergence of the iterative scheme is independent of them.

Figure 11Comparison of the H2O VCD calculated using a global constant a priori profile shape of 2 km (a) and the iterative scale height method (b) for all-sky conditions. Panel (c) illustrates the water vapour scale height estimated within the retrieval's VCD conversion. All panels show an atmospheric river hitting the eastern Pacific/western US coast on 13 February 2019. Invalid pixels are coloured grey.

Figure 11 illustrates a comparison of H2O VCD distributions for the cases of using a global constant a priori water vapour profile shape (panel a) with a scale height of 2 km (in accordance with ) and using the iterative scale height approach (centre panel) for all-sky conditions (i.e. no cloud filter applied) during an atmospheric river event at the western coast of the US on 13 February 2019. Figure 11c depicts the distribution of the water vapour scale height yielded during the iterative VCD conversion. The water vapour scale height varies a lot along the orbit and differs distinctively from 2 km, causing large deviations between the two approaches, particularly at pixels with high TCWV values and for clouded pixels. However, in contrast to the approach with a constant scale height, the iterative approach is still able to give reasonable TCWV results and does not exceed values higher than 80 kg m−2.

Figure 12Comparison of the H2O VCD calculated using a global constant a priori profile shape of 2 km (first from left), ERA-5 profiles (second from left), and the iterative scale height method (third from left) for clear-sky (effective cloud fraction <20 %; top row) and cloudy-sky conditions (effective cloud fraction ≤100 %; bottom row) with TCWV from SSMIS f16 (right) for the same scenery as in Fig. 11. Invalid pixels are coloured grey. The solid black lines indicate the edges of the TROPOMI swath.

Figure 12 illustrates the H2O VCD distributions from calculations using constant, ERA-5, and iterative profile shapes for the same scenario for clear-sky (effective cloud fraction CF<20 %, top row) and all-sky (CF≤100 %, bottom row) conditions. For ERA-5 we used the data provided by on an hourly $\mathrm{0.25}{}^{\circ }×\mathrm{0.25}{}^{\circ }$ grid and interpolated the model profile data to the TROPOMI pixel centre coordinates. In addition to the TROPOMI H2O VCDs, Fig. 12 also depicts the TCWV distribution from microwave sensor SSMIS f16, which has a temporal difference of around +2.3 h.

For the clear-sky case (top row) the VCD distributions between all profile approaches are almost identical, whereby for the constant scale height approach (first panel from the left) very high VCDs (exceeding values higher than 80 kg m−2) can be observed at the edges of the cloudy regions in the northern subtropics. For the all-sky case (bottom row) the differences between all approaches are largest in cloudy regions: for instance, in the region of the atmospheric river, the VCDs from the constant and ERA-5 profiles distinctively overestimate the VCD and exceed values higher than 80 kg m−2. In contrast, even under these unfavourable observation conditions the iterative approach is still able to give reasonable VCD values. Furthermore, the iterative approach shows an overall good agreement with the SSMIS observations.

Figure 13Mean normalized Box-VCD profiles of ERA-5 and the iterative scale height approach for cases of distinctive VCD disagreement within the region of the atmospheric river. Panel (a) illustrates the mean of the selected profiles from the ground up to 15 km and panel (b) the mean of the same profiles, but from the cloud top up to 6 km above the cloud top. The solid lines indicate the mean profiles and the shaded areas the corresponding 1σ standard deviation.

Taking a closer a look at the reasons for the deviations of the results retrieved for the ERA-5 profiles, Fig. 13 depicts the mean of the normalized water vapour profiles of ERA-5 and the iterative scale height approach for the AR region (around 30 N). Figure 13a shows the water vapour profile from the ground up to 15 km. In comparison to the iterative approach, ERA-5 is much drier above approximately 2.5 km for these particular cases, indicating that ERA-5 might systematically underestimate the water vapour content above the cloud within the region of the atmospheric river. This finding is further supported by Fig. 13b, which illustrates the normalized water vapour profiles above the cloud top: ERA-5 profiles are close to 0 and show only small variations, whereas the profiles of the iterative approach indicate higher water vapour concentrations along with a much higher variability. One potential reason for the discrepancies of ERA-5 could be the missing observational input data for the reanalysis: without observations, the reanalysis model is dominated by its a priori information (e.g. a climatological mean), so that it can be systematically distorted from the real atmosphere. However, further investigations of possible ERA-5 biases are beyond the scope of this paper.

4 Evaluation of different surface albedo input data

The surface albedo has a strong impact on the radiative transfer and thus also on the AMF. Hence we investigated the impact of different albedo products on the TCWV retrieval: the OMI monthly (a) mean and (b) minimum Lambertian equivalent reflectance (LER) at 442 nm from and (c) MODIS Aqua blue surface reflectance from the MODIS MYD13C2 Version 6 product . The MODIS reflectance covers a broad spectral window from 459 to 479 nm. Thus, to account for the different spectral windows of the albedo products, we scale the MODIS albedo by factor of 0.9. This factor was estimated by calculating the ratio between 472 and 442 nm of the OMI yearly minimum LER over parts of Australia where cloud contamination is generally low and hence the OMI LER has reasonably accurate values.

Figure 14Comparison of the effect of different land albedo input data on the mean H2O VCD for boreal summer 2018 (a: OMI monthly mean LER, b: OMI monthly minimum LER, c: scaled MODIS Aqua blue surface reflectance). Only pixels with an effective cloud fraction smaller than 20 % are included.

Figure 14 illustrates the global mean H2O VCD of boreal summer 2018 for the different albedo input data over land (top row: monthly mean OMI LER, middle row: monthly minimum OMI LER, bottom row: scaled monthly MODIS Aqua blue surface reflectance). In the tropical and subtropical regions the OMI albedos cause a distinctive separation of the VCDs between land and ocean, in particular at the coasts of South America, Africa, and Indonesia. These aforementioned regions are often affected by cloud cover, which might cause the OMI albedo statistics to be unable to filter cloudy cases correctly, so that cloud-contaminated observations are used within the albedo calculations. As a consequence, the values in the OMI albedo are too high and lead to an overestimation of the AMF, which in turn causes an underestimation of the H2O VCD.

In contrast, MODIS pixels have a much higher spatial resolution and MODIS' NIR channels are more sensitive to cloud contamination, yielding a higher sample size and allowing for correct cloud filtering. Hence, the H2O VCD distribution using the MODIS surface reflectance results in a much smoother transition from ocean to land and in general much higher VCD values over land along the Equator. Thus, in the following we use a combination of the MODIS and OMI albedos: the scaled MODIS Aqua blue surface reflectance over land and the monthly minimum OMI albedo over ocean.

5 Uncertainty estimation

The error budget of the H2O VCD is determined by the propagation of the main error sources of the fitted SCD and the precalculated AMF. Errors in the SCD are mainly caused by random errors, like the photon noise, and systematic errors, e.g. the uncertainty of the absorption cross section, whereas errors in the AMF are mostly systematic with random contributions.

## 5.1 Uncertainties in the slant column density

Table 4 summarizes the different error sources for the H2O SCD and the corresponding estimated uncertainties. As demonstrated in Sect. 2.1 and Appendix B the water vapour absorption cross section varies systematically between the different HITRAN versions. Hence, we assumed that the uncertainty of the water vapour cross section is of the same order of magnitude as the changes between the different cross-section versions, i.e. approximately around 10 %. Considering the LP-DOAS comparisons (see Sect. 2.1 and Appendix B) we estimate these errors to be around 5 % for this study.

Table 4Summary of the different error sources considered in the H2O SCD uncertainty.

Figure 15Histograms of the standard H2O SCD fit error distribution for small (SZA<20, a, c) and large ($\mathrm{70}{}^{\circ }<\mathrm{SZA}\le \mathrm{90}{}^{\circ }$, b, d) solar zenith angles for relatively small (<3 %, orange) and high (>15 %, blue) surface albedo values for clear-sky (cloud fraction <5 %, a, b) and cloudy-sky (cloud fraction >20 %, c, d) conditions. The coloured dashed lines represent the median of the respective distributions and their values are given in the legend of each panel.

The retrieval's spectral analysis directly yields the 1σ standard fit error of the H2O SCD, which is usually dominated by noise. For a better understanding of these fit errors, we separated them into data for small/large solar zenith angles (SZA<20 and $\mathrm{70}{}^{\circ }<\mathrm{SZA}\le \mathrm{90}$ respectively), low/high surface albedo (<3 % and >15 % respectively), and clear-/cloudy-sky observation conditions (CF<5 % and CF>20 % respectively). The distributions of the standard and relative fit errors of the spectral analysis are given in Figs. 15 and 16 respectively. The median values in Fig. 15 indicate that the standard errors for high SZA (around 0.3×1023 molec cm−2) are twice as high as for small SZA (around 0.15×1023 molec cm−2). Under clear-sky conditions the standard error for small surface albedo values is larger than for high surface albedo, but for cloudy conditions it does not depend on the surface albedo.

Figure 16Histograms of the relative H2O SCD fit error distribution for small (SZA<20, a, c) and large ($\mathrm{70}{}^{\circ }<\mathrm{SZA}\le \mathrm{90}{}^{\circ }$, b, d) solar zenith angles for relatively small (<3 %, orange) and high (>15 %, blue) surface albedo values for clear-sky (cloud fraction <5 %, a, b) and cloudy-sky (cloud fraction >20 %, c, d) conditions. The coloured dashed lines represent the median of the respective distributions and the solid lines represent the location of maximal probability density (values given in the legend of each panel).

Figure 16 reveals that the relative fit errors for high SZAs are higher than for low SZAs. However, the locations of maximal probability density and the medians also indicate that the distributions are right-skewed, in particular for high SZA scenarios: for these scenarios the relative errors easily exceed values of 100 %. Nevertheless, using the locations of maximal probability density as a rule-of-thumb estimate, relative fit errors have values around 10 % for low SZAs and approximately 30 % for high SZAs.

To estimate errors associated with ISRF biases, we calculated the H2O SCD using a Gaussian ISRF (instead of an asymmetric Super-Gaussian) for orbit 6930 and compared them to the SCDs from the “standard” retrieval setup for SZA<88. The comparison depicted in Fig. S3 reveals that the SCDs using the Gaussian ISRF highly correlate with the “standard” SCDs and only differ by approximately 1 %. Considering the much higher fit errors, errors due to biases in the ISRF are negligible.

## 5.2 Uncertainties in the AMF

The uncertainty in the AMF depends on the uncertainty of its input parameters. Because the parameters of the viewing geometry (i.e. solar zenith angle, line-of-sight angle, and solar relative azimuth angle) are known with high accuracy, the most important uncertainties are uncertainties of the surface albedo, cloud fraction, cloud top height, and water vapour profile shape. In order to estimate the contribution of each input parameter to the overall AMF uncertainty, we define standard scenarios (summarized in Table 5) for which we calculate the AMF from the precalculated LUT. Then we vary the input parameter for each scenario according to its uncertainty assumption listed in Table 6. The uncertainties of the water vapour scale height have been derived from the fit results of the intercomparisons between the measured COSMIC scale height and the parameterized scale height over ocean (see Fig. 9) and land (see Fig. 10).

Table 5Standard retrieval scenarios for the estimation of AMF error.

Table 6Summary of different error sources considered in the AMF uncertainty.

Figure 17 depicts box–whisker plots of the relative AMF error due to uncertainties in surface albedo and scale height for the standard clear-sky scenarios of surface albedo, solar zenith angle, and scale height. It reveals that uncertainties in surface albedo and scale height over low vegetation have the strongest impact on the AMF and can cause AMF errors larger than 30 %, in particular for scenarios with low surface albedo or high solar zenith angle. On average the median values of the AMF errors typically vary around approximately 10 %.

Figure 17Box–whisker plots of the relative AMF errors for clear-sky conditions due to uncertainties within the retrieval's input parameters (blue: surface albedo, orange: scale height over high vegetation, green: scale height over low vegetation, red: scale height over ocean) according to the uncertainty assumptions in Table 6 and simulated for the standard scenarios of the surface albedo, solar zenith angle, and scale height given in Table 5.

Figure 18Box–whisker plots of the relative AMF errors for cloudy-sky conditions due to uncertainties within the retrieval's input parameters according to the uncertainty assumptions in Table 6 and simulated for the standard scenarios given in Table 5.

Figure 18 illustrates box–whisker plots of the relative AMF error due to uncertainties in surface albedo, scale height, cloud fraction, and cloud top height for all standard scenarios listed in Table 5. In contrast to the clear-sky scenarios, the impact of the surface albedo uncertainties has strongly decreased, but in general the contributions of all AMF errors have increased distinctively. The main source of the AMF errors is still the uncertainty of the scale height over low vegetation, whose median values vary between 20 % and 50 % but can also cause AMF errors larger than 60 %.

Table 6 summarizes the results of the different error sources considered in the AMF uncertainty for clear- and cloudy-sky conditions. For clear-sky conditions one can typically assume a relative AMF error around 10 %–15 % and for cloudy-sky conditions around 10 %–25 %.

## 5.3 Total H2O VCD uncertainty

The total relative H2O VCD uncertainty can be approximated by

$\begin{array}{}\text{(15)}& \frac{\mathrm{\Delta }\mathrm{VCD}}{\mathrm{VCD}}=\sqrt{{\left(\frac{\mathrm{\Delta }\mathrm{AMF}}{\mathrm{AMF}}\right)}^{\mathrm{2}}+{\left(\frac{\mathrm{\Delta }\mathrm{SCD}}{\mathrm{SCD}}\right)}^{\mathrm{2}}}.\end{array}$

With our findings of typical relative AMF and H2O SCD uncertainties, the total relative VCD uncertainty is typically around 10 %–20 % for observations during clear-sky conditions, over ocean surface, and at low solar zenith angles. During partly cloudy-sky conditions, over land surface, and at high solar zenith angles the error reaches values of approximately 20 %–50 %.

6 Validation study

In order to evaluate the retrieval's performance, we conducted a validation study for the time ranges of boreal summer (June, July, and August) 2018 and boreal winter (December, January, and February) 2018/2019 whereby we only include clear-sky observations (i.e. pixels with an effective cloud fraction smaller than 20 %) and ice- and snow-free pixels. To avoid extreme outliers, we only include observations with an AMF>0.1. As reference data for the validation we use TCWV from the Special Sensor Microwave Imager/Sounder (SSMIS), from the reanalysis model ERA-5, and the ground-based GPS network SuomiNet. For the sake of completeness, we also briefly investigate higher cloud fractions at the end of each subsection and provide the results in the Supplement.

As cloud input data we use the cloud information (effective cloud fraction at 440 nm and cloud top height) as well as the surface altitude from the TROPOMI L2 NO2 product and as surface albedo input data we use the combination of the modified MODIS and OMI albedo described in Sect. 4. To distinguish between ocean and land surface, we use a land–sea mask derived from GSHHS coastline data , in which we use the pixel centre coordinates for the separation into land and ocean. As the NDVI is not available over lakes, we treat them as ocean.

## 6.1 SSMIS comparison

For the evaluation we use measurements from SSMIS onboard NOAA's f16 and f17 satellites processed by Remote Sensing Systems (RSS) and provided by the NASA Global Hydrology Resource Center on a daily $\mathrm{0.25}{}^{\circ }×\mathrm{0.25}{}^{\circ }$ grid. SSMIS can observe the TCWV distribution under all-sky conditions over ocean with an accuracy of around 1 kg m−2 . Since SSMIS changes its Equator crossing time (ECT), we only include SSMIS observations whose ECT is within 3 h (and 5 h for f17 respectively) with respect to TROPOMI's ECT of 13:30 LT. For the intercomparison we only include SSMIS measurements that are not affected by rain.

Figure 192D histograms for the comparison between TROPOMI and SSMIS f16 (a, b) and f17 (c, d) for clear-sky conditions (CF<20 %) for boreal summer (a, c) and boreal winter (b, d), where the colour indicates the relative share of total data points. The black dotted line indicates the 1-to-1 diagonal and the red solid line represents the results of the linear regression. The parameters of the linear regression and the coefficient of determination are given in the box in each panel.

Figure 19 depicts the comparison between SSMIS (f16, top row, and f17, bottom row) and TROPOMI for boreal summer (left column) and winter (right column). For f16 (top row) the scatter is distributed closely along the 1-to-1 diagonal (dashed lines) for both seasons and the fitted regression lines (red solid lines) indicate a very good agreement between both data, with slopes around 0.96, intercepts around −1.6 kg m−2 for summer and −1.7 kg m−2 for winter, and coefficients of determination of R2=0.91. For f17 the comparison reveals similar agreement, with slopes around 0.97 and intercepts around −1.5 kg m−2 with R2=0.89 for both seasons. Overall, considering the differences in collocation time (3 and 5 h for f16 and f17 respectively), the comparison shows that the TROPOMI TCWV retrieval can well capture the water vapour distribution over ocean.

Figure 202D histograms of the difference (TROPOMI  SSMIS f16, a, b) and relative difference $\left(\mathrm{TROPOMI}-\mathrm{f}\mathrm{16}\right)/\mathrm{f}\mathrm{16}$ (c, d) as a function of the input cloud top height (CTH) for clear-sky conditions (CF<20 %) for summer (a, c) and winter (b, d). The blue dashed line represents the median over the whole CTH range. The blue dots represent the median within a 1 km CTH and the error bars represent their respective 1σ standard deviation.

Figure 21Same as Fig. 20 but for SSMIS f17.

To investigate the influence of clouds on our retrieval, we plot the difference (top row) and relative difference (bottom row) between TROPOMI and SSMIS as a function of the input cloud top height (CTH) in Figs. 20 and 21 for f16 and f17 respectively. The median over the whole CTH range (blue dashed line) indicates an underestimation of the TROPOMI H2O VCD of approximately 12 %–13 % (2.6 kg m−2). However, the large majority of data points is distributed within the CTH bin between 0 and 1 km, revealing that the underestimation of the TROPOMI TCWV is mainly caused by low clouds. For mid-level clouds the median difference almost cancels out, whereas for high clouds it first increases and then remains almost constant with cloud top height.

Further validation results for SSMIS f16 and f17 separated into different cloud fraction and cloud top height bins for July 2018 are given in Figs. S10 and S11 respectively. The results indicate that there is no dependency with cloud fraction but a distinctive dependency with cloud top height: the retrieval underestimates for clouds below 1 km, is in very good agreement for mid-level clouds (1–4 km), and overestimates for higher clouds.

## 6.2 ERA-5 comparison

For the intercomparison between the reanalysis model ERA-5 and TROPOMI, we use ERA-5 TCWV data provided by on a $\mathrm{0.25}{}^{\circ }×\mathrm{0.25}{}^{\circ }$ grid. We only take into account values which are within +1 h with respect to the starting sensing time of the TROPOMI orbit and separate the data into data over ocean and data over land.

Figure 222D histograms for the comparison between TROPOMI and ERA-5 for data over ocean (a, b) and over land (c, d) for clear-sky conditions (CF<20 %) for boreal summer (a, c) and boreal winter (b, d), where the colour indicates the relative share of total data points. The black dotted line indicates the 1-to-1 diagonal and the red solid line represents the results of the linear regression. The parameters of the linear regression and the coefficient of determination are given in the box in each panel.

The results of the intercomparison are summarized in Fig. 22. Over ocean (top row in Fig. 22) the results are similar to the results from the comparison between TROPOMI and SSMIS: apart from slopes close to 0.95 and intercepts close to zero, the linear regression yields R2 of 94 % for summer and 95 % for winter respectively. Over land the linear regression still yields high values of the coefficient of determination R2, but the TROPOMI retrieval generally underestimates the H2O VCD by approximately 12 % during summer (and 7 % during winter). Since the values of the correlation coefficient are still high and the values over ocean coincide very well with the reference data sets, we assume that this underestimation has to be caused by a systematic uncertainty within the input parameters for our retrieval.

Figure 232D histograms of the difference (TROPOMI  ERA-5, a, b) and relative difference $\left(\mathrm{TROPOMI}-\text{ERA-5}\right)/\text{ERA-5}$ (c, d) as a function of the input cloud top height (CTH) for clear-sky conditions (CF<20 %) for summer (a, c) and winter (b, d) for data over ocean. The blue dashed line represents the median over the whole CTH range. The blue dots represent the median within a 1 km CTH and the error bars represent their respective 1σ standard deviation.

The influence of the cloud top height input is illustrated in Fig. 23 for data over ocean. The median is around −1.6 kg m−2 (−7.1 %) and −1.3 kg m−2 (−6.7 %) during summer and winter respectively, whereby similarly to SSMIS, these underestimations are caused by the majority of data points within the 0–1 km CTH bin. For increasing CTH the deviation from the reference increases and leads to an overestimation. For data over land (Fig. 24) the CTH variability is much larger than over ocean; i.e. most data points are now distributed between 0 and 3 km and the median is around values of −1.5 kg m−2 (−10.3 %) and −0.4 kg m−2 (−4.0 %) during summer and winter respectively. Furthermore, low clouds still cause an underestimation, and for mid- to high-level clouds the deviations almost cancel out, but one can also observe an increasing scatter for winter data.

Figure 24Same as Fig. 23 but for data over land.

All these findings reveal that the combination of albedo uncertainties and uncertainties in the cloud properties (cloud fraction and cloud top height) as well as in the scale height parameterization have a distinctive influence on the AMF. The cloud products from TROPOMI rely on the OMI albedo which, as we have demonstrated in Sect. 4, has several problems over land surface. In addition, the uncertainty of the OMI albedo over land surface is higher than over ocean due to a highly spatiotemporal variability of the scenery, and the differences between the monthly minimum and the monthly mean albedo are higher over land than over ocean. Furthermore, the cloud top height is calculated via the cloud top pressure and has to be combined with the surface pressure. Thus, the uncertainty of the cloud top height over land is higher than over ocean, since over ocean the topography is much simpler.

Nevertheless, the complex radiative interactions between albedo and clouds might amplify or cancel out these deviations and thus make it difficult to draw clear conclusions.

As for the SSMIS comparison, further validation results for ERA-5 over ocean and land separated into different cloud fraction and cloud top height bins for July 2018 are given in Figs. S12 and S13.

Similarly to SSMIS, the results over ocean reveal an underestimation for low clouds and an overestimation for high clouds and that there is almost no dependency with cloud fraction. Over land low clouds still cause an underestimation; however, for cloud top heights above 2 km the retrieval shows very good agreement with ERA-5, indicating that the input cloud top height for our retrieval is too low.

## 6.3 SuomiNet/GPS comparison

For the intercomparison with TCWV from ground-based GPS we use data from the SuomiNet network provided by UCAR. SuomiNet stations are distributed over North and Central America and provide data every 30 min with a typical accuracy of 2 kg m−2 . Thus, we only take into account TROPOMI pixels within a distance of 0.1 to the GPS station and within 2 h with respect to the GPS measurement.

Figure 25Scatter plots of the intercomparisons between TROPOMI and SuomiNet for clear-sky conditions (CF<20 %) for boreal summer (a) and boreal winter (b). The black dashed line indicates the 1-to-1 diagonal and the orange solid line represents the results of the robust regression based on . The parameters of the regression and the coefficient of correlation are given in the box in each panel.

Figure 25 illustrates scatter plots of the intercomparison between TROPOMI and SuomiNet for boreal summer and winter. For both seasons the robust regression indicates an underestimation of around 20 % (i.e. slopes of 0.82 and 0.84) with high Pearson correlation coefficients of 88 %. In order to investigate the influence of clouds on our retrieval, we plot the difference (top row) and the relative difference (bottom row) between TROPOMI and Suominet as a function of the input cloud top height (CTH) in Fig. 26. The median over the whole CTH range (blue dashed line) indicates an underestimation of the TROPOMI H2O VCD of approximately 14 % (3.5 kg m−2) during summer and of 8 % (0.8 kg m−2) during winter. However, during summer the median values for each 1 km CTH bin (blue dots) reveal that the underestimation is mainly caused by low clouds, whereas for mid- and high-level clouds the median difference almost cancels out. During winter this pattern is not clearly observable due to much larger scatter, but also here low clouds mainly cause the underestimation in TCWV, whereby the difference is generally within the range of accuracy of the SuomiNet retrieval.

Figure 26Scatter plot of the difference (TROPOMI  SuomiNet, a, b) and relative difference $\left(\mathrm{TROPOMI}-\mathrm{SuomiNet}\right)/\mathrm{SuomiNet}$ (c, d) as a function of the input cloud top height (CTH) for clear-sky conditions (CF<20 %) for summer (a, c) and winter (b, d). The blue dashed line represents the median over the whole CTH range. The blue dots represent the median within a 1 km CTH and the error bars represent their respective 1σ standard deviation.

Figure S14 depicts further validation results separated into different cloud fraction and cloud top height bins for boreal summer 2018. Though the sample size is much smaller, similar results to SSMIS and ERA-5 are obtained: independent of the cloud fraction, low clouds cause an underestimation of around 15 %–20 %, whereas for mid-level clouds the TROPOMI H2O VCDs show much better agreement with the SuomiNet TCWV, and for high clouds TROPOMI overestimates by around 10 %.

7 Summary and conclusions

In this paper, we introduce a total column water vapour retrieval from TROPOMI spectra in the visible blue spectral range using an iterative vertical column conversion scheme and provide a detailed characterization of our retrieved H2O VCD by performing a detailed uncertainty analysis and intercomparisons to reference data sets from the microwave sensor SSMIS, from the reanalysis model ERA-5, and from the ground-based GPS network SuomiNet.

For the iteration scheme we describe the a priori water vapour profile as an exponential decay with a scale height H and developed an empirical parameterization for this scale height. This parameterization is based on COSMIC water vapour profile data and relates the a priori water vapour profile shape to the H2O VCD, the seasonal cycle, the latitude, and the vegetation (and NDVI respectively). We demonstrate that we can correctly reproduce the scale heights, in particular for data at low latitudes (tropics and subtropics). However, we also observe an increasing scatter if higher latitudes are included in the comparison, likely because of the higher variability in H2O VCD due to midlatitudinal cyclone dynamics and a general higher uncertainty in the COSMIC profile data for drier atmospheric conditions. Overall, the retrieved profile heights are very reasonable, and we obtain a substantial improvement using the new parameterization compared to the use of a prescribed constant water vapour profile.

For the uncertainty analysis we investigated the impact of several error sources on the H2O SCD and AMF, like clouds, surface albedo, profile shape, and instrument properties. The error estimation reveals that the main SCD uncertainty is the fit error of the spectral analysis and that the main AMF uncertainties are caused by uncertainties in the surface albedo and water vapour profile shape. For the H2O VCD we estimated a typical total relative error of around 10 %–20 % for observations during clear-sky conditions, over ocean surface, and at low solar zenith angles. For observations during cloudy-sky conditions, over land surface, and high solar zenith angles the error reaches values of approximately 20 %–50 %. Thus, the theoretically estimated errors are of the same order of magnitude as the deviations found during the retrieval's evaluation. However, uncertainties in the absorption cross section of water vapour are a further systematic error source that can additionally contribute up to 10 %. Based on the LP-DOAS comparisons we estimate these errors to be around 5 % for this study, so that they are negligible compared to the other error sources.

In the validation study we demonstrate that for clear-sky conditions the retrieved TROPOMI H2O VCDs over ocean are in very good agreement with the reference data sets and can correctly capture the global water vapour distribution. Over land the TROPOMI retrieval can reproduce the TCWV distribution; however, we also observe a distinctive underestimation of around 10 %, in particular during boreal summer.

Nevertheless, these underestimations might be caused by the uncertainties of the external input data for the retrieval: for instance, the OMI LERs from are too high over tropical land masses, likely due to incorrect cloud filtering which causes too high AMFs, leading to too low H2O VCDs. Although we tried to overcome this issue by using a surface reflectance product from MODIS Aqua, the cloud products from the TROPOMI L2 NO2 product still rely on the OMI LER to calculate the effective cloud fraction and cloud top height and thus also have a large uncertainty. The intercomparisons to the reference data sets show that these uncertainties in the cloud products have a substantial impact on the H2O VCD: our investigations reveal that the input cloud top height is probably too low, which in turn leads to higher AMFs and consequently to an underestimation in TCWV. However, one has to consider that the radiative properties of the cloud and albedo products interact at a high degree of complexity, so that a clear explanation or suggestion on how to overcome these issues is beyond the scope of this paper. Because of all these uncertainties we recommend for general purposes to only use VCDs with an effective cloud fraction <20 % and AMF>0.1, which represents a good compromise between spatial coverage and retrieval accuracy.

Overall, the successful application of the TCWV retrieval in the visible blue spectral range to TROPOMI measurements is very promising for further investigations, including application to further satellite sensors such as OMI, SCIAMACHY, and GOME-1/2 or the upcoming Sentinel-4 and Sentinel-5 instruments and expanding the retrieval to measurements contaminated by higher cloud fractions. As the retrieval allows for a fast execution of large data sets, investigations of long-term trends using a TCWV data set of merged time series of different satellite sensors are easily possible. However, since these data sets have to be uniform, they require consistent input data across the different satellite sensors, in particular for cloud products.

Appendix A: Weighted linear least-squares fit for spectral analysis

To handle the daily high data volume of TROPOMI and to avoid “jumps” of pixels included in the fit window, we implemented a weighted linear least-squares fit for the DOAS analysis. The weights W are the fractional coverage of the pixel within the fit window (see also Fig. A1):

$\begin{array}{}\text{(A1)}& W\left(\mathit{\lambda }\right)=\left\{\begin{array}{ll}\mathrm{1}-\frac{\left|\mathit{\lambda }-{\mathit{\lambda }}_{\mathrm{low}}\right|}{\mathrm{\Delta }\mathit{\lambda }}& \frac{\left|\mathit{\lambda }-{\mathit{\lambda }}_{\mathrm{low}}\right|}{\mathrm{\Delta }\mathit{\lambda }}<\mathrm{1}\wedge \mathit{\lambda }-{\mathit{\lambda }}_{\mathrm{low}}<\mathrm{0},\\ \mathrm{1}& {\mathit{\lambda }}_{\mathrm{low}}<\mathit{\lambda }<{\mathit{\lambda }}_{\mathrm{up}}\\ \mathrm{1}-\frac{\left|\mathit{\lambda }-{\mathit{\lambda }}_{\mathrm{up}}\right|}{\mathrm{\Delta }\mathit{\lambda }}& \frac{\left|\mathit{\lambda }-{\mathit{\lambda }}_{\mathrm{up}}\right|}{\mathrm{\Delta }\mathit{\lambda }}<\mathrm{1}\wedge \mathit{\lambda }-{\mathit{\lambda }}_{\mathrm{up}}>\mathrm{0},\\ \mathrm{0}& \mathrm{else}\end{array}\right\\end{array}$

with λlow and λup the lower and upper boundaries of the fit window and Δλ the average wavelength increment within the fit window. The elements of the weight matrix are then given as ${w}_{ii}=\sqrt{{W}_{ii}\left({\mathit{\lambda }}_{i}\right)}$. Hence Eq. (2) can be solved by simple linear algebra:

$\begin{array}{c}{\mathbf{y}}^{\prime }={\mathbf{M}}^{\prime }\mathbf{x},\\ \stackrel{\mathrm{^}}{\mathbf{x}}={\left({{\mathbf{M}}^{\prime }}^{T}{\mathrm{M}}^{\prime }\right)}^{-\mathrm{1}}{{\mathbf{M}}^{\prime }}^{T}\mathbf{y},\\ \mathbf{S}={\left({{\mathbf{M}}^{\prime }}^{T}{\mathbf{M}}^{\prime }\right)}^{-\mathrm{1}}{\mathit{\chi }}^{\mathrm{2}},\\ {\mathit{\beta }}_{i}=\sqrt{{S}_{ii}},\end{array}$

with the solution of the linear problem $\stackrel{\mathrm{^}}{\mathbf{x}}$ containing the SCDs, ${\mathbf{y}}^{\prime }=\mathrm{diag}\left(\mathbf{w}\right)y$ the weighted measurement spectrum, ${\mathbf{M}}^{\prime }=\mathrm{diag}\left(\mathbf{w}\right)\mathbf{M}$ the weighted absorption structures to fit, βi the estimated 1σ fit error of the results for each fitted parameter, and χ2 the reduced chi square.

Figure A1Schematic illustration of the weights used during the retrieval's spectral analysis.

Appendix B: Evaluation of the water vapour absorption cross section

Figure B1 depicts intercomparisons between LP-DOAS and meteorological measurements of water vapour volume mixing ratios (WVMRs) at different altitudes (10, 40, and 200 m) at the CESAR Tower for daytime and nighttime during the Cabauw Intercomparison of Nitrogen Dioxide Measuring Instruments 2 (CINDI-2) campaign. The results of the regression methods indicate that for every altitude the LP-DOAS underestimates WVMRs by around 17 % during day and 11 % during night. These findings independently confirm the results of further LP-DOAS measurements taken at the Cape Verde Atmospheric Observatory, for which observed an underestimation of around 8 % when using the water vapour line lists from HITRAN 2012. However, when using the water vapour line lists from HITRAN 2008, observe an excellent agreement with the reference meteorological measurements at the observatory (see Table 8 in their paper).

Figure B2 compares the absorption cross sections of the different HITRAN versions. For the high-resolution cross section (panel a), the differences between the versions are hardly visible; however, after the convolution with the TROPOMI ISRF (panel b), distinctive differences in the peak absorption are clearly visible: in comparison to HITRAN 2008, the absorption peak of HITRAN 2012 is approximately 7 %–9 % higher than HITRAN 2008, and the absorption peak of HITRAN 2016 is approximately 7 %–9 % lower than HITRAN 2008.

Figure B1Scatter plots of water vapour volume mixing ratios (WVMRs) derived from LP-DOAS measurements and meteorological measurements at different altitudes (10, 40, and 200 m) at the CESAR Tower for day and night during the CINDI-2 campaign. Water vapour absorption cross sections have been calculated from the HITRAN 2012 line list. The dashed red line represents the 1-to-1 diagonal, the solid blue line the results from the robust regression (Siegel1982), and the solid orange line the results from the weighted linear regression.

Figure B2Comparison of the water vapour absorption cross section derived from different HITRAN versions (2008, 2012, and 2016) for a temperature of 296 K. Panel (a) depicts the high-resolution cross sections and the difference between HITRAN2008 and HITRAN2012. Panel (b) depicts the same cross sections but convolved with a typical TROPOMI Super-Gaussian ISRF (values from ).

Combining these findings with the shortcomings of HITRAN 2016 indicated by and the observational evidence from the LP-DOAS measurements, we conclude that it is most adequate to use the water vapour line list from HITRAN 2008.

Appendix C: Evaluation of methods for calculating water vapour scale height

The water vapour scale height can be calculated in different ways. Here, we compare two different approaches: the first method is the calculation of the scale height via a weighted non-linear fit:

$\begin{array}{c}min\sum _{i}\frac{{\left({y}_{i}-f\left({z}_{i},{n}_{\mathrm{0}},{H}_{\mathrm{nl}}\right)\right)}^{\mathrm{2}}}{{\mathit{\sigma }}_{i}^{\mathrm{2}}}\\ f\left(z,{n}_{\mathrm{0}},{H}_{\mathrm{nl}}\right)={n}_{\mathrm{0}}{e}^{-\frac{z}{{H}_{\mathrm{nl}}}},\end{array}$

where yi are the COSMIC profile data points, $f\left(z,{n}_{\mathrm{0}},H\right)$ is the approximation of the exponential function, and σi is the inverse of the layer thickness at the observation yi. The second method consists of summing up all the partial columns of the COSMIC profile data until a defined threshold is reached, which in our case is 63 % of the H2O VCD:

$\begin{array}{}\text{(C1)}& \frac{{\int }_{\mathrm{0}}^{{H}_{\mathrm{sum}}}n\left(z\right)\mathrm{d}z}{{\int }_{\mathrm{0}}^{\mathrm{TOA}}n\left(z\right)\mathrm{d}z}>\mathrm{1}-\frac{\mathrm{1}}{e}\approx \mathrm{63}\phantom{\rule{0.125em}{0ex}}\mathit{%}.\end{array}$

Figure S4 depicts the mean profile shapes calculated using both methods as well as the mean profile shape of the COSMIC data for different latitude bins for the year 2013 for which the sample size is largest. Further statistics of goodness are given in Fig. S5 (bias), Fig. S6 (mean absolute error), and Fig. S7 (standard deviation). In general, the profile shapes of both methods agree well with the COSMIC measurements; however, Figs. S5 and S6 also reveal that the largest deviations occur in the lowermost troposphere, in particular for the southern polar regions. Nevertheless, the profiles of standard deviations in Fig. S7 also demonstrate that both methods are able to well capture the vertical and temporal variations in the water vapour profile shape and that these variations are within the same range of the variation of the COSMIC profile data.

Figure C1Histogram of the relative deviation of the calculated synthetic AMFs between the sum method (blue)/non-linear fit (orange) and the COSMIC profile for selected latitude bins (0 to 10, −30 to −20, and −70 to −60 N) assuming clear-sky conditions and nadir-viewing geometry.

Figure C1 depicts histograms of the relative AMF deviation for both methods for selected latitude regions assuming nadir-viewing geometry and clear-sky conditions (like in Sect. 3.2 and Fig. 5). The peaks of the histograms for the sum method are close to the 0 % line, indicating very good agreement with AMF calculated from the COSMIC profiles. In contrast, the histograms for the non-linear fit peak at values around 2 % and show a broader distribution than the histogram of the sum method, thus revealing an inferior agreement with the reference AMFs. For cloudy-sky conditions (see Fig. S8), both methods are biased to smaller AMF values (deviations of around −5 %) for a cloud top height of 1 km, but for higher clouds both methods show similar good agreement with the reference AMFs. However, the variance in the AMFs for the sum method is much smaller than in the AMFs for the non-linear fit.

In summary, the sum method is to be preferred because it provides more consistent results for clear-sky and cloudy-sky scenarios than the non-linear fit.

Data availability
Data availability.

The TROPOMI TCWV data presented here are available upon request.

Supplement
Supplement.

Author contributions
Author contributions.

CB performed all calculations for this work and prepared the manuscript together with SB and TW and in collaboration with all the coauthors. SB developed the concept of the linearized retrieval scheme and CB and SB implemented most of the retrieval code. SD helped with the McArtim calculations and HS helped with the tessellation of the TROPOMI H2O VCD orbit data to a regular grid. TW supervised this study.

Competing interests
Competing interests.

The authors declare that they have no conflict of interest.

Special issue statement
Special issue statement.

Acknowledgements
Acknowledgements.

We would like to thank ESA and the S-5P/TROPOMI level 1 and level 2 teams for their great work in initiating and realizing TROPOMI and for providing the respective data sets. We also thank NASA for providing MODIS and SSMI data and ECMWF for providing reanalysis data. Furthermore, we acknowledge UCAR and ROMSAF for providing SuomiNet and COSMIC data. We also would like to thank Stefan Schmitt and Johannes Lampel from the Institute of Environmental Physics at the University of Heidelberg for performing the analysis of the LP-DOAS measurements during CINDI-2 and for providing the WVMR results in a very useful format.

Financial support
Financial support.

The article processing charges for this open-access publication were covered by the Max Planck Society.

Review statement
Review statement.

This paper was edited by Ben Veihelmann and reviewed by Ruediger Lang and one anonymous referee.

References

Anthes, R. A.: Exploring Earth's atmosphere with radio occultation: contributions to weather, climate and space weather, Atmos. Meas. Tech., 4, 1077–1103, https://doi.org/10.5194/amt-4-1077-2011, 2011. a

Anthes, R. A., Bernhardt, P. A., Chen, Y., Cucurull, L., Dymond, K. F., Ector, D., Healy, S. B., Ho, S.-P., Hunt, D. C., Kuo, Y.-H., Liu, H., Manning, K., McCormick, C., Meehan, T. K., Randel, W. J., Rocken, C., Schreiner, W. S., Sokolovskiy, S. V., Syndergaard, S., Thompson, D. C., Trenberth, K. E.,Wee, T.-K., Yen, N. L., and Zeng, Z.: The COSMIC/FORMOSAT-3 mission: Early results, B. Am. Meteorol. Soc., 89, 313–334, 2008. a

Beirle, S., Sihler, H., and Wagner, T.: Linearisation of the effects of spectral shift and stretch in DOAS analysis, Atmos. Meas. Tech., 6, 661–675, https://doi.org/10.5194/amt-6-661-2013, 2013. a, b, c

Beirle, S., Lampel, J., Lerot, C., Sihler, H., and Wagner, T.: Parameterizing the instrumental spectral response function and its changes by a super-Gaussian and its derivatives, Atmos. Meas. Tech., 10, 581–598, https://doi.org/10.5194/amt-10-581-2017, 2017. a, b, c, d

Bennartz, R. and Fischer, J.: Retrieval of columnar water vapour over land from backscattered solar radiation using the Medium Resolution Imaging Spectrometer, Remote Sens.Environ., 78, 274–283, 2001. a

Boggs, P. T., Boggs, P. T., Rogers, J. E., and Schnabel, R. B.: User's reference guide for odrpack version 2.01: Software for weighted orthogonal distance regression, US Department of Commerce, National Institute of Standards and Technology, available at: https://docs.scipy.org/doc/external/odrpack_guide.pdf (last access: 19 May 2020), 1992. a

Copernicus Climate Change Service: ERA5 hourly data on pressure levels from 1979 to present, Copernicus Climate Change Service Climate Data Store (CDS), https://doi.org/10.24381/cds.bd0915c6, 2018a. a

Copernicus Climate Change Service: ERA5 hourly data on single levels from 1979 to present, Copernicus Climate Change Service Climate Data Store (CDS), https://doi.org/10.24381/cds.adbb2d47, 2018b. a

Deutschmann, T., Beirle, S., Frieß, U., Grzegorski, M., Kern, C., Kritten, L., Platt, U., Prados-Román, C., Puk, ı, J., Wagner, T., Werner, B., and Pfeilsticker, K.: The Monte Carlo atmospheric radiative transfer model McArtim: Introduction and validation of Jacobians and 3D features, J. Quant. Spectrosc. Ra., 112, 1119–1137, 2011. a

Didan, K., Munoz, A. B., Solano, R., and Huete, A.: MODIS vegetation index user's guide (MOD13 series), Tech. rep., Vegetation Index and PhenologyLab, https://doi.org/10.5067/MODIS/MYD13C2.006, 2015. a, b

Duan, J., Bevis, M., Fang, P., Bock, Y., Chiswell, S., Businger, S., Rocken, C., Solheim, F., van Hove, T., Ware, R., McClusky, S., Herring, T. A., and King, R. W.: GPS Meteorology: Direct Estimation of the Absolute Value of Precipitable Water, J. Appl. Meteorol., 35, 830–838, https://doi.org/10.1175/1520-0450(1996)035<0830:GMDEOT>2.0.CO;2, 1996. a

Fang, P., Bevis, M., Bock, Y., Gutman, S., and Wolfe, D.: GPS meteorology: Reducing systematic errors in geodetic estimates for zenith delay, Geophys. Res. Lett., 25, 3583–3586, https://doi.org/10.1029/98GL02755, 1998. a

Gao, B.-C. and Kaufman, Y. J.: Water vapor retrievals using Moderate Resolution Imaging Spectroradiometer (MODIS) near-infrared channels, J. Geophys. Res.-Atmos., 108, 4389, https://doi.org/10.1029/2002JD003023, 2003. a

Gordon, I., Rothman, L., Hill, C., Kochanov, R., Tan, Y., Bernath, P., Birk, M., Boudon, V., Campargue, A., Chance, K., Drouin, B., Flaud, J.-M., Gamache, R., Hodges, J., Jacquemart, D., Perevalov, V., Perrin, A., Shine, K., Smith, M.-A., Tennyson, J., Toon, G., Tran, H., Tyuterev, V., Barbe, A., Császár, A., Devi, V., Furtenbacher, T., Harrison, J., Hartmann, J.-M., Jolly, A., Johnson, T., Karman, T., Kleiner, I., Kyuberis, A., Loos, J., Lyulin, O., Massie, S., Mikhailenko, S., Moazzen-Ahmadi, N., Müller, H., Naumenko, O., Nikitin, A., Polyansky, O., Rey, M., Rotger, M., Sharpe, S., Sung, K., Starikova, E., Tashkun, S., Auwera, J. V., Wagner, G., Wilzewski, J., Wcisło, P., Yu, S., and Zak, E.: The HITRAN2016 molecular spectroscopic database, J. Quant. Spectrosc. Ra., 203, 3–69, https://doi.org/10.1016/j.jqsrt.2017.06.038, 2017. a

Grossi, M., Valks, P., Loyola, D., Aberle, B., Slijkhuis, S., Wagner, T., Beirle, S., and Lang, R.: Total column water vapour measurements from GOME-2 MetOp-A and MetOp-B, Atmos. Meas. Tech., 8, 1111–1133, https://doi.org/10.5194/amt-8-1111-2015, 2015. a

Hajj, G., Kursinski, E., Romans, L., Bertiger, W., and Leroy, S.: A technical description of atmospheric sounding by GPS occultation, J. Atmos. Sol.-Terr. Phy., 64, 451–469, https://doi.org/10.1016/S1364-6826(01)00114-6, 2002. a

Heise, S., Wickert, J., Beyerle, G., Schmidt, T., and Reigber, C.: Global monitoring of tropospheric water vapor with GPS radio occultation aboard CHAMP, Adv. Space Res., 37, 2222–2227, https://doi.org/10.1016/j.asr.2005.06.066, 2006. a

Held, I. M. and Soden, B. J.: Water Vapor Feedback and Global Warming, Annu. Rev. Energ. Env., 25, 441–475, https://doi.org/10.1146/annurev.energy.25.1.441, 2000. a

Ho, S.-P., Kuo, Y.-H., Schreiner, W., and Zhou, X.: Using SI-traceable global positioning system radio occultation measurements for climate monitoring, B. Am. Meteorol. Soc., 91, S36–S37, 2010a. a

Ho, S.-P., Zhou, X., Kuo, Y.-H., Hunt, D., and Wang, J.-H.: Global Evaluation of Radiosonde Water Vapor Systematic Biases using GPS Radio Occultation from COSMIC and ECMWF Analysis, Remote Sens., 2, 1320–1330, https://doi.org/10.3390/rs2051320, 2010b. a

Kleipool, Q., Dobber, M., de Haan, J., and Levelt, P.: Earth surface reflectance climatology from 3 years of OMI data, J. Geophys. Res.-Atmos., 113, D18308, https://doi.org/10.1029/2008JD010290, 2008. a, b, c

Kursinski, E., Hajj, G., Schofield, J., Linfield, R., and Hardy, K. R.: Observing Earth's atmosphere with radio occultation measurements using the Global Positioning System, J. Geophys. Res.-Atmos., 102, 23429–23465, 1997. a, b

Kurucz, R. L., Furenlid, I., Brault, J., and Testerman, L.: Solar flux atlas from 296 to 1300 nm, National Solar Observatory Atlas, US. National Solar Observatory, Sunspot, NM, USA, 1984. a

Lampel, J., Pöhler, D., Tschritter, J., Frieß, U., and Platt, U.: On the relative absorption strengths of water vapour in the blue wavelength range, Atmos. Meas. Tech., 8, 4329–4346, https://doi.org/10.5194/amt-8-4329-2015, 2015. a, b, c, d, e

Lang, R., Casadio, S., Maurellis, A. N., and Lawrence, M. G.: Evaluation of the GOME Water Vapor Climatology 1995–2002, J. Geophys. Res.-Atmos., 112, https://doi.org/10.1029/2006JD008246, 2007. a

Levelt, P. F., van den Oord, G. H., Dobber, M. R., Malkki, A., Visser, H., de Vries, J., Stammes, P., Lundell, J. O., and Saari, H.: The ozone monitoring instrument, IEEE T. Geosci. Remote, 44, 1093–1101, 2006. a

Mears, C. A., Wang, J., Smith, D., and Wentz, F. J.: Intercomparison of total precipitable water measurements made by satellite-borne microwave radiometers and ground-based GPS instruments, J. Geophys. Res.-Atmos., 120, 2492–2504, 2015. a

Nielsen, J., Gleisner, H., and Lauritsen, K.: Validation Report: Reprocessed Level 2B and 2C 1D-Var products, Tech. Rep. SAF/ROM/DMI/REP/1DVAR/001, ROMSAF, product version 1.0, available at: https://www.romsaf.org/product_documents.php (last access: 19 May 2020), 2018. a

Noël, S., Buchwitz, M., Bovensmann, H., Hoogen, R., and Burrows, J. P.: Atmospheric water vapor amounts retrieved from GOME satellite data, Geophys. Res. Lett., 26, 1841–1844, https://doi.org/10.1029/1999GL900437, 1999. a

Noël, S., Buchwitz, M., and Burrows, J. P.: First retrieval of global water vapour column amounts from SCIAMACHY measurements, Atmos. Chem. Phys., 4, 111–125, https://doi.org/10.5194/acp-4-111-2004, 2004. a

Platt, U. and Stutz, J.: Differential Optical Absorption Spectroscopy: Principles and Applications, Physics of Earth and Space Environments, Springer Berlin Heidelberg, https://doi.org/10.1007/978-3-540-75776-4, 2008. a

Rosenkranz, P. W.: Retrieval of temperature and moisture profiles from AMSU-A and AMSU-B measurements, IEEE T. Geosci. Remote, 39, 2429–2435, 2001. a

Rothman, L., Gordon, I., Barbe, A., Benner, D., Bernath, P., Birk, M., Boudon, V., Brown, L., Campargue, A., Champion, J.-P., Chance, K., Coudert, L., Dana, V., Devi, V., Fally, S., Flaud, J.-M., Gamache, R., Goldman, A., Jacquemart, D., Kleiner, I., Lacome, N., Lafferty, W., Mandin, J.-Y., Massie, S., Mikhailenko, S., Miller, C., Moazzen-Ahmadi, N., Naumenko, O., Nikitin, A., Orphal, J., Perevalov, V., Perrin, A., Predoi-Cross, A., Rinsland, C., Rotger, M., Šimečková, M., Smith, M., Sung, K., Tashkun, S., Tennyson, J., Toth, R., Vandaele, A., and Vander Auwera, J.: The HITRAN 2008 molecular spectroscopic database, J. Quant. Spectrosc. Ra., 110, 533–572, 2009. a, b

Rothman, L., Gordon, I., Babikov, Y., Barbe, A., Benner, D. C., Bernath, P., Birk, M., Bizzocchi, L., Boudon, V., Brown, L., Campargue, A., Chance, K., Cohen, E., Coudert, L., Devi, V., Drouin, B., Fayt, A., Flaud, J.-M., Gamache, R., Harrison, J., Hartmann, J.-M., Hill, C., Hodges, J., Jacquemart, D., Jolly, A., Lamouroux, J., Roy, R. L., Li, G., Long, D., Lyulin, O., Mackie, C., Massie, S., Mikhailenko, S., Müller, H., Naumenko, O., Nikitin, A., Orphal, J., Perevalov, V., Perrin, A., Polovtseva, E., Richard, C., Smith, M., Starikova, E., Sung, K., Tashkun, S., Tennyson, J., Toon, G., Tyuterev, V., and Wagner, G.: The HITRAN2012 molecular spectroscopic database, J. Quant. Spectrosc. Ra., 130, 4–50, https://doi.org/10.1016/j.jqsrt.2013.07.002, 2013. a

Rozemeijer, N. and Kleipool, Q.: S5P Mission Performance Centre Level 1b Readme, Tech. Rep. S5P-MPC-KNMI-PRF-L1B, product version V01.00.00, available at: https://sentinel.esa.int/documents/247904/3541451/Sentinel-5P-Level-1b-Product-Readme-File (last access: 19 May 2020), 2019. a

Schneider, A., Borsdorff, T., aan de Brugh, J., Aemisegger, F., Feist, D. G., Kivi, R., Hase, F., Schneider, M., and Landgraf, J.: First data set of H2O∕HDO columns from the Tropospheric Monitoring Instrument (TROPOMI), Atmos. Meas. Tech., 13, 85–100, https://doi.org/10.5194/amt-13-85-2020, 2020. a

Serdyuchenko, A., Gorshelev, V., Weber, M., Chehade, W., and Burrows, J. P.: High spectral resolution ozone absorption cross-sections – Part 2: Temperature dependence, Atmos. Meas. Tech., 7, 625–636, https://doi.org/10.5194/amt-7-625-2014, 2014. a, b

Siegel, A. F.: Robust regression using repeated medians, Biometrika, 69, 242–244, https://doi.org/10.1093/biomet/69.1.242, 1982. a, b, c

Smith, E. K. and Weintraub, S.: The Constants in the Equation for Atmospheric Refractive Index at Radio Frequencies, P. IRE, 41, 1035–1037, https://doi.org/10.1109/JRPROC.1953.274297, 1953. a

Steffensen, J. F.: Remarks on iteration, Scand. Actuar. J., 1933, 64–72, https://doi.org/10.1080/03461238.1933.10419209, 1933. a

Sulla-Menashe, D., Gray, J. M., Abercrombie, S. P., and Friedl, M. A.: Hierarchical mapping of annual global land cover 2001 to present: The MODIS Collection 6 Land Cover product, Remote Sens. Environ., 222, 183–194, https://doi.org/10.1016/j.rse.2018.12.013, 2019. a

Susskind, J., Barnet, C. D., and Blaisdell, J. M.: Retrieval of atmospheric and surface parameters from AIRS/AMSU/HSB data in the presence of clouds, IEEE T. Geosci. Remote, 41, 390–409, 2003. a

Thalman, R. and Volkamer, R.: Temperature dependent absorption cross-sections of O2−O2 collision pairs between 340 and 630 nm and at atmospherically relevant pressure, Phys. Chem. Chem. Phys., 15, 15371–15381, https://doi.org/10.1039/C3CP50968K, 2013. a, b

Tilstra, L. G., Tuinder, O. N. E., Wang, P., and Stammes, P.: Surface reflectivity climatologies from UV to NIR determined from Earth observations by GOME-2 and SCIAMACHY, J. Geophys. Res.-Atmos., 122, 4084–4111, https://doi.org/10.1002/2016JD025940, 2017. a

Vandaele, A., Hermans, C., Simon, P., Carleer, M., Colin, R., Fally, S., Merienne, M., Jenouvrier, A., and Coquart, B.: Measurements of the NO2 absorption cross-section from 42 000 cm−1 to 10 000 cm−1 (238–1000 nm) at 220 K and 294 K, J. Quant. Spectrosc. Ra., 59, 171–184, https://doi.org/10.1016/S0022-4073(97)00168-4, 1998. a, b

Van Geffen, J., Boersma, K., Eskes, H., Maasakkers, J., and Veefkind, J.: TROPOMI ATBD of the total and tropospheric NO2 data products, Tech. Rep. S5P-KNMI-L2-0005-RP, Royal Netherlands Meteorological Institute, available at: https://sentinel.esa.int/documents/247904/2476257/Sentinel-5P-TROPOMI-ATBD-NO2-data-products (last access: 19 May 2020), 2019. a

Veefkind, J., Aben, I., McMullan, K., Förster, H., de Vries, J., Otter, G., Claas, J., Eskes, H., de Haan, J., Kleipool, Q., van Weele, M., Hasekamp, O., Hoogeveen, R., Landgraf, J., Snel, R., Tol, P., Ingmann, P., Voors, R., Kruizinga, B., Vink, R., Visser, H., and Levelt, P.: TROPOMI on the ESA Sentinel-5 Precursor: A GMES mission for global observations of the atmospheric composition for climate, air quality and ozone layer applications, Remote Sens. Environ., 120, 70–83, https://doi.org/10.1016/j.rse.2011.09.027, 2012.  a

Veefkind, J. P., de Haan, J. F., Sneep, M., and Levelt, P. F.: Improvements to the OMI O2−O2 operational cloud algorithm and comparisons with ground-based radar–lidar observations, Atmos. Meas. Tech., 9, 6035–6049, https://doi.org/10.5194/amt-9-6035-2016, 2016. a

Wagner, T., Heland, J., Zöger, M., and Platt, U.: A fast H2O total column density product from GOME – Validation with in-situ aircraft measurements, Atmos. Chem. Phys., 3, 651–663, https://doi.org/10.5194/acp-3-651-2003, 2003. a

Wagner, T., Beirle, S., and Deutschmann, T.: Three-dimensional simulation of the Ring effect in observations of scattered sun light using Monte Carlo radiative transfer models, Atmos. Meas. Tech., 2, 113–124, https://doi.org/10.5194/amt-2-113-2009, 2009. a

Wagner, T., Beirle, S., Sihler, H., and Mies, K.: A feasibility study for the retrieval of the total column precipitable water vapour from satellite observations in the blue spectral range, Atmos. Meas. Tech., 6, 2593–2605, https://doi.org/10.5194/amt-6-2593-2013, 2013. a, b

Wang, H., Souri, A. H., González Abad, G., Liu, X., and Chance, K.: Ozone Monitoring Instrument (OMI) Total Column Water Vapor version 4 validation and applications, Atmos. Meas. Tech., 12, 5183–5199, https://doi.org/10.5194/amt-12-5183-2019, 2019. a, b, c, d

Ware, R. H., Fulker, D. W., Stein, S. A., Anderson, D. N., Avery, S. K., Clark, R. D., Droegemeier, K. K., Kuettner, J. P., Minster, J. B., and Sorooshian, S.: SuomiNet: A real-time national GPS network for atmospheric research and education, B. Am. Meteorol. Soc., 81, 677–694, 2000. a

Weaver, C. and Ramanathan, V.: Deductions from a simple climate model: Factors governing surface temperature and atmospheric thermal structure, J. Geophys. Res.-Atmos., 100, 11585–11591, 1995. a, b

Wendland, W. and Steinbach, O.: Analysis: Integral- und Differentialrechnung, gewöhnliche Differentialgleichungen, komplexe Funktionentheorie, Lehrbuch: Mathematik, Vieweg+Teubner Verlag, https://doi.org/10.1007/978-3-322-82962-7, 2005. a, b

Wentz, F. J.: A well-calibrated ocean algorithm for special sensor microwave/imager, J. Geophys. Res.-Oceans, 102, 8703–8718, 1997. a

Wessel, P. and Smith, W. H. F.: A global, self-consistent, hierarchical, high-resolution shoreline database, J. Geophys. Res.-Sol. Ea., 101, 8741–8743, https://doi.org/10.1029/96JB00104, 1996. a, b