HDO/H2O ratio retrievals from GOSAT

We report a new shortwave infrared (SWIR) retrieval of the column-averaged HDO/H2O ratio from the Japanese Greenhouse Gases Observing Satellite (GOSAT). From synthetic simulation studies, we have estimated that the inferred δD values will typically have random errors between 20‰ (desert surface and 30° solar zenith angle) and 120‰ (conifer surface and 60° solar zenith angle). We find that the retrieval will have a small but significant sensitivity to the presence of cirrus clouds, the HDO a priori profile shape and atmospheric temperature, which has the potential of introducing some regional-scale biases in the retrieval. From comparisons to ground-based column observations from the Total Carbon Column Observing Network (TCCON), we find differences between δD from GOSAT and TCCON of around −30‰ for northern hemispheric sites which increase up to −70‰ for Australian sites. The bias for the Australian sites significantly reduces when decreasing the spatial co-location criteria, which shows that spatial averaging contributes to the observed differences over Australia. The GOSAT retrievals allow mapping the global distribution of δD and its variations with season, and we find in our global GOSAT retrievals the expected strong latitudinal gradients with significant enhancements over the tropics. The comparisons to the groundbased TCCON network and the results of the global retrieval are very encouraging, and they show that δD retrieved from GOSAT should be a useful product that can be used to complement datasets from thermalinfrared sounder and ground-based networks and to extend the δD dataset from SWIR retrievals established from the recently ended SCIAMACHY mission.


Introduction
Water vapour is the most important greenhouse gas, and an accurate representation of the water cycle and its associated feedback mechanisms is crucial for reliable climate model predictions (Soden et al., 2005). The water cycle is a complex system involving many different competing processes. Thus it is important not only that climate models manage to reproduce the tropospheric water vapour concentrations but also that the individual processes are correctly represented (Risi et al., 2012). The isotopic composition of water vapour between H 2 16 O and the heavier HDO (or H 2 18 O) changes during phase changes due to condensation and evaporation. In addition, kinetic and equilibrium fractionations yield different HDO/H 2 O ratios. Thus, the history of water vapour in an air parcel is imprinted in the ratio of HDO/H 2 O, and measurements of the HDO/H 2 O ratio can contribute to improving our understanding of the processes involved in the water cycle and allow critical testing of the water cycle representation in climate models.
The HDO/H 2 O ratio is measured in situ by a number of surface stations (e.g. the Global Network for Isotopes in Precipitation -GNIP) and from aircraft (e.g Ehhalt et al., 2005), but these measurements are sparse. In addition, remotely sensed HDO/H 2 O observations are now available from a number of satellite sensors and from networks of groundbased Fourier transform spectrometers Schneider et al., 2010Schneider et al., , 2006. Global tropospheric measurements of the HDO/H 2 O ratio have been inferred from observations of the Interferometric Monitor for Greenhouse Gases (IMG) on the ADEOS satellite between August 1996 and 600 H. Boesch et al.: HDO/H 2 O ratio retrievals from GOSAT June 1997 (Herbin et al., 2007) and more recently from measurements by the Thermal Emission Sounder onboard Aura (Worden et al., , 2007(Worden et al., , 2006 and the Infrared Atmospheric Sounding Interferometer (IASI) on MetOp (Schneider and Hase, 2011;Herbin et al., 2009). These satellite sensors measure radiances in the thermal-infrared emitted from the surface and the atmosphere, and their peak sensitivity is typically in the free troposphere. Retrievals of HDO/H 2 O ratio are also available from reflected and scattered sunlight in the shortwave infrared (SWIR) measured by SCIAMACHY onboard Envisat (Frankenberg et al., 2009). SCIAMACHY retrievals represent column-averaged HDO/H 2 O ratio observations with high sensitivity to the boundary layer, where the largest fraction of the water column resides. However, the Envisat mission has recently been ended due to a loss of communication. An overview of the various observational datasets of the HDO/H 2 O ratio that are currently available is given in Risi et al. (2012).
The Greenhouse Gases Observing Satellite (GOSAT) launched by the Japanese Space Agency in 2009 provides spectrally resolved radiance measurements in the SWIR, which provide the potential for the retrieval of the columnaveraged HDO/H 2 O ratio and for expanding the SWIR dataset of the HDO/H 2 O ratio established by SCIAMACHY. GOSAT is equipped with the two instruments Thermal And Near Infrared Sensor for Carbon Observations-Fourier Transform Spectrometer (TANSO-FTS) and Cloud and Aerosol Imager (TANSO-CAI). The TANSO-FTS sensor measures radiance spectra in three SWIR bands between 12 900-13 200 cm −1 , 5800-6400 cm −1 and 4800-5200 cm −1 , and in one thermal infrared (TIR) band between 700-1800 cm −1 with spectral resolutions between 0.257-0.367 cm −1 (Kuze et al., 2009). TANSO-FTS nominally performs a cross-track scanning pattern with an instantaneous field of view (IFOV) of 15.8 mrad, equivalent to 10.5 km diameter projected on to the Earth's surface. Until August 2010, the standard mode consisted of five cross-track points separated by 158 km. This has since been changed to three points to reduce pointing errors caused by microvibrations, which are most extreme at the largest off-nadir pointing angles (Crisp et al., 2012). Additionally, TANSO-FTS can measure in sunglint mode within 20 • of the subsolar latitude and in specific observation mode that provides targeted observations for validation.
In this manuscript, we first describe our HDO/H 2 O retrieval from GOSAT observations and present a series of retrieval sensitivity tests (Sect. 2). In Sect. 3, a comparison of the GOSAT retrievals to ground-based observations is discussed, and in Sect. 4 global retrievals of the HDO/H 2 O ratio are described followed by a conclusion in Sect. 5.

Retrieval description
Bands 2 and 3 of GOSAT cover a useful spectral range from 5800-6400 cm −1 and 4800-5200 cm −1 . Thus, the spectral window around 2.35 µm (4255 cm −1 ) used for the SCIA-MACHY HDO/H 2 O retrieval as described in Frankenberg et al. (2009) is not covered by GOSAT. However, in both bands, we can find numerous HDO absorption lines in conjunction with H 2 O and CO 2 lines and in the case of band 2 also CH 4 lines (see Fig. 1). There are also a significant number of strong solar lines in these spectral ranges. Especially, in band 2 for wavenumbers larger than of 4930 cm −1 , we find strong absorption lines of HDO, but this spectral region also contains very strong overlapping absorption lines of H 2 O and CO 2 that will make the HDO retrieval very difficult. Consequently, our HDO retrieval uses band 2, which includes only moderately strong H 2 O and CO 2 absorptions. However, the HDO lines are also weak, which will lead to a low precision of the retrieval.
We have selected a spectral window between 6439 cm −1 and 6464 cm −1 for the retrieval, which includes roughly 140 spectral points (Fig. 2). In this spectral range, HDO lines are relatively free from interference by H 2 O and strong solar lines. Extending the spectral range up to 1.568 µm to include additional HDO lines has shown to lead to H 2 O interferences in the HDO retrieval.
The HDO retrieval uses the University of Leicester Full Physics (UoL-FP) retrieval algorithm, which has already been used for CH 4 and CO 2 retrievals from GOSAT spectra (Parker et al., 2011;Cogan et al., 2012). The retrieval algorithm uses an iterative retrieval scheme based on Bayesian optimal estimation to retrieve a set of atmospheric/surface/instrument parameters, referred to as the state vector, from measured spectral radiances as described in O'Dell et al. (2012), Boesch et al. (2011) and Connor et al. (2008). The forward model, used to relate the state vector to the measured radiances, includes the LIDORT radiative transfer model combined with a fast 2-orders-of-scattering vector radiative transfer code (Natraj and Spurr, 2007). The low-streams interpolation functionality of the code (O'Dell, 2010) to accelerate the radiative transfer component is not used here.
The algorithm retrieves only multiplicative scaling factors for the H 2 O and HDO 20-level volume mixing ratio (VMR) profiles, surface albedo and its spectral slope and spectral shift. The a priori water vapour profile was obtained from the European Centre for Medium-Range Weather Forecasts (ECMWF) operational analysis data interpolated to the location and time of each GOSAT sounding, with corrections for altitude. The a priori HDO VMR profile is inferred from the H 2 O profile by using the HDO/H 2 O ratio according to SMOW (Standard Mean Ocean Water). The temperature profile and surface pressure have also been obtained g. 1. Overview over absorbers present in the spectral range of GOSAT band 2 (left panel) and band 3 (right panel). The spectra are ulated for a SZA of 30 • and a mid-latitude summer atmosphere. The spectra are shown on the spectral grid of the radiative transfer lculation and they are not degraded to the spectral resolution of GOSAT. Fig. 1. Overview of absorbers present in the spectral range of GOSAT band 2 (left panel) and band 3 (right panel). The spectra are simulated for a SZA of 30 • and a mid-latitude summer atmosphere. The spectra are shown on the spectral grid of the radiative transfer calculation, and they are not degraded to the spectral resolution of GOSAT. from ECMWF. The albedo a priori was inferred from the reflectance of the measured radiance at continuum wavelengths of the band. The spectral dispersion a priori was calculated by comparing the measured radiance to the position of a known solar line at 12 985.163 cm −1 . For the radiative transfer calculation, we have also taken into account CO 2 and CH 4 absorptions with VMR profiles taken from our CO 2 and CH 4 retrievals. In addition, we have included boundary layer aerosols with an aerosol optical depth of 0.05 at 0.760 nm.
The a priori covariance matrix uses a standard deviation of 0.32 for the HDO and H 2 O scaling factors. Note that no correlation is assumed between the HDO and H 2 O retrieval. All other retrieved parameters (albedo and its slope, spectral shift) are practically unconstrained. The scaling factors for HDO and H 2 O will be almost unconstrained parameters as such a single parameter can be retrieved well. However, the assumption of an a priori profile shape for H 2 O and HDO will impose a "hard" constraint on the retrieval, which can introduce a bias in the retrieval as will be shown in the next section.
The retrieval uses the spectroscopic parameters from the Total Carbon Column Observing Network (TCCON) line lists (version 20120409) for HDO, H 2 O and CH 4 and from the v3.2 of the OCO line lists for CO 2 (Crisp et al., 2012). The H 2 O line list is derived from a mixture of HITRAN 2008 (Rothman et al., 2009) and from Toth (2005). The lines providing the best spectral fit of each option are chosen. The June 2009 HITRAN update is not included in the wavelength range in which we are interested, but there are updates included from Jenouvrier et al. (2007), as well as hundreds of empirically determined H 2 O lines observed in humid (Darwin, Australia) spectra from ground-based observations .
The ratio of the retrieved columns R = HDO/H 2 O is converted into the δD notation, which gives the deviation from SMOW in per mil: δD = (R/R SMOW − 1) × 1000. Since the a priori profiles of H 2 O and HDO profiles already represent SMOW, we can directly use the ratio of the retrieved scaling factors instead of R/R SMOW .  Figure 1 but for a small spectral range. The grey bar indicates the fit window used for the HDO/H2O retrieval. Fig. 1 but for a small spectral range. The grey bar indicates the fit window used for the HDO/H 2 O retrieval.

Retrieval sensitivity tests
We have carried out a series of simulations to test the sensitivity of the HDO/H 2 O retrieval to retrieval assumptions. To this end, GOSAT spectra have been simulated for two solar zenith angles, SZAs (30 • and 60 • ), three surfaces types (conifer, desert and 5 % albedo) and two atmospheric profiles (summer and winter at Orleans, France) using the forward model of the retrieval algorithm. For the measurement noise we have used a constant value of 2.3 × 10 −9 W cm −2 sr −1 (cm −1 ) −1 , which is a reasonable approximation for the GOSAT measurement noise. The simulated spectra have then been retrieved with the retrieval algorithm described above. The inputs for the simulations and the retrievals are identical except that we have (1) perturbed the temperature profile by adding 5 K to all levels below 700 hPa; (2) perturbed surface pressure by 5 hPa; (3) included boundary layer aerosol with AOD of 0.15; (4) included a cirrus cloud at 8 km with an optical depth at 0.760 nm of 0.1; (5) perturbed the HDO/H 2 O ratio by perturbing the HDO a priori profile by multiplying the HDO VMRs with 0.5 above 550 hPa, by 0.7 between 950 hPa and 550 hPa and by 1 below 950 hPa and (6) perturbed the H 2 O and HDO a priori profiles simultaneously by multiplying the HDO and H 2 O VMRs by 1.5 below 850 hPa. Note that no noise has been added to the simulated radiances themselves so that the difference between the true and the retrieved HDO/H 2 O ratio gives directly the associated error related to each of the five perturbations described above.
The retrieval algorithm also calculates an estimate for the random error given by the a posteriori error of the retrieved HDO and H 2 O columns from which we derive the random error for δD. The values for these error estimates for the HDO and H 2 O retrievals range from 1.8-5.6 % and 0.65-1.9 % for desert, 4.0-11.5 % and 1.4-4.0 % for conifer and 16.1-27.6 % and 6.3-16.1 % for the 5 % albedo surface. The subsequent error estimates for δD are 20-60 ‰ for desert, 40-120 ‰ for conifer and 170-320 ‰ for 5 % albedo surface. . 4. The top panel gives errors in the HDO (left) and H2O (right) retrieval for the sensitivity tests for temperature (Temp), surface pressure ess), aerosol (Aero), cirrus cloud (Cirrus) and the a priori HDO/H2O ratio (Ratio) and the shape of the H2O and HDO a priori profiles ofile). The bottom panel gives the error in δD in ‰ (left) and relative to the estimated random error (right). The colour denotes the surface e (blue = 5% albedo; green = conifer; red = desert), the symbol the SZA (circle = 30 • ; triangle = 60 • ) and open symbols are for the winter osphere and filled symbols for the summer atmosphere. for the 5 % albedo surface. Note that the retrieval is a profile scaling retrieval where an a priori profile is scaled, and thus the maximum value for the degrees of freedom is one. Retrievals over very low surface albedo will not be useful owing to the very large random errors, but even over brighter surfaces some averaging will be necessary to reduce the random component of the errors. Figure 3 gives the normalized column averaging kernel for the HDO and H 2 O retrievals for the three surface types and the two solar zenith angles (see Connor et al., 2008, for a definition of the column averaging kernel). For the bright surfaces conifer and desert, the kernels for HDO and H 2 O are similar below ∼ 700 hPa with values close to unity. The kernel values for the 5 % albedo surface are much lower due to the very low signal-to-noise ratio and thus low information content.
The non-diagonal elements of the averaging kernel matrix describe the cross-dependences of the HDO-retrieval on the prior information of H 2 O and vice versa. As expected, we find that the influence of the a priori information is to a large degree independent between HDO and H 2 O. This is because of the diagonal choice of the a priori covariance matrix and because the weighting function matrix is fairly blockdiagonal; i.e. HDO and H 2 O lines are only weakly overlapping.
The results of the sensitivity tests are shown in Fig. 4. We find that temperature perturbation has little effect on the HDO retrievals, but it introduces errors of 4-7 % in the H 2 O retrievals leading to relatively large errors in δD of 50-100 ‰ depending on surface albedo and atmospheric profile. Especially for brighter surfaces this error can considerably exceed the random error. Surface pressure and aerosol perturbations introduce errors of a few percent in the HDO and H 2 O retrievals, typically with the same sign for HDO and H 2 O. Thus using the ratio of HDO and H 2 O columns will efficiently reduce these errors, and we find only minor errors of a few ‰ the non-diagonal elements of the averaging kernel matrix (right). The solid lines are for conifer, dashed lines for desert and the dotted line for 5% albedo. The thick line is for SZA of 30 • and the thin line for 60 • .  in δD. Only for the low albedo case, aerosol-related errors increase up to 15 ‰ for SZA of 30 • and up to 30 ‰ for SZA of 60 • due to the increased contribution of scattered light to the observed signal. For all cases, these errors represent only a small fraction of the random error.
In the case of cirrus clouds, we find a large difference between the dark surface with errors of −11 % for HDO and −21 % for H 2 O and the brighter surfaces with errors of up to +7 % for HDO and +6 % for H 2 O. Again, we find that the errors in δD are reduced due to the ratioing with values of 60-120 ‰ for the dark surface and 10-20 ‰ for the brighter surfaces. Overall, these errors can be significant when compared to the random error, and they can be of the same magnitude for bright surfaces.
The effect of perturbing the HDO/H 2 O a priori ratio is most noticeable for low surface albedo with errors of up to 18 % in the HDO retrieval, which leads to errors in δD of 80-180 ‰ depending primarily on the atmospheric profile. For the brighter surfaces, the effect is smaller and the errors can be positive or negative with values of a few percent and subsequent errors in δD varying between −12 and 25 ‰ again depending mostly on the atmospheric profile. For most scenarios, these errors are significant when compared to the random errors.
For the bright surfaces conifer and desert, changing the a priori HDO and H 2 O profiles leads to relatively similar errors in the HDO and H 2 O columns with errors of 7-11 % for HDO and 11-14 % for H 2 O. For the 5 % albedo surface, errors in HDO and H 2 O differ significantly with values of −10 to +3 % for HDO and 7-10 % for H 2 O. This leads to moderate errors in δD of 30-60 ‰ and much larger errors of up to 190 ‰ for dark surfaces.
Although this sensitivity study does not represent a complete characterization of the retrieval, it shows that some of the assumptions, especially for temperature, cirrus clouds, the a priori HDO/H 2 O ratio and the a priori H 2 O and HDO profiles, have the potential to introduce significant errors, especially for lower surface albedo. The 5 K temperature perturbation applied to our a priori profile is larger than the expected uncertainty in temperatures taken from meteorological analysis. In addition, we expect that such temperature uncertainties will be primarily of random nature, but some regional, systematic effects might well be possible. To some extent this is also true for the H 2 O a priori profiles, which are also taken from meteorological analysis. It is well known that some regions (e.g. the tropics) show a relatively persistent cirrus cloud coverage for most of the year (Sassen et al., 2008), and thus errors introduced by cirrus clouds will likely lead to some regional biases in the retrieved δD. In the Earth's atmosphere, the HDO profile is expected to be close to a Rayleigh distillation curve, which will result in a quicker decrease of the VMRs with altitude compared to a fractionation according to SMOW (Joussaume et al., 1984;Ehhalt, 1974). Thus the errors introduced by the a priori HDO/H 2 O ratio can lead to regional bias depending on surface albedo, SZA and atmospheric conditions. Furthermore, it can also be expected that there is significant coupling between cirrus clouds, temperature, aerosol and profile effects, and the errors due to the a priori profile might further increase for larger aerosol loads or in the presence of cirrus clouds.

Setup of GOSAT retrievals
The HDO/H 2 O retrieval uses GOSAT Level 1B files (050050C, 080080C, 100100C, 110110C and 130130C) acquired via the GOSAT User Interface Gateway. We calculate the noise from the standard deviation of the out-of-band signal and approximate the measured radiance by taking the average of the polarized intensities. The spectra are corrected for radiometric degradation. A new version of the level 1b data (version 141141C, 150150C and 150151C) has become available recently with significant improvements to the calibration. However, the main changes are in band 1 so that the impact for our HDO/H 2 O retrieval, which relies primarily on band 2, should be small.
The retrieval procedure consists of multiple steps. First, we select spectra over land with a solar zenith angle less than 70 • and signal-to-noise ratio higher than 50 in each band. Retrievals over the ocean from sunglint observations by GOSAT or for scenes with low clouds are not included in this study. We also remove measurements that are saturated, show large pointing errors or are over terrain with highly variable topography (see Cogan et al., 2012, for details). We then remove cloudy scenes using a cloud detection method based on the O 2 A band as described in Parker et al. (2011).
The HDO/H 2 O ratio is then retrieved from all remaining spectra with the retrieval procedure described above. To all converged retrievals, we then apply a basic quality screening that selects only retrievals with zero non-converging iteration steps and with a χ 2 of the fit residual between 0.5 and 2. In addition, we apply a much stricter quality filter that screens for soundings with χ 2 of the fit residual between 0.7 and 1.5, a retrieved H 2 O scaling factor between 0.7 and 1.2 and an a posteriori error of the HDO scaling factor smaller than 0.1.
We have also carried out a spectral fit in the spectral range of 1.93-1.94 µm where the H 2 O absorption is strongly saturated and the observed radiances are highly sensitive to the presence of scattering material in the upper troposphere, and thus these radiances are useful for the detection of cirrus clouds (Yoshida et al., 2011). We fit an additive intensity offset to match the modelled spectrum to the measured spectrum, and we use this fit parameter together with the χ 2 as additional screening parameters for cirrus clouds.

Comparison to TCCON
We have retrieved all GOSAT soundings for overpasses over sites of the Total Carbon Column Observing Network (TC-CON) at Ny-Ålesund/Spitsbergen, Bialystok/Poland, Bremen/Germany, Orleans/France, Wollongong/Australia and Darwin/Australia between April 2009 and June 2011. These sites cover a wide range of latitudes from the tropics to high  Table  1.   latitudes and thus capture different parts of the global water cycle. Initially we chose a ±5 • spatial co-location criterion around a TCCON site and ±3 h of the GOSAT overpass time over a TCCON site.
TCCON is a global network of ground-based, highresolution Fourier transform spectrometers recording direct solar spectra in the near-infrared spectral region . HDO and H 2 O columns are retrieved independently by scaling a priori VMR profiles. The fit uses 15 spectral windows for H 2 O and 6 for HDO (Table 1). The VMR H 2 O profiles (and temperature and pressure) are taken from NCEP reanalysis profiles interpolated to local noon. The H 2 O profile above 300 hPa is extrapolated until the tropopause altitude, and then an altitude-dependent profile is used in the stratosphere, ranging from approximately 3 to 4 µmol mol −1 at the tropopause to 7 µmol mol −1 at the stratopause. The HDO a priori profile is inferred from the H 2 O VMR profile with an additional H 2 O-dependent fractionation assuming a δD of −40 ‰ at 1 % H 2 O VMR decreasing to around −600 ‰. The spectroscopic line list used for the TCCON retrievals is the same as the one used for the GOSAT retrievals, except for CO 2 where small differences are possible, but this should have little effect on the HDO/H 2 O retrievals.
Examples of the spectral fit for the GOSAT retrieval and the TCCON retrieval are given in Figs. 5 and 6. The TC-CON spectral fit shows some spectral structures in the residual of the fit mostly associated with solar lines and strong H 2 O lines. It can also be seen that most of the HDO information comes from the four fit windows between 4053 and  4213 cm −1 , which are noisy and where HDO lines overlap with H 2 O lines. The spectral structures in the residual of the GOSAT fit are less pronounced due to the lower spectral resolution of GOSAT, but some spectral structures in the fit residual are visible that point to potential spectroscopic deficiencies in this spectral range.
The column averaging kernels of the TCCON retrievals of H 2 O and HDO are very similar for both species, and they show an almost constant sensitivity throughout the atmosphere (Fig. 7). In the free troposphere, the averaging kernels for H 2 O differ between the TCCON and GOSAT (Fig. 3) retrievals. However, since the bulk of the water column resides in the boundary layer, the contribution of the free troposphere to the total column of H 2 O is small and thus so is the impact of the difference in vertical sensitivities to H 2 O in the free troposphere.
The comparison of GOSAT and TCCON retrievals for the six TCCON sites is shown in Fig. 8 (single soundings and daily means) and Fig. 9 (seasonal cycle). A summary is given in Table 2. Using the ±5 • spatial co-location criteria, we obtain between 2236 and 1030 cloud-free soundings per site that pass the basic filter. This reduces by 20-50 % when applying the stricter quality filter. The exception is Ny-Ålesund where we find a much lower number of soundings due to its high-latitude and island location, and there are no temporalcoincident observations. As shown in Fig. 8, the GOSAT and TCCON retrievals show a reasonable agreement and both datasets show similar seasonal variations with a maximum in local summer and a minimum in winter. As expected, GOSAT retrievals show much larger scatter than the TCCON observations. This scatter is significantly reduced when applying the quality filter (grey and red symbols in Fig. 8). The single sounding standard deviation is between 40 and 100 ‰, which is in agreement with our expectations from Table 2. Overview of the GOSAT-TCCON comparisons for six TCCON sites (Ny-Ålesund, Bialystok, Bremen, Orleans, Darwin and Wollongong). The following data are given for each of the six TCCON sites, from left to right: the number of cloud-free soundings after basic filtering N s and after full quality filtering N s (filter); the number of filtered, coincident soundings N c s (filter); the mean bias in δD and standard error; and standard deviation inferred from the coincident soundings N c s (filter). All values are given for 4 different spatial co-location criteria. Note that for Ny-Ålesund no coincident soundings are found. the a posteriori error estimates. We find that δD retrieved from GOSAT is typically higher than that from TCCON. For the northern hemispheric sites, the mean bias inferred from the coincident observations ranges from −23.63 ± 3.45 ‰ to −34.09 ± 6.03 ‰. For the two Australian sites, the mean bias is significantly larger by as much as 35 ‰. Note that there are no coincident soundings for Ny-Ålesund, and we have not inferred a bias for this site. A very similar picture can be seen in Fig. 9, which shows the seasonal cycle observed by GOSAT (red line) and TCCON (green line) for the six sites with clearly larger differences between GOSAT and TCCON retrievals for the Australian sites. A systematic underestimation of δD from total column retrievals from space-based SCIAMACHY observations compared to TCCON has also been found by Risi et al. (2012) with values ranging from 30 to 87 ‰. The correlation between GOSAT and TCCON soundings using the daily mean values is shown in Fig. 10 (left panel). The mean bias for the whole dataset over all TCCON sites except Ny-Ålesund is around −44 ‰ with a standard deviation of a similar value. Overall, we find a reasonable correlation between the GOSAT and TCCON retrievals with a correlation coefficient of 0.52 but with GOSAT showing a  Table  1.     The spatial co-location criterion of ±5 • is very large, and it is likely that spatial gradients are present within this area for some of the TCCON sites. To investigate the effect of spatial averaging, we have also used spatial co-location criteria of ±100 km, ±70 km and ±50 km. Reducing the criterion to ±100 km reduces the number of soundings to 5-10 %, which is further reduced by 20-75 % for ±50 km. The locations of GOSAT soundings for the six TCCON sites for the different spatial co-location criteria are shown in Fig. 11. For a small spatial co-location criteria of ±100 km, ±70 km or ±50 km, only a few different locations of GOSAT soundings will be included due to the sampling pattern of GOSAT. With the exception of Ny-Ålesund, even for the smallest co-location criterion of ±50 km we still have at least 2-3 locations for each site.   The standard error tends to increase by a factor of 2 to 3 when reducing the co-location criteria to ±100 km due to the large reduction in the number of data points. For the Bialystok and Bremen sites, there is little change in the inferred mean bias and the standard error does not change much when reducing the criteria from ±100 to ±50 km, which suggests that the estimated bias is relatively robust. For Orleans, a rather large variation in the mean bias together with a large increase in the standard error is observed. The standard error becomes significantly larger than the value of the mean bias so that the inferred mean bias becomes very uncertain due to large scatter in the small dataset. For Darwin and Wollongong, a clear decrease in the mean bias is found when decreasing the co-location criteria, and the standard error shows little change between the ±100 km and ±50 km criteria. The values for the mean bias become as low 30-35 ‰, which is much closer to the biases observed for the northern hemispheric sites. For the smallest criterion of ±50 km, the mean bias for Wollongong increases again, but at the same time the standard error doubles. Similarly, we find that for a smaller co-location criterion the GOSAT retrievals tend to better agree with the TCCON observations and to better reproduce the seasonal cycle observed by TCCON, especially for the Australian sites (Fig. 9). However, the smaller criteria also lead to data gaps and more scatter.
Finally, we have tested the effect of the additional cirrus filter described in Sect. 2.3. The results are given in Table 3 for the ±5 • co-location criteria. This additional filter reduces Fig. 9. Seasonal cycle in δD retrieved from GOSAT over six TCCON sites. The figure shows the cloud-free, quality-filtered GOSA retrievals for the time period between April 2009 and June 2011 averaged according to the month. The red line is for a spatial co-locatio criterion of ±5 • and the blue line for ±100 km. The error bars represent the standard errors. For Darwin, we also show the retrievals for co-location criteria of ±50 km (cyan line). TCCON retrievals are shown in green. The bars at the bottom show the number of data points p month (numbers have been multiplied by 10 for the 100 km co-location criterion).   Grey points indicate cloud-free soundings, and red points show the quality-filtered retrievals for a spatial co-location criterion of ±5 • . Blue points are for a colocation criterion of ±100 km and cyan points for ±50 km. The triangle gives the location of the TCCON site. the number of soundings by 1/4-1/2, but it does not lead to improvements in the mean bias. However, for Darwin and Wollongong there is some improvement in the single sounding standard deviation. Overall, the additional cirrus filter does not appear beneficial for the retrieval results, but the   TCCON sites are not located in regions with very large cirrus coverage.
The TCCON and GOSAT retrievals use different spectral windows and different a priori profiles for HDO. In addition, TCCON instruments and GOSAT have different spectral resolutions, which can lead to differences in the effect of spectroscopic errors. We have also carried out TCCON retrievals at Orleans and Wollongong in the GOSAT retrieval window with the original TCCON resolution and when degrading the TCCON spectra to GOSAT resolution. We find that the H 2 O columns inferred from TCCON retrievals will be ∼ 2.5 % lower and the HDO columns ∼ 2.5 % higher when using the same spectral window as is used for the GOSAT retrievals. This results in an increase in δD of roughly 50 ‰ with little dependence on SZA or the H 2 O column (see Fig. 12 for an example TCCON fit in the GOSAT fit window). Thus, the difference observed in δD between TCCON and GOSAT would further increase when the same fit window is used. The comparison of the averaging kernels of the TCCON retrievals when using the original TCCON fit windows or when  using the GOSAT fit window is shown in Fig. 7. When using the GOSAT fit window, we find that the kernels from the ground-based instrument will closely resemble the GOSAT kernels shown in Fig. 3. Degrading the spectral resolution of the TCCON retrievals when using the spectral window of the GOSAT retrievals increases the δD by less than 10 ‰ primarily due to increased H 2 O columns. Using a HDO a priori profile according to SMOW for the TCCON retrievals (as is done in the GOSAT retrievals) changes the δD by about 10 ‰.  and strong enhancements of δD over the convective region in the tropics. Applying the quality filter leads to a clear reduction in the scatter and removes outliers, but it also has a significant effect on the latitudinal gradient, which is much stronger in the filtered data.
Overall, after averaging the data over a month and over 5 • × 5 • , we obtain almost continuous coverage and the values for δD show a smooth behaviour with few outliers. The few remaining, potential outliers are often found for bins with large values of the standard deviation and a small number of soundings. For clear regions such as deserts, we obtain up to 60 data points per bin, and for most other regions we find around 10-20 data points. The standard deviation of the GOSAT δD values is typically around 50 to 80 ‰, and only for some high-latitude bins it reaches or exceeds 100 ‰, which also means that the error for the mean value is only around 10-20 ‰ or smaller.
The quality-filtered δD data from GOSAT for the different seasons are shown in Fig. 14. In winter and to a minor extent in spring and autumn, there is a loss of coverage for mid-to high latitudes due to our SZA cutoff and since we exclude scenes with low signal-to-noise ratio, e.g. snow surface. A clear seasonal variation δD can be observed over many regions between local winter and summer that is most pronounced over the Sahara desert as well as over the USA or India. The seasonal cycle for the five regions of northern Australia, southern Africa, India, western USA and Sahara is shown in more detail in Fig. 15. The seasonal variation between the southern and northern hemispheric sites is roughly shifted by 6 months with peak-to-peak variations of around 50 ‰ for northern Australia, southern Africa, India and western USA. The most pronounced seasonal variations can be found over the Sahara desert with a peak-to-peak variation of 100 ‰, which is similar to observations of Frankenberg et al. (2009) based on SCIAMACHY retrievals.

Conclusions
We have developed a new retrieval for δD in water vapour from shortwave infrared spectra acquired by the GOSAT satellite instruments. δD is inferred from the ratio of HDO and H 2 O columns retrieved from a spectral window around 1.55 µm. The HDO lines are very weak and the single sounding precision of δD is only around 20 to 120 ‰ for most land surfaces so that averaging will be necessary to reduce the random errors. From a series of retrieval sensitivity tests, we find that the retrieval of the HDO/H 2 O ratio has little sensitivity to parameters such as surface pressure and aerosols in the boundary layer. As expected, the effect of both parameters is strongly reduced in the ratio of the HDO and H 2 O columns, which are retrieved in the same spectral range and show only weak absorptions. However, we find a significant sensitivity to atmospheric temperature, the presence of cirrus clouds and the shape of the HDO a priori profile. This will lead to additional scatter in the δD retrievals, and there is a risk of regional-scale biases. A more comprehensive error characterization appears necessary, especially since we expect that there could be a significant coupling between effects such as a priori HDO profile and cirrus clouds.
To test the performance of our GOSAT δD retrievals, we have compared δD from GOSAT to those inferred from the ground-based TCCON network. The TCCON HDO retrievals are not calibrated so that this represents only a consistency check and not a strict validation as both datasets could suffer from similar spectroscopic biases. In the framework of the MUSICA project, ground-based column (and profile) retrievals of δD from mid-infrared observations by NDACC (Network for the Detection of Atmospheric Composition Change) instruments are available at a well-documented quality , which will be well suited to expand upon the presented GOSAT-TCCON comparisons. A first comparison between δD from NDACC, TCCON and GOSAT for three sites is shown in Fig. 16. The δD values from TCCON and NDACC are in good agreement, but δD from TCCON tends to be larger than from NDACC. This difference between the TCCON and NDACC retrievals is largest for the southern hemispheric site Wollongong, and the GOSAT retrievals agree better with NDACC retrievals than with TCCON retrievals for this site.
In general, we find that the GOSAT δD retrievals reproduce the TCCON retrievals reasonably well, but with a typical bias of around 30 ‰ for northern hemispheric sites and a larger bias for Australian sites. However, reducing the spatial co-location criteria leads to a significant reduction in the bias for the Australian sites and brings them into better agreement with the northern hemispheric sites. The mean difference between TCCON and GOSAT retrievals will be even larger when using the same spectral windows for both retrievals. We could not identify the reason for this difference between the ground-based TCCON and the space-based GOSAT retrievals. However, a similar finding has been reported by Risi et al. (2012) between SCIAMACHY, TES and TCCON observations.
Our global retrievals show that GOSAT can observe global variations of δD with good coverage and relatively little scatter after some averaging (monthly and 5 • ×5 • ). We find large latitudinal gradients with strong enhancements over the tropics and a large seasonal cycle over the Sahara desert or the northern mid-to high latitudes in broad agreement with previous studies using other satellite sensors (Frankenberg et al., 2009, Risi et al., 2012.
The spectral coverage of the GOSAT instrument is not ideal for the retrieval of HDO, and the precision of the inferred δD is low. However, the results from our comparisons to the ground-based TCCON network and the results of the global retrievals are very encouraging, and they show that δD retrieved from GOSAT should be a useful product that can be used to complement datasets from thermal-infrared sounder and ground-based networks and to extend the δD dataset from SWIR retrievals established from the recently ended SCIAMACHY mission.