When computing climatological averages of atmospheric trace-gas mixing ratios
obtained from satellite-based measurements, sampling biases arise if data
coverage is not uniform in space and time. Homogeneous spatiotemporal
coverage is essentially impossible to achieve. Solar occultation
measurements, by virtue of satellite orbit and the requirement of direct
observation of the sun through the atmosphere, result in particularly sparse
spatial coverage. In this proof-of-concept study, a method is presented to
adjust for such sampling biases when calculating climatological means. The
method is demonstrated using carbonyl sulfide (OCS) measurements at 16 km
altitude from the ACE-FTS (Atmospheric Chemistry Experiment Fourier Transform
Spectrometer). At this altitude, OCS mixing ratios show a steep gradient
between the poles and Equator. ACE-FTS measurements, which are provided as
vertically resolved profiles, and integrated stratospheric OCS columns are
used in this study. The bias adjustment procedure requires no additional
information other than the satellite data product itself. In particular, the
method does not rely on atmospheric models with potentially unreliable
transport or chemistry parameterizations, and the results can be used
uncompromised to test and validate such models. It is expected to be
generally applicable when constructing climatologies of long-lived tracers
from sparsely and heterogeneously sampled satellite measurements. In the
first step of the adjustment procedure, a regression model is used to fit a
2-D surface to all available ACE-FTS OCS measurements as a function of
day-of-year and latitude. The regression model fit is used to calculate an
adjustment factor that is then used to adjust each measurement individually.
The mean of the adjusted measurement points of a chosen latitude range and
season is then used as the bias-free climatological value. When applying the
adjustment factor to seasonal averages in 30

Creating climatologies of atmospheric trace-gas concentrations from satellite-based measurements is usually done by collecting available observations into latitudinal and monthly or seasonal bins and calculating the respective averages (e.g., Jones et al., 2012 and Koo et al., 2017, who compiled comprehensive trace-gas climatologies from Atmospheric Chemistry Experiment Fourier Transform Spectrometer, ACE-FTS, observations). For such methods, an evenly distributed coverage with no significant measurement gaps is desirable to avoid introducing sampling biases when calculating climatological means. Satellite-based instruments, however, perform measurements only on distinct orbits, leaving spatiotemporal measurement gaps. This inhomogeneous sampling in space and time can introduce significant biases when calculating climatological averages (Aghedo et al., 2011; Toohey et al., 2013) if they are calculated in the traditional way. The magnitude of the sampling bias depends on the frequency spectrum of the spatial and temporal structure to be averaged. The bias can become particularly large when analyzing data from solar occultation instruments that typically provide two measurements per orbit, leading to sparse and spatially structured data coverage. The annual solar occultation sampling pattern of ACE-FTS is shown in Fig. 1a.

Schematic illustration of how the sampling bias is estimated and adjusted for
OCS mean mixing ratio at 16 km altitude in any chosen time–latitude bin. Two
examples are discussed in more detail in the text and are indicated by the
red and black boxes.

Recent studies (Aghedo et al., 2011; Sofieva et al., 2014; Toohey et al.,
2013; Millán et al., 2016) have investigated the effects of sampling
biases for various satellite data products. Toohey et al. (2013) quantified
the sampling bias for a number of satellites measuring ozone and water vapor.
Depending on the trace gas, pressure level and latitude, they frequently
found sampling biases as high as 20 % and, in some cases, biases as high
as 40 % in regions with steep spatial and/or temporal gradients, such as
in the vicinity of the polar vortex in both hemispheres. In an effort to
quantify long-term trends in stratospheric ozone between 60

Here, we present a novel approach to adjust measurements to mitigate spatiotemporal sampling biases in climatological averages of carbonyl sulfide (OCS) measured by the solar occultation instrument ACE-FTS. The method does not employ dynamical or chemical atmospheric models (e.g., chemistry transport models, CTMs) that may reflect inaccurate or incomplete understanding of the underlying processes. This approach thus allows the uncompromised application of the adjusted data product to test and validate such models.

The approach is suitable to be used on measurements with a seasonal cycle that is smooth enough to be represented by a low-order expansion in Fourier series. Motivated by efforts to quantify the stratospheric burden of OCS from ACE-FTS observations (Kloss, 2017), we use OCS measurements from ACE-FTS. We introduce these measurements in Sect. 2, together with OCS measurements from Envisat–MIPAS that will be used to evaluate our method. Section 3 describes the method developed to estimate and adjust for spatiotemporal sampling biases in detail, which is then evaluated using the much denser and more homogeneous MIPAS data set in Sect. 4. Limitations of the method and its applicability to other tracers and regimes are discussed in Sect. 5.

ACE-FTS is an infrared solar occultation spectrometer on the Canadian
satellite SCISAT and has been delivering data since 2004 (Bernath
et al., 2005). It measures in the spectral region from 750 to 4400 cm

Here, we use version 3.6 ACE-FTS OCS volume mixing ratio measurements between
February 2004 and September 2016 (Boone et al., 2005; Boone, 2013), retrieved
from microwindows in the range 2036 to
2056 cm

When calculating climatological means of atmospheric trace-gas mixing ratios
at a given altitude, missing data over large parts of a region of interest
do not automatically prohibit climatological averaging: an average can
theoretically be created from a single data point, even though it may not
be very representative of the true mean over the chosen spatiotemporal
regime. On the contrary, when calculating the stratospheric OCS burden over a
particular latitude band and season, data coverage is critical irrespective
of sampling bias because data have to be gridded and added up rather than
being averaged. In our study, partial OCS columns are accumulated into
1

The Michelson Interferometer for Passive Atmospheric Sounding (MIPAS) is a
mid-infrared spectrometer on board the ESA (European Space Agency) satellite
ENVISAT. It is a limb-sounding instrument analyzing the spectral radiance
emitted by atmospheric trace gases. From its sun-synchronous polar orbit,
MIPAS measures vertical profiles of multiple trace gases, including OCS. From
2002 to 2012 MIPAS operated in the spectral region between 685 and
2410 cm

Adjusting for spatiotemporal sampling biases requires some description of
the gap-free field. The field could be obtained, for example, from
CTM output, or, as mentioned above, from a satellite
data set providing higher spatial and temporal sampling. In this study, we
use the sparse data themselves to create a gap-free OCS field through the
application of a regression model fit. The regression model is used to fit a
continuous, smooth 2-D (time and latitude) surface either to OCS mixing
ratios at a given altitude or to fields of OCS partial columns. In a general
form with OCS represented by

A total of 12.5 years of ACE-FTS OCS mixing ratios at 16 km altitude are
used by the regression model to obtain the 15 fit coefficients (see Fig. 1a). A different set of fit coefficients is obtained from the regression
model when it is fitted to the stratospheric partial columns. Note that
because the regression model provides a value for any arbitrary latitude and
day of the year, it meets the “continuous” requirement for

This Fourier–Legendre fit only reflects the variability in the data with latitude and season that reoccur every year. Using the entire 12-year data record for the fit yields the most robust result for this purpose. Any additional variability in the spatiotemporal pattern, such as single events, trends, impact of El Niño, quasi-biennial oscillation, etc. is conserved; i.e., it will not be removed by the sampling bias correction. This might occur if the approximation were applied to each year individually.

The estimated regression fits for OCS mixing ratios at a given altitude or OCS partial columns describe the climatological and global state of OCS valid for the 12.5 years of available ACE-FTS observations. The coefficients for the regression fit are calculated by minimizing the sum of the squared differences between the original data (here the ACE-FTS observations) and the complete regression fit. This step in the regression is optimized by minimizing the differences simultaneously with respect to all coefficients used for the Fourier and Legendre expansions.

The regression model fit, together with its uncertainties, is therefore the best representation of the ACE measurements given the information provided (original measurements and number of Fourier and Legendre expansion settings) and due to the fitting process each fit coefficient has an associated uncertainty. Allowing the estimation of the effects of the coefficient uncertainties on the determined sampling biases (see Sect. 3) would require the application of bootstrapping techniques to create many different realizations of the determined OCS climatologies.

Using the gap-free field as described in Sect. 2.3 (

Figure 1a shows the OCS mixing ratio values from
12.5 years of ACE-FTS observations as a function of latitude and
time of year. The small year-to-year shifts in the latitudinal coverage of
ACE-FTS cause small offsets between the traces for individual years seen in
Fig. 1a. The red and black boxes in Fig. 1
indicate the selected time and latitude frames used to demonstrate the
application of this method. The boxes were chosen as examples for the
highest (red box) and lowest (black box) ACE-FTS latitude coverage. The
climatological mean OCS pattern, represented as the regression model fit to
the 12.5 years of ACE-FTS measurements, as a function of latitude and
season, is shown in Fig. 1b. Figure 2 shows the
same for the OCS stratospheric columns. Values for

Stratospheric OCS column values in kg km

ACE-FTS data (

As seen in Fig. 1a and shown in Barkley et al. (2008), OCS mixing ratios at a specific altitude (here 16 km) decrease with increasing latitude. The stratospheric partial column distribution, shown in Fig. 2, is quite different. Because both pressure and OCS mixing ratios rapidly decrease with height above the tropopause, the major fraction of the stratospheric OCS column resides in the few kilometers just above the tropopause, and thus the significant decrease in tropopause height with latitude leads to lower partial columns in the tropics and higher values closer to the poles. For the same reason, the annual cycle and day-to-day variability of the dynamical tropopause, rather than the annual cycle in OCS mixing ratios, largely controls the temporal variability of the stratospheric OCS partial columns, resulting in a more variable stratospheric OCS partial column field compared to the mixing ratio distribution shown in Fig. 1a, potentially confounding the adjustment procedure.

Comparison of the distributions and resulting mean and standard
deviation values of measured (green) OCS and the adjusted measurements
using Eq. (2) (blue) for the same time–latitude bins indicated by the
black

The equivalent to Fig. 1a with the MIPAS OCS data set from 2008
to 2011

Figure 3 shows the frequency distribution of
ACE-FTS OCS measurements at 16 km from 2004 to 2016 for the two chosen
latitude bands and time regions. The green histograms show the distribution
of the original measurements and the blue histograms show the distribution
of the adjusted measurements using Eq. (2). Here, all individual
measurements are adjusted for biases in the zonal seasonal mean. The shifts
in the mean values and contraction of the standard deviations provide useful
summary metrics of the effects of the applied spatiotemporal sampling bias
adjustments. The distribution of all 12 years of data between 60 and 90

MIPAS data distribution for DJF 2009–2010, 60 to
90

The histograms in Fig. 3b show the data
distribution for the red box in Fig. 1, i.e.,
between 30 and 60

To assess whether the methodology quantitatively adjusts the sampling bias, a validation against an independent data set was performed and will be described in the following section.

Comparison of the unadjusted (blue) and adjusted (red) seasonal
ACE-FTS OCS stratospheric columns and seasonal averaged OCS mixing ratio from
15.5 to 16.5 km altitude between 60 to 90

Close-up of the regression model

To quantify the sampling bias arising from the sparse ACE-FTS sampling for a
chosen latitude–time box, the OCS data product from the MIPAS instrument,
with its much denser data coverage, is used. Because of the dense sampling
pattern and almost complete latitude coverage (down to 88

For the best possible quantitative evaluation, the spatiotemporal box in
the ACE-FTS measurements with the lowest ACE-FTS coverage (Fig. 4b) and
the highest observed sampling bias is chosen: December 2009–February
2010, 60 to 90

Using the chosen spatiotemporal box (black box in
Fig. 1), we show in
Fig. 5 histograms of the relative frequency
distributions of all MIPAS OCS mixing ratios at 16 km observed between
60 and 90

To investigate the scientific relevance and applicability of the proposed sampling bias adjustment, climatologies for the seasonal stratospheric OCS columns and OCS mixing ratios at 16 km altitude are calculated with and without sampling bias adjustments.

Due to the satellite orbit, ACE-FTS does not measure in the latitude ranges
85–90

The largest difference between the seasonal mean calculated using original
OCS measurements and the seasonal mean calculated using the adjusted OCS
measurements occurs in the latitude band 60–90

As described in Sect. 2.1, the procedure for the OCS stratospheric column
integration already reduces the sampling bias by extrapolating OCS data into
empty latitude bands. As a consequence, the sampling bias adjustment for the
stratospheric burden is lower than for the mixing ratios. In this particular
case (Fig. 6), there is a marginal impact on the amplitude of the seasonal
cycle as the adjustment most significantly reduces the austral summer OCS
maximum at 16 km in virtually all years. No significant trends are apparent
in either the original or adjusted data (

In this study, we present a method to adjust the spatiotemporal sampling
bias in climatologies calculated from sparsely sampled satellite
observations without requiring additional observational evidence beyond the
data set used. The fact that this method is exclusively based on
observations and is independent of parameterization of atmospheric models
makes it accessible for potential sampling-bias-corrected climatologies used
to test and improve such atmospheric models. Generally, the method can be
applied to any atmospheric compound or property of which the variability
follows defined seasonal and latitudinal patterns and can therefore be
sufficiently well described using a regression model approach. The method
has been shown to quantitatively adjust the sampling bias in seasonal
30

ACE-FTS, with its solar occultation viewing geometry, and therefore sparse and heterogeneous sampling pattern, is particularly sensitive to the occurrence of a sampling bias when calculating climatologies (Toohey et al., 2013). OCS with its atmospheric variability in the stratosphere and upper troposphere limited to large spatial (100s of km) and temporal (i.e., seasons) scales (Barkley et al., 2008) provides an ideal tracer to investigate and demonstrate the sampling bias adjustment method. Note that the method would not work in the presented form (i.e., with a relatively simple regression model that is reasonably well determined by the data) for an OCS data product reflecting the lower tropospheric and boundary layer variability with complex regional patterns and to some extent distinct day–night differences such as the Infrared Atmospheric Sounding Interferometer (IASI) tropospheric OCS product described by Vincent and Dudhia (2017).

In the stratosphere, and often in the upper troposphere–lower stratosphere (UTLS), many long-lived trace
gases (e.g.,

The IMK/IAA-generated (Institute of Meteorology and Climate Research/Instituto de Astrofísica de Andalucía) MIPAS
data used in this study are available for registered users at

CK and MvH designed the research and analyzed and interpreted the results. GEB had the original idea upon which the research presented in this paper is based. MH and KAW provided data and data analysis. MR, JU, BH, SK and GEB contributed ideas and were involved in discussion. Everyone contributed in writing the paper.

The authors declare that they have no conflict of interest.

Measurements used in this study are from the ACE-FTS instrument and MIPAS together with the dynamical tropopause data from ECMWF. The Atmospheric Chemistry Experiment (ACE), also known as SCISAT, is a Canadian-led mission mainly supported by the Canadian Space Agency and the Natural Sciences and Engineering Research Council of Canada. MIPAS spectra used for deriving OCS vertical profiles at the Karlsruhe Institute of Technology have been provided by the European Space Agency. Corinna Kloss has been supported by the graduate school of Forschungszentrum Jülich HITEC (Helmholtz Interdisciplinary Doctoral Training in Energy and Climate Research) and ANR-17-CE01-0015 (TTL-Xing). Marc von Hobe was supported by the German Federal Ministry of Education and Research through the project ROMIC-SPITFIRE (BMBF-FKZ: 01LG1205C). The authors thank Kage Nesbit, Ben Lewis and Christian Rolf for their programming contribution. The article processing charges for this open-access publication were covered by a Research Centre of the Helmholtz Association .

This paper was edited by Justus Notholt and reviewed by two anonymous referees.