Relative drifts and biases between six ozone limb satellite measurements from the last decade

Abstract. As part of European Space Agency's (ESA) climate change initiative, high vertical resolution ozone profiles from three instruments all aboard ESA's Envisat (GOMOS, MIPAS, SCIAMACHY) and ESA's third party missions (OSIRIS, SMR, ACE-FTS) are to be combined in order to create an essential climate variable data record for the last decade. A prerequisite before combining data is the examination of differences and drifts between the data sets. In this paper, we present a detailed analysis of ozone profile differences based on pairwise collocated measurements, including the evolution of the differences with time. Such a diagnosis is helpful to identify strengths and weaknesses of each data set that may vary in time and introduce uncertainties in long-term trend estimates. The analysis reveals that the relative drift between the sensors is not statistically significant for most pairs of instruments. The relative drift values can be used to estimate the added uncertainty in physical trends. The added drift uncertainty is estimated at about 3 % decade−1 (1σ). Larger differences and variability in the differences are found in the lowermost stratosphere (below 20 km) and in the mesosphere.


Introduction
Ozone as the main absorber in the UV wavelength region is one of the crucial atmospheric trace gases which has been investigated extensively in the past 40 years due to its role as a protecting shield against UV radiation that is harmful for living species. Different observation techniques have been used to extract the ozone signal from the troposphere to the mesosphere (Hassler et al., 2014).
Due to a limited lifetime of a single space instrument, long-term studies on ozone require a combination of measurements from different instruments to be merged to obtain a coherent climate data record. For this purpose the merging of the data sets from several instruments is one possible method. In order to have the best observations included in the merged data, information about biases and drifts is needed for the optimal use of the data. Similar activities on merging are performed by GOZCARDS (Global OZone Chemistry And Related trace gas Data records for the Stratosphere) for SAGE I, SAGE II, ACE-FTS, and MLS-Aura  and by a combination of SAGE II and GOMOS  and SAGE II and OSIRIS (Bourassa et al., 2014). This paper deals with the intercomparison of six limb ozone data sets in the framework of the ESA (European Space Agency) climate change initiative (O3 CCI) and is part of the ongoing merging activities (See Bhartia et al. (2011) SI 2 N special issue and papers therein for an overview).
One important aspect of this work is that the intercomparisons are carried out for each possible sensor pair. A linear regression model has been applied in order to determine the differences and drifts between all pairs of instruments. The differences and drifts can be used to estimate drift-corrected trends of the merged pairs and overall merged product.
The paper is divided into five sections. In Sect. 2 we describe briefly the instruments and their performance. In Sect. 3 basic formulae and definitions for the pairwise comparisons are summarized. In Sect. 4, an overview of the time series from the intercomparisons with SCIAMACHY is provided. In Sect. 5 results from the regression model for the combination of all sensors are discussed and compared with other similar intercomparison and validation works and a summary of the main results and concluding remarks are given.

Instruments
The six instruments used for the comparison in this work are carried by three different satellites. Three atmospheric chemistry experiments (GOMOS, MIPAS, and SCIAMACHY) were onboard the Envisat satellite, which operated from 2002 to 2012. It flew in a sun-synchronous orbit at an altitude of 780 km, leading to an orbital period of ≈ 100 min and 14 orbits per day. OSIRIS and SMR aboard Odin are two instruments which have been taking measurements since 2001 and are still operating. Odin circles the Earth in a polar, sun-synchronous, near-terminator orbit with an inclination of 97.8 • at an altitude of 600 km. ACE-FTS has been providing measurements since 2004 on SCISAT that has a circular orbit with an inclination of 74 • at an altitude of 650 km.
All instruments are briefly described in the following subsections. Table 1 gives an overview of the time period used for the intercomparison, local time of the measurements, vertical resolution, precision and other instrument-specific information. More details on the instruments, their performance, and validation can be found in Hassler et al. (2014).

GOMOS (Global Ozone Monitoring by Occultation of Stars)
is the stellar occultation instrument onboard the Envisat satellite that exploits the absorption and scattering of stellar light in ultraviolet (UV), visible and near-infrared wavelengths to retrieve vertical profiles of ozone, NO 2 , NO 3 , O 2 , H 2 O, and aerosol extinction Bertaux et al., 2010). Ozone number density profiles are retrieved from measurements by the UV-Vis spectrometer in the altitude range ≈ 10-100 km . The vertical resolution of GOMOS ozone profiles is 2 km below 30 and 3 km above 40 km with the linear transition between. The estimated uncertainty of the retrieved ozone profiles is 0.5-5 % . In this paper GOMOS ozone profiles processed with IPF 6.0 are used.
The retrieved SCIAMACHY ozone profiles from the version V2.5 are used in this study (Rozanov et al., 2001) The algorithm, validation, and error analysis are described in Sonkaew et al. (2009), Mieruch et al. (2012, and Rahpoe et al. (2013), respectively. Table 1. Overview of data sets used (adopted from Sofieva et al., 2013). If necessary, the profiles were converted to volume mixing ratio (vmr) and interpolated to a 1 km vertical grid.  (Murtagh et al., 2002;Llewellyn et al., 2004). OSIRIS measures the ozone number density profiles with a vertical resolution of 1-3 km in a limb mode from 10 to 70 km. The measurement is performed in the optical spectral range of 280-800 nm with a resolution of 1 nm. In this work the OSIRIS ozone data V5.01 have been used (Adams et al., 2014).

SMR on Odin
The second instrument on the Odin satellite is SMR (Submillimeter and Millimeter Radiometer) which uses heterodyne radiometers to measure thermal emission in the frequency range of 486-581 GHz. Atmospheric species measured in the frequency bands at 501.8 and 544.6 GHz are ClO, HNO 3 , N 2 O, and O 3 (Urban et al., 2005). For this study we use the SMR ozone data version 2.1 processed at the Chalmers University of Technology, Gothenburg, Sweden. The optimal estimation method (OEM) scheme is used to retrieve the ozone VMR from the O 3 line at 501.8 GHz.

ACE-FTS on SCISAT
The solar occultation instrument ACE-FTS (Atmospheric Chemistry Experiment Fourier Transform Spectrometer) onboard the Canadian satellite mission SCISAT was launched on 12 August 2003 (Bernath et al., 1999). It measures highresolution (0.02 cm −1 ) spectra between 750 and 4400 cm −1 (2.2-13 µm). The vertical resolution of the profiles is 3-4 km with a sampling of 1.5-6 km. More than 30 trace gases, temperature, and pressure are retrieved by ACE-FTS using a modified global fit approach based on the Levenberg-Marquardt non-linear least-squares method . In this study we use the ACE-FTS ozone profiles version 3.0 retrieved at the University of Waterloo (Boone et al., 2013).

Methodology and definitions
Ozone volume mixing ratios on a common fixed altitude grid with 1 km spacing are used in this study. All profiles have been converted, regridded, and interpolated, if necessary, from native ozone profiles using pressure and temperature either from meteorological analyses or retrieved using the same instrument (see Table 1). The screening and filtering of the data sets was performed as follows: -SCIAMACHY: only cloud-free profiles are used; -GOMOS: no screening is performed by us; -OSIRIS: outliers are screened out for negative ozone values and ozone volume mixing ratio (vmr) > 15 ppmv; -MIPAS: screening for zero visualization values (Viz O 3 = 0) and diagonal elements of averaging kernels AK diag < 0.03, as recommended by the data providers; -ACE-FTS: if ozone values were negative and errors were larger than 100 %, as recommended by the data providers; -SMR: for poor-quality data sets with the flag set to zero, e.g. quality = 0, as recommended by the data providers.
In our analyses, we use collocated measurements for each pair of instruments. The collocation criteria depend on the sampling and coverage of the satellite pair in such a way that a sufficient number of profile pairs is achieved. Specific collocation criteria and the total number of collocations are listed in Tables 2 and 3, respectively. The sensitivity on collocation criteria have been performed for 5 and 12 h in the case of MIPAS and OSIRIS. No major differences have been observed for the variation of collocation criteria in stratosphere for this case The relative difference (δ) is calculated for collocated single profile pairs in a given month, altitude, and latitude bins (5, 15, and 30 • ) as follows: The mean relative difference ( ) is the monthly mean of the δ's at altitude z as follows: where x c i and x r i correspond to the collocated single ozone profiles of the comparison instrument (c) and the "reference" instrument (r) with X c and X r as monthly mean averages of x c i and x r i , respectively. N(z) is the number of available pairs at altitude z for a given month and latitude bin. The standard deviation of is calculated as follows: In addition to the relative difference we also applied a linear regression to the monthly mean relative difference time series for each altitude and latitude bin. The mean relative difference between two instruments is not necessarily a constant but can vary with time. We analyse this time dependence by using a multilinear regression model: where (t, z) is the monthly mean relative difference time series for each altitude and latitude bin. The slope α(z) is the "pairwise relative drift" and β(z) is the "pairwise relative bias" derived from the regression function. The term "bias" is avoided here, since the comparison is not based on one reference sensor but rather each sensor is used as a reference. Instead of "bias" the terms "pairwise relative bias" and "pairwise relative drift" between two instruments are more appropriate here and refer hereafter to "relative bias" and "relative drift" denoted by the Greek symbols β(z) and α(z), respectively. Non-linearity effects are not accounted for here.
The corresponding α(z), β(z) are derived using a multivariate linear regression and the autocorrelation method. The noise term R(t, z) is assumed to be autoregressive function with lag one AR(1). We used the methods described in Weatherhead et al. (1998) andGebhard et al. (2014) to derive autocorrelation, white noise, σ α , and σ β , respectively, for each pair of instruments. Only time series with number of months larger than 36 are used for the analysis. For the periodic variation, periods of 6 and 12 months have been considered with corresponding harmonic functions and parameters κ(z), ν(z). No proxies of the quasi-biennial oscillation or other natural variability have been considered because natural effects are assumed to cancel out when differences are calculated. Since MIPAS RR (reduced resolution) profiles are only available from January 2005 onwards, February 2005 was used as reference time t * or in other words, the relative bias β(z) is the observed bias at time t * .

Relative difference time series
In this part, only a brief example of mean relative difference time series is presented with SCIAMACHY as the reference instrument.
In Sect. 5 the results from the regression analyses (relative bias and relative drifts) of all sensors as reference instrument are discussed. We could have chosen any instrument as we consider none of the instruments as an absolute reference. SCIAMACHY is the only data set under investigation from a dense sampler covering the full Envisat observation period. Further details from all possible pair combinations from 5 • latitude bin analyses can be viewed as contour plots for β(z) and α(z) as Supplement.
The monthly mean relative difference time series of all CCI limb data with respect to SCIAMACHY for different latitude bands are presented in Figs. 1-3.
In the Arctic (70-60 • N, Fig. 1) most of the data sets agree to within ±10 % for all altitudes between 25 and 40 km with SCIAMACHY. The best agreement for most instruments with SCIAMACHY is found at 25 km. Above 30 km, MIPAS showed a pronounced seasonal cycle compared to SCIAMACHY. SCIAMACHY tends to be lower than the other instruments at 30 km.
At northern mid-latitudes (50-40 • N, Fig. 2) the best agreement with SCIAMACHY is at 30 km and below. At 30 km and above, SCIAMACHY is lower than ACE-FTS and MIPAS; at 40 km, SCIAMACHY is in agreement with MI-PAS, but higher than the other data sets by up to 10 %.
In the tropics (0-10 • N, Fig. 3), at 25 km, SCIAMACHY is lower than most of the other instruments, but all instruments agree to within ±5 % with SCIAMACHY. It is apparent that SMR data are quite noisy at this altitude. At 30 km, agreement is similar, except that SCIAMACHY shows a consistent positive bias of about +10 %. At 35 km, SMR shows a negative bias of about −5 to −10 % with respect to SCIA-  difference time series, thus, are a consequence of the different sampling statistics.

Intercomparison results and discussion
In order to get an overall picture of the pairwise comparisons with each instrument as a reference sensor, the vertical distribution of β (relative bias) is drawn in Fig. 4 for 30 • S-30 • N at altitudes between 20 km and 50 km in 5 km steps. Each colour identifies the reference sensor. The position of the different symbols mark the value of each comparison sensor relative to the reference sensor. This compact representation gives a detailed view of the performance of each sensor.
In the lowermost stratosphere (LS) the β range is large for most of the instruments. The smallest β range for most of the reference sensors is observed at 25 km which is to within ±5 %. Only MIPAS and GOMOS have a slightly larger absolute β with respect to SMR. At 30 km, the β range is within ±5 % except for SCIAMACHY as the reference sensor, showing a positive β with respect to four comparison sensors. Between 35 and 50 km, the β range increases for each sensor and shows different behaviour. Four different groups can be identified between 25 and 50 km. The classification between groups is mainly determined by the vertical β range behaviour. If all comparison sensors show positive relative bias with respect to the reference sensor, then we classify the reference sensor as negative relative bias (β) range. Between 25 and 50 km for the latitude band of 30 • S-30 • N (Fig. 4a), Group I consists of OSIRIS (balanced β range), Group II includes GOMOS (low negative β range), Group III includes MIPAS and SCIAMACHY (positive β range), and Group IV is SMR (systematic negative β range).
The balanced β range means that differences to that instrument may be positive or negative without favouring any sign.
For Group II, which consists of GOMOS, the absolute β values are not larger than ±15 %. GOMOS shows similarity with OSIRIS at 25 and 30 km.
In Group III (MIPAS, SCIAMACHY) the β range is mainly positive with respect to the other sensors. Above 40 km, SCIAMACHY shows the largest β value with respect to SMR of up to 20 % at 45 km (see Fig. 4b). The values are statistically significant for the majority of the comparison sensors.
Group IV consists of SMR with a negative β range with respect to all comparison sensors.
Because of the low sampling of ACE-FTS in the tropics, there are only two comparison sensors available, and therefore no general behaviour of ACE-FTS is possible. We observe a balanced β range (Group I) behaviour at 30, 40, and at 45 km and a slightly positive β range (Group III) at other altitudes.
From this plot we can conclude, that in the altitude range of 25 km, most of the groups show similar behaviour in sign and β range to within ±10 %. Highest variability is observed below 20 km (> ± 20 %). Between 25 and 45 km, sign and range of β depends on the reference sensor with four distinct groups as discussed before. Looking at Fig. 4b, one can conclude that all β values that are larger than ±10 % are statistically significant.
At northern middle latitudes (30-60 • N), SCIAMACHY changed its behaviour between 25 and 35 km (Fig. 5a). All other sensors show similar behaviour as in the tropics.
The main difference to the tropics is seen at 20 km. Here, all sensors present lower variability (< ± 12 %) than in the tropics and show balanced behaviour with the exception of SMR.
In the southern middle latitudes (30-60 • S) (Fig. 6a) the relative bias range resembles the behaviour of the tropical band. ACE-FTS, on the other hand, performs as in the northern middle latitudes. The variability below 20 km is however smaller (< ± 20 %) in comparison to the tropics.
There is no clear group behaviour for relative drift α in the tropics 30 • S-30 • N (see Fig. 7). Hereafter, we consider a drift estimate α to be statistically significant if it is outside the ±2σ α uncertainty interval (non-shaded values Fig. 7b). At 40 km the relative drift between OSIRIS and SMR is up to ±18 % decade −1 but is statistically non-significant.
A significant α value is observed for few combination pairs at different altitudes. SMR shows significant values with respect to three instruments at 45 km and at 20 km. SCIAMACHY shows significant values with respect to two instruments at 20 km (α<±20 % decade −1 ) and OSIRIS with two instruments at 35 km (α< ± 5 % decade −1 ). For most of the comparisons, no systematic significant relative drift is observed in this latitude band.
In 30-60 • N (Fig. 8a) the range of α values is larger than in the tropics. Especially the instrument pairs MIPAS/SCIAMACHY and SMR/ACE-FTS show α> ± 10 % decade −1 . But these values are non-significant as it can be seen in Fig. 8b with exception of SMR/ACE-FTS at 35 and 50 km. SCIAMACHY shows significant values with respect to OSIRIS between 20 and 40 km (α< ± 5 % decade −1 ) and OSIRIS with two instruments at 25 km (α< ± 10 % decade −1 ).
In southern mid-latitudes (30-60 • S) the α values are largest below 25 km and smallest between 30 and 40 km ( Fig. 9a), but they are not statistically significant (Fig. 9b). ACE-FTS shows significant values with respect to three instruments at 25 km with α<±15 % decade −1 . SCIAMACHY shows significant values with respect to two instruments at 50 km (α<±10 % decade −1 ). The total number of significant values is lowest in this latitude band.
Only few statistically significant relative drift values are observed. Generally 90 % of the pairs show non-significant relative drifts in these three latitude bands at the described altitudes. Since the majority of the pairs presented show no significant relative drift, we can conclude that merging of the data sets from these six instruments is possible. Such a drift analysis as carried out here can be helpful for identifying outliers which could then be drift-corrected. In our case all instruments show mostly statistically insignificant drift with respect to each other. In the middle stratosphere the drifts are generally below ±6 % decade −1 (2σ ), but can be higher in the upper stratosphere and above, and in the lowermost stratosphere below about 20 km. When merging the data by simply taking averages from all sensors as done in the last WMO (World Meteorological Organization) ozone assessment (WMO Assessment, 2014), an additional uncertainty of about 3 % decade −1 (1σ ) should be added to the physical trend uncertainty derived from the linear trend regression to obtain a more realistic estimate of the overall uncertainty.

Impact of local time and diurnal variation
The difference in local time of measurement can have an impact on the differences in the collocated ozone profiles in the upper stratosphere (above 40 km) (Sakazaki et al., 2013;Par-rish et al., 2014). Following Studer et al. (2013) the diurnal variation has the largest impact above 50 km with its difference between night-time and daytime of up to ±20 %. This might explain the variability observed in the relative biases at 50 km but cannot explain the significant relative biases observed for the altitudes below 50 km where the differences in the local time are expected to have less than ±5 % impact on the differences in ozone. We conclude that the variability observed in the biases is intrinsical and instrument-dependent and not based on the differences in local time.

Comparison to other validation results
Our results can be compared with other validation works as discussed in the following. Eckert et al. (2014) performed a detailed drift analysis of MIPAS V5 220 to derive a drift-corrected trend for the MI-PAS ozone time series. We give an overview of their results of drifts between MIPAS and OSIRIS and between MIPAS and ACE-FTS. For the drifts between MIPAS and OSIRIS, they found mostly negative statistically insignificant drifts in the upper stratosphere, with negative statistically significant values in the northern middle latitudes. The drift signs are in agreement with ours if we compare the MIPAS-OSIRIS drifts in the Southern Hemisphere (as light blue squares in Fig. 9). They find statistically insignificant positive drift values in the latitude bin of 30-40 • S between 40 and 48 km. We observe a positive drift at 40 and 50 km, respectively, that agrees qualitatively with their results. The drifts are on the order of 2-5 % decade −1 going up to ±10 % decade −1 for lower altitudes and are insignificant, in agreement with their findings. In the northern middle latitudes the drifts do not agree with our results. In our case the drifts are nonsignificant and are positive, where in their results, the drifts are negative and significant (see Fig. 8b). The reason can arise from the different time periods, i.e. 2005-2010 in our case and 2002-2010 in their case and by neglecting quasibiennial oscillation in our drift analysis.
For comparison between MIPAS and ACE-FTS, the sign of the drifts are consistent with our results for the southern middle latitudes 30-60 • S. Both papers observe nonsignificant drift in this latitude band. On the other hand at 25 km we see a significant drift between ACE-FTS and MI-PAS which is not observed by Eckert et al. (2014). In the northern middle latitude 30-60 • N the dominating sign of the drifts are negative in our case in agreement with their results above 22 km. Below this altitude we still observe a negative drift in contrast to their findings. Adams et al. (2014) made an analysis of differences between OSIRIS V5.07 and GOMOS V6 ozone profiles. In their comparison, mean relative difference values for the tropical band are lower than 5 % between 20 and 40 km. At 40 km OSIRIS is lower than GOMOS by about 10 % in the tropics. In our case, the comparison between OSIRIS 5.01 and GOMOS V6 shows similar mean relative difference value and shape, especially the sign and values of the mean relative difference between 20 and 40 km. The update of the ozone data led to the reduction of the mean relative difference (compared to GOMOS V5) between the two instruments in this specific region. The comparison of relative drift is in general agreement, except at 45 km, where they observe a significant negative drift, and we only see a positive non-significant drift between OSIRIS and GOMOS. Significant drift values are only observed in the northern middle latitudes 30-60 • N below 25 km. Gebhard et al. (2014) performed an individual trend analyses of three sensors (MLS, SCIAMACHY, and OSIRIS). The results show a significant trend of SCIAMACHY, MLS, and OSIRIS data at 35 km in the tropical latitude band of 20 • S-20 • N. Their results are consistent with our findings of significant relative drift between OSIRIS and SCIAMACHY at 35 km.
The methods applied here differ such that we used the mean relative differences. The drifts given by Eckert et al. (2014) are based on the absolute differences and not on relative. Adams et al. (2014) provides the drifts by using a robust method of the median values. Other validation works are based on few pairs mainly from the perspective of a single comparison sensor. A caveat to all methods (including ours) is that non-linearity effects in biases and drifts can have an impact on the final derived parameters.

Conclusions
Comparisons of ozone limb/occultation profiles between six independent instruments from three platforms have been performed, i.e. from Envisat, Odin, and SCISAT. The pairwise comparison using collocated data has been used to establish the mean relative differences between 15 pairs of instruments. Monthly mean relative difference time series have been used for the analysis by applying a linear regression model on the differences. The two regression parameters of the linear model, the slope α (relative drift) and the intercept β (relative bias) for the reference time of February 2005 have been calculated for different altitudes and latitude bands. Between 25 and 50 km the β is within ±22 % (in the majority of cases below ±10 %). Large variability in the lowermost stratosphere below 20 km is observed for all pairs in the tropics. This can be explained by retrieval problems for sensors due to low signal to noise ratios, larger natural variability, and the impact of clouds and aerosols.
Overall, β can be sorted into different groups for reference sensors: The relative drifts between the various instruments can be quite large at some altitudes, but because of the short data record (about 10 years), they are mostly statistically insignificant.
Since 90 % of the pairs presented show no significant relative drift, we can conclude that merging of the data sets from these six instruments is possible.
The evaluation of relative biases and relative drifts between pairwise sensors demonstrates its value in understanding the differences between the sensors and differences of the derived trends and can be used to estimate the added uncertainty in physical trends from the drift. The added drift uncertainty is estimated at about 3 % decade −1 (1σ ).
The Supplement related to this article is available online at doi:10.5194/amt-8-4369-2015-supplement.