Ozone ProfilE Retrieval Algorithm (OPERA) for nadir-looking satellite instruments in the UV–VIS

For the retrieval of the vertical distribution of ozone in the atmosphere the Ozone ProfilE Retrieval Algorithm (OPERA) has been further developed. The new version (1.26) of OPERA is capable of retrieving ozone profiles from UV–VIS observations of most nadir-looking satellite instruments like GOME, SCIAMACHY, OMI and GOME-2. The setup of OPERA is described and results are presented for GOME and GOME-2 observations. The retrieved ozone profiles are globally compared to ozone sondes for the years 1997 and 2008. Relative differences between GOME/GOME-2 and ozone sondes are within the limits as specified by the user requirements from the Climate Change Initiative (CCI) programme of ESA (20 % in the troposphere, 15 % in the stratosphere). To demonstrate the performance of the algorithm under extreme circumstances, the 2009 Antarctic ozone hole season was investigated in more detail using GOME-2 ozone profiles and lidar data, which showed an unusual persistence of the vortex over the Río Gallegos observing station (51 S, 69.3 W). By applying OPERA to multiple instruments, a time series of ozone profiles from 1996 to 2013 from a single robust algorithm can be created.


Introduction
Ozone is an important trace gas in the Earth's atmosphere. Whereas ozone in the stratosphere is essential to protect life from harmful UV radiation, ozone in the troposphere is considered to be a pollutant. At the same time ozone is a climateforcing gas, and is therefore listed as one of the essential climate variables (ECV) by GCOS WMO (http://gcos.wmo. int, see e.g. 2010). Vertical information on the distribution of ozone is required for the study of climate change, numerical weather forecasts, air quality and UV index.
The most accurate method to measure the vertical ozone concentration is by means of balloon-borne ozone sondes, but these have two drawbacks. First, they only reach as high as about 30 km. Second, it is impossible to obtain global coverage using sondes. These problems can be partly overcome by using satellite-based measurements. In 1957 the first algorithm was described for calculating the energy in the incident radiation at a satellite-based detector measuring backscattered solar light (Singer and Wentworth, 1957). A few years later Twomey (1961) showed how to actually retrieve the ozone concentration from the incident radiation at the detector.
The first satellite instrument designed to measure the vertical distribution of ozone was the backscatter ultraviolet (BUV) spectrometer instrument on NIMBUS 4, which was launched in 1970. It was followed by the solar backscatter ultraviolet (SBUV) on NIMBUS 7 in 1978 and the SBUV/2 family aboard the NOAA satellites from 1985 onwards. A complete description of the retrieval algorithm for the (S)BUV instruments can be found in Bhartia et al. (1996).
In April 1995 the Global Ozone Monitoring Experiment (GOME) instrument was launched aboard the second European Remote Sensing satellite (ERS-2) . GOME was the first of a new series of instruments with an increased wavelength range and higher spectral resolution with respect to the (S)BUV instruments. Other instruments followed, e.g. the SCanning Imaging Absorption spectroMeter for Atmospheric CartograpHY (SCIA-MACHY; see Bovensmann et al., 1999), which was launched aboard ENVISAT in 2002; the Ozone Monitoring Instrument (OMI; see Levelt et al., 2006), launched in 2004 aboard Aura; and GOME-2 (Callies et al., 2000), launched in 2006 aboard the first of EUMETSAT's Metop series.
The development of the Ozone ProfilE Retrieval Algorithm (OPERA) started as a retrieval algorithm for GOME data (van der A et al., 2002). In this version, the forward radiative transfer model (RTM) MODTRAN (Anderson et al., 1995;Berk et al., 1989) was used. Ozone cross sections were derived from the high-resolution transmission molecular database 1996 (HITRAN96). The Ring effect was accounted for, but polarisation was neglected. The a priori information was taken from the Fortuin and Kelder climatology (Fortuin and Kelder, 1998). Clouds were modelled by assuming a higher surface albedo.
The OPERA version (1.03) used in the ozone profile retrieval algorithm review paper by Meijer et al. (2006) included improvements to the wavelength calibration, polarisation sensitivity correction and degradation correction. The MODTRAN radiative transfer model was replaced by the LIDORT-A RTM (van Oss and Spurr, 2002). Cloud properties were calculated using the Fast Retrieval Scheme for Clouds from the Oxygen A band (FRESCO; Koelemeijer et al., 2001). Mijling et al. (2010) studied the convergence statistics of OPERA (v. 1.0.9) for GOME in order to improve the profile retrieval. They identified certain geographical regions where OPERA has problems in converging, such as the South Atlantic Anomaly region and above deserts. The effect of input data, such as ozone cross sections, and climatology on the retrieval were also investigated. It was found that in applying these adaptations, the number of non-convergent retrievals was reduced from 10.7 to 2.1 %, and the mean number of iteration steps from 5.1 to 3.8.
In this article, we will describe, for the first time, OPERA version 1.26 applied to the retrieval of GOME and GOME-2 profiles. A different version of OPERA has been used operationally since 2007 within the O3MSAF of EUMETSAT (http://o3msaf.fmi.fi/index.html) for GOME-2 profile retrieval which has been validated using ozone sondes, lidar and microwave instruments (Delcloo and Kins, 2009). That version performs well under challenging circumstances such as the Antarctic ozone hole (van Peet et al., 2009). The OPERA version described here is not limited to GOME-2, however, but is also applicable to GOME and the retrieval of SCIAMACHY and OMI data is under development. Because OPERA can be applied to different instruments, it is used in the development of an algorithm to produce a 15-year-long time series of ozone profiles from GOME, SCIAMACHY, GOME-2 and OMI within the ozone project of ESA's Climate Change Initiative (CCI) programme (http://www.esa-ozone-cci.org/). Within this project, a comparison is made (Keppens, 2013) between OPERA and the retrieval scheme developed at the Rutherford Appleton Laboratory (Miles, 2013).
In Sect. 2 we give a description of GOME and GOME-2. In Sect. 3 we give a short overview of the theoretical background of OPERA and the changes with respect to other versions. In Sect. 4 we will show the results for an intercomparison of GOME and GOME-2 retrievals with ozone sondes. Finally, in Sect. 5 we will show how well OPERA is capable of capturing the dynamics of the Antarctic ozone hole during the 2009 season.

GOME
In April 1995 the Global Ozone Monitoring Experiment (GOME) was launched aboard the second European Remote Sensing satellite (ERS-2) . One of the major changes with respect to the (S)BUV instruments was the wavelength range and the higher spectral resolution. Retrieval algorithms based on optimal estimation (see, for example, Rodgers, 2000) for GOME were developed by, for example, Munro et al. (1998), Hasekamp and Landgraf (2001, van der A et al. (2002) and Liu et al. (2005). No official ESA ozone profile product exists for GOME, but a comprehensive intercomparison of different GOME retrieval algorithms was done by Meijer et al. (2006). GOME is a nadir viewing instrument that measures the backscattered radiation from the atmosphere between 240 and 790 nm at a resolution of 0.2-2.4 nm. GOME uses a scanning mirror with a period of 4.5 s in the forward scan direction and 1.5 s in the backward scan direction.
Because OPERA uses the part of the spectrum between 265 and 330 nm, only parts of GOME channels 1 (237 to 307 nm) and 2 (312 to 406 nm) are used. In order to achieve a sufficient signal-to-noise ratio, part of channel 1 (channel 1a) is read out every 12 s (two forward and two backward scans), while the other part of channel 1 (channel 1b) and channel 2 are read out every 1.5 s. Table 3 gives the relative measurement noise as reported in the level 1 data for a few selected wavelengths. More information on how the different channels are combined is given in Sect. 4.2.

GOME-2
The successor of GOME was GOME-2 (Callies et al., 2000), launched in 2006 aboard the first satellite in EUMETSAT's Metop satellite series. The experience gained in the operation of GOME led to a significant number of changes, but the overall concept remained the same. GOME-2 measures backscattered solar light from the Earth's atmosphere between 250 and 790 nm in four channels with a relatively high spectral resolution (0.2-0.4 nm).  radiative transfer model  -LIDORT-A (van Oss and Spurr, 2002) LIDORT-A (see Sect. 3.2.5) -LABOS (used in the operational OMI retrieval algorithm; see e.g. Kroon et al., 2011) number of streams in the RTM -LIDORT-A: four or six streams six -LABOS: multiple of 2 Raman scattering on or off off window bands variable wavelength windows to use in the retrieval. Can be set independent from the instrument channels.
pressure grid configurable levels which can be adapted "on the fly" to match surface pressure and cloud-top pressure see Table 2 O 3 cross section temperature parameterised cross sections by -Bass and Paur (1985) - Brion et al. (1993), Brion et al. (1998), Daumont et al. (1992) and Malicet et al. (1995); the polynomial expansion can be based on four or five temperatures. GOME-2 uses a scanning mirror similar to GOME; a forward scan takes 4.5 s and the backward scan takes 1.5 s. In the normal mode, a forward scan corresponds to 40 km × 1920 km, which yields an almost global daily coverage. Channel 1a has an integration time of 1.5 s, corresponding to three ground pixels in a forward scan with a size of 40 km × 640 km. Bands 1b/2b have an integration time of 0.1875 s, corresponding to 24 ground pixels in a forward scan with a size of 40 km × 80 km. Table 3 gives the relative measurement noise as reported in the level 1 data for a few selected wavelengths. More information on how the different channels are combined is given in Sect. 4.3.

Retrieval theory
The retrieval theory and notation used is based on Rodgers (2000). The state of the atmosphere can be represented by the state vector x, which, in version 1.26 of OPERA, consists of the layers of the ozone profile, the albedo (see Sect. 3.2.3) and an additive offset (see Sect. 3.2.7). The measurement vector is given by y. The relation between x and y is given by y = F(x), where F is the forward model. This problem is generally underconstrained. Following the maximum a posteriori approach (Rodgers, 2000), the solution to y = F(x) is given by wherex is the retrieved state vector, x a is the a priori, A is the averaging kernel, x t is the "true" state of the atmosphere, S is the retrieved covariance matrix, I is the identity matrix, S a is the a priori covariance matrix, K is the weighting function matrix and S is the measurement covariance matrix. In OPERA, the measurement is the ratio of the radiance over the irradiance. The radiance and irradiance (and the errors) are taken from the level 1 data and used to calculate the measurement error according to error propagation theory. S is a diagonal matrix, with the measurement errors squared on the diagonal. The averaging kernel can also be written as A = ∂x/∂x t and gives the sensitivity of the retrieval to the true state of the atmosphere. The trace of A gives the degrees of freedom for the signal (DFS). When the DFS is high, the retrieval has learned more from the measurement than in the case of a low DFS, when most of the information in the retrieval will depend on the a priori. The total DFS can be regarded as the total number of independent pieces of information in the retrieved profile. The rows of A indicate how the true profile is smoothed out over the layers in the retrieval and are therefore also called smoothing functions. Ideally, the smoothing functions peak at the corresponding level and the half-width is a measure for the vertical resolution of the retrieval.
The covariance matrices include information on the uncertainty of x. The diagonal elements are the variances of the corresponding elements in the retrieved profile. The offdiagonal elements give the correlations between layers.

Configuration
The Ozone ProfilE Retrieval Algorithm (OPERA) has many configurable parameters. The most important ones are listed in Table 1 and their settings are explained in more detail in the following sections.

Retrieval grid
The vertical resolution of retrieved nadir ozone profiles ranges between 7 and 15 km, depending on altitude, solar zenith angle and albedo Liu et al., 2005;Meijer et al., 2006). A vertical resolution of 10 km or worse is achieved in the troposphere and upper stratosphere (≥ 40 km), while values of 7 km have been reported for the middle stratosphere (at ±25 km). The Nyquist criterion states that in order to be able to measure a certain resolution, the signal should be sampled at twice that resolution.
Another way to decide on the thickness of the retrieval layers is to check the DFS as a function of altitude. If the Fig. 1. The cumulative DFS for a GOME observation on 26 May 1997 (blue) and for GOME-2 on 4 April 2008 (red) over Europe. The lines marked with crosses are the DFS for a high-resolution, 40-layer retrieval grid, while the lines marked with dots are the DFS for a retrieval on the 16-layer grid (see Table 2). The green line represents the same observation from GOME-2, but is retrieved without the additive offset. The horizontal dashed line is the thermal tropopause.
DFS remains constant when the altitude increases, the layers in that altitude range do not add information to the profile and can therefore be combined.
In Fig. 1, examples of the DFS of both a GOME and a GOME-2 observation over Europe are plotted as a function of altitude. The light-blue and red lines give the DFS for a high-resolution, 40-layer retrieval grid. The dark-blue and red lines give the same retrievals on the reduced 16-layer retrieval grid. At both low in the troposphere and high in the stratosphere, the DFS does not increase with height, which is an indication that these layers do not add information to the retrieved profile.
Above 60 km, the retrieved partial columns are practically zero, and therefore there appears hardly any reason to retrieve ozone above 60 km. However, for radiation balance in the radiative transfer model, the retrieval grid has been extended until 80 km (0.01 hPa).
The retrieval grid used here consists of 16 layers; an example for the DFS is given by the red line in Fig. 1. The altitudes of the layer boundaries are given in Table 2. The grid has two layers each 6 km thick from the surface up to 12 km; between 12 and 60 km the layers are 4 km thick, while above 60 km, two layers of 12 km each have been added for radiation balance in the radiative transfer model.

Ozone cross section
Several cross-section databases can be selected for use in OPERA. For OPERA version 1.26 the temperature parameterised cross sections of Brion, Daumont and Malicet have been used (Brion et al., 1993(Brion et al., , 1998Daumont et al., 1992;Malicet et al., 1995). Using the pressure grid defined in Table 2, ERA-Interim temperature profiles from the European Centre for Medium-Range Weather Forecasts (ECMWF; see Dee et al., 2011;Dragani, 2011) provide the temperature information for the ozone cross sections.

Clouds and surface albedo
For GOME and GOME-2, OPERA uses the FRESCO algorithm (Wang et al., 2008) to calculate the cloud-top pressure, cloud fraction and cloud albedo. FRESCO uses the surface albedo database by Koelemeijer et al. (2003), and the same values are used in OPERA. OPERA calculates two spectra: one for a completely cloudy case and one for a completely cloud-free case. The resulting spectrum is the average of these two, weighted by the cloud fraction. During the optimal estimation, either the surface albedo or the cloud albedo is included in the state vector and the other is held constant. The cloud fraction determines which option is used: if the cloud fraction is less than 0.2 (this value is configurable) the surface albedo is fitted and the cloud albedo is held constant. For cloud fractions larger than 0.2 the cloud albedo is fitted and the surface albedo is constant. By fitting an effective cloud fraction, the presence of aerosols is partly taken into account in the cloud retrieval. The error made with this procedure is smaller than when taking a (random) guess at the unknown aerosol distribution (confirmed by Boersma et al., 2004, for GOME NO 2 retrievals). If snow/ice is detected, only a cloud-free retrieval is done and the surface albedo is fitted.

Climatology
OPERA can use three different ozone climatologies as an a priori profile. These are the Fortuin and Kelder climatology (Fortuin and Kelder, 1998); the TOMS climatology (Bhartia and Wellemeyer, 2002); and the McPeters, Labow and Logan climatology (McPeters et al., 2007, MLL hereafter). Mijling et al. (2010) investigated the effect of these climatologies on the average number of iterations needed for convergence. The Fortuin and Kelder climatology is based on data from 1980 to 1991, which does not completely capture the Antarctic ozone depletion. The TOMS climatology requires an estimate of the total ozone column as an extra parameter in addition to latitude and time. It also requires an estimate of the error in the profile, which is not provided with the climatology. The MLL climatology was selected for the ozone profile retrievals in OPERA since it is more recent than the Fortuin and Kelder climatology and does not need estimates of the total column and error.
In an optimal estimation procedure, the full a priori covariance matrix is needed instead of only the error on the a priori profile. The MLL climatology does not include information on the covariance matrix, which therefore has to be constructed. For OPERA, this is done with an exponential decrease in pressure (see, for example, Hoogen et al., 1999;Meijer et al., 2006). The a priori covariance matrix (S a ) offdiagonal elements depend on the diagonal elements as where i and j are used to iterate over the layers of the a priori profile, S a (i, i) are the variances taken from the climatology and P (i) is the pressure. The variable l is the correlation length, which in OPERA is expressed in pressure decades and set to 0.3 (approximately 5 km).

Radiative transfer
OPERA can use two radiative transfer models, LABOS and LIDORT-A. The LABOS radiative transfer model was recently developed at the Royal Netherlands Meteorological Institute and is used for OMI profile retrievals (Kroon et al., 2011). Included in LABOS are an approximate treatment of rotational Raman scattering and a pseudo-spherical correction for direct sunlight. The assumption that the atmospheric layers are homogeneous holds only for multiple scattering. For single scattering, the atmospheric layers can be inhomogeneous. Further, weighting functions are calculated for specific altitudes in the atmosphere, namely at the interfaces between atmospheric layers and not for the atmospheric layers themselves.
LIDORT-A is an analytical solution for the radiative transfer equations, designed to be fast and accurate (van Oss and Spurr, 2002). While LABOS runs on any number of streams, LIDORT-A only runs on either four or six streams. However, a LABOS retrieval takes longer for a six-stream retrieval compared to LIDORT-A. It should be noted that for the best results LABOS should run on at least eight streams, which would take even longer.
Both RTMs have the option to include a full treatment of rotational Raman scattering, which increases the processing time by a factor of 2. The effect on the retrieved profiles is small, and therefore it has been decided not to activate the rotational Raman scattering in the retrieval in favour of speed.
The radiative transfer model LIDORT-A (van Oss and Spurr, 2002) is used to calculate the radiance at the top of the model atmosphere because it is faster than LABOS. In addition to the model atmosphere an initial ozone profile and geometrical parameters such as (solar) viewing angles should be provided to the RTM. Additional atmospheric data can be provided in the form of trace gas and aerosol databases.

South Atlantic Anomaly
The South Atlantic Anomaly (SAA) is the region of Earth where satellite orbits pass through the inner Van Allen radiation belt. The high-energy particles contained in the belt can cause spikes and noise in the measurements. The effects are especially notable in the short-wavelength end of the spectrum, where the signal levels are low.
In the version 1.26 of OPERA, an SAA filter is implemented which is a slightly adapted version of the filter described by Mijling et al. (2010), in which, starting at a reference wavelength of 290 nm and progressing towards shorter wavelengths, a measurement is discarded when the reflectance is more than the reflectance of the previous accepted wavelength plus 3 times the reflectance error. In addition to that filter, wavelengths with a reflectance lower than 85 % of the previous accepted wavelength are now discarded.
Using the filter adds successful retrievals in a region where otherwise no successful retrievals would be done. No special flags are raised to indicate whether the retrieval comes from the SAA region.

Calibration
GOME-2 suffers from degradation of the detector in much the same way as GOME and SCIAMACHY. The throughput of the detector is changing, most notably in the shortwavelength end of the spectrum. Because the light paths for the Earth and solar radiance are different, the instrument degradation does not cancel out in the radiance / irradiance ratio. For GOME corrections are supplied along with the level 1 data, but for GOME-2 no such data are supplied with the level 1 data.
As a result of the degradation of the detector, the modelled radiance by the RTM for a given "true" profile is on average lower than the measured radiance for wavelengths smaller than 300 nm. In order to correct for both degradation and the detector's calibration, an offset is included for band 1 in the forward model to increase the photon count. This "additive offset" is added to the state vector and fitted in the optimal estimation procedure.
With the addition of the wavelength independent additive offset (AO), the Sun-normalised radiance (SNR) is given by with E the simulated earth radiance, I 0 the solar irradiance and λ the wavelength. It is assumed that the wavelength is calibrated properly in the level 1 data, and no other checks are performed in OPERA.

Convergence
Optimal estimation is an iterative process, so a convergence criterion has to be set in order to prevent the algorithm from iterating indefinitely. The next step in the iteration of the state vector is given by Eq. (5.10) in Rodgers (2000): The covariance matrix of the solution is calculated according to Eq. (2), and the gain matrix (G) according to Eq. (5.15) in Rodgers (2000), using the same Jacobian (K i ) as in the final iteration step. The gain matrix and Jacobian are used to calculate the averaging kernel matrix according to A = GK.
In OPERA version 1.26, the convergence criterion (calculated according to Eq. 5.29 in Rodgers, 2000) is based on the magnitude of the state vector update, and convergence has been reached when the relative change in the state vector is less than 2 %. A maximum of 10 iterations has been set before the retrieval is flagged as not converged. Since the average number of iterations is between 3.5 and 4.5, an upper limit of 10 iterations will only stop a small fraction of the retrievals. Out-of-bounds retrieval values and too high χ 2 values produce additional error flags.

Methodology
Only converged ozone profile retrievals with solar zenith angle less than 80 • have been used for a short validation study. The profiles produced by OPERA are compared to ECC-type ozone sondes (models Z and 6) that were obtained from the World Ozone and Ultraviolet Radiation Data Centre (WOUDC, 2011).
To be accepted for the validation, the sonde station should be inside the pixel footprint of the satellite instrument. The sondes are required to reach a minimum altitude of 10 hPa, and the time difference between sonde launch and satellite overpass should not be more than 2 h. When multiple collocations occur, only the collocation with the sonde that is closest in time to the satellite overpass is used. Therefore, each retrieval is validated against one sonde profile. GOME profiles have been validated against sondes from 1997, while GOME-2 profiles have been validated against sondes from 2008. After applying the collocation criteria described above, 190 sondes from 25 stations worldwide (ranging from 1 to 48 sondes per station) were used for the validation of the GOME ozone retrievals, and 26 sonde stations with 564 sondes (ranging from 1 to 97 sondes per station) were used for the validation of GOME-2 profiles. The locations for the sonde stations that are used in the validation are given in Fig. 2.
The ozone profiles from sondes that are collocated with satellite measurements are interpolated to the pressure grid used in the ozone profile retrieval and converted to DU layer −1 . Above the sonde burst level, the interpolated sonde profile is extended with the retrieval a priori partial columns. The interpolated and extended sonde profile (x) is then convolved with the averaging kernel (A) and the a priori profile (x a ) according to Eq. (1), with x t replaced by the sonde profile x. The resultingx is the smoothed sonde profile as it would have been observed by the satellite instrument. This smoothed sonde profile is compared with the actual collocated satellite measurement. This procedure is followed for each sonde station separately, but also for three zonal regions: the Southern Hemisphere (−90 to −30 • latitude), the tropics (−30 to 30 • latitude) and the Northern Hemisphere (30 to 90 • latitude).

GOME
For the validation of GOME we used all ozone sondes for 1997 from the WOUDC database that fulfil the collocation Table 3. Relative measurement noise in the level 1 data.
λ 260 280 300 320 340 GOME 5 % 5 % 1 % < 1 % < 1 % GOME-2 25 % 25 % 5 % < 1 % < 1 %  Fig. 2. The different integration times for channel 1a and the channels 1b and 2 result in different ground pixel sizes. One measurement from channel 1a covers an area at the surface of about 100 km × 960 km, and one forward scan measurement from channel 1b or 2 covers an area of 40 km×320 km. During one channel 1a integration time, the forward scans from channel 1b and 2 are read out six times. Each of these six channel 1b and 2 spectra is combined with the same overlapping channel 1a spectrum. The ground pixel size for the ozone profiles is therefore equal to the channel 1b and 2 ground pixel size. Table 4 gives an overview of the validation results for GOME for the Southern Hemisphere (SH), the tropics (TR) and the Northern Hemisphere (NH). The global averages are given in the last column. On the first row the DFS are given for the GOME retrievals that collocate with the sonde measurements. The DFS is lowest in the tropics, indicating that more information in the profile is coming from the a priori. The number of iterations ("n_iter") needed for the retrieval to reach convergence is slightly higher in the tropics than for the other two regions.
The differences in DFS and number of iterations might be affected by the number of sondes used (the row with "n_sonde" in Table 4) for the validation. For the Southern Hemisphere and the tropics, far fewer sondes are available for the validation than for the Northern Hemisphere. The results in the global column are therefore biased towards the Northern Hemisphere results.
The final two rows in Table 4 give the total number of GOME pixels that were retrieved ("n_pix") and the percentage of converged pixels ("%"). The percentage of converged pixels is significantly lower for the Southern Hemisphere than for the tropics or the Northern Hemisphere. From Fig. 2 it can be seen that the Southern Hemisphere is represented by three stations only, one of them being on the Antarctic continent. Since OPERA performs only a cloud-free retrieval over snow and ice, using an effective scene albedo, it has difficulties in discerning snow-and ice-covered surfaces from middle-and high-level clouds. This might be a reason why the percentage of converged retrievals is lower for the Southern Hemisphere. Figure 3 gives mean relative differences of the collocations between sondes and GOME. The Southern Hemisphere, tropics and Northern Hemisphere are indicated by the blue, red and green lines respectively (solid lines are the retrieved values, and the dashed lines are the a priori). The error bars indicate the 95 % confidence interval around the means. For most of the altitude range, the retrievals perform better than the a priori compared with sondes.
The vertical dashed lines are accuracy levels for the troposphere and stratosphere defined in the user requirements of the ozone project of the ESA CCI programme (http://www. esa-ozone-cci.org/). For the short-term variability, an accuracy of 20 % is required in the troposphere, while a 15 % accuracy is required in the stratosphere. The GOME retrievals are well within the required accuracy levels for the whole height range covered by the ozone sondes. The slight deviation at the top for the atmosphere is not significant since only one or two sondes reach this altitude. If the true profile (taken as the sonde profile here) is close to the a priori, Eq. (1) shows that the retrieved profile is also close to the a priori. Another aspect of the retrieval is that the a priori uncertainty is reduced according to Eq. (2). Figure 4 gives the mean of the relative error differences between the retrieval and the a priori. For the Northern and Southern Hemisphere, the mean relative error difference decreases from about −10 % near the surface to about −85 % at the top of the atmosphere. The tropics behave somewhat differently, starting at −40 % near the surface, increasing to about −15 % near 200 hPa and decreasing to −65 % near the top of the atmosphere. The mean relative error difference is smaller than zero for all latitude bands and for all altitudes, indicating that the retrieval performs as expected in reducing the a priori error.
Averaging kernels for the same pixel that was used to construct the DFS profiles for GOME in Fig. 1 are plotted in Figs. 5a and 5b. The averaging kernel values at the nominal retrieval altitudes for the 40-layer retrieval are smaller than for the 16-layer retrieval. If the averaging kernel diagonal elements for the 40-layer retrieval are summed between the pressure levels of the 16-layer retrieval, the value is comparable to the corresponding diagonal element from the 16-layer retrieval.
In addition to the 16 ozone layers, there are two more state vector elements: the albedo (see Sect. 3.2.3) and the additive offset (see Sect. 3.2.7). Due to the selection of surface or cloud albedo in the state vector, the albedo distribution shows two peaks at 0.08 and 0.8 respectively. These values match the average albedo values for the surface and clouds and are observed in all zonal regions in all months.

Fig. 5a.
Averaging kernels for the 40-layer GOME retrieval over Europe that was also used in Fig. 1. The circles give the nominal altitude for the retrieval. The averaging kernels corresponding to the albedo and the additive offset have not been plotted.

Fig. 5b.
Averaging kernels for the 16-layer GOME retrieval over Europe that was also used for the blue line in Fig. 1. The circles give the nominal altitude for the retrieval. The averaging kernels corresponding to the albedo and the additive offset have not been plotted.
In the GOME level 1 data the instrument degradation is taken into account in the correction data supplied with the level 1 data. Therefore, the additive offset is stable and rather low: the global 1997 mean is 0.3 × 10 9 photons with a standard deviation of 0.2 × 10 9 photons.  6. Blue grid: the average of eight spectra from channels 1b/2b, the result combined with the corresponding channel 1a spectrum. Yellow grid: separate combination of each channel 1b/2b spectrum with the overlapping channel 1a spectrum. Red grid: channel 1a spectrum and one 1b/2b spectrum from one forward scan combined with the next forward scan. Green grid: channel 1a spectrum and two 1b/2b spectra from one forward scan combined with the next three forward scans.

GOME-2
Horizontal correlation lengths of ozone in the atmosphere are 350 to 400 km in the lower stratosphere and 100 to 150 km in the middle and upper troposphere (Sparling et al., 2006). Using a pixel footprint that is much smaller than the correlation length leads to oversampling and higher computational cost. Therefore a compromise must be found between the different correlation lengths, the pixel size used in the retrieval and the computational cost.
There are three options to combine GOME-2 channel 1a spectra with channels 1b and 2b. The first option is to average the channels 1b and 2b spectra (0.1875 s integration time) until the total integration time is equal to the channel 1a integration time (1.5 s). The resulting spectrum can be combined with the channel 1a spectrum resulting in a ground pixel size of 40 km × 640 km (blue pixels in Fig. 6).
The second option is to combine each of the channel 1b/2b spectra within the channel 1a integration time with the channel 1a spectrum. This will result in eight ground pixels with a size of 40 km × 80 km (yellow pixels in Fig. 6).
The third option, called ATCT co-adding (along track, cross track), is different from the two options above in that it combines spectra from different forward scans, including channel 1a spectra. In Fig. 6, two different combinations are illustrated. The red borders give the ground pixel size when the channel 1b/2b spectra and the overlapping channel 1a spectrum in a forward scan are combined with the spectra from channel 1a and 1b/2b in the next forward scan. This Fig. 7a. The partial ozone columns (DU) in the second layer of a retrieval (6 to 12 km) over Europe for the blue pixels that were illustrated in Fig. 6.   Fig. 7b. The partial ozone columns (DU) in the second layer of a retrieval (6 to 12 km) over Europe for the yellow pixels that were illustrated in Fig. 6. results in ground pixels of approximately 80 km×80 km. The green borders show the ground pixel size for a combination of two consecutive channel 1b/2b spectra with the overlapping channel 1a spectrum from a foward scan with the corresponding channel 1a and 1b/2b from the next three scan lines. This results in ground pixel sizes of approximately 160 km × 160 km. Figure 7a-c show a comparison between the different methods of combining the measurements described above. In Fig. 7a, the pixel size is approximately 40 km × 640 km, which is much larger than the correlation length in the upper troposphere in one direction. As a consequence, the details Fig. 7c. The partial ozone columns (DU) in the second layer of a retrieval (6 to 12 km) over Europe for the green pixels that were illustrated in Fig. 6.   Fig. 8. Mean of the relative differences per latitude band for GOME-2 retrievals. Error bars indicate the 95 % confidence interval around the mean. The blue line gives the result for the Southern Hemisphere (SH), red for the tropics (Tr) and green for the Northern Hemisphere (NH) (solid for the retrieval, dashed for the a priori). The vertical dashed lines are accuracy levels for the troposphere and stratosphere defined in the ozone project of the ESA CCI programme. visible in Fig. 7b (pixel size 40 km × 80 km) are smoothed out. Processing all data at the same high resolution as in the middle plot is not feasible due to the high computational cost. Therefore, we combine two GOME-2 pixels cross track and four along track as in Fig. 7c (pixel size 160 km × 160 km), i.e. the green pixels in Fig. 6. At this resolution, the details from Fig. 7b are still visible and not completely smoothed out like in Fig. 7a.  Fig. 9a. Mean of the relative differences for GOME-2 retrievals in the Northern Hemisphere. A is the mean of (apri-sonde)/apri, B is the mean of (apri-sonde_ak)/apri, C is the mean of (sat-sonde)/sat and D is the mean of (sat-sonde_ak)/sat, where "sat" is the retrieved profile, "apri" is the a priori profile, "sonde" is the sonde profile on the retrieval grid and "sonde_ak" is the sonde profile convolved with the averaging kernel. The differences with sonde_ak are also used in Fig. 8. The numbers on the left side of the plot indicate the number of collocations between GOME-2 and sondes for that layer. Fig. 9b. Root mean square (RMS) of the absolute differences for GOME-2 retrievals in the Northern Hemisphere. A is the RMS of apri-sonde, B is the RMS of apri-sonde_ak, C is the RMS of satsonde and D is the RMS of sat-sonde_ak, where "sat" is the retrieved profile, "apri" is the a priori profile, "sonde" is the sonde profile on the retrieval grid and "sonde_ak" is the sonde profile convolved with the averaging kernel. The numbers on the left side of the plot indicate the number of collocations between GOME-2 and sondes for that layer. Fig. 10. Mean of the relative error differences per latitude band for GOME-2 retrievals and a priori. The blue line gives the result for the Southern Hemisphere (SH), red for the tropics (Tr) and green for the Northern Hemisphere (NH). Table 5. GOME-2 validation statistics for retrievals done on the green pixels in Fig. 6. Variables are the same as in Table 4 For the GOME-2 validation we used all available ozone sondes for 2008 from the WOUDC database complying with the collocation criteria explained in Sect. 4.1. The sonde locations are shown in Fig. 2. Table 5 shows the validation data for GOME-2 in the same format as in Table 4. Although the differences in GOME-2 DFS between the Southern Hemisphere, tropics and Northern Hemisphere are similar to those of GOME, the absolute values for GOME-2 are lower than for GOME. This is caused by the different signal-to-noise ratios of the instruments. A smaller signal-to-noise ratio results in less information from the measurements and more information from the a priori. Table 6 gives the dependence of the DFS on the measurement noise. The DFS decreases with increasing measurement noise, which is the expected behaviour based on Eq. (3). It is assumed that the measurement errors are uncorrelated, so the measurement covariance matrix is a diagonal matrix. When a correlation between the measurements is introduced by setting the elements above and below the diagonal of the covariance matrix to 0.01 and 0.10 of the diagonal elements respectively, the mean DFS drops by 0.3 and 3 %. Table 6. GOME-2 DFS dependence on level 1 measurement error multiplied by "Factor". The values for factors 0 and ∞ are derived from Eq. (3) assuming that S is a diagonal matrix. The number of iterations is lower for GOME-2 than for GOME. If the error in the measurement is large, then the retrieval will remain close to the a priori and fewer iterations are needed before convergence is reached. Therefore it is probable that the lower DFS and number of iterations of GOME-2 with respect to GOME are caused by the same underlying mechanism.
The number of sondes used in the validation is larger for GOME-2 than for GOME, especially in the Southern and Northern Hemisphere. The number of retrieved pixels is much larger, due to the higher spatial resolution of GOME-2.
The percentage of converged retrievals for GOME-2 with respect to GOME is higher in the Southern Hemisphere but lower in the tropics. The higher convergence in the Southern Hemisphere might be a consequence of the increased number of sonde stations for the validation of GOME-2 (six) with respect to GOME (three). There are more stations outside Antarctica, and consequently fewer problems with snow and ice. On the other hand, it is unclear why the percentage of converged retrievals for the tropics is lower for GOME-2 than for GOME. Figure 8 gives the mean relative differences for the validation of GOME-2. The retrieved values are similar to GOME, except for the second layer between 6 and 12 km. Here, GOME-2 significantly underestimates the sonde measurements in the Northern Hemisphere. In the tropics, the retrieved values for GOME-2 show a deviation comparable to that of GOME, but the bias is larger than for the a priori. The Southern and Northern Hemisphere show in general a better agreement up to 35 km between retrievals and sondes than between a priori and sondes.
In Fig. 9a, a more detailed example for the mean relative differences in the Northern Hemisphere is given. Both the a priori and the retrieved profile were compared to the sonde profile and the sonde profile convolved with the averaging kernel. The differences with non-convolved sonde profiles are similar to the differences with the convolved sonde profiles. With the exception of the second layer of the retrieval, both perform better than the a priori. Note that the number of sondes above 10 hPa rapidly decreases.
In order to see how much of the actual variation is captured by the retrieval, the root-mean-square (RMS) differences are calculated and plotted in Fig. 9b. The retrieval captures more of the actual variation than the a priori, both for the sonde profiles and sonde profiles convolved with the averaging kernel. Fig. 11a. Averaging kernels for the 40-layer GOME-2 retrieval over Europe that was also used in Fig. 1. The circles give the nominal altitude for the retrieval. The averaging kernels corresponding to the albedo and the additive offset have not been plotted. Fig. 11b. Averaging kernels for the 16-layer GOME-2 retrieval over Europe that was also used in Fig. 1. The circles give the nominal altitude for the retrieval. The averaging kernels corresponding to the albedo and the additive offset have not been plotted.
The mean relative errors of the retrieved profile and the a priori (see Fig. 10) are somewhat smaller for GOME-2 than for GOME. All three latitude bands start with relatively small error differences of the order of −5 to −10 % near the surface and decrease until about −65 % near the top of the atmosphere. Averaging kernels for the same pixel that was used to construct the DFS profiles for GOME-2 in Fig. 1 are plotted in Figs. 11a and 11b.  The albedo state vector element for GOME-2 is very similar to GOME, but the additive offset is different in two aspects. The global mean additive offset for 2008 is larger than for GOME (1997): 1.1 × 10 9 photons with a standard deviation of 0.5 × 10 9 photons, because no calibration data have been supplied along with the GOME-2 level 1 data. The tropical region shows a bimodal distribution with peaks at 1.1×10 9 and 1.7×10 9 photons. The second peak is caused by two stations that are close to the South Atlantic Anomaly and which are used for the validation of GOME-2 (see Fig. 2).   15. The mean of the differences between GOME-2 and the lidar at Río Gallegos (DU layer −1 ) for the retrieval (blue) and the a priori (red). The solid line is the mean, and the dashed lines are the ±1 standard deviations. The first number in the column on the left side is the number of collocations between GOME-2 and the lidar and the second number is the mean number of lidar layers averaged for that layer during interpolation.
Since these two stations provided no data for 1997, they have not been used for the validation of GOME and the second peak is not observed in the GOME data. The additive offset for GOME-2 shows an increase from January until December 2008, with a maximum in June. This increase Fig. 16a. The mean of the absolute differences for collocations that occurred inside of, or close to, the vortex. The retrieval is plotted in blue and the a priori in red. The solid line is the mean, and the dashed lines are the ±1 standard deviations. The first number in the column on the left side is the number of collocations between GOME-2 and the lidar, and the second number is the mean number of lidar layers averaged for that layer during interpolation. Fig. 16b. The mean of the absolute differences for collocations that occurred outside of the vortex boundary. The retrieval is plotted in blue and the a priori in red. The solid line is the mean, and the dashed lines are the ±1 standard deviations. The first number in the column on the left side is the number of collocations between GOME-2 and the lidar and the second number is the mean number of lidar layers averaged for that layer during interpolation.
in additive offset is caused by the increased degradation of GOME-2. Figure 12 gives a global map of the additive offset for 2 years (2007)(2008) of GOME-2 data. Note that global coverage is not achieved, because retrievals were only done over areas where ozone sondes were available. It is clear that the SAA has a significantly higher mean additive offset than the rest of the Earth. Therefore the SAA has been treated as a separate region. Figure 12 shows the time series of the additive offset for the NH, Tr, SH and the SAA. All regions show an increasing trend for the additive offset, with the SAA being significantly higher.
As described in Sect. 3.2.7, GOME level 1 data are corrected for the instrument degradation, and therefore GOME does not show a trend in the additive offset. Since the same OPERA settings have been used for both GOME and GOME-2, the trend is most likely caused by instrument degradation.
The same GOME-2 data that were used in Figs. 1 and 11b were retrieved again without the additive offset. The green line in Fig. 1 shows the DFS profile, which is virtually the same as the retrieval with additive offset until an altitude of about 2 hPa (45 km). This is the same altitude above which the contribution of the true state to the retrieval starts to decrease. In the region above this altitude, the retrieval without additive offset gains about one third of a DFS compared to the retrieval including the additive offset. Both retrievals level off above 0.3 hPa (60 km), indicating that no more information is present above that altitude. The averaging kernels for the retrieval without additive offset are very similar to the kernels of the retrieval with additive offset (see Fig. 11b).
The additive offset has the largest effect in the region above 2 hPa, corresponding to the wavelength range of band 1. The validation results do not change significantly, but the global number of retrieved pixels that pass all quality criteria increases with 5.3 % when the additive offset is taken into account. The mean of the relative differences between the  run with and the run without the additive offset is shown in Fig. 14.
Below 45 km, the retrieval is not very sensitive for the additive offset. The maximum difference is 2 %, with a standard deviation of the same order of magnitude. Above the 45 km, however, the difference increases to 25-30 %, with a standard deviation of 20 %.
Recent studies (e.g. Kyrölä et al. , 2013;Gebhardt et al. , 2014) show that the ozone trend over the last 20 years is of the order of a few percent per decade at altitudes over 20 km. Above 45 km, the observed trends are much smaller than the observed differences between the retrievals with and without the additive offset. For this altitude range it is possible that the trend will be (partly) masked by the additive offset. Below the 45 km, the trends larger than 2 % will not be masked by the additive offset.

OPERA applied to the 2009 Antarctic ozone hole
In this section, we demonstrate the retrieval results by studying the Antarctic ozone hole in September, October, November and December 2009 as observed with GOME-2. For a period of three weeks in November 2009, the ozone hole showed an unusual persistence over the southern mid-latitude observing station in Río Gallegos (51 • S, 69.3 • W). During this period the a priori will be far from the true state of the atmosphere, which will be a challenge for OPERA. The lidar measurements made during the 2009 ozone hole season at this station (Wolfram et al., 2012) will be compared to GOME-2 ozone profile retrievals.
Van Peet et al. (2009) showed that GOME-2 is capable of studying the ozone hole dynamics in both space and time using ozone sondes from Neumayer Station. Using the lidar measurements from the Río Gallegos site enables us to extend the altitude range over which the GOME-2 measurements during ozone hole conditions can be validated. The ozone profiles are retrieved using the settings described in this article.
Note that Neumayer Station (70.65 • S, 8.26 • W) is located closer to the South Pole than the Río Gallegos observing station. As a consequence, the a priori for Neumayer Station will include vortex conditions, while the a priori for the Río Gallegos station will not. The vortex was present over Río Gallegos for a few consecutive weeks during November 2009 (de Laat et al., 2010). This is an interesting opportunity to study the performance of OPERA in situations where the a priori is very different than the actual ozone profile.
For the 2009 Antarctic ozone hole season we retrieved all GOME-2 data south of 45 • S, and compared the GOME-2 retrievals to the lidar measurements from the Río Gallegos observing station. Due to the long integration times of the lidar (2.5 to 6 h), we selected those GOME-2 measurements that were closest in time to the centre of the integration time. The lidar operates at night, and time differences between the lidar and GOME-2 measurements vary between 6 and 11.5 h.
To make sure that the lidar and GOME-2 measure the same air mass, the assimilated total ozone columns from SCIA-MACHY for both lidar measurement time and GOME-2 overpass time were compared. Measurements were not used if the difference was larger than 15 DU. The assimilated total ozone columns have been produced by the TM3DAM model  Eskes et al. (2003) and the overpass data for Río Gallegos are freely available on www.temis.nl.
It is required for the lidar station to be within the GOME-2 pixel footprint, just as in the sonde validation. There are 25 lidar measurements available for the 2009 ozone hole season, and after applying the above collocation criteria, 18 were used for the validation.
The lidar profiles were interpolated to partial columns on the same pressure grid that was used for the GOME-2 retrievals. Below 15 km and above 45 km (the lidar altitude range) the a priori partial columns were used to extend the lidar profile to cover the full GOME-2 retrieval range. The resulting lidar profiles were inserted into Eq. (1) as x t and convolved with the averaging kernels. The mean differences with the GOME-2 profiles are shown in Fig. 15.
Between 100 and 20 hPa the absolute difference is positive, while above the 20 hPa it becomes negative. These deviations are larger than the theoretical error of the difference, and thus the bias is significant, but since it is only a few DU and because it changes from positive to negative, the effect on the total column will be small. Between 100 and 20 hPa the retrieval performs better than the a priori, while above the 20 hPa the a priori is somewhat closer to the lidar measurements than the retrieval.
As shown by Wolfram et al. (2012), the vortex passes over Río Gallegos a couple of times during the 2009 ozone hole season. The observations were grouped by their location being inside or outside the vortex to investigate whether the biases observed in Fig. 15 were affected by the vortex. The position of the vortex boundary was determined using the methodology described by Nash et al. (1996), applied on the 430 K potential temperature level from the ERA-Interim data (Dee et al., 2011;Dragani, 2011).
For 8 of the 18 collocations, the lidar at Río Gallegos was inside or close to the vortex; during the other it was outside of the vortex. The mean relative differences are plotted in Fig. 16a and b. There is little difference between these plots and the plot showing the mean of all differences (see Fig. 15). This is an indication that GOME-2 performs similarly inside and outside of the vortex.
However, the a priori behaves very differently when the position of the vortex with respect to Río Gallegos is taken into account. When Río Gallegos is inside of the vortex (Fig. 16a), the a priori is far from the lidar measurements and shows a larger uncertainty compared to measurements made outside the vortex (Fig. 16b). This difference is caused by the climatology, which at the latitude of Río Gallegos (51 • S, 69.3 • W) is not representative of the polar air present inside the vortex.
To investigate the temporal evolution of the vortex over Río Gallegos, all GOME-2 daily data were gridded onto a 1 • × 1 • grid, and a time series of these daily fields over the location of Río Gallegos is shown in Fig. 17.
The plot shows three episodes of stratospheric ozone depletion over Río Gallegos, indicated by the arrows at the top of the plot. At the end of September and the start of October, the vortex passes over Río Gallegos twice, but also rapidly disappears. Starting from the second week of November, a prolonged period is visible in which the vortex remains stationary over Río Gallegos. The three ozone-depleted periods are most visible in the two layers with maximum ozone concentration between 20 and 28 km. In the layers directly above and below this region, ozone depletion is also visible, but it does not always coincide with the depletion between 20 and 28 km due to the dynamics of the vortex. At the end of the ozone hole season in December, a slow recovery of the ozone concentration is visible between 20 and 28 km.
In Fig. 18a the location of the vortex is plotted for 26 September 2009, when the vortex passed Río Gallegos for the first time. Figure 18b shows the location of the vortex for 13 November 2009 at the start of the three-week stationary period.

Conclusions
The Ozone ProfilE Retrieval Algorithm (OPERA) version 1.26 is described for the first time. OPERA can be applied to measurements from nadir-looking satellite instruments in the UV-VIS spectral region such as GOME and GOME-2. In this paper, profiles are retrieved on a 16-layer pressure grid using the cross sections from Brion et al. (1993Brion et al. ( , 1998, Daumont et al. (1992), and Malicet et al. (1995), a priori information from the McPeters, Labow and Logan climatology (McPeters et al., 2007), and the LIDORT-A radiative transfer model (van Oss and Spurr, 2002).
Ozone profiles from GOME and GOME-2 have been validated against ozone sondes from the World Ozone and Ultraviolet Radiation Data Centre WOUDC (2011). For GOME the ozone sondes from 1997 were used and for GOME-2 the ozone sondes from 2008. Validation results show that the mean deviation between sondes and satellite instruments are within the accuracy levels (20 % in the troposphere, 15 % in the stratosphere) for the troposphere and stratosphere defined in the user requirements of the ozone project of the ESA CCI programme (http://www.esa-ozone-cci.org/). The only exception is the layer between 6 and 12 km for GOME-2 between 30 and 90 • N, which shows a mean deviation of approximately 30 %. The cause for this deviation is not yet known.
The Antarctic ozone hole season 2009 was investigated in more detail using the lidar measurements from the Río Gallegos observing station (51 • S, 69.3 • W). In November 2009, the vortex remained stationary over this station for three weeks, posing a challenge to the retrieval because the a priori does not include ozone depletion at this latitude and will be far from the true state of the atmosphere.
Below 20 hPa GOME-2 overestimates the ozone concentration compared to the lidar measurements with a few DU per layer. Between the 20 and 1 hPa the situation is reversed and GOME-2 underestimates the ozone concentration also with a few DU per layer compared to the lidar. Using all GOME-2 profiles over the Río Gallegos station, a time series of GOME-2 ozone profiles was constructed. This time series enables the study of highly variable ozone concentrations caused by the passage of the Antarctic polar vortex. Three notable ozone depletion episodes over Río Gallegos were observed: two short ones at the end of September and the start of October. The third episode started around the second week of November and lasted for three weeks. A closer inspection of the location of the vortex edge with respect to Río Gallegos showed that the station was inside the vortex for most of this period.
For the first time a single ozone profile retrieval algorithm can be applied to multiple nadir-looking UV-VIS instruments such as GOME and GOME-2. Therefore, OPERA is being used for the development of an algorithm that will be used to create a consistent multi-sensor time series of ozone profiles. Such a time series is important for the study of climate change.