Improvements to the OMI O2-O2 operational cloud algorithm and comparisons with ground-based radar-lidar observations

. The OMI (Ozone Monitoring Instrument on board NASA’s Earth Observing System (EOS) Aura satellite) OMCLDO2 cloud product supports trace gas retrievals of for example ozone and nitrogen dioxide. The OMCLDO2 algorithm derives the effective cloud fraction and effective cloud pressure using a DOAS (differential optical absorption spec-troscopy) ﬁt of the O 2 –O 2 absorption feature around 477 nm. A new version of the OMI OMCLDO2 cloud product is presented that contains several improvements, of which the introduction of a temperature correction on the O 2 –O 2 slant columns and the updated look-up tables have the largest impact. Whereas the differences in the effective cloud fraction are on average limited to 0.01, the differences of the effective cloud pressure can be up to 200 hPa, especially at cloud fractions below 0.3. As expected, the temperature correction depends on latitude and season. The updated look-up tables have a systematic effect on the cloud pressure at low cloud fractions. The improvements at low cloud fractions are very important for the retrieval of trace gases in the lower troposphere, for example for nitrogen dioxide and formalde-hyde. The cloud pressure retrievals of the improved algorithm are compared with ground-based radar–lidar observations for three sites at mid-latitudes. For low clouds that have a limited vertical extent the comparison yields good agree-ment. For higher clouds, which are vertically extensive and often contain several layers, the satellite retrievals give a lower cloud height. For high clouds, mixed results are obtained.


Introduction
The Ozone Monitoring Instrument (OMI) is an imaging spectrometer developed by the Netherlands and Finland that was launched in 2004 on board the NASA Earth Observing System (EOS) Aura satellite (Levelt et al., 2006).OMI has a continuous spectral coverage from 270 to 500 nm, with a resolution of approximately 0.5 nm.The primary data products from OMI are concentrations of trace gases, including ozone, nitrogen dioxide and formaldehyde.The trace gas retrieval algorithms rely on information on cloud properties for each ground pixel.Such information is important, because clouds have a significant impact on the photon path.The photon path strongly affects the information on trace gases contained in the satellite observations.Clouds and aerosols play a double role: they shield the atmosphere below them, thus reducing the sensitivity to the trace gases in these layers, while increasing the sensitivity to layers above the clouds.In tropospheric trace gas retrievals of e.g.NO 2 , the sensitivity of the measurement to the trace gas concentration as a function of altitude is described by the air mass factor (e.g.Boersma et al., 2011).To compute the altitude-dependent air mass factor, information is needed on the cloud fraction and the cloud altitude (or pressure).A conservative estimate of the total uncertainty in the tropospheric air mass factor for NO 2 is estimated by Boersma et al. (2004) as 35-60 %.Uncertainty on the cloud parameters are amongst the leading errors in this estimate.Improvement on the retrieval of the cloud parameters will thus lead to a significant improvement in the tropospheric trace gas retrievals.
The O 2 -O 2 cloud product (OMCLDO2) provides information on the cloud fraction and cloud pressure for each OMI Published by Copernicus Publications on behalf of the European Geosciences Union.
observation.The OMCLDO2 product has been designed to support the trace gas retrieval algorithms and is therefore driven by what these algorithms need for cloud information.The trace gas retrieval algorithms use the independent pixel approximation (Zuidema and Evans, 1998) in combination with a Lambertian cloud model.The clouds are represented as opaque Lambertian reflectors with a fixed albedo of 0.8 (Stammes et al., 2008).To be consistent with the trace gas retrievals, the OMCLDO2 product uses the same cloud model.The previous OMCLDO2 algorithm has been described by Acarreta et al. (2004).Because the amount of information on clouds in the OMI spectral range is limited, the algorithm derives an effective cloud fraction and an effective cloud pressure, instead of physical parameters.The cloud fraction and cloud pressure are derived from the continuum radiance and the depth of the O 2 -O 2 absorption feature around 477 nm.The algorithm does not distinguish between clouds and aerosols.Cloud-free conditions with significantly thick aerosols layers will be represented by small cloud fractions.Similarly, thin clouds, for instance cirrus, will also be represented by a small cloud fraction.The retrieval requires knowledge on the surface reflectance and the surface altitude, which are obtained from static look-up tables (LUTs).Validation studies (Sneep et al., 2008) have shown that the effective cloud fraction compares well with effective cloud fractions derived from the cloud optical thickness observed by MODIS (Moderate Resolution Imaging Spectroradiometer) and that the derived cloud pressure determines a level somewhere near the middle of the clouds.This sensitivity to the middle of the clouds differs significantly from observations in the thermal infrared, which are very sensitive to the actual cloud top pressure.
The OMCLDO2 retrieval is similar to the FRESCO (Fast REtrieval Scheme for Clouds from the Oxygen A-band) algorithm (Wang et al., 2008), with the difference that it is based on the O 2 -O 2 collision-induced absorption rather than O 2 absorption lines.The reason for using O 2 -O 2 is that the OMI spectral range does not cover the oxygen absorption bands.An important consideration of using the oxygen dimer is that its absorption scales with the oxygen density squared, which makes it increasingly more sensitive to the lower altitudes in the atmosphere.Besides the OMCLDO2 algorithm, there is also an OMI product based on the information from rotational Raman scattering (Joiner et al., 2012;Joiner and Vassilkov, 2006).It has been demonstrated that this product is also sensitive to scattering within the cloud layers, which has been referred to as the optical centroid pressure.
This paper describes version 2.0 of the OMCLDO2 product.Compared to the current operational version 1.2.3, version 2.0 contains the following improvements and extensions: 1.A temperature correction is implemented which is needed because of the density-squared dependence of the O 2 -O 2 absorption.
2. Besides the independent pixel approximation, a second cloud model is implemented, which represents the scene as a Lambertian surface at a certain pressure level.The retrieved parameters are the scene albedo and scene pressure.
3. The look-up tables that are used to derive the cloud fraction and pressure have a higher number of nodes, especially for the surface albedo and the surface altitude.
4. A method has been implemented to remove outliers from the spectral fitting.
5. The resolution of the surface altitude look-up table is brought in line with the average OMI spatial resolution.
6.The gas absorption cross sections are made consistent with the OMI NO 2 retrieval algorithm (van Geffen et al., 2015).
This paper is organized as follows: in Sect. 2 we describe the OMCLDO2 algorithm, focusing on the improvements that have been introduced in this version.In Sect. 3 we discuss the differences in the retrieval results of the new versus the previous algorithm.In Sect. 4 we present comparisons of the OMI-derived cloud pressures to ground-based radar-lidar observations.

Algorithm
The OMCLDO2 retrieval consists of two main steps: first a DOAS (differential optical absorption spectroscopy) fit is performed in the spectral region between 460 and 490 nm to derive the O 2 -O 2 slant column amount N s,O 2 O 2 and the continuum reflectance R c .In the second step these parameters are converted into cloud fraction c f , cloud pressure p cld , scene albedo A scn and scene pressure p scn .

DOAS fit
The DOAS fit is performed on the Earth's reflectance.OMI measures the Earth's radiance and once per day the solar irradiance.The wavelength grids of the Earth radiance and solar irradiance differ, because of the Doppler shift of the solar irradiance, because of the temperature variations of the OMI optical bench over an orbit and because of the nonhomogeneous filling of the instrument slit for partly cloudy scenes (Voors et al., 2006).For each ground pixel, the irradiance (F ) is interpolated on the spectral grid of the radiance (I ) (see van Geffen et al., 2015), and the reflectance is calculated as R (λ) = πI (λ) cos θ 0 F (λ) , where λ is the wavelength and θ 0 is the solar zenith angle.Next, the following equation is used for the DOAS fit:  Bogumil et al. (2000) at 220 K.I R (λ) was calculated using as input the expressions given in Chance and Spurr (1997), numbers given in Burrows et al. (1996) and the high-resolution solar irradiance provided by Dobber et al. (2008).The Raman spectrum was then calculated by convolving the solar spectrum with the rotational Raman lines and the OMI slit function and divided by the convoluted solar spectrum.We solve Eq. ( 1) using a modified Levenberg-Marquardt method, using the combined errors for the radiance and irradiance as weights.The fit parameters are the slant columns N s,O 2 O 2 and N s,O 3 , c R , and the coefficients for the polynomial P (λ).In addition, also the diagnostics of the fit are obtained, including the residuals and error estimates for all fit parameters.The residuals are analysed for possible outliers.Such outliers may be caused by high-energy particles hitting the detector or by varying dark current.Although all the information in the OMI Level 1B product is used to remove bad spectral pixels, some bad pixels may remain.For outlier detection several methods have been used (e.g.Richter et al., 2011), which are mostly based on Gaussian statistics, i.e. by using the mean and standard deviation of the residual.Because particle hits will cause only increases in detected radiance and because the mean and standard deviation themselves are strongly affected by outliers, we selected the so-called box-plot method for outlier detection (http://www.itl.nist.gov/div898/handbook/prc/section1/prc16.htm).This method determines lower and upper values based on the 25th and 75th percentile of a distribution.If the lower quartile is Q1 and the upper quartile is Q3, then the difference (Q3 − Q1) is called the interquartile range, or IQ.We define outliers as those values smaller than Q1 − 1.5 IQ or larger than Q3 + 1.5 IQ.After removal of the outliers, we redo the fitting of the spectrum to provide the final fit parameters.We have noted that the outlier removal is not stable; continuing iterating and each time applying the outlier removal procedure will often result in more and more removed spectral pixels.We therefore iterate only one time, thus removing the largest outliers.

Radiative transfer modelling
For the conversion of the DOAS fit parameters into cloud fraction and pressure, and scene albedo and scene pressure, we use radiative transfer modelling.In the new version of the OMCLDO2 algorithm we use two cloud models in the radiative transfer modelling: the independent pixel approximation (IPA) (see e.g.Zuidema and Evans, 1998) and the Lambertian equivalent reflectance (LER) model.The IPA reflectance at the top of the atmosphere is the weighted average of the clear and cloudy part.In our implementation of IPA, we calculate the cloudy part by treating the cloud as an opaque Lambertian reflector.For the LER method, we model the scene by assuming a Lambertian surface that covers the entire pixel.It is noted that the clouds and the ground surface in the IPA model are both treated as opaque Lambertian reflectors.Therefore, the name LER may be somewhat confusing, but it is used for consistency with the existing literature.For each ground pixel, both the IPA and LER method are applied.The original version of the OMCLDO2 algorithm applied only the IPA method (Acarreta et al., 2004).
For the IPA method, the effective cloud fraction c f can be calculated as where R (λ) is the top-of-atmosphere reflectance at wavelength λ; R clr (λ, A sfc , p sfc ) is the reflectance of the clear part, which is a function of the wavelength, the albedo of the surface A sfc and the surface pressure p sfc ; and R cld (λ, A cld , p cld ) is reflectance for the cloudy part, which is a function of the wavelength, the cloud albedo A cld and the pressure of the cloud layer p cld .
To retrieve cloud parameters, such as the effective cloud fraction (Eq.2), knowledge is required on the surface reflectance and surface pressure.We represent the clouds as Lambertian reflectors with an albedo of 0.8.Different studies have found that this is an optimal choice for the purpose of cloud corrections in trace retrieval schemes (see Stammes et al., 2008, and references therein).By using a large value for the cloud albedo, optically thin clouds that cover the entire ground pixel will be represented as a Lambertian reflector that covers only a small part of the pixel.Thus, the cloud-free part will implicitly model the transmission of light through the cloud, which is otherwise absent in the Lambertian cloud model.
The top-of-atmosphere reflectance is computed as the weighted average of the cloudy and clear parts, using the effective cloud fraction c f for the weighting.The IPA method requires knowledge on A sfc , A cld and p sfc .In the LER model the ground pixel is modelled as a Lambertian surface with an albedo A scn at a pressure level p scn .Note that the hatched areas below the opaque Lambertian indicate that these regions do not contribute in the radiative transfer calculations.
For both the IPA and LER model, we use the same set of forward model simulations of the reflectance between 460 and 490 nm; see Table 1.These simulations are performed for a mid-latitude summer standard atmosphere.The correction for different temperature profiles is discussed later on in this section.On the simulated reflectance the same DOAS fit is performed as for the measured OMI spectra (Eq.1).For all the nodes listed in Table 1, we obtain the slant column O 2 -O 2 as well as the continuum reflectance at 475 nm.The continuum is computed by evaluating the polynomial P (λ) for this wavelength.

Look-up table inversion
The radiative transfer modelling described above provides the O 2 -O 2 slant column and the continuum reflectance as a function f of the Sun-satellite geometry and the cloud model parameters: where R c is the continuum reflectance, θ is the viewing zenith angle, φ is the difference between the solar and viewing azimuth angles, c f is the cloud fraction, p cld is the cloud pressure, A sfc is the surface albedo and p sfc is the surface pressure.It is noted that setting the cloud fraction to 0 yields the forward model for the LER model and that the functions f 1 and f 2 may not be fully independent.Instead of having the slant column O 2 -O 2 and continuum reflectance as a function of the cloud and surface parameters (Eq.3), the retrieval requires the functions g and h that describe the IPA and the LER model parameters as a function of the slant column O 2 -O 2 and continuum reflectance: where A scn is the scene albedo and p scn is the scene pressure.Therefore, we invert the tables to give look-up tables with slant column O 2 -O 2 and continuum reflectance as nodes.This conversion process involves interpolation and extrapolation, for which we use linear radial basis functions (Jones et al., 2001).The inversion is illustrated in Fig. 2. Because the simulated spectra cover a very wide range of conditions, it is unlikely that the extrapolations in this inversion procedure have a large effect on the final result.For example, as can be seen in the lower panel of Fig. 2, the results for the effective cloud pressure around a reflectance of 0.15 and 0.4 × 10 −44 molec 2 cm −5 show strange patterns due to extrapolation.However, these combinations of continuum reflectance and O 2 -O 2 slant columns will never occur for real atmospheric conditions, and therefore these parts of the tables are never reached.
The final results of the inversion procedure are LUTs for the cloud fraction, cloud pressure, scene albedo and scene pressure on the nodes listed in Table 2.In the retrieval algorithm linear interpolation is applied in all dimensions, except for the solar zenith angle, for which spline interpolation is applied.This is implemented because of the non-linear behaviour at large solar zenith angles.
The previous version of the OMCLDO2 algorithms also made use of inverted LUTs.However, they were not calculated using radial basis functions but computed on ad hoc fits of the continuum reflectance and slant column O 2 -O 2 versus the cloud pressure and cloud fraction.Also the number of nodes for low cloud fractions and low albedos was significantly lower in the previous version.

Temperature correction
As will be described in this section, the slant column amount of O 2 -O 2 depends on the temperature profile.This is not caused by a temperature dependence of the O 2 -O 2 absorption cross section but is due to the nature of the dimers, of which the abundance scales with the pressure squared instead of being linear with pressure.Because this effect turns out to be significant, we have developed a temperature correction.This correction allows the use of the LUTs described above, which have been derived for a single pressure-temperature profile.By applying temperature correction, the O 2 -O 2 slant columns are scaled to the values for the reference temperature profile that has been used to construct the LUTs.
To understand the temperature effect of the O 2 -O 2 slant columns, we write the reflectance as It represents the relative reduction in the reflectance when a unit amount of absorption is added to the atmosphere in a thin layer located between z and z + dz.The volume absorption coefficient is given by In hydrostatic equilibrium, the integral over the altitude can be replaced by an integral over the pressure, using dp / dz = −ρ(z)g, where ρ(z) is the density of air.By expressing the density of air as ρ(z) = Mp / (R g T (z)), where M is the mean molecular mass of dry air and R g is the gas constant, Eq. ( 5) becomes Finally, we can express the number density of air as where k B is Boltzmann's constant, and we assume a mixing ratio of oxygen of 21 %.Substituting this in Eq. ( 7) gives which shows that the reflectance and hence the slant column of O 2 -O 2 change when the temperature profile changes.It is noted that this is due to the density-squared nature of the absorption of O 2 -O 2 .For "normal" absorbers (no collision complex) the slant column is independent of the temperature profile, assuming that the absorption cross section is independent of temperature and pressure.
In order to investigate the magnitude of the bias that is introduced if the temperature dependence is ignored, simulations of the retrieval were performed.In the retrieval the mid-latitude summer profile is used, while for the simulations either a mid-latitude winter profile or a subarctic winter profile is used.The bias was calculated for different true pressure levels of the cloud and for different cloud fractions.Figure 3 shows that the maximum bias in the retrieved cloud pressure ranges from less than 50 hPa at large cloud fractions to 200 hPa at very small cloud fractions.As discussed in the Introduction, clouds can have a shielding or an enhancing effect on sensitivity of satellite measurements of trace gases.Tropospheric trace gas retrievals are commonly limited to ground pixels with effective cloud fraction below approximately 0.2-0.3, for which the cloud-free reflectance dominates the scene.Figure 3 shows that for these cases the bias in the cloud pressure due to the temperature effect is very large .Such biases could change the effect of the clouds as assumed in the trace gas retrieval, from shielding to enhancing, or vice versa, and have a significant effect on the retrieved trace gas column.
The OMCLDO2 retrieval is based on a LUT approach, and generating LUTs for many different temperature profiles is not feasible.Therefore, we introduce a correction factor γ that translates the measured slant column into the slant column for the reference pressure-temperature profile.Using Eq. ( 8), we can compute γ as where R is the reflectance at a representative wavelength in the fit window; p sfc is the surface pressure and p cld the cloud pressure; and the subscripts clr and cld refer to the clear part and the cloudy part of the pixel, respectively.To implement the temperature correction factor, new lookup tables for the O 2 -O 2 air mass factors m(p, λ) and the corresponding reflectance for a wavelength in the middle of the fit window have been generated.In the retrieval algorithm, the temperature correction is applied in an iterative manner because the cloud fraction and pressure should be known to compute γ .As a default, we use three iterations to compute γ .

Input data
The OMCLDO2 version 2 uses the following input.For the absorption cross sections for O 2 -O 2 , ozone and optionally NO 2 , as well as for the radiance Raman scattering, we use the spectra described in van Geffen et al. (2015).For the surface reflectance, the OMI-derived monthly mean Lambertian equivalent reflectance database described in Kleipool et al. (2008), extended to 5 years of OMI data, is used.For the temperature profiles needed for the temperature correction, we use a monthly mean climatology four times per day  (00:00, 06:00, 12:00 and 18:00 UTC), derived from the Nation Centers for Environmental Prediction (NCEP) reanalysis data for the period 2005-2014.Actual temperatures will be somewhat better than using a climatology.However, for practical reasons related to the operational data processing facility, we have decided to use a temperature climatology.
For detecting snow and sea-ice coverage, the Near-real-time Ice and Snow Extent (NISE) product (Nolin et al., 1998) is used.

Impact of algorithm updates
In this section we first compare the OMCLDO2 version 2 with version 1.2.3 for 1 day of data.Next, the impacts of each of the improvements are discussed separately.The impacts of the improvements are summarized in Table 3.
Figure 4 shows the OMCLDO2 retrieval results for 14 May 2005.This day has been selected arbitrarily from the OMI data record.Note that we also have analysed other days, which show consistent results.Figure 4a  Over the high latitudes in the Northern Hemisphere considerably large positive and negative differences occur.These occur over snow and ice, where the retrieval algorithm has problems distinguishing the clouds from the highly reflective surface.Under such conditions, the accuracy of the retrieved effective cloud fraction will be very low.Due to the assumed cloud albedo of 0.8, the cloud fraction will become undetermined when the surface albedo is close to this value.
The differences in effective cloud pressure are shown in Fig. 4d.Version 2 shows higher cloud pressure in the tropics and sub-tropics, and lower cloud pressures at mid-and high latitudes.As discussed below, this zonally dependent effect is caused by the temperature correction introduced in version 2. Especially in the tropics, the differences in the cloud pressures are largest in regions with low cloud fractions.Overall, the uncertainty in the cloud pressure retrievals is a strong function of the effective cloud fraction.For small cloud fractions, the effect of the cloud on the top-of-atmosphere reflectance is very small, resulting in large uncertainties on the retrieved cloud pressure.In the limit of cloud-free conditions, the cloud pressure becomes undetermined.For large cloud fractions, the clouds dominate the reflectance, and the cloud pressure can be determined with high precision.This is illus- trated in Fig. 5, which shows the precision of the effective cloud pressure retrievals as a function of the effective cloud pressure.The precision is calculated by the propagation of the DOAS fit errors of the O 2 -O 2 slant columns and of the continuum reflectance.For cloud fractions below 0.1 the average precision is larger than 20 hPa with a very large spread, whereas for cloud fractions above 0.9 the precision is less than 10 hPa with a much smaller spread.It is noted that other errors sources, for example in the assumed surface albedo, will also have a much stronger impact at low effective cloud fractions.

Temperature correction
The correction for the temperature dependence is described above.Based on a temperature climatology, a correction factor is computed and applied to the O 2 -O 2 slant columns.
Figure 4g shows the temperature correction factor for the OMI observations on 14 May 2005.Because the temperature correction factor is computed relative to the mid-latitude summer atmosphere, it is larger than 1 in the tropics and smaller at the higher latitudes.On top of this general behaviour there is spatial structure related to cloud structures, especially when the clouds are at high altitudes and have significant optical thickness.The effect of clouds on the temperature correction factor is described in Eq. ( 10).For high and thick clouds, the temperature correction is in most cases closer to 1, indicating that the largest differences between the climatological temperature and the mid-latitude summer atmosphere occur at the lowest altitudes.
To test the impact of the temperature correction factor on the effective cloud fraction and pressure, we produced datasets with and without the temperature correction applied for 2 days of OMI data in different seasons (14 May and 15 November 2005).While the impact on the cloud fraction is negligible, the impact on the cloud pressure can be significant.Figure 6 shows the difference between the retrievals without and with the correction applied, as a function of the effective cloud fraction.The impact of the correction on the cloud pressure increases towards smaller cloud fractions.Depending on whether the correction factor is smaller or larger than 1, the impact on the cloud pressure can be both positive or negative.For cloud fractions below 0.2, the impact of the temperature correction can be as large as −100 to 150 hPa, whereas for cloud fractions larger than 0.2 the impact is in the range −20 to 40 hPa.For the higher latitudes (γ < 1) the clouds are at lower pressures (higher altitude) when the temperature correction is applied, whereas in the tropics and subtropics the effects are reversed.
Figure 6 can be compared to Fig. 3, which is based on retrieval simulations.Although Fig. 6 shows the difference with and without the temperature corrections, and Fig. 3 shows the difference with the simulated truth, the behaviour and magnitude of the bias are very similar.It is noted that for Fig. 3 only temperature profiles have been used which are colder in the troposphere than the reference mid-latitude summer atmosphere.Therefore, Fig. 3 shows only positive biases, whereas in the tropics and sub-tropics Fig. 6 also shows negative values.

Look-up tables
To test the impact of the LUTs that are used to derive the effective cloud fraction and effective cloud pressure, we produced datasets using version 2 algorithm with the new and the old LUTs.The cloud fraction with the new LUTs is about 0.01 larger than with the old version, except over snow and ice regions, where the cloud fraction with the new LUT is in most cases significantly smaller.Because over snow-and ice-covered regions the cloud fraction is highly uncertain as the algorithm is not able to distinguish clouds from highly reflective surfaces, this impact is not unexpected.
The effect of the new LUTs on the effective cloud pressure is shown in Fig. 7c.This figure shows the difference in the cloud pressure (old minus new) as a function of the effective cloud pressure.The differences become significant at cloud fractions smaller than 0.25, where the difference shows an oscillating behaviour.At a c f of approximately 0.125 a minimum is reached, and at smaller cloud fractions the mean difference reverses sign and increases towards lower c f .To investigate the nature of this behaviour, Fig. 7a and b show the distribution of the retrieved cloud pressures as a function of cloud fraction for the old and new LUT datasets.From these figures it is clear that the origin of the oscillating behaviour of the difference is in the retrievals with the old LUTs. Figure 7a shows that with the old LUTs the cloud pressure increases strongly towards lower cloud fractions, for which we have no physical explanation.The results with the new LUTs (Fig. 7b) do not show this.We attribute the large improvements with the new LUTs to the larger number of radiative transfer calculations on which it is based (see Table 1), as well as the improved interpolation scheme that was used to produce it.
Figure 7a and b also show that the effective cloud pressure for the largest c f bin is significantly larger.A further inspection showed that this is caused by retrievals over snow-and ice-covered regions, for which the cloud pressure retrievals are highly uncertain.For such cases the scene albedo and pressure provided by version 2 of the algorithm can be used.

Outlier removal
The outlier removal procedure that was introduced in version 2 of the algorithm removes spectral pixels from the DOAS fit after evaluation of the fitting residuals.Outliers can have different behaviour: they can be transient, e.g.occurring only for spectral pixels for a few pixels, or they can occur systematically for certain spectral pixels.When out-liers are detected, they are removed from the data, which will decrease the number of wavelengths used in the DOAS fit. Figure 4h shows the number of wavelengths used in the fit for 14 May 2005.The most prominent feature is the reduced values over South America caused by the South Atlantic Anomaly (SAA).In this region the number of high energetic particles hitting the OMI detectors is significantly increased (Dobber et al., 2006), resulting in spikes in the data.
It is noted that also the Level 0-1B processor flags transient pixels, so Fig. 4h is the result of the Level 1B flags in combination with the outlier removal procedure.In addition to the SAA, Fig. 4h also shows stripes in the along-track direction, as well as features related to geophysical conditions (for example higher values for Australia and India).
The impact of the outlier removal procedure was tested by running the algorithm with and without the procedure switched on for 14 May 2005.The differences in the retrieved effective cloud fraction are negligible, whereas the impact on the effective cloud pressure depends on the cloud fraction.The mean difference is not significant, but the standard deviation of the difference varies from 16 hPa for c f < 0.2 to 3 hPa for c f < 0.8.
We also inspected the root-mean-square error (RMSE) of the DOAS fit as a fit quality indicator.Although the difference in RMSE with and without the outlier removal did not differ significantly from 0, the distribution is skewed towards larger RMSE values when the outlier removal is switched off.This indicates that the outlier removal procedure improves the fit for cases with a high RMSE.

Digital elevation model (DEM)
Version 2 of the algorithm uses a DEM with a resolution of approximately 20 km, which is closer to the spatial resolution of OMI compared to the 3 km resolution DEM used in previous versions.The 20 km resolution DEM is constructed from the Global Multi-resolution Terrain Elevation Data 2010 (Danielson and Gesch, 2011).
The impact of the new DEM will be largest in mountainous terrain.Figure 8 illustrates the effect on the retrieved effective cloud pressures over Europe for 14 May 2005.This is the same day as shown in Fig. 4. Figure 8a shows that significant impacts of the new DEM are restricted to the main mountain ranges.The difference between using the old and new DEM can be both positive and negative.The impact increases towards the lower cloud fractions, when more signal comes from the surface and an accurate knowledge of the surface altitude becomes more important.Figure 8b shows that for most pixels the impact is smaller than ±50 hPa.

Cross sections
In the new version of the algorithm, absorption cross sections and the Raman radiance spectrum have been updated.The impact of this change was tested by running the algorithm with the old and the new cross sections.The impact on the cloud fraction was negligible.Using the new cross sections increased the effective cloud pressures by 23 ± 23 hPa.The difference in the root-mean-square error of the DOAS was not significant.The new cross sections did not significantly reduce the residuals of the DOAS fit.

Scene albedo and scene pressure
As described in the algorithm section, for each ground pixel the scene albedo and scene pressure are derived.The most important application of these parameters is over bright surfaces such as snow and ice, where the surface albedo becomes close to the assumed cloud albedo of 0.8 and no meaningful cloud fraction and pressure can be derived.Figure 9 shows a comparison of the retrieved scene pressure with the surface pressure derived from the DEM, assuming a sea level pressure of 1013 hPa.The figure shows a very good agreement between the retrieved scene pressure and the DEM over Greenland.This figure presents the comparison for the OMI cross-track pixel 20, but other cross-track pixels show similar results.It demonstrates the capabilities of the scene pressure for bright surfaces.Also, it is an indirect validation of the retrieved O 2 -O 2 slant columns.A correction of the O 2 -O 2 slant columns, as is sometimes used in ground-based DOAS measurements (for a discussion see Spinei et al., 2015), is clearly not necessary for the OMCLDO2 retrievals.
Over dark scenes, such as over oceans under conditions with low cloud fractions, the scene pressure is less well understood.For some areas over the ocean the retrieved scene pressure is significantly larger than the sea level pressure.For scene albedos of less than 5 %, about 3 % of the scene pressures exceed 1050 hPa, and 50 % exceed 1013 hPa.We note that scene pressures larger than 1013 hPa are the results of extrapolation and therefore should be used with great caution.For dark scenes we recommend using the cloud fraction and cloud pressure, taking into account that there will be a large uncertainty in the cloud pressure in these cases (see Fig. 5).

Comparison with ground-based radar-lidar observations
The changes made in version 2 of the OMCLDO2 algorithm have a stronger impact on the cloud pressure retrieval than on the cloud fraction retrieval.Therefore, we focus in this section on comparisons of the cloud pressure retrievals with correlative data.Because of the use of the IPA cloud model (Fig. 1), it is not straightforward to compare the retrieved cloud pressure to profile information on cloud parameters.We compare the OMI retrievals with ground-based radar data, for which the sensitivity to cloud droplet size is very different; the OMI retrievals are sensitive to the optical extinction which scales with droplet size to the 2nd power, whereas the Radar reflectivity scales with droplet size to the 6th power.Thus, when using these Radar data, it is not pos-sible to compare the same quantity, which is required in a validation study.Rather than conducting a validation study, we focus on explaining the differences between the OMI retrievals and the radar-lidar data, given their different sensitivities.This comparison uses a similar approach to that used for comparing Scanning Imaging Absorption Spectrometer for Atmospheric CHartographY (SCIAMACHY) cloud products with radar-lidar data (Wang and Stammes, 2014).We present comparisons for three sites: Cabauw, the Netherlands; Lindenberg, German; and the Atmospheric Radiation Measurement (ARM) Southern Great Plains (SGP), USA, for the period January to June 2005.These datasets were selected because of the continuous data availability for these sites in the Cloudnet (Illingworth et al., 2007) database.Cloudnet is a network of stations for the continuous evaluation of cloud and aerosol profiles.

Cloudnet data
We use the Cloudnet Level 2 classification product (Illingworth et al., 2007), which is based on the combination of Radar and Lidar observations and is available approximately every 30 s.This product classifies each vertical layer as 1 of 11 classes, which distinguish ice and water clouds, precipitation, aerosols, insects, clear sky and combinations thereof.We attribute a value of 1 to layers that are classified as cloudy (classes 1-7) and 0 to layers identified as non-cloudy.For profiles containing at least one cloudy layer, we compute the cloud mid-height as the average of the altitude of the cloudy layers.Next we average all the profiles in the time window of ±30 min of an OMI overpass.We also compute the average and standard deviation of the cloud mid-height over this time window and determine for the average cloud profile if it is single layer or multi-layer.
It is noted that this procedure for computing the cloud midheight does not take the optical thickness of the layers into account; an optically thick cloud and optically thin cloud are weighted the same in the cloud mid-height.Weighting with the optical thickness -or even better, with the sensitivity of the O 2 -O 2 cloud algorithm -would make a comparison much more direct.Unfortunately, information on the full optical thickness profile is not available from the Cloudnet data.Alternatively, we could use the Radar reflectivity as a weighting parameter.However, the Radar reflectivity is very sensitive to cloud particle size, which is also not a good representation for the cloud extinction in the visible.We therefore decided to use the simple weighting described above.This weighting gives the same weight to optically thin cloud layer as to optically thick layers, whereas the O 2 -O 2 cloud pressure retrieval is much more sensitive to the thick layers.
Further filtering of the Cloudnet data was done using the following criteria: the standard deviation of the cloud mid-height should not exceed 1.5 km, to avoid cases with large temporal variability during the OMI overpass; at least one layer in the profile should be cloudy during at least 50 % of the time averaging window.

OMI collocated data
For the OMI cloud data, we average all the ground pixels of which the centre is within 30 km distance of the ground station.For these pixels we determine the mean and standard deviation for the cloud fraction and pressure.We convert the cloud pressure to altitude using a scaling height of 8 km.We filter the OMI data using the following criteria: the effective cloud fraction should exceed 0.2, because the cloud pressure for low cloud fraction has a large uncertainty; the standard deviation of effective cloud pressures should not exceed 1.5 km, to exclude cases with large horizontal variability.

Results
Figure 10 shows a comparison between the Cloudnet data and the OMI effective cloud pressure for the collocations over Cabauw for the period January to June 2005.The cases presented in this figure are ordered by increasing mid-height of ground-based data.The following regimes can be distinguished in this dataset: 1. Case 1-50: these are low level clouds with limited vertical extent.The OMI effective cloud height and the ground station mid-height are in good agreement.
2. Case 51-129: according to Cloudnet the majority of these cases consist of vertically extended, and often multi-layered, cases.For these cases the OMI effective cloud height is generally lower than the ground station mid-height.3. Case 130-135: these cases have high clouds with limited vertical extent.The OMI effective cloud height compares well, except for the outlier for case 131.However, the number of collocations in this regime is small.
It is noted that the boundaries of these three regimes are not hard.
Figure 10 shows that for single layer clouds with a limited vertical extent the O 2 -O 2 effective cloud height and the Cloudnet-derived mid-height are in agreement.This shows that the OMI-derived product is capable of retrieving cloud height ranging from low clouds to high clouds.For vertically extended clouds, the OMI-derived cloud heights are generally lower than the radar-lidar-derived heights.A plausible explanation for this difference is that in these cases there are thin high clouds overlaying thicker low-level water clouds.Whereas the radar-lidar mid-heights have equal sensitivity, the O 2 -O 2 cloud height will be more sensitive to the optically thick layers at lower altitude.
When we include not only Cabauw but also Lindenberg and the ARM-SGP site, we get a similar picture.Figure 11 shows a comparison for all these sites for the period January-June 2005, where the single-and multi-layer cloud cases are distinguished.Good correlation is observed for the cloud range of 0-2.5 km, where the single cloud layers dominate.In the region between 2.5 and approximately 8 km the multilayer clouds dominate and the O 2 -O 2 cloud height is lower than the radar-lidar cloud mid-height.Above 8 km we find both good comparison and very large differences, although the number of points is very limited.As we are interested in the average comparisons, we did not investigate individual cases where big differences occurred.
The comparison between the Cloudnet data was repeated for the old version of the OMCLDO2 algorithm.The results were very similar to those presented in Figs. 10 and 11.This is expected because for effective cloud fractions larger than 20 % the difference between the old and the new algorithm is not very large.Moreover, the difference between the two algorithm versions is smaller than with the ground-based data, because of the different sensitivity of ground-based versus satellite observations and because of representation errors in both space and time.

Conclusions
We present a new version of the OMI OMCLDO2 Level 2 cloud product.This product is an important input for several of the operational OMI Level 2 algorithms.The new version contains six major improvements: 1. the correction for the temperature sensitivity of the DOAS fit; 2. improved look-up tables for computing the effective cloud fraction and effective cloud pressure; 3. retrieval of the scene pressure and scene albedo for every ground pixel, using the Lambertian equivalent reflectance model; 4. outlier removal procedure in the DOAS fit.
5. updated gas absorption cross sections; 6. introduction of a DEM with a similar resolution as the OMI ground pixels.
We show that the impact of these changes on the retrieved effective cloud fraction is for most ground pixels less than 0.01.The impact on the effective cloud pressure is larger: especially for cloud fractions less than approximately 0.3 the differences compared to the previous operational version can be as large as 200 hPa.These differences are mainly caused by the temperature correction and the introduction of the new look-up tables.Due to the temperature the differences have a latitudinally and seasonally dependent behaviour, where the updated algorithm gives higher cloud pressures at higher latitudes and lower pressures in the tropics and sub-tropics.Also it was found that the new look-up tables give better results at low cloud fractions.Cloud pressure retrievals have been compared to groundbased radar-lidar observations in Cabauw, Lindenberg and the ARM-SGP site.It was found that for low clouds, up to approximately 2.5 km, the satellite retrievals and ground-based results compare favourably.For clouds in the range between 2.5 and approximately 8 km the ground-based observations indicate many multi-layer and vertically extensive clouds.For these clouds the satellite-retrieved cloud heights are generally lower, probably because the algorithm is more sensitive to the optically thick low-level clouds.For high clouds (> 8 km) mixed results are found.The differences with the radar-lidar can be explained by the different sensitivity of the radar-lidar observations versus the satellite observations.www.atmos-meas-tech.net/9/6035/2016/We conclude that the new version of the OMCLDO2 product is a significant improvement of the previous versions, especially for the cloud pressure at cloud fractions smaller than approximately 0.3.This is very important for cloud corrections in retrievals of gases like nitrogen dioxide, sulphur dioxide and formaldehyde, which are very sensitive to the cloud pressure.
After reprocessing of the entire OMI data record, the stability of the product should be investigated, and the scene pressure and scene albedo should be validated.

Figure 2 .
Figure 2. Example of a slice of the effective cloud fraction LUT (top panel) and effective cloud pressure LUT (bottom panel), showing the LUT value as a function of the continuum reflectance R c and the slant column O 2 -O 2 N s,O 2 O 2 .The background colours show the values in the LUT derived from interpolation and extrapolation of the DOAS fit results, which are shown as the colour-filled symbols.The other LUT nodes are fixed to the following values: solar zenith angle of 44.2 • , viewing zenith angle of 21.2 • , relative azimuth angle of 0.0 • , surface albedo of 0.05 and surface altitude of 0 m.
where T (p) is the actual temperature profile taken and T ref (p) is the temperature profile used in the creation of the look-up tables.In the case of partial cloud cover and weak absorption

Figure 3 .
Figure3.Bias in the retrieved pressure (p retr − p true ) in hPa when in the retrieval a mid-latitude summer temperature profile is used, whereas in the simulation a mid-latitude winter profile (mlw) or a subarctic winter profile (saw) is used.The results are plotted as a function of the cloud fraction and for different pressure levels of the cloud used in the simulation.The surface albedo is fixed at 0.05, the cloud albedo is 0.80, the solar zenith angle is 60 • and the viewing direction is nadir.
and b show the effective cloud fraction and the effective cloud pressure.Figure 4c and d show the difference between versions 2 and 1.2.3.For areas with low effective cloud fractions, the effective cloud fraction is approximately 0.01 larger in version 2.

Figure 4 .
Figure 4: Results from the OMCLDO2 version 2 algorithm for 14 May 2005.a) effective cloud fraction, b)

Figure 5 .
Figure 5. Box-and-whisker plot of the precision of the effective cloud pressure as a function of the effective cloud fraction for 14 May 2005.

Figure 6 .
Figure 6.Difference in the effective cloud pressure due to the temperature correction (without correction minus with correction) plotted as a function of the effective cloud fraction.The colours of the symbols indicate the temperature correction factor γ .

Figure 7 : 693 Figure 7 .
Figure 7: Box-whisker plots of the effective cloud pressure as a function of the effective cloud fraction.The 691

Figure 8 .
Figure 8. Difference in the effective cloud pressure (old DEM minus new DEM) for effective cloud fractions exceeding 0.1 over Europe for 14 May 2005.Left panel: map of the differences over Europe; right panel: histogram of the differences over Europe on a logarithmic scale.

Figure 9 .
Figure 9. Top panel: map of the position of the ground pixel centres.Bottom panel: comparison of the retrieved scene pressure and the surface pressure derived from the DEM, plotted as a function of the longitude.

Figure 10 .
Figure 10.The effective cloud altitude retrieved from OMI (red), compared to radar-lidar cloud information for Cabauw (blue), the Netherlands.The grey background is the vertically resolved cloud occurrence derived from the radar-lidar data for the period ±30 min of the OMI overpass.The cases are ordered according to the ground station cloud mid-height.

Figure 11 .
Figure 11.The retrieved effective cloud altitude from OMI, plotted as a function of the radar-lidar-derived cloud altitude.Closed symbols are for single-layer clouds; open symbols are for multi-layer clouds.
is a polynomial of the first order, N s,O 2 O 2 the slant column of O 2 -O 2 , σ O 2 O 2 (λ) the O 2 -O 2 cross section convolved with the OMI slit function, N s,O 3 the slant column of O 3 , σ O 3 (λ) the O 3 cross section convolved with the OMI slit function, I R (λ) a synthetic radiance Raman spectrum convolved with the OMI slit function and c R a scale parameter for the amount of Raman scattering.For the reference cross sections for O 2 -O 2 we use Thalman and Volkamer (2013) at 293 K, and for O 3 we use Atmos.Meas.Tech., 9, 6035-6049, 2016www.atmos-meas-tech.net/9/6035/2016/where P (λ)

Table 1 .
Nodes for the radiative transfer calculations used for the OMCLDO2 algorithm v2.0 and v1.2.3.Where the nodes are the same between versions, the table cells are merged.Note that cloud fractions smaller than 0 and larger than 1 are included to enlarge the parameters space.

Table 2 .
Nodes for the continuum reflectance and the slant column O 2 -O 2 , for the cloud fraction/pressure and scene albedo/scene pressure look-up tables.The solar zenith angle, viewing zenith angle, relative azimuth angle, surface albedo and surface/cloud pressure nodes are the same as given in Table1.
-resolved air mass factor which is weakly wavelength-dependent, n O 2 (z) is the number density of oxygen and σ O 2 -O 2 (z, λ) is the absorption cross section of O 2 -O 2 .The altitude-resolved air mass factor m (z, λ) can be expressed as where R 0 (λ) is the reflectance if absorption by O 2 -O 2 is ignored, z 0 is the altitude of a Lambertian cloud or the Earth surface, TOA is the top of the atmosphere, m (z, λ) www.atmos-meas-tech.net/9/6035/2016/Atmos.Meas.Tech., 9, 6035-6049, 2016 is the altitude

Table 3 .
Impact of the improvements of the effective cloud fraction and effective cloud pressure retrievals.