Interactive comment on “ Absolute calibration of the colour index and O 4 absorption derived from Multi-AXis ( MAX-) DOAS measurements and their application to a standardised cloud classification algorithm ” by Thomas

The paper gives two main results: 1) new calibration methods for absolute calibration of colour index (CI) and O4 absorption and 2) location-independent threshold values for an earlier developed cloud-classification scheme. The calibration methods developed in this paper together with the new threshold values and the adapted colour index are an important step towards a uniform cloud classification scheme for DOAS measurements of scattered sunlight, as the authors state correctly. The paper is well-written and

Abstract.A method is developed for the calibration of the colour index (CI) and the O 4 absorption derived from differential optical absorption spectroscopy (DOAS) measurements of scattered sunlight.The method is based on the comparison of measurements and radiative transfer simulations for well-defined atmospheric conditions and viewing geometries.Calibrated measurements of the CI and the O 4 absorption are important for the detection and classification of clouds from MAX-DOAS observations.Such information is needed for the identification and correction of the cloud influence on Multi AXis (MAX-)DOAS profile inversion results, but might be also be of interest on their own, e.g. for meteorological applications.The calibration algorithm was successfully applied to measurements at two locations: Cabauw in the Netherlands and Wuxi in China.We used CI and O 4 observations calibrated by the new method as input for our recently developed cloud classification scheme and also adapted the corresponding threshold values accordingly.
For the observations at Cabauw, good agreement is found with the results of the original algorithm.Together with the calibration procedure of the CI and O 4 absorption, the cloud classification scheme, which has been tuned to specific locations/conditions so far, can now be applied consistently to MAX-DOAS measurements at different locations.In addition to the new threshold values, further improvements were introduced to the cloud classification algorithm, namely a better description of the SZA (solar zenith angle) dependence of the threshold values and a new set of wavelengths for the determination of the CI.We also indicate specific areas for future research to further improve the cloud classification scheme.

Introduction
Multi AXis differential optical absorption spectroscopy (MAX-DOAS) measurements are a widely used remote sensing technique for the measurement of atmospheric trace gases and aerosols (e.g.Hönninger and Platt, 2002;Wittrock et al., 2004;Hönninger et al., 2004;Heckel et al., 2005;Frieß et al., 2006;Irie et al., 2008;Clémer et al., 2010;Li et al., 2010;Wagner et al., 2011;Ma et al., 2013;Hendrick et al., 2014;Wang et al., 2014Wang et al., , 2015;;Vlemmix et al., 2015).MAX-DOAS measurements can be strongly affected by clouds (Wagner et al., 2004(Wagner et al., , 2011(Wagner et al., , 2014;;Gielen et al., 2014;Wang et al., 2015).Thus cloud-contaminated measurements have to be flagged, excluded from further processing or corrected for the effects of clouds.Different algorithms for the identification and classification of clouds based on MAX-DOAS measurements have recently been developed.They are based on several quantities derived from the measured spectra (Wagner et al., 2014;Gielen et al., 2014;Wang et al., 2015).These quantities include the following.a.A so-called colour index (CI, see e.g.Sarkissian et al., 1991Sarkissian et al., , 1994)), which is defined as the intensity ratio for two selected wavelengths.In this study we define the CI as a ratio of the intensity at the shorter wavelength to the intensity at the longer wavelength: b.The measured radiance at a selected wavelength.Here it should be noted that usually (MAX-)DOAS instruments Published by Copernicus Publications on behalf of the European Geosciences Union.
are not radiometrically calibrated.Thus we use the term "radiance" here in a broader sense for the measured signal as well, e.g.expressed as counts per second.
c.The absorption of the oxygen dimer O 4 (Greenblatt et al., 1990).
d.The strength of the so-called Ring effect (the filling-in of solar Fraunhofer lines by rotational Raman scattering, see Grainger and Ring, 1962;Wagner et al., 2014).
It was shown by Gielen et al. (2014) and Wagner et al. (2014) that the CI is very sensitive to the presence of clouds.It is thus well suited for their detection, especially because for zenith observations, clouds always lead to a decrease of the CI compared to clear-sky conditions (if the CI is defined with the intensity at the shorter wavelength divided by the intensity at the longer wavelength).In contrast, the other quantities mentioned above can be both increased or decreased in the presence of clouds depending on the cloud properties, wavelength and viewing geometry.Because of the unique dependence of the CI on the occurrence of clouds, the CI is usually used as the primary quantity for the detection of clouds.From the other quantities, especially from the radiance and the absorption of the oxygen dimer O 4 , important additional information on cloud properties can be derived (e.g. the presence of optically thick clouds or fog, see Wagner et al., 2014;Gielen et al., 2014;Wang et al., 2015).Since Ring effect measurements do not provide significant extra information, and because the quantitative analysis of the Ring effect is rather complicated, the Ring effect is not further considered here.
The identification and classification of clouds is usually based on the comparison of the measured quantities with their thresholds.These threshold values can e.g.be derived from measurements on clear days.Another, more universal, method is the determination of the threshold values from radiative transfer simulations.However, since MAX-DOAS instruments are usually not radiometrically calibrated, a direct quantitative comparison of measured and simulated quantities is not possible, which hampers the direct application of threshold values derived from radiative transfer simulations.To overcome this limitation, in this study we develop calibration procedures for the CI and the O 4 absorption and apply them to MAX-DOAS observations.
The proposed CI calibration comprises the determination of a proportionality constant, which converts the measured values into well-defined quantities (i.e.radiance ratios for the selected wavelengths).Similar suggestions for the calibration of the CI were already presented by Gielen et al. (2014) and Wagner et al. (2014).
For the O 4 measurements, the calibration comprises the determination and correction of an additional offset (the O 4 absorption of the Fraunhofer reference spectrum, FRS) like in Wagner et al. (2014).Already in Wagner et al. (2014) the measured CI and O 4 were calibrated based on selected clear-sky measurements.In contrast, here we develop standardised calibration algorithms for CI and the O 4 absorption, which can be applied to other MAX-DOAS measurements in a consistent way.

Update of the cloud classification scheme
After applying the new calibration algorithms to the measurements the calibrated CI and the O 4 absorption differ slightly from the calibrated values of the original classification scheme.Thus the threshold values of the cloud classification scheme have to be adapted accordingly.In addition to these changes, further improvements to the original classification scheme (Wagner et al., 2014) 330nm) is less affected by the atmospheric ozone absorption than the original choice (320 nm).The new longer wavelength (390 nm) has the advantage that it is covered by typical UV MAX-DOAS instruments (while 440 nm is often not).The variability of the surface albedo for 390 nm is also smaller than for 440 nm.
One major aim of this study is to provide a universal cloud classification scheme for MAX-DOAS measurements based on the new calibration procedures for the CI and the O 4 absorption and the updated threshold values.
The calibration procedures for the CI and the O 4 absorption are described in the first part of our paper (Sects. 2 and 3).In Sect.4, we apply both new calibrations to the measurements used for the development of the original cloud classification algorithm (Wagner et al., 2014), determine new threshold values and compare the results of the new and original algorithms.
In Sect. 5 particular problems and areas for future improvements of the classification scheme are discussed.Section 6 presents conclusions and outlook.
2 Calibration of the CI DOAS instruments are usually not radiometrically calibrated.Thus measured radiances and CI derived from MAX-DOAS or zenith sky DOAS measurements cannot be directly compared to the results from radiative transfer simulations.In this study (like in Wagner et al., 2014), we use the Monte-Carlo radiative transfer model McArtim (Deutschmann et al., 2011) for the simulations of radiances, CI and O 4 absorption.The settings used for the radiative transfer simulations used this study are described in Sect.2.2 of Wagner et al. (2014).
The CI derived from the measurements (CI meas ) can be converted to calibrated CI (CI cal ) by multiplication with a proportionality constant β: (2) β can be determined by comparison of measured and simulated CI under well-defined conditions (see e.g.Wagner et al., 2014;Gielen et al., 2014;Wang et al., 2015).Here it should be noted that instrumental problems (e.g.wrong offset or dark current correction or a non-linear response of the detector) might cause an additional offset between the measured and simulated CI.However, except for very low signals (e.g. at high SZA) or cases with strong oversaturation of the detector, these offsets are very small and are ignored in our calibration procedure.Moreover, oversaturated spectra could be easily identified by increased residuals of the spectral analysis.Wagner et al. (2014) used measurements during a clear morning with constant aerosol optical depth (AOD, derived from a sun photometer) for the calibration of the CI, radiance, O 4 absorption and Ring effect.Gielen et al. (2014) applied a more universal approach by considering CI values over extended periods of time.They compared cluster points for minima and maxima of the CI to results of radiative transfer simulations.In our study we basically follow their approach, but we also apply two important modifications.a.We only consider the minimum CI.As shown by Gielen et al. (2014), the maximum CI varies strongly with changing AOD, especially for low AOD.Thus the comparison of measured and simulated maximum CI depends critically on the AOD during the considered period, which is usually unknown.In contrast, the minimum CI depends only slightly on the specific atmospheric properties and measurement conditions (for details see below).
b.We do not use static threshold values, but consider the SZA dependence (of the minimum CI).
We also propose using the wavelength pair of 330 and 390 nm for the calculation of the CI (see the discussion in the introduction).In the following, the original CI (based on the wavelength pair 320 and 440 nm, see Wagner et al., 2014) is indicated by CI orig and the new CI by CI new .In Fig. 1 simulation results for CI orig and CI new are shown for different aerosol and cloud conditions.For both CI two features are obvious.
a.As already shown by Gielen et al. (2014), the maximum CI depends strongly on the AOD.This finding confirms that the maximum CI is not well suited for the calibration of the CI.  b.Small CI are found for cloudy cases, but interestingly the minimum values do not always occur for the largest cloud optical depths.This finding is probably caused by multiple scattering inside the clouds, which also increases the probability of additional Rayleigh scattering.Depending on the chosen wavelengths and SZA, the minimum CI is found for cloud optical depths between 3 and 12 (but the CI for the different cloud optical depths varies only slightly).Here it should also be noted that the CI for cloudy conditions is almost independent from cloud height.
It should be noted that these simulations are performed not for an elevation angle of exactly 90  (Piters et al., 2012) are compared to simulation results (for aerosol-free conditions, low aerosol load and the minimum CI for cloudy conditions, see Fig. 1).The minimum values are derived from a polynomial fit to the simulated minimum CI for different cloud optical depths as shown in Fig. 1.The polynomial expressions as well as the tabulated values of the minimum CI are provided in Tables 2 and A1 (in the Appendix).Different y axes are used for the measured (left) and simulated (right) CI.The maximum values of both axes were chosen according to the absolute radiance calibration for the respective wavelengths presented in Wagner et al. (2015).For CI orig and CI new most measurements fall into the area be-  tween the simulated minimum CI and those for an AOD of 0.1 (similar findings were presented by Gielen et al., 2014).Interestingly, about 20 % of all measurements are slightly lower than the simulated minimum values.This finding cannot be explained by the effect of measurement noise on the CI, which is very small ( 1 % for the SZA range considered here).Instead, the low CI values are probably caused by 3-D effects of broken clouds, which are not considered in our simulations.If, for example, the side of a cloud is illuminated by the direct solar beam, the composition of the light which enters the cloud might change compared to horizontally homogeneous clouds.The relative fraction of the diffuse sky radiation (which is blueish) compared to those of the direct solar beam might decrease, because the cloud side is illuminated by only part of the downwelling diffuse sky radiation.This effect would lead to a decrease of the CI.
However, it should be noted that even the lowest measured CI are still close to the simulated minimum values, indicating that the overall dependence of the CI is well represented by the model simulations (the detailed investigation of these 3-D effects should be the topic of futures studies).The results in Fig. 2 indicate that the minimum CI obtained from measurements (or better an accumulation point, see below) over a period of several weeks are well suited for the calibration of the measured CI.

Calibration procedure
We propose a calibration procedure consisting of three steps.c.The normalised CI of the accumulation point is determined by fitting a Gaussian curve to the frequency distribution after the clear-sky data were removed (data with normalised CI larger than 0.59 and 0.93 for the CI old and CI new respectively.Note that the derived values are almost independent from the chosen clipping value.We also determined the uncertainty of the scaling factor from the Gaussian fit to < 1 %.In order to account for possible temporal variation of the instrument sensitivity, we applied the method separately to the measurements during the first and second half of the campaign and found deviations < 2 %.This value probably represents a more realistic uncertainty for the measurements used in our study. For CI orig β is found to be 2.04 ± 0.04, and for CI new it is 1.16±0.02.The derived proportionality constants agree well with those calculated from the absolute radiance calibration presented in Wagner et al. (2015): 2.00 for CI orig and 1.19 for CI new .
In Fig. 4 results for MAX-DOAS observations at Wuxi (China) for the new CI are also shown (Wang et al., 2015).Measurements over a period from 1 January to 31 December 2012 were used.As for the Cabauw measurements a clear peak is found indicating that the method works in a similar way for completely different locations and measurement conditions.The derived proportionality constant is different from that for the Cabauw measurements caused by the different (wavelength-dependent) sensitivities of the instruments.Here it should be noted that differences of the aerosol properties at both locations could also contribute to the differences, but the effect of aerosols on the CI in the presence of clouds is typically below 2 %.
It should be noted that for measurements at locations with very low or very high cloud probability a larger time period than for our method might be needed to obtain a sufficient number of both cloudy and cloud-free measurements.In extreme cases, an accumulation point might even exist for CI representing clear-sky conditions (if the AOD also stays constant over an extended time period).In such cases, clear-sky measurements might be identified by visual inspection and be removed before the frequency distribution is calculated.

Calibration of the O 4 absorption of the Fraunhofer reference spectrum
The O 4 calibration is also performed by comparing the measurements to model simulations for specific atmospheric properties and measurement conditions.From the spectral analysis, the O 4 slant column density (SCD) is derived, which represents the integrated O 4 concentration along the atmospheric light path.Since the FRS also contains atmospheric O 4 absorptions, the result of the spectral analysis eventually represents the difference of the O 4 SCDs of the measurement and the FRS, which is usually referred to as differential SCD or DSCD.
Both the O 4 SCD and DSCD can be converted to the corresponding O 4 air mass factor (AMF) or O 4 differential air mass factor (DAMF): The VCD represents the vertical column density, the vertically integrated concentration which can be calculated from vertical profiles of temperature and pressure.Note that, in contrast to other trace gases, the SCD and VCD of O 4 are expressed relative to the square of the O 2 concentration (see Greenblatt et al., 1990).For the measurements during the CINDI campaign, the O 4 VCD is determined as 1.41 × 10 43 molecules 2 cm −5 (Wagner et al., 2014).
To obtain the total O 4 AMF of the measurement, the O 4 AMF of the FRS (AMF FRS ) has to be added: (5) The determination of AMF FRS constitutes the calibration of the O 4 measurements: Here DAMF i indicates the differential O 4 AMF derived from the spectral analysis of an individual measurement.AMF cal,i indicates the corresponding calibrated O 4 AMF.
In the following we describe how AMF FRS can be determined.Figure 5  The first finding indicates that clear-sky observations (around SZA of 36 • ) can in principle be used for the calibration of the O 4 measurements, even if the exact AOD is not known.The second finding indicates that cloudy measurements (as identified by CI) should be removed before the calibration is performed.Figure 6 presents a comparison of the measured O 4 DAMFs and simulated O 4 AMFs.Note that the y axes were shifted by 1.78, the final value derived for the O 4 AMF of the FRS (see below).In the top panel all measurements are shown and in the bottom panel only measurements for clear-sky conditions are shown (the cloud filtering was performed using the CI as described in Wagner et al., 2014).Interestingly, not only for the cloud-filtered measurements, but for all measurements the minimum O 4 DAMFs are well b.After applying the cloud filter, the number of measurements decreased by about a factor of 3. The rather large SZA range ensures that a useful number of measurements is still available for the comparison with the simulation results.
In Fig. 8 the frequency distribution of the normalised O 4 DAMF is shown for all observations (blue) or only clear-sky observations (red).The maximum values and uncertainties are determined in the same way as for the CI (Sect.2).For both cases (all measurements or clear-sky measurements) a value for the O 4 AMF of the FRS (AMF FRS ) of 1.78 is derived.The uncertainties are ±0.09and ±0.08 for all and clear-sky observations respectively.Here it should be noted that the rather large uncertainties are caused by the low number of observations and can probably be reduced if measurements for longer periods with more clear days are analysed.In the original version of our algorithm (Wagner et al., 2014) AMF FRS was determined based on selected clear-sky days with similar AOD.In spite of the different procedures, a very similar value for AMF FRS (1.75) was found.
In Fig. 8 results for MAX-DOAS observations at Wuxi (China) are also shown (Wang et al., 2015).Measurements over a period from 1 January to 31 December 2012 were used.As for the Cabauw measurements, a clear peak is found indicating that the method works in a similar way for com- Figure 9. Flow chart of the updated cloud classification scheme.The basic structure is the same as in the original scheme (Fig. 14 in Wagner et al., 2014).However, except for the decision criterium for fog (decision no.7), all other criteria were changed compared to the original scheme.The individual changes are summarised in Table 1.For decision no.6, no universal recommendation can be given, because the spread of the non-zenith angles depends in particular on the relative azimuth angle.Thus it is omitted in the updated classification scheme.The basic scheme of the original classification algorithm, however, is not changed.For most of the individual decision steps the normalisation procedures and the definition of the threshold values for the involved quantities were adapted.An overview on the new classification scheme and the applied changes is provided in Fig. 9 and Table 1.Only for the identification of fog, are exactly the same criteria as in the original algorithm still used.For the other quantities the new thresholds were chosen to best match those of the original scheme.The individual changes are summarised in Table 1 and are discussed in detail in the next subsections.At the end of this chapter (Sect.4.6) the results of the new scheme are compared to the results of the original scheme.

New threshold values for the CI
In the original cloud classification scheme the measured CI was first normalised (divided) by a SZA-dependent clear-sky reference value (simulated for an AOD of 0.3).The normalisation was applied to correct the strong SZA dependence of the CI (see Fig. 1).Then a constant threshold (independent of the SZA) was used to discriminate clear from cloudy observations.However, it turned out that the simple normalisation was not sufficient for large SZA (> 60 • ).Thus we decided to use a SZA-dependent threshold in the new version of our cloud classification algorithm (but we not apply the normalisation of the measured CI values anymore).As threshold values, we use simulation results for AODs of 0.75 for 440 nm (CI orig ) and 0.85 for 390 nm (CI new ) respectively.The AOD value of 0.75 at 440 nm was chosen to achieve consistency between the new and the original classification scheme (for small SZA).The corresponding AOD values at the other wavelengths (including those for the calculation of CI new ) were derived from the AOD at 440 nm using a typical Ångstrøm exponent of unity.
In Fig. 10 the calibrated CI orig and CI new for 24 June 2009 are compared to simulated CI for different aerosol and cloud properties.Note that the CI for AOD of 0.85 at 390 nm represents the SZA-dependent threshold value, see Tables 1  and A1.During the morning the measured CI are similar to the simulation results (blue lines) for the AOD obtained from the simultaneous AERONET measurements.Around noon, the AOD increases and some clouds also appear.As a consequence the measured CI decreases and it even falls slightly below the simulated minimum values several times.After about 15:00 the clouds disappear, but the CI stays at low levels because of the increased AOD in the afternoon.
Polynomial expressions describing the SZA-dependent threshold values for zenith viewing direction are provided in Table 1; tabulated values are provided in Table A1 in the Appendix.

New threshold values for the temporal smoothness indicator (TSI)
In our original algorithm a so-called temporal smoothness indicator (TSI) is used, which is derived from the temporal variability of the CI (Eq.7 in Wagner et al., 2014).It is used to identify rapid variations of the sky conditions, e.g.related to broken clouds.In our original study, the time differences between the individual measurements were explicitly considered for the calculation of the TSI.In fact, the TSI was defined as the discrete approximation of the second derivative in time.The TSI was also normalised (divided) by the clear-sky reference value and a constant threshold was applied to discriminate measurements with high TSI from measurements with low TSI (indicating a smooth temporal variation of the CI).In the new version two important changes are applied: first, the time difference between the individual measurements is not considered anymore for the calculation of the TSI.The motivation for this change is the fact that the TSI (as defined in Eq. 7 in Wagner et al., 2014) depends strongly on the time differences between individual measurements and thus on the individual instrument properties and measurement protocols.For practical use in MAX-DOAS inversions it is, however, sufficient to know whether the sky condition has changed during the time interval of a typical elevation sequence (independent from the actual duration of this elevation sequence).Thus, in the updated version of the cloud classification scheme, we use the following (simpler) definition of the TSI: Here CI n indicates the CI in zenith direction of the nth elevation sequence.Another change compared to the original classification scheme is that the TSI is not normalised by the clear-sky CI reference values, but instead a SZA-dependent threshold is used.The advantage of this approach is that the threshold values can be calculated based on well-defined atmospheric scenarios.Here we suggest using the difference of the CI for a clear-sky scenario with moderate aerosol load (AOD of 0.2) and the minimum CI for cloudy conditions (see Fig. 10).The AOD of 0.2 is assumed for the upper wavelength and the AOD for the lower wavelength is calculated assuming an Ångström exponent of 1.The threshold value is calculated from the CI for both scenarios: Here TSI threshold (SZA) represents the threshold value and CI diff (SZA) is the difference of the simulated CI for clear and cloudy conditions.We chose the proportionality constant α such that for SZA around 50 • the threshold value for the new version of the algorithm matches that of the original version.The best agreement is found for α = 0.06 (see Fig. 11).The polynomial expression for CI diff (SZA) for the exact zenith view is provided in  have to be multiplied by α = 0.06 before they can be used as threshold value for the new TSI.
In this study, we do not provide a set of thresholds for the TSI in non-zenith viewing directions, because they also depend on the azimuth angle, which is different for different instruments (and seasons).

Threshold values for the spread of the CI for different elevation angles
The spread of the CI for the different elevation angles approaches zero in the presence of clouds (Wagner et al., 2014), which makes it a useful quantity for the distinction between situations with enhanced aerosol loads or clouds.In addition, the spread of the CI can be used to identify clouds at low SZA, when identification by the absolute value of the CI fails (see Wang et al., 2015).The spread of the CI is calculated as the difference between the maximum and minimum CI for all elevation angles of an individual elevation sequence.Wagner  2014) performed simulations of the spread of the CI for selected relative azimuth angles only.Here we extended these calculations to cover all relevant combinations of relative azimuth angles and SZA (see Fig. 12).Based on these results the following can be concluded.a.No clear SZA dependence is found.Thus a simple normalisation as a function of the SZA (as in Wagner et al., 2014) is in general not appropriate, and in the new version of the classification scheme no normalisation as function of the SZA is performed.We keep the original threshold value of 0.14 (see Table 1), because according to Fig. 12 it still seems to be a good compromise to discriminate clouds and aerosols.Here it should be noted that for individual measurements (with a specific relation between the SZA and the relative azimuth angle) a monotonous relationship between the spread of the CI and the SZA might occur.In such cases a SZAdependent threshold might still be useful.
b. Clouds and aerosols (with the same optical depth) have a very similar effect on the spread of the CI (the differences are mainly a result of the different wavelength dependence of cloud and aerosol scattering).Thus in many cases, it is difficult to clearly discriminate between both types of atmospheric scatterers based on the spread of the CI.
c. Cases with optical depths about > 6 can be clearly identified based on the used threshold of 0.14.This finding makes the spread of the CI an important indicator for the presence of clouds in situations (e.g. at small SZA) in which they can not be identified based on the zenith CI itself (Wang et al., 2015).

New threshold for the O 4 AMF in zenith direction
In contrast to the original version of the algorithm, the SZA dependence of the measured O 4 AMFs is not corrected by subtracting the corresponding clear-sky reference values in the new algorithm.Instead, the measured O 4 AMFs are compared to the wavelength-dependent clear-sky reference value, to which a constant offset is added (representing the effect of multiple scattering inside optically thick clouds).In contrast to the original classification scheme, the clear-sky reference value is calculated for an AOD of 0.2 (instead of 0.3) to be consistent with the AOD measured by the AERONET sun photometer on 24 June 2009, which was chosen as a clearsky reference.Because of the new O 4 calibration and the new clear-sky reference values a slightly different offset (0.85) compared to the original version (0.74) was chosen in order to bring the results of the new algorithm into close agreement with those of the original algorithm.A comparison of the measured (calibrated) O 4 AMF with the threshold value for a day with occasional optically thick clouds is shown in Fig. 13.

Threshold for the spread of the O 4 AMF
The calculation of the spread of the O 4 AMF for the different elevation angles is not affected by the new calibration, since the same offset value is added to the O 4 DAMF of all elevation angles.Thus, the same threshold value as in the original version (0.37) is still used.

Comparison of the results of the original and the new classification schemes
We applied the updated classification scheme (using either CI orig or CI new ) to the same data set as in Wagner et al. (2014).A summary of the comparison results is shown in Fig. 14.In general, good agreement between the original and new results is found, but especially at high SZA (> 70  substantial deviations for particular classification results also occur.These differences are mainly caused by the different treatment of the normalisation and the SZA dependence of the thresholds, in particular those for the absolute value of the CI (step no. 1 in Fig. 9).Due to this change, much fewer measurements for SZA > 70 • are assigned to the class "clear sky with low aerosol" than in the original classification scheme.This change, however, seems to be reasonable, since for the new algorithms the relative fraction of the class "clear sky with low aerosol" has become more similar for low and high SZA.The updated threshold for the CI also leads to a considerable shift of cases from the class "cloud holes" to the class "broken clouds".This change also seems to be reasonable, because the results of the new algorithm for both classes have become more similar for low and high SZA, especially for the new CI.

Threshold values for observations at exactly zenith direction (elevation angle of 90 • )
Since the zenith observations were not performed in exact zenith direction (but instead at an elevation angle of 85 • ) for our MAX-DOAS measurements during the CINDI campaign, the question arises as to whether the threshold values can also be used for observations in exact zenith direction.In conclude that our findings can be directly transferred to observations at 90 • .The same conclusions hold for the quantities derived from the CI, the TSI and the spread of the CI (see also Figs. 12 and A2 in the Appendix).In Fig. 16 the clear-sky reference values of the O 4 AMF for elevation an-gles of 85 and 90 • are compared.Almost identical values are found, indicating that the reference value for 85 • can also be used for measurements at exactly zenith view.Polynomial expressions for all threshold values for exact zenith direction are provided in Table 2; tabulated values are provided in Table A1 in the Appendix.
5 Further improvements of the classification scheme In this section possible extensions and improvements of the calibration procedure and classification scheme are discussed.

Effect of instrumental degradation for long-term measurements
Especially for long-term measurements, instrumental degradation can become an important issue, because the results of the CI, O 4 absorption (and radiance) might systematically change over time.Wang et al. (2015) presented a method to quantify the effect of instrument degradation using time series of the derived quantities.They also suggested a degradation correction for the observed CI and radiance.Unfortunately, the effect of instrumental changes for the O 4 absorption (in particular the change of the instrument's resolution) can be very strong, and these influences cannot usually be corrected well.In such cases, the O 4 absorption can not be used for the detection of optically thick clouds.Thus, for long-term observations the occurrence of optically thick clouds should probably be based on observations of the radiance.An approach for an indirect calibration of the radiance will be proposed in Sect.5.2.

Estimation of a SZA-dependent threshold for the radiance
Optically thick clouds can be identified using the O 4 absorption or the measured radiance (Wagner et al., 2014).Especially for long-term measurements, the effect of instrumental degradation on the radiance is usually much weaker than for the retrieved O 4 absorption (see e.g.Wang et al., 2015).However, as mentioned before, the calibration of the radiance requires more effort than the calibration of the CI and O 4 absorption.In particular, measurements for days with constant and well-known AOD are required (Wagner et al., 2015).Thus, the updated version of the cloud classification scheme does not use the measured radiance for the detection of optically thick clouds, because such clouds can also be clearly identified by the O 4 absorption observed in zenith direction.Nevertheless, especially for long-term observations, the use of O 4 observations for the detection of optically thick clouds might be strongly affected by instrumental degradation (Wang et al., 2015).For such cases it might still be useful to identify optically thick clouds based on the measured radiance.Thus, in this subsection we propose a simple method for the determination of a threshold value which can be applied to the uncalibrated radiance.It is based on the comparison of measured radiances for optically thick and thin clouds as determined from the O 4 absorption.In Fig. 17 all radiance observations for optically thin clouds (identified by the O 4 absorption) are indicated by red dots and measurements for optically thick clouds by blue dots.In spite of some outliers, the transition between thin and thick clouds (as a function of the SZA) can be clearly identified.Moreover, the threshold value used in the original version of the algorithm (indicated by the black line) fits well to the transition between the red and blue points.This finding indicates the possibility of determining the threshold value for the radiance without performing an explicit radiance calibration of the instrument via the relationship with the observed O 4 absorption in zenith direction.This method could be applied for periods in which the effect of instrumental changes on the O 4 absorption are negligible.The derived radiance calibration could then be used for the entire period of the MAX-DOAS measurements.

Observations at low latitudes
The results in Fig. 15 indicate a potential problem for measurements at low SZA (< 30 • ).For such viewing geometries, the difference of the CI for clear and cloudy observations decreases, and the CI for cloudy situations can even become larger than the threshold values for the detection of high aerosol loads or clouds.Thus the identification of cloudy measurements for low SZA becomes increasingly uncertain or eventually even impossible based on the absolute value of the CI.For such situations, it is recommended to iden- tify clouds by the spread of the CI as proposed by Wang et al. (2015).

Possible ways of distinguishing between the effects of clouds and aerosols with similar optical depths
As discussed in Sect.4.3, from measurements of the absolute value of the CI alone it is difficult to distinguish between aerosols and clouds if they have the same optical thickness (especially for optical thicknesses between about 1 and 6).This is especially important for the presence of continuous clouds, because they cannot be detected by enhanced values of the temporal smoothness indicator.We recommend making use of the O 4 observations to distinguish between aerosols and continuous clouds.We suggest the following two approaches.These possibilities should be investigated in further studies.

Observations at high latitudes and over bright surfaces
The application of the cloud classification scheme at high latitudes is subject to two specific problems.First, measurements at small SZA are rare.This problem mainly affects the calibration of the O 4 absorption.Second, the surface will be covered by snow and ice during large parts of the year.Here it should be noted that, also at midlatitudes, the surface might be covered by ice and snow during part of the year.Increased values of the surface albedo strongly affect the atmospheric radiative transport and thus probably also the proposed calibration approaches and derived threshold values.
Figure 18 shows simulated CI and O 4 AMFs in zenith direction for high surface albedo (0.8).Interestingly, the CI is hardly affected by the increase of the surface albedo: compared to the results for low surface albedo (Fig. 15, bottom), the CI values are shifted towards slightly higher values by an almost constant value, indicating the effect of increased multiple scattering, which also leads to more Rayleigh scattering events of the observed photons.These results indicate that for measurements over snow-and ice-covered surfaces a similar calibration approach for the CI as for measurements over low albedo can be applied.
In contrast, the O 4 AMFs (Fig. 18, bottom) are strongly affected by the increased surface albedo.Compared to the results for low surface albedo, systematically higher values are found.Moreover, for SZA < 80 • , the O 4 AMF depends strongly on the AOD.We made additional simulations for different values of the surface albedo between zero and unity (see Fig. A3 in the Appendix).Interestingly, for all values a specific SZA exists, for which the O 4 AMFs become independent from the AOD.The dependence of this SZA on the surface albedo is almost linear (see Fig. 19).
From these findings we conclude that the calibration method developed for measurements above small surface albedo has to be modified before it can be applied to measurements over high surface albedo.Clear-sky measurements at large SZA can probably be used for the comparison of measured and simulated O 4 AMF.Fortunately, this possibil- ity fits well to the fact that at high latitudes most measurements are performed at moderate to high SZA.These simulation results indicate that for measurements over bright surfaces modified calibration approaches and threshold values would have to be used.These modifications need to be tested in more detail.Here it should also be considered that the surface albedo often changes rapidly, especially in spring and autumn.Thus methods for the detection (and quantification) of changes of the surface albedo based on the MAX-DOAS observations should also be developed.

Conclusions and outlook
We developed methods for the calibration of the colour index (CI) and the O 4 absorption derived from MAX-DOAS measurements of scattered sunlight, which are an important step towards a universal cloud classification scheme for MAX-DOAS observations.Both calibration methods are based on the comparison of measurements and radiative transport simulations for well-defined atmospheric conditions (e.g.clear or cloudy conditions) and limited SZA ranges.It should be noted that, except for the determination of the spread of the CI and the O 4 absorption (see Fig. 9), the algorithm can also be applied to "traditional" zenith sky DOAS instruments.
For the calibration of the CI, observations under cloudy conditions are used, for which minimum values of the CI are found (if the CI is defined as the ratio of the intensity at the short wavelength divided by the intensity at the long wavelength).The result of the calibration procedure is a proportionality constant, which is applied to the measured CI.
For the calibration of the O 4 absorption observations under clear-sky conditions and for a limited SZA range are used.As a result of the calibration procedure a constant offset is determined (the O 4 absorption of the Fraunhofer reference spectrum), which is added to the measured O 4 absorption.We successfully applied both calibration methods to measurements at two locations: Cabauw in the Netherlands and Wuxi in China.
In the second part of our study we applied the cloud classification algorithm described in Wagner et al. (2014) to the calibrated CI and O 4 absorptions and adapted the original threshold values accordingly.Together with the calibration method, the new set of threshold values can be used in a consistent way for any MAX-DOAS measurement thus constituting a universal method for cloud classification.In addition to the new threshold values, the updated version of the cloud classification includes further important improvements.a.We used a new wavelength pair (330 nm/390 nm) for the CI.Compared to the CI (320 nm/440 nm) used in our original study (Wagner et al., 2014) this choice has two advantages: the change of the low wavelength to 330 nm largely minimises the impact of the ozone absorption on the CI.The change of the upper wavelength to 390 nm ensures that the new CI can be calculated for almost all UV (MAX-)DOAS instruments (which often do not cover 440 nm).
b.The new threshold values better describe the SZA dependence.They are obtained from radiative transfer simulations for well-defined atmospheric scenarios.This aspect is important, since it ensures that threshold values for possible modified CI or additional cloudsensitive quantities could be determined in a consistent way (based on the same atmospheric scenarios).
c.No radiance measurements are used in the new version, because the absolute calibration of the measured radiance spectra is more complicated than those of the CI and the O 4 absorption.Fortunately, the omission of the radiance measurements has no large impact on the classification results, because the radiance was only used for the detection of optically thick clouds, which can also be identified from the O 4 absorption.
We compared the results of the updated cloud classification scheme with those from the original version and found general good agreement.The comparison results indicate that the updated classification scheme yields more consistent results for high SZA (> 70 • ) than the original classification scheme.
It should be noted that our cloud classification algorithm is optimised for MAX-DOAS measurements at midlatitudes.For measurements at high and low latitudes specific problems occur: at low latitudes, many measurements are performed for small SZA, for which the CI becomes indistinguishable for clear and cloudy conditions.As suggested by Wang et al. (2015) this problem can be partly overcome by the use of the spread of the CI for the identification of clouds.At high latitudes, measurements at small SZA are rare, which can lead to problems for the application of the calibration methods, especially for the O 4 absorption.In addition, frequently increased surface albedo due to snow and ice strongly affect the atmospheric radiative transfer and thus the prerequisites of our calibration methods.However, sensitivity studies suggest that for such conditions modified versions of the calibration methods and cloud classification scheme can still be applied.
Another problem with the new version of the algorithm is that, especially for long-term observations, the derived O 4 absorptions might be strongly affected by temporal changes of the instrument properties.Thus, the identification of optically thick clouds might be impossible for such measurements.As a possible solution for this limitation we propose a new indirect determination of the threshold value for the uncalibrated radiance, which is based on the O 4 measurements for periods which are not affected by instrumental changes.Optically thick clouds could then be identified based on the uncalibrated radiances, which are usually less affected by instrument degradation (and can be better corrected for instrumental changes than the O 4 absorption).
Finally we identify important research areas, which should be addressed in future studies in order to further improve the cloud classification scheme.These areas include a more sophisticated use of CI from individual elevation angles (instead of simply using the spread of the CI), modifications for the cloud classification algorithm for situations with high surface albedo, an improved discrimination of clouds and aerosols based on O 4 absorptions as well as the investigation of 3-D cloud effects on the CI.

Data availability
The raw data can be obtained on request from the authors.

Figure 1 .
Figure 1.Simulated colour indices for an elevation angle of 85• (top: CI original = 320 nm/440 nm; middle: CI new = 330 nm/390 nm) for different aerosol and cloud optical depths.For the aerosol cases (green lines) the OD represents the value at 390 nm (Ångström exponent = 1); for the cloud cases (OD ≥ 2) the same optical depth is assumed for both wavelengths (Ångström exponent = 0).The aerosol layer is between the surface and 1 km; the cloud layer is between 1 and 2 km.Aerosol properties are described by a Henyey-Greenstein model with an asymmetry parameter of 0.68 and a single scattering albedo of 0.95.Bottom: SZA for a day (26 June 2009) in the middle of the campaign.

Figure 3 .
Figure 3. Normalised CI during CINDI for both wavelength pairs.The normalisation is performed by dividing the measured CI by the respective simulated minimum values.For SZA < 60 • (indicated by the red vertical lines) the minima of the normalised CI are almost independent from the SZA.The measurements are from the period 12 June to 15 July 2009.

Figure 4 .
Figure 4. Frequency distribution of the normalised CI for SZA < 60 • (for bins of 0.02).The blue and magenta curves represent the results for Cabauw (1527 measurements); the black curve represents the results for Wuxi (2440 measurements).

Figure 5 .
Figure 5. Simulated O 4 AMFs (360 nm) for different aerosol and cloud scenarios.The violet and red lines indicate results for minimum and maximum AOD of 0 and 3 respectively.For clear-sky conditions almost the same O 4 AMFs are obtained around SZA of 36 • (indicated by the blue arrows), independent from the assumed AOD.Depending on the cloud OD and the cloud height, clouds can either increase or decrease the O 4 AMF compared to clear-sky conditions(Wagner et al., 2011).The simulations are performed for a day (26 June 2009) in the middle of the campaign.
presents simulated O 4 AMFs for (near) zenith view for different cloud-free (coloured lines) and cloudy conditions (black lines).From these simulation results, two important findings can be deduced.a. Around SZA of 36 • (indicated by the blue arrows) the O 4 AMFs for the different aerosol scenarios are almost the same.b.For cloudy scenarios the O 4 AMF can be either decreased or increased compared to cloud-free scenarios: for low and optically thick clouds the O 4 AMFs are enhanced, while for high and optically thin clouds the O 4 AMFs are decreased (Wagner et al., 2011).

4
Update of the cloud classification scheme using the newly calibrated MAX-DOAS CI and O 4 data Wagner et al. (2014) developed a scheme for the classification of cloud and aerosol conditions based on MAX-DOAS measurements.In this section we introduce an updated version of this classification scheme.Compared to the original version, four major changes are applied.a.The threshold values of the original scheme have been adapted to the newly calibrated CI and O 4 data.b.The new threshold values are based on well-defined atmospheric conditions.Thus the procedures for the determination of the threshold values can be applied to any measurements at different locations and seasons.The new threshold values also better represent the SZA dependencies.c.The threshold values are also provided for the new CI.d.The determination of optically thick clouds is only based on O 4 measurements (without making use of the radiance measurements).

Figure 10 .
Figure 10.Comparison of measured normalised CI for 24 June 2009 with simulated CI for different scenarios.Top: original CI for 320 and 440 nm; Bottom: new CI for 330 nm/390 nm.The AOD values in the legends correspond to the wavelengths 440 and 390 nm respectively.The blue curves represent CI for the AOD measured by the AERONET instrument on the morning of 24 June 2009.The red curves represent CI for the threshold values used in the new classification scheme.The minimum values represent cloudy conditions (see Fig. 1).The simulations are made for a day (26 June 2009) in the middle of the campaign.

Figure 11
Figure 11.(a) Measured normalised CI (a) for 24 June 2009.(b) Normalised TSI according to the original algorithm.Here a constant threshold value was used.(c, d) New TSI as used in this study without normalisation and explicit consideration of the time (Eq.7).The threshold values for the new TSI depend on the SZA.
Figure12.Spread of the CI (top: CI orig for 320 nm/440 nm; bottom: CI new for 330 nm/390 nm) between the different elevation angles for different aerosol (left) and cloud (right) scenarios.The spread is calculated as the difference between the maximum and minimum CI for a given combination of SZA and relative azimuth angle.The different lines of the same colour represent simulations for different relative azimuth angles(0, 30, 60, 90, 120, 150, 180 • ).The simulations are made assuming an elevation angle of 85 • for zenith view.Results for exact zenith view are almost identical (see Fig.A2in the Appendix).

Figure 14 .
Figure 14.Comparison of the results of the original and updated classification scheme.The black numbers indicate the comparison results between the original algorithm and the updated algorithm using CI orig (1) and between the original algorithm and the updated algorithm using CI new (2).Blue and red colours represent cases which were assigned or not assigned to the respective class by the original algorithm.Bars with full colours indicate agreement between the old and new algorithm, hatched bars indicate disagreement between the old and new algorithm.

Figure 15 .Figure 16 .
Figure 15.Comparison of simulated CI (top: CI orig: 320 nm/440 nm; bottom: CI new : 330 nm/390 nm) for elevation angles of 85 • (blue lines) and 90 • (red lines).The different symbols represent different atmospheric scenarios.The black line represents results for a cloud optical depth of 10.
a.If clouds are not included in the forward model for the aerosol profile inversion of MAX-DOAS retrievals, continuous clouds might be simply identified by the bad convergence (large differences between the measured O 4 absorptions and the corresponding results) of the forward model.b.More sophisticated versions of the forward model might even include a simple parameterisation of clouds (e.g.based on homogenous clouds with different optical depths and altitudes).Homogenous clouds might then simply be identified by the retrieved layer height (if they is larger than the typical aerosol layer height).

Figure 19 .
Figure 19.SZA, for which the O 4 absorption becomes independent from the aerosol optical depth as function of the surface albedo (see also individual simulation results in Fig. A3 in the Appendix).

Figure A3 .
Figure A2.Same results as in Fig.12, but for zenith observations at exactly 90 • elevation.
are introduced.a.The determination of the threshold values is based on well-defined atmospheric scenarios.
b.The new thresholds better account for solar zenith angle (SZA) dependencies.c.A new set of wavelengths is used for the CI: the old wavelength pair (320 nm/440 nm, see Wagner et al., 2014) is replaced by 330 nm/390 nm.The new wavelength pair has several advantages: the new shorter wavelength ( • , but for 85 • , because of the specific conditions of the measurements used here (Wag-

CI normalised I(320 nm) / I(440 nm) .
AMFs for AOD of 0.2 were subtracted.This normalisation procedure is applied to remove the SZA dependence.The choice of the simulations for AOD of 0.2 is somehow arbitrary, but the exact choice has only a very small effect on the normalisation results.Polynomial expressions and tabulated values of the O 4 AMF for AOD = 0.2 (clear-sky reference value) are provided in Tables2 and A1. Figure 7. Normalised O 4 DAMF for all measurements (blue) or measurements under clear-sky conditions (red).The normalisation is performed by subtracting the simulated O 4 AMFs for AOD of 0.2.The rectangles indicate the SZA ranges used for the calibration of the O 4 DAMF.The measurements are from the period 12 June to 15 July 2009.For the calibration of the O 4 DAMF, measurements for SZA between 30 and 50 • were chosen for two reasons.a.For different aerosol layer heights, the SZA for which the O 4 AMF for different AOD become similar varies slightly between about 30 Figure 6.Comparison of measured O 4 DAMFs (right axis) with simulated O 4 AMFs (left axis) for different AOD.In the top panel all observations are shown and in the bottom panel only clear-sky observations are shown.The rectangles indicate the SZA ranges (30-50 • ) used for the calibration of the O 4 measurements.The right y axis is shifted compared to the left y axis by the O 4 AMF of the FRS (−1.78).The measurements are from the period 12 June to 15 July 2009; the simulations are performed for a day (26 June 2009) in the middle of the campaign (aerosol layer height: 1000 m, surface albedo: 5 %).Atmos.Meas.Tech., 9, 4803-4823, 2016 www.atmos-meas-tech.net/9/4803/2016/T. Wagner et al.: Absolute calibration of the colour index and O • (for layer heights of 500 m) and 50 • (for layer heights of 2000 m), see Fig. A1 in the Appendix.Since the aerosol layer height is usually unknown and can vary with time, we chose a SZA range which covers typical aerosol layer heights.

Clear sky, low aerosol Continuous clouds Smooth temp. var. of CI Z and spread of CI for different elevation angles Clear sky, high aerosol CI Z,meas < Ci ref ? CI the same for all elevation angles ? Broken clouds Rapid temp. var. of CI ? Z Yes No Clear sky betw. clouds
Figure 8. Frequency distribution (for bins of 0.05) of the normalised O 4 DAMF for SZA between 30 and 50 • .The blue and red curves represent observations at Cabauw for all sky conditions (896 measurements) and clear-sky conditions (302 measurements) respectively.For both selections the frequency maximum is found for −1.78.The black curve represents the frequency distribution for clear-sky observations at Wuxi (790 measurements).pletelydifferentlocations and measurement conditions.The derived value for the O 4 AMF of the FRS is almost the same as for the Cabauw measurements as both FRS were recorded at similar AOD and SZA.Finally, two important aspects should be mentioned.a.For long-term measurements, it might be necessary to use different FRS for different parts of the whole time series.In such cases the calibration procedure has to be applied for each selected FRS.b.The O 4 VCD depends on atmospheric temperature and pressure.Thus it varies with time.Depending on the weather conditions and season, such changes can exceed 10 %.The variation of the O 4 VCD leads to a similar variation of the measured O 4 DAMFs.In addition, the temperature dependence of the O 4 cross section probably further increases this variability.So far, these effects are not explicitly considered in most studies, and here we also assumed the O 4 VCD to be constant.Thus part of the scatter of the measured O 4 AMFs in Figs.7 and 8 might be caused by the variation of the O 4 VCD and temperature dependence of the O 4 cross section.However, the current version of the algorithm is only slightly affected by the corresponding uncertainties of the derived O 4 DAMFs, because they are used only for the identification of optically thick clouds and fog.Future studies might take the effects discussed above into account when retrieving the O 4 DAMFs from the measured spectra.T. Wagner et al.: Absolute calibration of the colour index and O 4 absorption

Table 1 .
Comparison of the normalisation procedures and threshold values for the individual decisions of the cloud classification scheme of the original algorithm and the new algorithms (for CI orig and CI new ).In the column "algorithm steps" it is indicated in which decisions the selected quantity is involved (see Fig.9).

Table 1 ,
and tabulated values are provided in Table A1 in the appendix.Note that the provided values T. Wagner et al.: Absolute calibration of the colour index and O 4 absorption Figure 17.Measured radiance at 360 nm (in units of counts s −1 ) for near-zenith observations during the Cabauw campaign.The blue/red dots indicate measurements which were classified as under optically thick/thin clouds, based on the calibrated O 4 absorption using the updated thresholds.The black curve represents the threshold value for the radiance used in the original version of the algorithm.The measurements are from the period 12 June to 15 July 2009.

Table 2 .
Polynomial expressions for the different SZA-dependent clear-sky reference and threshold values.