Quantification and parametrization of non-linearity effects by higher-order sensitivity terms in scattered light differential optical absorption spectroscopy

We address the application of differential optical absorption spectroscopy (DOAS) of scattered light observations in the presence of strong absorbers (in particular ozone), for which the absorption optical depth is a non-linear function of the trace gas concentration. This is the case because Beer–Lambert law generally does not hold for scattered light measurements due to many light paths contributing to the measurement. While in many cases linear approximation can be made, for scenarios with strong absorptions non-linear effects cannot always be neglected. This is especially the case for observation geometries, for which the light contributing to the measurement is crossing the atmosphere under spatially well-separated paths differing strongly in length and location, like in limb geometry. In these cases, often full retrieval algorithms are applied to address the non-linearities, requiring iterative forward modelling of absorption spectra involving time-consuming wavelength-bywavelength radiative transfer modelling. In this study, we propose to describe the non-linear effects by additional sensitivity parameters that can be used e.g. to build up a lookup table. Together with widely used box air mass factors (effective light paths) describing the linear response to the increase in the trace gas amount, the higher-order sensitivity parameters eliminate the need for repeating the radiative transfer modelling when modifying the absorption scenario even in the presence of a strong absorption background. While the higher-order absorption structures can be described as separate fit parameters in the spectral analysis (so-called DOAS fit), in practice their quantitative evaluation requires good measurement quality (typically better than that available from current measurements). Therefore, we introduce an iterative retrieval algorithm correcting for the higher-order absorption structures not yet considered in the DOAS fit as well as the absorption dependence on temperature and scattering processes.

Abstract.We address the application of differential optical absorption spectroscopy (DOAS) of scattered light observations in the presence of strong absorbers (in particular ozone), for which the absorption optical depth is a non-linear function of the trace gas concentration.This is the case because Beer-Lambert law generally does not hold for scattered light measurements due to many light paths contributing to the measurement.While in many cases linear approximation can be made, for scenarios with strong absorptions non-linear effects cannot always be neglected.This is especially the case for observation geometries, for which the light contributing to the measurement is crossing the atmosphere under spatially well-separated paths differing strongly in length and location, like in limb geometry.In these cases, often full retrieval algorithms are applied to address the non-linearities, requiring iterative forward modelling of absorption spectra involving time-consuming wavelength-bywavelength radiative transfer modelling.In this study, we propose to describe the non-linear effects by additional sensitivity parameters that can be used e.g. to build up a lookup table.Together with widely used box air mass factors (effective light paths) describing the linear response to the increase in the trace gas amount, the higher-order sensitivity parameters eliminate the need for repeating the radiative transfer modelling when modifying the absorption scenario even in the presence of a strong absorption background.While the higher-order absorption structures can be described as separate fit parameters in the spectral analysis (so-called DOAS fit), in practice their quantitative evaluation requires good measurement quality (typically better than that available from current measurements).Therefore, we introduce an iterative retrieval algorithm correcting for the higher-order absorption structures not yet considered in the DOAS fit as well as the absorption dependence on temperature and scattering processes.

Introduction
Differential optical absorption spectroscopy (DOAS) (Platt and Stutz, 2008) is a well-established remote sensing method to retrieve trace gas abundances in the atmosphere, either from direct or scattered light observations (e.g.Perner et al., 1976;Platt and Perner, 1980;Mount et al., 1987;Solomon et al., 1987;Wahner et al., 1990;Burrows et al., 1999;Platt and Stutz, 2008;Wagner et al., 2008).Based on Beer-Lambert law, it assumes a linear relation between the trace gas concentration (i.e.number density) and the optical depth (OD, logarithm of the intensity; in this paper we specifically define OD as the logarithm of the Sun normalized radiance).
The assumption of linearity, however, generally does not hold for scattered light observations since light takes various paths before observed by the instrument (Marquard et al., 2000).The longer light paths will become more unlikely than the shorter ones due to the higher absorption probability with increasing concentration.Therefore, the OD has a complex, non-linear relation with respect to the concentration of atmospheric absorbers, with higher non-linearity for strong absorbers (in particular ozone).Since the scattering probability due to molecules and aerosols changes with wavelength, this relation also varies with wavelength.Also interferences between different absorbers and especially the impact of strong on weak absorbers need to be accounted for to not under-or Published by Copernicus Publications on behalf of the European Geosciences Union.
overestimate their amount (Puk ¸īte et al., 2010).Due to the strong ozone absorption, retrievals in the UV and in the midvisible (green/yellow) range can be substantially affected.The non-linearity effect is especially strong for observation geometries where the light contributing to the measurement is crossing the atmosphere (in particular the regions where the trace gas of interest is located) under spatially largely distributed paths differing strongly in length and location, like observations of stratospheric trace gases in limb geometry or at high solar zenith angles (SZAs).However, besides observations in these extreme viewing geometries, also nadir observations of tropospheric trace gases might be substantially affected by the non-linearity, especially if the tropospheric trace gas absorption is strong and the probability of multiple scattering is high (e.g. with increased aerosol loads present).Usually, iterative approaches like weighting function DOAS (Buchwitz et al., 2000;Coldewey-Egbers et al., 2004, 2005;Frankenberg et al., 2005) or the so-called full retrieval approach, developed especially for limb measurements (e.g.Rozanov et al., 2005Rozanov et al., , 2011)), requiring extensive radiative transfer model (RTM) calculations wavelength by wavelength in the spectral region of interest, are employed to provide an exact treatment of non-linearity.Besides that, linear retrieval methods for scattered light observations exist, e.g.air mass factor (AMF) modified DOAS (Diebel et al., 1995;Richter, 1997) or the Taylor series approach (Puk ¸īte et al., 2010) that considers the non-linearity by parametrizing it by the fit components.In particular the Taylor series approach, which parametrizes the non-linearity by including first-order Taylor series terms on absorption and wavelength as additional fit components, benefits from being more time efficient because RTM calculations, similarly as for the standard DOAS, are necessary only for the spatial evaluation (trace gas profile or vertical column retrieval) and can be limited to one wavelength.However, to be not only more time efficient (by performing RTM calculations at only one or a limited number of wavelengths) but also more independent from a priori constraints, one might wish to quantitatively characterize the influence on the retrieval caused by strong absorption.One possibility would be to introduce additional functionals quantifying the non-linearity effects, besides the already wellknown conventional box AMFs.The box AMFs quantify the linear influence of the trace gas absorption coefficient on the logarithm of the measured intensity; they describe the effective light paths through a given atmospheric region (also called box or voxel) normalized by the box spatial extent.Here, we parametrize the contribution to OD by non-linearity effects by introducing higher-order box AMFs or higherorder effective light paths.The mth-order effective light path of a set of given m boxes is defined as the weighted mean product of the individual light paths through the m boxes.When normalized by the product of the spatial extents of the considered boxes, the mth-order box AMF is obtained.The higher-order effective light paths can help to approxi-mate the forward model in an optimum way: first, compared to the standard DOAS approach and even to the Taylor series approach (Puk ¸īte et al., 2010), non-linearities will be much better taken into account; second, with respect to both standard DOAS and Taylor series approach, the impact of a priori constraints is minimized; third, compared to the exact formulation of the forward model, it is much more time efficient because no full RTM employment for forward model calculations is needed.We limit most of the descriptions to the second-and third-order effective light paths due to their practical relevance since the effect of even higher-order contribution to OD is practically negligible.
The consideration of the higher-order contributions allows us to implement a two-step algorithm (separating the spectral retrieval part from the spatial retrieval part) even for strong absorption scenarios (similarly as in Puk ¸īte et al., 2010).In addition, the quantitative consideration of these contributions allows us to implement an iterative Gauss-Newton-like retrieval scheme iteratively correcting for the higher-order OD terms not yet considered in the DOAS fit.The concept of higher-order effective light paths allows us to perform the necessary forward modelling quickly (i.e.without employment of RTM calculations).During the iterative process the absorption dependence on temperature and scattering processes are considered as well.
The article is structured as follows: in Sect. 2 we introduce an approximation for the radiative transfer equation considering higher-order OD terms while Sect. 3 generalizes DOAS considering these terms.Section 4 introduces an approach to approximate the effective light paths of any order as function of wavelength and trace gas absorption, which is practical to quickly calculate absorption spectra and the first-order effective light paths.Section 5 discusses the application of the higher-order OD terms in the retrieval algorithms.First, the application of Taylor series DOAS (Puk ¸īte et al., 2010) is studied, and the method is extended by considering the higher-order OD terms of different trace gases.In addition, an iterative two-step approach is described and investigated in sensitivity studies.Section 6 investigates the application of the new retrieval scheme for the retrieval of BrO vertical concentration profiles from satellite limb measurements by SCIAMACHY (Bovensmann et al., 1999) and compares the retrieved profiles with correlative balloon measurements (Dorf et al., 2006(Dorf et al., , 2008;;Rozanov et al., 2011).Finally, Sect.7 draws some conclusions.

Characterization of absorption effects for scattered light observations
The intensity (in our study the Sun normalized spectral radiance) as observed by a detector depends on the paths the photons from the Sun took in the atmosphere until reaching the detector, each with a certain probability to be observed by the detector.The intensity is basically the mean of the probability of the light paths after entering the atmosphere per element of the area, element of time, and element of wavelength to enter an element of the detector's aperture solid angle.The path integral formulation of the radiative transfer equation (compare Marquard et al., 2000) in a discrete summation form is expressed as follows: The weight w T i expresses the probability of such an individual light path i to reach the element of the detector's aperture solid angle.The weight depends on the path's geometry and the optical properties along it (e.g. a path with many scattering events or with high trace gas absorption along it will have low weight).For the sake of simplicity, the limit expression and the summation boundaries along i will be skipped in the following.
One can modify the weight w T i , describing it as a product of the weight without absorption w i and the absorption probability along the path i (this is possible because Beer-Lambert law holds along one single path).The equation becomes where l ij describes the path length of the ith trajectory through the box j (if the box is not crossed by a path, the length for the path there is 0).The summation is performed over all boxes belonging to the atmosphere.To correspond to reality, each box has to have an infinitesimally small 3-D volume.β kj is the absorption coefficient of absorber k in box j , i.e.
where c kj and σ kj are number density and absorption cross section of the absorber respectively.Since the Beer-Lambert law does not hold for a multiple light path ensemble, only its approximation can be applied for scattered light observations.Standard DOAS applications use its approximation up to the first order.However, for strong absorption scenarios the first-order approximation is usually not applicable, which motivates us to investigate the use of higher-order terms.
Expanding the OD τ (i.e. the logarithm of the intensity of Eq. 2) in a Taylor series (similarly as it is done by Deutschmann, 2014) with respect to the absorption coefficients β kj up to the third order (at zero absorption background), one obtains Here, τ 0 , referred to in the following as the scattering term, is the contribution to the OD that can be explained by scattering processes only because the weights w i do not contain any absorption.Because of its broadband variation with wavelength, it can be approximated by a polynomial in DOAS.It equals the logarithm of the intensity I 0 which would be obtained for the atmosphere without any absorbers present (note that I 0 in this notation is not the solar irradiance).The rest, i.e. the difference between the observed OD τ and the scattering term, is the OD caused by absorption.τ 1 , τ 2 , and τ 3 approximate the absorption OD of the first-, second-, and third-order contributions to the observed absorption OD by k absorbers present in the atmosphere.We refer to them in the following as first-, second-, and third-order OD terms respectively.The first-order OD terms can be also called linear OD terms while the second-and third-order (and even larger-order) OD terms are generally referred to as higherorder OD terms in the following.The terms of a respective order are always a sum of the OD contributions of different trace gases.The absorbers k, K, and k index all trace gases belonging to the same atmospheric scenario, and also j , J , and J index all boxes belonging to the same scenario spanning the whole atmosphere; therefore the summation order, performed over all trace gases and all boxes in the atmosphere, is arbitrary.While the linear OD term is a sum of the first-order or linear kth absorber OD contributions, the higher-order OD terms contain contributions arising from the same trace gas (if k = K for the second order, or k = K = k for the third-order OD terms, we call these contributions selfcorrelative OD terms of absorber k) and from the combination of different trace gases (if k = K for the second order, or any k, K, and k is not equal to another one, we call these contributions cross-correlative OD terms e.g. of absorbers k and K).
L j is the effective light path (of the first order) through box j describing the sensitivity of the measurement to a certain atmospheric volume, in this case for an atmosphere without absorbers: (5) A sketch of light paths (with corresponding weights) crossing two boxes (layers) in limb geometry for the illustration of the variables in Eqs. ( 5) and ( 7).w 1 . ..w 4 indicate weights of various paths photons may take travelling from the Sun and reaching the detector.j and J represent two arbitrarily selected boxes (layers); l 1...4,j ...J are the lengths of individual paths through the boxes.The red arrows indicate that the path is obtained by summation of the two segments through the box.In the case the path is not crossing any box, its length is 0, as indicated.
Normalized by the box dimensions (usually by the height) a dimensionless functional is obtained: the well-known box AMF (e.g.Wagner et al., 2007), However, the higher-order OD terms of Eq. ( 4) introduce the new sensitivity functionals: L 2j J and L 3j J J .The secondorder effective light paths are Figure 1 illustrates the terms used in Eqs. ( 5) and ( 7) assuming a finite-sized 1-D layered atmosphere in limb geometry (as it is used in the studies later in the paper).
L 2j J (Eq. 7) are weighted products of the individual light paths through two boxes characterizing how dependent the sensitivity is between two different places in the atmosphere; i.e. they account accounts for all those light paths which cross both selected boxes.If, however, both or any of the boxes are not crossed, the individual contribution of the light path is 0. In the case of complete correlation, which occurs if only one light path exists (like for direct light observations), the second-order effective light paths are equal to the product of the corresponding first-order effective light paths: In this case the contributions to the observed intensity by the second-order terms like in Eq. (4) (and even higher-order terms) become 0 and the DOAS equation becomes linear.In Sect.3.4.3 the properties of second-order contribution are investigated in more detail.
The third-order effective light paths are They are weighted products of the individual light paths through three boxes and quantify the dependence of the sensitivity between three locations in atmosphere; i.e. they account for all light paths which cross the given three boxes.
In general, any order effective light paths can be defined, if the Taylor series expansion of Eq. ( 2) is extended to higher orders.The mth-order term for any M boxes is Note that similar to the effective light paths L j , the higherorder effective light paths are equal for any trace gas considered in a certain atmospheric scenario.
In analogy with Eq. ( 6), also higher-order box AMFs can be defined when normalizing effective light paths by the extension (height) of the involved boxes: 3 Generalization of DOAS for scattered light observations In the next subsections the validity of different approximations (linear or higher-order terms) in DOAS is assessed.

Linear approximation in the standard DOAS
In order to get a DOAS-like equation (as in Platt and Stutz, 2008) for scattered light observations, one basically needs to consider the Taylor series expansion of the logarithm of the intensity in Eq. ( 4) up to the first order as a function of wavelength within a certain wavelength interval (fit window): In the following we skip explicitly mentioning the dependence of wavelength for better readability.
The wavelength dependence of the scattering term τ 0 is usually approximated by a low-order polynomial, i.e.
where λ is wavelength, P indicates the polynomial order, and a p is the coefficient of the different polynomial orders obtained by the fit.
τ 1 in Eq. ( 12) approximates the absorption OD.Considering wavelength-dependent absorption cross sections but neglecting their spatial variation (assuming them being constant with temperature and pressure), the linear OD term τ 1 can be rewritten: with the slant column density (SCD) S k in the DOAS literature referred to as the concentration of absorber k integrated along the effective light path.
Since the Taylor series expansion is performed at the absorber concentration of 0, the SCD in the formula represents the effective light path for an atmosphere without absorption.Here it is important to note that for strong absorption scenarios the retrieved SCDs as well as ODs are systematically underestimated by this linear model derived at a zero absorption background, because longer light paths contributing to the measurement become less probable.This leads to the fact that a fit by a linear model will distribute the higherorder absorption structures (with a mostly negative contribution by the second-order terms, see also the discussion later in Sect.3.4.3) to the existing fit parameters (polynomial and cross-section terms).This effect can be addressed in a simple way by calculating effective light paths, including a background absorption scenario (like it is done in Puk ¸īte et al., 2006;Kühl et al., 2008), when inverting SCDs into spatial concentration distributions.This is, however, only useful for minor absorbers because the relationship between the OD and their concentration is still almost linear.Applied to the spatial interpretation only, this simple approach, however, still does not consider spectral effects, i.e. systematic biases in the fitted SCDs due to ignoring higher-order absorption structures.

Taylor series approach
In order to correct for both the systematic underestimation of the SCDs and to minimize the systematic biases due to higher-order absorption structures, the Taylor series approach in Puk ¸īte et al. ( 2010) suggested considering the firstorder dependency of the SCD in Eq. ( 12) on absorption and scattering as Thus the OD terms for absorber k are given as whereS k,λ , describing the first-order variation with wavelength, and S k,σ , providing the first-order effect on trace gas absorption, are directly obtained by the DOAS fit in addition to S k,0 , parametrizing the constant part of the SCD.In this way the linear dependence between the logarithm of the intensity and the absorption parameters is kept and the retrieval problem can be solved by the usual least squares technique.Puk ¸īte et al. ( 2010), however, have not yet provided a quantitative interpretation of these terms in the framework of radiative transfer theory (see next subsection).
This modification for a strong absorber (ozone) in the UV largely decreased the fit residual structures.However, at the same time a systematic bias for weak absorbers still remained (see Figs. 12 and 13 therein) because the method does not account for cross-correlative terms (see also Sect. 5.1 where we evaluate the method in sensitivity studies).
Besides the significant improvements, the method relays on a priori considerations when performing radiative transfer simulations for the spatial evaluation.Effective light paths are evaluated at a background a priori scenario at an empirically selected wavelength (where the discrepancy with the true profile as determined in sensitivity studies is minimum).
Another possibility to address the problem of scattered light observations with a linear method is the so-called AMF modified DOAS (Richter, 1997;Diebel et al., 1995), where the product of a wavelength-dependent AMF and cross section is used instead of a simple cross section.This method, however, relies on the approximate knowledge or assumption of atmospheric properties, e.g.trace gas and aerosol profiles, for AMF calculation which needs to be performed at all wavelengths used for the spectral analysis.

Higher-order DOAS
Instead of limiting the DOAS fit to the linear approximation (standard DOAS) or a qualitative consideration of the variation of the SCD due to scattering and second-order selfcorrelative absorption effects (Sect.3.2, Puk ¸īte et al., 2010), all higher-order terms of Eq. (4) can generally be considered in a quantitative manner.
Like for the approximation of the wavelength dependence of τ 0 in the linear case (Eq.13), the wavelength dependence of the effective light paths can also be approximated by polynomial functions in Eq. ( 4), in particular because the weights w i in Eqs. ( 5), (7), and (8) are broadband functions of wavelength since they consider only scattering: J. Puk ¸īte and T. Wagner: Quantification of non-linearity in DOAS Here λ p , their products with the cross sections, as well as their product with the products of the cross sections are fit components describing the wavelength dependence of the OD due to scattering.The retrieved quantities from the fit are the polynomial coefficients a p , SCD coefficients S pk that parametrize the wavelength dependence due to the scattering, as well as the higher-order SCD coefficients S pkK and S pkKk .
Considering only the τ 0 and τ 1 terms with P k = 0 the equation corresponding to standard DOAS is obtained (compare Eqs. 12-15).Considering also terms with P k = 1 (at least for strong absorbers) and including also τ 2 terms with k = K (for strong absorbers) and P kK = 0 the approximation by the Taylor series approach (compare Eq. 17) is obtained.
The retrieved SCD coefficients are quantitatively associated with the trace gas concentration by j L j c kj = P k p=0 λ p S pk . ( The equation allows to investigate the relationship between the SCD and the spatial distribution of an absorber at any wavelength.Additionally, it is worth noting that the SCDs correspond to an atmosphere without absorption, so theoretically it is possible to calculate the effective light paths L j for such an atmosphere.This eliminates the need for a priori profile assumptions in the calculation of effective light paths and avoids possible subsequent iterative steps.For the second-and the third-order SCD coefficients the relations can be written in a similar manner: and From these two equations it can be basically concluded that additional information for the spatial evaluation can be derived by fitting higher-order terms, because different-order effective light paths have different spatial distributions, as discussed later.Note, however, that in this paper the content of this additional information is not explicitly investigated because (a) a non-linear inversion algorithm would be needed and (b) the high correlations between the derived SCDs of different orders, especially when they correspond to the same trace gas.For accurate fit results, in general a very good measurement quality (e.g.signal-to-noise ratio) would be needed and, depending on the trace gases of interest and the measurement quality, fit components with a significant contribution to the fit results should be considered.

Terms for second-order approximation
In order to account for the effects caused by strong absorption scenarios, let us consider the higher-order Taylor series terms of the expansion in Eq. ( 4).The considerations are generally valid for combinations between any absorbers, both strong or weak.However, the impact of combinations between weak absorbers only is expected to be negligible.
The second-order Taylor series term for any two absorbers X and Y can be split in three terms: two for the respective absorbers (important for strong absorbers only) and one for an interdependence between the absorbers (can be important also for combinations between a strong and a weak absorber).

Self-correlative term
Assuming that the cross section is invariant in space (i.e.neglecting its variation due to temperature and pressure), the single trace gas term for any of the trace gases is Including the squared cross section in the DOAS fit (Puk ¸īte et al., 2010), the second-order contribution of the absorber is accounted for.

Cross-correlative term
The second-order Taylor series expansion term for the interdependence between two absorbers X and Y is (24)   Note the similar structure with Eq. ( 23).Since the term consists of the product of the absorption cross sections of two absorbers, it will be important if absorption structures of two absorbers overlap.However, it is left open how much this term contributes to the SCDs of different trace gases; i.e. it describes the contribution to both trace gases in a similar way.It seems clear that the error on the SCD fitted by the linear DOAS model will be largest for weak absorbers since the ratio between the OD associated with the cross-correlative term and the OD attributed to the trace gas will be larger for a weak absorber: As an example, the Taylor series terms of different orders for a typical scenario with ozone (strong absorber, vertical column ∼ 460 DU), NO 2 , and BrO (weak absorbers) characteristic for subarctic latitudes in March (see Table 4.1 in Puk ¸īte et al., 2010, for more details) are calculated by applying the effective light paths of different orders.The wavelength-dependent effective light paths for the atmosphere without absorption are derived from trajectory ensembles generated by McArtim (Deutschmann et al., 2011;Deutschmann, 2014) (see Appendix A for calculation de-tails) at distinct wavelengths and approximated (see Sect. 4.1 for details) by a broadband function for wavelengths in between.The obtained Taylor series terms for UV and visible (VIS) spectral ranges are plotted in Fig. 2a and c, for a limb viewing geometry, tangent height (TH) of 19.83 km, and solar zenith and azimuth angles of 75 and 60 • respectively.First of all, non-negligible values of the ozone self-correlative term and the inter-correlative terms between ozone and the minor absorbers with respect to their first-order terms can be seen.For ozone, also the third-order self-correlative term shows rather high values.The importance of these and other higher-order terms should be evaluated also taking into account the detection limit of a particular detector.The consideration of the higher-order effects for measurements in limb geometry is most important at wavelength regions with strong ozone absorption bands for an instrument able to detect ODs below 10 −2 to 10 −3 .
The lines in Fig. 2b and d, show the sum of the absorption OD terms up to a certain order, while the dots indicate the true total absorption OD (logarithmic ratio between intensities simulated by the RTM without and with absorption).The overestimation of the absorption is clearly seen if only the first-order (linear) terms are considered.Second-and third-  order terms improve the characterization converging towards the truth.

Second-order effective light paths
The second-order effective light paths (Eq.7), as well as higher-order terms, similarly to the classical first-order effective light paths (Eq.5), depend on the specific composition of light paths for the observation geometry and are the same for any trace gas; i.e. the effective light paths only quantify the effect to the observed OD an absorber will have if it is "added" to the atmosphere.
Let us analyse the difference between the second-order effective light paths and the products of the corresponding (first-order) effective light paths (as in Eqs. 4,22,23,or 24) in more detail.This difference can in principle be used as a measure for the complexity of the observation geometry.Indeed, this difference in mathematical terms is the covariance between light paths through the two boxes.For the same box (if j = J ), the difference (i.e. the variance) will always be ≥ 0 because the mean of the squared light paths is larger than the squared mean of the light paths if their lengths and/or weights are not equal.For different boxes (j = J ) the difference can be both positive or negative but not smaller than the negative product of the effective light paths.A negative difference will appear when the two boxes are more likely to be crossed by different than by similar light paths (e.g.different parallel light paths from the Sun towards the line of sight of the instrument).In practice, the difference will be larger if boxes j and J are located close to each other (with maximum if j = J ) and will decrease and even become negative with increasing distance between the boxes j and J .Negative values are more characteristic for 2-D or 3-D simulations of the atmospheric radiative transfer where it is more likely that different light paths cross a different set of boxes; for the 1-D atmosphere considered here mostly positive values occur.
For illustration, first-order effective light paths and their product between different boxes are compared with secondorder effective light paths in Fig. 3.The same scenario as in Fig. 2 is used.Similar to first-order effective light paths, as well as for the second-order effective light paths, the highest sensitivity is found around the tangent point.Interestingly there are also the largest differences between the secondorder effective light paths and the products of the first-order effective light paths are observed.This finding is caused by the increased scattering probability around the tangent point, which causes photons to change the direction.This also means that higher-order contributions to the OD increase for absorbers located around the tangent point.The different sensitivity distributions of the different-order terms could in principle allow us to obtain additional information about the spatial distribution of the absorber.However, this possibility is not further exploited in this study.

Third-order contribution
Continuing analysing the Taylor series expansion of Eq. ( 1) one can realize that the sensitivity at a certain region can depend not just on the sensitivity at another region but also on a third (Eq.4), fourth, etc. region (there can be also higherorder self-correlation terms for the same region as well).A quantitative description of the third-order terms as being products of cross sections is provided in Appendix B; also these third-order products of cross sections can be included in the DOAS fit as fit parameters.
From Fig. 2a and c one can see that, for the chosen scenario, even the third-order self-correlative ozone term might be important enough to be considered in the fit.It can be seen however that the third-order terms generally show a smaller contribution than the second-order terms.Since the measurement geometry along with the radiative transfer introduces strong limits for the possible trajectories, it is likely that higher-order contributions decrease rapidly with increasing order.Besides the Taylor series nature that diminishes the importance of even higher terms according to 1/n!, it is less likely that the light path that had crossed some atmospheric box/boxes will cross another one in a completely different way, because the probability of scattering events per trajectory decreases rapidly.The cases where many scattering events are still probable (e.g.highly polluted regions with high aerosol load or fog) would be worth investigating in a separate study.

Limitation of the classical absorption optical depth definition
In this section we discuss the applicability of fundamental concepts of DOAS for scattered light observations.We show that these fundamental concepts are not valid in a strict sense.
While the basic consequence of these findings is a possible bias of the results of the retrieval algorithms, in practice these aspects are specifically relevant only for scenarios with strong non-linearities due to absorption.For most of the current DOAS applications these considerations are not relevant since they are performed in the (almost) linear range.
The fundamental assumption by the standard DOAS is that it is possible to distinguish between absorption features of different trace gases; i.e. the total atmospheric absorption τ T (being the logarithm of the ratio of the intensity without and with absorption) can be written as the sum of the ODs of the individual absorbers: In order to improve the fit results sometimes differential cross sections are applied to better differentiate between narrow and broadband spectral features, but it is always assumed that each trace gas has its own "fingerprint" completely originating from the contribution of this absorber.
Looking at the higher-order terms discussed before, it becomes clear that this assumption (Eq.26) does not hold for scattered light observations; i.e. there are "fingerprints" that cannot be attributed to one trace gas alone but are products of several absorbers.
It can in particular be shown that the definitions for optical quantities (OD, SCD, and AMF) used in DOAS are affected.In the following subsections we discuss two commonly used definitions: the classical definition and the light path integral definition.They are introduced in the following subsections.

Classical definition
The classical definition is used by e.g.Perliski and Solomon (1993) and defines OD as the logarithmic ratio of intensities with and without an absorber of interest.
The OD τ c according to this definition (c stays for "classical") for absorber X in terms of Eq. ( 2) is expressed as where I −X means intensity without absorber X and β Xj is the absorption coefficient of the selected absorber.One might wish to investigate the contribution of each absorber to τ T in Eq. ( 26) separately.However, looking at Eq. ( 26) one can immediately realize that for two absorbers X and Y : .
This means that the OD of two absorbers (or any number of absorbers) is not equal to the sum of the individual ODs of these absorbers, since the cross-correlation between different absorbers is not considered.Consequently also the definitions of the AMF (Perliski and Solomon, 1993), and SCDs, are fundamentally inaccurate (V X is the vertical column density (VCD) of absorber X).Hence not only standard DOAS but also "extended" applications like the AMF modified DOAS (Richter, 1997) are affected.Figure 4, top panel, shows the difference between the sum of the ODs of the considered absorbers in the UV, each calculated by the classical formula, and the OD calculated for the whole absorption for the limb observation scenario with TH = 19.83km and absorptions of ozone, NO 2 , and BrO at distinct wavelengths.The rest OD is compared with the sum of the inter-correlative terms estimated by Eqs. ( 24), (B2), and (B3) which is approximated by the approach in Sect.4.1 for all wavelengths within the spectral region similarly as for Fig. 2.Not accounting for the inter-correlation between the different absorbers represents the approach of simply adding the individual ODs obtained by the classical definition.While perfect agreement is found between the rest OD and the sum of the inter-correlative terms of up to the third order, already the sum of the second-order terms gives good agreement for most wavelengths.While the difference is shown for wavelengths where the full radiative transfer simulations are performed, the wavelength-dependent effective light paths are approximated for all wavelengths in the spectral interval similarly as for Fig. 1 (see Sect. 4.1 for details).The bottom panel shows the corresponding results for an absorption scenario containing ozone and NO 2 in VIS.Here almost perfect agreement is found already by the second-order absorption contribution alone.
The result in Fig. 4 is confirmed when the OD according to the classical definition is expressed including higher-order terms.The exact intensities with and without absorbers in Eq. ( 27) are replaced with the derivation according to Eq. ( 4).Considering terms up to the second order, the OD is or, assuming spatially constant cross sections, The classical OD thus consists of the absorption terms attributed to the absorber X (the first two terms on the right side of the equation) and the cross-correlative terms between X and all remaining absorbers.Summing the ODs for the individual absorbers, the cross-correlative terms are considered twice as much as in the OD calculation for the whole absorption together while the self-correlative terms are considered correctly.

Light path integral (LPI) definition
Alternatively, according to the LPI definition, the SCD is defined as concentration integrated along the effective light path (e.g.Wagner et al., 2007).In the discrete notation it can be written as with L T the effective light path obtained for the given absorption scenario: The corresponding definition of the OD would be the integration of the absorption coefficient along the effective light path: Please note that τ lX = τ cX (as defined in Eq. 27), indicating that even for a scenario with only one absorber present in the atmosphere (where the classical definition in Eq. 27 is valid), LPI definition (Eq.37) is not accurate.Also for the LPI definition the optical quantities can be expressed in terms of the Taylor series expansion.Expanding effective light paths in Eq. ( 36) in Taylor series on absorption (for more details see Sect.4.2) up to the second order, the SCD in Eq. ( 35) becomes Please notice that the summation over k here also contains the self-correlative term, i.e.X ∈ k.In a similar way the OD can be approximated: While the application of the LPI definition has been limited for retrievals of weak absorbers (Marquard et al., 2000), Eq. ( 39) allows us to quantify the disagreement for strong absorbers which arises due to overestimating the selfcorrelative term by 2 times in comparison to the classical definition (Eq.27, see its approximation in Eqs.32 and 33).The disagreement in principle can now be addressed in a quantitative way.The cross-correlative terms in the LPI definition are considered in a similar way as in the classical formula.

Taylor series definition
In summary, both the classical and LPI definitions of the OD are not applicable to describe the total OD as the sum of the ODs of individual absorbers.In order to obtain an agreement between the sum of the approximations of the ODs τ k of individual absorbers and the Taylor series approximation of the total OD in the form of Eq. ( 26), one can use arbitrary factors a X,k instead to assign part of a cross-correlative OD to absorber X and factors a k,X to assign another part of it to absorbers k, so that for any k = X and with a X,k=X = 0.5.The OD definition for absorber X can then be written as Therefore, the approximation of the OD in Eq. ( 41) can be used to define more appropriate quantities for SCD and AMF in Eqs. ( 30) and ( 31) for applications e.g. in forward modelling for profile retrievals (e.g.satellite limb or DOAS observations) like in AMF modified DOAS.The selection of factors a X,k is arbitrary and could depend on empirical considerations like the correlation properties between different fit parameters; e.g. for an absorber with more pronounced differential structures one could assign a larger factor, to account for the fact that cross-correlative structures can become assigned to the strong absorber in the fit.In other words, the (self-correlative) absorption structure would largely correlate with the cross-correlative structure.

Variability of effective light paths on absorption and on wavelength
The effective light paths defined earlier (Eqs.5, 7, 9, 10) are calculated at a single wavelength and at no absorption while the DOAS fit is performed over a selected wavelength range, over which the effective light paths change due to both variability of the trace gas absorption and the atmospheric scattering properties.Therefore it can be worth it to parametrize the different effective light paths to account for such variations in the retrieval process without the need to perform RTM simulations for every wavelength within the fit window if atmospheric trace gas composition changes.

Wavelength dependence due to scattering
If the Taylor series expansion of Eq. ( 2) is made and effective light paths are calculated at zero absorption, the wavelength dependence of the weights w i is caused only by changes of the scattering probability.So basically, light paths and their weighs w i as in Eq. ( 2) are different at different wavelengths.
Since the variation is broadband, one can approximate it by a broadband feature.For example, one can simulate effective light paths (box AMFs) of different orders at selected wavelengths by RTM and interpolate them or fit the variation by a low-order polynomial or some other characteristic function.For illustration, box AMFs as function on wavelength are shown in Fig. 5, where RTM simulations are performed at five different wavelengths and approximated afterwards.

Dependence on absorption
By performing the Taylor series expansion at a zero absorption background, effective light paths at an arbitrary absorption background can be expressed as a function of higherorder effective light paths (simulated for the zero background scenario) and the trace gas distributions.Assuming spatially constant cross sections, the first-order effective light path (as in Eq. 5) for box J considering terms up to the third order can be written as Here, trace gas concentrations c k and c K , with k and K belonging to the same set, are arbitrary background distributions of trace gases.
Such a parametrization is practical, since it can be used for any wavelength chosen for the procedure described in Sect.4.1 and for any trace gas cross section as well as for any trace gas profile.This allows us to adjust the effective light paths with small calculation effort, which can be useful for iterative Gauss-Newton-like retrievals, for which a linearization of the forward model is used and is updated after each iteration.An iterative algorithm can be implemented without the need to redo (much slower) RTM simulations.With small modification (see next subsection) the spatial variability of cross sections can also be accounted for.Figure 6a shows an example for first-order effective light paths calculated for a certain absorption background from the effective light paths obtained for an atmosphere without absorption at 545 nm.Good agreement with the effective light paths explicitly simulated for the given absorption is obtained.Already if only the terms up to the second order are considered, the discrepancies are hardly visible by eye; the largest discrepancies are located around the tangent point where most non-linearities come from (compare Fig. 3d).The absolute contributions of the different-order terms are shown in panel b.Panels c and d show the relative differences between the approximated and explicitly simulated effective light paths for the absorption background at different wavelengths in the mid-visible spectral range (520-570 nm).The largest discrepancies occur for altitudes below the TH, which usually only have a small effect on the measured signal.The largest contribution to the measurement is from and above the TH.At TH they exceed 2-3 % only at 570 nm where the absorption OD is extremely strong (around 0.9, see Fig. 2b); above the TH the discrepancy strongly decreases and converges towards 0 a few kilometres above the TH.

Spatial variability of cross sections
If the true cross-section σ T kj at a certain box j differs from the originally considered cross-section σ k , the impact on the absorption will be the ratio of both R kj = σ T kj /σ k .In this way one can apply modifications by scaling the concentration of absorber k by the cross-section ratio in the respective formulas.For example, the SCD in Eq. ( 15) can be rewritten in order to account for the spatial variability of the cross section and/or the cross-section variation on wavelength: In a similar way, i.e. by considering the cross-section ratio, one can also include modifications in all other formulas containing products of cross section and concentration.strong and weak absorbers can be considered by adding the cross-section product of both in the fit to improve it.Although easy to implement, several reasons complicate the use of these cross-correlative terms for a non-iterative retrieval algorithm.They include (1) considerations about the optical quantity definitions (they are now clarified in Sect.3.6); (2) the spectral structures of these terms showing spectral features similar to the cross sections of the absorbers involved -thus, they provide the potential for high correlations between the different terms included in the DOAS fit; (3) these terms are a product of contributions of different absorbers and their interpretation (e.g. for the retrieval of spatial distributions) requires a non-linear algorithm.
The OD for two absorbers considering terms up to the second order can be written as Standard DOAS accounts only for the terms 1 and 2. Taylor series DOAS as discussed in Puk ¸īte et al. ( 2010) also added term 3 (X ≡ O 3 ).(Terms considering the wavelength dependence caused by scattering are also added but are not shown here.)Term 4 can be neglected for weak absorbers, while term 5 shows a rather strong impact if it is ignored (see e.g.Fig. 12, bottom panel, in Appendix C in Puk ¸īte et al., 2010).However, adding it, assuming it as fully contributing to the OD of the weak absorber, leads to errors by overestimating the impact of the cross-correlative terms (Sect.3.6).The effect is not negligible with respect to the OD of the weak absorber (see Eq. 25).If, however, the correlative term is added to the fit but not assumed to contribute to the OD of the weak absorber, the fit coefficient of term 2 of the weak absorber is basically an approximation for the SCD, which would be measured if the strong absorber X were not present in the atmosphere.So one could in principle retrieve profile information from the retrieved SCD by performing forward modelling for an atmosphere with no absorption.In practice, however, this first-order term has a rather high fit error because of the correlation with the inter-correlative term and consequently the profile retrieval will give rather poor results (the retrieval error is high).
As an example, a DOAS evaluation for a simulated limb spectrum at TH = 19.83km by applying the Taylor series approach is shown in Fig. 7 for the wavelength range 518-570 nm, where the ozone absorption is especially strong.The fit window in the green spectral region was selected for demonstration purposes primarily because of the strong O 3 absorption.However, the NO 2 absorption is in general relatively well structured in this spectral range and has been investigated before (see e.g.Bauer, 2012).For limb measurements this wavelength could be beneficial due to the decreased Rayleigh scattering probability (in comparison to the blue spectral region where the NO 2 retrieval usually is performed), thus causing an increased sensitivity to the lower www.atmos-meas-tech.net/9/2147/2016/Atmos.Meas.Tech., 9, 2147-2177, 2016 NO 2 First-order Taylor series term w.r.t.NO 2 absorption Ozone and NO 2 Second-order cross-correlative term between ozone and NO 2 absorption Figure 8.(a) Results for the same evaluation as in Fig. 7, but wherein the fitted ODs are summed according to their attribution to certain trace gases or scattering in Table 1 in comparison with a calculation from the simulated terms at zero absorption (formulae in the right column in the table).(b) The same as (a) but the retrieval is performed with random Gaussian noise added to the simulated spectra.The noise leads to an increase of the fit residual by 10 times when compared to the fit in Fig. 7.The results are shown for 10 different random noise spectra.
stratosphere.The fit parameters are listed in Table 1.To improve the condition of the forward model matrix, the polynomial parameters are orthogonalized.The same is done for the ozone parameters; they are all labelled as O 3 in the figure.
Ten million light path trajectories are used to simulate the spectrum, reducing the noise below 2 × 10 −4 and allowing a good distinction also for the second-order cross-correlative term between ozone and NO 2 .Figure 8a compares the fitted ODs summed according to their attribution to certain trace gases or scattering in Table 1 with the corresponding values calculated according to the formulas in the right column of the table.Good agreement is obtained.It can be seen that the OD of the second-order cross-correlative term between ozone and NO 2 is approximately 10 % of the OD of the firstorder NO 2 term -a potential error source if neglected.Addi-tionally, the effect of spectral noise on the retrieval is investigated by adding random Gaussian noise to the simulated spectrum.The noise level was selected in a way that the fit residual increased ∼ 10 times (i.e. to 2 × 10 −3 ) in comparison to the residual for the original simulation.Figure 8b, however, reveals even higher variation (up to 5 × 10 −3 ) in the fitted NO 2 and cross-correlative terms.The variation can be explained by the large covariance between both terms as shown in Fig. 9 where the statistical properties (mean and standard deviation) of the fitted NO 2 ODs are plotted in comparison with the calculated OD from the effective light paths.
For the statistical analysis, the spectrum with 1000 different noise spectra added is evaluated.It can be seen that ( 1) the random noise does not introduce obvious systematic error structures in the fit result and (2) the standard deviations , second-order cross-correlative term between ozone and NO 2 (middle), and the sum of both terms (bottom).For a statistical evaluation, the analysis is performed for the simulated limb spectrum at TH = 19.83km with 1000 different random Gaussian noise spectra added (similar as for Fig. 8).Additionally, the respective terms as calculated from the simulated effective light paths are included for comparison.
of the two fit terms are very similar in magnitude, but the standard deviation of their sum is reduced almost 3 times.In practice, therefore, it might be necessary to utilize both terms together for further retrieval steps (e.g.spatial evaluation).
To illustrate the consequences if the cross-correlative term between ozone NO 2 is not considered, Fig. 10 compares the resulting NO 2 OD when the cross-correlative term is either included or excluded from the fit with the simulated OD according to the classical formula (Eq.27).Also the respective second-order approximation for the classical definition calculated from the effective light paths, i.e. the sum of the firstorder NO 2 and the cross-correlative term between ozone and NO 2 (compare to Eq. 33), are plotted.Again a good agreement between the simulated classical OD and the approximated OD can be seen (around 1 %).It is also found that the agreement with the evaluation, where both the first-order NO 2 and the cross-correlative terms are included, is varying only slowly with wavelength and is less than 2 %.In addition to the higher-order effects, the variation might also be caused by broadband (i.e.scattering) features, the RTM behaviour (e.g.Monte Carlo (random) noise in the simulated spectrum), as well as numerical effects in the fitting algorithm.If however the first-order NO 2 term is left alone in the fit, the relative difference has a similar shape as the ozone cross section with the amplitude 1 order of magnitude larger reaching 15 %.

Suggested algorithms
Due to the high cross-correlation between the different-order terms, the retrieval quality, among others, depends on the way this correlation is accounted for.There are several possibilities, including the following.
Figure 10.Results of the Taylor series DOAS evaluation either with or without the cross-correlative ozone and NO 2 term included in comparison with the NO 2 OD simulated according to the classical definition (Eq.27) and calculated from the simulated effective light paths (up to the second order).Top: absolute values of the simulated, calculated, and retrieved OD for NO 2 ; bottom: relative difference with respect to the classical definition.
1.A quasi-full retrieval approach: the absorption OD is calculated for each considered spectral point according to the higher-order Taylor series expansion, and the forward model is linearized at a certain a priori absorption scenario first.Then an iterative least squares fitting technique is applied, improving the knowledge about the trace gas distributions and adjusting the forward model according to the improved knowledge using the effective light paths of different orders.This method must be the most accurate but also the most time consuming as the calculations are performed for every spectral point and the inversion is performed for all spectral points and measurement geometries together.However, it will be still faster than the classical full retrieval approach (like e.g.Rozanov et al., 2011) because no new full RTM calculations are necessary for each iterative step.
2. A quasi-AMF modified DOAS: other than the method by Richter (1997) where full RTM simulations are performed for every wavelength, here AMFs are calculated utilizing pre-calculated effective light paths of different orders.Also the limitation of the classical AMF definition for a multiple absorber scenario can be considered, e.g. by applying the OD definition according to Eq. ( 41) when calculating AMFs e.g. with Eq. ( 30).AMFs for the retrieved trace gases are calculated for an a priori profile and are applied in a DOAS fit for each measured spectrum separately.Then the trace gas profile is retrieved and the procedure is repeated, adjusting the AMFs e.g. with respect to the spatial variation of the cross sections.This method is also quite accurate since it considers effects between different absorbers in the approximation of the AMFs.Similar to the previous method, calculations (in this case of the AMF) for ev-ery spectral point are required, but the spatial retrieval is done separately from the spectral analysis (which is used to retrieve the SCDs, the product of the fitted VCD and AMF), so the inversion requires less computational resources.
3. A two-step approach: in the first step, a DOAS fit is performed including all relevant higher-order terms, especially the inter-correlative terms.In the second step (spatial retrieval) a retrieval procedure is performed at a selected wavelength (e.g. a wavelength in the centre of the spectral region or where the differential absorption structures are largest or the absorption effects smallest; with a compromise between all these aspects).Further, there are two possibilities: (a) to use only the first-order terms which describe the absorption as there would be no influence of absorption by the strong absorber (the absorption is sorted out by the cross-correlative term) or (b) to perform a non-linear spatial retrieval including the cross-correlative terms of the absorber of interest and the strong absorber.The possible correlation between the different-order terms of the same absorber, however, needs to be approached iteratively and at the end a balance between (a) and (b) should be found.This method would be faster than the first two, because the effective light path calculations are limited to one or few spectral points.At the same time, the algorithm would be less accurate due to the correlations between the first-order and second-order terms in the DOAS fit, or, in other words, it requires better measurement quality.

An iterative two-step approach
While the implementation of the algorithms mentioned in Sect.5.2 is not a subject of this study, for the following sections a fourth retrieval approach is suggested and implemented.Here the effective light paths up to the second order are applied to retrieve vertical profiles.The principle is similar to the two-step approach in Puk ¸īte et al. (2010).First, a DOAS fit is performed to the measured spectra; second, the spatial information is obtained by inverting the fit results to vertical profiles.The substantial change here is the implementation of an iterative process where the retrieved profiles of the relevant trace gases are used to recalculate the absorption spectra.In the subsequent iterations, the measured spectrum is corrected by the calculated spectra, the DOAS fit is performed for the difference, and the retrieved profile is adjusted by eventually arising corrections (a Gauss-Newton approach is implemented).

Spectral evaluation
First, a DOAS fit is performed for the simulated spectra similarly as it is done by the non-iterative algorithm described in Puk ¸īte et al. (2010).Afterwards, the obtained SCDs S f parameters (f means fitted) are used to calculate the OD con-tribution for absorber X: where σ iX are the fit components of trace gas X.For weak absorbers they are just the corresponding cross sections of the trace gases (i = 1).For ozone, the additional Taylor series terms are also considered as in Puk ¸īte et al. ( 2010).The calculation of the OD by Eq. ( 45) is performed at the selected wavelength at which the spatial evaluation is later performed.
A wavelength of 342 nm (located between strong ozone absorption bands while close to the most pronounced BrO absorption band) is used for the UV fit window of 337-357 nm.
For the visible fit window (519-570 nm), the selected wavelength is 545 nm.The trace gases X considered in the calculation and in the profile retrieval (see next subsection) are BrO, NO 2 and O 3 , and NO 2 and O 3 for the respective fit windows.

Spatial evaluation
Effective light paths of the first-and second-order precalculated for zero absorption are used to calculate first-order effective light paths for the profiles retrieved in the previous iteration (for the initial iteration for the a priori data) at the selected wavelength according to Eq. ( 42).In order to allow the use for the selected wavelength, they are approximated by a fit with broadband wavelength function (see Sect. 4.1).
The profile is retrieved according to the Gauss-Newton optimal estimation scheme (Rodgers, 2000): starting with the a priori information as a first guess, i.e. c Xi = c Xa (suffixes a, i, and i + 1 indicate quantities corresponding to a priori, existing retrieved knowledge, and updated knowledge during the iteration respectively).τ X is the difference between τ f X and τ Xa for the retrieval of the first guess.τ Xa , which is applied only for the initial inversion, is the absorption OD calculated according to the approximation of the classical definition (Eq.32) for the a priori scenario at the selected wavelength.The elements of the weighting function matrix K Xi are the products of the firstorder effective light paths and the cross sections.Weighting functions and ODs are used instead of effective light paths and SCDs in order to enable the possibility to account for the temperature dependence of the cross sections (important when the retrieval is applied on real measurements).E Xe is the error covariance matrix with the DOAS fit variances as its diagonal elements; the off-diagonal elements are 0. E Xa is the a priori covariance matrix.The applied retrieval settings for the sensitivity studies are summarized in Table 2.A priori information A priori profiles as plotted in Fig. 13 (dashed lines), constant a priori uncertainty at all altitudes (for ozone: 10 × 10 12 ; NO 2 : 1 × 10 9 ; BrO: 2 × 10 7 molecules cm −3 ); smoothing constraint: correlation length of 3 km introduced in the a priori covariance matrix Same as in the UV (for ozone and NO 2 )

Iteration process
In further iterations, absorption spectra based on the firstand second-order absorption terms according to Eq. ( 4) are calculated from the profiles retrieved in the previous iteration for all wavelengths within the fit window considering the broadband approximation and possible temperature dependencies of the cross sections.Then the DOAS evaluation is performed again for each measured spectrum corrected by subtracting the calculated absorption spectrum (referred to as corrected spectrum in the following).The spatial retrieval is performed again, and the difference τ in Eq. ( 46) is now the newly obtained τ f X .Since the retrieval is converging fast, a fixed small number of iterations (2-3) is selected.

Sensitivity studies of the retrieved profiles
We apply the iterative two-step algorithm to absorption spectra simulated at different THs to investigate its ability to reproduce the true profiles used for the simulated spectra and compare it with the linear algorithm where (1) no iteration for spectral correction is applied and (2) a linear forward relationship between the a priori and OD is assumed.The simulation is performed for the same scenario (limb observation in the subarctic atmosphere in spring) as for previous studies; i.e. while the spatial retrieval is performed similarly as the initial step of the iterative algorithm applying Eq. ( 46), τ Xa (as introduced in Sect.5.3.2) is calculated according to the light path integral definition in Eq. ( 39).

Comparison of the iterative two-step approach with the linear two-step approach
Figures 11 and 12 illustrate the effectiveness of the iterative two-step approach for considering spectral features of the involved absorbers in both the UV and VIS spectral range.In both cases the ODs of the minor absorbers (as derived from the fit to the corrected spectra) are reduced to 0 (for the given simulation precision) when the iterative approach is applied.It means that the forward model considers all spectral features correlated with the fitted terms of these trace gases.Here especially the wavelength dependence and cross-correlation between different absorbers is accounted for.Some remaining absorption can still be seen for ozone comparable with the magnitude of the contribution of its third-order terms (compare with Fig. 2).Also an almost complete removal of systematic residual structures is observed for the UV fit window; the rest is most likely attributed to higher-order effects or simulation errors.For the VIS spectral range, where the residual is dominated by simulation noise (although the same number of photon trajectories is simulated by RTM as in the UV, less scattering events are obtained that reduces the statistics), small changes can be observed especially at longer wavelengths with higher ozone absorption.
The obtained vertical concentration profiles are plotted together with the true profiles in Fig. 13 for ozone, NO 2 and BrO (odd plots from left to right respectively).For the linear retrieval, two results are shown: one, for which the effective light paths were simulated for the true scenario (blue lines), and another, for which the effective light paths are calculated for the a priori profiles from the first-and second-order absorption terms simulated at zero absorption background (green lines).For the iterative two-step retrieval, results for the initial retrieval step (red lines) and the final results after iterative process (solid magenta) are shown.For this retrieval, additional results are shown for which the cross-correlative terms are skipped in the calculation of the correction spectra (dashed magenta line), as discussed in detail in Sect.5.4.2.The top panel shows results for the retrieval in the UV and the bottom for the retrieval in the visible.To minimize possible effects due to spatial under-sampling and a priori constraints, true profiles smoothed by the averaging kernels (according to Connor et al., 1994) are also included in the figure.Even plots show the relative difference between the retrieved profiles and the smoothed true profile.It should be noted that the differences to the true (non-smoothed) profiles could in principle be caused by a contribution of the a priori profiles to the retrieved profiles (if the effect of the a priori on the retrieval is large).However, here the measurement response (i.e. the sum of the rows of the averaging kernels) is close to unity above the altitude of ∼ 12-16 km (depending on involved trace gases), indicating that this effect is small.Hence the differences mentioned above are mainly caused by the smoothing alone.The ozone profile retrieved in the UV (see two plots on the top, left) shows an overestimation of less than 4 % for all retrieval methods, indicating that there is only a small non-linearity effect on the retrieval of the ozone profile itself.This finding can be explained by the fact that (a) the spatial retrieval is performed at a wavelength outside the strong ozone absorption band, (b) the nonlinearities are largely accounted for by the fitted Taylor series ozone terms, and (c) by a good representation of OD at the selected wavelength by the fitted ozone OD.Interestingly, the iterative retrievals (even without the necessity for iterations) show good agreement (better than 1 %) for altitudes between 17 and 33 km, while the linear retrievals show discrepancies of up to 2 % for practically all altitudes.For altitudes below 15 km, however, the iterative retrieval shows larger discrep- .Odd plots (from left for ozone, NO 2 , and BrO): vertical concentration profiles retrieved by different approaches (blue: linear approach with effective light paths simulated for the true profiles; green: same but with effective light paths calculated for the a priori profiles from the first-and second-order absorption terms simulated at zero absorption background; red: profile retrieved at the initial step of the iterative algorithm; magenta solid line: final profile obtained by the iterative approach; magenta dotted line: final profile but without consideration of the cross-correlative terms between different absorbers) in comparison with the true profile (black dotted line: original profile; black solid line: original profile smoothed by averaging kernels).The a priori is shown as a dashed line.Even plots: relative difference to the true profiles smoothed by the averaging kernels.Top: results for the UV; bottom: results for the visible.
ancies of ∼ 3 %.The application of effective light paths simulated for the true profiles or approximated for the a priori profiles does not give significantly different results: the results agree within ∼ 1 % range with respect to each other.
In the VIS spectral range, much larger ozone absorption occurs, increasing gradually with wavelength.Thus it is not possible to select a wavelength with low ozone absorption for the spatial evaluation.Therefore an overestimation of the retrieved profiles (see the two plots on the left in the bottom panel) by the linear retrieval of up to 16 % is observed because the effective light path integral formalism for the OD used in the retrieval set-up overestimates the higher-order contributions to the absorption OD (see Sect. 3.6).At the same time, in the initial step of the iterative approach the classical OD definition (Eq.32) is used.Thus, higher-order self-correlative terms are correctly approximated in τ Xa in Eq. ( 46) (while the contribution of the cross-correlative terms with weak absorbers can be ignored).Good agreement (within 1 %) is achieved above the ozone peak between 20 and 28 km altitude.Below and above this altitude range the concentration is under-and overestimated respectively by up to 4 %.The iteration process leads to an improvement also for these altitudes, reaching an overall agreement within 1 %.
For weak absorbers like NO 2 , the higher-order selfcorrelative terms are negligible, explaining that almost no difference between the linear retrieval approach and the initial step of the iterative approach (see upper third and fourth plots) is found (because both OD definitions consider crosscorrelative terms similarly).The difference with respect to the truth of up to 25 % for altitudes just below the ozone peak in the UV could be caused by the incorrect accounting for the cross-correlative effect with ozone (either caused by the lower ozone a priori profile in comparison with the truth or the incorrect OD definition; see discussion in Sect.3.6).Since, however, the retrieval utilizing the effective light paths simulated for the true profiles agrees with the retrieval utilizing the effective light paths approximated for the a priori profiles, the discrepancy is not caused by the difference between the true and a priori ozone profile.The most likely reason for the disagreement is the broadband variation of the NO 2 OD with wavelength (due to scattering), since other than for ozone, no variation of the NO 2 OD due to dependence of scattering with wavelength is assumed in the DOAS fit.In addition, in contrast to BrO, the NO 2 absorption features are distributed rather equally along the fit window, while the selected wavelength for the spatial retrieval is more at the beginning of the fit window.The profile obtained after the iteration process, however, shows good agreement (better than ∼ 1 %) with the truth for altitudes above 16 km.Deviations below this altitude for the retrieved NO 2 profiles are caused by the a priori profile according to our investigation.The results for VIS (bottom, middle plots) reveal disagreement with the truth for the linear retrievals when the effective light paths are calculated for the a priori profiles.While above the ozone profile peak an overestimation of up to 6 % is found, below From left: ozone, NO 2 , and BrO: relative differences between the trace gas profiles retrieved at different wavelengths by different approaches with respect to the true profile smoothed by averaging kernels.Rows from top to bottom: retrieval in the UV by the linear approach (with effective light paths calculated for the a priori), retrieval in the UV by the iterative approach, retrieval in the visible by the linear approach (with effective light paths calculated for the a priori), and retrieval in the visible by the iterative approach.
the ozone profile peak an underestimation of up to 8 % occurs.Much better agreement with the truth is obtained for the linear retrieval with effective light paths calculated for the true profiles and the iterative retrieval (better than 2-3 % for altitudes above 20 km).Below this altitude, an overestimation of up to 10 % for the linear and 8 % for the iterative retrieval is obtained, which does not depend on the a priori and is potentially caused by the cross-correlation of the fit parameters or even higher-order Taylor series terms not considered in the retrieval.Due to the almost perfect agreement between the iterative retrieval and the linear retrieval (when the effective light paths are calculated for the true profiles) the assumption of an influence of possibly incorrect crosscorrelative terms in the OD definition can be excluded.Similarly to in the UV, the NO 2 profiles below 14 km are affected by the a priori.The BrO retrieval in the UV (right two plots on the top) shows similar findings as the NO 2 retrieval in this spectral range: an agreement between the linear retrievals and the initial step of the iterative retrieval is found, but with a smaller discrepancy to the truth.This finding is explained by the selection of the wavelength for the spatial evaluation close to the strongest BrO absorption band: the BrO profiles are slightly underestimated by up to 4 % for altitudes up to 30 km.The iterative retrieval improves the agreement to 2-3 % for altitudes below 25 km, while at higher altitudes the agreement is similar to the linear approaches.

Effect of cross-correlative terms between different trace gases
As stressed in Sect.3.4.2and demonstrated by the application of the Taylor series approach (Sect.5.1) the consideration of cross-correlative higher-order terms is especially important for the retrieval of the minor absorbers.We investigate their importance in the application of the iterative algorithm by performing an iterative retrieval without considering the cross-correlative terms in the calculation of the correction spectra.The resulting profiles (see magenta lines in Fig. 13) confirm the aforementioned arguments and results: substantial discrepancies with respect to the true profiles are observed for both minor absorbers (BrO and NO 2 ) while they are negligible for ozone.The discrepancies are largest for the NO 2 results in the VIS spectral range (more than 20 % around 20 km altitude) due to the larger ozone absorption, while in the UV the difference to the truth does not exceed 5 %.

Dependence on the wavelength selection for the spatial retrieval
We investigate the stability of the iterative retrieval if the spatial retrieval is performed at different wavelengths.Figure 14 displays the relative difference with respect to the true profiles when the spatial evaluation step of the retrieval is performed at different wavelengths within the fit window.While for the linear retrieval the discrepancies strongly depend on the selected wavelength, the iterative retrieval shows similar results for all wavelengths.For the linear retrieval, the discrepancies for the ozone profile are larger for wavelengths at the strongest ozone absorption bands in the UV as well as smaller at shorter wavelengths in the visible where smaller ozone absorptions occur.For NO 2 , the agreement is better at longer wavelengths in the UV but even then some discrepancies are left.In the visible, the profile is overestimated at higher altitudes (above 20-25 km depending on wavelength) increasing with wavelength, while it is underestimated at lower altitudes.For BrO, however, good agreement for the linear retrieval is obtained for the lower wavelength part of fit window (where the strongest BrO absorption peak occurs) but close to the ozone absorption bands.In short, the application of the linear approach requires detailed sensitivity studies to find the best retrieval wavelength and the agreement may even depend on the involved fit parameters.At the same time, the iterative approach shows similar performance at all wavelengths for all involved trace gases.

Sensitivity to the fit window selection
Figure 15 compares the retrieved profiles with the true profiles for the UV fit window extended towards shorter wavelengths (332-357 nm) including in addition two stronger ozone bands and one BrO absorption band.Compared to the findings for the standard fit window (Fig. 13, upper panel) similar but larger discrepancies with respect to the true profiles are obtained both for the linear retrieval and the iterative retrieval.For ozone they reach 5-8 % below 17 km altitude for all the retrieval approaches, while for BrO the underestimation increases for the linear retrievals at all altitudes being now 3-6 % below 30 km.For the iterative retrieval, the retrieved BrO profile varies more around the true profile being slightly more underestimated at low altitudes (by 4 % at 15 km) and overestimated by 2 % at around 21 km.For NO 2 , the profiles retrieved by the linear approach show a shift towards lower altitudes, while for the iterative retrieval approach an underestimation around 20 km altitude is observed (which is not present for the standard fit window).The agreement, however, improves (differences are less than 0.5-1 %) for all trace gases compared to the agreement obtained for the standard fit window if the third-order absorption effects of ozone are included in the DOAS fit (not shown here due to almost one to one matching with Fig. 13, upper panel).We also found out that to obtain this good agreement it is only necessary to include the third-order absorption effects in the DOAS fit, but it is not necessary to consider the thirdorder terms in the calculation of the effective light paths for the spatial evaluation and the calculation for the correction spectra in the iterative process.

Effect of including the Taylor series terms in the fit
For strongly non-linear problems, and if the a priori is too far away from the true profiles, the results during the iterative retrieval might converge towards profiles that differ from the truth.Applying the standard DOAS fit (i.e.skipping higherorder terms), much larger discrepancies are found even after the iterative process (Figs.16 and 17 for the UV and VIS respectively) when compared with the retrieval when the higher-order terms of ozone are included (Figs. 11 and 12).The residual values are of the magnitude of the third-order ozone absorption OD (compare with Fig. 2), indicating the impact from the higher-order spectral features: these features (1) are not considered by the standard DOAS fit so well as by ozone Taylor series terms and (2) consequently the retrieved profiles become systematically biased, which affects the calculated correction spectra.The retrieved profiles from different approaches are shown in Fig. 18.While the significantly disagreeing results of the linear retrievals are corrected to a large extent by the iterative approach, the uncertainties in the UV are by a factor of 2-4 larger compared with the retrieval including the additional ozone terms.The iterative retrieval in the VIS spectral range, which is characterized by much stronger absolute but smaller differential ozone absorption, also results in larger uncertainties (by a factor 2-4) above the ozone profile peak, while the ozone concentration is disagreeing by 7 % and the NO 2 concentration by more that 30 % at 20 km altitude.

Application to SCIAMACHY measurements
In the following we apply the two-step iterative algorithm introduced in Sect.5.3 for the retrieval of trace gas profiles from SCIAMACHY limb measurements.We investigate the response to different retrieval settings (like the crosssection dependence on temperature or the consideration of the cross-correlative terms).We also compare the retrieved profiles with correlated balloon measurements provided by Dorf et al. (2006Dorf et al. ( , 2008) ) and Rozanov et al. (2011).Note that the real measurements and retrieval parameters are subject to substantial errors including a relatively small signal-to-noise ratio and other instrumental problems, as well as uncertainties in the cross sections, temperature profiles, spectral calibration, and consideration of the Ring effect.For the comparison of the profiles from different instruments, the selection of different cross sections, fit windows, trajectory modelling, and photochemical correction for balloon measurements are also additional error sources (Kühl et al., 2008;Puk ¸īte et al., 2010;Rozanov et al., 2011).

Instrument description
The SCIAMACHY instrument (Burrows et al., 1995;Bovensmann et al., 1999) was flown on the ENVISAT satellite operated in a near-polar Sun synchronous orbit with an inclination from the equatorial plane of ∼ 98.5 • and probed the atmosphere at the day side of Earth in alternating sequences of nadir and limb measurements from August 2002 to April 2012.Limb scans in one scanning sequence were  (Serdyuchenko et al., 2013) at 193, 213, and 233 K (I 0 corrected), NO 2 (Bogumil et al., 2003) at 223 K, and BrO (Fleischmann et al., 2004) at 223 K; Taylor series terms for O 3 (product of cross section at 213 K and wavelength and squared cross section) and NO 2 (product of cross section and wavelength), functions to account for polarization (eta, zeta), offset (1/I 0 , λ/I 0 ), Ring, shift  0, 335.6, 338.0, 344.2, 357.0, 363.4, 380.0, 391.1, 399.9, 419.8, 435.1, 450.2, 519.9, 545.0, 570.2 Approximation of the effective light path variation on wavelength Fit with Rayleigh cross sections (Bates, 1984); their square and offset as fit parameters Forward modelling, initial retrieval iteration step Calculation of ODs of individual absorbers at spatial retrieval wavelength from a priori profiles according to the second-order approximation of the classical definition (Eq.32); first-order effective light paths for a priori scenario are obtained from first-and second-order effective light paths simulated for atmosphere without absorption.Cross-section dependence on temperature is accounted for by using temperature-dependent cross sections (Serdyuchenko et al., 2013 (O 3 ), Bogumil et al., 2003 (NO 2 ), andFleischmann et al., 2004 (BrO) are used) Forward modelling, following iterations Same as above but the calculation is performed for trace gas profiles obtained in the previous iteration; also correction spectra are calculated Wavelength at which the spatial retrieval is performed (nm)

342
Retrieved trace gases, altitude range (km) 10-38 (BrO, NO 2 ), 10-41 (ozone) Altitude grid resolution (km) 1 Regularization settings Optimal estimation scheme (Rodgers, 2000) with constant a priori uncertainty at all altitudes (for ozone: 10 × 10 12 ; NO 2 : 1 × 10 9 ; BrO: 2 × 10 7 molecules cm −3 ); smoothing constraint: correlation length of 3 km introduced in the a priori covariance matrix performed with approximately 3.3 km elevation steps at the tangent point in flight direction.The cross-track swath was 960 km at the tangent point and consisted of up to 4 pixels for the UV/VIS spectral range.The field of view was 0.045 • in elevation and 1.8 • in azimuth.This corresponds to approximately 2.5 km in vertical direction and 110 km in horizontal direction at the tangent point, respectively.Measurements were performed in the UV/VIS/NIR spectral range from 240 to 2380 nm with a spectral resolution of approximately 0.25 to 0.55 nm in the UV/VIS range.More instrumental details can be found in Bovensmann et al. (1999).

Retrieval settings
The retrieval of vertical profiles from SCIAMACHY measurements is performed in a similar way as for the studies with simulated spectra above.The retrieval scheme is based on the linear two-step retrieval algorithm developed in our group (Puk ¸īte et al., 2006(Puk ¸īte et al., , 2010;;Kühl et al., 2008) but with the addition of the iterative scheme.In order to account for real measurement conditions and to address instrumental effects, additional fit parameters for the spectral analysis are introduced in addition to those in the sensitivity studies: Ring effect, polarization, intensity offset, shift, and temperature dependence of cross sections are accounted for.The algorithm parameters as applied for the SCIAMACHY measurements are summarized in Table 3.Besides the iterative re- Logarithm of all eta, zeta functions from SCIAMACHY key data are orthogonalized and two functions with the largest eigenvalues are included in the analysis One eta and one zeta function from SCIAMACHY key data are included according to Kühl et al. (2008) Spectral corrections (shift) Spectral shift is fitted according to approach in Beirle et al. (2013) to allow shift between analysed spectra and the reference (spectra is adjusted by the fitted shift and iterated to account for non-linearity in shift) Non-linear method is used as in Kühl et al. (2008); see e.g.Stutz and Platt (1996)  2. the measured spectra are gridded to a TH grid of 3 km; 3. spectra at the same TH (across track) are averaged; 4. a Sun spectrum (A0 spectrum in the SCIAMACHY data set) is used as a reference spectrum to improve the signal to noise ratio in the fit -in order to correct for possible offsets in the retrieved quantities caused by unexplained spectral structures in the Sun spectrum, the absorber ODs are corrected (by subtraction) by the ODs retrieved at high TH; 5. the spectral shift is linearized and the shift between the reference and the analysed spectra is fitted as additional fit parameter according to the method in Beirle et al. (2013) (during the iterative process, the wavelength grid is adjusted to account for non-linearities in the shift); 6. to account for the spectrally varying polarization sensitivity of the SCIAMACHY instrument eta and zeta spectra are orthogonalized and the spectra corresponding to the largest eigenvalues are considered in the fit.
The settings for the spatial evaluation and the calculation of the correction spectra are similar to for the sensitivity studies, but in addition cross sections at all available temperatures are considered for the calculation of correction spectra and effective light paths.
The most important modifications in the spatial retrieval step include the following (see Table 4 for more details): 1.The vertical profiles are now obtained on a 1 km retrieval grid.
2. Instead of the SCD, the OD is used for the inversion allowing the consideration of the temperature dependence of the cross sections.Although the modifications are discussed in the scope of the iterative algorithm, the retrieval with the linear approach used in the comparison plots in the next subsection also includes all respective modifications while omitting the iterative process.Consequently the results are not the same as presented in Puk ¸īte et al. ( 2010) or Rozanov et al. (2011).

Evidence of the higher-order absorption structures in measured spectra
Before discussing the retrieved profiles by the new iterative algorithm (next subsection) we provide evidence for higher-order absorption structures in the measured spectrum.
Figure 19 shows a difference between the fit residuals for a fit without including higher-order ozone terms (standard DOAS) and with including them (initial fit of the iterative approach).The comparison is performed for a measurement  2010).The fit is performed for a SCIAMACHY measurement with the same observation geometry as for the sensitivity studies.Please note that a similar pattern is obtained for the simulated spectrum in the upper left plot in Fig. 16 illustrating the residual for the initial DOAS fit where the higher-order terms are omitted.
with the same observation geometry as used for the simulation studies (compare e.g.Fig. 16).Because all other residual structures are removed by the subtraction, a similar pattern as in the upper left plot in Fig. 16 (initial fit without higher-order terms added to the fit) can be seen.The neglected higherorder structures can be seen.

Results for BrO
We retrieve vertical concentration profiles by the iterative algorithm and the linear algorithm and analyse the effect of considering the temperature dependence of the cross sections or the second-order cross-correlative terms.We compare the vertical profiles retrieved from SCIAMACHY measurements with collocated LPMA/DOAS balloon observations of direct sunlight (Dorf et al., 2006(Dorf et al., , 2008;;Puk ¸īte et al., 2010;Rozanov et al., 2011).For a detailed description of the balloon DOAS instrument, the retrieval algorithm, and the collocation criteria, please refer to these publications.In Figs.20 and 21, results are shown for the available balloon measurements for Kiruna (67.9The SCIAMACHY retrieval results from the linear algorithm (green line), the iterative approach (red line), the iterative approach without considering the temperature dependence of the cross sections (i.e.assuming a constant temperature of 223 K in the forward modelling) (blue line), or the second-order cross-correlative terms (cyan line) are included in the plot.To not overburden the plots we do not include profiles showing the temperature effect for the linear retrieval.This effect is caused by the temperature dependence of the cross sections on the calculation of the higherorder effective light paths, but this effect is negligible.Also for minor absorbers, the results of the linear algorithm are practically identical with the result of the first iteration of the iterative algorithm, and therefore they are not plotted.An agreement between SCIAMACHY and the balloon profiles (within the error bars) can be seen for most of the profiles and altitudes, which fulfil the collocation criterion (indicated by the yellow shading).The profiles obtained by the linear retrieval are systematically lower than presented in the retrieval comparison by Rozanov et al. (2011) where they were found systematically too high in comparison to retrievals by other groups.This decrease compared to the previous version can largely be explained by the use of another wavelength (342 nm) for the spatial retrieval.Also the different treatment of the Fraunhofer reference might contribute to the change (we found a systematic bias caused by TH reference spectrum applied in the previous algorithm).The profiles from the iterative retrieval without yet considering the temperature dependence of the cross sections shows an increase compared to the linear retrieval.The difference is increasing with altitude (from around 0 % below 20 km to 10 % around 30 km).The tendency is comparable to the findings from the simulated spectra where a positive offset is also obtained for the altitude range around 20 km between the linear and iterative retrieval.However, for the real measurements the difference is smaller by 5 %.The consideration of the temperature dependence of the cross sections leads to lower values in comparison to the retrieval assuming a constant temperature.A seasonal variation is found similar to the findings by Rozanov et al. (2011).They investigated the effect of the temperature-dependent BrO cross sections and found a typical effect of 10 % with a strong seasonal variation with altitude.Keeping in mind that also the temperature dependence of the ozone cross section in the current study is considered, a larger effect (up to 15-20 %) for high latitudes in spring is obtained, caused by both the temperaturedependent BrO and ozone absorption features.Exclusion of the cross-correlative terms in the correction spectra (which is subtracted from the measured spectra in the upcoming iterations of the iterative retrieval) shows only an effect of a few percent, which is in accord with the simulation studies.

Conclusions
For scattered light observations the Beer-Lambert law generally does not hold due to multiple light paths contributing to the measurement.In the presence of strong absorption and a strong wavelength dependence of the light scattering probability (characteristic for complex viewing geometries like satellite limb measurements) the commonly used linear approximation between the trace gas concentrations and the logarithm of the measured intensity does not hold anymore.The non-linearity is especially strong when large wavelength intervals are used in the spectral analysis.Therefore algorithms based on the DOAS technique are especially affected.In essence, basic quantities like the OD or the SCD cannot be unequivocally defined anymore.Even with modifications, such as including additional fit terms accounting for the effects of strong absorbers, this problem cannot be solved completely because of remaining cross-correlative structures between weak and strong absorbers or remaining structures due Comparison of SCIAMACHY BrO profiles retrieved by the linear approach (green), iterative approach (red), iterative approach without considering the temperature dependence of the cross sections (blue), and iterative approach without considering the cross-correlative terms in the calculation of the correction spectra (cyan) with collocated balloon observations.In order to match the SZA of the SCIAMACHY measurement the balloon results are photochemically corrected (Dorf et al., 2006(Dorf et al., , 2008)).Grey areas illustrate the uncertainty of the balloon results (1σ ).The altitude range where the sounded air masses of both instruments match is indicated by a yellow shading.www.atmos-meas-tech.net/9/2147/2016/ to the broadband variability on wavelength.While we show that, in principle, the higher-order cross-correlative structures can be considered in the DOAS fit as additional fit parameters, to obtain meaningful results a good signal-to-noise ratio is required to minimize the cross-correlation between the fitted parameters.
To avoid the need to apply a full retrieval algorithm requiring time-consuming, iterative wavelength-by-wavelength forward modelling of absorption spectra, we introduce an iterative two-step retrieval algorithm separating the spectroscopic and spatial retrieval steps.From the profiles retrieved in a previous step, corrections for the measured spectra are calculated.After they are applied to the measured spectra, the next iteration is performed.Usually, convergence is found after 2-3 iterations.The correction spectra are calculated based on the a priori profiles or the profiles derived in the respective iteration.The calculation is based on RTM calculations at only a few selected wavelengths for an atmosphere with zero absorption, which strongly reduces the calculation time.Also higher-order absorption effects and the temperature dependence of the cross sections is considered.The quantitative definition of higher-order absorption effects can be obtained by expanding the radiative transfer equation into higher-order Taylor series.Higher-order effective light paths are calculated as weighted mean products of the light paths through several boxes.By approximating the wavelength dependence of the first-and higher-order effective light paths derived from pre-calculated RTM simulations at several wavelengths by a broadband function, we calculate absorption spectra and first-order effective light paths for any atmospheric trace gas composition and for any wavelength within the fit range, thus eliminating the need for RTM simulations during the retrieval process.
We find a strong influence by second-order absorption structures for retrievals from measurements in limb geometry, which often exceed the contributions by minor absorbers (ODs ∼ 10 −3 -10 −1 ).Even third-order terms show potential significance: for atmospheric scenarios typical for Arctic spring (with the stratospheric ozone concentration larger than the yearly average), we found an OD for the third-order ozone absorption term of up to 10 −2 at the absorption peak at ∼ 333 nm and exceeding 10 −4 even at ∼ 344 nm, while also in the visible (at 550 nm) it is ∼ 10 −3 .Also cross-correlative terms between ozone and other trace gases can have significant ODs and thus have the potential to cause retrieval errors if not considered by retrieval.Since higher-order contributions to the OD structures are proportional to products of the trace gas concentrations, these terms will increase or decrease proportionally to changes in the respective products; e.g.second-order self-correlative terms are proportional to the squared concentration.
By applying the iterative two-step algorithm for simulated measurements, the residual structures caused by absorption are reduced: sensitivity studies reveal good agreement within 1 % for retrieved ozone profiles with respect to the truth.This good agreement is obtained for most of the altitudes for retrievals both in the UV and the VIS spectral range even if the effective light paths up to just the second-order are considered.The agreement practically does not depend on the selected wavelength for the spatial evaluation.Also for the minor absorbers NO 2 and BrO the retrieved profiles are wavelength independent with an agreement within a few (1-3) percent for altitudes above 17-20 km.At the same time, linear retrieval algorithms (without an iterative process) without complete correction for higher-order effects show partly significant discrepancies and the retrieved values are strongly dependent on the wavelength selected for the spatial evaluation.Good agreement is also found between the retrieval results when the second-order approximation for the effective light paths or their direct calculation by RTM are used.
The pre-calculation of higher-order sensitivity parameters by RTM allows to enhance the performance speed of iterative retrieval algorithms.The consideration of only second-order absorption terms in the simulation of the correction spectra in the iterative two-step approach accounts for most of the non-linear absorption effects, reducing the systematic residual structures to the order of ∼ 10 −4 for scenarios even with strong ozone absorption.An agreement within a few percent for simulated trace gas profiles typical for SCIAMACHY limb geometry is obtained.Also the application of the iterative two-step algorithm on real SCIAMACHY measurements confirms the findings of the sensitivity studies based on synthetic spectra and shows good agreement with the collocated balloon measurements.The algorithm profits from the additional possibility to correct the temperature dependence of the cross sections.The improved time efficiency when applying the pre-calculated higher-order effective light paths instead of online RTM could help for applications for future instruments with better quality and higher data amount.Additionally, it is not necessary to consider the trace gas profile variability in the lookup tables like in applications relying on weighting function DOAS (as in e.g.Buchwitz et al., 2000;Coldewey-Egbers et al., 2004) because the algorithm is fully flexible to derive the trace gas absorption from the effective light paths.Together with the flexibility for the wavelength selection, the method is, in principle, easily implementable for different fit windows, observation geometries, and instruments.
Although our studies are limited to the limb geometry which is in many aspects the most sophisticated one, MAX-DOAS and even nadir geometry under extreme conditions (high SZA and/or high pollution) can benefit from the approach, but this needs to be investigated in separate studies.The obtainable improvement depends on the precision at which one is measuring (uncertainties of the spectral retrieval) and performing the retrieval (e.g.uncertainty of the radiative transfer simulations).The third-order cross-correlative term between two trace gases X and Y is Since there are three combinations of XY Y , the summation is performed three times, which leads to the corresponding expression of the sensitivity part.
The third-order cross-correlative term between three trace gases X, Y , and Z is There are six combinations of XY Z; the summation therefore is performed three times, which leads to the corresponding expression of the sensitivity part.
It can be seen that the expressions are third-order products of cross sections and these cross-section products can in principle be considered in a DOAS fit.

Figure 2 .
Figure 2. (a) Contribution (absolute values) to the OD by different Taylor series terms in the UV spectral region with three absorbers (ozone, NO 2 , and BrO) for a simulated scenario characteristic for SCIAMACHY limb observations at high northern latitudes in March (TH = 19.83km).All OD terms up to the second-order and the largest third-order terms are shown.(b) Simulated (true) absorption OD in comparison with the OD calculated from the effective light paths considering absorption up to a certain order.(c, d) Same as (a) and (b) but in the visible spectral range with two absorbers (ozone and NO 2 ).

2154J.)
Figure 3. (a) Conventional (first-order) effective light paths at 545 nm calculated for the same scenario as in Fig. 2. (b) Product of the firstorder effective light paths at two different altitudes.(c) Second-order effective light paths as function of two different altitudes.(d) Difference between the second-order effective light paths and the products of the first-order effective light paths.

Figure 4 .
Figure4.Difference between the sum of the ODs of the considered absorbers, each calculated by the classical formula, and the OD calculated for the whole absorption (red crosses), in comparison with the sum of the inter-correlative terms (blue line: sum of second-order terms only; green line: sum of second-and third-order terms).Top panel: the result in the UV for the limb observation scenario with TH = 19.83km and absorption scenario containing ozone, NO 2 and BrO.Bottom panel: same but in visible range and for two absorbers, ozone and NO 2 .

Figure 5 .
Figure5.Effective light paths for limb geometry (TH = 18 km, 1 km layer from 18 to 19 km) modelled by RTM at five distinct wavelengths and approximated afterwards by a fit with a Rayleigh cross section, its square plus an offset.

Figure 6 .
Figure 6.(a) Effective light paths at 545 nm explicitly simulated by RTM for an absorption background (true) in comparison with firstorder effective light paths approximated for this background from the effective light paths simulated for an atmosphere without absorption.Calculations considering effective light paths up to either second or third order are performed.(b) Contributions from terms of different orders to the effective light paths in (a).(c ,d) Relative differences between the approximated effective light paths considering up to second-or third-order terms respectively and the explicitly simulated effective light paths at the absorption background for different wavelengths in the mid-visible spectral ranges.

5Figure 7 .
Figure 7. DOAS evaluation of a simulated limb spectrum at TH = 19.83km.The simulated spectrum is shown on top left, the fit residual on top right.The remaining plots show fit results for the individual OD terms listed inTable 1, third column.

Figure 9 .
Figure9.Mean and standard deviation of the fitted OD of the firstorder NO 2 term (top), second-order cross-correlative term between ozone and NO 2 (middle), and the sum of both terms (bottom).For a statistical evaluation, the analysis is performed for the simulated limb spectrum at TH = 19.83km with 1000 different random Gaussian noise spectra added (similar as for Fig.8).Additionally, the respective terms as calculated from the simulated effective light paths are included for comparison.

Figure 11 .
Figure 11.Residual and fitted ODs of the absorbers obtained by the iterative two-step approach in the UV.The spectra are simulated for the same scenario (limb observation in subarctic atmosphere in spring) as for previous figures.Left panel: initial DOAS fit for a spectrum at TH = 19.83km; right panel: fit for the corrected spectrum (original spectrum corrected by subtracting the spectrum calculated from profiles retrieved in the last iteration).

Figure 12 .
Figure 12.Same as Fig. 11 but for the visible fit window.
Figure13.Odd plots (from left for ozone, NO 2 , and BrO): vertical concentration profiles retrieved by different approaches (blue: linear approach with effective light paths simulated for the true profiles; green: same but with effective light paths calculated for the a priori profiles from the first-and second-order absorption terms simulated at zero absorption background; red: profile retrieved at the initial step of the iterative algorithm; magenta solid line: final profile obtained by the iterative approach; magenta dotted line: final profile but without consideration of the cross-correlative terms between different absorbers) in comparison with the true profile (black dotted line: original profile; black solid line: original profile smoothed by averaging kernels).The a priori is shown as a dashed line.Even plots: relative difference to the true profiles smoothed by the averaging kernels.Top: results for the UV; bottom: results for the visible.
Figure14.From left: ozone, NO 2 , and BrO: relative differences between the trace gas profiles retrieved at different wavelengths by different approaches with respect to the true profile smoothed by averaging kernels.Rows from top to bottom: retrieval in the UV by the linear approach (with effective light paths calculated for the a priori), retrieval in the UV by the iterative approach, retrieval in the visible by the linear approach (with effective light paths calculated for the a priori), and retrieval in the visible by the iterative approach.

Figure 15 .Figure 16 .
Figure 15.Same as upper panel of Fig. 13 but for an extended UV fit window (332-357 nm) including an additional BrO absorption band at shorter wavelengths.

Figure 17 .
Figure 17.Same as Fig. 16 but for the visible fit window.

Figure 18 .
Figure18.Same as Fig.13but for the application of standard DOAS.The final profile but without consideration of the cross-correlative terms between different absorbers (magenta dotted line in the original figure) is not shown here.

3.
Weighting functions (i.e. the product of the effective light paths and the cross sections) are calculated at 342 nm (outside the strong ozone absorption band) while in Puk ¸īte et al. (2010) box AMFs at 344.2 nm (at an ozone absorption band but providing good agreement according to sensitivity studies) are used.4.Effective light paths are approximated by a broadbandfunction from the first-and second-order effective light paths obtained at zero absorption and at selected wavelengths.The absorption contribution is approximated up to the second order, while in Puk ¸īte et al. (2010) box AMFs are calculated at a given wavelength and for a priori absorption scenario.
Figure20.Comparison of SCIAMACHY BrO profiles retrieved by the linear approach (green), iterative approach (red), iterative approach without considering the temperature dependence of the cross sections (blue), and iterative approach without considering the cross-correlative terms in the calculation of the correction spectra (cyan) with collocated balloon observations.In order to match the SZA of the SCIAMACHY measurement the balloon results are photochemically corrected(Dorf et al., 2006(Dorf et al., , 2008)).Grey areas illustrate the uncertainty of the balloon results (1σ ).The altitude range where the sounded air masses of both instruments match is indicated by a yellow shading.

Figure 21 .
Figure 21.Same as Fig. 20 but showing the relative differences for the SCIAMACHY results with respect to the balloon measurements.

Table 1 .
Fit parameters included in the DOAS evaluation of synthetic spectra in the spectral range 520-570 nm.

Table 2 .
Settings of the iterative two-step algorithm for simulation studies.

Table 3 .
Settings of the iterative two-step algorithm for the application to SCIAMACHY limb observations in the UV.

Table 4 .
Comparison of the current retrieval algorithm implementation for BrO with the algorithm in Puk ¸īte et al. (2010).