Multi-axis differential optical absorption spectroscopy
(MAX-DOAS) is a widely used measurement technique for the detection of a
variety of atmospheric trace gases. Using inverse modelling, the observation
of trace gas column densities along different lines of sight enables the
retrieval of aerosol and trace gas vertical profiles in the atmospheric
boundary layer using appropriate retrieval algorithms. In this study, the
ability of eight profile retrieval algorithms to reconstruct vertical
profiles is assessed on the basis of synthetic measurements. Five of the
algorithms are based on the optimal estimation method, two on parametrised
approaches, and one using an analytical approach without involving any
radiative transfer modelling. The synthetic measurements consist of the
median of simulated slant column densities of

The planetary boundary layer (PBL) is the part of the atmosphere that is in
direct contact with the terrestrial biosphere. Its chemical composition is
determined by anthropogenic and natural emissions. Monitoring of both
chemical composition and aerosol content of the PBL is crucial for the
understanding of the chemical and physical processes and the spatio-temporal
evolution of PBL composition. A versatile tool for the monitoring of
atmospheric trace gases and aerosol content of the PBL is the well-known
multi-axis differential optical absorption spectroscopy (MAX-DOAS)

Algorithms for the retrieval of vertical profiles from MAX-DOAS measurements
can be separated into those that retrieve vertical profiles on a finite
vertical grid (usually with layers of 50–200 m in thickness) using the
optimal estimation method (OEM)

Testing the performance of algorithms for the retrieval of the atmospheric
state using remote-sensing measurements on the basis of synthetic data is a
method that has been widely used in the scientific community. In particular,
numerous synthetic studies that investigated the performance of MAX-DOAS
retrieval algorithms were published in the past

The paper is structured as follows: Sect.

In general, the retrieval of the atmospheric state (or the state of any
physical system) by remote sensing is based on the observation of a finite
number of quantities that represent the components of the measurement
vector

To overcome these problems, MAX-DOAS retrieval algorithms make use of two
different approaches. Retrieval algorithms using the well-known optimal
estimation method (OEM) are based on a Bayesian approach

MAX-DOAS retrieval algorithms participating in this study

The a priori constraints

The degrees of freedom for signal (DFS)

Parametrised retrieval algorithms do not explicitly introduce a priori
constraints but overcome the problem that the state vector is poorly
constrained by the measurements by representing the state vector as

and the best estimate of the parameters

OEM algorithms have the advantages that the approach, based on the well-established Bayesian statistics, is mathematically stringent and that important parameters, such as the retrieval covariance matrix (separated into smoothing and noise covariance), AVK, and information content, can be readily derived. Parametrised algorithms are usually faster than OEM algorithms since the small number of parameters allows for the usage of precalculated LUTs, whereas OEM algorithms, with a larger number of state vector elements, usually perform radiative transfer calculations online. The calculation of the Jacobian matrix (weighting function), which is required by OEM algorithms for the minimisation of the cost function, can be quite time consuming, especially for non-linear problems, such as the aerosol profile retrieval. Parametrised algorithms have the disadvantage that the parametrisation limits the possible representations of the state vector to a certain subspace of the state vector space when characterising the state vector with a limited number of parameters. Conversely, OEM algorithms tend to be biased to the a priori, in particular in regions where the sensitivity to the atmospheric state is low.

The overall strategy for the comparison of aerosol and trace gas vertical
profiles from MAX-DOAS measurements within this study is depicted in
Fig.

The first step of the intercomparison exercise consists of a comparison of the forward models of the individual retrieval
algorithms. Simulations of SCDs of HCHO,

The medians of the ensemble of dSCDs of HCHO,

The comparison of profiles

Finally, a comparison of the numerical performance of the individual retrieval algorithms is performed (Sect.

Flow diagram depicting the strategy for the retrieval algorithm intercomparison.

This section briefly describes the eight MAX-DOAS retrieval algorithms whose
main features are listed in Table

The Belgian Profile (bePRO) OEM inversion algorithm was developed at the
Royal Belgian Institute for Space Aeronomy (BIRA-IASB) by

The standard retrieval vertical grid used in bePRO is the following: 10 layers of 200 m in thickness starting from the altitude of the station, followed by two layers of 500 m in thickness and one layer of 1 km in thickness. This grid can be modified if needed.

A bePRO retrieval (aerosol or trace gas) is flagged as valid if the three
following criteria are fulfilled: (1) the root mean square (RMS) of the
difference between measured and simulated dSCDs

The Bremen Optimal estimation REtrieval for Aerosols and trace gaseS (BOREAS)
OEM algorithm

The BOREAS aerosol retrieval is based on the minimisation of the difference
in

The profile retrieval is calculated based on SCIATRAN's main grid and can be set by the user. The grid itself describes homogeneous layers around the grid points, with the exception of the uppermost (lowermost) layer, which is considered to be half of the grid steps. BOREAS retrieval results are routinely calculated on equidistant grid levels from the surface up to the maximum retrieval height. For each level, the retrieved value is the average of the corresponding layer, e.g. in the case of the 200 m level, the altitude range from 100 to 300 m defines this specific layer. Two exceptions are the boundary levels, 0 and 4000 m, which cover only half the altitude range compared to the other layers (0–100 and 3900–4000 m). Since the grid was defined at the centre of the levels in this intercomparison study, BOREAS results were interpolated on these altitude values. Due to this interpolation, the lowest values in the submitted profiles are the interpolation between the retrieved surface value and the 200 m result.

Four different quality filters were applied. A profile is flagged as invalid
if (1) the retrieved vertical column density is negative, (2) the
profile contains more than 10 negative values, (3) the RMS between simulated
and measured dSCD is larger than

The Heidelberg Profile Retrieval Algorithm (HEIPRO) is an updated version of
the algorithm already described in detail in

HEIPRO is based on OEM and retrieves the most probable state vector by
minimising the cost function given by Eq. (

No filtering of the HEIPRO data has been performed, and all profiles are flagged as valid.

The Mexican MAX-DOAS Fit (MMF) algorithm

MMF flagging for this study was based on the mean of the ratio of the absolute value of the difference between measured and simulated dSCD and the dSCD measurement error. The limit for flagging scans as invalid was 10.

The OEM-based profile inversion algorithm of aerosol extinction and trace gas
concentration (PriAM) developed by the Anhui Institute of Optics and Fine
Mechanics, Chinese Academy of Sciences (AIOFM, CAS), in cooperation with Max
Planck Institute for Chemistry (MPIC), is introduced in

The Mainz Profile Algorithm (MAPA) developed at the MPIC is a two-step
algorithm based on a parametrised approach. First, the aerosol profile is
retrieved based on

Previous versions of the parameter-based profile inversions, as described
e.g. in

MAPA uses RTM parameters for the LUT generation that slightly differ from
those prescribed within this study (see Sect.

The MAPA flagging scheme is as follows. For each elevation sequence, MAPA
determines the parameter combinations yielding the best match of modelled and
measured dSCDs, no matter how good this best match actually is. Thus,
flagging is required in order to evaluate which MAPA results should be
considered as meaningful and which not. MAPA provides a two-stage flagging
scheme: moderate exceedance of the thresholds results in a warning, while
large deviations raise an error. Flagging is based on different criteria. (1) The level of agreement between forward model and measurement compared to
the dSCD error

Flag criteria and thresholds have been developed and optimised based on both
the synthetic dSCDs presented in this study and real measurements during the
Second Cabauw Intercomparison of Nitrogen Dioxide Measuring Instruments
(CINDI-2). For details on the MAPA flagging scheme and a discussion on the
impact of the a priori thresholds see

The MAX-DOAS retrieval KNMI (MARK) developed at the Royal Dutch Meteorological
Institute (KNMI) is described in

MARK data are flagged as invalid if the variability in the AOT or the trace gas column within an ensemble is larger than 15 % of the value itself.

The National Aeronautics and Space Administration (NASA) real-time algorithm
was developed as a quick look algorithm that relies on the fact that
atmospheric scattering strongly affects DOAS-measured

The maximum number of vertical layers is equal to the number of elevation angles. Profiles are considered invalid if fewer than four measurements are used in the profile calculation, with all of the synthetic data analysed here satisfying this test. Within this study, an exponential profile decreasing to 0.01 % of the last altitude layer extinction coefficient at 4 km was added for consistency with the other algorithms. The resulting profile was then linearly interpolated on the common grid (200 m up to 4 km).

The trace gas profile retrieval does not rely on the aerosol retrieval. The
trace gas VCD

Near-surface trace gas VMRs

The rest of the profile VMR is calculated using

A first important step for the comparison of retrieval algorithms is the
assessment of their capability to realistically simulate the underlying
physical processes using appropriate forward models, which are in this case
atmospheric radiative transfer models. This section describes the forward
model parameters and atmospheric scenarios for the modelling of

The model atmosphere for the forward calculations consists of 67 layers, with a resolution of 100 m at altitudes between the surface and 4 km and a coarser resolution above. Note that the retrieval of extinction and trace gas profiles is performed on a coarser grid with 200 m resolution in the lowermost 4 km. The choice of a finer grid for the forward modelling than for the inverse modelling allows for the investigation of the impact of sub-grid trace gas and aerosol variabilities on the retrieved profiles. For the forward modelling, a constant concentration within each layer has been implemented for the model calculations whenever possible.

Trace gas number concentration

The trace gas concentration and aerosol extinction profile scenarios for the
forward modelling of HCHO,

Description of the trace gas profiles shown in Fig.

The viewing geometry for the forward model simulations is specified by the EA

Description of the aerosol extinction profiles shown in Fig.

Viewing geometry for the model calculations. SCDs are simulated for each combination of EA, SZAs, and RAA.

Further RTM parameters specified for both forward and inverse modelling are
listed in Table

RTM parameters for the radiative transfer modelling.

Assumed measurement errors of the

In total, all combinations of viewing geometries, aerosol profiles, and trace
gas profiles yield 990

In this section, the ability of the forward models to realistically simulate
trace gas SCDs is assessed based on a comparison of the individual
simulations with the median of the SCDs from the different RTMs. The median
SCDs also serve as a reference dataset and provide the synthetic measurements
for the profile retrieval comparison presented in
Sect.

Correlation of SCDs of

Slope, intercept, regression coefficient, and root-mean-square difference (RMS) of the
correlation between the SCDs from the individual forward models and the median SCDs from all models.
RMS and intercept values are in units of

In summary, the SCDs simulated by the different radiative transfer models
agree well under all conditions, as within previous RTM intercomparisons

This section presents the results of the aerosol and trace gas retrieval
algorithm intercomparison. Section

The 360 nm aerosol AVKs. Each subplot shows the mean AVK over all SZAs and RAA for a specific aerosol scenario (columns) and retrieval algorithm (rows). Filled circles indicate the nominal altitude of the corresponding AVK plotted in the same colour. Colour-shaded areas (barely visible in most cases) indicate the standard deviation of the AVKs, i.e. the variation for different SZAs and RAA. Also shown are the DFS. AVKs stem from retrievals with noisy measurements (v1n).

Based on the simulated SCDs from the atmospheric scenarios described in
Sect.

Two reference datasets were created: dataset v1 contains the median dSCDs
without any noise, and dataset v1n contains the median dSCDs with a noise
component consisting of the sum of (1) normally distributed noise with a
standard deviation according to the errors listed in Table

Same as Fig.

Same as Fig.

Each participant performed retrievals with settings being as close as
possible to the prescribed settings described below. The results based on
noise-free and noisy dSCDs are labelled “v1” and “v1n”, respectively. Each
measurement vector consists of

Same as Fig.

In contrast to the output of OEM algorithms, which directly retrieve trace
gas and aerosol profiles on this vertical grid, profiles from the
parametrised algorithms are interpolated onto the prescribed grid. The OEM
algorithms use aerosol and trace gas a priori profiles exponentially
decreasing with altitude with a scale height of 1 km, with a priori vertical
columns for each species as listed in Table

Optionally, each participant can define criteria for the validity of the retrieved profiles, and can provide corresponding Boolean validity flags for each profile.

A posteriori versus measured dSCDs of

Slope, intercept, regression coefficient, and RMS of the dSCD correlation. Each of the circular
symbols for the filtered data represents a pie chart that quantifies the fraction of data flagged as
valid. RMS and intercept values are in units of

Figures

The shapes of the AVKs from the different models have a high degree of
similarity, except for BOREAS aerosols, where the AVKs show a much smaller
information content than all other algorithms and indicate that there is very
little height sensitivity for aerosol profiles from the BOREAS algorithm. The
respective BOREAS aerosol vertical profiles are, however, in good agreement
with the results from the other algorithms (see Sect.

Apart from the fog scenario (AER8), there is only a moderate dependency of
vertical resolution and information content on the aerosol content of the
atmosphere. Interestingly, the information content for trace gases, in
particular

Comparison of retrieved (red solid line, version v1n) and true (green solid line) vertical profiles of aerosol extinction at 360 nm for each aerosol scenario (columns) and algorithm (rows). The retrieved profiles are the medians for all SZA–RAA combinations. The (25 %–75 %) and (5 %–95 %) percentiles are shown as grey areas and whiskers, respectively. The a priori profile is shown as the blue line (OEM algorithms only).

The variability of the AVKs with the position of the Sun (shown as shaded
areas in Figs.

An important indicator for the level of convergence of the retrieval, and
subsequently the accuracy of the retrieved profile, is the agreement between
the measurement vector

Same as Fig.

The algorithms show significant differences in the level of convergence.
BOREAS, HEIPRO, MMF, and MAPA show good agreement between measured and
modelled dSCDs for all scenarios with slopes and Pearson correlation
coefficients close to unity. The same holds true for MARK except for the HCHO
retrieval, where poor convergence is achieved for the TG9 scenario, but this
has only a little effect on the regression parameters (see Fig.

Comparison of retrieved (red solid line, version v1n) and true (green solid line) vertical
profiles of HCHO for each trace gas scenario (columns) and algorithm (rows). In addition to the median
profiles for all aerosol scenarios, SZAs and SAAs shown in red, the median concentration profile for
each aerosol scenario is shown as coloured symbols as denoted in the legend. The RMS (true – retrieved
extinction) is shown in units of

In this section, the overall ability of the retrieval algorithms to reproduce
the true atmospheric aerosol and trace gas profiles is discussed. Figures

Same as Fig.

Slope, intercept, regression coefficient, and RMS for the correlation between true and retrieved
aerosol extinction as well as HCHO and

Most algorithms are capable of realistically retrieving the shape of the true
aerosol extinction profiles for moderate conditions (AER0 … AER7), with
slopes close to unity and Pearson regression coefficients for the correlation
between true and retrieved extinction of

Box–whisker plots of the retrieved AOT at 360

The 200 m thick fog layer (AER8) is very well reproduced by the parametrised
algorithms MAPA and MARK, and to a lesser extent also by the OEM algorithms
MMF, HEIPRO, and PRIAM, which retrieve the logarithm of the aerosol extinction
profile and are therefore capable of retrieving a higher range of aerosol
extinction values than bePRO, which operates in linear space and therefore
is subject to a stronger bias towards the a priori. The cloud above 1 km
(AER9) is very well retrieved by MMF. MAPA and MARK capture the cloud bottom
altitude but overestimate the extinction above the cloud. The bePRO, BOREAS,
HEIPRO, and PRIAM algorithms retrieve only a small enhancement of the aerosol extinction
(

The retrieved extinction profiles for scenario AER10, which consists of a
cloud above 5 km altitude and no aerosols elsewhere, are very similar to the
cloud- and aerosol-free scenario AER0, although the RMS difference between
true and retrieved profiles is higher for AER10 than for AER0 in some cases.
As already discussed in Sect.

Box–whisker plots of the retrieved HCHO

As can be seen from the width of the 50 % and 90 % confidence intervals
(shaded areas and error bars in Figs.

Note that for MAPA the spread of the retrieved profiles is far smaller when only filtered results are considered (see Figs. S1–S4), indicating that
the MAPA filter is successfully removing outliers. This requires, however,
excluding 17 % and 37 % of the aerosol profiles at 360 and 477 nm,
respectively, as well as 47 % of the HCHO profiles and 44 % of the

Slope, intercept, regression coefficient, and RMS for the correlation between true and retrieved AOT
as well as HCHO and

Box–whisker plots of the retrieved surface aerosol extinction at 360 nm

As a result of a poor convergence of modelled and measured dSCDs (see Sect.

PRIAM and NASA underestimate the aerosol extinction at 477 nm with slopes of
only 0.72 and 0.63, respectively. Furthermore, PRIAM falsely retrieves a
non-existing uplifted aerosol layer around 1 km in altitude with a peak
extinction

The retrieval is generally more stable for trace gases (Figs.

With the exception of

MAPA, MARK, and NASA algorithms retrieve profiles close to zero for the trace gas free atmospheres (TG0), whereas the OEM algorithms either exhibit a slight bias towards the a priori (HEIPRO, MMF, PRIAM) or oscillate around zero (bePRO, BOREAS) for this scenario. Sensitivity studies based on the MMF algorithm have shown that these oscillations are suppressed if the logarithm of the profile is retrieved, as is the case for HEIPRO, MMF, and PRIAM. This representation also prevents the retrieval of negative values, which occur for bePRO and BOREAS, in particular for TG0 and TG4.

Except for bePRO

Same as Fig.

Slope, intercept, regression coefficient, and RMS for the correlation between true and retrieved
aerosol extinction as well as HCHO and

In this section, the ability of the retrieval algorithms to retrieve aerosol
and trace gas total columns is discussed. Box–whisker plots comparing
retrieved and true AOT at 360 and 477 nm are shown in Fig.

Time in seconds for the retrieval of a single profile on a single CPU core for each participant, separated by the target species as denoted in the legend.

The total column of both trace gases and aerosols is retrieved accurately by
most algorithms. Except for foggy conditions (AER8), there is little
dependency of the accuracy of the retrieved trace gas VCD on the aerosol
profile. As expected from the limited sensitivity to high altitudes, the
total columns from OEM algorithms tend to be biased towards the a priori. For
the aerosol and trace gas free scenarios AER0 and TG0, a positive bias of
(0.02–0.06) for AOT and (0.2–0.35)

The parametrised algorithms MAPA and MARK accurately retrieve the total
column in most cases. Exceptions are the AER6 and AER7 scenarios (1 km box
profile and uplifted profile), where both algorithms show a positive bias.
Both MAPA and MARK show a significant scatter of the AOT for AER0, AER1,
AER2, AER7, and AER10 that, in the case of MAPA, is reduced by filtering out a
significant fraction of the data (

In this section, the agreement between true and retrieved surface extinction
and surface concentration (i.e. the values in the lowermost layer of the
respective profiles with a thickness of 200 m) are discussed. As for the
total column discussed in the previous section, Figs.

Surface aerosol extinction and trace gas profiles are generally well
reproduced. For bePRO retrievals in the visible, however, a negative slope
and no significant correlation (

For the prescribed profile scenarios, retrieved aerosol surface extinction
(mean regression coefficient from all algorithms

In order to assess the numerical performance of the different retrieval
algorithms, the duration for a single profile retrieval was reported by each
participant. For multiprocessor systems, the total time has been multiplied
by the number of processor kernels used for the retrieval. It is important to
note that the retrievals were performed individually by each participant,
using computers with different performances. A more accurate comparison would
require to run all algorithms on the same computer, which is outside the scope
of this study. The results of the benchmark test are shown in Fig.

The OEM algorithms, which rely on online radiative transfer calculations, require between 4 (MMF) and 23 s (PRIAM) for the retrieval of a single trace gas profile. The duration for the retrieval of an aerosol profile ranges between 6 s for MMF and more than 3.5 min for BOREAS. The large range of computational effort for aerosols probably results from the different approaches for the calculation of the weighting functions. The BOREAS aerosol retrieval relies on radiative transfer simulations at several wavelengths, resulting in the lowest computational performance, followed by HEIPRO, whose aerosol weighting function calculation is based on the finite difference method, leading to about 1 min for an aerosol retrieval, while MMF and bePRO are significantly faster since they rely on analytically calculated aerosol weighting functions.

The parametrised algorithms MAPA and MARK show significant differences in computational performance, although both rely on LUTs for the weighting functions. MARK requires 13 and 24 s for aerosol and trace gas retrievals, respectively, while MAPA aerosol and trace gas retrievals are executed within 3 and 2 s, respectively.

The NASA algorithm, which does not rely on radiative transfer modelling but on an analytical approach, is outstanding in terms of computational performance. The retrieval of a single aerosol or trace gas profile requires less than 5 ms, and is thus almost 3 orders of magnitude faster than the second fastest algorithm, MAPA.

Eight different algorithms for the retrieval of aerosol and trace gas
vertical profiles from MAX-DOAS measurements have been compared under a large
variety of atmospheric conditions by using synthetic measurements. Both OEM
and parametrised algorithms, and also the analytic approach by NASA, show
equally good performance in terms of the reproduction of the true atmospheric
state, with a typical accuracy (in terms of RMS difference between true and
retrieved state) of (0.08–0.25) km

There are only a few exceptions from this high level of agreement between the
retrieved atmospheric state from the different algorithms. As a result of
lack of convergence between true and modelled slant column densities, bePRO
profiles are subject to a high degree of instability in the visible
wavelength range for the aerosol scenarios AER0, AER8, AER9, and AER10 and to
a lesser extent also AER4. Up to 25 % of the bePRO profiles need to be
discarded in order to achieve an accuracy similar to the other algorithms.
However, bePRO performs well when convergence is reached. The synthetic data
used for the study are not necessarily representative of real measurements,
especially in terms of dSCDs errors, and sensitivity tests performed by increasing
the dSCDs errors but also previous publications

Aerosol AVKs from BOREAS differ from those of other OEM retrievals as
additional regularisation is applied. They can therefore not be compared to
those from the other retrievals and also do not comply with Eq. (

OEM algorithms tend to produce profiles biased towards the a priori, in particular at high altitudes where the sensitivity to the atmospheric state is small. OEM algorithms retrieving the logarithm of the target parameters show a higher degree of stability with fewer oscillations than those operating in linear space. Parametrised algorithms do not suffer from this disadvantage, but the possible results can be wide-spread when the sensitivity to the atmospheric state is low, which is particularly the case at high altitude or above layers with high extinction. However, despite these conceptual differences, the overall accuracy of OEM and parametrised algorithms is very similar.

Based on an analytical approach without using RTM calculations, the NASA
algorithm is by far the fastest, with the retrieval of a single profile
requiring less than 5

In summary, it can be concluded that, with only a few exceptions, the
algorithms presented here are capable of realistically retrieving aerosol and
trace gas profiles in the lowermost

As a result of this study, the MMF and the MAPA algorithms, both showing best
performance in terms of reconstruction of the atmospheric state and
computational speed, were selected as profile algorithms for the FRM

A detailed comparison of vertical profiles of trace gases and aerosols from
MAX-DOAS field measurements performed during the CINDI-2
(

The bePRO and HEIPRO algorithms are available from the authors upon request (Francois Hendrick, francois.hendrick@aeronomie.be, and Udo Frieß, udo.friess@iup.uni-heidelberg.de).

The reference database of synthetic dSCDs is available on
the FRM

The supplement related to this article is available online at:

The intercomparison strategy and retrieval settings were developed by FH and UF. Forward modelling of dSCDs and retrieval of vertical profiles was performed by UF, SB, LAB, TB, MMF, FH, AP, AR, MvR, ES, TV, TW, and YW. The visualisation of the data was performed by UF and J-LT. UF performed the statistical analysis of the data. VVR developed the SCIATRAN radiative transfer model.

The authors declare that they have no conflict of interest.

The funding of this study by the FRM

This paper was edited by Andreas Hofzumahaus and reviewed by three anonymous referees.