Introduction
In this study, the consistency of differential optical absorption
spectroscopy (DOAS) retrievals of tropospheric nitrogen dioxide (NO2) from
ground-based scattered light observations is evaluated. NO2 is released
into the atmosphere predominantly in the form of NO as a result of combustion
processes at high temperatures. Through reaction of NO with ozone (O3),
NO2 is rapidly produced and (during the day) back-converted into NO by
photolysis. Therefore, nitrogen oxides are often discussed in terms of NOx
(NO + NO2 = NOx). NO2 is a key species in the formation of tropospheric
ozone and a prominent pollutant in the troposphere, causing (together with
aerosols) the typical brownish colour of polluted air. In addition, it is
harmful for lung tissue and a powerful oxidant. In the lower troposphere, the
lifetime of NO2 is short (several hours) due to reaction with OH and
photodissociation; thus it is mostly found close to its sources, making
it a good tracer of local pollution. Anthropogenic sources such as
burning fossil fuel in industry, power generation and traffic as well as
biogenic sources including bush and forest fires contribute to the
tropospheric NOx loading. In addition, NO2 is released from soil microbial
processes and lightning events . As a result, high NO2
amounts are mostly observed above industrialized and urban areas, traffic
routes and over bush fires.
Using its characteristic absorption bands in the UV and visible spectral
range, NO2 has been successfully measured for many years using the DOAS
technique , both from the ground and
from space e.g.. In addition, airborne and
car-based measurements e.g., have
been taken to close the gap between ground-based observations providing
continuous temporal, but poor spatial resolution and satellite measurements
which offer global observations, but only up to one measurement per day above
each location. Using ground-based multi-axis (MAX)-DOAS measurements at
different elevation angles, a more accurate vertical column (VC) can be
retrieved with information on the vertical distribution of NO2 and other
trace gases in the troposphere e.g..
As result of the viewing geometry, ground-based MAX-DOAS observations are
most sensitive to the lowest layers of the troposphere. Here they provide
high sensitivity and low relative measurement errors. Averaging over longer
integration times can further reduce statistical noise. Another advantage is
that MAX-DOAS stations can be operated automatically and usually require
little maintenance. MAX-DOAS measurements were therefore taken in remote
regions for the investigation of background concentrations and in many
locations for the validation of satellite observations
e.g..
NO2 slant columns and fit residual root mean square (rms) obtained from the IUPB
retrieval code for the intercomparison day used in this study. Different elevation angles are
colour coded. The fit settings correspond to v1 settings as described in Sect.
(Table ).
(a) Example of the fitted NO2 differential optical density in 2∘ elevation angle
at 14:40 UT corresponding to a slant column of ≈ 7E16 molec cm-2 (compare Fig. ).
The solid line is the NO2 differential optical density (= differential cross section, i.e. the cross section
after subtraction of a fitted polynomial, multiplied by the retrieved slant column) and the dashed line is the
solid line plus the fit residual. The NO2 differential optical density is much larger than the residual,
which is explicitly shown in (b).
There currently exists a variety of different instruments and retrieval codes
designed to perform and analyse MAX-DOAS measurements. While the basic
approaches are similar, differences exist which can potentially lead to
inconsistencies in measurements from different instruments and retrievals. As
this is an important limitation for the use of MAX-DOAS measurements in a
global observing system, the Network for the Detection of Atmospheric
Composition Change NDACC (formerly known as NDSC) has organized several
intercomparison campaigns aimed at comparing instruments, retrievals and
uncertainties of a wide range of DOAS instruments. The first of these
campaigns were focused entirely on zenith-sky observations for stratospheric
absorbers but later also
included other viewing directions . The Cabauw
Intercomparison of Nitrogen Dioxide measuring Instruments (CINDI) campaign in
2009 in Cabauw, the Netherlands was the first to also have a focus on
MAX-DOAS observations of tropospheric species and has recently been followed by the CINDI-2 campaign carried
out in 2016, also in Cabauw. However, these intercomparison exercises
concentrated mostly on results originating from different retrieval codes and
instruments, and separation of instrument and retrieval effects was not
easily possible. In some cases, synthetic spectra were used to intercompare
retrieval algorithms, and while such tests can highlight differences between
retrieval approaches, they give no insight into the way different retrieval
codes deal with instrumental effects such as intensity offsets, resolution
changes or spectral drifts.
The present study was carried out in the framework of the European FP7 project
QA4ECV (Quality Assurance for Essential Climate Variables), which aims at
providing quality assurance for satellite-derived ECVs such as NO2, HCHO
and CO by characterizing the uncertainty budgets through uncertainty analysis
and error propagation but also by validation with external data sets. In this
context, ground-based MAX-DOAS measurements can play an important role, and
harmonization of the retrieval approaches and quality assurance for the
reference measurements are needed. The results of the analysis in
the QA4ECV project are expected to contribute to the harmonization of data
from MAX-DOAS instruments, in particular for the ongoing integration of such
measurements in the NDACC network.
The work reported here overcomes limitations from previous studies by using
real measurements originating from a single instrument. This facilitates the
study of the agreement between different retrieval codes on real data without
instrumental biases between results from different groups. The
intercomparison was carried out on spectra recorded by the University of Bremen
(IUPB) instrument during the Multi-Axis DOAS Comparison campaign for Aerosols
and Trace gases (MAD-CAT) carried out in Mainz, Germany, in summer 2013
(http://joseba.mpch-mainz.mpg.de/mad_cat.htm). Data were distributed to
17 international groups working on ground-based DOAS applications. Each group
analysed these spectra using their own DOAS retrieval code but prescribed fit
settings for fitting window, cross sections, polynomial and offset
correction. Resulting slant columns from all groups were then compared to
IUPB results chosen as the arbitrary reference, evaluating the level of
agreement and systematic differences and investigating their algorithmic
origins. With this set-up, nearly all sources of disagreement were removed,
and only those differences between retrieval codes were investigated, which
are not usually reported.
This intercomparison exercise concentrates on 1 day (18 June 2013) of
the MAD-CAT campaign, which had the best weather and viewing conditions. As an
example, Fig. shows NO2 slant columns (left) and fit
rms residual (right) retrieved with the IUPB software. It can clearly be seen that
NO2 slant columns measured at different elevation angles are separated as
a result of differences in the light path. It should be mentioned that NO2
slant columns are relatively large as a result of anthropogenic pollution at
the densely populated and industrialized measurement location and thus the
findings of this study correspond to polluted urban environments. However,
these are normally the ones of interest for NO2 observations.
Figure also shows that the rms of the fit residuals (in
the following simply denoted rms) separates with elevation angles as well.
In addition, the shape of the rms in small elevation angles is very similar
to that of NO2 slant columns, indicating that predominantly NO2-related
effects such as the wavelength dependence of the NO2 AMF (which was not
included in the fit shown here) limit the fit quality (note that this study aims
not at finding best NO2 fit settings but studying disagreements
originating from different retrieval codes). This is supported by the
observation that the shape of NO2 slant columns is not seen in the rms for
larger elevation angles as effects related to the NO2 absorption are
smaller and other effects limiting the fit quality dominate. In addition, the
rms is very large in the early morning for small elevations, which is
discussed in detail later, and was found to result from pointing towards sunrise. However,
in general NO2 is retrieved very accurately as Fig. demonstrates, showing the NO2 optical density and fit
residual for a measurement in 2∘ elevation angle (this is a typical
example fit).
The paper is structured in the following way:
Sect. provides details about the MAD-CAT campaign,
measurements and the NO2 intercomparison exercise as well as participating
retrieval codes. Intercomparison results between groups are presented in
Sect. . Sensitivity studies of non-harmonized retrieval
aspects are described in Sect. , attempting to reproduce observed differences between groups and
evaluating the impact on NO2 slant columns and DOAS fit quality. The paper ends with conclusions and
recommendations for better harmonization of ground-based MAX-DOAS
measurements.
Measurements
DOAS technique
The DOAS technique is based on Beer–Lambert law, which describes the
attenuation of light passing through a medium. Here, it is applied to
measurements of scattered sunlight. The spectral attenuation caused by
scattering is smooth in wavelength (e.g. λ-4-dependence for
Rayleigh scattering) while molecular absorption often has structured spectra.
In DOAS, the total spectral attenuation is therefore split into a
high-frequency part consisting of the (high-frequency components of) trace
gas absorptions and a low-frequency part that accounts for elastic
scattering,
which is described by a low-order polynomial, also compensating for intensity
changes, e.g. caused by clouds. In addition, the effect of inelastic
scattering, which is predominantly due to rotational Raman scattering, known
as the Ring effect is accounted for by a
pseudo cross section e.g.. Similarly, intensity
offsets mostly resulting from stray light within the spectrometer are
accounted for by pseudo cross sections σoff (see
Sect. for more details). In total, the optical depth
τ is approximated by
τ=lnI0I=∑iσi⋅SCi+σRing⋅SCRing+σoff⋅SCoff+∑papλp+r,
where the first sum is over all i absorbers having cross section
σi, the polynomial degree is p, and the residual term r contains
the remaining (uncompensated) optical depth. An important quantity used
within this study to identify and evaluate differences between DOAS retrieval
codes is the root mean square of the fit residual r (for simplicity denoted
rms in the context of this study), which is a measure of the fit quality.
In Eq. (), known as the DOAS equation, all
quantities with the exception of ap depend on wavelength λ. For
tropospheric absorbers, the spectrum I is normally taken at small elevation
angles above the horizon where the tropospheric light path is large. The
reference spectrum I0 is normally a zenith spectrum either at small solar
zenith angle (SZA), which is called a noon reference fit in the following, or
close in time to the measured spectrum I, in the following called
a sequential reference fit.
The DOAS equation is usually an overdetermined problem (τ is measured
at more wavelengths λ than unknowns exist in
Eq. ) and solved by means of a least-squares fit (see
also Sect. ), i.e. minimizing the residual term. The
resulting fit factors are the polynomial coefficients ap and the so-called
slant columns SCi, which are the quantities of interest. The slant column
is the integrated concentration ρi of absorber i along the effective
light path s (for simplicity we use the SC also for the fit coefficients of
the Ring and offset spectra):
SCi=∫ρids.
Note that this is a simplification as normally an ensemble of different light
paths contribute to the measurement. In extreme cases, the absorption optical
depth can become a non-linear function of the trace gas concentration.
However, this is usually only of importance for satellite limb measurements
and is discussed in detail, for example, in . A comprehensive
discussion of the DOAS technique can be found, for example, in
.
The MAD-CAT campaign
The Multi-Axis DOAS Comparison campaign for Aerosols and Trace gases
(MAD-CAT) was carried out in Mainz, Germany, in summer 2013. During the
intensive phase of the campaign (7 June to 6 July 2013), 11 groups deployed
MAX-DOAS instruments on the roof of the MAX-Planck Institute for Chemistry at
the Mainz University campus. Being located in the densely populated
Rhine-Main region, the observations are dominated by anthropogenic pollution,
predominantly by NO2 (in the visible). The main azimuthal viewing
direction was 51∘ (from the north), pointing roughly in the direction of
the city of Frankfurt ≈ 30 km away. Series of vertical scans
comprising elevation angles of 0, 1, 2, 3, 4, 5, 6, 8, 10 and 30∘ were
performed. In each direction, single measurements of varying exposure time
depending on illumination were integrated for 20 s (for the IUPB instrument,
other instruments used different integration times). Between vertical scans,
multiple zenith measurements were taken (see Fig.
for an example of resulting NO2 slant columns). Detailed information about
MAD-CAT can be also found at
http://joseba.mpch-mainz.mpg.de/mad_cat.htm and the first campaign data
focusing on range-resolved distributions of NO2 measured at different
wavelengths were published by . Furthermore,
demonstrated the presence of vibrational Raman
scattering on N2 molecules in spectra measured during MAD-CAT. In
addition, publications based on MAD-CAT data are in preparation focusing on
HCHO , HONO and CHOCHO
.
The IUPB MAX-DOAS instrument
The IUPB MAX-DOAS instrument deployed in Mainz during the MAD-CAT campaign is
a two-channel CCD-spectrometer system measuring in the UV and visible. Within this exercise, only data from the visible spectrometer
are used as NO2 is best retrieved in this spectral range. The spectrometer
is an ANDOR Shamrock 303i, covering a spectral range from 399 to 536 nm at a
resolution of ≈ 0.7 nm. The spectrometer was actively temperature
stabilized to 35 ∘C. Spectra were recorded with a charge-coupled device (CCD) of the ANDOR iDUS
420 type with 1024 × 255 pixels (26 × 26 µm each), leading to a spectral
sampling of 7–8 pixels nm-1.
Light was collected by a telescope unit mounted on a commercial ENEO VPT-501
pan-tilt head allowing pointing in any viewing direction. Photons entering
the telescope through a fused silica window were focused by a lens on an
optical fibre bundle. The instrument's field of view (FOV) was
≈1.2∘. The Y-shaped optical fibre bundle (length 20 m)
connecting the telescope to both spectrometers consists of 2 × 38 = 76 single
fibres, minimizing polarization effects. A video camera inside the telescope
housing takes snapshots of every recorded spectrum for scene documentation,
and a mercury-cadmium (HgCd) line lamp allows for wavelength calibration
measurements. Dark current and slit function measurements are taken every
night. The same instrumental set-up has been used in previous campaigns, e.g.
CINDI and TransBrom .
Intercomparison exercise
Summary of fit settings used for the NO2 intercomparison. These fit settings were agreed on during the
MAD-CAT campaign and can be found also at http://joseba.mpch-mainz.mpg.de/mad_analysis.htm.
Fit
Reference
Window
Cross sections
Intensity offset
Polynomial
v1
noon
425–490
1, 2, 4, 5, 6, 7
constant (0th order)
5 (6 coefs)
v1a
sequential
425–490
1, 2, 4, 5, 6, 7
constant (0th order)
5 (6 coefs)
v2
noon
411–445
1, 3, 4, 5, 6, 7
constant (0th order)
4 (5 coefs)
v2a
sequential
411–445
1, 3, 4, 5, 6, 7
constant (0th order)
4 (5 coefs)
Cross sections
1
NO2 at 298 K , I0-correction using 1 × 1017 molec cm-2
2
NO2 at 220 K orthogonalized to 298 K within 425–490 nm
3
NO2 at 220 K orthogonalized to 298 K within 411–445 nm
4
O3 at 223 K
5
O4 Hermans et al., unpublished, http://spectrolab.aeronomie.be/o2.htm
6
H2O, HITEMP
7
Ring, NDSC2003
Spectra recorded by the IUPB MAX-DOAS instrument on 18 June 2013 during
MAD-CAT were distributed to partners. It is worth mentioning that this
intercomparison exercise was not restricted to groups participating in
MAD-CAT, as common observations were provided. 18 June 2013 was selected as
having good viewing and weather conditions. However, temperature
stabilization of the IUPB spectrometer was problematic due to unusually hot
weather which, due to a lack of air conditioning on that day, led to
the instrument overheating. As a result, spectral stability was not as
good as usual, providing the opportunity to investigate how different
retrieval codes deal with potential spectral shifts during the day. The
provided data comprised observations at several elevation angles in the main
azimuthal viewing direction of 51∘ (see Fig. for
an example). In addition, common trace gas cross sections as well as DOAS fit
settings were distributed as summarized in Table , which
are the common fit settings agreed on during the MAD-CAT campaign. All
groups then performed DOAS fits using these prescribed settings on IUPB
spectra using their own retrieval software. The influence of different
retrievals was then analysed, focusing on the impact on NO2 columns.
A brief description of retrieval codes is given in the following.
The participating groups (with abbreviations used within this study) and
retrieval codes they used are summarized in Table .
Summary of participating institutes (with abbreviation) and retrieval codes used by each group.
Abbr.
Retrieval(s)
Institute
IUPB
NLIN
Institute for Environmental Physics, University of Bremen, Germany
AUTH
QDOAS
Aristotle University of Thessaloniki, Greece
BIRA
QDOAS
Belgian Institute for Space Aeronomy, Brussels, Belgium
JAMSTEC
QDOAS
Japan Agency for Marine-Earth Science and Technology, Japan
Toronto
QDOAS
Department of Physics, University of Toronto, Ontario, Canada
IUPHD
DOASIS
Institute of Environmental Physics, University of Heidelberg, Germany
Boulder
QDOAS
University of Colorado, Boulder, USA
KNMI
KMDOAS
Royal Netherlands Meteorological Institute, De Bilt, The Netherlands
INTA
LANA
National Institute for Aerospace technology, Madrid, Spain
MPIC
MDOAS,
Max Planck Institute for Chemistry, Mainz, Germany
WinDOAS
CSIC
QDOAS
Department of Atmospheric Chemistry and Climate,
Institute of Physical Chemistry Rocasolano, CSIC, Madrid, Spain
NIWA
STRATO
National Institute of Water and Atmospheric Research, Lauder, New Zealand
IAP
RS.DOAS
A. M. Obukhov Institute of Atmospheric Physics, Russian Academy of Sciences, Moscow, Russia
BSU
WinDOAS
Belarusian State University, Minsk, Belarus
USTC
QDOAS
University of Science and Technology, Hefei, China
UNAM
QDOAS
National Autonomous University of Mexico, Mexico
NUST
QDOAS
Institute of environmental sciences and engineering (IESE),
National University of Sciences and Technology, Islamabad, Pakistan
NLIN
NLIN was originally developed at IUPB for the analysis of
ground-based measurements but over time extended and also used for airborne
and satellite data sets. The DOAS matrix is inverted using a singular value
decomposition (SVD) which is comprised in an iterative non-linear fit for
calibration of the wavelength axis using a Marquardt–Levenberg fit.
QDOAS
QDOAS is the multi-platform (Windows, Unix/Linux and
Mac) successor of WinDOAS. A coupled linear/non-linear least squares (NLLS)
algorithm (Marquardt-Levenberg fitting and SVD decomposition) is used to
solve the DOAS equation including a wavelength calibration module. During
this operation, the slit function can be characterized in addition to the
wavelength registration of the measured spectra. QDOAS is widely used within
the DOAS community and has been applied in different versions for the present
exercise: AUTH (version 2.109.3), JAMSTEC (version 2.109), University of
Toronto (version 2.109), CSIC (QDOAS windows 2.110.1 beta 20151123), BIRA
(version 2.110.1), USTC (version 2.109.4), UNAM (version 2.109.3), NUST
(version 2.111), CU Boulder (version 2.110). These various versions do not
differ in the core implementation of the DOAS analysis, but can feature
slightly different capabilities in terms of data handling or specific
retrieval functionalities (see
http://uv-vis.aeronomie.be/software/QDOAS/LastChanges.php).
LANA
LANA is a two-step iterative algorithm developed at INTA and used since 1994.
In the first step, cross sections and spectra are positioned and in the
second step the linear equations system is solved using a Gauss–Jordan
procedure. For this exercise, the intensity offset correction was based on
the reference spectrum I0.
MDOAS
MDOAS is a Matlab DOAS code developed at MPIC, but calibration and
convolution have been performed in WinDOAS. For sequential reference fits,
two versions, MPIC_MDa and MPIC_MDb, are included in
this exercise, which used a different treatment of the Ring spectrum.
KMDOAS
KMDOAS was recently (2013–2014) developed at KNMI and verified using QDOAS.
It is written in Python using standard modules (matplotlib, numpy, scipy,
pandas).
DOASIS
DOASIS has been used by IUPHD in its version 3.2.4595.39926. A detailed
explanation can be found in . In the version submitted here,
an additional shift of cross sections with respect to the optical depth was allowed
and a saturation correction was implemented.
STRATO
The NIWA-Strato package was originally (1980s) used for processing
zenith DOAS spectra, but extended as needed to handle MAX-DOAS measurements.
The fitting code uses least squares or optional SVD fitting inside iterations
applying shift, stretch and (optional) offset, for minimum residual. In
contrast to other groups, an internal equidistant wavelength grid is applied
using the average interpixel spacing.
RS.DOAS
RS.DOAS
was developed by IAP. The DOAS inversion is performed by LU decomposition.
The wavelength calibration uses several subwindows, and for each of them
a non-linear shift and stretch fit including all trace gases (resulting from
the linear DOAS fit) is applied. From a set of shifts obtained for the
subwindows a 2nd-order polynomial approximation of the shift dependency on
pixel number is constructed. For convoluting cross sections the QDOAS
software (offline version 2.0 at 5 March 2012) was used.
WinDOAS
WinDOAS is the precursor of QDOAS. It has been used by BSU
as well as MPIC for this intercomparison exercise (the MPIC submission is
denoted MPIC_WD, the reference has been selected by hand), but
only for noon reference fits.
Intercomparison results
Noon reference, 425–490 nm fit window (v1 fit parameters)
Results from v1 fit settings in 2∘ elevation angle as a function of time. (a, b) Absolute NO2 slant
column and rms differences, (c, d) relative NO2 slant column and rms differences,
(e) NO2 slant columns, (f) rms, (g, h) fitted spectral shift (h is a zoom-in
of g without INTA and KNMI
lines, but including the reanalysed INTA2 and NIWA2 lines). Panels (a)–(d) are differences with respect to IUPB, panels (e)–(h) are
absolute results and IUPB is explicitly shown in black. Triangular markers indicate groups using QDOAS,
circular markers indicate groups using independent software, and diamonds indicate groups that corrected
faults in their code during the course of this study and submitted a reanalysed data set.
Linear regression results (slope, intercept, correlation) for different elevation angles (with respect to IUPB) for fit
settings v1 (noon reference). Groups using QDOAS (in different versions) are indicated by a star (*).
Differences between groups for the 425–490 nm fit using a noon reference (v1
fit settings; see Table ) are shown in Fig. for individual measurements at an elevation angle of 2∘.
Small elevation angles above the horizon are associated with long
tropospheric light paths and are therefore important for the detection of
tropospheric absorbers. Differences shown in Fig. are
relative to IUPB results. For the objective of identifying retrieval-code-specific effects, the use of a single retrieval code as a reference seems
advantageous in comparison to using the mean of all retrieval codes which
would average over all retrieval-specific features. Note that this does not
exclude IUPB from the intercomparison as problems with the IUPB retrieval would
be easily detected, leading to the same systematic patterns in all lines
shown in Fig. .
Absolute differences (institute-IUPB) and relative differences (absolute
difference/IUPB) of NO2 slant columns and fit rms are shown in
Fig. a–d. In general, NO2 slant column differences are
in the range of ±2–3 ×1015 molec cm-2 or < 2 %. This is
about a factor of 2–3 larger than NO2 slant column errors, which are
typically <1×1015 molec cm-2 or < 0.6 % for 2∘
elevation. A clearly enhanced disagreement is observed for the first data
point in the INTA time series as well as for MPIC_WD. The latter
could be linked to the reference spectrum, which in this case is not the one
with the smallest sun zenith angle (SZA), while the outlier in the INTA
time series was identified to arise from different implementations of the
intensity offset correction (see Sect. ). Note that
these NO2 differences for individual measurements are much smaller than
the variability (diurnal cycle) of NO2 and thus almost invisible in
Fig. e where no differences but absolute NO2 slant
columns from each groups (including IUPB in black) are plotted.
Interestingly, many groups show a smooth behaviour (constant offset) in
absolute NO2 differences (Fig. a), which is mostly an
effect of the choice of the reference (see Sect. ),
while relative differences (Fig. c) reflect the shape of
NO2 slant columns in Fig. e (smaller slant columns lead
to larger relative differences and vice versa). However, some groups show a
smooth line, not for absolute differences but for relative differences. Thus, two types
of disagreements are observed, (1) constant in absolute differences and (2) constant in
relative differences. These two types are linked to differences in retrieval
codes, which are investigated in detail in Sect. .
Absolute rms differences (with respect to IUPB) in Fig. b show the
same shape as NO2 slant columns. This is because at small elevations the
rms itself reflects the shape of the NO2, which was already demonstrated
and discussed in Fig. . A better measure for the
identification of differences between retrieval codes is thus the relative
rms disagreement shown in Fig. d. Interestingly, the first
data point for INTA, showing a large disagreement with IUPB and other groups
in NO2 slant columns is prominent in absolute rms differences as well, but
not in relative rms differences. The reason is that the rms in the morning is
very large (Fig. f, compare also to Fig. ) and thus decreases the relative difference. Remarkably, relative
rms differences are found up to 80–100 %, which is substantially more than
NO2 slant column differences (only a few percent).
The IUPB spectra provided were wavelength precalibrated using nightly HgCd
line lamp measurements as explained in Sect. . However,
the DOAS fit quality can be improved (rms reduced) by applying a
postcalibration. In addition, the night-time calibration can change during
the day as a result of temperature drifts. This is accounted for by DOAS
retrieval codes in terms of a non-linear shift fit (for a more detailed
discussion see Sect. ). Figure g and h (being a zoomed-in image of g) show the reported shift resulting
from the wavelength calibration in participating retrieval codes. Note that
absolute values for the shift are shown here and the IUPB result is
explicitly included. For an ideal spectrometer without drifts, only a very
small shift would be expected and it would be caused by a non-commutivity of convolution and
DOAS polynomial, which is known as the tilt effect and typically in
the order of less than 1–2 pm, depending on the instrument resolution
. The shift retrieved here is
larger than that and driven by the overheating of the system on this day. In
general, time series of shifts shown in Fig. g and h agree
well in shape. A small offset is observed for NIWA, who successfully found
and corrected a fault in their retrieval code (related to the intensity
offset calculation) in the course of this exercise. The DOAS fit was then
repeated using the corrected retrieval code and the result, called NIWA2, is
included in Fig. . Apparently, NIWA2 shows a much better
comparison than other groups (most obvious in the fitted spectral shift) and
performs better because the rms in Fig. d is much smaller than
before. Similarly, INTA fixed a problem in their retrieval code related to
the wavelength calibration. The reanalysed INTA data are denoted INTA2 in
Fig. . The INTA2 shift is comparable to other groups,
while large disagreements were seen before, and the rms is largely reduced in
comparison to the first INTA submission. Notably, almost no differences
between INTA and INTA2 are observed in terms of NO2 slant columns; in
particular the first data point is still outlying. This indicates (1) that
the wavelength calibration has a large impact on the fit rms but only a minor
impact on the NO2 slant columns, and (2) that the disagreement of the
first data point is not related to the wavelength calibration. Despite INTA
and NIWA, only KNMI is fitting a rather different shift (and provides the
shift only in 0.01 nm resolution which leads to the displayed discrete
steps). This is caused by different definitions of the shift (see
Sect. ): KNMI is fitting the shift of the
optical depth relative to the cross sections while the other groups are
fitting the shift of I relative to the reference spectrum I0. As
a result, the shift of KNMI and other groups cannot be expected to match.
The agreement between groups, including other elevation angles, has been
quantified by linear regression analyses to correlation plots (of each group
relative to IUPB results, not shown). Resulting slopes, offsets and
correlation coefficients are summarized in Fig. and
colour-coded for different elevation angles. As expected, the correlation
coefficient is > 99.98 % with the only exception of the INTA (and INTA2)
2∘ elevation which is predominantly caused by the outlier already seen
in Fig. . The slope ranges between 0.985 and 1.01, the
offset between -4 and 2.5 ×1015 molec cm-2. Apart from USTC and
INTA, no large separation of the slope and offset with elevation angle is
observed. An important observation is that groups using the same retrieval
code (QDOAS) do not necessarily show the same systematic behaviour in
Fig. , implying that the influence of remaining fit
parameters that are different from the harmonized general settings in
Table is still larger than the effect of the specific
retrieval software used. For example, the best agreement in terms of slope,
offset and correlation coefficient is found between IUPB, AUTH, IAP and CU
Boulder, which all use different retrieval codes.
Other fit parameters
Results from v1a fit settings (sequential reference spectrum) on 2∘ elevation angle
data as a function of time. (a, b) Absolute NO2 slant column and rms differences, (c, d) relative NO2 slant
column and rms differences. All differences are relative to IUPB. Triangular
markers indicate groups using QDOAS, circular markers indicate groups using independent software,
and diamonds indicate groups that corrected faults in their code during the course of this study and submitted a
reanalysed data set.
The agreement between groups was investigated in the same way when using
a sequential reference (denoted v1a fit setting; see
Table ) instead of a noon reference spectrum (v1 fit). A
sequential reference is often preferred if tropospheric absorbers are of
interest as stratospheric effects are removed to a large extent.
Figure is similar to Fig. , but some
groups are missing here (e.g. IAP) as they only provided noon reference fits.
The range of disagreements for individual NO2 slant columns is up to
≈ 8 % and therefore larger than NO2 disagreements using a noon
reference (v1 fit). In addition, neither absolute nor relative differences
between groups are smooth (Fig. a and c). The reason is
that, in contrast to the noon reference, a different (zenith) I0 spectrum
is used for every scan and thus the impact of details of the implementation
of the reference changes from scan to scan. Different implementations of the
reference spectrum are further investigated in Sect. and were found to be the major reason for NO2 disagreements
between groups. The outlier (first data point) seen for v1 (noon) fit
settings is present for v1a as well but it is not prominent here as
fluctuations for the above-mentioned reason are of same magnitude.
In terms of rms, differences between groups are comparable to v1 results and
as large as ≈ 80 %.
Compared to the noon reference fit, correlations from linear regressions (not
shown) are smaller, especially for the 30∘ elevation angle, mainly
resulting from the larger disagreements as explained above (but also partly
expected as slant columns are smaller using a sequential reference).
Correlations are still > 99.2 % for 30∘ elevation and even
> 99.8 % for smaller elevations.
In addition to the 425–490 nm fit (v1 and v1a), a smaller fit window used
within the MAD-CAT campaign was intercompared (v2 and v2a using a spectral
range of 411–445 nm; see Table ). Results of these fits
are not explicitly shown as mainly confirming observations and findings
above. Typical correlation coefficients (as well as offsets and slopes) are
summarized in Table for all fits.
A very small tendency of better agreement between groups if using the larger
fit window is seen. Although this should not be overinterpreted, it could be
caused by more information being present in the large fit window and
therefore more accurate results (fewer statistical fluctuations).
Correlations, slopes and offsets from linear regressions on NO2 slant columns between each group and IUPB
(neglecting the INTA 2∘ outlier).
Fit
Correlation
Slope
Offset
(%)
(1 × 1015 molec cm-2)
v1
> 99.98
0.985 to 1.01
-4 to 3
v1a
> 99.2
0.96 to 1.01
-1.5 to 1
v2
> 99.94
0.985 to 1.005
-4 to 3
v2a
> 99.2
0.96 to 1.01
-2 to 1
Sensitivity studies of non-harmonized retrieval aspects
The observed differences between the harmonized MAD-CAT retrieval results
prompted the team to systematically investigate effects of the non-harmonized
aspects of the retrievals, with the aim to derive a key set of best practice
recommendations. A survey suggested five potential sources of differences in
the DOAS fit results: (1) the selection/calculation of the reference spectrum,
predominantly for sequential reference fits, (2) treatment of the slit
function, (3) the intensity offset correction, (4) differences in the numerical
calculation of the DOAS fit (linear fit) and (5) differences in the wavelength
calibration (non-linear fit). In the following, tests using the same
retrieval code (IUPB) have been performed to characterize the impact of each
of the above-mentioned systematic differences.
Effect of the reference spectrum
Tests performed to study the influence of different reference spectra.
Test
Reference (fit setting)
Remarks
TR0
noon (v1)
first spectrum of smallest SZA
TR1
noon (v1)
second spectrum of smallest SZA
TR2
noon (v1)
first for morning, second for afternoon
TR3
noon (v1)
average of both
TR4
sequential (v1a)
closest zenith in time
TR5
sequential (v1a)
closest before the scan
TR6
sequential (v1a)
closest after the scan
TR7
sequential (v1a)
average of before and after
TR8
sequential (v1a)
interpolation of before and after to time of measurement
Different test results (see Table ) in 2∘ elevation angle as a function
of time. Top panel shows absolute NO2 SC differences, middle shows relative SC differences, bottom shows relative rms differences.
Left is for noon reference (differences are with respect to IUPB v1 fit results), right is for sequential references
(differences are with respect to IUPB v1a fit results).
In the IUPB spectra provided to intercomparison partners, two different
zenith spectra at the same smallest SZA were reported, which is of course
non-physical. Actually, the second zenith spectrum is the one with the
smallest SZA, but for rounding reasons (the SZA was a 4-digit number in the
input data provided), both spectra had the same SZA. Consequently, for the
noon reference fits v1 and v2 within this intercomparison exercise, four
options exist: (1) taking the first zenith spectrum of smallest SZA, (2) taking
the second one, (3) taking the first for the morning and the second for the
afternoon (i.e.
always taking the closest in time) and (4) taking the average of both
spectra. Similarly, different options exist for calculating the sequential
references for fits v1a and v2a: (1) always taking the zenith spectrum closest
in time, (2) always taking the last zenith spectrum before the actual
measurement, (3) always taking the next zenith spectrum after the measurement,
(4) taking the average of the two (before and after) and (5) interpolating the
two zenith spectra to the time of the actual measurement.
All different options for noon and sequential reference fits were evaluated
using the IUPB retrieval code NLIN (Table ).
Figure shows the resulting absolute and relative slant
column differences (top and middle) as well as relative rms differences
(bottom) for noon reference (left) and sequential reference (right) with respect to
v1 or v1a fit results.
Taking another noon reference spectrum results in a constant offset in
absolute NO2 differences (Fig. , top left). Test TR0
(using the first spectrum) yields the same results as the IUPB v1 fit from
Sect. , which uses the first zenith spectrum as well. In
contrast, using the second zenith spectrum as a reference (TR1) results in
a constant offset of 1.5 × 1015 molec cm-2 (0.5–2.5 % in relative differences,
depending on the actual NO2 slant column), because the second zenith
spectrum apparently had a smaller NO2 content. The change of the NO2
content could be related to changes of the atmospheric NO2 amount or the
atmospheric light path. In terms of rms, TR1 is up to 3 % larger
(Fig. , bottom left). The main reason for this is probably
the larger NO2 slant columns as associated effects like the
wavelength dependence of the NO2 slant column were not
compensated in this intercomparison exercise as discussed in Sect. .
Test TR2 yields results which are identical to TR0 am and TR1 pm values. This
is not seen for any groups in Fig. above; i.e. this
option is apparently not present in any retrieval code. Not surprisingly, TR3
(averaging both zenith spectra) yields results which are between TR0 and TR2.
In contrast to noon reference tests, sequential references show no smooth
behaviour, neither for absolute nor for relative differences
(Fig. , right). The reason is that for each vertical
scanning sequence (from which only the 2∘ elevation is shown here and
in Fig. ) another reference spectrum is used and
consequently reference-related differences between groups are also changing
from scan to scan. Note that almost no difference can be seen in
Fig. between TR4 and TR5 as the 2∘ elevation angle
is shown here and consequently the closest zenith measurement in time is
normally the one before the scan as IUPB measurements proceed from low to
large elevations. TR8 (interpolation to the measurement time) resembles the
sequential reference treatment normally implemented in the IUPB code; thus the TR8 line is zero. For TR6 and TR7, absolute and relative differences
(up to 8 %) are remarkably similar to observed differences between groups
using v1a fit settings, both in shape and in absolute values (compare to
Fig. ).
To conclude, the exact treatment of reference spectra is the major reason for
observed NO2 discrepancies between groups in Sect. ,
causing differences of up to 8 %. Unfortunately, no clear recommendation can
be derived from relative rms differences in Fig. (bottom
right) as all lines scatter around zero. However, the relative difference in
rms can be as large as 6 % and in general, using a single zenith reference
spectrum before or after the scan tends to produce larger rms. However, while
the reference treatment explains the majority of NO2 disagreements, it
cannot explain the large rms differences (up to 100 %) between groups.
Slit function treatment
Different treatments of the slit function.
Test
Sub. offset
Geom. centring
Cut-off value
Fit line parameters
TS0
no
yes
0.0
no
TS1
no
no
0.001
no
TS2
yes
no
0.001
no
TS3
no
yes
0.001
no
TS4
yes
yes
0.001
no
TS5
no
no
0.0
no
TS6
no
no
0.001
Gaussian
TS7
no
no
0.001
Gaussian (μ=0)
TS8
no
no
0.001
Gaussian (μ=0) fitted on 0.01 nm grid
TS9
no
no
0.0
Gaussian
TS10
no
no
0.0
Gaussian (μ=0)
TS11
no
no
0.0
Gaussian (μ=0) fitted on 0.01 nm grid
Examples of different slit function treatments. Blue is the measured slit function (from HgCd line at ≈ 480 nm).
Red is the fit of a Gaussian shape with μ=0. Light green is the Akima interpolation of the red line to a finer 0.01 nm grid.
Dark green is the interpolation first (linearly) to 0.01 nm and fit of a Gaussian shape afterwards.
NO2 slant columns and rms differences with respect to IUPB v1 fit results in 2∘ elevation
angle for different treatments of the slit function.
The slit function distributed to intercomparison participants originated from
HgCd line lamp measurements made in the night before 18 June 2013. It was
preprocessed in terms of subtraction of the (also measured) dark signal,
centred and provided on an equidistant 0.1 nm grid.
While some groups used this slit function as is, other retrieval codes
include a further online processing of the slit function, e.g. by
fitting line parameters. In order to evaluate resulting differences, trace
gas cross sections have been convolved offline (before the fit) using
different treatments of the slit function as summarized in
Table . Again, the IUPB retrieval code NLIN has been
used then performing the DOAS fit.
Examples of different treatments of the slit function are shown in
Fig. . The original slit function is displayed in
blue. If fitting a Gaussian shape to it, the maximum is not exactly centred
around zero, presumably caused by a small asymmetry. The slit function after
fitting a Gaussian shape and forcing centring (i.e. μ=0) is shown in
red (as used in TS7 and TS10). Furthermore, performing a discrete convolution
of cross sections requires the same wavelength sampling; i.e. the slit
function has to be interpolated to the cross section grid, which was 0.01 nm.
Here, an Akima interpolation has been applied (green line). However, if the
original slit function is first linearly interpolated to the required 0.01 nm
grid and then a Gaussian shape is fitted, a slightly different result is
obtained, shown in dark green (as used in TS8 and TS11). Note that often not
a discrete convolution but a (faster) convolution using Fourier
transformation is implemented in DOAS retrieval codes (see below).
Figure shows resulting absolute and relative
differences in NO2 slant columns (top and middle) as well as relative
differences of the fit rms (bottom) with respect to the IUPB v1 fit (without
I0-correction as this was not applied to the slit function test fits). The
reference IUPB v1 fit uses an online convolution of cross sections using FFT
and a further geometrical centring of the slit function is applied.
No difference between the reference fit, TS0 and TS5 is observed, neither for
NO2 nor for the rms. In TS0, the same slit function treatment is applied
as in the reference fit, but using discrete convolution instead of FFT,
meaning that no difference arises due to the method of convolution. In
addition, TS5 and TS0 are identical tests except that TS0 applies an explicit
geometrical centring (i.e. centring the area) of the slit function. As this
has no visible effect on NO2 slant columns and rms, the centring during
preprocessing was already sufficient (or the non-linear shift fit was
compensating for small deficits in the slit function centring).
The largest impact on NO2 (up to -1.3 %) is seen for TS2 and TS4 which
both subtract the smallest value from the slit function (i.e. forcing the
slit function to be zero for the smallest value measured). This is usually
not performed in DOAS fits and only advisable if instrumental stray light is
a large problem or if the dark signal drifts which is normally not the case
in state-of-the-art CCD detectors that are cooled and temperature-stabilized
using Peltier elements.
In contrast, the largest effect in terms of rms (up to 6.5 %) occurs not for
TS2 and TS4, but for all tests using a fitted Gaussian. Interestingly, the
Gaussian tests (TS6–TS11) show almost no difference among them in terms of
rms in the morning, but split up towards the evening. This might have to do
with decreasing NO2 slant columns in the afternoon or changes of the slit
function during the day. In terms of NO2, all tests using Gaussian slit
functions yield smaller NO2 slant columns of 0.2–0.7 %.
In addition to the basic Gaussian line parameters fitted here, some retrieval
codes offer the possibility of fitting more sophisticated line shape
parameters while taking into account potential asymmetry much better (the slit function in
Fig. shows indeed slight asymmetry). However, the
exact implementations differ and were therefore not reproduced in the tests
performed here, which can then be regarded as two extreme scenarios,
namely (1) using the (original, slightly asymmetric) slit function as it is,
or (2) fitting basic line parameters (a more sophisticated fit taking into
account the slit function's slight asymmetry would clearly lead to a result
in between these scenarios).
An important finding is that all performed tests lead to constant offsets in
relative (and not absolute) NO2 differences. In addition, all tests yield
larger rms than the reference fit using the measured slit function.
Intensity offset correction
Tests performed for different implementations of the intensity offset correction (and I0 correction).
A 0th-order offset correction means applying a constant term only, a 1st-order correction means applying a constant and slope term.
Test
Offset
Offset approach
I0-correction
order
TI0
0th
1/I
yes
TI1
0th
1/I0
yes
TI2
0th
Eq. () using I
yes
TI3
0th
Eq. () using I0
yes
TI4
None
No offset correction
yes
TI5
1st
1/I, λ/I
yes
TI6
0th
1/I, additionally Ring ×λ
yes
TI7
0th
1/I (simple approach)
no
Absolute (top) and relative (middle) NO2 slant column differences and relative rms differences
(bottom) with respect to IUPB v1 fit results in 2∘ elevation angle resulting from different implementations of
the intensity offset correction (and I0 correction).
Photons may hit the CCD detector at locations not corresponding to their
wavelength (e.g. through scattering on mirrors, surfaces etc. inside the
spectrometer) which produces an intensity offset, also called stray light. In
addition, other effects such as changes in the dark current can lead to
intensity offsets and the vibrational Raman scattering (VRS) is known to
produce spectral effects that are very similar to intensity offsets
. In the DOAS fit, pseudo cross
sections are usually included in order to compensate for intensity offsets. If the
measured spectrum I is superimposed by a constant intensity C (which is
the most simple assumption), the optical depth reads
-τ=lnI+CI0=lnII0+ln1+CI≈lnII0+CI,
with the Taylor expansion ln(1+x)≈x. Thus, in a first approximation
the intensity offset causes an additive term of optical depth that is
proportional to 1/I, which is often used as a pseudo absorber (and showing
large similarities to the Ring cross section as this compensates a filling-in
of Fraunhofer lines).
However, often more sophisticated approaches are used. For example, the IUPB
retrieval code NLIN allows either the simple implementation of
σoff=1/I or
σoff=lnI+C⋅ImaxI,
omitting the Taylor expansion in Eq. () and
superimposing I by a certain constant C of the maximum intensity within
the fit interval. In addition, higher correction terms are also sometimes
used, assuming that, not only does a constant superimposes the spectrum, but
a contribution also changes with wavelength (in which case often λ/I is
included in the DOAS fit). Furthermore, sometimes the offset is not included
in the linear DOAS fit, but fitted non-linearly (this is not included in tests
performed here).
Different implementations summarized in Table were
tested in order to evaluate the influence of the intensity offset correction.
Figure shows resulting absolute and relative
differences of NO2 slant columns and rms values. Again, the reference for
these differences is the IUPB v1 (noon) fit.
No difference in NO2 or rms is seen between TI0 and the v1 fit as both
fits use the same simple approach of σoff=1/I. In contrast,
TI1 uses 1/I0. In terms of NO2 differences, the TI1 line slightly follows
the shape of total fit rms and NO2 slant columns (compare to
Fig. ). Interestingly, while the disagreement of TI1 with respect to
the reference fit is on average ≈2 % for NO2, the first data
point is clearly off by almost 10 % for NO2 and 60 % for rms. This agrees
perfectly with the observed outlier in the INTA analysis (compare to
Fig. ); i.e. the reason for this disagreement could be
identified as a different offset implementation (which has been verified by
INTA). The SZA and the sun azimuth angle (SAA) of this measurement are
≈ 90 and 58∘ (from the north), respectively, while the
instrument's elevation angle is 2∘ and the azimuthal viewing direction
51∘; i.e. the instrument was pointing close to the rising sun.
Enhanced stray light in the spectrometer (caused by the large contribution of
photons at longer wavelengths while observing the red sky during sunrise)
seems plausible and using 1/I is a better choice for compensation. It was
verified (not shown) that the fit coefficient of the offset (also called the
offset slant column) is particularly large not only in this but also
in adjacent measurements and a colour index indicated that these spectra are
indeed more reddish. However, these measurements could also be affected by
direct sunlight in the instrument, which is known to increase rms (e.g. due
to polarization issues). Figure f demonstrates that the
respective measurement is affected by a very large rms (potentially caused by
a combination of direct light and stray light).
TI2 and TI3 are more sophisticated approaches
(Eq. ) based on either I or I0. However,
the resulting lines largely follow TI0 and TI1 (simple approaches).
Unexpectedly, both TI2 and TI3 lead to larger rms values for the sunrise
measurement in the morning where the simple approach performs better.
The total effect of using an intensity offset compensation is evaluated by
TI4 which includes no offset correction. The resulting effect on NO2 is
almost the same as TI1 and TI3 (based on I0) leading to the clear
recommendation of using an offset compensation based on I instead of I0,
which is supported by largely increased rms values of TI4
(Fig. bottom).
An intensity offset correction of 1st-order (i.e. a term varying linearly
with λ in addition to a constant term) was tested in TI5, which is in
practice often used not only for intensity offsets but also for compensation
of the wavelength-dependence of the Ring slant column. The resulting NO2
differences with respect to the reference fit (or TL0) are small with the exception of
the first data point that is slightly off. In terms of rms, TI5 leads to
improvements of ≈ 20 % in the morning. Interestingly, the TI5 rms line
shows some similarities to the IUPHD-IUPB line in Fig. d.
However, no 1st-order offset was included in the IUPHD fit; i.e. the offset
implementation is not causing the observed similar shape in the morning (and
the reason remains unclear). This is supported by increasing IUPHD rms values
in the evening in Fig. d, which are not present in TI5.
Fit TI6 includes again only a 0th-order intensity offset, but a pseudo cross
section accounting for the wavelength-dependence of the Ring slant column was
added (the Ring cross section was multiplied by λ and orthogonalized
against the original cross section). The resulting rms is indeed almost
identical to TI5 (using a 1st-order intensity offset) while the NO2 is
identical to TI0; i.e. the outlying first data point which is still partially
present in TI5 disappeared in TI6. Thus, using a pseudo-cross section for
compensation of the Ring wavelength-dependence seems preferable compared to
using a 1st-order intensity offset correction.
I0-correction
In addition to the intensity offset tests, the effect of inclusion of an
I0-correction was evaluated in TI7 (which is identical to TI0 except for
the I0-correction; see Table ). The respective
line is shown in addition to the intensity offset investigations in
Fig. . The I0-effect addresses the problem that the
limited instrument's resolution can cause an incomplete removal of Fraunhofer
structures in the vicinity of strong narrow-banded absorption bands
.
Only a very small constant offset in relative NO2 slant column differences
is obtained in TI7 (≈ 0.25 %, which is almost invisible in
Fig. , middle). In terms of rms, exclusion of the
I0-correction leads to an increased rms of up to 20 %, which is comparable to
different treatments of the intensity offset correction. Thus, including
an I0-correction is recommended in polluted environments such as the
MAD-CAT site. It should be noted that the first data point is not an outlier
in TI7; i.e. it is not sensitive to the I0-correction.
Numerical methods (linear DOAS inversion)
Different methods tested for solving the linear DOAS equation Ax=b.
Test
Retrieval
Spectral
Method
Remarks
code
grid
Reference
NLIN
I0
Pseudo-inverse of A using SVD
following
TL0
Python
I0
Pseudo-inverse of A using SVD
TL1
Python
I0
Solving quadratic ATAx=ATb
using LU decomposition
TL2
Python
I0
Invert ATA using LU decomp.
and multiply with ATb
TL3
Python
0.01 nm
same as TL0
linear interpolation
to 0.01 nm
TL4
Python
0.01 nm
same as TL0
cubic spline interpolation
to 0.01 nm
The DOAS equation (Eq. ) is a linear inverse
problem
Ax=b,
with the vector x (size n) containing the n trace gas slant
columns and polynomial coefficients of interest, the vector b=ln(I0I) (size m) containing the measured optical depths at m
wavelengths and the m×n DOAS matrix A with columns
consisting of absorption cross sections and polynomial terms (1, λ,
λ2, etc.) for the m wavelengths.
Different numerical methods exist to solve Eq. () for
x. As A is non-square, no inverse exists. However, most
retrieval codes calculate a pseudo-inverse A-1 (almost)
fulfilling A-1A=I (identity matrix) using a
singular value decomposition (SVD) and obtain the slant columns of interest
by x=A-1b. This method is frequently recommended
for solving overdetermined linear inverse problems in terms of least squares
see e.g..
However, after multiplying Eq. () with AT, the
matrix ATA is quadratic and can be decomposed into an upper and a
lower triangular matrix, L and U. The linear inverse
problem
ATAx=LUx=ATb
can then be solved by a forward substitution, obtaining y from
Ly=ATb, and a backward substitution, obtaining
x from Ux=y.
Furthermore, ATA can also be directly inverted, normally by LU
decomposition as well. The vector of slant columns is then obtained by
x=(ATA)-1ATb. As the inversion normally takes
many more computational steps, this method is known to be subject to
round-off errors and is therefore not recommended .
The influence of these different numerical methods on resulting slant columns
could not be easily tested with the IUPB retrieval code and was thus
evaluated in a Python script solving the DOAS equation using the same input
(spectra, cross sections). Python (which is a well-established programming
language in scientific computing) provides numerous different routines within
its numpy and scipy packages (based on different
subroutines from the LAPACK package,
http://www.netlib.org/lapack/) that were tested for solving the DOAS
equation. All performed tests are summarized in
Table . Again, differences of NO2 slant columns
and fit rms have been calculated with respect to the IUPB v1 fit results
(i.e. using NLIN). In order to restrict differences to the influence of
numerical approaches only, the same slit function treatment as the IUPB
retrieval code NLIN was applied in the Python script and the same (noon)
reference spectrum was used. In addition, no further wavelength calibration,
i.e. no non-linear shift and squeeze fit, was performed (neither in the Python
script nor in the NLIN reference fit used here) and no I0-correction was
included. The test results are shown in Fig. , again
for the 2∘ elevation angle.
Results of test TL0 appear to be identical to the reference fit of the IUPB
retrieval code, both using a SVD for inversion of the DOAS matrix. However,
very small differences exist between TL0 and the reference fits (too small to
be seen on the scale of Fig. ), which are < 0.006 %
for NO2 slant columns and < 0.07 % for rms. These tiny disagreements can
be attributed to numerical differences in programming languages.
TL1 and TL2 yield identical NO2 slant columns (both are using an LU
decomposition) which differ from SVD results by up to 0.7 %. However, the rms
from TL2 was found to be an order of magnitude larger compared to the other
tests and is therefore not shown in Fig. (bottom).
This is most likely due to problems mentioned above; i.e. this finding is in
agreement with common recommendations in textbooks. Interestingly, both TL1
(and TL2) NO2 and rms lines (which are differences with respect to IUPB) are
similar in shape to the total fit rms (compare to Fig. f).
Thus, when the rms increases, SVD inversion and LU decomposition lead to
larger disagreements, both in rms and NO2. As the SVD yields smaller rms
values, it seems to be preferable, although the obtained improvement is only
about 2.5 %.
Numerical differences may be obtained when performing the linear DOAS fit
(Eq. ) on another wavelength grid. Changes of the grid
potentially arise from the wavelength calibration. Some retrieval codes (e.g.
NIWA) also use an internal, equidistant wavelength grid. To test the effect
of changes in the wavelength grid, the TL0 fit was repeated on an equidistant
0.01 nm grid (i.e. I, I0 and cross sections were interpolated to
0.01 nm before solving Eq. ). TL3 and TL4 are identical
to TL0, but a linear interpolation was applied in TL3, while TL4 uses a cubic
spline interpolation to 0.01 nm. Apparently, this results in a constant
offset in relative NO2 differences, which is seen most clearly in the TL3
line in Fig. (middle). The resulting constant shift
is ≈0.4 % for TL3, but only ≈ 0.02 % for TL4 meaning that
the type of interpolation to the equidistant grid is of importance and the
spline interpolation (not surprisingly) seems to resemble the spectrum better
than a linear interpolation. However, using different wavelength grids might,
for example, explain some of the observed differences between IUPB and NIWA,
which were found to be constant in relative differences as well. In terms of
rms, the computation on an equidistant 0.01 nm grid using linear
interpolation behaves even a bit better on average (up to 1 %). However, no
recommendation can be drawn from this, as discussed above.
Absolute (top) and relative (middle) NO2 slant column differences and relative rms
differences (bottom) with respect to IUPB v1 results (only linear fit) in 2∘ elevation angle resulting
from different numerical methods solving the linear DOAS equation.
Non-linear wavelength calibration
Different wavelength calibration approaches evaluated.
Test
Retrieval code
Shift I0 to
Shift
Remarks
code
Fraunhofer atlas
I to I0
Reference
NLIN_D
yes
yes
alternating scheme, v1 fit settings
TW0
NLIN_D
no
no
linear DOAS fit only
TW1
NLIN_D
no
yes
TW2
NLIN_D
yes
no
TW3
Python
yes
yes
same a TW0
TW4
Python
yes
yes
same a TW3, but linear interpolation
TW5
Python
yes
yes
same a TW3, all trace gases
included in Fraunhofer I0 fit
As mentioned in Sect. , the spectra provided from the
IUPB instrument were precalibrated using nightly HgCd line lamp
measurements, which provide accuracies better than 0.1 nm. However, usually a
postcalibration is included in DOAS retrieval codes in order to increase the
fit quality (reducing rms). This wavelength calibration is implemented in
different ways in participating retrieval codes. Most groups calibrate the
reference spectrum I0 to a high-resolution Fraunhofer atlas, apply the
resulting calibration to all measured spectra and allow in addition a shift
and squeeze between I and I0 in order to compensate spectral shifts of
the spectrometer during the day, e.g. caused by temperature changes (KNMI
uses a slightly different definition of the shift as discussed below). This
non-linear shift and squeeze fit of the wavelength axis is mostly implemented
in an iterative scheme together with the linear DOAS fit on ln(I0/I).
However, codes differ, for example in whether trace gases are included in the
shift and squeeze fit of I0 to the high-resolution Fraunhofer atlas.
Sometimes also a higher-order calibration is allowed or several subwindows
are used in order to differently characterize different parts of the spectra.
Table summarizes tests performed to investigate
the impact of different wavelength calibration approaches using the IUPB
retrieval code NLIN. In addition, some tests were performed with the Python
script form Sect. , which has therefore been extended
to perform the non-linear shift fit as not all tests could be easily
implemented in the comprehensive NLIN software. Note that, in contrast to the
shift, the squeeze has been excluded from the intercomparison as it was found
to always be 1.0 for measurements shown here. Also, QDOAS-specific
implementations were not tested here.
Figure shows the resulting impact on NO2 and rms
as well as the fitted shift between I and I0 (not present in all tests).
NO2 and rms are again differences relative to the IUPB v1 fit results
(without I0-correction as this was not implemented in the Python routine).
As in Fig. , the shift in
Fig. d is no difference, and for comparison the
shift from the reference fit is shown explicitly in black. The (fixed) shift
retrieved from the non-linear fit of I0 to the Fraunhofer atlas is
-0.035 nm. This is roughly a factor of 10 larger than the fitted shift
between I and I0 shown in Fig. d. It is
interesting to note that the shift is not zero around noon (time of the
reference spectrum), indicating correlations between shift fit and other
effects (predominantly intensity offset correction) and also indicating the
presence of the tilt effect .
Absolute (a) and relative (b) NO2 slant columns differences and relative rms differences
(c) in 2∘ elevation angle resulting from different wavelength calibration approaches
(Table ). Corresponding shifts between I and I0 resulting from
non-linear fits are shown in (d).
The most extreme test is TW0, which excludes both the calibration of I0 to
the Fraunhofer atlas as well as the shift between I and I0. When
omitting both calibration steps, the rms is largely enhanced by up to
80 %,
peaking in the morning and decreasing towards noon with a second smaller
maximum around 12:00 UT. Interestingly, a very similar shape is seen in the rms
differences of KNMI, NUST, UNAM, USTC and INTA
(but not in INTA2 where a fault in the wavelength registration module was corrected) in Fig. d.
Although all of these groups are performing a wavelength calibration, TW0
indicates that differences in the calibration procedure are causing most of
the disagreements between groups in terms of rms. This is in contrast to
NO2 where changes of only 0.4 % are obtained from TW0.
TW1 still excludes the absolute calibration to the Fraunhofer atlas, but
includes the shift fit between I and I0. As seen before, the impact on
NO2 is very small (≈ 0.4 %), but absolute NO2 differences of
TW1 reflect the shape of total rms and NO2 slant columns (compare to
Fig. ); i.e. the relative differences are smooth in shape.
The rms is similar in shape to TW0, but the morning maximum is slightly later
at 07:00 UT; the noon maximum is around 11:00 UT and a small maximum in the evening
occurs at 18:00 UT. This shape is similar to the relative rms of NUST, UNAM and
NIWA in Fig. , but the absolute numbers are different.
However, this behaviour indicates that differences in the fitted Fraunhofer
shift are partially responsible for observed differences between these and
other groups. Interestingly, the fitted shift between I and I0 of TW1 in
Fig. d is very similar to corresponding values of
the reference fit, which is because the missing Fraunhofer shift is
a different effect than the shift between I and I0.
In contrast to TW1, TW2 includes the Fraunhofer shift fit, but excludes the
shift between I and I0. As this is the larger effect (-0.035 nm compared
to only ≈ 0.004 nm), the rms is much smaller than in TW1 with a single
maximum (up to 50 %) in the early morning at 06:00 UT. The rms time series shape
is similar to the KNMI line in Fig. . The reason is a
different definition of the shift in the KNMI retrieval: while in most
retrievals I is shifted relative to I0, KNMI calculates the optical
depth τ=ln(I0/I) without any shifts but then allows a shift of all
cross sections relative to τ. This is in first order compensating the
effect of the Fraunhofer shift but neglecting potential shifts between I
and I0 (in this case Fraunhofer lines would not cancel out completely in
the optical depth τ). As a result, the KNMI approach is similar (but not
identical) to TW2. The fit quality following the KNMI approach is expected to
be better using a sequential reference as the temperature drift of the
spectrometer is much smaller then. It has to be mentioned that the change of
NO2 in TW2 is small (0.3 %) compared to the change in fit rms (50 %).
TW3 uses the same wavelength calibration treatment as the IUPB reference fit,
but was run in another programming code (Python) to evaluate how much
difference is caused by use of another programming code and numerical issues.
The fitted shift is mostly identical to the reference fit except for the
early morning and late evening when the NO2 also shows very slight
differences. The largest disagreement of NO2 is 0.2 % for the first
measurement of the day that was affected by large stray light effects (and
potentially direct light) and thus most likely indicating cross-correlations
between shift fit and intensity offset correction. The resulting rms is
almost identical to the reference fit (Fig. c).
Summary of performed sensitivity tests and associated impacts on NO2 slant columns and rms.
Reason for disagreement
ΔNO2 ( %)
Δrms ( %)
Remarks
Reference treatment (noon)
2.5
3
Produces constant absolute NO2 SC offsets.
Reference treatment (seq.)
8
6
Slit function treatment
1.3
6.5
Produces constant relative NO2 SC offsets.
Intensity offset correction
2 (typically)
20 (typically)
10 (outlier)
60 (outlier)
I0 correction
0.25
20
Produces constant relative NO2 SC offsets.
Numerical methods
0.3 (0.7)
2.5
Produces constant relative NO2 SC offsets.
(for linear DOAS fit)
Disagreements increase with rms.
Wavelength calibration
0.4
up to 80
(non-linear shift fit)
TW4 is the same as TW3 but the spline interpolation (calculating I at
spectral points of I0 during the non-linear shift fit) is replaced by a
simple linear interpolation. The fitted shift is changed slightly and NO2
differences are of almost the same order than when not performing any shift
fit at all, demonstrating that the shift fit has a negligible impact on
NO2. In contrast, it has a large impact on rms, where the marginally
different methods between TW3 and TW4 produce the same rms while the effect
of excluding the shift completely leads to largely enhanced rms.
In the Fraunhofer calibration of the reference fit using NLIN (non-linear
shift fit of I0) as well as in TW3 all trace gas absorptions are omitted;
i.e. an iterative scheme between shift fit and DOAS fit comprising only a
polynomial of order 4 is applied. In contrast, all trace gas absorptions are
included in the Fraunhofer calibration in TW5. As a result, the rms of the
DOAS fit between the Fraunhofer spectrum and I0 is reduced by a factor of 2
and the non-linearly fitted shift is -0.031 nm instead of -0.035 nm in the
reference fit. However, this only has a marginal influence on NO2, rms and
fitted shift between I and I0 in Fig. and
consequently explains none of the observed differences between groups.
Notably, all shifts (even TW1) in Fig. d show the
same general shapes that are retrieved by most groups in Fig. . Only shifts of KNMI clearly differ, for the reasons mentioned
above. Rms shapes in Fig. c suggest that
differences in the wavelength calibration are the major reason for observed
rms disagreements between groups in Fig. .
Summary and conclusions
An intercomparison of DOAS retrieval codes using measured spectra from the
same instrument during the MAD-CAT campaign and harmonized fit settings was
performed. Excellent agreement was found between different DOAS fit
algorithms from 17 international groups. In some of the retrieval codes,
faults were identified and corrected in the course of the study, leading to
even better agreements. For noon reference fits, the correlation in terms of
NO2 slant columns was found to be larger (> 99.98 %) than for sequential
references (> 99.2 %), which is caused by different implementations of the
sequential reference.
Despite the excellent overall correlation, for individual measurements in
low elevations differences of up to 8 % in resulting NO2 slant columns
were observed, which is up to 2–3 times larger than corresponding typical
NO2 slant column errors. In terms of fit rms, large differences of up to
≈ 100 % were found.
Interestingly, groups using the same retrieval code (QDOAS groups) do not
always produce results showing the same systematic behaviour, which is the
result of different options users select – other than the harmonization settings agreed on. This has some impact for the interpretation of other studies
as the extent of prescribed fit settings in this study is comparable to
intercomparison campaigns like CINDI or CINDI-2. Consequently, groups
participating in those campaigns will provide intrinsic differences in their
results due to non-harmonized (detailed) settings even if using the same
retrieval code, which has to be expected in the same range as observed here.
Comprehensive sensitivity studies systematically investigating effects of the
non-harmonized retrieval aspects were performed in order to (1) attribute
observed differences between groups to certain sources and (2) evaluate the
impact of each of these sources on NO2 slant columns and fit rms. For this
purpose, five causes were identified in a survey of participating retrieval
codes. Typical impacts on NO2 slant columns and rms are summarized in
Table .
In general, the wavelength calibration and the intensity offset correction
were found to produce the majority of observed rms differences, but have
a negligible impact on NO2 slant columns (< 0.4 % or < 2 % except
for the first measurement of the day affected by stray light and possibly
direct light in the telescope). In contrast, the reference selection explains
the majority of observed NO2 slant column differences between groups while
having a minor impact on the rms. Thus, if harmonization of NO2 slant
columns is of interest, the reference treatment needs to be harmonized
(otherwise in small elevation angles differences of up to 8 % – and even more
for larger elevation angles due to decreasing absolute slant columns – have
to be expected predominantly in the case of using a sequential reference)
while for rms reduction/harmonization, the offset intensity correction and
the wavelength calibration need to be harmonized.
In terms of NO2, two types of disagreements have been observed, which are
(1) constant in absolute differences or (2) constant in relative differences. The latter
was found to arise from the numerical approach used for solving the DOAS
equation as well as the treatment of the slit function while the choice of
the reference spectrum causes absolute differences.
The best practices aiming at improvement of the fit quality and harmonization
between MAX-DOAS retrievals derived from this study are as follows.
Reference treatment: using averaged or interpolated sequential reference spectra matches the atmospheric conditions at
the measurement time better and was found to produce slightly smaller rms.
Slit function: using a measured slit function gave a better performance than fitting line parameters in the data set used here.
However, slit function measurements then have to be taken regularly (e.g. daily to monitor possible instrument changes).
Intensity offset: an approach based on I instead of I0 is recommended. Surprisingly, the simple approach performs
better for measurements pointing close to sunrise but this could be just a coincidence in this data set. The inclusion of an additional
Ring spectrum multiplied by wavelength is preferred over adding a linear term to the offset as this mostly compensates the
wavelength-dependence of the Ring slant column.
Numerical approaches: using an SVD provides more stability and produces slightly smaller rms than LU decomposition.
Wavelength calibration: although HgCd line lamp calibration measurements lead to absolute accuracies (in this case) of
≈ 0.03 nm, a Fraunhofer shift fit reduces the rms by up to 40–50 %. For the I0 fit with respect to the Fraunhofer
spectrum, the inclusion of all trace gases showed no advantage over the inclusion of only a polynomial. Temperature instabilities
of the spectrometer produced shifts of I relative to I0 changing over the day. Compensation of this effect within DOAS
retrieval codes further improves the rms by up to 40–50 % when using a noon reference spectrum.