Interactive comment on “ Improved OSIRIS NO 2 retrieval algorithm : Description and validation

1. The second sentence in the Abstract becomes only clear after reading the paper. It could be left as is, if no better formulations come to mind. 2. page 2 line 17: OSIRIS is already defined on page 1 line 17 3. page 9 line 16: Why use 17.5 km? 4. page 12 line 17-21: Why include a climatology only for November and the Southern Hemisphere? What data has been used for this climatology, I presume it C1


Introduction
Nitrogen oxides, such as NO and NO 2 , are the reactive nitrogen-containing species in the middle atmosphere and are produced mainly from the breakdown of nitrous oxide in the stratosphere (Crutzen, 1971).Oxides of nitrogen dominate ozone loss in the middle stratosphere, whereas in the lower stratosphere they react with oxides of chlorine and bromine, such as ClO and BrO, to reduce the halogen-catalyzed destruction of ozone (Salawitch et al., 2005;Wennberg et al., 1994).
The partitioning between NO and NO 2 depends on several factors such as the local ozone concentration and the photolysis frequency of NO 2 .Reactive nitrogen is chemically converted at night to N 2 O 5 and, upon hydrolysis, can be further sequestered into unreactive "reservoir" species such as HNO 3 .NO 2 increases steadily during daylight hours due to the UV photolysis of N 2 O 5 (e.g.Wetzel et al., 2012).
The photochemistry of NO 2 , which is particularly rapid near the day-night terminator, leads to horizontal gradients within the field of view, particularly for limb sounders such as OSIRIS (Optical Spectrograph and Infrared Imager System) (Llewellyn et al., 2004) on the Odin satellite or for solar occultation instruments operating in either the UVvisible or mid-infrared (e.g.Kerzenmacher et al., 2008).Besides OSIRIS, other space-borne limb scattering instruments  (Vandaele et al., 1998) (Burrows et al., 1998) (Vandaele et al., 1998) Interpolation of fitted cross section No Yes, using local ECMWF Yes, using local ECMWF to effective T temperature at each TH temperature at each TH Reference Haley and Brohede (2007) Bourassa et al. (2011) This work that have measured NO 2 vertical profiles include Scanning Imaging Absorption spectrometer for Atmospheric Chartography (SCIAMACHY) (Bovensmann et al., 1999), Solar Mesosphere Explorer (Mount et al., 1984), and Stratospheric Aerosol and Gas Experiment (SAGE) III (Rault et al., 2004;Rault, 2004).
The OSIRIS operational NO 2 retrieval algorithm was developed and validated by Haley and Brohede (2007) and the current version is 3.0.The work of Kerzenmacher et al. (2008) is the only other publication comparing version 3.0 OSIRIS NO 2 data to correlative profile measurements.Earlier versions (e.g.2.x; Haley et al., 2004) were more thoroughly validated, for example by Brohede et al. (2007).The pseudo-spherical forward model used in the operational OSIRIS NO 2 retrieval algorithm is less accurate than the SaskTran spherical radiative transfer model (RTM) (Bourassa et al., 2008;Zawada et al., 2015) currently used as a forward model in the operational retrieval algorithm for OSIRIS ozone and aerosol extinction data products.Recently, Bourassa et al. (2011) developed an alternative NO 2 algorithm which relied on four wavelengths covering a single NO 2 absorption band.This was later modified to 13 wavelengths covering three adjacent NO 2 bands (438-450 nm) and used to process the entire OSIRIS data record and is referred to here as the "fast" OSIRIS NO 2 product.While the spectral information content is reduced relative to the operational OSIRIS algorithm and the algorithm described below (see Table 1), the "fast" algorithm has two key common elements to the one described herein: 1.The forward model is the successive-orders-ofscattering version of SaskTran.
In this work, we provide a detailed description of the new retrieval algorithm whose heritage is the "fast" algorithm (Bourassa et al., 2011) as well as the algorithm developed in a series of papers (Sioris et al., 2003(Sioris et al., , 2004(Sioris et al., , 2007)).The current algorithm was developed to demonstrate that improved accuracy is possible through the combination of a better forward model and better forward model inputs than used by Sioris et al. (2007) and additional wavelengths longer than those selected by Sioris et al. (2003).The main focus is on the lower stratosphere (and upper troposphere).
We then compare NO 2 profiles retrieved from OSIRIS observations to balloon-borne NO 2 profile measurements during the decade (2002)(2003)(2004)(2005)(2006)(2007)(2008)(2009)(2010)(2011)(2012) when dozens of balloon-borne limb measurements were performed.Balloon profiles are chosen for the validation for many reasons, of which the most important is the expected accuracy of these data.The accuracy is due to two main factors: very high signal-to-noise ratio afforded by the luxury of long exposure times (which can be traded for higher vertical and/or spectral resolution) and superior altitude determination, which for balloon-borne limb geometry is due to the sensor being an order of magnitude closer to the tangent point than for satellite limb sounders, thereby reducing the impact of imperfect viewing angle knowledge.Profile measurements from balloons are preferable to those from satellite for the purpose of validation because of their high vertical resolution, generally matching or exceeding the ∼ 2 km vertical resolution of OSIRIS NO 2 (e.g.Sioris et al., 2003) (see below).Balloon measurements exploit a greater diversity of methods as in situ techniques are used in addition to remote sensing.Furthermore, one balloon-borne remote sensing technique relies on occultation during balloon ascent/descent, which is not possible for satellite instruments, and provides very accurate altitude registration, offers potentially finer vertical resolution than limb geometry for instruments observing the full solar disk (see Sect. 2.3), and smaller errors due to the neglect of diurnal NO 2 gradients.

Algorithm description and settings
The algorithm is a classic two-step approach of spectral fitting of optical depth spectra with absorption cross sections to determine slant column densities (SCDs), followed by vertical fitting to invert the SCD profile (e.g.Ogawa et al., 1981).This two-step approach is used by many groups (e.g.Pommereau, 1982;Ferlemann et al., 1998;Renard et al., 2000) including some of the balloon remote sensing teams providing data used in this study.The profile is retrieved by iteratively updating it based on MART (see below) such that the simulated NO 2 SCDs agree with the observed ones.A generally appropriate diagram of the algorithm described here appears in the work of Haley et al. (2004).
The first step of the data analysis, namely the spectral fitting, involves a reference spectrum as in Eq. ( 1) of Sioris et al. (2003).The reference spectrum is the co-addition of spectra at tangent heights (THs) in the 50-70 km range (Sioris et al., 2003).Spectral fitting refers to a multiple linear regression including the following basis functions: a fourthorder closure polynomial which is justified based on an adjusted R 2 test, and temperature-dependent NO 2 (Vandaele et al., 1998) and O 3 (Serdyuchenkov et al., 2014) absolute absorption cross sections interpolated to the temperature (T ) of the tangent layer using the European Centre for Medium-Range Weather Forecasting (ECMWF) analysis and convolved with a Gaussian to OSIRIS spectral resolution.Water vapour absorption is neglected despite maximal absorptions of > 0.1 % in the fitting window (434.8-476.7 nm) in the upper troposphere as it is not spectrally correlated with NO 2 absorption over this window and would greatly increase the forward modelling computational burden.SaskTran treats water vapour as a line-by-line absorber assuming a Voigt line shape and spectroscopic parameters from the HITRAN 2008 database (Rothman et al., 2009).
The spectral fitting is exactly the same for observed and simulated normalized radiances.For example, the actual OSIRIS wavelengths and tangent heights are inputs into the simulation and any spectral pixels which are rejected due to radiation hits or detector saturation at the Level 1 processing stage (i.e.observed radiances) are also omitted from the fit of the SaskTran-simulated radiances.The longest wavelength in the fitting window is extended to 476.7 nm, raising the number of spectral pixels to 107, thereby increasing the spectrally integrated signal-to-noise ratio and the penetration of the lower atmosphere relative to both existing OSIRIS NO 2 algorithms mentioned above.Both of these benefits of an extended fitting window have been demonstrated for SCIA-MACHY limb scattering (Sioris et al., 2004), but OSIRIS has a glass filter that prevents the detection of higher orders of light reflected off the grating, which was positioned such that the 477-530 nm region is not usable (Warshaw et al., 1996).Thus the 434.8-476.7 nm window is used here and spectral fitting residuals are shown in Fig. 1.Using a discontinuous fitting window that included wavelengths greater than 530 nm is not beneficial.The NO 2 SCD uncertainties are improved relative to those obtained by Sioris et al. (2003) who used a 434.8-449.0nm fitting window and the "tilt" pseudo-absorber.The improved NO 2 number density precision is illustrated below.The "tilt" basis function is now excluded from the spectral fitting.We also tested an alternative approach of fitting an NO 2 spectral weighting function to the normalized radiances, as is used to fit the ozone absorption signal in nadir reflectance spectra (Coldewey-Egbers et al., 2005) and the spectral fits did not improve significantly.
The retrieval upper altitude limit is defined by the lowest TH in a limb scan for which the NO 2 SCD error is < 100 % for all THs below.This can be as high as 49 km but is typically ∼ 40 km and typically a few kilometres higher than the upper altitude of the "fast" NO 2 product.The lower limit of the retrieval is often determined by cloud tops as Odin generally scans the limb into the upper troposphere; cloudcontaminated observations are excluded as they can lead to biases and poor retrieval convergence.Cloud top detection is particularly important for MART since spectra from successive THs immediately below a given altitude are used in a weighted fashion to retrieve the NO 2 number density at that altitude.We assume, for OSIRIS geometry (scattering angles of 90±30 • ), that an observed scene near a cloud top sampled by the ∼ 1 km tall instantaneous field of view has larger limb radiance at ∼ 810 nm than if it were cloud free.A scene is deemed to be cloudy if the ∼ 810 nm radiance scale height is < 2.4139.This threshold is lowered from an overly stringent value of 3.84 (Sioris et al., 2007).The new threshold is chosen to allow NO 2 retrievals (and validation) to extend below 2-3-month old volcanic aerosol layers due to Sarychev Peak and Nabro, two of the eruptions during the OSIRIS mission which led to the largest stratospheric aerosol optical depth.
Algebraic reconstruction techniques have been used to recover the vertical (and along-track) distribution of atmospheric constituents for over 4 decades (Thomas and Donahue, 1972;Fesen and Hays, 1982).Chahine's (1968) relaxation method, used by Sioris et al. (2007) for the retrieval of OSIRIS NO 2 vertical profiles, is a variant of MART in which only the tangent layer is used to retrieve the local number density and is also tested (see Sect. 4).The MART retrieval in this work uses a 0.6 : 0.3 : 0.1 weighting following the OSIRIS aerosol extinction retrieval (see Eq. 8 of Bourassa et al., 2007).Exactly 15 retrieval iterations are used, which appears to be adequate for most cases and, with MART, does not lead to overfitting.The radiance is summed over five orders of scattering, which is sufficient for non-cloudy cases (Sioris et al., 2004) and also leads to ≤ 6 % errors over the retrieval range for a case with a solar zenith angle (SZA) of 75 • with an optically thick ice cloud occupying two altitude levels within the boundary layer.The NO 2 profile is retrieved on a 1 km altitude grid, although the NO 2 vertical resolution remains ∼ 2 km, a consequence of the ∼ 2 km vertical sam- pling provided by Odin.Above the retrieval range, NO 2 profiles are scaled every iteration using a Chahine-like update based on the highest TH within the retrieval range.Below the retrieval range, the NO 2 profile is assumed to have a constant number density down to the ground, equal to the number density at the lowest retrieved altitude.The air number density profile is from ECMWF.The retrieval uses OSIRIS-retrieved ozone and aerosol extinction profiles (version 5.07, v5.07 hereafter), as well as the 675 nm scene albedo (Bourassa et al., 2007) to provide more realistic forward model inputs into SaskTran.
The NO 2 retrieval uncertainty is obtained by perturbation as described by Sioris et al. (2010), with the NO 2 SCD standard errors and an altitude-independent NO 2 number density perturbation serving as inputs.Profile retrieval including the error calculation using a forward model that neglects diurnal gradients (see Sect. 2.2) takes ∼ 5 min on a desktop computer with eight 3.4 GHz processors.et al. (2006) developed the capability to account for diurnal chemical gradients in a pseudo-spherical RTM.The high spatial resolution capability which is required for modelling (horizontal) diurnal gradients of NO 2 within the fully spherical SaskTran RTM is described by Zawada et al. (2015).This capability comes at the price of a large increase in computing time (see also Sect.3).SaskTran is now capable of modelling radiation fields with the atmosphere varying along the line of sight (LOS) as well as along the incoming solar beam (referred to as "2-D mode" and "3-D mode", respectively, hereafter).The 1-D forward model and associated retrieval (Table 1) completely neglects diurnal gradients in NO 2 .In 2-D mode, the atmosphere consists of sectors along the LOS (with 1 • , angular resolution) but the diurnal variation of NO 2 along the incoming solar beam is not considered.In 3-D mode, the atmosphere essentially consists of stacked triangular prisms of increasing horizontal extent with increasing distance from the tangent point as illustrated by Zawada et al. (2015) and diurnal gradients are simulated for any light path from its point of entry into the atmosphere to its exit.

McLinden
To provide realistic diurnal gradients in the atmosphere of the forward model, SaskTran has been linked to the PRATMO stratospheric gas-phase photochemical box model (McLinden et al., 2000).PRATMO is configured to converge to 0.5 % between the start and end of each 1-day run.The default number of time steps per day is 35.The latitudinal and vertical resolutions are 2.5 • and ∼ 2 km, respectively.The atmosphere is aerosol free for photolysis frequency calculations.More details are available in McLinden et al. (2006) and references therein.The accuracy of PRATMO at various altitudes, latitudes, seasons, and SZAs is in evidence in the work of Brohede et al. (2008).
The diurnal variation within the SaskTran atmosphere is then simulated by scaling the NO 2 number density in each sector by the ratio of NO 2 number density in that sector relative to the tangent sector.NO 2 number density is linearly interpolated from the SZAs in PRATMO to the SZAs of the SaskTran sectors based on the cosine of the SZA.The sectors and the vertical grid in SaskTran are defined independently of the PRATMO latitudinal and vertical grid.
Note that the typical horizontal sampling of OSIRIS limb scans of ∼ 5 • is too sparse to allow for a tomographic retrieval (Hultgren et al., 2013).A two-dimensional tomographic retrieval of NO 2 from a special set of SCIAMACHY limb scattering measurements with finer horizontal sampling was performed by Puk ¸īte et al. (2008).Their tomographic retrieval provides an alternative approach to account for the diurnal gradient of NO 2 in the orbital plane (i.e.along the LOS).

Validation approach and datasets
Even though the OSIRIS profiles retrieved with the 2-D and 3-D mode of SaskTran account, to varying degrees, for the diurnal gradients expected for the OSIRIS viewing geometry, photochemical modelling is also required to scale all of the OSIRIS NO 2 profiles to the local time of the balloon measurement.The PRATMO box model is also used for this purpose.
Coincidence criteria are within 1000 km (Brohede et al., 2007) and on the same calendar day (using UT time).Only daytime correlative measurements are considered.The closest spatial coincidence is used if located within a 1000 km range.Comparisons between OSIRIS NO 2 and balloon correlative data are performed only down to the lower altitude limit of each profile retrieved with the v3.0 algorithm in an effort to keep the number of balloon-coincident altitudes the same between that algorithm and the algorithm debuting here.The upper limit for validation is determined in all cases by the balloon float altitude.
The best opportunities for validation of a denoxified profile come on 4 and 16 March 2003 in Kiruna, Sweden, when the NO 2 number density measured by LPMA and SAOZ, respectively, did not exceed 10 9 molec cm −3 at any altitude.These peak number densities are the lowest in the validation dataset.Note that OSIRIS does not measure in the northern polar region until late February due to a lack of sunlight at ∼ 06:00 and 18:00.NO 2 profiles at southern high latitudes have not been measured by balloon-borne instruments during the OSIRIS mission.
Averaging kernels are not taken into account since the vertical resolution is similar between the NO 2 profiles from most of the selected validation instruments and from the three OSIRIS algorithms, namely the operational v3.0, "fast", and the algorithm described herein.Results from the current algorithm will be treated separately for each of the forward model modes (1-D, 2-D, and 3-D) described in Sect.2.2.
In Sect.3, the 1-D profiles used in the validation are those processed by a network of computers (using a SZA cutoff of < 90 • ).The entire OSIRIS data record has now been processed using the 1-D algorithm and is currently available (University of Saskatchewan, 2017).One time-saving approximation is used to process the entire record: multiple scattering (MS) is only calculated at 21 wavelengths corresponding to the peaks and troughs of the NO 2 absorption cross section across the spectral fitting window rather than the entire set of 107 wavelengths.Radiance ratios between multiple scattering and single scattering simulations at these 21 wavelengths are linearly interpolated to the remaining 86 wavelengths, where they are used to scale the single-scattering radiances.The radiance error due to the MS approximation is typically < 0.05 % at all wavelengths and all THs used in the NO 2 retrieval.

Results
One difference in the method described above (Sect.2) compared to previous OSIRIS NO 2 algorithms (e.g.Haley and Brohede, 2007;Bourassa et al., 2011) is the use of spectral information at longer wavelengths, which allows the NO 2 absorption optical depth to be more precisely quantified.Two sources of random error in retrieved NO 2 are shot noise, a consequence of the finite number of electrons generated by  the detector by impinging limb-scattered photons, as well as radiation hits by energetic particles (e.g.protons) that are not filtered in the Level 1 data.The extended fitting window tends to reduce the susceptibility of the retrieved NO 2 to these noise sources.Using the 1-D forward model described above, the 33 OSIRIS limb scans used in validation are processed with the extended fitting window and the one used in the operational algorithm (Haley and Brohede, 2007).The standard deviation in retrieved NO 2 number density for each set (i.e.fitting window) is calculated over the common altitude range for each of the 33 scans.Natural variability of NO 2 is identical since the same limb scans and altitude ranges are used, so any difference in the standard deviation is due solely to the different spectral fitting windows.Figure 2 (left panel) shows the slight but systematic reduction in NO 2 variability at all altitudes with the extended fitting window.Note that the NO 2 profiles retrieved from the two fitting windows do not have a significant bias (±1 standard error) between themselves, which points to the self-consistency of NO 2 spectral cross section between the two windows.
Other benefits of the extended fitting window are that the NO 2 retrieval uncertainty is improved by ∼ 20 % and the upper altitude limit of the retrieval moves slightly higher, as shown in the right panel of Fig. 2 for a sample case.For this tropical case, only at the NO 2 number density minimum at 15.5 km is the current retrieval unable to measure NO 2 with uncertainty of < 100 %, whereas with the fitting win-dow of the operational algorithm the measurements are below the lower detection limit between 11.5 and 17.5 km and at 7.5 km (Fig. 2).The significant upper tropospheric NO 2 enhancement at ∼ 10 km is also observed by the mini-SAOZ (not shown).
Figure 3 shows that the impact of the different forward model modes: the 1-D retrieval overestimates NO 2 by 29 % relative to the one using the 3-D forward model mode at 17.5 km.Note that relative diurnal gradients are expected to be largest below 18 km.The overestimate for this sample case is due to the larger NO 2 concentrations that are present on the far side of the limb (yet neglected by a 1-D model) for a SZA (86.6 • ) that is small enough to allow significant far side contribution to the radiance.The increasing NO 2 gradient toward the far side of the limb occurs because of the decreasing photolysis frequency of NO 2 with increasing SZA.The retrieval using the 2-D forward model underestimates slightly since the slightly lower NO 2 concentrations along the incoming solar beam are not included in that model version.The 3-D forward model clearly reduces the bias vs. SAOZ relative to the 1-D forward model at altitudes below 18 km.This improved precision is demonstrated below for a large ensemble of profiles by contrasting the correlation of the retrievals using the 1-D and 2-D forward model modes and the standard error of the biases relative to coincident balloon data.The slightly lower NO 2 number density at 20.5 km retrieved with the 1-D forward model (Fig. 3) is probably .Left: comparison of the standard deviation of the NO 2 number density profiles retrieved with the new algorithm using the default window (435-477 nm, green) vs. the fitting window used in the OSIRIS v3.0 operational algorithm (Haley and Brohede, 2007;blue).Right: comparison of a single NO 2 profile retrieved from scan 20 of orbit 60346 with the algorithm described above using the default window (435-477 nm) vs. the fitting window used in the OSIRIS v3.0 operational algorithm.The retrieval uncertainty for each profile is bound by a shaded area of matching colour.a minor oscillation due to an overestimation immediately above at 22.5 km where a secondary NO 2 number density peak is observed by OSIRIS.
In order to compare the various OSIRIS NO 2 products to the balloon correlative data, all profiles are linearly interpolated onto a 1 km grid (12.0-39.0km) commonly used in the balloon data (see Supplement).The "fast" product tends to be limited in its vertical range at both extremes of the profile, relative not only to the other OSIRIS NO 2 products but also to the coincident balloon profiles, and also the "fast" retrieval is not available for some coincidences (Fig. 4).The sample size is ≥ 20 between 13 and 31 km for all OSIRIS NO 2 products and thus, hereafter, the validation discussion focusses on this altitude range.
The v3.0 operational OSIRIS NO 2 has proven to be of high quality in the 15-42 km range from previous satellite intercomparisons (e.g.Haley and Brohede, 2007) and the same can also be inferred for the upper troposphere from the ability to detect small lightning-generated NO 2 enhancements with the expected latitudinal and longitudinal distribution (e.g.Sioris et al., 2007).Thus, in Fig. 5, we examine the standard error of the coincident profiles for OSIRIS and balloon data to compare the variability of each dataset, similar to Kerzenmacher et al. (2008).The upper altitude limit of the data obtained from balloons launched by CNES (Centre National d'Études Spatiales) is 31 ± 2 km (Table 2).This excludes MkIV and SAOZ-BrO.From 31 down to 28 km, the balloon-borne NO 2 profiles show more scatter than any of the OSIRIS data products.This is likely due to the need to assume the NO 2 vertical distribution above float altitude for the remote sensors, which is particularly problematic in the tropics where the NO 2 profile typically peaks at a higher altitude than for the extratropics.A second factor is the smaller ab- sorption signal for the remotely sensing balloon instruments due to the short path lengths when observing altitudes near float.Below 16 km, the balloon data clearly exhibit less scatter than OSIRIS, thereby quantitatively supporting the choice of balloon validation data for reasons discussed in Sect. 1. Kerzenmacher et al. (2008) show that this is not true for NO 2 at ∼ 15 km measured by the ACE (Atmospheric Chemistry Experiment) satellite instruments relative to OSIRIS v3.0 NO 2 .Between the three OSIRIS products, "fast" exhibits the largest scatter at all altitudes, whereas the 2-D algorithm described here offers noticeably less scatter between 15 and 20 km as compared to the v3.0 product.The 3-D algorithm takes 1.5 h to retrieve a profile on the computer specified above and thus the entire set of balloon-coincident OSIRIS limb scans was not processed.The 2-D algorithm takes half of the processing time of the 3-D algorithm.
To determine whether OSIRIS NO 2 is biased relative to the balloon data, we studied medians and means of individual profile differences over all coincidences (as a function of altitude).These results are shown in Figs. 6 and 7.It is clear that all of the OSIRIS NO 2 algorithms have a statistically significant bias near the NO 2 peak (typically ∼ 30 km).This overestimate near the peak is similar to the overestimate by OSIRIS v3.0 relative to ACE Fourier transform spectrometer and MAESTRO (Measurements of Aerosol Extinction in the Stratosphere and Troposphere Retrieved by Occultation), which coincided with the NO 2 peak altitude.Local biases were +17 and 14 %, respectively (Kerzenmacher et al., 2008).Haley and Brohede (2007) found no such overestimate at the peak vs. SAGE III and POAM (Polar Ozone and Aerosol Measurement) III, but a similar positive bias near ∼ 28 km vs. HALOE (Halogen Occultation Experiment).
The "fast" product has the largest overestimate at the peak, averaging ∼ 20 % with a sharp gradient in its bias, such that, at 18 km, there is a statistically significant ∼ 20 % underestimate.The retrieved profiles using the 2-D retrieval described above have a similar bias profile shape with "fast" (Figs. 6, 7) but are of smaller amplitude, with a ∼ +10 % typical median bias at the number density peak for the 2-D retrieval and a ∼ 10 % underestimation at 18 km, which is statistically insignificant in terms of the median bias (Fig. 6).The bias of the v3.0 product is similar to the alternative OSIRIS NO 2 products above 20 km, but between 15 and 17 km there is a significant positive bias using both central tendency statistics (Figs. 6,7).In contrast to "fast" and v3.0 products, there are no altitudes in the lower stratosphere (below 24 km) with statistically significant average and median biases for the 1-D and 2-D products.This conclusion is not sensitive to the use of the MS approximation used for the 1-D product (see Sect. 2.3) (not shown).While Figs. 6 and 7 address the systematic errors of the various OSIRIS products, it is also important to consider the precision of these data since a product whose biases in individual profiles average to zero could still fail to adequately capture the variability.This is potentially a greater concern for the v3.0 product since it relies slightly on a priori NO 2 and vertically smooths the retrieved NO 2 when measurement precision is lacking.However, Haley and Brohede (2007) and Haley et al. (2004) show that these are very minor concerns as the measurement response and vertical resolution do not deteriorate significantly down to an altitude of 12 km.Figure 8 shows that the v3.0 product captures the variability of the balloon measurements down to ∼ 22 km as well as the 2-D and 1-D products and better than "fast" NO 2 , but then has larger scatter below 20 km than the 2-D and "fast" products.The "fast" retrieval benefits from more accurate forward modelling than the v3.0 algorithm.The reduced scatter for the "fast" NO 2 product in the lower stratosphere is unlikely to be a difference between inversion approaches (MART vs. the optimal estimation approach used in the v3.0 algorithm) as discussed above.The new product also may have an edge over the v3.0 product by virtue of using SaskTran as its forward model.For the 2-D mode, there is also the specific benefit of the forward model accounting for diurnal gradients in NO 2 .This can be seen clearly between 13 and 16 km where profiles retrieved with the 2-D forward model mode tend to be more precise than those retrieved using the 1-D mode due to this built-in photochemical modelling capability.This confirms the better precision of the 2-D retrieval suggested by Fig. 3 and is only expected for large SZAs (McLinden et al., 2006).Note that errors due to the neglect of diurnally varying chemical gradients can alternate in sign depending on the viewing geometry (McLinden et al., 2006;Brohede et al., 2007), so mean or median biases may not be significantly affected by switching from the 1-D to the 2-D RTM mode (Figs. 6, 7), even though the standard error of the individual biases is reduced (Fig. 8).
Correlation can be used as an alternative statistic to verify whether an OSIRIS product captures the variability observed by coincident balloon data.It is different from the standard error statistic used in Fig. 8 since multiplicative or additive biases do not affect the correlation but do affect the standard error of the individual biases.The correlation is calculated over the 46 coincident balloon profiles at each available altitude (Fig. 4). Figure 9 shows a general decrease in correlation between OSIRIS and coincident balloon data with decreasing altitude.There is higher correlation with balloon NO 2 data for the 2-D mode retrieval over the 1-D mode retrieval at the lowest altitudes, consistent with Fig. 3.The correlation of the v3.0 product is comparable with the 1-D and 2-D mode products down to the lowest validated altitudes, whereas the "fast" product has generally lower correlations above 22 km than the OSIRIS NO 2 products relying on spectral fitting.

Discussion
Next, we review and discuss the sensitivity of the retrieval to forward model parameters.There is a slight sensitivity of the retrieved NO 2 to changes in aerosol extinction.Previous sensitivity studies (Sioris et al., 2003;Haley et al., 2004) are consistent with the current findings.The scene albedo in the visible can vary from < 0.03 for calm ocean to almost unity for fresh snow, although the NO 2 retrieval is, by design, insensitive to surface albedo by virtue of the high-TH normalization of the radiance spectra (Sioris et al., 2003).Use of the retrieved scene albedo (instead of the default value of 0.3) and of the v5.07 aerosol extinction profile further reduce these sensitivities.Finally, use of OSIRIS-observed (v5.07) ozone instead of the default ozone climatology in SaskTran (McPeters et al., 2007) is expected to have a minor impact on retrieved NO 2 via errors in modelling the atmospheric extinction based on Sioris et al. (2007).With a method that involves simulating spectra at high THs for the purpose of normalizing radiances simulated for THs within the retrieval range, the forward model must accurately compute the radiance over a large range of THs.The NO 2 retrieval error due to the pseudo-spherical approximation is expected to have its largest impact at low altitudes because of the vertical gradient in pseudo-spherical RTM errors (Griffioen and Oikarinen, 2000).At the top of the retrieval range (40 km), the error due to this approximation is cancelled by the use of the high altitude reference.
While not included in Figs.5-9, Chahine's relaxation method tends to produce sharper extrema in the retrieved NO 2 profile than MART that tend to be slightly displaced vertically from those in the balloon data.This may stem from Figure 10.Climatological map of zonal mean NO 2 volume mixing ratio (ppb) with 10 • latitudinal binning and 1 km vertical binning centred at altitudes between 9.5 and 39.5 km for November (2001-2014, top), January (2002-2014, middle), and March (2002-2014, bottom).
the coincidence criterion of 1000 km in distance, but it may also relate to the ∼ 2 km vertical sampling of OSIRIS and the 1 km altitude grid used for the retrieval.If the vertical grid of the retrieval is finer than the vertical sampling, a vertically narrow layer at 19 km, for example, would be retrieved as peaking at 18 km when the available radiance spectra are measured at THs of 18 and 20 km.The comparison of MART and Chahine inversion approaches is worth revisiting with the 1 km vertical sampling offered by the Ozone Mapping Profiler Suite (Jaross et al., 2014).
In order to rigorously validate this new retrieval algorithm in the upper troposphere, tropospheric chemistry must be added to the photochemical model used to scale the OSIRIS observations to balloon local time.The reaction of NO 2 with the hydroxyl radical to form HNO 3 drives the diurnal variation of tropospheric NO 2 more than N 2 O 5 photolysis (Boersma et al., 2009).Accurate knowledge of the seasonal variation of the OH concentration is required for modelling the diurnal variation of upper tropospheric NO 2 .

Conclusions
Profiles of NO 2 retrieved from OSIRIS have been improved in terms of reduced bias and scatter vs. the current v3.0 operational algorithm in the lower stratosphere (e.g. 15 km) as determined using highly accurate balloon measurements as truth.The median bias is within ∼ ±10 % between 14 and 37 km.
The benefits of spectral fitting and extending the fitting window to longer wavelengths are evident at the highest altitudes where the photoelectron shot noise tends to be large relative to the NO 2 absorption signal, as well as at the lowest altitudes where the retrieved profile shape is largely driven by small differences in NO 2 SCDs obtained from spectra at adjacent tangent heights.Algorithms that exploit the richness of the NO 2 absorption spectrum are shown to better capture the mid-stratospheric variability of this key constituent.The use of a fully spherical forward model is an important advantage, particularly at lower altitudes since, while the pseudospherical RTM errors are largest at the top of the retrieval range, the largest NO 2 retrieval errors due to the use of a pseudo-spherical RTM occur at low altitudes because a high tangent height reference is used in the retrieval.
A model capable of modelling diurnal gradients in NO 2 also helps to improve the precision of the retrieved number density profile, particularly at the lowest altitudes where the horizontal gradients in NO 2 are sharpest and where the radiance is predominantly from the near side of the limb.

Figure 1 .
Figure 1.Measured and fitted differential optical depth (DOD) as a function of tangent height ("Alt") in the spectral window used for NO 2 retrieval.The residual is calculated as measured DOD minus fitted DOD.

Figure 3 .
Figure 3. NO 2 profile retrieved from scan 44 of orbit 16011 using the 1-D, 2-D, and 3-D retrievals, all converted to sunset (tangent point SZA = 90 • ), the local time of coincident SAOZ measurements at Bauru (22.35 • S, 49.03 • W).OSIRIS is on the day side with SZA = 86.6 • (p.m.) and an azimuth difference angle of 99 • such that the far side of the limb is closer to the terminator.

Figure 4 .
Figure 4. Sample size of coincident balloon data vs. altitude and as a function of retrieval algorithm.Sample size vs. height for 2-D mode is identical to that for 1-D mode.

Figure 5 .
Figure5.Relative standard error of coincident NO 2 profiles.Note that OSIRIS profiles have been scaled to various local times of the balloon measurements, which adds random error to the OSIRIS profiles due to random variations between the photochemical model atmosphere and the true atmosphere.

Figure 6 .Figure 7 .
Figure 6.Median of individual biases vs. balloon data.The error bar shows ±1 standard error of the median NO 2 bias profile.The median and standard error of the individual biases are converted to relative quantities by dividing by the corresponding median OSIRIS NO 2 profile (scaled to balloon local time).The relative median bias profiles for the other three algorithms are shown in grey for comparison.

Figure 8 .
Figure 8.Standard error of the individual biases relative to balloon correlative data for different OSIRIS NO 2 products.The curves here correspond to the half-widths of the error bars in Figs. 6 and 7.

Figure 9 .
Figure 9. Correlation between various OSIRIS NO 2 products and the balloon validation dataset.

Table 1 .
Main features of OSIRIS NO 2 algorithms compared in this work.

Table 2 .
(Jégou et al., 2013)) data based on OSIRIS coincidences.The minimum relative uncertainty is reported on the native vertical grid.Vertical resolution is quoted or calculated for an altitude of 16 km.The vertical resolution is provided byButz (2006)for LPMA and DOAS,Weidner et al. (2005)for mini-DOAS, available at http://mark4sun.jpl.nasa.gov/m4data.htmlforMkIV,and calculated for limb occultation for the SAOZ-type instruments, including SALOMON-N2(Jégou et al., 2013), assuming vertical resolution equal to the vertical extent of the solar disk at the tangent point.Uncertainties are for a 1 km vertical grid, except for SPIRALE (0.005 km) and mini-DOAS (2 km).