Validation of ACE-FTS version 3 . 5 NO y species profiles using correlative satellite measurements

The ACE-FTS (Atmospheric Chemistry Experiment – Fourier Transform Spectrometer) instrument on the Canadian SCISAT satellite, which has been in operation for over 12 years, has the capability of deriving stratospheric profiles of many of the NOy (N + NO + NO2+ NO3+ 2 × N2O5+HNO3+HNO4+ ClONO2+ BrONO2) species. Version 2.2 of ACE-FTS NO, NO2, HNO3, N2O5, and ClONO2 has previously been validated, and this study compares the most recent version (v3.5) of these five ACE-FTS products to spatially and temporally coincident measurements from other satellite instruments – GOMOS, HALOE, MAESTRO, MIPAS, MLS, OSIRIS, POAM III, SAGE III, SCIAMACHY, SMILES, and SMR. For each ACE-FTS measurement, a photochemical box model was used to simulate the diurnal variations of the NOy species and the ACE-FTS measurements were scaled to the local times of the coincident measurements. The comparisons for all five species show good agreement with correlative satellite measurements. For Published by Copernicus Publications on behalf of the European Geosciences Union. 5782 P. E. Sheese et al.: Validation of ACE-FTS version 3.5 NOy species profiles NO in the altitude range of 25–50 km, ACE-FTS typically agrees with correlative data to within −10 %. Instrumentaveraged mean relative differences are approximately−10 % at 30–40 km for NO2, within ±7 % at 8–30 km for HNO3, better than −7 % at 21–34 km for local morning N2O5, and better than −8 % at 21–34 km for ClONO2. Where possible, the variations in the mean differences due to changes in the comparison local time and latitude are also discussed.


Introduction
Currently, the only way to get global observational coverage of the Earth's atmosphere is with satellite-based observations. In addition, no single instrument can give us the full picture. Several instruments are needed in order to give us full global, vertical, and temporal coverage. Understanding biases between instruments is thus critical to understanding the true state of the atmosphere.
NO y is the complete set of reactive nitrogen species. Its concentration is calculated as [ ]. The abundances of NO y as well as the partitioning and interactions of its components are important to understand because they play a significant role in ozone chemistry. The main source of the NO y species in the stratosphere is through oxidation of N 2 O. NO and NO 2 can also descend from the lower thermosphere, where they are mainly produced via energetic particle precipitation, into the upper stratosphere during the polar winter (Randall et al., 1998(Randall et al., , 2009Funke et al., 2005a). A detailed description of stratospheric NO y photochemistry is given, for example, by Brasseur and Solomon (2005) and a summary for the species validated in this study is given below.
The main source of NO in the stratosphere is through dissociation of N 2 O via reactions with excited O( 1 D) atoms, and the majority of stratospheric N 2 O originates from surface soil and ocean emissions. The predominant destruction mechanism of stratospheric N 2 O is photolysis, producing O( 1 D) in the process: NO is also produced through reactions of O 2 with atomic nitrogen, which can be produced by dissociation of N 2 by cosmic rays. Cosmic rays can be a nontrivial source of NO in the polar regions. Stratospheric NO 2 is produced through the reaction of NO with O 3 , as well as with ClO, BrO, HO 2 , and CH 3 O 2 . NO 2 is destroyed through reactions with atomic oxygen and through photolysis -both processes produce NO: NO 2 + hν (λ < 405 nm) → NO + O( 3 P).
The main source of HNO 3 is through the three-body reaction, where M is an air molecule. The main sinks are through photolysis and through destruction via reactions with OH: HNO 3 is also produced on the surface of ice (H 2 O) particles, water droplets, nitric acid ice, and sulfate aerosols through the heterogeneous reaction, N 2 O 5 is produced mainly at night, when there is an abundance of NO 3 , through the three-body reaction: The main sinks of N 2 O 5 are through photolysis and through collisions: The main source of ClONO 2 is through the three-body reaction, and the main sink is through photolysis, ClONO 2 + hν (λ ≤ 320 nm) → Cl + NO 3 (or ClO + NO 2 ) .
Concentrations of these NO y species can have large diurnal variations because the reactions governing their production and destruction depend on sunlight. To account for diurnal variations, calculations made using the "Pratmo" photochemical box model (McLinden et al., 2002) are used to scale local times between the two instruments. This model was used by Kerzenmacher et al. (2008) Table 2. Reported retrieval uncertainties for the data sets used in this study. The listed ACE-FTS values represent mean statistical fitting errors . The values given are in the altitude range of 20-60 km for NO, 20-40 km for NO 2 , 15-30 km for HNO 3 , 20-40 km for N 2 O 5 , and 17-38 km for ClONO 2 . The MIPAS IMK-IAA uncertainties were obtained from the respective validation studies discussed in Sect. 2.2.2. Note that these are the uncertainties reported as "systematic" and "random" uncertainties and are not all necessarily at the same confidence level.   Tables 1 and 2, respectively. Section 2 outlines the instruments and the data sets used in this study. Section 3 describes the methodology as well as the Pratmo photochemical box model. The results of the comparisons with ACE-FTS, with and without the use of the photochemical box model, are detailed in Sect. 4. A summary and discussion of the results is given in Sect. 5. The ACE-FTS instrument  is a solar occultation, high-resolution (0.02 cm −1 ) spectrometer operating between 750 and 4400 cm −1 . It was launched in August 2003 into a high-inclination orbit of 74 • near an altitude of 650 km, and ACE-FTS has been providing volume mixing ratio (VMR) profiles of over 30 atmospheric trace gases and of over 20 isotopologue species since February 2004. During either sunset or sunrise, ACE-FTS makes a measurement approximately every 2 s between ∼ 5 and 150 km with a vertical sampling between ∼ 2 and 6 km, depending on the orbital geometry. The vertical extent of the instrument field of view is ∼ 3-4 km at the tangent point. The trace species VMR retrieval, as described by , is a nonlinear, least-squares, global-fitting technique that fits the observed spectra in given spectral microwindows (dependent on the retrieved species) to forward modelled spectra. Modelled spectra use line strengths and widths from HITRAN 2004 (Rothman et al., 2005) (with various updates, as detailed by Boone et al., 2013) and use the derived temperature and pressure profiles determined by fitting CO 2 lines in the observed spectra. The main updates in v3.5 (compared to v2.2) are improved sets of microwindows for the majority of species, along with an increase in the number of interfering species in their retrievals; improved temperature/pressure retrievals resulting in a reduction of profiles exhibiting unrealistic temperature oscillations; and the inclusion of trace species COCl 2 , COClF, H 2 CO, CH 3 OH, and HCFC-141b and the exclusion of ClO.
The ACE-FTS v3.5 NO retrieval uses 39 microwindows between 1649.3 and 1977.6 cm −1 . The main interfering species within the NO microwindows is O 3 , but spectral features of CO 2 and H 2 O isotopologues and COF 2 are also present. The retrieval has a lower altitude limit of 6 km and an upper altitude limit of 107 km. ACE-FTS v2.2 NO was validated by Kerzenmacher et al. (2008), and there were two known issues with the v2.2 results (still present in the v3.5 NO results). At altitudes below ∼ 20 km, NO VMRs suffer from a significant negative bias that causes many unphysical negative results. This is most likely due to strong diurnal variation along the line of sight that is not taken into account in the NO retrievals. Also, in polar winter around 35-50 km, where the NO VMR profile has a large vertical gradient, during times of increased downwelling, NO VMRs can exhibit large negative spikes. Kerzenmacher et al. (2008) found that, on average, ACE-FTS v2.2 NO agreed with coincident HALOE data on the order of 8 % within the altitude range of 22-64 km and exhibited a positive bias of ∼ 10 % from 93 to 110 km, and that the uncertainties were too large for statistically significant comparisons in the 64-93 km region.
The v3.5 NO 2 retrieval uses 40 microwindows between 1204.4 and 2950.9 cm −1 . The majority of microwindows added since v2.2 were chosen because of their information content with respect to the spectrally interfering isotopologues of CH 4 and H 2 O. Between 7 and 20 km, CO 2 and OCS also significantly interfere with the NO 2 lines. The retrieval has a lower altitude limit of 7 km and an upper altitude limit of 52 km. ACE-FTS v2.2 NO 2 was validated by Kerzenmacher et al. (2008), who concluded that ACE-FTS NO 2 typically exhibited a ∼ 15 % low bias with coincident satellite data near the peak (∼ 35 km) and on average was within 20 % in the altitude range of approximately 20-40 km.
The v3.5 HNO 3 retrieval uses 41 microwindows between 865.5 and 1977.6 cm −1 . Interfering species include CCl 2 F 2 , H 2 O, CO 2 , OCS, and O 3 . The retrieval has a lower altitude limit of 5 km and an upper altitude limit of 62 km. ACE-FTS v2.2 HNO 3 was validated by Wolff et al. (2008), who found that the ACE-FTS data and all coincident satellite data agreed to within 20 % in the altitude range of 18-35 km.
The v3.5 N 2 O 5 retrieval, with altitude limits of 8 and 45 km, has only one spectral window, 30.0 cm −1 wide and centred at 1244.0 cm −1 . Interfering species include O 3 and isotopologues of H 2 O, CO 2 , CH 4 , and N 2 O. ACE-FTS v2.2 update N 2 O 5 profiles (herein v2.2) were compared with MI-PAS IMK-IAA N 2 O 5 profiles by Wolff et al. (2008), who used climatological results from a chemical transport model to calculate diurnal scaling factors in order to match the local times of the two instruments. Without the use of diurnal scaling, Wolff et al. (2008) found that ACE-FTS v2.2 N 2 O 5 typically exhibited a low bias on the order of 30-50 %, whereas with diurnal scaling ACE-FTS typically exhibited a ∼ 10-35 % low bias.
The v3.5 ClONO 2 retrieval uses five microwindows between 780.2 and 2672.7 cm −1 . Interfering species include N 2 O, CH 4 , O 3 , HNO 3 , and isotopologues of N 2 O, CO 2 , H 2 O, and CH 4 . The retrieval has a lower altitude limit of 10 km and an upper altitude limit of 41 km at high latitudes and 36 km near the equator. ACE-FTS v2.2 ClONO 2 was compared to co-located MIPAS IMK-IAA data by , who used diurnal scaling factors to match the local times of the two instruments. With the use of diurnal scaling, Wolff et al. (2008) showed that ACE-FTS v2.2 and MIPAS IMK-IAA ClONO 2 values typically differed by less than 1 % between 16 and 24 km. Above the peak (∼ 25 km), ACE-FTS exhibited a positive bias with respect to MIPAS of up to 20 % near 33 km.
It should be noted that ACE-FTS also derives VMR profiles of HNO 4 ; however, because HNO 4 does not contribute substantially to the overall NO y budget and due to a lack of multiple correlative satellite data sets with which to validate, it is not included in this study. All ACE-FTS data used in this study were screened for physically unrealistic outliers using the recommended quality flags version 1.1, as described by P. E. Sheese et al.: Validation of ACE-FTS version 3.5 NO y species profiles 5785 Sheese et al. (2015). Any profile known to be affected by instrument or processing errors (flag values of 7) or any profile containing a data point determined to be an extreme outlier (flag value in the range of 4-6) was excluded from the analysis.

MAESTRO
The MAESTRO instrument  consists of two spectrographs designed to cover the spectral range 210-1025 nm -with a 1.5-2.5 nm spectral resolution, which observe direct solar radiation occulted by the Earth's atmosphere. The MAESTRO solar occultation measurements are used to retrieve profiles of O 3 , NO 2 , H 2 O, aerosol extinction, and other various atmospheric properties. The instrument has been providing measurements since February 2004. The NO 2 retrieval algorithm, described by McElroy et al. (2007), uses a two-step process. The spectral fitting of apparent optical depth spectra is used to derive slant column densities, assuming temperature-independent NO 2 and O 3 absorption cross sections from Burrows et al. (1998Burrows et al. ( , 1999. Then an iterative Chahine inversion technique (Chahine, 1968) is used to retrieve NO 2 VMR profiles from the slant column values. The spectral fitting algorithm is performed over a spectral range of 420-750 nm, and NO 2 profiles are retrieved in an altitude range of ∼ 5-52 km, with a vertical resolution on the order of 1-2 km.
Version 1.2 of the NO 2 data (used in this study) was validated by Kerzenmacher et al. (2008), who found that between 25 and 40 km, when comparing to correlative satellite measurements, diurnally scaled MAESTRO NO 2 tends to exhibit a bias within −20 and +10 %. In the same altitude region, scaled MAESTRO NO 2 also tends to exhibit a high bias of 0-50 % when compared to correlative ground-and balloonbased measurements. The poorer comparison with groundbased instruments was attributed to not accounting for diurnal variations along the MAESTRO line of sight in the NO 2 retrieval algorithm. It should be noted that this issue would similarly affect ACE-FTS NO 2 retrievals.
The ACE-FTS outlier detection method described by Sheese et al. (2015) was used to detect physically unrealistic outliers in the MAESTRO NO 2 data set. Any profile that was found to contain such an outlier was rejected prior to any comparisons. This method was ineffective at removing many of the outliers below 19 km. Therefore at altitude levels below 19 km, NO 2 VMR values greater than 3 ppb were screened out. At all altitude levels, any values with a corresponding fractional error of 1 or greater were also removed. Only data between February 2004 and September 2010 were used in the analysis.

Instruments on Envisat
In March 2002, the European Space Agency (ESA) launched the Envisat satellite (Fischer et al., 2008) into a polar, sun-synchronous orbit near 800 km, with an ascending node of 22:00 LT (local time). On board the Envisat satellite were a number of atmospheric sounding instruments, including the limb sounders GOMOS, MIPAS, and SCIAMACHY, which are described in following sections. Ground control lost communication with the satellite in early April 2012, thus ending all observations from the Envisat instruments.

GOMOS
The GOMOS instrument (Kyrölä et al., 2004) on the Envisat satellite employed a grating spectrometer that observed the attenuation of stellar emission, from the ultraviolet (UV) to the near-infrared, through the limb of the Earth's atmosphere. The stellar occultation technique was employed to retrieve vertical profiles of nighttime O 3 , NO 2 , NO 3 , H 2 O, OClO, BrO, O 2 , and aerosol extinction nominally between altitudes of 5 and 150 km, using three different bands within the spectral range of 248-954 nm. GOMOS was capable of obtaining hundreds of occultations each day with a vertical sampling typically between 0.4 and 1.7 km. GOMOS measurements span from March 2002 to April 2012.
The NO 2 retrieval algorithm is described by Kyrölä et al. (2010) and makes use of a Tikhonov-type regularization (Tikhonov, 1963), which leads to a retrieval vertical resolution of 4 km. Version 6 of the GOMOS NO 2 data set is used in this study. Version 5 of the NO 2 retrievals was validated by Verronen et al. (2009), who compared the GOMOS profiles to nighttime MIPAS ESA NO 2 data (described below). It was found that in the low to midlatitudes, between approximately 25 and 60 km, GOMOS NO 2 tended to exhibit a positive bias with respect to MIPAS on the order of 0-25 %. In the high latitudes, the two data sets agreed within 35 % at altitudes above ∼ 45 km where nighttime NO 2 VMR was at a maximum. However, at lower altitudes (in the high latitude regions) the bias reached up to 65 %, which was greater than the combined systematic errors. Since the ACE-FTS NO 2 profiles only extend up to 52 km, GOMOS comparisons have been limited to between 60 • S and 60 • N.
Only GOMOS profiles where the local solar zenith angle is greater than 97 • at altitudes below 50 km and greater than 110 • at altitudes below 100 km were used in the analysis. In order to eliminate the presence of extreme outliers, any GOMOS NO 2 profile that contained an absolute VMR value greater than 0.5 ppm in the altitude range of 0-52 km was also rejected; in the limited latitude region this rejected less than 1 % of the GOMOS profiles.

MIPAS
The MIPAS instrument (Fischer and Oelhaf, 1996;Fischer et al., 2008) on the Envisat satellite was a limb-viewing Fourier transform spectrometer that observed atmospheric emissions. The spectrometer had five spectral bands in the range of 685-2410 cm −1 and scanned the Earth's limb be-5786 P. E. Sheese et al.: Validation of ACE-FTS version 3.5 NO y species profiles tween altitudes from approximately 6 to 70 km in nominal mode and up to 170 km in special modes. The MIPAS vertical field of view was 3 km and the instrument had a vertical sampling that ranged from 1.5 to 5 km, depending on the altitude. Prior to 2005, MIPAS operated at its full spectral resolution of 0.025 cm −1 , with a sampling time of 4.5 s. In 2004, an anomaly occurred in the interferometer mirror slide mechanism and it was determined that the spectral resolution needed to be downgraded to 0.0625 cm −1 with a consequent reduction of the sampling time to 1.8 s, exploited to allow for a finer vertical sampling. In order to avoid any discontinuities that may arise from switching the observation mode, only MIPAS measurements from the period of January 2005 to April 2012 were used in this study.
Two different MIPAS level 2 products, based on two different retrieval algorithms, were used in this study -the first is from the ESA and the second is the result of a collaboration between the Institute of Meteorology and Climate Research at the Karlsruhe Institute of Technology and the Instituto de Astrofísica de Andalucía (IMK-IAA). The ESA algorithm that produces version 6 of the level 2 retrievals (used in this study) is described by Raspollini et al. (2013). It is a least-squares, global-fitting technique, using the regularized Levenberg-Marquardt method (Hanke, 1997), which fits spectra in species-dependent microwindows to a forward model. A parameter setting has been chosen that leaves results largely independent from the initial guess profiles. The forward model assumes horizontal homogeneity and local thermodynamic equilibrium at all altitudes. An a posteriori regularization, using a self-adapting regularization constraint, is then applied to the retrieved profile (Ceccherini, 2005;Cecccherini et al., 2007).
The IMK-IAA algorithm is described by von Clarmann et al. Funke et al. (2014), and the most recent version of the level 2 data (used in this study) is version 5. The IMK-IAA algorithm uses an iterative variant of Tikhonov regularization (Tikhonov, 1963) on species-dependent sets of microwindows. This inversion technique is implemented to constrain the shape of the resulting profile without pushing the values towards an a priori profile. The retrieval is performed on a 1 km grid, and the altitude-dependent strength of the smoothing constraint was chosen in order to optimize vertical resolution in the upper troposphere to lower mesosphere while still minimizing artificial oscillations in the retrieved profile. The NO and NO 2 retrievals are performed in log(VMR) space, and the forward model allows for horizontal variation in temperature. In the forward model, NO and NO 2 line-of-sight variations are considered and a lineof-sight NO x gradient is retrieved concurrently. Further, the forward model can allow for deviations from local thermodynamic equilibrium (LTE), which mainly affects mesospheric retrievals, and LTE is assumed for all NO y species except NO and NO 2 . The NO y microwindows are chosen, in part, in order to reduce non-LTE effects.
MIPAS IMK-IAA NO x retrievals (only in the original resolution mode) were compared to HALOE measurements by Funke et al. (2005b). It was found that the two NO x data sets typically agreed within 20 % between 25 and 50 km. Wetzel et al. (2007) found that, in the mid-stratospheric MIPAS ESA version 4.6 NO 2 , diurnally scaled using data from a 1-D photochemical model agreed best with balloon-borne measurements, with biases typically better than 10 %. In similar comparisons with correlative satellite-based solar occultation measurements, the MIPAS ESA profiles typically agreed within 10-30 %. Wang et al. (2007a, b) assessed the quality of the MIPAS IMK-IAA version 3 and MIPAS ESA version 4.6 HNO 3 data sets, respectively. Comparing MIPAS ESA HNO 3 with correlative data sets from ground-based and balloon-borne instruments, both Wang et al. (2007a, b) studies determined that relative differences were typically better than 10 %. In their comparisons with ACE-FTS v2.2 HNO 3 , relative differences in the lower to mid-stratosphere were on the order of 5-15 %.
MIPAS IMK-IAA ClONO 2 profiles were validated by Höpfner et al. (2007), who showed that the MIPAS data set agreed well with correlative balloon and airborne data sets, typically to better than 10 %. Höpfner et al. (2007) also compared the MIPAS IMK-IAA profiles to ACE-FTS v2.2 ClONO 2 using diurnal correction factors obtained from a chemical transport model. The diurnally corrected MIPAS data and ACE-FTS typically agreed within 10 % at altitudes between 15 and 27 km. However, above 27 km, the ACE-FTS exhibited a ∼ 20 % low bias with the diurnally corrected MI-PAS data and a ∼ 20 % high bias with the uncorrected data.
Neither the MIPAS ESA nor IMK-IAA N 2 O 5 data set has been the focus of a MIPAS validation study, but MI-PAS ESA N 2 O 5 and ClONO 2 data were compared with the balloon-based MIPAS-B instrument by Wetzel et al. (2013). It was found that N 2 O 5 concentrations typically agree within ±40 % and ClONO 2 concentrations typically within ±30 %. Also, the IMK-IAA N 2 O 5 data set was used in the ACE-FTS v2.2 N 2 O 5 validation study of Wolff et al. (2008), the results of which are summarized in Sect. 2.1.1.
All MIPAS vertical resolutions, listed in Table 1, were calculated as the full-width, half-maximum of the retrieval averaging kernels. MIPAS IMK-IAA data were used only where the corresponding averaging kernel diagonal values were greater than 0.03.

SCIAMACHY
The SCIAMACHY instrument (Burrows et al., 1995;Bovensmann et al., 1999) was an eight-channel grating spectrometer that observed the Earth's atmosphere in the wide spectral range of 240-2400 nm, using three different viewing geometries -limb viewing of scattered sunlight, solar occultation, and nadir viewing. The NO 2 data used in this study are the profiles retrieved from limb-viewing observa-tions in the channel that observed in the spectral window of 394-620 nm (spectral channel 3). The instrument scanned the Earth's limb from the surface up to 100 km with a 2.5 km vertical field of view and a ∼ 3 km vertical sampling. The NO 2 retrieval algorithm, detailed by  and summarized by Bauer et al. (2012), uses limb-scattered radiances measured from 420 to 470 nm and solves the inverse problem using the DOAS technique and Tikhonov regularization (Tikhonov, 1963). In each profile, the spectra are normalized by the limb radiances nearest 43 km. The regularization matrix smooths the retrievals using an empirically determined height-dependent smoothing parameter, chosen in order to minimize physically unrealistic oscillations in profiles while maximizing vertical resolution. The retrieval makes use of a forward model that takes into account absorption by O 3 (simultaneously retrieved) and O 2 -O 2 and uses pressure and temperature profiles from the European Centre for Medium-Range Weather Forecasts (ECMWF). The NO 2 and O 3 absorption cross sections were obtained from Bogumil et al. (1999). The algorithm retrieves NO 2 profiles between 10 and 40 km with a typical vertical resolution of 3-5 km, degrading to ∼ 10 km at the upper and lower retrieval altitude limits.
This study used v3.1 of the SCIAMACHY level 2 NO 2 profiles, which was validated by Bauer et al. (2012). The NO 2 profiles were compared to correlative satellite measurements that were diurnally scaled to the SCIAMACHY local times. It was found that in the altitude range of 25-35 km SCIAMACHY NO 2 tends to exhibit a 2 % low bias with respect to HALOE v19 profiles and tends to exhibit a 5 % high bias with respect to ACE-FTS v2.2 profiles.
Only SCIAMACHY data below 40 km with a retrieval response greater than 0.8 were used in the analysis.

HALOE on the Upper Atmosphere Research Satellite (UARS)
The HALOE instrument (Russell et al., 1993), on the UARS, was a solar occultation instrument that provided observations of the Earth's limb between October 1991 and November 2005. The UARS precessing orbit allowed for HALOE measurements to observe all latitudes between 80 • S and 80 • N approximately every 36 days. Profiles of O 3 , HCl, HF, CH 4 , H 2 O, NO, NO 2 concentrations, temperature, and aerosols were derived from observations within four radiometric channels and four radiometric/gas-filter correlation channels. The HALOE NO measurements use a gas-filter correlation method with a spectral filter band pass near 1900 cm −1 and are virtually insensitive to interfering absorbers. The NO 2 measurements are made using a broadband radiometric channel centred near 1600 cm −1 and the effects of interfering species O 2 , H 2 O, and CH 4 are accounted for in the retrieval. The interfering species N 2 O is not accounted for, although the effect on NO 2 is very small. Retrievals of NO and NO 2 profiles use a modified onion peel approach and account for aerosol extinction and interfering attenuation. The NO retrievals have a vertical resolution of 4 km at altitudes below ∼ 60 km (degrading to 7 km at higher altitudes), and the NO 2 retrievals have a vertical resolution of 2 km. The HALOE version 17 NO and NO 2 data were validated by Gordley et al. (1996). This study uses HALOE version 19 NO and NO 2 , which have very small differences relative to v17 (James M. Russell III, Hampton University, personal communication, December 2015). Gordley et al. (1996) found that above 25 km HALOE v17 NO tended to agree with correlative satellite and balloon-based measurements within 15 %, but with a maximum low bias reaching 35 %. Also, above 25 km HALOE v17 NO 2 agreed with correlative satellite, balloon, and ground-based measurements to within 15 %.

POAM III on SPOT 4
The POAM III instrument (Lucke et al., 1999) was a ninechannel photometer that viewed the Earth's limb in solar occultation. POAM III, on board the Satellite Pour l'Observation de la Terre (SPOT) 4 satellite, was launched in March 1998 into a sun-synchronous orbit with a descending node of 10:30 LT, at an altitude of ∼ 830 km and a 98.7 • inclination. Designed to measure atmospheric profiles of O 3 , H 2 O, NO 2 , and aerosol extinction, POAM III observed the limb at tangent heights between cloud-top and 60 km in nine different narrow passbands in the near-UV to near-infrared spectral region, with a total spectral range from 354 to 1018 nm. POAM III started taking measurements in April 1998, and measurements stopped in December 2005 due to instrument failure. NO 2 profiles were retrieved between 20 and 45 km from differential measurements in the 439.6 nm (NO 2 "on") and 442.2 nm (NO 2 "off") channels, both with a full-width, halfmaximum passband of 2.1 nm. The vertical resolution of retrieved NO 2 was ∼ 1.5 km from 25 to 35 km, increasing to nearly 3 km at 20 km and > 7 km at 45 km. The retrieval algorithm is described in detail by Lumpe et al. (2002). The algorithm inverts slant column densities to vertical profiles using the Newtonian optimal estimation technique (Rodgers, 2008) for all target species. The forward model assumes horizontal homogeneity. Randall et al. (2002) validated POAM III version 3.0 NO 2 measurements through comparisons with data from multiple instruments. They found no evidence for any systematic bias below 35 km; e.g. differences with respect to HALOE were within approximately ±0.2 ppbv (∼ 6 %). Relative to HALOE, POAM III NO 2 mixing ratios were shown to be higher by up to 0.7 ppbv (∼ 17 %) from 35 to 42 km; about 5 % of that bias was attributed to an error in HALOE retrievals, but no explanation for the remaining 12 % was identified. Although the version 4 NO 2 data (used in 5788 P. E. Sheese et al.: Validation of ACE-FTS version 3.5 NO y species profiles this study) have not been the focus of a validation study, it was used by Kerzenmacher et al. (2008) in comparison with ACE NO 2 . It was shown that above 25 km, POAM III typically agreed within ±6 % with respect to ACE-FTS v2.2 and within ±8 % with respect to MAESTRO v1.2.

SAGE III on Meteor 3M
The SAGE III instrument (SAGE III ATBD Team, 2002a) was a solar and lunar occultation atmospheric sounder on board the Russian Meteor 3M satellite, which was launched in December 2001 and was operational until March 2006. Meteor 3M was launched into a 1020 km altitude, sunsynchronous orbit with a descending node of ∼ 09:00 LT. In solar occultation mode, SAGE III was designed to retrieve vertical profiles of O 3 , NO 2 , H 2 O, and aerosol extinction (plus NO 3 and OClO in lunar mode) throughout the stratosphere from observations in the near-UV to near-infrared spectral region. The instrument consisted of a grating spectrometer that observed in the spectral range of 280-1040 nm and an InGaAs infrared detector that observed in a band pass between 1530 and 1560 nm.
The SAGE III NO 2 retrieval algorithm is detailed by SAGE III ATBD Team (2002b). The algorithm first uses a multiple linear regression technique to derive slant column densities for both O 3 and NO 2 simultaneously from calculated slant column optical depths. The O 3 and NO 2 region wavelength-dependent optical depths are derived from observations in two spectral channels spanning 433-450 and 563-622 nm. The NO 2 column densities are inverted into vertical density profiles (on a 0.5 km grid between 0 and 100 km with a vertical resolution of 1-2 km) using a modified Chahine technique (Chahine, 1968), assuming horizontal homogeneity.
There has not yet been a rigorous SAGE III NO 2 validation study. Kar et al. (2007) found that SAGE III NO 2 version 3 data (used in this study) typically exhibited a high bias (within ∼ 10-15 %) above 25 km with respect to v1.2 MAE-STRO data. Similarly, Kerzenmacher et al. (2008) found that the SAGE III v3 data also tended to exhibit a high bias (typically within ∼ 10 %) with respect to v2.2 ACE-FTS data. These results are consistent with Polyakov et al. (2005), who reported that their SAGE III NO 2 product, derived using the Newtonian iterative optimal estimation technique, was systematically lower than the SAGE III operational product.

MLS on Aura
The MLS instrument (Waters et al., 2006) aboard the Aura satellite observes atmospheric thermal emission in the Earth's limb. It was launched into a sun-synchronous orbit at an altitude of ∼ 700 km and with an ascending node of 13:45 LT. The MLS consists of seven radiometers measuring in the spectral range of 118 GHz to 2.5 THz, and the spectra are used to retrieve atmospheric profiles of temper-ature, geopotential height, and concentrations of over 15 atmospheric trace species and cloud ice on a pressure vertical grid.
HNO 3 retrieved from MLS is scientifically useful between pressure limits of 215 and 1.5 hPa. In the lower altitude range, at pressures of 22 hPa or greater, the 240 GHz radiometer measurements are used and result in a HNO 3 vertical resolution on the order of 3-4 km; at higher altitudes, at pressures of 15 hPa or less, the HNO 3 retrievals use measurements from the 190 GHz radiometer and have a vertical resolution of 4-6 km. In both pressure regimes, HNO 3 level 2 v3.3/3.4 profiles (Livesey et al., , 2013 use a Newtonian optimal estimation technique (Rodgers, 2008), with a forward model that assumes horizontal homogeneity and uses absorption cross sections from the JPL Spectral Line Catalogue (Pickett et al., 1998).
Version 2.2 HNO 3 was validated by Santee et al. (2007), where the MLS data were compared to multiple data sets retrieved from ground-based, balloon-borne, aircraft, and satellite platforms. It was found that the MLS HNO 3 profiles were scientifically useful within the altitude range of approximately 10-40 km and that throughout the stratosphere MLS HNO 3 tended to exhibit a low bias on the order of 10-30 %. That low bias was largely eliminated in version 3.3 (Livesey et al., 2013).
All MLS measurements with corresponding negative precision values, indicating poor retrieval response, have not been included in the analyses, nor have any profiles determined to contain cloud contamination. However, the adverse effects on MLS v3 HNO 3 due to clouds were substantially mitigated in the most recent version, v4.2 (Livesey et al., 2015). The altitude-dependent vertical resolution was assumed to be constant for all retrievals and was calculated as the full-width, half-maximum of the mean averaging kernels.

The Odin satellite
Odin is a Swedish/Canadian/Finnish/French satellite (Murtagh et al., 2002) that was launched in February 2001. It was launched into a sun-synchronous orbit at an altitude of ∼ 600 km, with an ascending/descending node of 06:00/18:00 LT. Aboard the Odin satellite are two Earth observing instruments, OSIRIS (Llewellyn et al., 2004) and SMR (Frisk et al., 2003).

OSIRIS
The optical spectrograph of the OSIRIS instrument operates in the spectral range of 280-810 nm, with ∼ 1 nm spectral resolution, and observes Rayleigh and Mie scattered sunlight in the Earth's limb between altitudes of ∼ 7 and 110 km with a vertical field of view of approximately 1 km. The NO 2 retrievals, described by Haley and Brohede (2007), use the DOAS technique to calculate NO 2 slant columns. These are calculated in the spectral window of 435-451 nm and between altitudes of 10 and 46 km, with the OSIRIS 46-60 km averaged radiances as the reference spectrum. The slant columns are then inverted into density profiles using the optimal estimation technique (Rodgers, 2008), using LIMBTRAN (Griffioen and Oikarinen, 2000) for the forward model. The NO 2 retrievals have a vertical resolution of approximately 2 km at all altitudes.
Version 3 of the data set (used in this study) was validated by Brohede et al. (2007a), who found that OSIRIS NO 2 typically agrees with correlative satellite and balloon-borne data sets within 20 % between 25 and 35 km for all seasons and latitudes. Between 35 and 45 km, the agreement was within 30 %, with smaller absolute systematic differences for comparisons in the high latitudes than for those nearer the equator.

SMR
SMR observes thermal emission in the Earth's limb using four tunable receivers in the spectral range of 486-581 GHz and a millimetre-wave receiver near 119 GHz. The HNO 3 profile retrieval algorithm (Urban et al., 2005) uses observations in a 1 GHz band centred at 544.6 GHz and is based on the Newtonian Levenberg-Marquardt optimal estimation technique (Rodgers, 2008). The forward model used is that of the MOLIERE-5 forward/inversion model (Urban et al., 2004). HNO 3 is retrieved at altitudes above 18 km, with vertical resolutions on the order of 2-3 km. As discussed by Urban et al. (2009), the SMR HNO 3 data exhibit a ∼ 1-1.5 km vertical bias. Therefore, in this study the version 2.1 HNO 3 data were offset upwards by 1.5 km prior to any analysis. Urban et al. (2009) showed that the SMR HNO 3 climatology exhibits reasonably good agreement with UARS/MLS climatology from measurements taken between 1991 and 1998. Wolff et al. (2008) showed that SMR HNO 3 profiles exhibit a ∼ 20 % high bias with respect to ACE-FTS v2.2 HNO 3 at altitudes below 30 km, and exhibit systematic differences within ±20 % between 30 and 35 km.
Only profiles that had retrieval response values greater than 0.75 were used in the analysis. Due to a level 2 processing error that affected SMR data for May 2009 and onwards, only SMR data before May 2009 were used in this study.

SMILES on the Japanese Experiment Module (JEM) on the International Space Station (ISS)
The SMILES instrument (Kikuchi et al., 2010) was an atmospheric limb sounder that operated on ISS/JEM between October 2009 and April 2010. SMILES measured atmospheric thermal emissions in three bands within the spectral region of 624-650 GHz. The ISS orbits the Earth at an altitude of ∼ 375 km with an inclination of 52 • . In order to observe northern high latitudes, the SMILES line of sight was angled 45 • from the ISS orbital plane, giving SMILES a nominal latitudinal coverage of 38 • S to 66 • N. The angle of the line of sight was occasionally shifted to give a latitudinal coverage of 66 • S to 38 • N. SMILES scanned the Earth's limb between tangent heights of 10 and 60 km with a vertical resolution on the order of 3.5-4 km, and the local time coverage was such that it took 2 months to sample an entire diurnal cycle. The SMILES operational retrieval algorithm, detailed by Takahashi et al. (2010), makes use of the optimal estimation technique combined with the Levenberg-Marquardt method, with a forward model that accounts for instrument attributes, single-ray temperature brightness, and absorption cross sections from the JPL Spectral Line Catalogue (Pickett et al., 1998). The resulting HNO 3 data, derived from observations in the two spectral bands covering 624.32-625.52 (band A) and 649.12-650.32 (band C) GHz, have a typical vertical resolution on the order of 5-9 km.
No studies focusing specifically on SMILES-derived HNO 3 have previously been published, mainly because the line parameters used in the forward model are theoretical, rather than laboratory, values. This study uses version 2.4 of the level 2 SMILES data from the operational processor. Only level 2 SMILES data derived from band C measurements were used in the analysis, as the HNO 3 retrievals from band A have been found to typically converge to a priori values (Makato Suzuki, personal communication, 30 October 2015). Only data with corresponding precision values greater than 0 (indicating reasonable measurement response values) were used in the analysis.

Methodology
In this section, when discussing comparisons between the ACE-FTS data set and the correlative data sets from other instruments, the term INST will be used to refer in general to one of the other instruments' data sets. Prior to analysis, all profiles (from every data set) have been linearly interpolated onto the ACE-FTS 1 km grid. In cases where an ACE-FTS profile was coincident with multiple profiles within an INST data set, only the profile measured closest in time to the ACE-FTS occultation was used.
In order to keep the level of vertical smoothing consistent between data sets, vertical resolution matching was carried out on coincident profiles where the INST vertical resolutions are finer than 3 km or coarser than 4 km (the range of the ACE-FTS vertical resolution). The profile with the finer vertical resolution, X f , was smoothed by taking a weighted average of the profile at each altitude level. The weight used was a normalized Gaussian centred at the altitude level: where h is the altitude on the ACE-FTS 1 km grid, z is altitude, and G(h, z) is the normalized Gaussian distribution, v s is the square root of the difference between the squared coarser vertical resolution of profile X c and the squared vertical resolution of X f (in order to avoid over-smoothing): where This method of using a Gaussian as an approximation of an averaging kernel is used in place of applying the averaging kernels directly because averaging kernels are not always available for all data sets. In fact, the ACE-FTS data sets do not include corresponding averaging kernels. One drawback of this approach is that any distortion of the profiles due to asymmetric averaging kernels (especially for retrievals performed in log(VMR) space) remains unaccounted for. However, as discussed in the Appendix, vertical smoothing in the altitude regions where the ACE-FTS retrievals have been validated typically only affect average relative differences on the order of 1 % or less.
For all of the species analyzed, three main diagnostics have been calculated at each altitude: correlation, mean relative difference, and standard deviation of relative differences. In all comparisons, differences are with respect to ACE-FTS v3.5 data. In the following definitions, X will represent ACE-FTS values at a given height, and Y will represent the corresponding INST values. The correlation coefficient, r, for comparisons between ACE-FTS and the other individual correlative data sets is determined at each height in the usual way: where n is the number of co-located measurements and σ refers to the standard deviation over the co-located measurements. The means of the relative differences are calculated at each altitude as the mean of the absolute differences (relative to ACE-FTS) divided by the mean of both the ACE-FTS and INST values: The overall mean is used as the denominator because the ACE-FTS retrievals, along with certain INST retrievals, allow for negative concentrations (which are included in the analysis so as to not bias the respective means); negative values can cause unrealistically large percent differences if the average of two compared values is near zero. The relative difference calculated as per Eq. (5) can also have unrealistically large values when the overall mean is near zero (if one of the ACE-FTS or INST averages is negative); however, this is much more unlikely than when using the standard calculation of the percent difference. Similarly, the standard deviation of the relative differences is calculated at each height as the standard deviation of the absolute differences (relative to ACE-FTS) divided by the overall mean of the ACE-FTS and INST values. When comparing ACE-FTS data to multiple instruments it is desirable to calculate an overall average of each of the diagnostic values. A simple mean of the values is not useful, as it does not take into account the quality of the INST data sets used in the comparisons. Therefore, a weighted average is calculated, using the inverse of the squared standard error of the relative means (σ −2 s = σ/ √ n −2 ) as the weight. Using σ s assumes that all data sets exhibit similar natural variability. In certain regions, it is possible for comparisons to have unreasonable standard errors with data set values approximately equal to the a priori. Unfortunately, not all data sets include retrieval response and, therefore, at each height the weights are calculated as the INST inverse-squared standard error multiplied by the INST correlation coefficient, i.e.
For rare cases where there is anti-correlation between ACE-FTS and INST (r INST < 0), the weights are set to zero. These weights are used to calculate the weighted-average ACE-FTS correlation coefficients, mean differences, and standard deviations of the relative differences. All recommended status, quality, and convergence flags have been applied to all data sets where such flags have been made available (as described in Sect. 2).

Diurnal scaling
For each pair of coincident profiles, the ACE-FTS profile was scaled to the local time of the other instrument's profile. This was done by using a photochemical box model in order to determine altitude-dependent diurnal-scale factors for each ACE-FTS NO y profile. Similar approaches have been used before in other studies, e.g. Bracher et al. (2005), , Wetzel et al. (2007), Brohede et al. (2007a), and Wolff et al. (2008).
The University of California Irvine photochemical box model (Prather, 1997;McLinden et al., 2002), also known as Pratmo, simulates the diurnal cycle of nitrogen and chlorine species, including NO, NO 2 , HNO 3 , N 2 O 5 , and ClONO 2 . It was used by Brohede et al. (2007b) in producing NO 2 climatologies from OSIRIS measurements, by Kerzenmacher et al. (2008) in the validation of ACE-FTS v2.2 NO 2 and by Bauer et al. (2012) in the validation of SCIAMACHY v3.1 NO 2 . In the simulation of the diurnal cycle for an ACE-FTS profile, the model is constrained using the corresponding ACE-FTS temperature, pressure, and O 3 profiles. The model takes into account altitude, latitude, and day of year, using NO y and N 2 O climatologies from a 3-D chemical transport model (Olsen et al., 2001), Cl y and Br y climatologies (as described by Brohede et al., 2007a), and climatological SAGE II background aerosol data. All photochemical reaction rates were obtained from Sander et al. (2003). Updated reaction coefficients have more recently been suggested for Reactions (R6) and (R10) by Burkholder et al. (2015). Since HNO 3 does not have a significant diurnal variation, excluding the updated coefficients for Reaction (R6) is unlikely to affect the results of this study; however, excluding updates to the coefficients for Reaction (R10) may add additional uncertainty to the comparisons of N 2 O 5 that use diurnal scaling. Latitude-and longitude-dependent albedo values from the Medium Resolution Imaging Spectrometer (MERIS) 412 nm albedo climatology (Popp et al., 2011) were also used as input into the model. The MERIS albedo climatology data were obtained from http://www.temis.nl/surface/meris_bsa.html. Figure 1 shows the mean altitude-dependent diurnal variations for NO, NO 2 , HNO 3 , N 2 O 5 , and ClONO 2 , as calculated by Pratmo using all ACE-FTS v3.5 data. The variation values shown at a given altitude are the mean percent deviations from the mean concentration at that altitude.
The output of the Pratmo model for a given profile is the variation of the concentration of the given species on the given day of year (from midnight to midnight the next day). At each altitude, the diurnal-scale factor value, s diurnal , is calculated as where X is the species concentration, LT is the local time, and the mod, ACE, and INST subscripts refer to the model, ACE-FTS, and the compared instrument values, respectively. The ACE-FTS concentration values can then be scaled to the compared instrument local time using As discussed by Brohede et al. (2007b) and Kerzenmacher et al. (2008), for NO 2 , the uncertainties due to the diurnalscale factor profiles are typically less than 20 % in the lower and upper stratosphere and typically less than 10 % in the middle stratosphere. Uncertainties are expected to be of the same order or less for the other NO y species. For a small fraction of ACE-FTS occultations, the photochemical model failed to produce results. Therefore, in the following section, comparisons between scaled and non-scaled results between ACE-FTS and each INST may not always contain exactly the same number of coincident pairs.

Direct comparisons of ACE-FTS versions 2.2 and 3.5
Direct comparisons between v3.5 and v2.2 of the ACE-FTS NO y species are shown in Fig. 2. From left to right in each panel, Fig. 2 shows the v3.5 and the v2.2 mean profiles, the correlation coefficient profiles, the mean of the relative differences (v3.5−v2.2 divided by the mean v2.2 profile), and the standard deviation of the relative differences. Figure 1a-e show results for NO, NO 2 , HNO 3 , N 2 O 5 , and ClONO 2 . For NO, it can be seen that up to 60 km the two versions are highly correlated, with a correlation coefficient of nearly 1 at most altitudes, dropping to 0.92 at the lowest altitude level. Between altitudes of 25 and 43 km, the relative differences are better than 2 % with standard deviations less than 10 %. At higher altitudes, up to 60 km, v3.5 NO concentrations are ∼ 5 % lower with standard deviations on the order of 30 %. This difference can be considered an improvement, as Kerzenmacher et al. (2008) showed that near 60 km ACE-FTS v2.2 NO had a positive bias on the order of 10-15 %. Below 22 km, the differences are much worse; however, this is in a region where the NO retrievals are often negative, and below 17 km the mean NO profile of both versions is negative and NO concentrations are over an order of magnitude smaller than above 22 km.
In the altitude region of 17-37 km, v2.2 and v3.5 NO 2 retrievals are very similar. The correlation coefficients are all near 1, relative differences are within 2 % and standard deviations are better than 5 %. From 37 to 47 km, v3.5 NO 2 reaches a maximum difference of −8 % with a standard deviation of 15 %. Above 37 km, where there is only a weak NO 2 signal, the standard deviations of the relative means and the correlation coefficients get worse, reaching 137 % and 0.7, respectively. Below 17 km, where NO 2 VMR values are significantly lower, v3.5 exhibits lower VMRs than v2.2, with differences reaching −12 %.
For HNO 3 , correlation coefficients are greater than 0.95 at altitudes of 10 km and higher. Between 10 and 23 km, v3.5 HNO 3 tends to exhibit differences between −1 and 5 % with standard deviations on the order of 4-14 %. Between 23 and 37 km, v3.5 HNO 3 exhibits 4-8 % higher VMRs with standard deviations of 4-13 %. Below 10 km, where v3.5 HNO 3 VMR values are lower, the comparison results get much worse with decreasing altitude and at 6 km the correlation coefficient is 0.42, the mean of the relative differences is −53 %, and the standard deviation of the relative differences is 130 %.
The v3.5 N 2 O 5 data exhibit a positive difference that is within 5 % between 22 and 37 km and within 15 % at all altitudes above 17 km. Above 20 km, correlation coefficients are better than 0.95 and the standard deviations of the relative means are between 15 and 44 %. Below 20 km, the compar- From left to right the panels show the mean concentration profiles (red solid for v2.2, black solid for v3.5) with corresponding 1σ (red dashed for v2.2, black dashed for v3.5) in parts per billion volume (ppbv), correlation coefficient profiles, the mean of the percent differences (v3.5−v2.2 divided by the mean v2.2 profile), and standard deviation of the percent differences. Dashed lines in the correlation (at 0.8) and relative difference plots (at −10, 0, and 10 %) are provided for visual clarity.
ison results get worse with decreasing altitude, as the N 2 O 5 concentration decreases.
ClONO 2 correlation coefficients are all greater than 0.95 in the altitude region of 15-29 km and greater than 0.8 between 13 and 32 km. Both the v3.5 and v2.2 ClONO 2 mean profiles peak between 26 and 27 km, but the v3.5 peak exhibits a positive difference of 1 ± 6 % and is vertically narrower, with v3.5 exhibiting lower VMRs with differences of 12 ± 23 % at 18 km and 11 ± 27 % at 33 km. The lower v3.5 VMRs above the peak would improve on the v2.2 high bias of ∼ 20 % reported by Wolff et al. (2008); however the lower v3.5 VMRs below the peak would worsen the reported ±1 % bias.

Satellite instrument comparisons
Throughout the discussion of the results, when it is remarked that there are "better" comparison results, what is meant (unless explicitly stated otherwise) is that the correlation coefficients are higher while the standard deviations of the relative differences are lower. Conversely, by "worse" comparison results, it is meant that the correlation coefficients are lower and that standard deviations of the relative differences are higher. When discussing the coincidence criteria for each species, the "optimal" criteria are those that allow for a significant number of coincident profiles (minimum number of 10), but loosening the criteria would generally worsen the comparison results and tightening the criteria would not significantly affect the comparison results. When the "bias" and the "standard deviation" between two data sets are mentioned, unless stated otherwise, these refer to the mean of the relative differences and the standard deviation of the relative differences, respectively. In the following figures, plots of relative differences include error bars that represent the standard error of the mean of the relative differences (shown every 5 km, error bars that are less than ∼ 1 % may not be visible); thick solid black lines represent the weighted-average profiles for comparisons that have been diurnally scaled, and the thick dashed black lines represent the weighted-average profiles for comparisons that have not been scaled. Table 3 gives the maximum number of coincident profiles between ACE-FTS and the respective instruments using the optimized coincidence criteria. It should be noted that the number of co-incidences are typically not constant in altitude due to screening of the data sets using metrics (e.g. retrieval response, quality flags) that are not always constant in altitude.

Comparisons of NO
Note that only HALOE and MIPAS IMK-IAA are being compared with ACE-FTS (MIPAS ESA does not have an NO data product). Figure 3 shows the mean NO VMR profiles for coincident ACE-FTS and HALOE profiles and coincident ACE-FTS and MIPAS IMK-IAA profiles. Since ACE-FTS and HALOE are both solar occultation instruments and only overlapped between 2004 and 2005, there are not many coincident measurements. As such, the spatial coincidence criterion was kept somewhat lax, within 500 km, in order to ensure a statistically significant number of coincidences. It was found that a temporal coincidence criterion of within 3 h also led to a statistically significant number of coincidences (47 profiles). Comparisons with a less stringent criterion (greater than 3 h) led to a larger number of coincidences but significantly reduced the correlation and increased the standard deviation of the relative differences between the data sets. Using a tighter spatial criterion, e.g. within 350 km, also yields a significant number of coincidences but does not significantly improve the comparison results. At all altitudes, with any temporal coincidence criterion, it was found that using the diurnal scaling factors did not greatly improve the HALOE comparison results. This is likely due to the fact that both ACE-FTS and HALOE are solar occultation instruments, and hence measurements at a common geographic location do not differ greatly in local time. Figure 4a shows the ACE-FTS and HALOE NO comparison results, with and without diurnal scaling. The two data sets are only strongly correlated in the altitude region of approximately 25-55 km. In this region, the relative difference shows that ACE-FTS NO tends to exhibit a low bias of less than 10 %, with standard deviations on the order of 10 % with respect to HALOE. Of the 47 coincident profiles 41 are local sunset occultations, and the remaining 6 are local sunrise occultations. Due to the lack of sunrise measurements, it was not possible to determine whether or not there is a significant bias between the sunrise and sunset (or similarly local morning and local evening) NO profiles. It was found that the temporal and spatial coincidence criteria that optimized the comparison results for diurnally scaled ACE-FTS and MIPAS IMK-IAA NO profiles were within 3 h and within 100 km. Similar to the HALOE comparisons, using the diurnal scaling factors did not greatly improve the comparison results at most altitudes. However, for temporal differences larger than 3 h, using the diurnal scaling factors worsened comparison results at all altitudes. Figure 4b shows the ACE-FTS and MIPAS IMK-IAA NO comparison results with and without diurnal scaling, using coincidence criteria of within 3 h and 100 km. Throughout the middle stratosphere, the diurnal scaling generally increased the correlation coefficients by ∼ 0.05 and lowered the standard deviations by ∼ 3 %. Relatively strong correlation is seen above 25 km, where ACE-FTS exhibits a negative bias within −10 and −22 % between 25 and 35 km and an approximate −5 % bias between 40 and 50 km. The lowest standard deviations are observed in the 30-50 km region, on the order of 35-50 %. The higher standard deviations (relative to comparisons with HALOE, Fig. 4a) reflect the higher variance within the MIPAS IMK-IAA NO data set. Below 25 km, the relative differences get more negative with decreasing altitude -more negative than −100 % below 21 km. An ACE-FTS NO low bias with respect to non-solar occultation instruments, on the order of ∼ 10-40 %, is expected in this region due to not accounting for diurnal variations of NO along the line of sight (Brohede et al., 2007a). Figure 5a shows NO comparison results for data separated by local time using all available MIPAS IMK-IAA data. It can be seen that there is an apparent significant local time bias in the ACE-FTS−MIPAS IMK-IAA comparison results. Between 19 and 52 km, the correlation coefficients are better for local evening (PM) comparisons than for local morning (AM) comparisons by up to 0.4, and at all altitudes the evening comparisons exhibit lower standard deviations by ∼ 15-50 %. This leads to improved relative differences when only using the evening data in the 25-34 and 52-60 km ranges. However, due to the orbital geometries and the MI-PAS IMK-IAA retrieval sensitivity to NO, the only coincident PM data are during November-January in the Southern Hemisphere (SH) and May-July in the Northern Hemisphere (NH), hereafter referred to as "summer" months. Figure 5b shows NO comparison results between ACE-FTS and MI-PAS IMK-IAA for data separated by local time and using only the summer months (both NH and SH). It can be seen that correcting for this seasonal bias greatly improves the AM comparison results, as there is less NO variation in the polar summer regions than in the winter. At most altitudes the summer PM comparisons still tend to exhibit better correlation than the AM, but the summer AM and PM standard deviation profiles are rather similar -values of ∼ 100 % near 18 km, then decreasing with altitude to ∼ 15-20 % near 45 km, and from there increasing with altitude. Between 22 and 52 km, the summer AM and PM relative difference profiles are also quite similar. ACE-FTS exhibits a negative bias with respect to MIPAS IMK-IAA of approximately −100 to −10 % between 22 and 27 km. Above 27 km, up to ∼ 50 km, ACE-FTS NO is typically systematically lower than MIPAS IMK-IAA by 0-10 %. Above 52 km, the summer PM results (correlation coefficients and standard deviations) are typically better than the AM; the PM relative differences are between 0 and +7 %, and the AM relative differences decrease with altitude from 0 to −32 % between 53 and 60 km.

Comparisons of NO 2
From Fig. 6 it is apparent that in the comparisons with all other NO 2 data sets, the diurnally scaled ACE-FTS profiles have a low bias near the NO 2 peak, ∼ 33 km. It can be seen in Fig. 7a that using coincidence criteria of within 350 km and within 4 h without any diurnal scaling leads to relatively poor agreement between ACE-FTS and many instruments in the middle to upper stratosphere. Near the NO 2 peak, without diurnal scaling mean relative differences between ACE-FTS and INST data range from −38 to +2 %, with standard deviations that reach up to ∼ 50 %. With diurnal scaling (Fig. 7b)  with weighted-average standard deviations within 18-43 %. Within these altitudes, most comparisons typically yielded correlation coefficients that were greater than 0.8, the exception being GOMOS which measures at nighttime. The weighted-average correlation coefficients are better than 0.8 between 15 and 40 km and better than 0.9 between 17 and 35 km. Below 25 km, an ACE-FTS NO 2 positive bias is expected with respect to instruments that do not use the solar occultation viewing geometry due to not accounting for diurnal variations in NO 2 along the line of sight in the forward model. In solar occultation viewing geometry, not accounting for this diurnal effect is expected to lead to a ∼ 10-40 % positive bias (Brohede et al., 2007a). It can be seen from Fig. 7b that, below 25 km, ACE-FTS does have a positive bias on the order of 5-40 % with respect to MIPAS IMK-IAA, OSIRIS, and SCIAMACHY. As well, below 22 km, ACE-FTS exhibits a positive bias with respect to HALOE, which is a solar occultation instrument but accounts for the diurnal effect in the NO 2 retrieval algorithm.
Diurnal scaling has less of an effect on comparisons with the solar occultation instruments (HALOE, POAM III, SAGE III) than on those with other viewing geometries, as there is less of a difference in measurement local times, and diurnal scaling has no effect on the ACE-FTS comparisons with MAESTRO as measurements are co-located (although they do have differing vertical and horizontal resolutions). In order to determine biases in the comparisons due to lo-cal time or hemispheric coverage, comparisons were made in the 20-40 km region where the NO 2 peak is well sampled and the majority of instruments have sufficient coverage. For local time differences, GOMOS data have been excluded, as it only contains local evening data, and the solar occultation instruments have been excluded as they tend to only have a significant number of coincidences in either local morning or local evening. For hemispheric differences only HALOE data were excluded, as the vast majority of HALOE data are from the NH.
As can be seen in Fig. 8, at all altitudes within the 20-40 km range, the weighted-average results are generally better for the evening comparisons than for the morning comparisons. The weighted-average standard deviations are better by up to 18 % and the correlation coefficients are better by ∼ 0.05 in the evening comparisons. The evening average relative differences are ∼ 0 % near 20 km, reach −11 % near 35 km, and 6 % near 40 km. Whereas for morning results, compared to evening results, average relative differences are more negative above 35 km (reaches −13 %) and more positive below 30 km (up to +40 %). The better evening results are likely due to differences in the diurnal variation along the line of sight between sunrise and sunset observations. For sunrise (local morning) observations, ACE-FTS samples a region of the atmosphere that has yet to be sunlit long enough for NO 2 to be in equilibrium. For sunset (local evening), however, the entire sampled area should be relatively stable. As can be seen in Fig. 9, there were no major differ-  Figure 8. Comparisons of ACE-FTS NO 2 profiles with correlative data sets using coincidence criteria of within 4 h and 350 km: (a) comparisons for local morning and (b) local evening. Note that GOMOS and the solar occultation instruments have been excluded. Error bars in the relative difference profiles represent the standard error of the mean (values less than ∼ 1 % may not be visible). ences in the weighted-average results between the NH comparisons and the SH comparisons. The only significant difference in the weighted-average relative differences is below 25 km, with the SH exhibiting larger values by up to 7 %.

Comparisons of HNO 3
Due to the relatively weak diurnal variation of HNO 3 in the stratosphere, using the photochemical box model did not improve the HNO 3 comparison results at any altitude level. In addition, a lax temporal coincidence criterion of within 6 h was used, as tightening the criterion did not significantly im-  Figure 9. Comparisons of ACE-FTS NO 2 profiles with correlative data sets using coincidence criteria of within 4 h and 350 km: (a) northern hemispheric data and (b) southern hemispheric data. Note that HALOE has been excluded. Error bars in the relative difference profiles represent the standard error of the mean (values less than ∼ 1 % may not be visible). prove comparison results. As such, it was possible to use a spatial coincidence criterion of within 100 km, which optimized the comparison results. Figure 10 shows the mean coincident ACE-FTS and INST HNO 3 profiles along with the 1σ measurement variation. There is typically good agreement between ACE-FTS and the other instruments, and HNO 3 comparison results are shown in Fig. 11. Near the HNO 3 peak, ∼ 20-25 km, there is excellent agreement, with weighted-average relative differences within −1 %, correlation coefficients of ∼ 0.97, and standard deviations of ∼ 8 %.
The weighted-average correlation coefficients are greater than 0.5 for altitudes of 7-40 km and greater than 0.9 for altitudes of 12-31 km. Between 9 and 38 km the weightedaverage standard deviations are below 50 %, reaching a minimum of 7 % near 24 km. The weighted-average relative differences are within ±6 % between 9 and 29 km. Above 30 km, the average relative differences increase with altitude to 37 % at 40 km; however, at that altitude only the MIPAS IMK-IAA comparisons exhibit standard deviations below 50 %, and the ACE-FTS−MIPAS IMK-IAA relative difference at 40 km is on the order of 20 %. Below 30 km, the ACE-FTS−MIPAS ESA relative differences are within ±10 %, and the ACE-FTS−MLS differences are typically on the order of −5 to 10 %. Between 20 and 38 km ACE-FTS typically exhibits a high bias with respect to SMILES, which is on the order of 1 % near 20 km and increases to 55 % near 34 km. With respect to SMR, ACE-FTS exhibits a negative bias on the order of −9 to −13 % between 25 and 30 km. This is an improvement from the ∼ 20 % low bias exhibited in the ACE-FTS v2.2 and SMR v2.0 comparisons reported by  Figure 11. Same as top panel of Fig. 7 (comparisons with no diurnal scaling) except for HNO 3 , with coincidence criteria of within 6 h and 100 km. Error bars in the relative difference profiles represent the standard error of the mean (values less than ∼ 1 % may not be visible). Wolff et al. (2008). Mean relative differences between ACE-FTS and SMR are also within ±10 % between 30 and 35 km. There were no major local time biases found in the HNO 3 comparisons. In the altitude range 12-28 km, weightedaverage mean relative differences were within ±4 % for the local morning comparisons, whereas local evening comparisons yielded weighted-average mean relative differences within ±7 % (not shown). Between 16 and 38 km, there was no significant hemispheric bias found in the HNO 3 comparisons, with SMILES data excluded (due to asymmetric hemispheric coverage).

Comparisons of N 2 O 5
Before showing the N 2 O 5 validation results, it should be noted that a significant difference was found between local morning and local evening MIPAS (both ESA and IMK-IAA) N 2 O 5 comparisons with ACE-FTS: the evening comparisons exhibited much worse agreement than the morning comparisons. Figure 12a shows results for comparisons between local evening diurnally scaled ACE-FTS and MI-PAS profiles using coincidence criteria of within 3 h and within 100 km. Near 20-25 km, the relative differences are on the order of ±10 % with standard deviations of ∼ 50-80 % and correlation coefficients of ∼ 0.65-0.75. However, outside of this region, comparison results yield poorer results, with weak correlation, standard deviations greater than 100 %, and relative differences beyond ±100 %. In order to highlight that this poor agreement is not an issue with differences due to diurnal variation, Fig. 12b shows comparisons using non-scaled ACE-FTS profiles and with a much tighter temporal coincidence of within 20 min (and within 200 km). In comparing to both MIPAS data products in this case, there are large systematic differences from ACE-FTS. The MIPAS ESA differences range from approximately −60 to 200 % and the MIPAS IMK-IAA differences range from approxi-mately −130 to 200 %. The poor agreement in the evening is mostly due to the low signal-to-noise ratio in the ACE-FTS measurements due to the lower N 2 O 5 concentrations at sunset than at sunrise. Figures 13 and 14 show the results of the morning comparisons. At coincidence criteria of within 3 h and 100 km, with diurnal scaling, ACE-FTS and MIPAS tend to agree best in the altitude range of 22-34 km. In this region, weightedaverage correlation coefficients are better than 0.8, weightedaverage standard deviations are between 16 and 40 %, and weighted-average mean relative differences are typically better than −7 %. Above 34 km, ACE-FTS exhibits a positive bias that is within 10 % up to 38 km and increases with altitude, up to 33 % at 43 km. This positive bias in the upper altitudes is not reduced when tighter temporal coincidence criteria are chosen (down to within 20 min) and exists both with and without diurnal scaling. Also shown in Fig. 14 are the weighted-average comparison results for non-scaled ACE-FTS profiles. It can be seen that using the photochemical box model does improve the comparison results, especially in the 23-38 km region, where it leads to an improvement to the average standard deviations on the order of 5 %. Diurnal scaling also reduces the positive bias above 33 km by up to 16 %.
Although there was very poor agreement between local evening ACE-FTS and local evening MIPAS N 2 O 5 profiles, comparisons between diurnally scaled morning ACE-FTS and evening MIPAS N 2 O 5 profiles yield much better agreement. This indicates that the poor agreement seen in the evening data is most likely due to the high level of noise in the evening ACE-FTS N 2 O 5 data and is unlikely an issue with the MIPAS data. Figure 15 shows the weightedaverage results (scaled) from Fig. 14, along with comparison results between diurnally scaled morning ACE-FTS and evening MIPAS (both ESA and IMK-IAA) profiles using coincidence criteria of within 12 h and within 100 km. Between 22 and 37 km, the morning/evening weighted-average correlation coefficients are greater than 0.8 and the standard deviations are less than 50 %. In this altitude range, the weightedaverage relative differences are better than 10 %. Figures 16 and 17 show the results of the ACE-FTS and MI-PAS, both ESA and IMK-IAA, comparisons. Comparisons were found to be optimized at coincidence criteria of within 4 h and 100 km. With diurnal scaling, ACE-FTS and MIPAS tend to agree best in the altitude range of 17-34 km. In this region, weighted-average correlation coefficients are better than 0.7, weighted-average standard deviations are between 13 and 32 %, and weighted-average mean relative differences tend to exhibit a negative bias within −1 and −10 %, except at the lower altitudes where the low bias reaches −20 % near 17 km. Also shown in Fig. 17 are the weighted-average comparison results for non-scaled ACE-FTS profiles. It can be seen that using the photochemical box model does improve the comparison results, especially above 26 km, where it leads to an improvement to the average correlation coefficients by up to 0.15 and to the average standard deviations by up to 4 %. Similar to the case for the MIPAS IMK-IAA NO comparisons in Sect. 4.2.1, separating the coincident MIPAS ClONO 2 data into morning and evening subsets seasonally biases the data. Due to the orbital geometries and the MIPAS retrievals' sensitivity to ClONO 2 , there is typically only coincident evening data between February and April in the NH and August and October in the SH (henceforth referred to as "spring" months). In examining the differences between spring morning and evening comparison results, shown in Fig. 18, between 17 and 36 km there are no major differences in the weighted-average relative difference profiles. In the 13-23 km region, where the comparison results are more consistent for the evening results, both the morning and evening results tend to exhibit a −10 % bias. Above 25 km,  Table 4. Summary of validated ACE-FTS NO y systematic differences for two different cases. Case 1: region where the weighted-average correlation coefficient profile is greater than 0.5 and the weighted-average standard deviation of the relative differences profile is less than 100 %. Case 2: region where the weighted-average correlation coefficient profile is greater than 0.8 and the weighted-average standard deviation of the relative differences profile is less than 50 %. Results are for comparisons using all data and the species-dependent optimized coincidence criteria (given in text and  in correlation coefficient values and a decrease in standard deviations of the relative differences. Table 4 summarizes the average systematic differences between ACE-FTS and the data sets for all other instruments in the regions where the ACE-FTS data have been validated and where there is typically a strong correlation and reasonable standard deviations. The column outlining the systematic differences where average correlation coefficients are better than 0.8 and average standard deviations are typically below 50 % could also be used to determine recommended altitude limits for the different ACE-FTS data sets (with the exception of NO, which was only examined below 60 km, the top altitude of the photochemical model).

Atmos
In general there is good agreement between ACE-FTS and HALOE NO, but as mentioned above, the diurnal scaling factors did not help improve the comparison results. Comparisons indicated that ACE-FTS has a negative bias on the order of 0 to −10 % in the altitude region of 28-48 km. This is a slight improvement on the ACE-FTS v2.2 NO profiles, which Kerzenmacher et al. (2008) found to have a ∼ 8 % bias with HALOE in this region.
ACE-FTS and MIPAS IMK-IAA comparisons suggest that ACE-FTS NO has a negative bias at all altitudes below 60 km, and between 40 and 60 km this bias is approximately −5 %. Below 25 km, the bias becomes more negative with decreasing altitude from −15 % to beyond −100 %, and 10-40 % of this bias is expected to be due to diurnal variations along the ACE-FTS line of sight. Comparisons using only summer data yield similar results. Both summer morning and summer evening comparisons yield negative relative differences at all altitudes, with values more negative than 50 % below ∼ 23 km and above ∼ 50 km, and within −10 and 0 % in the 32-50 km region.
ACE-FTS v3.5 NO 2 profiles have a clear systematic negative bias with respect to all other instruments at and around the NO 2 peak, ∼ 32 km. With diurnal scaling, this negative bias near the peak is ∼ −10 % for evening comparisons (which typically yield better results than morning comparisons) and ∼ −12 % for morning comparisons. This bias is likely in part due to errors in the characterization of the ACE-FTS instrumental line shape in v3.5 (Boone et al., 2013), but the complete source of this bias is the subject of on-going investigations. Better evening comparison results than morning results are likely attributable to sunrise observations sampling a region of the atmosphere where NO 2 concentrations are not yet in daytime equilibrium. Below 25 km, ACE-FTS tends to exhibit a 5-40 % positive bias with respect to nonsolar occultation instruments and HALOE. This bias is expected due to diurnal variation of NO 2 along the ACE-FTS line of sight that is not accounted for in the forward model. No major differences were found between NH comparisons and SH comparisons, but below 25 km the average relative differences were on the order of 8 % in the SH, and on the order of 15 % in the NH. These results are an improvement over the findings of Kerzenmacher et al. (2008), who found that ACE-FTS v2.2 NO 2 had a ∼ 15 % low bias near the peak and between 20 and 40 km agreed with correlative data sets to within 40 %.
HNO 3 comparisons near 35 km show that ACE-FTS has a positive bias that on average is ∼ 20 %. Within the 8-30 km range ACE-FTS and correlative data sets on average are within ±7 %, and around the HNO 3 peak (∼ 20-26 km) on average ACE-FTS is within ±1 % of the other measurements. These results suggest an improvement from ACE-FTS v2.2 comparisons by Wolff et al. (2008), who found that ACE-FTS was typically within ±20 % of correlative satellite data sets. No major biases in the HNO 3 comparisons were found due to measurement local time or hemispheric coverage. Above 35 km, morning ACE-FTS N 2 O 5 has a positive bias with respect to MIPAS ESA and IMK-IAA, which reaches 33 % near 42 km. This bias is not an artifact of diurnal mismatch as it still exists when comparing profiles using a temporal coincidence criterion on the order of 20 min (not shown). At these higher altitudes, where the VMR is decreasing with altitude, it is difficult to accurately derive N 2 O 5 concentrations given the broad, unstructured N 2 O 5 absorption spectrum. Between 22 and 35 km, ACE-FTS tends to exhibit a negative bias, on average better than −7 %.
Evening ACE-FTS N 2 O 5 profiles show very poor agreement with evening MIPAS measurements regardless of diurnal scaling, coincidence criteria, and hemisphere. As the coincident ACE-FTS measurements are always evening sun-set measurements, this is when N 2 O 5 is at its least abundant (roughly an order of magnitude less than morning concentrations) and therefore where the ACE-FTS N 2 O 5 retrievals suffer from the lowest absorption signals for the molecule. The evening MIPAS retrievals are most likely not equally affected by the low abundance of N 2 O 5 , as they compare reasonably well with morning ACE-FTS profiles that have been diurnally scaled to match the MIPAS local times. Further investigation into the poorer quality of the ACE-FTS evening N 2 O 5 data is needed.
In the 14-35 km region ACE-FTS ClONO 2 exhibits a negative bias with respect to the MIPAS data sets. From 14 to 24 km, the ACE-FTS bias is on average better than −20 %, and in the 21-35 km region better than −8 %. Differences in morning and evening ACE-FTS−MIPAS comparison results are examined for the spring months. Major differences are only exhibited above 25 km, where the comparison results are typically better for the morning results. In the 25-33 km range, spring morning relative differences on average are −3 % and the spring evening relative differences on average are +2 %. Below ∼ 25 km, these results are slightly worse than those of Wolff et al. (2008), who found that below ∼ 25 km ACE-FTS v2.2 ClONO 2 data were typically within 1 % of MIPAS IMK-IAA data. Although at higher altitudes, ACE-FTS v2.2 exhibited a positive bias of up to 20 % near 33 km, and therefore above the VMR peak v3.5 ClONO 2 has improved.

Data availability
The ACE-FTS Level 2 data used in this study can be obtained via the ACE-FTS website (registration required), http://www.ace.uwaterloo.ca (ACE-FTS, 2016), or upon request from the corresponding author (kaley.walker@utoronto.ca). The GOMOS data can be obtained via https://earth.esa.int/web/guest/data-access (registration required) (ESA, 2016a). The HALOE data can be obtained via http://haloe.gats-inc.com/download/index.php (HALOE, 2016). The MIPAS ESA data can be obtained via https://earth.esa.int/web/guest/data-access (registration required) (ESA, 2016b). The MIPAS IMK-IAA data can be obtained via https://www.imk-asf.kit.edu/english/308.php (registration required) (KIT, 2016). The MLS data are publicly available via http://disc.sci.gsfc.nasa.gov/Aura/ data-holdings/MLS/index.shtml (registration required) (GES DISC, 2016). The OSIRIS data can be obtained via http://odin-osiris.usask.ca (registration required) (University of Saskatchewan, 2016). The POAM III data can be obtained via https://eosweb.larc.nasa.gov/project/poam3/poam3_table (registration required) (NASA, 2016a). The SAGE III data can be obtained via https://eosweb.larc.nasa.gov/ project/sage3/sage3_table (registration required) (NASA, 2016b). The SCIAMACHY data can be obtained via http://www.iup.uni-bremen.de/scia-arc/ (registration required) (IUP, 2016). The SMILES data can be obtained via https://www.darts.isas.jaxa.jp/iss/smiles/ (DARTS, 2016). The SMR data can be obtained via http://odin.rss.chalmers.se (registration required) (Odin/SMR, 2016).  Figure A1. Weighted-average relative difference profiles for vertically smoothed (solid and dashed) and non-smoothed (dot-dashed) NO 2 , HNO 3 , and ClONO 2 data. The only profiles that do not include diurnal scaling are those for HNO 3 . Horizontal dotted lines indicate altitude limits within which the ACE-FTS comparisons yield average correlation coefficients greater than 0.8 and average standard deviations below 50 %. Figure A1 shows that, away from the upper and lower altitude limits, where retrieval errors are typically largest, the vertical smoothing has little to no effect on the weightedaverage relative differences. At altitudes where the weightedaverage correlation coefficients are greater than 0.8 and the weighted-average standard deviations are less than 50 % (altitude limits indicated by horizontal dotted lines in Fig. A1), the largest effect on the relative differences is in the NO 2 comparisons near 17 km. In this region the difference between the smoothed and non-smoothed relative difference is less than 9 %, and this difference is mainly due to coarser vertical resolution values for SCIAMACHY, and to a lesser extent MIPAS IMK-IAA, retrievals in this region. Otherwise, within the altitude limits mentioned above, differences between smoothed and non-smoothed relative differences are typically less than 1 %, as the vertical resolutions of most of the retrievals are on the same order.