Accuracy, precision, and temperature dependence of Pandora total ozone measurements estimated from a comparison with the Brewer triad in Toronto

This study evaluates the performance of the recently developed Pandora spectrometer by comparing it with the Brewer reference triad. This triad was established by Environment and Climate Change Canada (ECCC) in the 1980s and is used to calibrate Brewer instruments around the world, ensuring high-quality total column ozone (TCO) measurements. To reduce stray light, the double Brewer instrument was introduced in 1992, and a new reference triad of double Brewers is also operational at Toronto. Since 2013, ECCC has deployed two Pandora spectrometers co-located with the old and new Brewer triads, making it possible to study the performance of three generations of ozone-monitoring instruments. The statistical analysis of TCO records from these instruments indicates that the random uncertainty for the Brewer is below 0.6 %, while that for the Pandora is below 0.4 %. However, there is a 1 % seasonal difference and a 3 % bias between the standard Pandora and Brewer TCO data, which is related to the temperature dependence and difference in ozone cross sections. A statistical model was developed to remove this seasonal difference and bias. It was based on daily temperature profiles from the European Centre for Medium-Range Weather Forecasts ERA-Interim data over Toronto and TCO from the Brewer reference triads. When the statistical model was used to correct Pandora data, the seasonal difference was reduced to 0.25 % and the bias was reduced to 0.04 %. Pandora instruments were also found to have low air mass dependence up to 81.6 solar zenith angle, comparable to double Brewer instruments.


Introduction
Routine total column ozone (TCO) measurements started in the 1920s with the Dobson instrument (Dobson, 1968). During the International Geophysical Year, 1957, the worldwide Dobson ozone-monitoring network was formed. Stratospheric ozone has been an important scientific topic since the 1970s and became a matter of intense interest with the discovery and subsequent studies of the Antarctic ozone hole (Farman et al., 1985;Solomon et al., 1986;Stolarski et al., 1986) and depletion on the global scale (Stolarski et al., 1991;Ramaswamy et al., 1992). To improve the accuracy and to automate the TCO measurements, the Brewer spectrophotometer was developed in the early 1980s (Kerr et al., 1980(Kerr et al., , 1988. In 1988, the Brewer was designated (in addition to the Dobson) as the World Meteorological Organization (WMO) Global Atmosphere Watch (GAW) standard for total column ozone measurement. By 2014, there were more than 220 Brewer instruments installed around the world, with most in operation today. To maintain the measurement stability and characterize each individual Brewer, field instruments need to be regularly calibrated against the travelling standard reference instrument. The travelling standard itself is calibrated against the set of three Brewer instruments (serial numbers 8, 14, and 15) operated by Environment and Climate Change Canada (ECCC), located in Toronto, and known as the Brewer reference triad (BrT) (Fioletov et al., 2005). Due to the well-known stray-light issue in the UV region (Bais et al., 1996;Fioletov et al., 2000), the MkIII Brewer (double X. Zhao et al.: Comparison of Pandora total ozone measurements Brewer) was introduced in 1992. The double Brewer has two spectrometers in series, significantly improving UV response and measuring global UV spectral irradiance, O 3 , SO 2 , and aerosol optical depth. The double Brewer instruments also have a set of three instruments (serial numbers 145, 187, and 191) co-located with BrT to form the Brewer reference triad double (BrT-D). Individual Brewer instruments of the BrT and BrT-D are independently calibrated at Mauna Loa, Hawaii, every 2-6 years (Fioletov et al., 2005).
The Pandora system was developed at NASA's Goddard Space Flight Center and first deployed in the field in 2006. Pandora instruments are based on a commercial spectrometer with stability and stray-light characteristics that make them suitable candidates for both direct-sun and zenith-sky measurements of total column ozone and other trace gases (Herman et al., 2009;Tzortziou et al., 2012). Pandora instruments have been tested and deployed in multiple scientific measurement campaigns around the world. These include the Cabauw Intercomparison Campaign of Nitrogen Dioxide measuring Instruments (CINDI) in the Netherlands in 2009 (Roscoe et al., 2010) and four NASA DISCOVER-AQ campaigns since 2011 (Tzortziou et al., 2012). The Pandora instruments have been used for validation of satellite ozone (Tzortziou et al., 2012) and NO 2 (Herman et al., 2009;Tzortziou et al., 2012) measurements. By 2015, several longterm Pandora sites had been established in the United States and worldwide (including Austria, Canada, the Canary Islands, Finland, and New Zealand). In 2013, two Pandora instruments (serial number 103 and 104) were deployed at Toronto co-located with BrT and BrT-D on the roof of the ECCC Downsview building (43.782 • N, 79.47 • W).
The instrument random uncertainties of BrT were analysed by Kerr et al. (1996) and Fioletov et al. (2005) using similar methods. These methods both require knowledge of the extraterrestrial calibration (ETC) values, the ozone absorption coefficients, and the Rayleigh scattering coefficients for each instrument. Fioletov et al. (2005) reported that the random uncertainties of individual observations from the BrT are within ±1 % in about 90 % of all measurements. This work takes a different approach, using a statistical variable estimation method to determine the random uncertainties for BrT, BrT-D, and the two Pandora instruments together. The variable estimation method follows the work of Fioletov et al. (2006) to estimate the random uncertainties with the assumption that there is no multiplicative bias between Pandoras and Brewers. Details of the method are provided in Sect. 3.1. Since the instrument random uncertainties for BrT were last reported 10 years ago using data to 2004 (Fioletov et al., 2005), this work provides a new assessment of the performance of both the BrT and BrT-D in recent years, along with a comparison between coincident Brewer and Pandora measurements.
The Pandora ozone retrievals are more sensitive to stratospheric temperatures. In Herman et al. (2015), the temperature dependence for Pandora no. 34 (0.333 % K −1 ) was determined by applying retrievals at a series of different ozone temperatures from 215 to 240 K for the ozone cross sections and then obtaining a linear fit to the percent change. As the small Brewer temperature dependence is known, we use coincident measurements from the BrT and BrT-D to determine the temperature dependence factors for Pandora nos. 103 and 104, and then apply the correction to remove the difference between Pandora and Brewer instruments.
2 Instruments and datasets 2.1 Pandora The Pandora spectrometer system uses a temperaturestabilized (1 • C) symmetric Czerny-Turner system with a 50-micron entrance slit and 1200 lines mm −1 grating. Unlike the Brewer instruments, which only measure intensities at selected wavelengths, the Pandora instruments, with a 2048 × 64 back-thinned Hamamatsu CCD detector, record spectra from 280 to 530 nm at 0.6 nm resolution (Herman et al., 2015). The spectra are analysed using the differential optical absorption spectroscopy (DOAS) technique (Noxon, 1975;Platt, 1994;Platt and Stutz, 2008;Solomon et al., 1987), in which absorption cross sections for multiple atmospheric absorbers (including ozone, NO 2 , SO 2 , HCHO, and BrO) are fitted to the spectra (Tzortziou et al., 2012). The Daumont, Brion, and Malicet (DBM) (Daumont et al., 1992;Brion et al., 1993Brion et al., , 1998 ozone cross section at an effective temperature of 225 • K is used in the Pandora retrievals (Herman et al., 2015). Additional information on Pandora calibrations and operation can be found in Herman et al. (2015).
Two commercial Pandoras (nos. 103 and 104) were used in this study with no modifications to operational and processing algorithms ( Tzortziou et al. (2012), the Pandora ozone dataset is filtered to remove data from which the normalized root mean square (RMS) of weighted spectral fitting residuals is greater than 0.05, and the Pandora-calculated standard uncertainty (Tzortziou et al., 2012) in TCO is greater than 2 DU.

Brewer
The Brewer instruments use a holographic grating in combination with a slit mask to select six channels in the UV (303.2, 306.3, 310.1, 313.5, 316.8, and 320 nm) to be detected by a photomultiplier. The first and second wavelengths are used for internal calibration and measuring SO 2 respectively. The four longer wavelengths are used for the ozone retrieval. The total column of ozone is calculated by analysing the relative intensities at these different wavelengths using the Bass and Paur (1985) ozone cross sections at a fixed effective temperature of 228.3 • K (Kerr, 2002).
Most of the instruments in the BrT (nos. 8,14,and 15) and nos. 187,and 191) have been in operation since Pandora instruments were deployed. However, there are a few measurement gaps for some of the Brewers. For example, Brewers nos. 14 and 15 were recalibrated at Mauna Loa, Hawaii, in October 2013, and Brewer no. 145 was in Spain in March 2014 for an intercomparison. We also had to exclude some periods due to instrument malfunction and repairs. The coincident measurement periods for the instruments are shown in Table 1. The data from Brewer and Pandora instruments are both time-binned (3 min) for the comparison. Following the work of Tzortziou et al. (2012), the Brewer dataset is filtered to remove data with calculated standard uncertainty in TCO greater than 2 DU. In addition, the Brewer dataset is filtered for clouds by removing data for which the logarithm of the signal at 320 nm is less than the mean value minus 2 standard deviations (4 % of data were removed with this filter).

OMI
The Ozone Monitoring Instrument (OMI) is a nadir-viewing, near-UV-Vis spectrometer aboard NASA's Earth Observing System (EOS) Aura satellite (launched in July 2004). The OMI instrument measures the solar radiation backscattered by the Earth's atmosphere and surface between 270 and 500 nm with a spectral resolution of about 0.5 nm . The OMI TCO data are retrieved using both the Total Ozone Mapping Spectrometer (TOMS) technique (developed by NASA (Bhartia and Wellemeyer, 2002) and based on a retrieval using four wavelengths at 313, 318, 331, and 360 nm) and the DOAS technique (developed by KNMI Kroon et al., 2008) and based on the spectrum measured in the wavelength range 331.1-336.6 nm). The OMI TCO validation done by Balis et al. (2007) shows a globally averaged agreement of better than 1 % for OMI-TOMS data and better than 2 % for OMI-DOAS data in comparison with Brewer and Dobson measurements.
The OMI TCO products used in the present study are the Level-3 Aura/OMI daily global TCO gridded product (OMTO3e) retrieved by the enhanced TOMS Version 8 algorithm (Balis et al., 2007). The OMTO3e data (Bhartia, 2012) are generated by the NASA OMI science team by selecting the best pixel (shortest path length) data from the good-quality Level-2 TCO orbital swath data (for example, L2 observations with SZA < 70 • ; details can be found in Bhartia, 2012) that fall in the 0.25 • × 0.25 • global grids. The OMTO3e data that come from the grid point over the groundbased site are used in this work to validate our correction method for Pandora TCO data.

ECMWF ERA-Interim data
In this work, the ozone-weighted effective temperature was used to assess the temperature sensitivity of Pandora ozone retrievals. Temperature and ozone profiles were extracted from the European Centre for Medium-Range Weather Forecasts (ECMWF) ERA-Interim data for 2013-2015 (Dee et al., 2011) with 0.5 • × 0.5 • spatial resolution on 37 standard pressure levels, available from http://apps.ecmwf.int/ datasets/. The ozone-weighted effective temperature (T eff ) is calculated based on daily ozone and temperature profiles (at 18:00 UTC) over Toronto, defined as where w eff is the weighting function, T i is the temperature, n i is the ozone number density, MMR i is the ozone mass mixing ratio, and p i is the pressure at pressure level i. In this work, profile data on ECMWF standard pressure levels from no. 6 to no. 30 (10-800 mbar) were used to decrease the noise from variable surface temperatures.
3 Statistical uncertainty estimation Figure 1 shows the time series of the total column ozone datasets used in this work. The seasonal cycles of TCO from the ground-based and satellite instruments track each other well, and the high-frequency daily variations from all ground-based instruments are consistent. By comparing the same quantity retrieved from different remote sensing instruments, we can characterize the differences between them, which are a combination of random uncertainties and systematic bias. Theoretically, information about the random uncertainties can be derived from the measurements themselves (Grubbs, 1948;Toohey and Strong, 2007). The following method for doing this is described in Fioletov et al. (2006) and briefly explained below.

Method
We define the two types of measured TCO (denoted as M B and M P , for Brewer and Pandora respectively) as simple linear functions of the true TCO value (X) and instrument random uncertainties (δ B and δ P ), and assume that there is no multiplicative or additive bias between Pandora and Brewer, giving If we assume that the instrument random uncertainties are independent of the measured TCO, the variance of M is the sum of the variances of X (around the mean of the dataset) and δ: If the difference between Pandora and Brewer does not depend on X (no multiplicative bias), and the random uncertainties of the two instruments are not correlated, then the variance of the difference is equal to the sum of the variance of the random uncertainties: Since we have the measured TCO and the difference between the Pandora and Brewer datasets, the variance of the TCO and instrument random uncertainties can be solved by Equation (6) can be used to estimate the standard deviation (SD) of instrument random uncertainties (σ δ B and σ δ P ) and the SD of ozone variability (σ X ). We do not actually know the variances σ 2 M i and σ 2 M B −M P ; we can only estimate them, with some uncertainty, from the available measurements. It can be shown that the uncertainties in the σ 2 X , σ 2 δ B , and σ 2 δ P estimates depend on the sum of all three variances (σ 2 M B , σ 2 M P , and σ 2 M B −M P ) and can be high even if the estimated variance itself is low (but one or more of the variances σ 2 M B , σ 2 M P , and σ 2 M B −M P are high). The estimates are thus only as accurate as the least accurate of these parameters. The variance estimates can be improved by increasing the number of data points or by reducing variances of X by removing some of the daily variability. To remove the variability in X, the residual ozone here is defined as the difference between the highfrequency TCO and the low-frequency TCO measured by an instrument: For example, the Brewer residual ozone could be the Brewer TCO measurements minus the Brewer ozone daily mean for that day, whereas the corresponding Pandora residual ozone would be the Pandora TCO measurements minus the Pandora ozone daily mean. By subtracting the low-frequency signal, we remove most of the ozone variability. In addition, as proposed in Fioletov et al. (2005), to improve the removal of the bias, we can use the following statistical model to calculate the low-frequency signal: where t is the time of the measurement and t 0 is the time of local solar noon. I B is an indicator function for the Brewer instrument; it is set to 1 if the TCO is measured by the Brewer and to 0 otherwise. I P is the indicator function for the Pandora. The coefficients A B , A P , B, and C are estimated by the least-squares method for each day (for example, the calculated low-frequency signal for Brewer and Pandora will share the same B and C terms, but they have their own offsets A B and A P ). In the following, we will refer to the residual ozone calculated by subtracting the daily mean value as residual type 1 and that obtained by subtracting this secondorder function as residual type 2. The present work is focused on evaluating the high-quality TCO data. Thus to avoid the stray-light effect, in the statistical uncertainty estimation, we only use Pandora and Brewer data with ozone air mass factor (AMF) less than 3 (see Sect. 4 for more details about the stray-light effect).

Results
In this work, we calculate two different types of residual ozone (see Eq. 7) as defined in Sect. 3.1 and then use them to calculate the instrument random uncertainty with the statistical variable estimation method (Eq. 6; more details can be found in Fioletov et al., 2006). For example, we use Eqs. (7) and (8) to calculate two type 2 residuals for both Brewer and Pandora (dM b-res2 and dM p-res2 ), and then calculate their difference (dM b-res2 − dM p-res2 ). Next, we calculate their variances values σ 2 (dM b-res2 ), σ 2 (dM p-res2 ), and σ 2 (dM b-res2 − dM p-res2 ). Those variance terms are used in Eq. (6) to estimate the random uncertainties. The residual types and relevant terminologies are summarized in Table 2. High-frequency TCO measurements, averaged in 3 min bin M low-f (daily mean) Low-frequency TCO, calculated as the daily mean TCO M low-f (2nd order function) Low-frequency TCO, calculated using the secondorder function (Eq. 8) Figure 2. Estimated random uncertainties: for the Brewer instruments using (a) residual ozone type 1 and (b) residual ozone type 2; for the Pandora instruments using (c) residual ozone type 1 and (d) residual ozone type 2. The black squares indicate data from Pandora no. 103, and the red triangles indicate data from Pandora no. 104. The error bars show the 95 % confidence bounds. Figure 2 shows the Brewer-estimated random uncertainties obtained using the two types of residual ozone data ( Fig. 2a for residual type 1, Fig. 2b for type 2). For example, in Fig. 2a, the estimated random uncertainty for Brewer no. 8 using Pandora no. 103 data (residual type 1, derived from M P103 ) is shown as a black square in the column for Brewer no. 8, while its estimated random uncertainty using Pandora no. 104 data (residual type 1, derived from M P104 ) is shown as a red triangle in the same column. Figure 2 demonstrates that type 1 (Fig. 2a) and type 2 (Fig. 2b) residual ozone data provide comparable results and confirm that Brewer instruments have random uncertainties of 1-2 DU. Figure 2 also shows the Pandora-estimated random uncertainties using the two types of residual ozone data ( Fig. 2c for residual type 1, Fig. 2d for type 2). For example, in Fig. 2c, the estimated random uncertainty for Pandora no. 103 using Brewer no. 8 data is shown as a black square in the column of Brewer no. 8, while its estimated random uncertainties using other Brewer data are shown by respective Brewer columns. Figure 2 demonstrates that the Pandora instruments have estimated random uncertainties less than 1.5 DU. Slight differences in the estimated Pandora random uncertainties were found using different Brewer instruments. This is due to the sample size; when the sample size is large (> 1200 coincident points; see Table 1), the Pandora-estimated random uncertainties from different instruments are more consistent. For example, in Fig. 2c, one of the estimated random uncertainties for Pandora no. 103 (black square in Brewer no. 187 column) is below 0.5 DU. This result is undesirable (the value is ∼ 0.5 DU lower than the other values) but not unusual. Dunn (2009) describes this issue in detail and points out that the low (even negative in some cases) variance estimate is due to small sample size. In general, Dunn (2009) concludes that, even with the correct model, the comparisons and estimation of precision are only viable with large sample sizes. Figure 3c shows that the low variance was indeed from the smallest sample size (608 coincident points for Pandora no. 103 vs. Brewer no. 187 and 397 for Pandora no. 104 vs. Brewer no. 187). In addition, when using the data from the same pair of Brewer and Pandora instruments, the estimated random uncertainty for Pandora is consistently lower than that for Brewer by ∼ 0.5 DU. Fioletov et al. (2006) estimated natural ozone variability (σ X ) using Eq. (6). However, because we are using the residual ozone instead of the TCO in the statistical analysis, the σ X calculated from our method is not the estimated natural ozone variability but the estimated residual ozone variability for the measurement period. It can be used to characterize the difference between residual types 1 and 2. Figure 3a shows the estimated residual ozone variability using residual type 1 data, while Fig. 3b shows the variability using residual type 2. Figure 3a and b demonstrate that residual type 1 data have larger variability than type 2 data, indicating that using the daily mean value as the low-frequency signal did not fully remove the natural ozone variability. Ideally, the random uncertainty estimate should only contain random noise caused by the instrument and no natural ozone variation. Scatter plots of Brewer vs. Pandora residual ozone (Fig. 4) illustrate the same results. Figure 4 shows that the correlation coefficients for residual type 1 (R = 0.813 for Brewer no. 8 vs. Pandora no. 103, see Fig. 4a; 0.909 for Brewer no. 8 vs. Pandora no. 104, see Fig. 4b) are higher than the ones for residual type 2 (0.333 for Brewer no. 8 vs. Pandora no. 103, see Fig. 4c; 0.688 for Brewer no. 8 vs. Pandora no. 104, see Fig. 4d). The low correlation coefficients for ozone residual type 2 data indicate that the ozone variability has been largely removed from Pandora and Brewer data. Thus when we use residual ozone type 2, even with relatively small sample size, the estimated uncertainties for Pandoras are still consistent with those obtained from comparisons with other Brewers having larger sample sizes (see Fig. 2c and d, Brewer no. 187 column).
To summarize, we tested two different methods for calculating residual ozone and applied them in the statistical uncertainty estimation. The comparison of two residuals helps us understand more details about the variable estimation method. Although using the daily mean value as a low-frequency signal (as in the residual type 1 calculation) has some shortcomings, it is more straightforward than using the complex second-order statistical model (Eq. 8). By showing the consistency of results from both type 1 and 2 in Fig. 2, we validated the use of the second-order statistical model (Eq. 8) and proved some of the advantages when using type 2. For example, the residual type 2 could work with smaller data size than the residual type 1 (without making the estimated variance unrealistic, too low, or even negative). In general, Fig. 2 demonstrates that the Pandora TCO data have ∼ 0.5 DU smaller estimated random uncertainties than the Brewer TCO data. The mean estimated random uncertainties for BrT and BrT-D are in the range of 1-2 DU (∼ 0.6 %). The mean estimated random uncertainties for Pandora nos. 103 and 104 are in the range of 0.5-1.5 DU (∼ 0.4 %). These results confirm the quality of the TCO data, with all eight instruments meeting the GAW requirement for a precision better than 1 % to measure ozone (WMO, 2014).

Method
When comparing Pandora and Brewer TCO data, we can see a clear seasonal structure and a bias in the difference and ratio. Figure 5a shows the time series of Brewer no. 14-Pandora no. 103 TCO difference; the seasonal amplitude is 3-4 DU, and the mean bias is 10.81 DU. Figure 5b (which uses the corrected data) will be discussed in Sect. 4.2. The locally weighted scatter plot smoothing (LOWESS(x)) fit (the dashed line) is based on local least-squares fitting applied to a specified x fraction of the data (Cleveland and Devlin, 1988). The bias between Pandora and Brewer TCO is mainly due to the fact that both retrievals depend on the choice of ozone absorption cross section (Scarnato et al., 2009;Herman et al., 2015). The Brewer TCO in this work was retrieved using the standard Brewer network operational ozone cross section (Bass and Paur, 1985), while the Pandora TCO was retrieved using the standard Pandora network operational ozone cross section (the DBM ozone cross section). Redondas et al. (2014) reported that changing the Brewer operational ozone cross section from Bass and Paur (1985) to that of Daumont et al. (1992) (DBM) will change the calculated TCO by −3.2 %. In addition to the offset caused by the use of different ozone cross sections, the seasonal difference between Pandora and Brewer TCO data is due to their differing temperature dependence, which varies from instrument to instrument because of the differences in ozone retrieval algorithm and instrument design. Moreover, even for the same type of instrument, the temperature sensitivity can be different due to imperfections in the wavelength settings and slit function for each individual instrument. We will study these differences (offset and temperature effect) by using the standard TCO products from Pandora and Brewer instruments.
In this work, we use ECMWF ERA-Interim ozone and temperature profiles to calculate daily ozone effective temperature (described in Sect. 2.4). Then we use the following simple linear regression model to find the temperature dependence factor for Pandora instruments: where a is the temperature dependence factor for Pandora, b is the (systematic) multiplicative bias between Pandora and Brewer, and 225 refers to effective temperature of 225 • K for ozone cross sections used in the Pandora retrievals. Here, the M B and M P are TCO daily means measured by the Brewer and Pandora respectively. To increase the number of coincident data points, the M B dataset is formed by merging all measurements from the six Brewers (see Table 1). A successfully merged M B data point has coincident measurements from at least two Brewers, to avoid domination by a single instrument. The coincident time period of the M B and M P103 datasets is from October 2013 to December 2015 with 272 coincident days (points). Figure 6 shows the linear regression results for Pandoras nos. 103 and 104. We found the "relative temperature dependence factor" (RTDF) for Pandora no. 103 to be 0.247 ± 0.013 % K −1 (from the term a in Eq. 9), with a 2.2 ± 0.1 % multiplicative bias (from the term b in Eq. 9). Although Pandora no. 104 only has measurements from January to April 2014 (53 coincident days), the linear regression still results in a similar temperature dependence factor (0.255 ± 0.040 % K −1 ) and the same bias as Pandora no. 103. The correlation coefficients for those two linear regressions are 0.91 and 0.89 respectively. We applied the Pandora temperature dependence factors to the Pandora TCO to remove its bias and seasonal difference relative to Brewer TCO data. Similar to the correction function used in Herman et al. (2015) for Pandora no. 34, we used the following function to correct Pandora TCO data: where M corr is corrected Pandora TCO, and other terms are as defined for Eq. (9). For the Pandora no. 103 dataset, this becomes where M P103 is the TCO data from Pandora no. 103. The temperature dependence factor (0.247 ± 0.013 % K −1 ) and the multiplicative bias (1.022) are found in Fig. 6. The same regression model and method give a 0.255 ± 0.040 % K −1 temperature dependence factor with a 2 % multiplicative bias to Pandora no. 104, and hence where M P104 is the Pandora no. 104 TCO. For comparison, Herman et al. (2015) derived the correction function for Pandora no. 34 as where the 0.00333 (0.333 % K −1 ) is the temperature dependence factor for Pandora no. 34. Note that this value was determined by applying retrievals using ozone cross sections from 215 to 240 K and then obtaining a linear fit to the percent change (Herman et al., 2015). However in this work, the factors for Pandora nos. 103 and 104 were found by statistical analysis (comparison) of the Pandora and Brewer TCO datasets. Thus our temperature dependence factor combines the temperature sensitivity from both Pandora and Brewer instruments, and describes the relative temperature sensitivity between the Pandora and Brewer standard TCO products. We call it a "relative temperature dependence factor" (RTDF), while that from Herman et al. (2015) is an absolute temperature dependence factor (ATDF). Although the RTDF is a non-linear combination of ATDF from both Pandora and Brewer (note that the Pandora used an ozone cross section at an effective temperature of 225 K, while the Brewer used that at 223.8 K), we can still make a simple linear estimation of the RTDF from reported ATDFs. In fact, the reported ATDF for Pandora no. 34 (0.333 % K −1 ; Herman et al., 2015) minus the reported ATDF for Brewer nos. 8 and 14 (0.07 and 0.094 % K −1 ; Kerr et al., 1988;Kerr, 2002) gives relative numbers (0.26 and 0.24 % K −1 ) that are close to our model-calculated RTDF (∼ 0.25 % K −1 ). In our correction functions (Eqs. 11-12), we have a constant b term of 1.022 given 0.001 uncertainty, which indicates a multiplicative bias of ∼ 2 % (not caused by the temperature effect) between the Pandora and Brewer instruments due to their different selection of ozone cross sections.
Merging data from all six Brewers could lead to variation of the Brewer temperature dependence, so we performed sensitivity tests on the dataset. Table 3 summarizes the tests; the combined Brewer data are merged from all available Brewer data during the data period indicated in the table. Figure 7 shows the RTDFs, multiplicative bias, correlation coefficient, and number of data points for the 13 sensitivity tests. Tests 1 and 2 are the results adapted from Fig. 6. Due  Brewer (no. 8,no. 14,no. 15,no. 145,no. 187,no. 191) Oct 2013-Dec 2015 0.247 ± 0.013 2 no. 104 Combined Brewer (no. 8,no. 14,no. 15,no. 187,no. 191) Jan 2014-Apr 2014 0.255 ± 0.040 3 no. 103 Combined Brewer (no. 8,no. 14,no. 15,no. 187,no. 191) Jan 2014-Apr 2014 0.261 ± 0.027 4 no. 103 Combined Brewer (no. 8,no. 14,no. 15,no. 187,no. 191) Oct 2013-Aug 2014 0.255 ± 0.020 5 no. 103 Combined Brewer (no. 8,no. 14,no. 145,no. 191 Fig. 7a) are in the range of 0.24-2.9 %, and the multiplicative biases (see Fig. 7b) are in the range of 1.7-2.5 %. The correlation coefficients (see Fig. 7c) for most tests are above 0.8. In general, the RTDFs found for the Pandora instruments are stable when derived from combined Brewer data or reliable individual Brewer data. For this 2-year data period, the derived RTDFs from BrT-D instruments are lower (0.241-0.246 % K −1 ) than the ones from BrT instruments (0.262-0.290 % K −1 ). However, with the large uncertainties on the estimated RTDFs and the bias, we could not conclude whether this is due to the different instrument designs or a sampling issue.

Pandora TCO correction
As an example, Fig. 5 shows the time series of Brewer no. 14-Pandora no. 103 TCO differences, before and after applying the Pandora correction (Eq. 11). A clear seasonal signal is seen due to the variation of T eff before we apply Figure 8. Scatter plots of Pandora no. 103 vs. Brewer no. 14 TCO, colour-coded by ozone effective temperature: (a) before applying the correction and (b) after applying the correction. The red line is a simple linear fit, the green line is the linear fit weighted by the calculated standard uncertainty from Pandora and Brewer TCO data, the blue line is the linear fit with intercept set to 0, and the black line is the one-to-one line.
the temperature dependence correction (see Fig. 5a). Figure 8 shows scatter plots of Pandora no. 103 versus Brewer no. 14 TCO. In Fig. 8a, the linear regression (green line, weighted accounting for uncertainties from both measurements; York et al., 2004) between Pandora no. 103 and Brewer no. 14 gives a slope of 1.023, an offset of −18.486 DU, and strong correlation (R = 0.9954). Forcing the intercept to 0 gives a Figure 9. Effective ozone temperature: (a) T eff calculated using ECMWF ERA-Interim data (18:00 UTC over Toronto) and NASA climatology data (monthly mean for 40-50 • N), and (b) the difference between these two. slope of 0.969, indicating −3.1 % mean bias. This is consistent with the work of Redondas et al. (2014), which showed that changing the Brewer ozone cross section from Bass and Paur to DBM changed the Brewer TCO by −3.2 %. By colour coding the scatter points, it is obvious that this non-ideal slope and offset are related to T eff . After applying the correction, the seasonal Brewer-Pandora difference disappears as seen in Fig. 5b, and the linear regression (green line) gives a slope of 1.008, an offset of −2.678 DU, and an improved correlation (R = 0.9982) (see Fig. 8b). Linear fitting with zero intercept gives a slope of 1.001, indicating that the correction improves the mean bias between Pandora and Brewer TCO from −3.1 to 0.1 %.
To calculate the effective temperature, we use daily temperature and ozone profiles from ECMWF ERA-Interim data at 18:00 UTC for Toronto, but Herman et al. (2015) used monthly averaged temperature and ozone climatology data (interpolating the climatological ozone profile to the observed TCO in order to capture day-to-day variability; see ftp://toms.gsfc.nasa.gov/pub/ML_climatology) for latitudes of 30-40 and 40-50 • N to form an average suitable for Boulder (40 • N). To understand the difference due to the selection of T eff , we adapted the climatology data used in Herman et al. (2015) and used the data from 40 to 50 • N to calculate effective ozone temperature for Toronto (44 • N). Figure 9 shows the comparison between the ECMWF daily T eff and the NASA monthly climatology T eff . A sudden cooling event happened at Toronto on 29-30 January 2014, for which the difference between the daily and monthly T eff was −10 K. Figure 10 shows the time series of TCO difference (combined Brewer-Pandora no. 103) before and after applying the temperature dependence correction using both the monthly climatology T eff and daily T eff . Because the monthly climatology T eff does not reflect the low temperature during those two days, the correction function (see Eq. 11) overcompensated for the temperature effect (the minimum delta ozone value on 29 January changed from −8 DU in Fig. 10a to −14 DU in Fig. 10b). The low-temperature event was cap- Figure 10. Time series of combined Brewer-Pandora no. 103 TCO difference colour-coded by ozone effective temperature: (a) before applying the temperature dependence correction, (b) after applying the correction using NASA monthly climatology T eff , and (c) after applying the correction using ECMWF EAR-Interim daily T eff . The sudden cooling event on 29-30 January 2014 is marked by a black box. The dashed lines are LOWESS(0.5) fits. tured by the daily T eff ; thus the compensation from the temperature effect was reasonably small when using ECMWF daily T eff (the minimum value was −7 DU; see Fig. 10c). In general, the ECMWF daily T eff can better capture some ozone variation events that are associated with rapid temperature changes. Figure 11 shows time series of the monthly average TCO difference in percentage before and after applying the temperature dependence correction for eight pairs of instruments (six individual Brewers vs. Pandora no. 103,combined Brewer vs. Pandora no. 103,and combined Brewer vs. Pandora no. 104). Figure 11a shows that both Pandora nos. 103 and 104 have similar offsets relative to the Brewers before applying the correction to Pandora data. In addition, the seasonal variations are consistent when comparing Pandora no. 103 to six individual Brewers (see Fig. 11a). After applying the TCO corrections (Fig. 11b), the seasonal differences decreased from ±1.02 to ±0.25 % for Pandora no. 103 and from ±0.40 to ±0.25 % for Pandora no. 104, as did the offset which decreased from 2.92 to −0.04 % for Pandora no. 103 and from 2.11 to −0.01 % for Pandora no. 104. The 1σ uncertainty in Fig. 11b shows that, statistically, the corrected Pandora datasets have no significant seasonal differences or offsets compared to the Brewer datasets.

Comparison with OMI
To further validate the temperature dependence correction for the Pandora data, we used OMI ozone data (version OMTO3e). Pandora data are averaged within ±10 min of OMI overpass times. In Fig. 12, scatter plots of OMI vs. Pandora TCO are shown in panels a and b; OMI vs. corrected Pandora TCO (using Eqs. 11 and 12 with the correction functions found from our statistical model) is shown in panels c and d; and OMI vs. corrected Pandora TCO (using Eq. 13 with the correction function from Herman et al., 2015) is shown in panels e and f. All the Pandora TCO corrections shown in Fig. 12 used the same T eff calculated with the ECMWF ERA-Interim daily ozone data. Figure 12a and c show that, after applying the TCO correction (Eq. 11) to Pandora no. 103, the slope of the linear regression improved from 0.987 to 0.990, the offset improved from 14.84 to −3.59 DU, the correlation coefficient improved from 0.987 to 0.991, and the mean bias between OMI and Pandora improved from 3.1 to 0.02 %. Similar improvement is seen in the comparison between Pandora no. 104 and OMI (see Fig. 12b and d), although the size of the coincident measurement dataset is smaller, with the mean bias improving from 1.5 to −0.6 %. In addition, Fig. 12e and f show that, by using the correction function from Herman et al. (2015), the comparisons also improve, although 1.9 % (1.4 %) bias remains for Pandora no. 103 (no. 104) (indicated by the slop of linear fit with force the intercept to 0; see the green lines in Fig. 12). Note that the ATDF in Herman et al. (2015) is only 0.08 % K −1 higher than our RTDF. Figure 13a and b shows the monthly mean time series of the OMI-Pandora TCO percentage difference, before and after applying the three correction functions. All three correction models reduced the difference between Pandora and OMI. Our relative correction model (Eqs. 11 and 12) reduces the seasonal difference (indicated by the δ of the percentage monthly delta ozone) between Pandora no. 103 and OMI from ±1.68 to ±1.00 %, with the mean bias decreasing from 2.65 to −0.19 % (the mean of the percentage monthly delta ozone). Pandora no. 104 has a similar improvement. The absolute correction model (Eq. 13) reduces the seasonal difference between Pandora no. 103 and OMI to 0.87 %, with the mean bias decreased to 1.71 %. The reduction in the mean bias between Pandora and OMI is better for the relative correction model. This result (−0.19 ± 1.00 % mean bias) is consistent with Balis et al. (2007), who showed that the global average difference between OMI-TOMS and Brewer instruments is within 0.6 %, and that the difference in the 40-50 • N band (Toronto is at 44 • N) is close to 0 (see their Fig. 1). Balis et al. (2007) reported that the time series of globally averaged differences between OMI-TOMS and Brewer instruments shows almost no annual variation, and the OMI-TOMS data theoretically have no temperature dependence (McPeters and Labow, 1996;Bhartia and Wellemeyer, 2002). By using our relative correction, the corrected Pandora TCO should have similar performance to the Brewer TCO. Figure 13c shows the difference between the absolute correction method and the relative correction method. Although both methods removed some of the seasonal signal (reduced from 1.68 to 1.00 % for the relative correction and to 0.87 % for the absolute correction), Fig. 13c shows that there is still a weak seasonal signal residual (0.39 %) left between these two methods.

Stray-light effect
It is well known that direct-sun UV spectrometers are affected by stray light when the solar zenith angle (SZA) is too large. In general, when the ozone AMF is larger than 3 (SZA > 70 • ), the retrieved TCO will show an unrealistic decrease with increasing SZA (thus this effect is also known as the air mass dependence effect). In general, the stray light from longer wavelengths results in overestimation of the UV signal at short wavelengths and makes the measured UV signal in that part of the spectrum less sensitive to TCO. The double Brewer spectrometer was introduced in 1992, which uses two spectrometers in series to reduce the stray light (Bais et al., 1996;Wardle et al., 1996;Fioletov et al., 2000). The BrT-D has the advantage of very low internal stray-light fraction (10 −7 , stray-light signal divided by total signal) compared to BrT (10 −5 ) in the 300-330 nm spectral range (Fioletov et al., 2000;Tzortziou et al., 2012). For Pandora instruments, a UV340 filter is used to remove most of the stray light that originates from wavelengths longer than 380 nm (Herman et al., 2015). A typical UV340 filter has a small leakage (5 %) at ∼ 720 nm, which misses the detector and hits the internal baffles. Further stray-light correction is done by subtracting the signal of pixels corresponding to 280 to 285 nm (which contain almost zero direct illumination) from the rest of the spectrum. However, a very small (but unknown) amount of this stray light may scatter onto the detector (Herman et al., 2015). Tzortziou et al. (2012) tested the stray-light effect for Pandora no. 34 and Brewer no. 171 and concluded that the Pandora stray-light fraction (∼ 10 −5 ) was comparable to the single Brewer. Pandora ozone retrievals are accurate up to a slant column between 1400 and 1500 DU or 70 and 80 • SZA, depending on the TCO amount (Herman et al., 2015).
In this work, to assess the air mass dependence, we compared Brewer TCO to the corrected Pandora TCO data. Figure 14 shows an example of the Brewer / Pandora ratio as a function of ozone AMF (reported value in Brewer data) before and after applying the TCO correction (Eq. 11), with the data points grouped by effective temperature. Before applying the correction (Fig. 14a), the linear fits show consistently low (−0.1 to 0.5 %) relative AMF dependence between Brewer and Pandora (defined as the slope of the linear fit) for each T eff group. However, the linear fit to the whole dataset (all effective temperatures, black line) shows that the relative AMF dependence is −0.007. Figure 14b shows that the correction changed the slope of the black line to −0.001; removing the temperature effect for the Pandora dataset thus reduces the relative AMF dependence from −0.7 to −0.1 %. Figure 14. Brewer no. 14 / Pandora no. 103 TCO ratio vs. ozone air mass factor: (a) before and (b) after applying the Pandora temperature dependence correction. The points are grouped by effective temperature (from 215 to 240 K, in 5 K bins), and the linear fits for each group are colour-coded. The black line and linear fit are for the whole dataset. Figure 15. Percentage difference between Pandoras (nos. 103 and 104) and Brewers (grouped as BrT and BrT-D) as a function of ozone air mass factor. On each box, the central mark is the median, the edges of the box are the 25th and 75th percentiles, and the whiskers extend to the most extreme data points not considered outliers.
To characterize only the air mass dependence, we therefore removed the temperature dependence effect from the Pandora dataset.
To show how the different instrument designs affect the stray-light performance, we merged the six Brewer datasets into two groups (BrT and BrT-D) to compare with the corrected Pandora data. Figure 15 shows the (Brewer − Pandora) / Brewer percentage difference as a function of ozone AMF. In Sects. 3 and 4, the TCO data with ozone AMF > 3 were discarded. The purpose of this filter was to ensure that only the best direct-sun measurements (with low air mass dependence) from both instruments were used. However, to study the instrument performance for large AMFs, and also to characterize the performance of Brewer and Pandora instruments, we changed the AMF threshold from 3 to 6. Figure 15 indicates that Pandora, BrT, and BrT-D instruments have similar air mass dependence for ozone AMF < 3 (∼ 71 • SZA), consistent with the result reported by Tzortziou et al. (2012). Pandora and BrT-D have similar AMF dependence up to ozone AMF of 5.5-6 (80.6-81.6 • SZA), but Pandora and BrT diverge above AMF of 3-4 (71-76 • SZA). In general, these results indicate the Pandora and BrT-D instruments have very good stray-light control.

Conclusions
The instrument random uncertainty, TCO temperature dependence, and ozone air mass dependence have been determined using two Pandora and six Brewer instruments. In general, Pandora and Brewer instruments both have very low random uncertainty (< 2 DU) in the total column ozone measurements, with that for Pandora being ∼ 0.5 DU lower than Brewer. This indicates that Pandora instruments could provide more precise measurements than the Brewer for the study of small-scale (temporal and magnitude) atmospheric changes. This work confirms the quality of the TCO data, with all eight instruments meeting the GAW requirement for a precision better than 1 % (WMO, 2014); however, the Brewer instruments have smaller ozone temperature dependence than the Pandoras.
By using the ECMWF ERA-Interim and Brewer ozone data in the statistical method, we successfully corrected the Pandora TCO to decrease its temperature dependence. We found relative temperature dependence factors of 0.247 % K −1 for Pandora no. 103 and 0.255 % K −1 for Pandora no. 104 against the Brewer instruments. This relative temperature dependence factor is comparable to the absolute temperature dependence factors previously found for Pandora (0.333 % K −1 , by applying retrievals with different ozone cross sections, Herman et al., 2015) and Brewers (0.07-0.094 % K −1 ; Kerr et al., 1988;Kerr, 2002). In addition, a 2 % multiplicative bias was found between the Pandora and Brewer standard TCO products, which is due to the different ozone cross sections used in the retrievals. After applying the corrections, the annual seasonal difference between Pandora and Brewer instruments decreased from ±1.02 to ±0.25 %, and the mean bias decreased from 2.92 to 0.04 %. In addition to using model ozone data (ECMWF ERA-Interim for our case) to calculate the effective ozone temperature, it could also be estimated from Brewer or Pan-dora measurements (Kerr, 2002;Tiefengraber et al., 2016), albeit at a cost of decreased TCO measurement precision. An effective ozone temperature algorithm is under development for the Pandora. The future operational Pandora ozone retrieval algorithm will use this derived effective ozone temperature to minimize the temperature dependence of the ozone product (Tiefengraber et al., 2016).
This study confirmed that the Pandora and Brewer TCO data have negligible air mass dependence when the ozone AMF < 3. The Pandora and BrT instruments have similar air mass dependence (relative air mass dependence < ± 0.1 %) up to 71 • SZA (AMF < 3); the Pandora and BrT-D instruments have very good stray-light control, and their AMF dependence is comparably low up to 81.6 • SZA (within 1 % up to AMF = 5.5 and within 1.5 % up to AMF = 6).

Data availability
Data from the BrT, BrT-D, and Pandora instruments are available through Environment and Climate Change Canada (contact Vitali Fioletov, vitali.fioletov@canada.ca). The final version of the Brewer data is (or will be) available from the World Ozone and UV Data Centre (doi:10.14287/10000001). OMTO3e data are available from the GES DISC: doi:10.5067/Aura/OMI/DATA3002 (GES DISC, 2004). Any additional data may be obtained from Xiaoyi Zhao (xizhao@atmosp.physics.utoronto.ca).