CALIPSO Lidar Calibration at 532 nm: Version 4 Nighttime Algorithm.

Data products from the Cloud-Aerosol Lidar with Orthogonal Polarization (CALIOP) on board Cloud-Aerosol Lidar and Infrared Pathfinder Satellite Observations (CALIPSO) were recently updated following the implementation of new (version 4) calibration algorithms for all of the level 1 attenuated backscatter measurements. In this work we present the motivation for and the implementation of the version 4 nighttime 532 nm parallel channel calibration. The nighttime 532 nm calibration is the most fundamental calibration of CALIOP data, since all of CALIOP's other radiometric calibration procedures - i.e., the 532 nm daytime calibration and the 1064 nm calibrations during both nighttime and daytime - depend either directly or indirectly on the 532 nm nighttime calibration. The accuracy of the 532 nm nighttime calibration has been significantly improved by raising the molecular normalization altitude from 30-34 km to 36-39 km to substantially reduce stratospheric aerosol contamination. Due to the greatly reduced molecular number density and consequently reduced signal-to-noise ratio (SNR) at these higher altitudes, the signal is now averaged over a larger number of samples using data from multiple adjacent granules. As well, an enhanced strategy for filtering the radiation-induced noise from high energy particles was adopted. Further, the meteorological model used in the earlier versions has been replaced by the improved MERRA-2 model. An aerosol scattering ratio of 1.01 ± 0.01 is now explicitly used for the calibration altitude. These modifications lead to globally revised calibration coefficients which are, on average, 2-3% lower than in previous data releases. Further, the new calibration procedure is shown to eliminate biases at high altitudes that were present in earlier versions and consequently leads to an improved representation of stratospheric aerosols. Validation results using airborne lidar measurements are also presented. Biases relative to collocated measurements acquired by the Langley Research Center (LaRC) airborne high spectral resolution lidar (HSRL) are reduced from 3.6% ± 2.2% in the version 3 data set to 1.6% ± 2.4 % in the version 4 release.

Abstract. Data products from the Cloud-Aerosol Lidar with Orthogonal Polarization (CALIOP) on board Cloud-Aerosol Lidar and Infrared Pathfinder Satellite Observations (CALIPSO) were recently updated following the implementation of new (version 4) calibration algorithms for all of the Level 1 attenuated backscatter measurements. In this work we present the motivation for and the implementation of the version 4 nighttime 532 nm parallel channel calibration. The nighttime 532 nm calibration is the most fundamental calibration of CALIOP data, since all of CALIOP's other radiometric calibration procedures -i.e., the 532 nm daytime calibration and the 1064 nm calibrations during both nighttime and daytime -depend either directly or indirectly on the 532 nm nighttime calibration. The accuracy of the 532 nm nighttime calibration has been significantly improved by raising the molecular normalization altitude from 30-34 km to the upper possible signal acquisition range of 36-39 km to substantially reduce stratospheric aerosol contamination. Due to the greatly reduced molecular number density and consequently reduced signal-to-noise ratio (SNR) at these higher altitudes, the signal is now averaged over a larger number of samples using data from multiple adjacent granules. Additionally, an enhanced strategy for filtering the radiation-induced noise from high-energy particles was adopted. Further, the meteorological model used in the earlier versions has been replaced by the improved Modern-Era Retrospective analysis for Research and Applications, Version 2 (MERRA-2), model. An aerosol scattering ratio of 1.01±0.01 is now explicitly used for the calibration altitude. These modifications lead to globally revised calibration coefficients which are, on average, 2-3 % lower than in previous data releases. Further, the new calibration procedure is shown to eliminate biases at high altitudes that were present in earlier versions and consequently leads to an improved representation of stratospheric aerosols. Validation results using airborne lidar measurements are also presented. Biases relative to collocated measurements acquired by the Langley Research Center (LaRC) airborne High Spectral Resolution Lidar (HSRL) are reduced from 3.6 %±2.2 % in the version 3 data set to 1.6 % ± 2.4 % in the version 4 release.

Introduction
The Cloud-Aerosol Lidar and Infrared Pathfinder Satellite Observations (CALIPSO) satellite was launched on 28 April 2006, with a payload of three Earth-observing instruments: Cloud-Aerosol Lidar with Orthogonal Polariza-Published by Copernicus Publications on behalf of the European Geosciences Union. tion (CALIOP), an elastic backscatter lidar ); a wide field-of-view camera; and an imaging infrared radiometer (Garnier et al., 2017). CALIOP produces a data set of vertically resolved cloud and aerosol properties as an integral part of NASA's Afternoon (A-Train) constellation. CALIOP's unique measurements have been widely adopted in a broad range of scientific studies and have greatly advanced our knowledge in the areas of aerosol emission and transport processes, Earth's radiative energy budget and atmospheric heating profiles, numerical weather forecasting, regional and global climate studies, and ocean biomass studies (Winker et al., 2010a;Solomon et al., 2011;Vernier et al., 2011;Yu et al., 2015;Santer et al., 2015;Tan et al., 2016;Behrenfeld et al., 2017). The fidelity of these new scientific results depends crucially on the calibration of the CALIOP lidar hereafter P09). The lidar transmits pulses of linearly polarized laser light at 532 and 1064 nm. The CALIOP receiver measures the attenuated backscatter from molecules and particles in the atmosphere, including both parallel and perpendicular components at 532 nm and total backscatter at 1064 nm. The detector channels are sampled at a rate of 10 MHz , and the digitized signals are converted to 532 nm total backscatter, 532 nm perpendicular backscatter and 1064 nm total backscatter; they are reported in the Level 1 data products. These measurements are calibrated using the nighttime observations acquired by the 532 nm parallel channel at stratospheric altitudes, where aerosols and clouds have been assumed to be either absent or well characterized and where almost all of the backscattered light is from molecules. Assuming a molecular-only atmosphere, accurate estimates of the expected laser backscatter are computed from an atmospheric assimilation model provided by the Global Modeling and Assimilation Office (GMAO). This is the first and most important step in the CALIOP data processing, as the daytime backscatter measurements at 532 nm as well as the daytime and nighttime measurements at 1064 nm are all subsequently calibrated relative to the 532 nm nighttime calibration. The version 4 (V4) updates to the calibration algorithms for 532 nm daytime signals and 1064 nm signals are described in two companion papers: Getzewich et al. (2018) and Vaughan et al. (2018), respectively. Calibration of the 532 nm polarization gain ratio is performed using onboard calibration hardware, described in P09, and has not been altered in V4. These calibrated attenuated backscatter data at 532 and 1064 nm constitute Level 1 in the CALIPSO data processing hierarchy and are used for all Level 2 analyses, including layer detection, cloud-aerosol discrimination, determination of cloud ice-water phase, aerosol subtyping, and retrievals of particulate extinction and backscatter profiles Vaughan et al., 2009;Liu et al., 2009;Hu et al., 2009;Omar et al., 2009;Young and Vaughan, 2009).
The CALIOP 532 nm nighttime calibration uses the well-established molecular normalization technique, wherein a scalar-valued calibration coefficient is calculated to achieve the best match between the signals measured over a designated calibration range and the expected signals derived from a molecular scattering model (Russell et al., 1979;P09). For the initial release of the CALIOP data products the calibration region was fixed between 30 and 34 km, where it remained for all versions of CALIOP data up to version 3.40. However, fairly early in the mission lifetime, a study by Vernier et al. (2009) showed conclusively that the aerosol loading in the 30-34 km calibration region was non-negligible and varied in both time and space. In this paper we report the results of a new calibration procedure for the nighttime 532 nm data which was initially implemented in version 4.00 of CALIOP Level 1 data, which was publicly released in April 2014. In November 2016, the version 4.00 data release was updated to version 4.10 (Vaughan et al., 2016), which now uses an improved digital elevation map and replaces the GMAO's Forward Processing Instrument Teams (FP-IT) product with the Modern-Era Retrospective analysis for Research and Applications, Version 2 (MERRA-2), as the source of meteorological data (Gelaro et al., 2017). Henceforth, we will refer to both version 4.00 and version 4.10 as V4, as they use exactly the same calibration algorithm, and all version 3 data as V3. In the new V4 algorithm, the molecular normalization is now applied between 36 and 39 km, where particulates are thought to be nearly absent. However, this altitude regime is near the upper limit of the CALIOP measurement range and thus has the attendant problem of significantly lower signal-to-noise ratio (SNR), necessitating substantially more averaging of the data. Consequently, one of the design constraints imposed on the new algorithm is that the relative uncertainty in the calibration coefficient from random errors should be of the same magnitude as in V3 (< 2 %). In this work we present an in-depth description of this new calibration strategy and provide examples documenting the improvements in the new version as a result of these changes. In particular, we repeat the validation study conducted earlier using extensive collocated measurements acquired by the Langley Research Center (LaRC) airborne High Spectral Resolution Lidar (HSRL) (Rogers et al., 2011), which shows that the bias in the CALIOP attenuated backscatter coefficients is reduced from 3.6 % ± 2.2 % in the V3 data set to 1.6 % ± 2.4 % in the V4 release. This paper describes the comprehensive updates in V4 of the CALIOP 532 nm nighttime calibration strategy described in P09. Many of the procedures and analyses described therein are still used in V4, and many of the details given in P09 are still applicable to the V4 calibration discussion. However, while these areas of continuity will be specifically identified in this paper, the detailed discussions given in P09 will not be repeated here. Instead the focus will be on describing those modifications that are unique to the V4 532 nm nighttime algorithm and on demonstrating the improved accuracy of the new calibration coefficients. The initial decision to calibrate the CALIOP nighttime 532 nm channel signals by molecular normalization at 30-34 km was dictated by the need to have sufficient molecular backscatter to provide a robust SNR (required to be at least 50 when data are averaged over 5 km vertically and 1500 km horizontally), as well as low or negligible contamination from stratospheric aerosol loading (Hostetler et al., 2006;Hunt et al., 2009;P09). The SNR requirement was easily satisfied: even after 11 years on orbit, the SNR in the 30-34 km calibration region remains comfortably above 70. However, the amount of aerosol loading subsequently proved problematic. Assessing the biases introduced by aerosol contamination is one of the primary tasks of the CALIPSO project's ongoing validation campaign. Given the degree of accuracy desired, validation of the CALIOP Level 1 data has always been a challenging task. Beginning early in the CALIPSO mission, extensive efforts were expended to use the European Aerosol Research Lidar Network (EARLINET) of ground-based lidars to evaluate the CALIOP Level 1 data. Using the coincident measurements (within 100 km and 2 h) from the Raman lidars operating at these stations and making use of the extinction profiles from these upward-looking Raman lidars, a CALIPSO-like attenuated backscatter profile was constructed, which was then compared with the corresponding CALIOP attenuated backscatter profiles. Using this strategy, several studies found a general underestimate in the CALIOP attenuated backscatter values in the free troposphere under clear-sky conditions (Mona et al., 2009;Mamouri et al., 2009;Pappalardo et al., 2010). While these studies pointed towards a possible issue with CALIOP calibration, there are significant issues involved in using ground-based lidars to validate satellite lidars, especially with regards to spatial and temporal matching. Gimmestad et al. (2017) pointed out that an inherent difficulty in validating CALIOP observations is the need to average over large distances along-track to sufficiently reduce the random noise in the CALIOP measurements. A more rigorous evaluation of the CALIOP calibration was possible using airborne LaRC HSRL underflights beginning early in the CALIPSO mission, using internally calibrated data from the HSRL 532 nm channel. From the early HSRL campaigns, P09 reported an underestimate of ∼ 5 % in the mean nighttime calibration and attributed this bias to the presence of stratospheric aerosols in the calibration region. Using data from many more underflights, Rogers et al. (2011) found an underestimate of the total attenuated backscatter measured by CALIOP of 2.7 % ± 2.1 % for nighttime data.
The aerosol contamination issue confounding the CALIOP calibration was clearly elucidated by Vernier et al. (2009), who analyzed the time sequence of attenuated scattering ratios (R ), defined as the ratio of the measured attenuated backscatter coefficients and the attenuated backscatter coefficients calculated from a purely molecular model: In this expression, backscatter coefficients are represented by β x ; two-way transmittances are represented by T 2 x ; and the subscripts m, O 3 and p indicate, respectively, contributions from molecules, ozone and particulates (i.e., clouds and aerosols). The expression in the numerator defines the measured CALIOP 532 nm attenuated backscatter coefficients; the quantities in the denominator are derived from model data. At sufficiently high altitudes, where the aerosol optical depths are negligible, T 2 p (z) ≈ 1, and in these regions the attenuated scattering ratios provide a good proxy for the true scattering ratios (i.e., R(z) = (β m (z) + β p (z))/β m (z)). Vernier et al. (2009) calculated R from CALIOP 532 nm measurements over the tropics and showed anomalously low values (R < 1) above 34 km, as well as in the lower stratosphere. Since molecular normalization at 30-34 km implies R should be unity or larger at these altitudes, this finding of non-physical low biases in the CALIOP data strongly suggested flaws of some sort in the CALIOP calibration procedure. In an attempt to eliminate these biases, Vernier et al. (2009) assumed that the 36-39 km altitude region was aerosol-free and renormalized the CALIOP data set using the original R values calculated in this region. Figure 1, reproduced from Vernier et al. (2009), shows the latitude-time cross section of their adjusted calibration constant, which can be interpreted as the R that would have been measured at 30-34 km if the data had been calibrated in the 36-39 km region. As can be seen, only minor adjustments to the CALIOP V3 calibration are required at the midlatitudes during this time period, but adjustments of 6-12 % are necessary in the tropics. A similar problem was noted by P09, who found a persistent dip in the tropics in clear-air attenuated scattering ratios (< 1) between 8 and 12 km. This too suggested deficiencies in the original CALIOP calibration procedures.
As the mission progressed and understanding of data quality improved, it was realized that the calibration altitude could be raised to 36-39 km without compromising the quality of the data products. In order to estimate the scattering ratios expected at the increased CALIOP V4 calibration altitudes, we examined the available stratospheric measurements from other satellites. The most extensive and accurate measurements of stratospheric aerosols have come from the Stratospheric Aerosol and Gas Experiment II (SAGE II) instrument. SAGE II has provided the extinction coefficient profiles in the stratosphere using the solar occultation technique from 1984 through 2005 (Mauldin et al., 1985; Thoma-  Damadeo et al., 2013). Between 1991 and 1996, the stratosphere was loaded with volcanic aerosols from the Pinatubo eruption, and no meaningful data are available for that period. Stratospheric aerosol information is also available from the Global Ozone Monitoring by Occultation of Stars (GOMOS) instrument, which provided data up to 2012 Kyrölä et al., 2010). GOMOS also employs the occultation technique but observes stars rather than the Sun. Figure 2a shows the zonally averaged scattering ratio (R) at 30-34 km and 36-39 km from both SAGE II (version 7) and GOMOS (version GOPR_6_0) for the time period 2002-2005 at 532 nm. For GOMOS, the aerosol extinctions at 500 nm were converted to R at 532 nm using a stratospheric aerosol lidar ratio of 50 sr and an Ångström exponent of 1.5. A lidar ratio of 50 sr is typically used for quiet non-volcanic ("background") conditions in the stratosphere (e.g., Kremser et al., 2016), while the value of the Ångström exponent was adopted from the balloon measurements of Jager and Deshler (2002). A similar process was used to convert the SAGE II extinction data at 525 nm to scattering ratios at 532 nm. Both the instruments show significant aerosol scattering ratios of 1.06-1.08 at 30-34 km in the tropics, decreasing to ∼ 1.02 in the polar regions. On the other hand, at 36-39 km R decreases to ∼ 1.00-1.02. GOMOS shows a low bias compared to SAGE II in both altitude ranges, with a scattering ratio of ∼ 1.01 at 36-39 km. Figure 2b shows R in these altitude ranges for 2006-2009 from GOMOS, during the first years of CALIPSO operation. SAGE II data are not available during this period, but as can be seen, the R values at 36-39 km from GOMOS are lower than those during 2002-2005 with a maximum of about 1.01 in the tropics. Assuming SAGE II data to be the reference standard for stratospheric aerosol measurements, and given the uniform underestimate of R from GOMOS as compared to SAGE II (from panel a), it is rea-sonable to assume a global value of 1.01 ± 0.01 for R at 36-39 km for the period of the CALIPSO mission. This value of R was therefore adopted in the CALIOP V4 algorithm to characterize the aerosol concentration at the new calibration altitude range of 36-39 km.

CALIOP 532 nm nighttime calibration method
As described in Sects. 2 and 3a of P09, the CALIOP nighttime 532 nm calibration coefficients are derived from the range-corrected, gain-and energy-normalized signals, X(z): where S(z) is the measured backscatter signal in the 532 nm parallel channel; r is the range, in kilometers, from the lidar to altitude z; E 0 is the laser pulse energy continuously measured on the platform; and G A is the electronic amplifier gain adjusted for night and day operation. The calibration coefficients (in km 3 sr J −1 count) are derived by normalizing X(z) to the expected backscatter signals computed from an atmospheric scattering model at some calibration altitude z c : . ( In this equation, R(z c ) is the expected scattering ratio that would be measured in the 532 nm parallel channel at the calibration altitude (z c ); β m (z) is the molecular backscatter coefficient in the 532 nm parallel channel; and T 2 m (z) and T 2 O 3 (z) are the two-way transmittances due to molecular scattering and ozone absorption, given by, respectively, where σ m (z) is the molecular extinction coefficient and are computed from molecular model data obtained from NASA's GMAO. Accurate calibration of the CALIOP nighttime 532 nm data depends crucially upon this model. Previous versions of the CALIOP data products were generated using the Goddard Earth Observing System Model, Version 5 (GEOS-5) near-real-time analyses, which are created by GMAO for use by NASA satellite instrument teams. These meteorological fields were continually updated with assimilation system improvements and new data inputs. Therefore, successive versions of GMAO data products were used for different time periods during the CALIOP data record. CALIOP versions 3.01 and 3.02 used GEOS 5.2 data. Versions 3.30 and 3.40 used the FP-IT nearreal-time assimilation products (GEOS version 5.9.1 and Atmos. Meas. Tech., 11, 1459-1479, 2018 www.atmos-meas-tech.net/11/1459/2018/ 5.12.4). The initial release of the CALIOP V4 data products (version 4.00) used the FP-IT product built with GEOS 5.9.1. The current V4 release (version 4.10) uses the MERRA-2 reanalysis product (Molod et al., 2015;Gelaro et al., 2017), which has enhanced physics, including a new gravity wave drag parameterization that is capable of producing a quasibiennial oscillation (QBO), and spans the entire CALIOP data record, from April 2006 to the present. MERRA-2 is thought to provide more accurate modeled meteorological fields because it assimilates temperature and ozone profiles retrieved from the Aura Microwave Limb Sounder (MLS) (Gelaro et al., 2017). Additionally, MERRA-2 also ingests data from new observing systems and has enhanced quality control of conventional sounding data, such as radiosondes (Gelaro et al., 2017). As an example, comparison of CALIOP V3 (created using GEOS-5.2) and V4 (using MERRA-2) data for July 2010 in the calibration region for both V3 and V4, i.e., between 30 and 40 km (including all latitudes), indicates that the fractional difference (V4−V3) / V3 in molecular density varies from zero to about 1.5 %, with a mean difference of ∼ 0.7 %. The molecular backscatter coefficients between the two models will differ by the same amount. Fractional difference in ozone density (or absorption) varies from about −10 to 5 % with a mean difference of ∼ −1.7 %. The resulting total two-way transmittance changes between GEOS-5.2 and MERRA-2 vary from about −0.01 to 0.03 % with a mean difference of ∼ 0.003 %. These values can vary somewhat with latitude and season. In previous versions of the CALIOP Level 1 data, the aerosol scattering ratio in the 30-34 km calibration region was assumed to be 1; in effect, aerosol loading was assumed to make a negligi-ble contribution to the calculated calibration coefficients. As demonstrated by Vernier et al. (2009), and as anticipated in Hostetler et al. (2006) and P09, this assumption is not valid.
In the V4 analyses, the aerosol scattering ratio at altitudes between 36 and 39 km is assumed to be 1.01±0.01, irrespective of latitude.

The V4 averaging scheme
High-spatial-resolution estimates of the 532 nm nighttime calibration coefficients are generated using profiles that are averaged horizontally over each CALIPSO payload data acquisition cycle (PDAC). A PDAC specifies the minimum time interval over which each of the three CALIPSO instruments can collect an integer number of measurements. During each PDAC, CALIOP acquires backscatter data from 165 laser pulses, which translates into an along-track horizontal distance of ∼ 55 km. Equation (3) is applied to each vertical range bin within the averaged profile, and from these calculations an estimate of the mean and standard deviation (SD) of the calibration coefficient is determined for each PDAC.
To reduce uncertainties in the final estimates, the calibration coefficients obtained over individual PDACs are further averaged over some fixed spatial extent. In V3 this averaging was done by computing running averages over 27 consecutive PDACs, covering a distance of 1485 km, representing about 8 % of the typical along-track distance for the nighttime segment of the orbit. Calibration coefficients and uncertainty estimates for each laser pulse are then derived by interpolating this time series of smoothed calibration coeffi- cients. Complete mathematical details are given in Sect. 3b of P09.
The quality of the calibration coefficients computed in this manner depends critically on the SNR of the backscattered signal in the calibration region. Based on long-term monitoring of CALIOP's instrument performance, 532 nm parallel channel data measured within the 30-34 km calibration region used in V3 and averaged over 27 consecutive PDACs have an SNR of ∼ 75-80. Figure 3 shows 12 measured profiles of CALIOP SNR as a function of altitude. These profiles are constructed from data acquired from 2009 to 2012, covering all seasons and a wide range of latitudes. The profiles are normalized to have an SNR of 1 at 32 km (i.e., at the mid-point of the V3 calibration region). The relative SNR at 37.5 km (i.e., at the mid-point of the V4 calibration region) is ∼ 0.65; thus if the averaging procedure used in V3 were to be retained in the V4 calibration region of 36-39 km, the expected SNR of the underlying measurements would drop significantly, to ∼ 52 (for a SNR of 80 at 32 km). In other words, while raising the calibration region to reduce aerosol contamination provides a substantial decrease in calibration bias errors, random errors can increase by an even larger amount. Because the overall increase in calibration uncertainty introduced by this drop in SNR is unacceptable within the context of the CALIOP Level 2 retrievals Young et al., 2013), a new averaging scheme was required for the V4 processing.
Simulations indicate that achieving the V3 SNR at the V4 calibration altitudes within a single orbit would require that the along-track averaging distance be increased to at least 4710 km or 86 PDACs. As this distance represents approximately one quarter of the total along-track distance covered during a nighttime orbit segment, doing this would risk smearing out legitimate time-varying changes in the calibration coefficients. An example of these effects is seen in Fig. 4, which illustrates the thermal beam steering effects that occur near the night-to-day terminator . Because these thermally induced calibration variations are highly consistent from orbit to orbit during both daytime and nighttime (Powell et al., 2008), and because longitudinal variations in molecular number density are negligible, an alternative averaging scheme was devised. For V4, high-resolution calibration samples are averaged using a two-dimensional timespace sliding window that extends across-track for 11 consecutive orbits and along-track for 11 consecutive PDACs within each orbit (i.e., 121 PDACs in all, covering a total along-track distance of 11 km × 605 km = 6655 km). An assessment of multiple years of data verifies that both the instrument and the platform are sufficiently stable to permit averaging over multiple consecutive orbits. The data-averaging procedure runs autonomously during periods of continuous instrument operation. Averaging restarts are initiated for onorbit instrument tuning events such as boresight alignments and etalon scans, and after any data acquisition interruptions (e.g., due to unfavorable space weather) that extend for more than 24 h. Although more complex to implement than the V3 approach, the V4 averaging strategy has some important advantages, most notably in high-noise regions, where higher-SNR data from adjacent orbits effectively replace the low-SNR samples that would otherwise be used.

Rejecting outliers in the calibration region
Prior to averaging, the lidar signal profiles are carefully filtered in a three-step process in order to eliminate the large noise spikes that can be encountered in the calibration region.
Atmos. Meas. Tech., 11, 1459-1479, 2018 www.atmos-meas-tech.net/11/1459/2018/ The marked drop-off beginning just after 15 000 km is attributed to thermal beam steering caused by warming as the satellite first enters the sunlit portion of the orbits. For nighttime, the orbit starts in the north, and the starting point of the along-track distance is at the day-night terminator.
These noise spikes are especially frequent over an extended area covering the continent of South America and adjoining South Atlantic Ocean known as the South Atlantic Anomaly (SAA), where high-energy charged particles from the Sun and cosmic rays trapped in the Van Allen belts come down to relatively low altitudes (Noel et al., 2014;Domingos et al., 2017). Because the CALIOP photomultipliers (PMTs) are not shielded against cosmic radiation, when these charged particles strike the PMT dynode chain they can generate large noise excursions (i.e., "spikes") that appear at arbitrary altitudes throughout the measured profiles ). In the first step of the noise rejection process, an adaptive spike filter, outlined in Sect. 3d of P09, is used to remove the outliers from each of the 11 signal profiles (i.e., X(z), averaged to 5 km horizontally and 300 m vertically) measured within a 55 km (165 shot) PDAC. Low and high rejection thresholds are determined based on the expected molecular signal and the uncertainties from the random noise in the measurement process. Further details are given in Sect. 3d of P09. In order to accommodate the generally lower signals at the raised calibration altitudes in the new V4 scheme, the low and high uncertainty threshold values were adjusted so as to eliminate not more than about 0.15 % of the data at both low and high ends of the signal distribution at all latitudes. At least one sample in each range bin in the calibration region for any PDAC is required. Otherwise, the calibration coefficient and its uncertainty for this PDAC are labeled as invalid and excluded from further calibration processing. This is different from V3, where for each failed PDAC, a historical estimate of the calibration coefficient (daily average of all valid calibration coefficients from the previous day) was used (see P09 for details). As in V3, the valid data in the segments remaining from the first step are further filtered in a second step that removes additional large signal excursions using an estimated noiseto-signal ratio (NSR). The NSR is defined as the SD divided by the mean value of all the valid signals within each PDAC, and the calculated NSR is compared against an empirically derived threshold value. If the NSR value estimated from the valid signal profiles is less than the predefined threshold, then a mean "calibration-ready" profile is constructed from the valid signals. For V4, this step necessitated some careful consideration, particularly at high latitudes in both hemispheres. This is because the molecular number densities (and thus the backscattered signals) drop sharply at high latitudes in local winter. This low signal, coupled with the high incidence of radiation-induced noise (from high-energy particles) at these latitudes, often leads to anomalously high NSR values in the V4 calibration region. Applying the same NSR thresholds that were used in V3 would preferentially eliminate the lowsignal/high-noise data at the new calibration altitudes of 36-39 km at these locations and times, leading to high biases in the signal data used for calculating the calibration coefficient, which in turn leads to unrealistically high calibration coefficients in these regions. These high calibration coefficients subsequently yield anomalously low attenuated scattering ratios (< 1) in the calibration region and below. In the V3 calibration region, where the SNR was considerably higher, the NSR values are better behaved, and thus the effect is much less pronounced. Figure 5 shows the NSR thresholds used in V4 as a function of the granule elapsed time (Fig. 5a) and laser footprint latitude (Fig. 5b). Granule elapsed time (in seconds) is referenced to the time at the beginning of a particular orbit segment. For nighttime orbits, the granule elapsed time begins in the northern latitudes at the location of the day-to-night terminator and ends in the southern latitudes where the satellite reaches the night-to-day terminator. The threshold values represent the median NSR plus 5 times the median absolute deviation (MAD) for all Level 1 data acquired during 2007-2012. The NSR thresholds are seen to vary from month to month to accommodate seasonal and latitudinal variations in atmospheric density.
The largest seasonal differences occur in the southern polar latitudes, with highest NSR thresholds in local winter, when the densities in the calibration region are lowest. The choice of this particular set of NSR filters was dictated by the requirement that the filter should minimize the difference in mean calibration coefficients over the SAA region and the non-SAA region within the same latitude band. This choice also ensured that at least 85 % of samples (for the test data sets that were used) at all latitudes were retained after filtering for a robust estimation of the calibration coefficient. As mentioned above, the data set used for testing the filters en- compassed the years 2007 through 2012, thus including more than 90 % of the data available at that time.
As an example, Fig. 6b shows the NSR at 36-39 km as a function of the granule elapsed time for a single granule from July 2010. NSR values remain quite uniform at ∼ 1.5-2.0 until about 1500 s. However, as the molecular number density (averaged over 36-39 km) dips over high southern latitudes, the NSR increases sharply and becomes extremely variable, with large values corresponding to low signal levels. The constant threshold of 3.31 (dashed line in blue) which would have been used by V3 eliminates a substantial fraction of samples at these latitudes. The revised latitudinally variant threshold in V4 (red line) now includes many more of these samples, rejecting only the extreme outliers, and accounts for the high NSR which occurs seasonally at these high latitudes.
In the third and final noise rejection step, an adaptive filter similar to that used in the first step is applied to the mean of the calibration-ready profile. If the mean profile passes this test, then it is used for calculation of the calibration coefficient using Eq. (3). The basic calibration algorithm over a single PDAC with the new spike filter, as mentioned above, is similar in both V3 and V4. Further details with examples of the actual filtering and the mathematical basis for computation of the calibration coefficient are available in P09.
An estimate of the efficiency of the three-step noise rejection algorithm described above may be obtained from the calibration success rate, which is just the ratio of the number of successful calibrations and the attempted calibrations within a specified area. Figure 7a and b show the mean single PDAC calibration success rate as a percentage of the calibration opportunities for the month of July 2010 for V3 (Fig. 7a) and V4 (Fig. 7b). Both versions have broadly similar calibration success rates over the globe, with somewhat more noise in V4, as expected due to reduced SNR from the higher calibration re-gion. Over most of the globe, the success rate is over 90 % in both versions. However, substantially lower success rates (in blue) occur over the SAA, where the adaptive filter removes a significant number of PDACs, leading to the lower success rates. The minimum value of the success rate within the SAA region reaches zero. In V3, historical calibration coefficient estimates (daily average from the previous day) were used whenever a PDAC would fail any of the three filtering steps, and these historical values were included in all subsequent averaging operations (see P09 for details). The success rate also falls over Antarctica, with the V4 calibration success rate being somewhat lower than in V3. This phenomenon once again indicates the harsh radiation environment over this area, which affects the SNR particularly at higher altitudes . Figure 7c shows the spatial distribution of the difference in success rates between the two versions. Note that there are a few pixels over Antarctica where V3 success rate was higher than V4. This is due to the different and improved noise-filtering scheme in V4. The multi-granule averaging scheme described in Sect. 2.3 is specifically designed to counterbalance the lower single PDAC success rates seen in the V4 data.

Calculating profiles of attenuated backscatter coefficients
Calculating the calibration coefficients and applying them to the measured profile data is a two-stage process. As described above, the first stage extracts filtered and averaged parallel channel calibration coefficients and uncertainty estimates for each PDAC in all nighttime granules. This procedure uses a two-dimensional sliding window that extends along-track for 11 PDACs and across-track for 11 contiguous nighttime granules. The results obtained from these relatively coarse spatial resolution calibration calculations are stored in a MySQL database. The second calibration stage   applies the calibration coefficients to the measured data, resulting in the profiles of calibrated attenuated backscatter coefficients that are reported in the CALIOP L1 data products. For each granule, time histories of the calibration coefficients and their associated uncertainties are retrieved from the database. These data are linearly interpolated with respect to granule elapsed time, t g , for each laser shot along the nighttime orbital track. The interpolated parallel channel calibration coefficients, C || (t g ), are then applied to each parallel channel signal profile, X || (z, t g ), as defined in Eq. (2), to obtain the profile of parallel channel calibrated attenuated backscatter coefficients (in km −1 sr −1 ): The perpendicular channel signal profiles, X ⊥ (z, t g ), are then calibrated using where the perpendicular channel calibration coefficient, C ⊥ (t g ), is the product of C || (t g ) and the polarization gain ratio (PGR) (P09, Eqs. 8-10). The independently calculated PGR quantifies the electronic gain and responsivity differences between the two channels . For each laser pulse, the CALIOP L1 data products report the parallel channel calibration coefficient and its corresponding uncertainty. The PGR and its uncertainty are also reported for each laser pulse, and thus the perpendicular channel calibration coefficient and its uncertainty are readily derived. Profiles of the perpendicular channel attenuated backscatter coefficients are also recorded. However, instead of parallel channel attenuated backscatter coefficients, the CALIOP L1 products report the total attenuated backscatter coefficient profiles in per kilometers per steradian, which are simply the sum of the parallel and perpendicular channel contributions. Note that we have explicitly used C || to denote the parallel channel calibration coefficient in this section to distinguish it from the perpendicular channel calibration coefficient; otherwise C and C || have been used interchangeably.
3 Assessment of CALIOP V4 calibration Figure 8 shows the time series of the V3 and V4 calibration coefficients from 2006 through 2016. The granule average values of the coefficients have been smoothed over 10 consecutive granules. Overall, there is a decrease of ∼ 3 % from V3 to V4. Over the short term, sharp upward revisions in calibration mostly correspond to boresight alignment optimizations (marked B in Fig. 8) and etalon temperature tuning procedures, marked E in Fig. 8  ). These procedures take place periodically and lead to an increase in signal and a corresponding increase in calibration coefficient.
Apart from these, there were two significant one-time events that took place. First, the laser off-nadir pointing angle was changed from 0.3 to 3.0 • in November 2007 (marked N in Fig. 8). Second, CALIPSO's primary laser started showing signs of degradation, and in March 2009 it was replaced by the backup laser, marked L in Fig. 8 (Winker et al., 2010b). The longer-term downward trends in the calibration coefficients are most likely due the slow degradation of receiver components as the instrument ages. The relatively rapid decay in C over the first year of the mission is attributed to a persistently increasing wavelength mismatch between the laser transmitter and the etalon in the receiver (largely corrected by the initial retuning of the etalon in March 2008), compounded by boresight misalignment .
3.1 Overall differences between V3 and V4 calibration Figure 9 shows the spatial distribution of the calibration coefficients for V3 and V4 for the month of October 2010. Several obvious artifacts can be seen in the V3 map. In particular, the band of high values between the Equator and about 50 • N indicates the calibration biases resulting from aerosol contamination at 30-34 km. Further, the V3 calibration coefficients clearly (and wrongly) demarcate the SAA region, and individual orbital tracks are readily apparent, spuriously suggesting large orbit-to-orbit variations.
In contrast, the V4 map is much smoother and shows no indication of any latitudinally varying aerosol contamination. Similarly, the boundaries of the SAA are no longer visible, as the averaging procedure effectively compensates for the lowsampling issues over the noisy regions. The lower values of the calibration coefficient over Antarctica are due to thermal beam steering effects in the instrument that occur as the satellite first enters the sunlit portion of the orbits when approaching the night-to-day terminator (e.g., as seen in Fig. 4). Figure 10a shows the zonal mean distribution of the fractional change in the 532 nm nighttime calibration coefficients from V3 to V4 for the months of January, April, July and October 2010, representing the four seasons. The V4 calibration coefficients, obtained from measurements at 36-39 km, decrease by 2-3 % on average as compared to the V3 calibration coefficients, derived at 30-34 km. This behavior is expected because of the negligibly low aerosol contamination at 36-39 km, as shown in Fig. 2. Seasonal and interannual variations in the calibration differences occur as the aerosol loading at 30-34 km responds to the stratospheric dynamics. One important criterion for improving the calibration in V4 was to retain the same level of the estimated relative random uncertainty in the calibration coefficient. Figure 10b shows the zonal mean relative uncertainty in the calibration coefficient in V4 for the 4 months corresponding to Fig. 10a. Overall, the mean random uncertainty is less than ∼ 2 %, with higher values over the SAA region and near the poles (particularly in July and October over Antarctica) due to the radiation-induced noise in the measurements in these Atmos. Meas. Tech., 11, 1459-1479, 2018 www.atmos-meas-tech.net/11/1459/2018/  regions. This is on the same order of uncertainty as in V3. We note, however, that there was a bug in the V3 code that caused the uncertainties reported in the L1 data products to be underestimated by a factor of 3 or more. For this reason, Fig. 10b plots only the V4 uncertainties, not the differences between V3 and V4 that are shown in Fig. 10a. One of the important signatures indicating suboptimal performance of the V3 532 nm nighttime calibration was a characteristic dip in R calculated for "clear-air" conditions in the tropics over an 8-12 km region (P09). R values less than unity are not expected under these conditions and essentially imply the existence of aerosols in the V3 calibration region. We note that in this context clear air is not required to be pristine and aerosol-free. Instead, the 8-12 km clear-air samples likely contain tenuous particulate loading at levels that lie below the layer detection threshold of CALIOP but that will still show up as elevated scattering ratios with R values in excess of the pristine clear-air R of 1.0. Figure 11 shows the clear-air R computed between 8 and 12 km for V3 (Fig. 11a) and V4 (Fig. 11b) for October 2010. Each point in this scatterplot represents a 200 km segment along the orbit which has been determined to be clear air (i.e., no cloud or aerosol layers) using the corresponding V3 and V4 Level 2 cloud and aerosol products. The red curves show median values within 2 • latitude bins. Note that polar stratospheric clouds (PSCs) were additionally cleared for this plot using the currently available version (V1.0) of the CALIOP PSC product (Pitts et al., 2009), which is still based on the CALIOP V3 Level 2 data. As can be seen in Fig. 11, the strong dip in the tropics to median R < 1 that is seen in V3 data no longer appears in V4, where the median R is consistently above ∼ 1.03. This along with the general meridional uniformity of clear-air R indicates a significantly improved calibration in V4 of CALIOP data.
The V3 calibration altitude range of 30-34 km presents a useful region for V4 calibration assessment, since R was essentially forced to unity in this region in V3 and should now be different (higher) in V4. Figure 12 shows the zonal mean distribution of R averaged over 30-34 km calculated from V4 Level 1 data for January, April, July and October 2009, again representing the four seasons. The R values at 30-34 km in V4 represent an increase of between ∼ 3 to ∼ 10 % in all cases, with significant seasonal variations. V4 is now consistent with the aerosol loading and its seasonal variation at these altitudes from SAGE II and GOMOS, as seen in Fig. 2, and thus represents a significant improvement Figure 10. The fractional change from V3 to V4, (V4−V3) / V3 in the zonally averaged 532 nm calibration coefficient for 4 months in 2010 (a) and the zonally averaged relative uncertainty ( C532 / C532) in the V4 calibration coefficient for the same months (b). over V3. The high tropical values of R in January and April, peaking at ∼ 1.10, may be related to interannual variations in stratospheric dynamics (see Sect. 3.3 below), as was also seen in Fig. 1.

Effects of instrumental changes on version 4 calibration
As indicated in Fig. 8, several instrument configuration changes have taken place in the CALIOP lidar since the beginning of the mission. Each of these changes results in corresponding changes in the calibration coefficient. A good metric for evaluating the calibration procedure is to ensure that these changes in calibration leave R unaffected. In this section we assess this aspect of the V4 calibration.

Laser switch
As previously mentioned, the CALIPSO payload includes both a primary laser and a backup laser. At launch, each was housed in a hermetically sealed canister filled with dry air and pressurized to 1 atm . CALIOP data production began in June 2006 using the primary laser, which was known pre-launch to have a slow leak in the canister. Over time, as the pressure decreased, the primary laser started showing anomalous behaviors resulting from coronal discharge at low pressures. As a result, the primary laser was turned off on 16 February 2009. The backup laser was subsequently activated on 12 March 2009 and has been continuously operating since then. This is the largest configuration change in the mission so far, and it led to a concomitantly large change in the calibration coefficients. This change is illustrated in Fig. 13, where panel (a) shows the zonal mean calibration coefficients for the 2 weeks immediately before (1-14 February) and immediately after (18-31 March) the laser switch, and panel (b) shows the zonal mean R values computed for the same two time periods. While the calibration coefficients are seen to be quite different, the zonal mean R values agree quite well. As there were no volcanic eruptions or other meteorological events that perturbed the distribution of stratospheric aerosols during this time period, this close R agreement is exactly what should be expected. This clearly demonstrates that the calibration algorithm correctly and automatically adapts to significant changes in instrument configuration without affecting the quality of the science data.

Off-nadir test
Another significant instrument event took place in November 2007, when the pointing angle of the lidar was changed from 0.3 to 3.0 • , in order to minimize the effects of specular reflections from horizontally oriented crystals in ice clouds Noel and Chepfer, 2010). An advanced test of this change was carried out between 22 August and 6 September 2007, when the pointing angle was held at 3 • , and then changed back to 0.3 • pending the final change in November 2007. Figure 14a shows the normalized calibration coefficients before the test (4-20 August 2007), during the test (22 August-6 September 2007) and after the test (8-24 September 2007). Although not as large as the change resulting from the laser switch, significant changes in the 3 • off-nadir calibration coefficients can still be discerned among the curves. Note that the calibration coefficients do not exactly revert back to the pre-test values and are somewhat lower. This is because this test took place when the primary laser was still operational and, as seen in Fig. 8, the calibration coefficients were steadily decreasing during this period. However, despite this, the zonal mean R values (Fig. 14b) at 30-34 km are all essentially coincident, thus again testifying to the robustness of the calibration algorithm.

Boresight alignment
The alignment between the CALIOP transmitter and receiver is maintained using a boresight mechanism to adjust the laser pointing direction relative to the receiver field of view to   maximize the return signal ). Boresight alignment is checked and adjusted periodically. The boresight alignment that took place on 7 December 2009 resulted in an unusually large adjustment to the previous computed pointing direction. Figure 15a shows zonally averaged calibration coefficients before (21 November-6 December 2009) and after (8-23 December 2009) this boresight alignment. The calibration coefficients changed significantly in response to the event. However, as can be seen in Fig. 15b, changes in stratospheric R are largely negligible and are not correlated with the changes in the calibration coefficients. At a couple of locations, the R curves show significant deviations, which could be due to some real variations in aerosol loading or noise in the data.

Representation of stratospheric aerosol
As demonstrated above, the new calibration coefficients in V4 lead to a generally upward revision of the Level 1 attenuated backscatter coefficients by 3-6 % or more, depending upon location and season. In particular, Fig. 12 indicates that variations in aerosol loading at stratospheric altitudes are robustly captured in the V4 data. This is illustrated further in Fig. 16, which shows the zonally averaged height-latitude cross sections of R in November 2007 and May 2009 for both V3 and V4. In both months, distinct structures can  be observed in the V4 data in the stratospheric regions between 20 and 30 km in the tropics which are likely linked to the QBO of lower stratospheric winds between about 20 and 35 km. In November 2007, a dominant westerly shear prevailed in the stratosphere (monthly mean zonal wind at Singapore at 10 hPa = 18 ms −1 ), leading to a characteristic double-horn structure in the tropical stratospheric aerosol distribution (Trepte and Hitchman, 1992). In the V3 map ( Fig. 16a) this structure can be seen only partially, while it is much more prominent and clear in the V4 map (Fig. 16b).
On the other hand, a dominant easterly shear prevailed in the stratosphere in May 2009 (monthly mean zonal wind at Singapore at 10 hPa = −34.2 ms −1 ), during which aerosol lofting is expected to take place in the tropics and lateral transport is inhibited (Trepte and Hitchman, 1992). The aerosol lofting is not seen in the V3 map ( Fig. 16c) but is quite clearly observed in the V4 map (Fig. 16d). This illustrates the potential for V4 CALIOP data to provide important and robust in-formation on stratospheric aerosol. A CALIOP stratospheric aerosol product is currently under development which exploits the improved V4 calibration.

Validation of Vcalibration: comparisons with HSRL measurements
The airborne HSRL developed at NASA LaRC (Hair et al., 2008) has been used throughout the CALIPSO mission to validate the CALIOP lidar calibration through an ongoing series of coincident underflights (Rogers et al., 2011). At 532 nm, the HSRL uses an internal calibration technique that avoids the aerosol contamination issues at calibration altitudes encountered by spaceborne lidars and thus can deliver highly accurate measurements (to within ∼ 1 %) of attenuated backscatter coefficients (Rogers et al., 2011).  June 2014 were used for comparison with the coincident CALIOP measurements in clear-air conditions. For comparison with CALIOP, the total attenuated backscatter measured by the HSRL must first be corrected for the molecular and ozone attenuation between the HSRL flight altitude (typically ∼ 8-9 km above mean sea level) and the CALIOP altitude. These corrections are made using the same atmospheric model data used in deriving the CALIOP calibration coefficients. Following the protocol described in Rogers et al. (2011), the V4 CALIOP vertical feature mask (VFM) is used to exclude all profiles in which layers are detected above the HSRL aircraft altitude. Upon completion of this procedure, averaged attenuated backscatter profiles are created for both sets of measurements. The amount of horizontal averaging performed for the comparisons varies from flight to flight, and it depends upon the temporal-spatial collocation of the CALIPSO and the HSRL data sets. The vertical extent of the regions used in the comparisons also varies, depending on the geometric depth of the clear-air segments within the averaged profiles. Fractional difference profiles between HSRL and CALIOP are then calculated using where β HSRL (r) is the mean of the coincident total attenuated backscatter from the HSRL at range r, referenced to the CALIOP altitude grid, and β CALIOP (r) is the corresponding mean of total attenuated backscatter from CALIOP at range r. For further details of the comparison methodology, the reader is referred to Rogers et al. (2011). A single difference value was estimated for each HSRL coincident underflight by averaging over the horizontal and vertical dimensions of the clear-air region. Figure 17 shows the mean biases between HSRL and CALIOP using all clear-air data from each individual underflight as a function of mean latitude for both the V3 (filled diamonds) and V4 (open circles) data sets. Most of the flights took place at the northern midlatitudes between 30-40 • N (Fig. 17). Although the comparison covers only a limited latitude range, no obvious latitude dependence can be discerned. In general, the low bias of the CALIOP attenuated backscatter coefficients was more pronounced in V3 and has now decreased in V4, which shows a more uniform distribution of points about the zero difference line. Most of the differences from the individual flights have decreased significantly, with the exception of a few outliers. Rogers et al. (2011) pointed out a slight seasonal effect in the V3 biases with somewhat higher bias during the summer months, which might be related to enhanced stratospheric aerosol loading. The improved calibration in V4 has now generally reduced the differences during the summer months. The mean bias between the two instruments for V4 calibration using data from all the flights is 1.6 % ± 2.4 % and has decreased from 3.6 % ± 2.2 % in V3. When computing these aggregate means and SDs, the sample counts from each flight are used as weights that are applied to the perflight means and SDs.
Note that we expect the CALIOP attenuated backscatter coefficients to be slightly lower than those from HSRL, as we cannot correct the HSRL data for the attenuation from undetected aerosols (or clouds) that occurs between the CALIPSO satellite and the HSRL aircraft altitudes. The stratospheric aerosol optical depth (SAOD) at 525 nm in the tropics (20 • S-20 • N) between the tropopause and 40 km has been declining steadily in the post-Pinatubo period, reaching very low values of ∼ 0.003 in 2001-2002(Kremser et al., 2016. Subsequently SAOD rose slowly because of inputs from moderate-size volcanic eruptions leading to a value of about 0.005 on average between 2006 and 2012 Kremser et al., 2016). Assuming then a background SAOD of 0.005 at 532 nm, the failure to correct for this attenuation would account for about 1 % of the 1.6 % mean bias estimated using V4 CALIOP data.
We note too that the V3 values reported here differ slightly from those given in Rogers et al. (2011). There are two reasons for this. First, the number of HSRL flights available for comparison has increased since 2011, and hence the sample size in the new study is somewhat larger. Second, and perhaps more important, a bug discovered in the analysis code used for the original study led to slight underestimates of the bias calculations. Further details of this bug and its remediation are given in Appendix A.

Conclusions
The 532 nm nighttime calibration is the fundamental quantity from which all other CALIOP calibration coefficients are derived, and thus it is the most important element in ensuring the robustness and overall quality of the CALIOP data products. The V4 algorithm incorporates two major changes that markedly improve the accuracy and reliability of the 532 nm nighttime calibration. First, the calibration altitude range for the nighttime parallel channel has been raised from 30-34 to 36-39 km, resulting in significantly reduced contamination from stratospheric aerosols (now at about the 1 % level) for the molecular normalization procedure. And second, a new two-dimensional averaging scheme that harvests data both along an orbit track and across multiple adjacent orbit tracks ensures that the random error in the calibration coefficients is at or below the levels reported in the V3 data products. Among other important changes are an improved noisefiltering scheme, the adoption of MERRA-2 as the meteorological model and the explicit accounting for the presence of residual aerosol in the calibration region. We have presented the salient features of the new calibration procedure and highlighted the many improvements in the V4 data arising from this new calibration. The inconsistencies in the V3 data owing to the previous calibration scheme have largely been resolved. The relative uncertainties from random noise in the V4 calibration are of the same magnitude as they were in V3, and the V4 calibration procedure is shown to correctly adjust to compensate for periodic instrument changes such as boresight alignments. The new calibration also improves the representation of stratospheric aerosols that will be exploited in future versions of the CALIOP data products. Importantly, Atmos. Meas. Tech., 11, 1459-1479, 2018 www.atmos-meas-tech.net/11/1459/2018/ validation of the V4 nighttime calibration coefficients using the coincident HSRL measurements at northern midlatitudes indicates an agreement to within ∼ 1.6 % ± 2.4 %, reduced from 3.6 % ± 2.2 % in V3, indicating a robust enhancement in calibration accuracy. Overall, a significant improvement in CALIOP primary calibration has been achieved in V4 which will result in corresponding improvements in the downstream Level 1 and Level 2 CALIOP products. In particular, the attenuated backscatter values increase by about 2-3 % on average, which enables increased detection of tenuous layers by the Level 2 algorithm, particularly in the stratosphere. The improvements in stratospheric aerosol retrievals will be invaluable for cross-validation of the stratospheric aerosol products from other instruments, such as the Stratospheric Aerosol and Gas Experiment III on International Space Station (SAGE III-ISS), and are expected to lead to a better understanding of climate-related issues. Appendix A: Updated values of HSRL-CALIOP bias from Rogers et al. (2011) When replicating the analyses of the collocated CALIOP-HSRL data set for this paper, an error was discovered in the code used to estimate the overlying two-way transmittance differences between the two sets of measurements. This error led to a small bias in the results reported in Rogers et al. (2011). We thus report here updated values for the V3 data set calculated using the corrected code. Table A1 shows the mean and the SD of the mean computed from a data set of column-averaged biases for each HSRL flight. Note that the uncorrected V3 values do not exactly match those of Rogers et al. (2011) due to slight variations in the code and flight data used. A difference of ∼ 1.3 % (corrected−uncorrected) in the mean bias is found, which represents an underestimation of the bias reported previously. However, the results shown in this study still show a significant improvement in the calibration scheme for the V4 CALIOP data.