Interactive comment on “ Information Content and Sensitivity of the 3 β + 2 α Lidar Measurement System for Aerosol Microphysical Retrievals ”

Abstract. There is considerable interest in retrieving profiles of aerosol effective radius, total number concentration, and complex refractive index from lidar measurements of extinction and backscatter at several wavelengths. The combination of three backscatter channels plus two extinction channels (3β + 2α) is particularly important since it is believed to be the minimum configuration necessary for the retrieval of aerosol microphysical properties and because the technological readiness of lidar systems permits this configuration on both an airborne and future spaceborne instrument. The second-generation NASA Langley airborne High Spectral Resolution Lidar (HSRL-2) has been making 3β + 2α measurements since 2012. The planned NASA Aerosol/Clouds/Ecosystems (ACE) satellite mission also recommends the 3β + 2α combination. Here we develop a deeper understanding of the information content and sensitivities of the 3β + 2α system in terms of aerosol microphysical parameters of interest. We use a retrieval-free methodology to determine the basic sensitivities of the measurements independent of retrieval assumptions and constraints. We calculate information content and uncertainty metrics using tools borrowed from the optimal estimation methodology based on Bayes' theorem, using a simplified forward model look-up table, with no explicit inversion. The forward model is simplified to represent spherical particles, monomodal log-normal size distributions, and wavelength-independent refractive indices. Since we only use the forward model with no retrieval, the given simplified aerosol scenario is applicable as a best case for all existing retrievals in the absence of additional constraints. Retrieval-dependent errors due to mismatch between retrieval assumptions and true atmospheric aerosols are not included in this sensitivity study, and neither are retrieval errors that may be introduced in the inversion process. The choice of a simplified model adds clarity to the understanding of the uncertainties in such retrievals, since it allows for separately assessing the sensitivities and uncertainties of the measurements alone that cannot be corrected by any potential or theoretical improvements to retrieval methodology but must instead be addressed by adding information content. The sensitivity metrics allow for identifying (1) information content of the measurements vs. a priori information; (2) error bars on the retrieved parameters; and (3) potential sources of cross-talk or "compensating" errors wherein different retrieval parameters are not independently captured by the measurements. The results suggest that the 3β + 2α measurement system is underdetermined with respect to the full suite of microphysical parameters considered in this study and that additional information is required, in the form of additional coincident measurements (e.g., sun-photometer or polarimeter) or a priori retrieval constraints. A specific recommendation is given for addressing cross-talk between effective radius and total number concentration.

Abstract.There is considerable interest in retrieving profiles of aerosol effective radius, total number concentration, and complex refractive index from lidar measurements of extinction and backscatter at several wavelengths.The combination of three backscatter channels plus two extinction channels (3β+ 2α) is particularly important since it is believed to be the minimum configuration necessary for the retrieval of aerosol microphysical properties and because the technological readiness of lidar systems permits this configuration on both an airborne and future spaceborne instrument.The second-generation NASA Langley airborne High Spectral Resolution Lidar (HSRL-2) has been making 3β+ 2α measurements since 2012.The planned NASA Aerosol/Clouds/Ecosystems (ACE) satellite mission also recommends the 3β+ 2α combination.
Here we develop a deeper understanding of the information content and sensitivities of the 3β+ 2α system in terms of aerosol microphysical parameters of interest.We use a retrieval-free methodology to determine the basic sensitivities of the measurements independent of retrieval assumptions and constraints.We calculate information content and uncertainty metrics using tools borrowed from the optimal estimation methodology based on Bayes' theorem, using a simplified forward model look-up table, with no explicit inversion.The forward model is simplified to represent spherical particles, monomodal log-normal size distributions, and wavelength-independent refractive indices.Since we only use the forward model with no retrieval, the given simplified aerosol scenario is applicable as a best case for all existing retrievals in the absence of additional constraints.Retrievaldependent errors due to mismatch between retrieval assumptions and true atmospheric aerosols are not included in this sensitivity study, and neither are retrieval errors that may be introduced in the inversion process.The choice of a simplified model adds clarity to the understanding of the uncertainties in such retrievals, since it allows for separately assessing the sensitivities and uncertainties of the measurements alone that cannot be corrected by any potential or theoretical improvements to retrieval methodology but must instead be addressed by adding information content.
The sensitivity metrics allow for identifying (1) information content of the measurements vs. a priori information; (2) error bars on the retrieved parameters; and (3) potential sources of cross-talk or "compensating" errors wherein different retrieval parameters are not independently captured by the measurements.The results suggest that the 3β+ 2α measurement system is underdetermined with respect to the full suite of microphysical parameters considered in this study and that additional information is required, in the form of additional coincident measurements (e.g., sun-photometer or polarimeter) or a priori retrieval constraints.A specific recommendation is given for addressing cross-talk between effective radius and total number concentration.

Introduction
Aerosol effects on global and regional climate and human health depend on aerosol amount, vertical distribution, and proximity to clouds, as well as the composition, size and absorption properties of the aerosol.The NASA Aerosol/Clouds/Ecosystems (ACE) mission (https://acemission.gsfc.nasa.gov/index.html)recommended by NRC's 2007Earth Decadal Study (National Research Council, 2007) is currently in pre-formulation stage and aims to produce a comprehensive set of vertically and horizontally resolved aerosol properties as a function of time and location.This dataset will be used to constrain aerosol transport models and model estimates of globally averaged direct aerosol radiative forcing, not just at the top of the atmosphere but also near the surface and within the atmosphere.The mission therefore addresses the quantification of (1) aerosol sources, sinks and transport, (2) direct aerosol forcing, and (3) aerosol-cloud interactions (ACE Science Working Group, 2010).
To achieve these goals, ACE is planned to include a multi-wavelength high spectral resolution lidar (HSRL) and a multi-wavelength, multi-angle imaging polarimeter from which vertically resolved aerosol microphysical retrievals will be made.While passive polarimeter measurements can provide accurate retrievals of column averaged microphysical properties (Dubovik et al., 2011;Hasekamp et al., 2011), only lidar measurements can provide the vertical resolution required.The combination of three backscatter and two extinction wavelengths (3β+ 2α) for the lidar is considered to be the minimum number of channels required for an aerosol microphysical retrieval (Bockmann et al., 2005;Veselovskii et al., 2002) based on a heritage of aerosol microphysical retrievals from ground-based Raman measurements of varying wavelength combinations (e.g., Müller et al., 1999;Bockmann, 2001;Donovan and Carswell, 1997).Accordingly, the ACE plan calls for an HSRL to measure the aerosol backscatter coefficient at 355, 532, and 1064 nm and the aerosol extinction coefficient at 355 and 532 nm.This combination is frequently referred to as "3β+ 2α" lidar.The NASA Langley airborne HSRL-2 is one prototype for the ACE lidar.
There exist various aerosol microphysics retrievals based on 3β+ 2α lidar measurements (e.g., Bockmann, 2001;Veselovskii et al., 2002Veselovskii et al., , 2012;;Chemyakin et al., 2014;Müller et al., 1999).In general, these retrievals are performed for each vertically resolved altitude level (grid-point) in the lidar profile, on a single set of three backscatter and two extinction measurements at a time, with each altitude level being treated independently.
The inversion with regularization retrieval (Müller et al., 1999;Veselovskii et al., 2002) is the standard algorithm used for 3β+ 2α retrievals.Mie theory kernels link the lidar optical measurements of aerosol backscatter and extinction coefficients with aerosol size distributions, which are represented as a combination of five to eight triangular basis functions.
The size distribution is retrieved using inversion with regularization for a given complex refractive index and set of minimum and maximum particle sizes (integration limits).The integration limits and complex refractive index are then varied over a range of values that are typically found for aerosols, for example between 30 nm and 8 µm for the integration limits, between 1.325 and 1.8 for the real part of the refractive index, and between 0 and 0.1 for the imaginary refractive index (the limits of the search space vary somewhat for different authors).Specific solutions (sets of values for the size distribution parameterization and complex refractive index) are selected based on limiting the amount of discrepancy between the measurements and the backscatter and extinction coefficients reproduced from the Mie solutions.A few hundred individual solutions with different integration limits and refractive indices are then averaged together to provide the mean value and error bars for the final solution.The process of averaging multiple solutions together adds stability to the retrieval (Veselovskii et al., 2002).The inversion with regularization retrieval was demonstrated with airborne HSRL-2 measurements by Müller et al. (2014).
The linear estimation method (Veselovskii et al., 2012) solves for the particle size distribution represented as a linear combination of the measurement kernels.Only the total integrated number concentration is retrieved rather than the full size distribution.The refractive index is retrieved by iteration, solving the equation for an assumed refractive index and minimizing the resulting systematic error.The systematic error to be minimized is estimated by using only four of the measurements to attempt to reproduce the fifth and repeating for all five measurements.Like the inversion with regularization technique, the final solution is an average of a family of individual solutions.
The arrange and average method (Chemyakin et al., 2014) is a simplified version of the 3β+ 2α retrieval which is particularly helpful for experimental work in understanding retrieval behavior (Chemyakin et al., 2016).This methodology makes use of a pre-computed look-up table (LUT), simplifying the exploration of the full space of possible solutions.The LUT used in the present study and by Chemyakin et al. (2014) has only monomodal log-normal size distributions.Since the complex refractive index is also included in the LUT and therefore treated identically to the size distribution parameters in this retrieval, all parameters are retrieved simultaneously and the relationships between retrieval parameters are more transparent.Solutions are selected from the LUT that match the optical measurements to within a small discrepancy.
While it has been demonstrated that 3β+ 2α lidar measurements can yield accurate retrievals of aerosol microphysical parameters that agree with in situ measurements of effective radius and total integrated number concentration (Müller et al., 2014), it is also understood that this retrieval is underdetermined (Veselovskii et al., 2002;Bockmann et al., 2005;Pérez-Ramírez et al., 2013;Chemyakin et al., 2016).There-fore, characterization of the aerosol microphysical state parameters requires additional information or constraints beyond the lidar measurements.In general, constraints can take various forms, including smoothing, regularization, a priori state values, or limits on the ranges of values that the retrieved parameters can take (Rodgers, 2000, chapter 10).
Previous studies of the 3β+ 2α lidar retrieval system also point out the difficulty of retrieving the complex refractive index in particular (Veselovskii et al., 2012;Müller et al., 2014;Pérez-Ramírez et al., 2013).The 3β+ 2α retrievals represent the relationship between the measured optical properties and the particle size distribution as Fredholm integral equations of the first kind, with known limits of integration and known complex refractive index.Consequently, the complex refractive index is generally assumed based on context, or else varied in a separate minimization process (Müller et al., 1999;Veselovskii et al., 2012), which makes the retrieval performance and sensitivities complicated to assess.
These challenges have been acknowledged and addressed in the existing retrievals, but there is still relatively little published discussion about the true sensitivities of the 3β+ 2α lidar measurements with respect to aerosol microphysical parameters of interest and the implications for the need for additional information content in the retrievals.We wish to rigorously and quantitatively deepen our understanding of the information content of the retrieval system by determining how much information in the retrieval stems from the lidar measurements themselves and conversely which information is provided only by constraints or a priori information.The results of this sensitivity study will also clarify how other measurements (e.g., polarimeter or sun-photometer measurements) may significantly add to the information content.This study therefore supports ongoing work to implement a full combined active + passive (lidar + polarimeter) vertically resolved aerosol retrieval and to understand the retrieval limitations in situations where only lidar data are available (i.e., night side of the orbit or gaps in broken cloud systems).These studies will also help refine measurement requirements and determine retrieval uncertainties for ACE or other future measurement systems.
An ideal framework for a study of retrieval sensitivity and information content is optimal estimation (OE).OE, based on Bayesian statistics, is a formalized framework for combining measurements, measurement errors, external information, and constraints.Thoroughly described by Rodgers (2000), it provides a number of key tools for characterizing the sensitivities and information content of a retrieval system.For example, Knobelspiesse et al. (2012) use the Shannon information content and the propagated retrieval errors to characterize the capabilities of multi-angle, multi-wavelength polarimeter for aerosol microphysics retrieval.Xu and Wang (2015) analyze the information content of AERONET measurements with respect to aerosol microphysics retrievals using the propagated retrieval errors and degrees of freedom (DOF) of the signal.Veselovskii et al. (2005) also discuss an assessment of the information content and retrieval uncertainties of the 3β+ 2α lidar measurements using an eigenvalue analysis based on work by Twomey (1977), which, like the OE framework, allows for an assessment of the information content in a way that is mostly independent of any retrieval methodology.
In other words, the diagnostics for sensitivities and information content in the OE framework do not depend on completing a retrieval.Rather, they depend only on retrieval inputs: the forward model, measurement uncertainties, and the a priori constraints.Therefore, although the lidar retrieval algorithms described above are not OE algorithms, these tools can nevertheless be usefully applied to this problem to provide implementation-independent best-case sensitivity metrics.Unlike a perturbation method, the strategy of performing the sensitivity study using only the forward model allows for mapping out the entire state space relatively quickly, without the need for time-consuming retrievals.In addition, since the OE method is a matrix method, the measurement covariance matrix is handled as a single object, taking into account measurement errors in all channels simultaneously, without requiring simplifying assumptions such as an additive property (Pérez-Ramírez et al., 2013).Finally, the OE method provides a formalized means of representing the retrieval constraints, a critical part of an underdetermined retrieval like this, but one which is not well represented using a perturbation sensitivity study or the eigenvalue approach of Veselovskii et al. (2005) and Twomey (1977).In this study, we use an LUT approach to simplify the forward model and set the stage for a retrieval-independent study of sensitivity and information content of the 3β+ 2α lidar measurement system with respect to a small set of aerosol microphysical parameters.While the simplifications necessarily ignore some errors that would occur in a generic real-world aerosol situation, this strategy provides a transparent and rigorous view of the basic sensitivities for this retrieval problem that is applicable to any retrieval with the same measurement inputs, as long as the retrieval assumptions are no more restrictive than those consistent with the very simplified aerosol under consideration.Note also that retrievals will also potentially include additional errors that are dependent on the method used to converge to a solution, which, again, are not included in this assessment.
In Sect. 2 we describe the overall methodology for our sensitivity study and in Sect. 3 we describe the specific cases used for illustration in this paper.In Sect. 4 we give a brief demonstration of the sensitivity of the 3β+ 2α lidar measurement system to the microphysical aerosol properties (the state parameters).Then in Sects.5 and 6 we delve into specific metrics provided by the OE toolset: the DOF of the signal (Sect.5) and the propagated state errors (Sect.6).In Sect.7 we expand the discussion of the propagated state errors by discussing the sensitivity to different levels of measurement uncertainty.Section 8 revisits the propagated state covariance matrix with a new focus on the correlation terms.S. P. Burton et al.: Information content of lidar aerosol microphysical retrievals Section 9 discusses correlation in additional detail in terms of cross-talk between state parameters and gives a recommendation for resolving some of the ambiguity in 3β+ 2α retrievals.Section 10 provides a brief look at the effect of using volume distribution kernels.Section 11 provides a summary and outlook.

Methodology
With this study, we wish to develop a deeper understanding of the information content and sensitivities of the 3β+ 2α measurement system in terms of aerosol microphysical parameters of interest, namely the complex refractive index, total number concentration, and a parameterization of the size distribution.The retrieval methodologies for this inversion system tend to be fairly complicated, particularly due to the difficulty in solving for the complex refractive index.For this study, we aim to determine the basic sensitivities common to all 3β+ 2α lidar retrievals using a methodology that is independent of any retrieval.We accomplish this by calculating the information content and uncertainty metrics using only a forward model, with no explicit inversion.
The measurements for these retrievals are bulk aerosol extinction and backscatter coefficients measured by an HSRL or Raman lidar system.They are related to the particle size distribution and complex refractive index of the volume of aerosols by this general relationship: where g i,λ represents a lidar measurement of either backscatter or extinction coefficient at wavelength λ.The function f (r) represents the aerosol size distribution, which is a function of r, the particle size.K i represents the extinction and backscatter measurement kernels, which are dependent on particle size, wavelength, and the complex refractive index, m.The measurements also include some measurement error, ε.
Equation ( 1) is of the following general form: in which F, the forward function, relates the vector of state parameters, x, to the vector of measurements y.Comparing Eq. (2) with Eq. (1), the vector of measurements y in Eq. ( 2) is comprised of g i,λ , the five lidar measurements of backscatter and extinction.The state vector x in Eq. (2) comprises the complex refractive index m and variables describing the size distribution f (r).
If the forward model is linear or can be linearized, then Eq. ( 2) can be written by the following matrix equation: where J, the Jacobian matrix, relates the state vector x to the measurement vector y.Rodgers (2000) describes the generalized inverse problem, the OE methodology for solving it, and also a number of useful diagnostics for assessing the information content and retrieval errors.Although the existing lidar aerosol microphysical retrievals solve the generalized inverse problem in various ways not limited to OE, the metrics described by Rodgers (2000) are useful for the retrieval-free information assessment in this project.These include the scalar DOF metric and the state error covariance matrix, propagated from the measurement errors.To calculate these metrics, it is necessary to have the weighting function matrix or Jacobian matrix, J, whose elements are the partial derivatives of the forward model elements with respect to the state vector elements.
To generate a Jacobian matrix for the purpose of the sensitivity study, we first simplify the problem by assuming single scattering processes from spherical particles, monomodal log-normal size distributions, and wavelength-independent refractive indices.The assumption of wavelength-independent refractive indices has been used in all 3β+ 2α lidar aerosol microphysical inversions to date (Müller et al., 1999;Veselovskii et al., 2002;Bockmann et al., 2005;Chemyakin et al., 2014) and is necessitated in part by lack of knowledge of the wavelength dependence of the complex refractive index for real aerosols.However, some types of aerosols may have a complex refractive index with significant spectral dependence (Veselovskii et al., 2016).
The assumption of monomodal log-normal size distributions is used by the arrange and average algorithm (Chemyakin et al., 2014) but not the inversion with regularization algorithm (Müller et al., 1999), the hybrid regularization method (Bockmann et al., 2005), or the linear estimation method (Veselovskii et al., 2012).The retrievals which do not make this assumption can retrieve more general size distribution shapes of which the monomodal log-normal can be seen as a special case.Similarly, the assumption of spherical particles is generally found in these retrievals due to limitations in the accuracy of non-spherical models for lidar measurements, but some retrieval studies (Veselovskii et al., 2010(Veselovskii et al., , 2016) ) have allowed limited retrievals for non-spherical particles with more generalized assumptions about shape.Our forward model adopts the most restrictive assumptions used by any of these retrievals, as have the fewest unknown state parameters; therefore, it has the fewest unknown state parameters.That is, we are characterizing the retrieval of aerosols that conform perfectly to the most restrictive forward model assumptions.The same set of measurements would have less information content with respect to a forward model with more unknown state parameters.Additionally, mismatch between retrieval assumptions and true atmospheric aerosols will also generate errors which are not assessed by this analysis and which will be retrieval dependent.Sensitivity studies to assess the measurement content with respect to more complex aerosol scenarios (specifically bimodal size distributions) are part of our ongoing work.
Consistent with the assumption of spherical particles and single scattering processes, we use Mie kernels, which are calculated with code from Bohren and Huffman (1983).The size distributions are represented as monomodal log-normal size distributions characterized by the total number concentration, N; median radius, r med ; and geometric standard deviation, s.The mode width is the natural logarithm of s.
In all, five state parameters are used in this study: the median radius and geometric standard deviation of the monomodal log-normal size distribution, the total number concentration, and the complex refractive index (real and imaginary parts).From these, the extinction and 180 • backscatter are calculated from Eq. ( 1) at the wavelengths measured by the 3β+ 2α lidar system, which are 355 and 532 nm for extinction and 355, 532, and 1064 nm for backscatter.The integrals are performed for values of r from 1 nm to 50 µm.The state parameters and the output extinction and backscatter values are saved in the form of an LUT for a wide range of state variable values meant to conservatively include realistic aerosol states.The original LUT was developed by Chemyakin et al. (2014), who describe it in more detail.The version of the LUT used for this study includes median radii from 17 to 605 nm; geometric standard deviations from 1.425 to 2.625; real refractive indices from 1.37 to 1.75; and imaginary refractive indices including 0 plus increments from 0.00025 to 0.10175.For the purpose of this study, we have also included total number concentration as a fifth dimension.The range of total number concentration values in the modified LUT is 1-40 000 cm −3 .
For our purposes, the Jacobian matrix is calculated from the LUT using finite differences, using the increments of the LUT itself.The use of finite differences amounts to an assumption that the increments are small enough that the derivatives are locally linear.Testing with both smaller and larger increments confirms that the derivatives are insensitive to the size of the increments from about one-tenth the size of the increments used to at least about 5 times the size used.However, the derivatives and associated retrieval sensitivities are not constant across the entire state space.Therefore, the Jacobians and the metrics describing information content and error propagation have been calculated for several specific realistic cases and also over multiple continuous slices of the hypercube defined by the five state variables, to develop a sense of how these metrics vary over the state space.
Although the published aerosol microphysical retrievals referenced in the introduction solve the inverse problem in various ways, the LUT can be thought of as a generalized realization of the forward function, given the simplifications described above.Since the calculation of the sensitivity and error metrics (Rodgers, 2000) depend on the forward function but not on any explicit retrieval, the LUT can be used to assess the 3β+ 2α measurement sensitivities with respect to aerosol microphysical retrievals, independent of any particular retrieval strategy, not just the arrange and average retrieval for which the LUT was developed.
Besides the Jacobian matrix, the sensitivity calculations also require the measurement covariance matrix, which depends on the observation system.We use a simple but realistic matrix to describe the measurement errors for this study, modeling the uncertainties as constant, normally distributed relative values with standard deviation of nominally 20 % for the extinction coefficients and 5 % for the backscatter coefficients, and with no correlations between the uncertainties in each channel.Zero or near-zero correlation for the uncertainties between channels is realistic for lidar, for which uncertainties are primarily from random processes (e.g., shot noise) and channel-specific systematic sources (e.g., uncertainty in the filter transmittance).The uncertainty levels used in this study are chosen as realistic targets for a space-based lidar system, based on existing HSRL-2 technology (Hair et al., 2008;Burton et al., 2015).Later in this study (Sect.7), we explore a few other benchmark values of measurement uncertainties.In reality, uncertainty will not be constant for all aerosol scenarios, but for the purpose of this study, a few benchmark values are sufficient to explore the sensitivities.
The third input needed for these calculations is the a priori covariance matrix.This matrix represents the uncertainty of the prior knowledge of the state.The diagonal terms represent the variance and are chosen so that the standard deviation is represented as one half of the full range in the LUT for each state variable.The off-diagonal terms represent the correlation or covariance between state variables; we assume zero correlation in the a priori.These large prior variances and zero correlations are an intentionally conservative choice.For an actual retrieval, prior information about aerosol type and real aerosol variability would typically be used to decrease these prior variance terms, which can certainly decrease the uncertainty in the final result.Likewise, if it were known a priori that the state variables were correlated, this could also be used to decrease the uncertainty in the final result.However, since our aim is primarily to assess the information content of the measurements themselves, we use conservative prior variance and covariance values for the sensitivity study.We recognize that the state variables are not normally distributed in reality, although the OE formalism makes the assumption that they are (and that the measurement errors likewise are normally distributed).A more advanced strategy would be to use the Markov Chain Monte Carlo method (Posselt and Mace, 2014), which al- lows for generalized error distributions.However, for this initial study, we use the more straightforward OE method and partially compensate by choosing conservatively large prior variances values.

Case definitions
In describing the calculation of the metrics, we will illustrate the procedures and interpretation using five particular sets of values in the state space, which we collectively call "the reference cases."The values of the state variables for the five reference cases are given in Table 1, as well as values of effective radius, effective variance, single scattering albedo (SSA), and lidar ratio, which are calculated from the state variables.For a log-normal distribution, the effective radius and effective variance can be expressed as analytical functions of r med and s: The SSA is calculated from the state variables using Mie theory.
The first of the references cases has been constructed to approximately reflect a real measurement scenario encountered during the DOE TCAP (Two-Column Aerosol Project) field mission by HSRL-2 (Berg et al., 2015;Müller et al., 2014); the parameters model a plume of urban outflow.The complex refractive index for this constructed case is 1.47-0.00325i,corresponding to a very weakly absorbing aerosol with SSA value of 0.98 at 532 nm.The aerosol is composed of accumulation-mode particles; the constructed monomodal size distribution has effective radius of 0.17 µm and effective variance of 0.16, or median radius 0.12 µm and geometric standard deviation of 1.48.The total number concentration is moderate, with a value of 1100 cm −3 .
For the other four reference cases, we vary the state variables in sets.Cases 2 and 3 have the same complex refractive index as Case 1, but different size distributions.For Case 2, the effective radius and effective variance are somewhat larger at 0.24 µm and 0.23, respectively (median radius = 0.15 µm and geometric standard deviation = 1.58).Like Case 1, this size distribution is considered fine mode.For Case 3, the effective radius and effective variance are 1.60 µm and 1.27, respectively (median radius is 0.20 µm and geometric standard deviation is 2.48).The total number concentration is 50 cm −3 .The total number concentration is much lower than cases 1 and 2, but the larger particles are more scattering and therefore the signal levels are comparable.For comparison, the 532 nm extinction value is 0.092 km −1 for Case 1 and 0.084 km −1 for Case 3.This larger size distribution approximately reflects a coarse-mode marine aerosol, although the complex refractive index is not necessarily appropriate for marine aerosol.Since we will be using these cases to understand the dependencies of the retrieval sensitivities on the state space, we choose to vary the state variables relating to the size distribution separately from those relating to the complex refractive index.
Case 4 has a size distribution equal to Case 1, but larger real and imaginary refractive index values of 1.61 and 0.03, respectively.For this size distribution, this complex refractive index corresponds to a 532 nm SSA value of 0.89.This can be thought of as similar to a biomass burning plume.
Case 5 is similar to Case 4 in everything except total number concentration.Now the total number concentration has been increased dramatically to 20 000 cm −3 , approximating as a very intense smoke plume.
These five cases will be used for illustrating the results of the sensitivity analysis, starting in Sect. 5.

Dependence of lidar intensive variables on aerosol microphysical parameters
First, to build an intuition of the information content encoded within the 3β+ 2α dataset, we briefly examine the dependence of some of the lidar intensive variables on the effec-  1).The middle panel shows the dependence on real refractive index along the x axis, parameterized by effective radius.The right panel shows the dependence on the imaginary refractive index (x axis) for four values of the real refractive index; in this case, median radius is held fixed at 0.12 µm and geometric standard deviation is held fixed at 1.48 (values also from Case 1 in Table 1).
tive radius (Eq.6) and complex refractive index.We use Mie modeling of spherical particles and use the simplified assumption of a monomodal log-normal size distribution (as discussed in Sect.2) for this exercise.
Recall that aerosol intensive variables are those that do not scale with the amount of aerosol loading.Of the five state variables, total number concentration is an extensive variable while the other four (real and imaginary parts of the refractive index, median radius, and geometric standard deviation) are intensive variables.Aerosol extinction and backscatter coefficients, the direct measurements of a lidar using the HSRL or Raman techniques, are extensive variables; ratios of these basic measurements are intensive variables.Burton et al. (2012) show that intensive variables such as the lidar ratio (extinction-to-backscatter ratio at a given wavelength) and backscatter color ratio (i.e., ratio of backscatter at two different wavelengths) encode information about the type of aerosol present in broad categories, i.e., marine vs. smoke vs. urban pollution.It is also known that the extinction Ångström exponent is sensitive to the particle size distribution (e.g., Schuster et al., 2006;Ångström, 1929;Kaufman et al., 1994).
Figure 1 (left panel) illustrates the monotonic dependence of extinction Ångström exponent (355/532 nm) on the effective radius, for monomodal log-normal size distributions.The sensitivity of this parameter to either the real or imaginary part of the refractive index is smaller, as demonstrated by shallower slopes in the middle and right panels of Fig. 1.However, Fig. 2 illustrates the dependencies of the lidar ratio (at 532 and 355 nm), which is the ratio of the extinction to backscatter and is also the inverse of the product of the aerosol 180 • phase function and the SSA.The dependence on effective radius is non-monotonic, and there is a compli-  cated dependence on the real refractive index.For effective radius larger than about 0.1 µm, there is significant sensitivity to the real refractive index.There is a monotonic relationship with the imaginary refractive index, with greater sensitivity (steeper slopes) for the 355 nm lidar ratio compared to the 532 nm lidar ratio, reflecting the relationship between lidar ratio and absorption.The lidar ratio increases as the imaginary part of the refractive index increases and (for large enough sizes) as the real part of the refractive index decreases.In Fig. 3 the backscatter color ratio (532/1064 nm) is shown to vary in a complicated way with the real and imaginary refractive indices and with the effective radius, with differently shaped curves compared to the dependence of the extinction Ångström exponent and lidar ratios.Total number concentration is not reflected in any of the intensive parameters of course, but by definition the extensive parameters (backscatter and extinction) are linearly related to N .While this simple sensitivity check illustrates that changes in the aerosol microphysical parameters are reflected in the measurements, it is not sufficient to determine if the measurements are enough to retrieve all five state parameters.For that, we must turn to more quantitative tools.The left graph shows the dependence on median radius and geometric standard deviation, with the complex refractive index held fixed as 1.47-0.00325iand the total number concentration held fixed at 1001 cm −3 .The right graph shows the dependence on the complex refractive index (RRI is real refractive index and IRI is imaginary refractive index) with the total number concentration held fixed at 1001 cm −3 , the median radius = 0.115 µm, and the geometric standard deviation = 1.475.Dependence on total number concentration is very slight and is not illustrated here.

Degrees of freedom and averaging kernel matrix
The retrieval problem as specified above consists of five direct aerosol measurements (two extinction and three backscatter measurements) from a lidar system at a single level in the atmosphere and five state vector elements (three describing the number and size distribution and two to specify the complex refractive index).We would like to know if the five measurements are sufficient to determine the five unknowns -in other words, to determine if the inverse system is fully determined, overdetermined, or underdetermined and by how much.Rodgers (2000) describes a useful metric to quantify the number of pieces of independent information in the measurement, the DOF for the signal, d s .It is defined as the trace of the matrix J, which is known as either the prewhitening matrix (Rodgers, 2000) or the error-normalized Jacobian matrix (Xu and Wang, 2015).This matrix is defined in terms of the Jacobian matrix, J, the measurement error covariance matrix, S ε , and the a priori covariance matrix, S a .
Since the error-normalized Jacobian matrix is weighted by the prior covariance in the numerator and the measurement error in the denominator, elements greater than unity indicate where variability in the true state exceeds measurement noise.The trace of the matrix, d s , therefore indicates the number of independent pieces of information about the state contained in the measurements.For a fully determined retrieval system, the DOF would be equal to the number of state parameters.
For the first reference case, Case 1 (see Table 1 for description), the signal DOF, d s , is determined using this method to be 4.5.The implication of this calculation is that of the five pieces of information required to specify the state, 4.5 of them are provided by the measurement signal.
The quoted d s is only applicable to one particular value of the state vector.In general, the information content is regimedependent (dependent on the state).For the other cases in Table 1, d s is 4.6 for the slightly larger fine-mode case, 3.9 for the coarse-mode case, 4.2 for the absorbing case, and 3.8 for the case with large total number concentration. Figure 4 provides a more detailed look at the regime dependence for two orthogonal "slices" through the 5-D state space, illustrating that values of approximately 4 are typical of most of the space, except for the smallest particle radii, where the signal DOF decreases closer to 3.
Signal values for the DOF less than 5 mean that some of the information in the five retrieved parameters is not provided directly by the measurements and will be "filled in" by a priori information or other constraints in a retrieval.A value of d s less than 5 is not surprising, because it is already well understood that this problem is underdetermined (Veselovskii et al., 2002;Bockmann et al., 2005;Pérez-Ramírez et al., 2013).In general for this system, we find that approximately four independent pieces of information are provided by the measurements, with slight regime dependence.

Propagated state uncertainties
While the signal DOF is a useful metric that indicates the number of independent pieces of information in the measurements with respect to the state, the a posteriori (i.e., propagated) state error covariance matrix is more useful both for indicating how the retrieval errors are propagated from the measurement errors and also for assessing how the underdeterminedness affects specific state variables.The state er- ror covariance matrix, Ŝ, is propagated from the measurement error covariance matrix S ε and a priori covariance matrix S a using the Jacobian matrix, J, by Table 2 shows Ŝ for the first reference case as an example.The diagonal elements in the covariance matrix are the variance terms, and their square roots are the standard deviations.These standard deviations, which we will also call the propagated uncertainties, are shown in Table 3 for the five reference cases.Table 3 also shows the prior uncertainty from the a priori covariance matrix.Comparing the propagated uncertainty with the prior uncertainty shows how the measurements constrain the retrieval beyond the prior knowledge.For the size distribution parameters, the assigned prior standard deviations are 0.3 for the median radius, 0.6 for the geometric standard deviation, and 20 000 for the total number concentration.In each of the reference cases, the propagated uncertainty values from Table 3 for these three variables represent a significant reduction in the standard deviation by 40-87 % for the median radius, 17-84 % for the geometric standard deviation, and 31-99 % for the total number concentration.The measurements also reduce the prior standard deviation of the RRI significantly, by a factor of 26-79 % from the prior standard deviation of 0.19.For IRI, there is a reduction of 52-90 % from the prior standard deviation of 0.05.So, the measurements constrain knowledge of all of the state variables beyond the prior knowledge.
Since the prior covariance matrix was defined rather conservatively in this study, the reduction from the prior uncertainty may be less useful than comparing to uncertainty values defined in terms of a desired goal.Part of the motivation of this study is to determine the extent to which a 3β + 2α lidar system can meet the requirements outlined in the ACE satellite white paper (https://acemission.gsfc.nasa.gov/documents/Draft_ACE_Report2010.pdf, last access: 22 October 2015).These draft ACE requirements, shown in Table 3, in some cases specify retrieval precisions defined with respect to a vertically resolved profile with resolution of 1.5 km in the free troposphere and 500 m in the boundary layer.These include the total number concentration with a retrieval precision (one standard deviation) to within 100 % (relative) and SSA to within 0.02 (absolute).Other ACE draft requirements are specified for column-equivalent values.These include RRI to within 0.02 (absolute), effective radius to within 10 % (relative) and effective variance to within 50 % (relative).The ACE satellite is planned to include both a multi-wavelength lidar and multi-wavelength, multi-angle polarimeter.The requirements reflect the expectation that both instruments will be used in a combined retrieval, but this measurement configuration is out of the scope of the current sensitivity study.Some of the ACE requirements are stated in terms of the effective radius, effective variance, and SSA, quantities that are not part of the nominal set of state variables described above; however, they are directly related to the state variables and can be derived from them.In general, if a secondary variable, z, can be expressed as some function of the state variables, x, then the random error of the secondary variable can be calculated using the state error covariance matrix Ŝ and the partial derivatives of the secondary variable with respect to the state variables.
For our purpose, the variable z can represent either the effective radius, effective variance, or SSA.
The effective radius and effective variance can be calculated for a monomodal log-normal size distribution using Eqs.( 6) and ( 7).The functional dependence of SSA, which is the ratio of the scattering efficiency to the total extinction efficiency, can be obtained using Mie theory.Then the propagated uncertainties for these quantities can be obtained using Eq. ( 11) with partial derivatives that are calculated either analytically or using finite differencing on the output of the Mie code from Bohren and Huffman (1983).The propagated uncertainties for the effective radius, effective variance, and SSA are also shown in Table 3.
As with the signal DOF, the propagated errors for the state vector elements and for effective radius, effective variance, and SSA are regime dependent, varying over different parts of the state space.In the Supplement, there are figures similar to Fig. S4 but which show five state variables as well as the derived variables, effective radius, effective variance, and SSA.These illustrate the ease with which the sensitivity metrics can be calculated for the whole state space, but since some of the states represented in these slices may not be particularly realistic, it can be hard to interpret the results.Therefore, the five reference cases in Table 1 were designed to provide a focus for understanding the regime dependence more easily.
Recall that the differences between cases 1, 2, and 3 are related to the size distribution.The size distribution for Case 3 is a coarse mode with a larger particle size, larger geometric standard deviation, and smaller total number concentration than cases 1 and 2. Compared to Case 1 or 2, Case 3 has larger propagated relative uncertainty of the effective radius, 50 % uncertainty compared to 23 and 29 %, and also of total number concentration, 122 % compared to 98 and 103 %, mostly due to the increase in the geometric standard deviation.For the most part, we found increasing relative uncertainties for the size distribution parameters for increasing geometric standard deviations (with some exceptions, as can be seen in the figures in the Supplement).However, compared to Case 1, Case 3 has smaller uncertainties on the complex refractive index and SSA, although the complex refractive index did not change between cases.
Case 4 has the same size distribution as Case 1, but the complex refractive index corresponds to a more absorbing aerosol.There are only minor differences in the size distribution uncertainties, but the uncertainties on the complex refractive index and SSA increase, suggesting less sensitivity in the retrieval to the complex refractive index of absorbing aerosols.
Case 5 is identical to Case 4 except that it has a very large total number concentration.Although such a large total number concentration in a real-world measurement scenario would mean greatly increased signal-to-noise ratio (SNR), the measurement errors for this study are defined as constant percentages, so the SNR effect is not reflected in this study.Instead, total number concentration behaves essentially as a scaling variable in the retrieval, and therefore most of the propagated uncertainties are very similar for Case 5 compared to Case 4. The exception is the uncertainty in the total number concentration itself, which decreases from 94 to 68 %.
Comparing the propagated uncertainties to the ACE requirements, note that ACE calls for an uncertainty on the column total number concentration of 100 %.The uncertainties in Table 3 show that the 3β+ 2α retrieval already meets this requirement even for vertically resolved measurement levels in the absorbing aerosol cases and meets it or is very close to meeting it in the non-absorbing fine-mode cases.The coarse-mode case has the largest total number concentration uncertainty, 122 %, but is still reasonably close to the column uncertainty target.Note that a requirement on the column uncertainty is less restrictive than a requirement on vertically resolved measurement levels.This study focuses only on the sensitivities of single-level retrievals, but full pro- file retrievals are also possible (Kolgotin and Müller, 2008) which optimize the use of simultaneous information content from multiple related vertical levels.If an aerosol layer extends across multiple measurement levels, then a profile retrieval which combines measurements from multiple levels within the column would include proportionately more measurement information content (since the noise in the measurements is mostly uncorrelated and the aerosol properties are correlated), and so the uncertainty would be reduced, compared to a single-level retrieval.In the future, we will perform sensitivity studies for such a retrieval system.
The uncertainties on the vertically resolved effective variance are 36 and 41 % for the absorbing cases, which already meets the proposed ACE column requirement of 50 %.The non-absorbing fine-mode and coarse-mode cases have effective variance uncertainties of 61 to 68 %, not very much larger than the ACE column requirement.
The requirement of 10 % column uncertainty for the effective radius is not met for any of the five illustrated cases on a vertically resolved basis; the propagated uncertainties are 2 to 3 times larger for the three fine-mode cases and 5 times larger for the coarse-mode case.A factor of 2 or 3 may be recoverable by a profile retrieval which uses multiple vertically resolved measurement levels simultaneously, assuming the aerosol properties are correlated across several levels.
The propagated uncertainty on the real refractive index is 2 to 7 times the proposed ACE column requirement, in this case smallest for the coarse-mode case and worst for the two absorbing fine-mode cases.
The proposed ACE requirement for SSA is 0.02 on a vertically resolved basis.The propagated uncertainties for all four cases are 3 to 5 times larger than this proposed requirement, which may be sufficient for distinguishing extreme cases such as intense biomass burning plumes and also may be reducible to some extent by a profile retrieval.

Performance assessment for varying measurement errors
It should perhaps be mentioned that the ACE requirements are not necessarily finalized and the values quoted here are draft requirements.Similarly, the instrument performance used for the results described above is only approximate based on a best-guess estimate of realistic targets for a spacebased lidar system, based on the technology used for the airborne HSRL-2.Since the motivation for this study is to determine what retrieval performance is possible from a lidaronly microphysical retrieval, it is useful to briefly explore the retrieval uncertainties as a function of instrument performance.Table 4 accordingly shows the propagated uncertainties, using reference Case 1, for three different instrument configurations with different measurement uncertainties for backscatter and extinction.The first measurement configuration assumes that the uncertainties are larger than previously described, 10 % for aerosol backscatter and 30 % for aerosol extinction.The second of the three configurations in Table 4 is a repetition from Table 3, with uncertainties of 5 and 20 % on aerosol backscatter and extinction, respectively.The third theoretical instrument configuration is more ambitious, with assumed uncertainty 5 % on aerosol backscatter and 10 % on aerosol extinction.Comparing the first and second scenarios, when the measurement uncertainties are allowed to increase as described, the retrieval uncertainties increase by a factor of 20-50 %.Comparing the second and third scenarios, when instead the extinction measurement uncertainty is decreased by half, then the retrieval uncertainties all decrease by approximately 30-40 %.In the third scenario, the draft ACE requirement for vertically resolved total number concentration is met; the requirement for column effective variance is met even on a vertically resolved basis, and the vertically resolved effective radius uncertainty is less than twice the column requirement.However, the real refractive index and SSA uncertainties are still large compared to the ACE draft requirements.This level of precision and accuracy may be difficult to achieve with a satellite lidar.
Recall that the proposed ACE system consists of both a multi-wavelength HSRL and a multi-wavelength, multiangle polarimeter.The current sensitivity study addresses only the lidar.A combined retrieval with both lidar and polarimeter will certainly have higher information content particularly pertaining to aerosol absorption, and a better chance of meeting all of the draft ACE requirements.To quantitatively assess the information content of this more complicated system, a full column retrieval using a combined lidarplus-polarimeter forward model would be required, which is outside of the scope of the current paper.
Based on the current study, it seems likely that a 3β+ 2α lidar-only system with measurement errors similar to those studied here will have trouble retrieving SSA to the target level of uncertainty and that additional information content must be provided, such as from coincident passive (sunphotometer or polarimeter) measurements at more wavelengths or, when additional measurements are not available, then from a priori constraints.

Correlation matrix
Besides the diagonal variance elements, the state error covariance matrix includes off-diagonal terms that describe the interaction between pairs of state variables in the retrieval.Prior similar sensitivity studies for other systems do not explicitly address the off-diagonal terms of the propagated matrix (Xu and Wang, 2015;Knobelspiesse et al., 2012), but these terms give critical information about retrieval performance.To illustrate, Tables 5 and 6 give the state error correlation matrix for Case 1 and Case 2, respectively.These can be easily converted from the state error covariance matrices, like the one given in Table 2 for Case 1.The correlation matrices show that there is some correlation between all pairs of variables, with the highest correlations between the real and imaginary parts of the complex refractive index and between the total number concentration and median radius.The correlations have a complicated regime dependence, illustrated in Figs. 5, S9, and S10.Although cases 1 and 2 vary only a small amount in median radius and geometric standard deviation, there is a significant increase for Case 2 in the magnitude of the correlations between size distribution variables.For Case 2, the correlation is −0.97 between the median radius and total number concentration and also between the median radius and geometric standard deviation. .The a posteriori correlation between retrieved total number concentration and median radius is here shown as a 2-D slice through the five-variable state space.The complex refractive index is held fixed at 1.470-0.00325i,the total number concentration is held fixed at 1001 cm −3 , and the dependence on median radius and geometric standard deviation is depicted.Symbols show the values of median radius and geometric standard deviation for cases 1 (circle), 2 (square), and 3 (triangle), which also have the same complex refractive index as the illustrated slice.
High magnitude correlations between the retrieved variables indicate the potential for cross-talk between these parameters.Cross-talk can cause additional error in the retrieved parameters that is not reflected in the variance terms, due to non-unique solutions which have compensating errors.In an ideal case with no cross-talk, the forward model evaluated at the true state would produce output equal to the measurements (ignoring measurement error), while the forward model evaluated using an incorrect state vector should produce output that does not agree with measurements.However, in the case of compensating errors or cross-talk, an incorrect solution may also reproduce the measurements if, for example, an error in the median radius that tends to produce larger backscatter and extinction values were compensated by an error in the total number concentration that tends to produce smaller values.Such compensating errors make it impossible for the measurements to distinguish between the true state and the erroneous state.
The cause of the cross-talk can broadly be described as a lack of sensitivity in the measurements.The cross-talk between total number concentration and median radius occurs because particles significantly smaller than the shortest measured wavelength (355 nm) contribute little to observed optical properties.Therefore, the measurements can be insensitive to the difference between large numbers of very small particles and smaller numbers of larger particles.This problem and a partial remediation are examined in more detail in Sect.9.The cross-talk between the real and imaginary index of refraction is related to a relative lack of sensitivity to absorption in the lidar measurements.Probably the best remedy for this latter problem is to incorporate additional in- formation content into the retrieval, preferably in the form of additional coincident measurements, as from a polarimeter on the same platform.
9 Cross-talk between size parameter and total number concentration Taking measurement error into account, there are always multiple solutions that reproduce the measurements to within the measurement error.This is not a concern when the solutions are clustered around the true solution, but it can be a significant issue in the case of cross-talk or compensating errors as discussed above.Figures 6 and 7 show histograms of the number of solution states in the gridded LUT that reproduce the backscatter and extinction values of Case 1 to within the prescribed error bars (5 % for backscatter coefficient and 20 % for extinction coefficient).Figure 6 shows the total number concentration and Fig. 7 shows the median radius, respectively.Note that although the peaks of the histograms do not exactly match the specified values for Case 1 (indicated by dashed lines), the solutions are clustered around those values.In contrast, Figs. 8 and 9 illustrate the set of solutions from the gridded LUT that match Case 2 within the measurement error bars (shown in red).Figure 8 shows that this set of solutions covers an enormous range in total number concentration.The range of total number concentration for these solutions is much larger than indicated by the propagated standard deviation shown in Table 6.Cases 1 and 2 have similar propagated standard deviations; the problem with Case 2 is only evident in the near-unity correlation value between to- tal number concentration and median radius, shown in Table 6.The very high correlation indicates that the solutions with very large total number concentrations are those solutions that also have very small median radii.Very small particles contribute little to extinction or backscatter at the lidar wavelengths.So, large numbers of very small particles can be included in the retrieved solution without significantly affecting the agreement with the measurements; therefore, the measurements by themselves are not sufficient to determine if these very small particles are actually present.
This situation emphasizes the value of examining the cross-terms of the propagated error matrix.The regime dependence of this situation is complex and the problem can be detected only by studying the correlation matrix or else by examining the distribution of solutions for a given retrieval.
A resolution of the cross-talk can be achieved by adding an additional constraint on either the total number concentration or the radius using a priori information.For example, it is probably unrealistic to allow total number concentra- tions up to 40 000 cm −3 .However, it is not clear how one would determine a realistic upper bound on the total number concentration.We argue that a better solution is to constrain the radius of the particles.After all, the limitation of the lidar measurements is a lack of sensitivity to particles much smaller than the smallest lidar wavelength; it is not a limitation on the sensitivity to large total number concentrations.To approximate such a constraint, we repeated the retrieval for cases 1 and 2 using a limited version of the LUT, where some solutions are disallowed depending on the size of particles in the size distribution.The blue histograms in Figs.6-9 illustrate the solutions for which 80 % or more of the particles in the size distribution are larger than 50 nm radius.As seen in the blue histograms, limiting the retrieval to larger particles improves the cross-talk problem for Case 2, and the solutions are now much better constrained around the truth solution.
A radius of 50 nm is proposed for the cutoff value based on the sensitivity of the lidar measurements and on the naturally occurring lower bound of the atmospheric aerosol accumulation mode.Typically aerosol size distributions are described in terms of three or four size modes, depending on whether one is examining the number-or mass-based size distribution (Seinfeld and Pandis, 2006).Most particles, on a number basis, exist in the ultrafine diameter size range of a few nanometers up to a few hundred nanometers, with two distinct modes: the nucleation mode (D < 10 nm) and the Aitken mode (10 nm < D < 300 nm).Nucleation-mode particles are fresh aerosols created via gas-to-particle nucleation, while the Aitken mode encompasses directly emitted particles and particles that have grown via coagulation or gas-to-particle condensation.Meanwhile, on a mass basis, the aerosol size distribution is dominated by two larger modes (with the ultrafine particles contributing almost negligible mass): the accumulation mode (100 nm < D < 2.5 µm; consisting of direct particle emissions, coagulation of smaller particles, and gasto-particle condensation of sulfates, nitrates, and organics) and the coarse mode (2.5 µm < D < 50 µm; consisting of particles formed via mechanical processes such as wind-blown dust or sea salt).Atmospheric photo-oxidation and cloud processing can also affect these modes and cause both the number and mass size distributions to shift toward larger sizes, as is often seen for cloud processed marine aerosol (Hoppel et al., 1986).
Clearly, if an Aitken or nucleation mode with large number concentration does exist, limiting the size range of the retrieval introduces the possibility of bias in total number concentration.However, it is important to realize that even if it is known from external sources (such as in situ measurements) that an observation is occurring in a region of significant new particle production, lowering the cutoff radius will not resolve the systematic error in the retrieval, since the measurements cannot distinguish between large numbers of very small particles and smaller numbers of larger particles.Therefore, we think it is sensible to limit the particle size in the retrievals to reflect the measurement sensitivity to larger sized particles.This strategy also has the benefits of making the constraint explicit and leading to a clear and understandable interpretation of the results.In this case, the retrieval should not be described as a retrieval of total aerosol number concentration but rather as a retrieval of accumulation-mode and coarse-mode aerosols, more accurately reflecting the retrieval sensitivities.
This strategy has heritage in existing retrievals.In inversion with regularization (Müller et al., 1999), the underdetermination of the retrieval is addressed by putting strong constraints on the window of particle sizes that are considered, effectively limiting the minimum particle radius to 50 nm (Veselovskii et al., 2002).However, in that retrieval the limit varies from case to case and even from one solution to another within the set of solutions that are averaged for a particular retrieval, with the minimum radius being anything between 50 and 500 nm.Since we argue that the need for a minimum particle radius cutoff is related to the limited sensitivity of the measurements to very small particles, we believe that a single cutoff would be more consistent with our understanding of the retrieval sensitivities.In any case, it is important to recognize that the size cutoff amounts to prior information supplementing the information content of the measurements; explicitly describing the prior information is essential to understanding and evaluating retrieval systems and their products.
To investigate the potential for bias associated with the particle size cutoff, it is useful to examine how much of an effect the Aitken mode would have on the backscatter and extinction measurements.For this exercise, we start with a retrieval case similar to an actual measurement described by Müller et al. (2014) from the NASA Langley HSRL-2 on 17 July 2012, from TCAP, and then add on a simulated mode with particle radius of 15 nm (diameter = 30 nm) and varying number concentration.For the purpose of this exercise, we limit the simulated Aitken mode to a narrow mode width (s = 1.48).For the complex refractive index of the simulated Aitken mode, we used values given by Costabile et al. (2013).
Figure 10 shows the result of this numerical experiment, and demonstrates that even for 40 000 cm −3 of simulated Aitken particles, the maximum effect on the measurements is less than 2 % for the 355 nm backscatter measurement and less for other wavelengths and for the extinction (i.e., compared to the actual backscatter and extinction measurements for the TCAP case).Two percent is not a significant effect on the measurements.Given that it is significantly smaller than the assumed measurement errors for this sensitivity study, it is fair to say that the measurements are not sensitive to this mode.(For airborne HSRL-2 measurements with 5-minute averaging such as were used by Müller et al. (2014), the effect on backscatter is about the same size as the random errors and significantly smaller than the extinction random errors.)For number concentrations larger than 40 000 cm −3 , the effect of the simulated Aitken mode is of course larger due to the linear dependence of backscatter and extinction coefficients on total number concentration.For these particles, it would require a number concentration of approximately 10 6 cm −3 to have a significant impact on the measurements.Examples of measured Aitken and nucleation-mode number concentrations include values of about 13 000 cm −3 for each of the two modes from a case of new particle formation in an urban environment described by Cheung et al. (2013) and a maximum of 50 000 cm −3 of particles with radii less than 5 nm for a case of new particle formation in an agricultural region (Mozurkewich et al., 2004).For this latter case, since the particles are much smaller, the effect on the backscatter and extinction is smaller than the 40 000 cm −3 of 15 nm radius particles simulated above, so it seems reasonable to suggest that number concentrations of particles in this size range would rarely be large enough to significantly affect the lidar measurements.As the particle radius gets larger, the sensitivity of the measurements to these aerosols increases.Figure 11 shows the effect as a fraction of backscatter and extinction (again using the measurements from the TCAP case on 17 July 2012 as a reference) of 1000 cm −3 of particles of varying median radius.At about 50 nm median radius, the approximate boundary between the Aitken and accumulation modes, the effect is a few percent to 10 % of the backscatter and extinction, which is on the order of the measurement uncertainty.For larger particles in the accumulation mode, the effect is a significant portion of the measurements, reflecting that the measurements have good sensitivity to the accumulation mode.This suggests that a 50 nm radius is a reasonable cutoff to use in retrievals, representing the approximate boundary where the measurements have reasonable sensitivity.Of course, the true sensitivity of the measurements depends on the number concentration, but since N is unknown, a constant cutoff is a good strategy.
It is worth pointing out that although it is true that lidar measurements lack sensitivity to particles much smaller than the smallest wavelength, they do not lack sensitivity to particles much larger than the longest wavelength, as is sometimes stated.For instance, it is not true that "pollens cannot be observed with lidar systems" (Bockmann et al., 2005).See Fig. 12 for an illustration of lidar measurements simulated by Mie modeling for very large particles.At these large particle sizes, a forward model for the lidar based only on the single scattering Mie calculations is no longer applicable, but this simple illustration serves to show that the backscatter and extinction coefficients are much larger, not smaller, than the benchmark observations of the lidar.The scattering efficiency of large particles is significant even at wavelengths much smaller than the particle size and so the effect of laser light scattering from large particles is easily seen using lidar.However, since the particle size dependence of the lidar Figure 13.The a posteriori correlation between retrieved total number concentration and median radius is here, as a function of median radius and geometric standard deviation, with the complex refractive index held fixed at 1.470-0.00325iand the total number concentration held fixed at 1001 cm −3 .Similar to Fig. 5, but here number concentration is replaced by volume concentration as one of the five independent state variables.Significant differences compared to measurements is not monotonic at large particle sizes and the single scattering forward model is no longer applicable, microphysical retrievals of particle properties are challenged at large particle sizes.See Gasteiger and Freudenthaler (2014) for a further discussion of retrieval of large particle size from multi-wavelength lidar.3, but using total volume concentration instead of total number concentration as a state variable, this table shows propagated uncertainties (standard deviations) for state variables and selected additional variables derived from the state variables, shown for the reference cases described in Table 1.The propagated uncertainties (Eq.9), depend on assumed measurement errors of 5 % for backscatter and 20 % for extinction and depend on a priori covariance as described in the text.The assumed a priori uncertainties are listed for comparison.

Retrieval
Prior  2004) that performing the retrieval with higher-order kernels may reduce the retrieved uncertainties.It is straightforward to use the volume size distribution instead of the number size distribution for f (r) in Eq. (1) as long as the kernels are also represented in terms of volume concentrations.The analysis presented above can be repeated using the total volume concentration rather than total number concentration as one of the five state variables, and the sensitivity analysis can be repeated to assess the impact of switching kernels on the information content of the measurements, due to a redefinition of the state space and concomitant change in the null space (the portion of the state variable space that cannot be assessed using the measurements).Table 7 shows the propagated uncertainties for the five state variables after making this change, for the reference cases.Note that the differences between Tables 7 and 3 are mostly insignificant except for Case no. 3, the coarse-mode case.This is also reflected in Fig. 13, which shows decreased correlation (cross-talk) between the total volume concentration and the median radius, compared to the number-versus-radius correlation shown in Fig. 5, but only in the upper right quadrant which corresponds to the largest effective radii.In summary, the change to the higher-order kernel reduces the measurement sensitivities for the case of large particles.It does not solve the problem of high correlation between the number concentration and the median radius for the fine-mode cases as discussed in Sect.9. Note, as before, that any additional errors or instabilities that are part of the retrieval will not be included here, and it is possible that there are other considerations in specific retrievals that might favor the use of volume kernels over number kernels, such as the how the kernel functions are integrated using orthogonal basis functions, as discussed by Veselovskii et al. (2004).

Summary and discussion
There is considerable interest in retrievals of aerosol size distribution parameters and absorption properties using multiwavelength HSRL or Raman lidar.While there have been successful 3β+ 2α retrievals of some particle properties (Müller et al., 2014;Veselovskii et al., 2016), there is also well-justified concern that these retrievals are somewhat underdetermined.In this study we have taken a rigorous look at the information content of single-height-level 3β + 2α lidar measurements with respect to the microphysical parameters of interest, using implementation-independent tools from the field of OE, which allows for combining measurements, measurement errors, and constraints within a single coherent framework.By avoiding a retrieval and using the forward model only (along with reasonable measurement uncertainties and a conservative a priori covariance matrix) we isolate the sensitivities of the measurements themselves for a best-case aerosol scenario, a monomodal log-normal distribution of spherical particles with spectrally independent complex refractive index.The choice of a simplified model adds clarity to the understanding of the uncertainties in retrievals, since it allows for separately assessing the sensitivities and uncertainties of the measurements alone that cannot be corrected by any potential or theoretical improvements to retrieval methodology but must instead be addressed by adding information content.Future work will be performed using less-simplified models.For example, expanding to a bimodal retrieval is straightforward.Equation ( 1) can be expanded by simply adding the modes together.Then the Ja-cobian in Eq. ( 5) becomes a non-square matrix, with more state variables than measurement variables; however, the following equations, Eqs. ( 8) and (9), do not require a square matrix and therefore the sensitivity metrics can be calculated straightforwardly.Nevertheless, even with a more complex aerosol model, there will be additional retrieval-dependent uncertainties that are related to mismatch between the assumptions and the real-world aerosols and also to retrieval methodology such as inversion technique.These uncertainties are in addition to the uncertainties discussed in this study.
In contrast, actual retrievals generally benefit from using various constraints and a priori information that reduce the retrieval errors.A priori knowledge is intentionally minimized in this study to focus on the measurement sensitivities, but in general it will improve retrieval performance from this basic level.
We find that the five 3β+ 2α lidar measurements provide approximately four independent pieces of information to describe the aerosol microphysical state space, with only slight regime dependence.Using reasonable lidar measurement uncertainties, the retrieval uncertainties are closest to the proposed ACE satellite precision requirements for the size distribution parameters, particularly the total number concentration, and worst for the complex refractive index, and provide a reduction of the uncertainty from the conservative a priori values for all five variables.We find that the total number concentration and particle median radius can be affected by cross-talk which increases the true uncertainty beyond the propagated standard deviation, for some parts of the state space, related to limited sensitivity of the lidar measurements to particle radii smaller than about 50 nm.We recommend limiting the radii in the retrieval to a range where the measurements have greater sensitivity, to address the high correlation between total number concentration and the particle median radius.
In general, information about the state vector that is not provided by the measurements comes from assumptions, constraints, or other a priori information.Smoothing and regularization are examples of retrieval constraints, as is the idea of limiting the minimum particle radius.Retrieval constraints and assumptions can also be hidden or difficult to characterize.For specific retrieval methodologies, we would like to emphasize the importance of explicitly describing any prior information and constraints that affect retrieval results.
In this sensitivity study, only very conservative constraints were used in order to pinpoint the sensitivity of the measurements.To achieve better performance with a retrieval, three strategies can be adopted either singly or in combination: 1. Add a priori information that constrains the retrieval using known information about the observed aerosol.
3. Add additional measurements to the system.
One method to assign a priori covariance information is to use aerosol classification from the lidar intensive parameters (Burton et al., 2012) to infer what type of aerosol is present and then assign prior variances for the state parameters that are specific to that aerosol type.It has been demonstrated that the lidar intensive parameters from an HSRL have sufficient information content to categorize aerosol into broad categories.Assigning a priori values based on these categories additionally requires representative information about the microphysical properties of aerosols in each category from in situ measurements or from modeling.
Reducing the measurement uncertainty involves either designing the observing system to stricter requirements (to the extent practical) or reworking the retrieval problem to make more optimal use of the measurement information.For example, a simultaneous profile retrieval that uses the 3β+ 2α lidar information from the whole column with appropriate constraints on the correlations between levels is likely to have somewhat improved information content compared to the baseline uncertainties for the level-by-level retrieval system discussed in this work.
Finally, measurement information content can be increased by adding more measurements to the system, for example by combining coincident lidar plus polarimeter measurements from the same platform.This combination is expected to add significantly more information content and reduce the need for constraints or a priori information Research is ongoing into each of the three retrieval strategies described above, aerosol-type-specific prior covariance matrices, profile retrievals, and combined lidar plus polarimeter retrievals.Additional sensitivity studies for these scenarios will be performed in the future.
The Supplement related to this article is available online at doi:10.5194/amt-9-5555-2016-supplement.

Figure 3 .
Figure 3. Like Fig. 1 but for the backscatter color ratio (which is the ratio of the aerosol backscatter coefficient at 532 nm divided by the aerosol backscatter coefficient at 1064 nm).

Figure 4 .
Figure 4.The degrees of freedom (DOF) of the signal, d s , is shown color-coded, as orthogonal 2-D slices through the five variable state space.The left graph shows the dependence on median radius and geometric standard deviation, with the complex refractive index held fixed as 1.47-0.00325iand the total number concentration held fixed at 1001 cm −3 .The right graph shows the dependence on the complex refractive index (RRI is real refractive index and IRI is imaginary refractive index) with the total number concentration held fixed at 1001 cm −3 , the median radius = 0.115 µm, and the geometric standard deviation = 1.475.Dependence on total number concentration is very slight and is not illustrated here.
Figure5.The a posteriori correlation between retrieved total number concentration and median radius is here shown as a 2-D slice through the five-variable state space.The complex refractive index is held fixed at 1.470-0.00325i,the total number concentration is held fixed at 1001 cm −3 , and the dependence on median radius and geometric standard deviation is depicted.Symbols show the values of median radius and geometric standard deviation for cases 1 (circle), 2 (square), and 3 (triangle), which also have the same complex refractive index as the illustrated slice.

Figure 6 .
Figure 6.Histograms showing the total number concentration value for all solutions in the gridded LUT (i.e., without interpolation) that match the backscatter and extinction coefficients of Case 1 within measurement errors of 5 % for backscatter and 20 % for extinction.The total number concentration value for Case 1 is marked with a dashed line.Red histogram bars show solutions from the full gridded LUT.Blue histogram bars show solutions from the modified LUT, which excludes size distributions that have a significant contribution from particles of smaller than 50 nm radius.

Figure 7 .Figure 8 .
Figure 7. Histograms showing the median radius for all solutions in the gridded LUT that match the backscatter and extinction coefficients of Case 1 within measurement errors of 5 % for backscatter and 20 % for extinction.The dashed line indicates the median radius for Case 1. Red and blue are as in Fig. 6.

Figure 9 .
Figure 9. Like Fig. 7 but for Case 2. The inset box shows the blue histograms (reduced solution set) with an expanded y axis scale for better readability.

Figure 10 .
Figure10.The effect on the backscatter coefficient (solid lines) and extinction coefficient (dashed lines) of a narrow Aitken mode with median radius = 15 nm, s = 1.48, and varying number concentration, expressed as a fraction backscatter and extinction measured in a typical observation by HSRL-2 on 17 July 2012 during the TCAP campaign.

Figure 11 .
Figure 11.The effect on the backscatter coefficient (solid lines) and extinction coefficient (dashed lines) of 1000 cm −3 particles in a narrow mode (s = 1.48) with varying median radius, expressed as a fraction of the backscatter and extinction measured in a typical observation by HSRL-2 on 17 July 2012 during the TCAP campaign.The dotted line indicates the 50 nm particle radius cutoff discussed in the text.

Figure 12 .
Figure12.The effect on the backscatter (red line) and extinction (blue line) coefficients at 532 nm of a narrow mode (geometric standard deviation 1.1) of large particles (radius varies along x axis) with number concentration 36 cm −3 and complex refractive index 1.57-i0.0037,using the same single scattering Mie modeling as before.The backscatter and extinction coefficients of the modeled coarse mode are in this case much larger than the benchmark values measured by HSRL-2 on 17 July 2012 during the TCAP campaign.The y axis is expressed as the coarse-mode backscatter or extinction divided by the benchmark measurement.
Figure13.The a posteriori correlation between retrieved total number concentration and median radius is here, as a function of median radius and geometric standard deviation, with the complex refractive index held fixed at 1.470-0.00325iand the total number concentration held fixed at 1001 cm −3 .Similar to Fig.5, but here number concentration is replaced by volume concentration as one of the five independent state variables.Significant differences compared to Fig. 5 can be seen for large effective radii, the upper right quadrant of the figure.Symbols are as in Fig. 5.

Table 1 .
State variables and selected derived variables for five constructed reference cases.

Table 2 .
Propagated state error covariance matrix for the first reference case, assuming measurement errors of 5 % for backscatter and 20 % for extinction and a priori covariance as described in Sect.6 and Table3.

Table 3 .
Propagated uncertainties (standard deviations) for state variables and selected additional variables derived from the state variables, shown for the reference cases described in Table1.The uncertainties are shown as absolute value for all variables with relative uncertainty in parenthesis for the size distribution variables.The propagated uncertainties (Eq.9) depend on assumed measurement errors of 5 % for backscatter and 20 % for extinction and depend on a priori covariance as described in the text.The assumed a priori uncertainty and the requirements described in the ACE white paper (also 1 standard deviation) are listed for comparison.

Table 4 .
Propagated uncertainties for Case 1, expressed as absolute values and percentage (for the size distribution parameters) for three different theoretical instrument configurations with different backscatter and extinction uncertainties.The last column repeats the draft requirements from the ACE white paper as in Table3for reference.

Table 5 .
State correlation matrix derived from the covariance matrix shown in Table2, showing the correlations between retrieved variables for Case 1, assuming measurement errors of 5 % for backscatter and 20 % for extinction and a priori uncertainties from Table3.

Table 6 .
Correlation matrix of the retrieved variables for Case 2, assuming the same measurement errors of 5 % for backscatter and 20 % for extinction and a priori uncertainties listed in Table3.

Table 7 .
Like Table