Laboratory and In-flight Evaluation of a Cloud Droplet Probe ( CDP )

Please quantify the seven droplets sizes generated in the lab; otherwise it is unclear what you mean with “For the smallest diameters” (lines 5-6 of abstract) and with “For all larger diameters” (line 7). In the revised abstract we are explicit about the drop sizes tested. The revised abstract also provides quantification of the specific tests fkor different droplet sizes.


Introduction
In-situ cloud studies often utilize measurements from forward scattering optical particle counters (OPCs) to provide size and concentration information about cloud hydrometeors up to a few 10's of microns in diameter.The Particle Measuring Systems' Forward Scattering Spectrometer Probe (FSSP) and the Droplet Measurement Technologies' (DMT) Cloud Droplet Probe (CDP) are forward scattering OPCs used to measure hydrometeors of 1 -50 µm in diameter.These 25 instruments use an open-path laser and measure the intensity of scattered light from transiting particles and relate that to particle size utilizing Mie-Lorenz theory and assuming something about the particles (typically liquid and spherical).The instruments output measurements as cumulative binned counts of droplet diameter.Some instruments including certain versions of the CDP, Fast CDP (FCDP; SPEC, Inc.), and FAST-FSSP are also capable of providing the sizes and interarrival Several sources may contribute to OPC sizing and counting errors that in turn propagate through to higher moments (Dye and Baumgardner, 1984;Baumgardner et al., 1985;Cooper, 1988;Baumgardner and Spowart, 1990;Brenguier et al., 1998;5 McFarquhar et al., 2017;Wendisch and Brenguier, 2013).Sizing error can also result in artificial broadening of hydrometeor size distributions which can mistakenly be attributed to distribution-modifying cloud processes (Baumgardner et al., 1990).
The non-linear relationship between droplet diameter and the intensity of light scattered by a droplet limits the resolution of size bins (Pinnick et al., 1981).The CDP has a default bin width of 2 µm for diameters larger than 14 µm, which can result in as great as 15% uncertainty in diameter (Nagel et al., 2007).Mie resonance, which is more pronounced for the CDP's 10 unimodal laser, also introduces sizing uncertainty for droplet diameters smaller than 14 µm (Knollenberg, 1976;Nagel et al., 2007).
Laser intensity and droplet scattering angles vary based on droplet transit location (Dye and Baumgardner, 1985;Brenguier et al., 1998).Therefore, droplets are only counted and sized if they pass through the qualified sample area; an elliptical 15 region within the depth of field where laser intensity and droplet scattering angles are relatively homogenous.Nonetheless, laser intensity and scattering angles are somewhat variable even within the qualified sample area, resulting in counting and sizing error that is dependent on droplet transit location (Brenguier et al., 1998;Wendisch et al., 1996).Instrument-specific misalignment of optical components can increase spatial variability in counting and sizing accuracy (Lance et al., 2010).The cross-sectional area of the depth of field (DOF) is included in calculations of sample volume, such that errors in DOF will 20 propagate as a scaling bias in concentration (Wendisch et al., 1996).
Coincidence error is a concentration-dependent phenomenon that occurs when multiple droplets are simultaneously within the sensitive area of an OPC's laser.Coincidence can affect sizing and counting accuracy but errors can be difficult to characterize because they depend on many factors including particle concentration, particle size, the location that particles 25 transit the laser, and instrument optical design (Baumgardner et al., 1985;Cooper, 1988;Brenguier, 1988).
FSSP electronic limitations require an 'electronic delay sequence' for a period after particle detection.Particles passing through the qualified sample area during the delay sequence are not detected resulting in undercounting, or 'dead time losses', which requires algorithmic corrections to FSSP-measured concentration (Baumgardner et al., 1985;Brenguier et al., 30 1998;Baumgardner and Spowart, 1990;Brenguier et al., 1998).Newer OPCs including the FAST-FSSP, CDP, and FCDP feature faster electronics that negate dead time loss errors (Brenguier et al., 1998).Forward scattering OPC measurements require that several assumptions be made about cloud particles.OPC techniques assume that measured particles are primarily composed of water and therefore have refractive indexes equal to that of pure water (Pinnick et al., 1981).Particles must also be small enough (less than ~50 µm diameter) to follow Mie-Lorenz scattering theory for the laser wavelength of an instrument.Particle shape affects scattering behaviour, so it is assumed that liquid hydrometeors are spherical (Nagel et al., 2007).Several researchers have used the FSSP to study ice hydrometeors but 5 such measurements are subject to uncertainty imposed by the variability in ice particle shape (Gardiner and Hallett, 1985;Field et al., 2003).
In mixed and ice phase cloud, ice particles are prone to shattering on contact with OPC structures.If passed through the sample area, ice fragments can be erroneously identified as natural particles leading to errors in counting and sizing and an 10 artificial bimodality in hydrometeor distributions (Gardiner and Hallett, 1985;Korolev and Isaac, 2005).FSSP measurements can be greatly affected by shattering artifacts because the probe's laser is housed in a cylindrical shroud (Heymsfield, 2007;McFarquhar et al., 2007).The CDP features an open-path laser that is passed between two arms that are often outfitted with anti shattering tips.As a result, particle shattering introduces negligible uncertainty in CDP measurements, as demonstrated in work by Lance et al. (2010) and Khanal et al. (2018).15 The FSSP and CDP are often calibrated by passing glass microbeads or polystyrene spheres through the sample area.These methods have crude control of calibration media placement and concentration such that they are only capable of testing OPC sizing response.Because these methods have limited control of particle concentration, coincidence can compromise calibrations (Wendisch et al., 1996).Furthermore, the refractive indices for glass and polystyrene differ from that of water 20 requiring a correction be applied to calibration measurements (Nagel et al., 2007).Wendisch et al. (1996), Korolev et al. (1991), Nagel et al. (2007), and Lance et al. (2010Lance et al. ( , 2012) ) developed droplet generating calibration systems that can produce and precisely place a mono-disperse stream of droplets of a known size/frequency at discrete locations within an instrument's sample area.These systems can test locationally-dependent sizing/counting accuracy at specific locations throughout an instrument's sample area and in turn provide measurements of sample area dimensions . 25 This work uses a water droplet generating system to quantify CDP errors in counting and sizing resulting from variations in droplet scattering angles and laser intensity and misalignment of optical components.Seven droplet generator experiments with droplets of 9 -46 µm in diameter provide data for detailed evaluations of CDP performance at locations throughout the sampling area of the probe.This work is similar to the earlier work reported by Lance et al. (2010), but utilizes a wider range 30 of droplet sizes over the entire qualified sample area and a much higher resolution of measurements across that area.
Estimates of how errors in sizing and counting affect higher order moments are provided.Comparisons of in-situ CDPderived liquid water content (LWC) and bulk LWC measurements from a hotwire device provide an additional means of evaluating probe performance.

CDP Operating Principles
The CDP features two forward-protruding arms; one houses a 568 nm laser diode and the other contains a series of collecting optics and photodetectors.As the probe is flown through cloud, some droplets transit the laser and scatter energy.The collecting optics capture energy scattered by droplets in an ~12° arc, remove photons in the innermost ~4°, and focus the 5 remaining energy onto a beam splitter.The beam splitter divides the laser energy and passes it to a sizer photodetector that is covered by an 800 µm diameter pinhole mask and a qualifier photodetector that is masked by a rectangular slit.Responses from the two photodetectors are converted to digital counts ranging from 1 -4095 counts.Sizer responses are used to estimate droplet diameter through Mie-Lorenz theory.The qualifier's rectangular mask is designed to reduce the collection angles of the detector so that responses are maximized when droplets pass through the qualified sample area.A droplet is 10 considered to be within the qualified sample area (or a "qualified droplet") if the signal from the qualifier is greater than onehalf of the signal from the sizer.The CDP employs a dynamic sizer signal threshold in order to minimize false counting events resulting from impinging solar radiation or other sources of noise.This is accomplished by considering all sizer responses within a 10 Hz period that result in less than 512 digital counts.A noise band is defined as the region that contains at least 75% of responses with less than 512 counts.Sizing/counting events are rejected if sizer response is less than the 15 determined noise band (Lance et al., 2010;Droplet Measurement Technologies, 2014).
Standard coincidence occurs when multiple droplets are simultaneously within the qualified sample area of a CDP.OPCs are designed to count/measure a single particle at a time so standard coincidence results in undercounting and can also lead to oversizing due to the additional light scattered by coincident hydrometeors (Baumgardner et al., 1985;Cooper, 1988).20 Because the qualified sample area of the CDP laser is relatively small (on the order of 0.3 mm 2 ), standard coincidence is only expected to affect measurements in regions of high droplet concentration (Lance et al., 2010).
Originally, the sizer detector was unmasked, meaning that it was sensitive to light scattered by droplets transiting through a region surrounding the qualified sample area, called the extended sample area.Droplets passing through the extended sample 25 area cause insignificant qualifier detector responses so they are not counted or sized.A specialized form of coincidence, called extended coincidence, occurs when droplets are simultaneously within the qualified and extended sample areas (Lance et al., 2010 and2012).Coincident droplets within the extended sample area scatter additional light that can in turn result in oversizing of qualified droplets.Extended coincidence can also lead to undercounting if sizer response exceeds a threshold value.Lance et al. (2010) used a droplet generating calibration system to measure the qualified and extended sample areas 30 (SA Q and SA E ) using droplets with 12 and 22 µm diameters.The researchers found that SA E can be much larger than SA Q (20.1 mm 2 vs. 0.3 mm 2 ) resulting in errors from extended coincidence up to 60% oversizing and 50% undercounting in concentrations as low as 400 cm -3 (Lance et al., 2010).Results from Lance et al.'s 2010 study motivated the addition of an 800 µm diameter sizer pinhole mask that decreases the size of the extended sample area to ~2.7 mm 2 , thus reducing the occurrence of extended coincidence (Lance et al., 2012).It was concluded that extended coincidence introduces negligible uncertainty in droplet concentrations less than 650 cm -3 for CDPs featuring the sizer mask modification.oversized by 2 µm at the centre of the qualified sample area and that sizing accuracy for 12 and 22 µm droplets is dependent upon where droplets transit the qualified sample area.Droplets were undersized by as much as 74% in certain sample 10 locations and oversized by as much as 12% in others (Lance et al., 2010).It was found that on average, 12 and 22 µm droplets were counted to within 95% accuracy.Counting error is more severe at the edges of the qualified sample area as a result of photodetector signal noise (Lance et al., 2010).

University of Wyoming droplet generating system
The University of Wyoming (UW) Atmospheric Science Department developed a droplet generating calibration system very 15 similar to the system built by Lance et al. (2010Lance et al. ( , 2012) ) which is based on work by Korolev et al. (1991), Wendisch et al. (1996), andNagel et al. (2007).A detailed explanation of the design and operation of a droplet generating system can be found in Lance et al. (2010).The systems built by the UW team and Lance et al. (2010Lance et al. ( , 2012) ) use a piezoelectric print head to produce a mono-disperse stream of water droplets inside a glass flow tube.The flow tube contains a sheath flow that is accelerated in a tapered exit region, which by extension, accelerates and focuses suspended droplets into a precise stream.20 The accelerated droplet stream is then passed through a CDP's sample area at discrete locations.Two-axis positioning stages are used to control the point of droplet injection and provide the coordinates of injection locations.The UW droplet generator incorporates computerized stages, instead of the manual positioners used by Lance et al. (2010), in order to automate and expedite the testing procedure.

25
A high speed metrology camera outfitted with a 10X microscope objective provides an independent measurement of droplet diameter using the glare technique as described by Korolev et al. (1991), Wendisch et al. (1996), Nagel et al. (2007), and Lance et al. (2010).As droplets pass through the laser of the CDP, the left and right sides of the droplet are illuminated as a result of reflection and refraction.The metrology camera images these illuminated regions (glares) which appear as two parallel lines when using an exposure time of 1/1000 sec.Estimates of droplet diameter are obtained by considering the pixel 30 separation of glares, a pixel to distance conversion, and a formula that accounts for the angle of the camera objective relative to the laser (Wendisch et al. (1996) and Korolev et al. (1991).Using this technique, the UW system is capable of determining individual droplet diameters to within ±0.355 µm.
The metrology camera can also be used to estimate droplet velocity by capturing images with exposures on the order of 1/150,000 -1/300,000 second.Shorter exposure times produce glare images with well-defined start and end points.Droplet 5 velocity can be estimated by considering glare length, a pixel to distance conversion, and exposure time.The longitudinal position of glares can also be used to evaluate droplet placement precision.
A number of validation tests were performed to ensure that the UW droplet generator can produce droplets of consistent diameter for the amount of time required to conduct a test (~4 hours), precisely place droplets at discreet locations within the 10 sample area, and eject droplets at suitable velocities.Seven droplet generator tests that produced droplet diameters of 9 -46 µm are used to evaluate accuracy and consistency in droplet diameter.During the course of each test, glare images were captured once every second and a random sample of 80 images were analysed to provide distributions of true droplet diameter (D true ).Table 1 shows that standard deviation of D true is less than 0.7 µm for all seven tests.It also shows that all but one test produce 95 th -5 th percentile range of D true less than the 2 µm bin width of the CDP (for droplets larger than 14 µm).15 Two tests were conducted that used the deviation of glare position to validate droplet placement precision.To confirm that placement precision is similar along orthogonal axes, glare images of 32 µm droplets were captured for 1 hour with the metrology camera placed at 124.9º incident to the CDP laser and an additional hour at 214.9º incident.Glare position for a random sample of 50 images from each camera angle show that droplet deviation is similar along orthogonal axes.The 20 absolute deviation of glares is 5.7 µm along both axes and standard deviations are 1.5 and 1.7 µm for the 124.9º and 214.9º camera angles respectively.A separate experiment tested long term placement precision by analysing 80 random glare images captured over the course of a four-hour test.Droplet position for the sample has an absolute range of 11.4 µm and a 95 th -5 th percentile range of 9.3 µm.Approximately 8% of droplets were placed beyond 10 µm.

25
Droplet ejection velocity is validated by capturing images using exposure times of 1/150,000 -1/300,000 sec.It was found that when droplets were created and accelerated in a 13 l min -1 sheath flow, 40 µm diameter droplets cross the CDP laser at ~32 m s -1 .This velocity is only about 30% of typical University of Wyoming King Air research airspeeds but is greater than the minimum operational airspeed of the CDP (10 m s -1 ).  9, 17, 24, 29, 34, 38, and 46   The time required to complete a full test was in some cases as long as 5 hours (see Table 1).Stability of the droplet generator system over this time depends, in part, on the size of droplets being produced.Smaller droplets tend to result in reduced system stability and therefore required shorter test periods.For the five tests using droplets 24 µm and larger, the dwell time at each sample location was 2 seconds.This resulted in 500 droplets passing through the CDP sample area at each location.
For these same tests, a 10 µm by 10 µm grid of sample locations covered the entire test area, corresponding to 2700 discrete 15 sample locations across the approximately 0.27 mm 2 qualified sample area of the CDP.For the test using 17 µm droplets, the dwell time at each location was reduced by a factor of 2 and the grid resolution remained the same.The system was less stable when producing 9 µm droplets and required test times of less than 2 hours to ensure consistent droplet sizes and placement throughout the experiment.For this test, dwell time was further reduced such that 200 drops were placed at each location and the resolution of the grid was reduced to 30 µm by 20 µm, resulting in roughly 450 discrete locations across the 20 qualified CDP sample area.

CDP Sizing
CDP measurements for all droplets detected during a given test were used to produce a distribution of droplet diameters for that test.Droplet distributions were computed using number counts from each of the CDP's 30 pre-determined size bins.Bin widths are 1 µm for diameters less than 14 µm, and 2 µm for diameters greater than 14 µm.For each bin, we considered the 25 geometric mean diameter, hereafter referred to as D CDP .Also, for each test, 80 randomly selected droplet glares were analysed to determine a distribution of actual droplet diameters, D true .These droplets, when binned according to CDP size bins, resulted in a distribution of droplets, D true *. and another 30% were placed in the 8.5 µm bin.Nearly 90% of the randomly selected droplets were determined to have actual diameters between 8 and 10 µm, suggesting that the CDP undersized droplets in this range by about 1 to 2 µm.Table 2 shows that the absolute difference between mean D CDP and mean D true * was 1.3 µm.
Tests using 17 and 24 µm diameter droplets resulted in a better match between D CDP and D true *.For each test, the medians 5 and modes of D CDP and D true * were in the same bin and more than 95% of the droplets were contained in the same two bins.
However, the breadth of the distribution measured by the CDP was slightly larger than the actual distribution for the 17 µm test.And, for the 24 µm test, the distribution measured by the CDP was skewed to smaller sizes.For both tests, absolute differences between the means of the distributions were less than 1 µm (Table 2).10 For the 29, 34, 38, and 46 µm diameter tests, a steady trend of oversizing with increasing droplet diameter is apparent when comparing the normalized histograms of D CDP and D true * (Fig 1).In all cases, the mode diameter from CDP measurements was one bin larger than the true diameter mode.For the largest droplet test, 46 µm, 55% of the droplet diameters from the CDP fell in the 48 to 50 µm bin and another 10% fell in each of the 44 to 46 and 46 to 48 µm bins.More than 95% of the actual diameters were split roughly equally between the 44 to 46 and 46 to 48 µm bins.15 Skewing of the CDP-measured distribution to smaller sizes occurred for all tests using droplets 24 µm and larger.Further, the breadth of the CDP-measured distribution increased with increasing droplet diameter.This is perhaps more apparent from the data in the last column in Table 2.Here we compare the difference between the 95 th and 5 th percentile range for D true * and D CDP .The difference increased significantly for larger diameter tests.Interestingly, even though the difference in 20 both the mode and median diameters of the CDP-measured distributions (compared to D true *) were larger for these larger diameters, the absolute difference between the means of the distributions were quite small, roughly 0.1 to 0.2 µm.This is because, with the measured distributions skewed to smaller diameters, comparisons of mean diameters appeared to compare more favourably.

25
By matching the measured response of the CDP to the expected Mie scattering curve, it is possible to investigate whether the errors in sizing observed from the droplet generator tests may be accounted for by limitations due to Mie resonances or by uncertainty in scattering angle collection.The CDP's nominal collection angles are 4° to 12°.However, optical misalignment along with the physical dimensions of the probe's depth of field may lead to uncertainties up to 1° (Baumgardner et al., 2017).Figure 2 shows that the Mie response curve matches reasonably well with the CDP threshold counts that are used to 30 sort droplets into discrete size bins.Two ranges of scattering angles are considered and both show similar behaviour.In fact, regardless of which range of angles are considered, the error in sizing is expected to be, on average, nearly the same.Errors in drop sizing for individual drops, however, will vary depending on collection angles.by more than about 2 µm.However, smaller droplets less than about 20 µm in diameter may easily be mis-sized by more than 2 µm, accounting for as much as ±20% error in sizing (Baumgardner et al., 2017).
Results from the droplet generator tests overlaid on Figure 2 provide additional insight into CDP response.The mean and 5 th 10 to 95 th percentile range of D true illustrates that the droplets being produced nearly all fell within one size bin of the CDP for any given test.The corresponding MIE resonance curves over those same size ranges generally fluctuate over a range of A/D counts that correspond to threshold values of up to 2 to 3 size bins.This can be seen by examining the Mie response (4° -12°) for the test producing 29 µm drops.Over the range of droplet sizes produced, some locations of the Mie curve fall just below the threshold box (for 28-30 µm bin), while others fall slightly above and still other fall inside the box.15 The skewing of CDP-measured distributions to smaller sizes is also apparent by examining the CDP response compared to the threshold curves.For each test, the mean value of A/D counts (Figure 2) lies either within or very near the appropriate threshold box for that droplet diameter.However, the median value of A/D counts exceeds the threshold box for that droplet diameter for all tests using droplet diameters 29 µm and greater.This suggests that the calibration of the CDP is based upon 20 mean diameter of drops rather than the median or mode diameter.While this may be appropriate, because of the unnatural skewing to smaller sizes, it does have implications on calculations of higher order moments.The severe undersizing of a small sample of drops for these same tests cannot be explained based on Mie resonance or collection angle considerations.magnitude of the sizing difference laterally across the beam increased with increasing droplet size.For the 24 µm test, the sizing difference was only about 2 µm across the beam, but for the 46 µm test, the sizing difference was nearly 6 µm across the beam.Also, for each of these five tests, a region near the detector showed significant undersizing of droplets that also increased in magnitude with increasing droplet size.For the 46 µm test, droplets were undersized by as much as 30 µm.This region accounts for the skewing to smaller sizes of the distributions discussed earlier in this section.5 Columns three and four in Table 2 provide information about how sizing differences for each test impact higher moments of the droplet size distribution.For the 9 µm test, the volume-weighted mean diameter (VMD) measured by the CDP was 1.1 µm small compared to that computed from the D true * distribution, resulting in a 36.7%underestimate in LWC.For the 17 and 24 µm tests, the absolute difference between the actual and measured VMD was less than 0.25 µm and resulted in a roughly 10 8% overestimate and 2% underestimate in LWC for these droplets, respectively.For tests using droplets larger than 24 µm, the CDP oversized VMD from 1 to 1.5 µm, resulting in overestimates of LWC of 2.4 to 11%.Readers should note that errors in sizing by a given amount will have a much more significant impact on LWC for smaller droplets.However, for real measurements in cloud, it is often the larger droplets that carry the majority of the liquid mass.Therefore, these middle and larger sizes from 20 µm and greater are expected to have the greatest impact on LWC estimates from the CDP.15

Counting accuracy and qualified sample area measurements
Counting accuracy is evaluated by comparing CDP-recorded counts to the actual number of droplets based on print head ejection frequency and dwell time at each sample location.For all tests, droplets are counted to within 98% accuracy in ~95% of the sample locations.Experiments indicate that all sizes of droplets are undercounted around the perimeter of the qualified sample area, presumably as a result of sizer and qualifier signal noise (Lance et al., 2010).Figure 4 shows 20 locationally-dependent counting accuracy for 46 µm droplets where purple areas correspond to locations where the CDP recorded 10 -50% of actual counts, blue show locations where 50 -90% of actual counts were recorded, and green denotes where at least 90% actual counts were recorded.Only 46 µm droplets were overcounted, specifically in two isolated regions where droplets were overcounted by as much as 100%.The regions are located just left of the area where 46 µm droplets were significantly undersized (see Fig. 3g and discussion earlier).Overcounting in these regions contributes to less than 1% 25 overall count error because they occupy less than 1% of total SA Q .
Figure 5 shows SA Q calculated by summing the individual areas of sample locations that received a certain percentage of actual counts.SA Q is calculated three times for each test by constraining which sample locations are considered to those that received at least 10, 50, and 90% actual counts (SA Q _ 10% , SA Q _ 50% , SA Q _ 90% ).Evaluating SA Q using this count threshold 30 method provides uncertainty ranges of SA Q and accounts for the fact that ~8% of droplets were placed beyond sample area bounds.The mean value of SA Q_50% considering all tests is 0.269 mm 2 ; compared to a value of 0.30 mm 2 provided by the manufacturer.SA Q_50% varies 0.03 mm 2 across the range of droplet diameters tested.It is smallest for 9 and 17 µm droplets, reaches a maximum of 0.28 mm 2 for 24 µm droplets, and then decreases to 0.27 mm 2 for 46 µm droplets.The range of SA Q_10% to SA Q_90% is smallest for the largest droplets, most likely because detector noise is less of a consideration for larger droplets that scatter relatively more light and hence provide a greater detector response.The test using 9 µm droplets shows 5 the greatest difference between SA Q_10% and SA Q_90% , but it should be noted that SA Q variability is likely exaggerated by the course spatial resolution used for that experiment.
For calculations of number concentration and higher moments, SA Q can be provided by either using a fixed value equal to the mean for all droplet sizes (solid red line in Figure 5) or by using a variable value based on a second degree polynomial fit 10 (blue curve in Figure 5).To explore the impact of employing a fixed vs. variable SA Q , three Poissonian droplet distributions with means of 10, 25, and 35 µm are prescribed.The concentration of each distribution equals 100 cm -3 when calculated with a fixed SA Q_50% of 0.27 mm 2 .Table 3 illustrates how using a fixed vs. variable SA Q affects concentration and LWC.It shows that choice of SA Q type most affects concentration and LWC for the distribution with a 10 µm mean diameter.Using a variable SA Q results in 6% greater concentration and ~4% greater LWC.For distributions with greater mean diameters, the 15 choice of using a fixed or variable SA Q results in less than 3% difference in concentration and LWC.
It seems best to calculate higher moments using a fixed SA Q of 0.27 mm 2 , given that the choice of SA Q type has relatively little impact on concentration or LWC.Furthermore, the second-degree polynomial fit used to model variable SA Q does not completely capture variations in SA Q for droplets with diameters between 20 to 30 µm and requires extrapolation of SA Q for 20 droplets with mean diameter less than 9 µm or greater than 46 µm.

Comparisons of liquid water content from the CDP and Nevzorov probes
In-situ data collected by aircraft are used to further investigate uncertainty in CDP measurements by comparing LWC derived from CDP measurements to bulk LWC measured by a Nevzorov hotwire probe.Comparisons of in-situ LWC measurements provide an independent evaluation of CDP performance and an indication of how error in real-world CDP 25 measurements compares to laboratory droplet generator results.

The University of Wyoming King Air
The University of Wyoming King Air (UWKA) is a Beechcraft Super King Air modified to carry a variety of atmospheric in-situ and remote sensors capable of collecting information about atmospheric thermodynamics, dynamics, and cloud particle properties (Wang et al., 2012).In the following, we utilize measurements from two field campaigns conducted in 30 USA (French et al., 2018).The majority of clouds sampled in both PACMICE and SNOWIE were mixed phase.5

Constant temperature hotwire probes
The UWKA carries both a DMT LWC-100 and a deep-cone Nevzorov constant temperature hotwire probe.Both provide estimates of bulk cloud water content utilizing changes in current supplied to heated elements that are exposed to impacts of cloud particles (King et al., 1978;Baumgardner et al., 2017).Element temperature is maintained near 100 C such that impinging particles will vaporize transferring energy from the element through the effects of sensible and latent heating.10 Control circuitry maintains element temperature by altering the power supplied using element resistance as a proxy for temperature.Measurements of water content are obtained by relating the power required to maintain element temperature as particles are vaporized to the sensible and latent heat capacities of water, and element surface area (King et al., 1978;Korolev et al., 1998).

15
Convective losses due to moist airflow over the sensor also transfer energy from collector elements and can be quite large at aircraft flight speeds (King et al., 1978;McFarquhar et. al, 2017).The Nevzorov probe features reference elements that are positioned on the devices' trailing edge such that they are aerodynamically shielded from particle impact (Korolev et al., 1998;Strapp et al., 2003).Energy losses from the reference elements are then assumed to arise solely due to convective considerations and thus the total power delivered to the reference elements can be used to estimate the convective heat losses 20 from the sensing (collector) elements.The relationship between collector and reference element convective losses depends on airspeed and density (Korolev et al., 1998, Abel et al., 2014).Data collected during clear air calibration manoeuvres are used to compute the ratio of collector to reference power and determine how the ratio varies with airspeed and density.The manoeuvres are typically flown at several flight levels over a range of airspeeds.Any inaccuracy in the estimate of convective heat losses in the collector sensor based on power delivered to the reference sensor results in baseline drift of the 25 Nevzorov-derived LWC (LWC NEV ) measurement (Abel et al., 2014).For the data used herein, the effectiveness of the Nevzorov data processing method was evaluated using ~60,000 out-of-cloud points.LWC NEV residual (i.e.departure from zero when not in-cloud) was used to determine uncertainty in baseline LWC.LWC NEV baseline uncertainty is estimated to be no greater than 0.05 g m -3 (the 95 th -5 th percentile range of residual LWC) and minimum detectable LWC NEV is +0.02 g m -3 (95 th percentile residual LWC).30 The Nevzorov is capable of measuring both LWC and total condensed water content (TWC) using two collector elements with different geometrical designs (Korolev et al., 1998).Estimates of ice water content (IWC) can then be obtained by differencing the two measurements.The LWC element is in the shape of a thin rod designed to only evaporate liquid particles.Ice particles shatter on impact with the sensor and are swept away before significant melting or evaporation can occur (Korolev et al., 1998).The TWC collector has a 'deep inverted cone' shape designed to capture both liquid and ice particles (Korolev et al, 2013).Korolev et al. (1998) showed that in mixed phase conditions, interactions between the LWC collector and ice particles can result in LWC overestimation on the order of 12% IWC. 5 In some conditions, collection efficiency may be significantly less than unity resulting in underestimation of LWC NEV .
Because airflow diverges in the vicinity of the LWC collector, LWC NEV may be underestimated by as much as 30% in droplet populations with VMD less than 8 µm since particles with insignificant mass are unable to cross the divergent streamlines and impact collector elements (Korolev et al., 1998).Collection efficiency also departs from unity for droplets 10 with VMD greater than 30 µm because larger droplets tend to splatter on impact leading to incomplete evaporation (Schwarzenboeck et al., 2009).

Dataset overview
Data were used from 29 research flights from both PACMICE and SNOWIE.Measurements from both probes were filtered to 1 Hz.Here we select only those data points in which both LWC NEV and LWC CDP exceeded a threshold value of 0.05 g m -3 .15 To minimize uncertainty due to presence of ice hydrometeors in CDP and Nevzorov LWC measurements, the IWC from the Nevzorov was used to select periods of liquid phase only penetrations from PACMICE and SNOWIE missions.However, IWC estimates from the Nevzorov may be affected by as-of-yet uncharacterized sources of uncertainty such that one cannot conclude the dataset used here is completely devoid of mixed-phase penetrations.Nonetheless, uncertainty in LWC resulting from the presence of ice is expected to minimally impact results.LWC NEV is subject to overestimation of less than 12% 20 IWC, which is often small compared to LWC in mixed phase cloud.It has been also been demonstrated that the CDP is minimally effected by ice shattering artifacts (Lance et al., 2010, Khanal et al., 2018).
The resultant data subset used in the comparison contains 17,917 1 Hz in-cloud points.Droplet concentrations encountered during SNOWIE were uncharacteristically low for continental clouds.Mean droplet concentration for the dataset is 113.6 25 cm -3 with 50% of data points having concentration less than 50 cm -3 .Consequently, droplets were relatively large; with an average VMD of 22.2 µm and 1 st and 3 rd quartiles of 16.7 and 27.7 µm.Nearly all measurements were taken in supercooled conditions, the environmental temperature range for the 5 th and 95 th percentile is -18.7 and -1.3 °C.

In-situ results
For each 1 Hz data point, measured spectra from the CDP were used to compute the total droplet concentration and the VMD 30 of the spectra.The data were first subdivided based on droplet concentration and then further divided based on VMD.LWC CDP is greater than LWC NEV ) for four different ranges of droplet concentration.The points represent mean percent LWC difference calculated using a best fit line for all points in 5 µm wide bins (5-10 µm, 10-15 µm, 15-20 µm, etc.); error bars are root mean squared error.Green hatched areas are estimates of expected percent LWC difference using droplet generator results computed for errors in CDP sizing and counting for 10% and 90% count thresholds and errors considering best estimates of Nevzorov collection efficiency as a function of VMD (from Korolev et al., 1998;Strapp et al., 2003;5 Schwarzenboeck et al., 2009).Plots are shown for 4 ranges of total droplet concentration.The mean percent LWC difference and the number of data points (n) are shown in the bottom-right corner of each plot.
For all VMDs larger than 10 µm, LWC CDP exceeded LWC NEV by as much as 40%.For VMDs less than 10 µm, LWC CDP was less than LWC NEV by 5 -10% for those cases in which total droplet concentrations were less than about 400 cm -3 .The 10 general trend, for all droplet concentrations, suggests increasing LWC CDP (compared to LWC NEV ) for increasing VMD.
However, the mean difference in LWC, across all VMDs, does not indicate any specific trend when considering different ranges of total droplet concentration.
Estimates of percent LWC difference expected based on results from droplet generator tests and Nevzorov collection 15 efficiency estimates predict that LWC CDP should be at most 11% greater than LWC NEV .However, when all of the data in this study are considered, the mean percent difference is 19.6%.Two striking features of the data show that the percent difference for large VMD, greater than about 25 -30 µm, is considerably larger than expected.And, for droplet concentrations greater than 400 cm -3 , the percent difference is significantly larger than expected for all VMDs.The larger than predicted difference between LWC CDP and LWC NEV is unlikely to be a result of coincidence error.The UWKA CDP 20 features a sizer pinhole mask modification such that it is expected to be relatively unaffected by coincidence in concentrations less than 600 cm -3 (Lance et al., 2012).Figure 6d shows that mean percent LWC difference for data with concentration of 400 -1600 cm -3 is not significantly different than mean values for much smaller total droplet concentrations.On the other hand, for this concentration range, percent LWC difference is significantly larger for smaller VMDs when compared to similar VMDs for lesser droplet concentrations, suggesting that those CDP measurements may 25 indeed be impacted by coincidence for these higher concentrations.Regardless, these data account for less than 4% of all points and suggest coincidence is unlikely to account for differences across all ranges of concentration.
The droplet generator tests used a near constant droplet velocity that was ~30% of typical UWKA airspeeds.They provide no information about how CDP sizing/counting accuracy and SA Q may vary with airspeed.Some of the discrepancy between 30 estimated and actual percent LWC difference could be a result of a change in CDP performance at typical aircraft flight speeds which could result from limitations in photodetector response (Dye and Baumgardner, 1984).However, in order for airspeed-dependent errors in sizing and/or SA Q to account for the discrepancies shown, increased flight speeds would need to result in an increase in sizing (and hence photodetector output for the sizer signal) and/or an increase in SA Q , both of which are unlikely outcomes; one might expect the opposite behaviour.On the other hand, overcounting could increase with increasing particle velocity if photodetector response limitations result in more significant signal noise.But it seems unlikely that such considerations could cause overcounting on the order of 5 -20% given that only 46 µm droplets were overcounted (by less than 1%) during droplet generator tests.

5
It is possible that the discrepancy between estimated and actual percent LWC difference could be a result of a change in counting/sizing behaviour for droplets passing through the qualified sample area region where droplets are severely undersized (blue areas in the rightmost 10% of the beam maps shown in Fig. 3).Sizer responses are characteristically within the noise band range (less than 512 digital counts) for droplets transiting these regions; thus, severely undersized droplets could be rejected during 'real-world' operation.If LWC CDP error estimates (as described in section 3.2) are recalculated 10 excluding these regions where droplets are severely then the resultant oversizing throughout the rest of the sensitive sample area could result in as much as 17% overestimation of LWC CDP (effectively shifting upward the hatched green areas in Figure 6 for large VMD).
Error in Nevzorov measurements could also contribute to the discrepancy between LWC CDP and LWC NEV .Instrument icing 15 was a common issue during SNOWIE.The 0.05 g m -3 threshold applied to LWC CDP and LWC NEV was used to exclude measurements taken when one (or both) of the instruments was (were) completely unresponsive.In the case of ice accumulation on the Nevzorov sensing element, build-up of rime ice near (or over) the LWC element often results in significant baseline drift along with an accompanying reduction in sensitivity to liquid water (due to changes in airflow and shielding of the sensing element).Such situations would result in an underestimation of LWC by the Nevzorov and could 20 explain some of the differences shown in Figure 6.However, examination of baselines prior to and after exiting clouds suggests this is not a large problem for the cases examined.Regardless, nearly all measurements were obtained in supercooled conditions so the data used in this study were not able to be further subdivided to investigate differences in regions where temperature greater than 0 °C would exclude possibility of icing.

Summary 25
A droplet generating calibration system was used to test the sizing and counting performance and provide measurements of the qualified sample area of the UWKA CDP using seven discrete droplet diameters ranging from 9 -46 µm.Experiments reveal that droplet sizing accuracy varies depending on where droplets transit the sample area and the size of the droplets.
Errors in sizing for the majority of droplets across the size ranges tested can be accounted for by the amplitude of Mie resonances on the response curve.The Mie resonances often result in an artificial broadening of the distribution by 1 -2 30 bins.How much broadening occurs depends on droplet size and the actual range of collection angles for the probe.This finding confirms results of earlier studies (Rosenberg et al., 2012;Baumgardner et al., 2017).Droplets with nominal The tests also reveal that for droplets 24 µm and larger, nearly all droplets passing through 10% of the qualified sample area (that portion closest to the detector) are undersized, by as much as 30 µm, depending on the droplet diameter.However, 15 droplets passing through much of the rest of the sample area are oversized.The locationally-dependent nature of sizing accuracy results in artificial spectral broadening of droplet size distributions, which is most pronounced for droplets with diameters 34 µm and larger.Although droplets are oversized by 2 -4 µm in most locations within the qualified sample area, the resulting errors in higher order moments such as mean diameter, VMD, and LWC, are mostly offset by undersizing of droplets throughout the rest of the sample area.This has implications for how sizing should be calibrated for the CDP.For 20 example, matching distribution modes when performing calibrations will result in an underestimation for higher order moments because distributions are artificially skewed.Conversely, calibrations that match mean droplet diameter will result in an over estimation of the diameter of the droplet distribution mode in real clouds.
Droplets were counted to within 98% accuracy over roughly 95% of sample locations.Only the largest droplets tested, 46 25 µm, indicated any significant over counting.This occurred in two regions bordering the area where these same droplets were significantly undersized.However, these regions account for less than 1% of the total qualified sample area so they introduce less than 1% overall count error.All sizes of droplets are undercounted around the perimeter of the qualified sample area and this must be considered when defining SA Q for higher moment calculations.SA Q 50% varies only 0.03 mm 2 depending on droplet diameter and thus the use of a mean of SA Q 50% for all droplet sizes (0.27 mm 2 ) is warranted.SA Q for the CDP used 30 in this work and the CDPs examined by Lance et al. (2010Lance et al. ( , 2012) ) agree to within 10%.
Comparisons of in-situ LWC measurements from the CDP and a Nevzorov hotwire probe provide another means of evaluating CDP performance.In-situ comparisons show that, on average, LWC CDP is greater by about 20% whereas droplet  Table 1.Res.
[     ).Fixed SA Q 50% concentration is not shown because it equals 100 cm -3 for all distributions.Uncertainty is equal to 1/2 the range of each parameter when calculated with SA Q 10% and SA Q 90% .
Atmos.Meas.Tech.Discuss., https://doi.org/10.5194/amt-2017-479Manuscript under review for journal Atmos.Meas.Tech.Discussion started: 30 January 2018 c Author(s) 2018.CC BY 4.0 License.times of individual particles.Cloud particle sizes and counts are used to construct size distributions and calculate higher distribution moments.

5
Lance et al.'s (2010) droplet generator work also tested CDP sizing and counting accuracy throughout the qualified sample area (at a spatial resolution of 200 x 20 µm) using 12 and 22 µm droplets.Ten additional tests investigated sizing accuracy at the centre of the qualified sample area using droplet diameters of 8 -35 µm.It was shown that droplets are systematically Atmos.Meas.Tech.Discuss., https://doi.org/10.5194/amt-2017-479Manuscript under review for journal Atmos.Meas.Tech.Discussion started: 30 January 2018 c Author(s) 2018.CC BY 4.0 License.
Atmos.Meas.Tech.Discuss., https://doi.org/10.5194/amt-2017-479Manuscript under review for journal Atmos.Meas.Tech.Discussion started: 30 January 2018 c Author(s) 2018.CC BY 4.0 License. 4 Results of Droplet Generator Tests on the CDP 4.1 Experimental design To quantify uncertainty in CDP measurements of droplet counting and sizing, seven tests using nominal droplet diameters of µm provided measurements over most of the size range detectable by the CDP.For each test, droplets were injected at fixed locations through the qualified sample area of the CDP.Droplets were injected at a frequency 5 of 200 Hz for 9 µm droplets and 250 Hz for all other sizes.Following a dwell time at a given location, the position of the droplet injector relative to the CDP sample area was moved a small distance.The tests proceeded in this fashion, injecting droplets throughout the entire qualified sample area of the CDP.The start and end times at each location were recorded.Post-test, 5 Hz data from the CDP were synchronized to match droplet location and CDP measurements. 10

Figure 1
Figure1shows distributions of normalized frequency for D CDP and D true * for each of the seven tests.In general, the mode 30 diameter of the distribution based on sizing from the CDP (D CDP ) was within one to two size bins (1 to 4 µm) of the D true * mode.For the test using 9 µm droplets, more than 50% of the droplets detected by the CDP were placed in the 7.5 µm bin Atmos.Meas.Tech.Discuss., https://doi.org/10.5194/amt-2017-479Manuscript under review for journal Atmos.Meas.Tech.Discussion started: 30 January 2018 c Author(s) 2018.CC BY 4.0 License.The shaded region in Figure 2 illustrates the range of threshold counts that the CDP uses to determine the size bin for an individual drop.Regions where the Mie curve(s) lie within the shaded regions are locations where a drop will be sized 'correctly'.If the Mie curve is above the shaded region, the drop will be oversized; below the shaded region it will be undersized.The amplitude of the Mie resonances and the locations of the peaks and valleys depend on droplet diameter and vary with collection angles.Generally, the amplitude of the Mie resonances increases with increasing drop size, however, so 5 does the 'steepness' of the curve.Therefore larger droplets, 40 to 50 µm in diameter, should not be undersized or oversized

Figure 3
Figure 3 illustrates sizing from the CDP across the entire qualified sample area.For each sample location, the mean of D CDP 25 is compared to the mean of D true *.A positive difference (warm colours) indicates oversizing by the CDP, negative values (cool colours) indicate undersizing.For the 9 µm diameter tests, undersizing of droplets by 1 to 2 µm was found throughout most of the CDP sample area.Droplets passing through the centre of the beam and laterally towards the top experienced somewhat less undersizing than in other regions.No regions indicated oversizing of droplets.For the 17 µm tests, droplets throughout much of the sample area were sized correctly.Only a small region, laterally towards the top of the beam and 30 towards the detector, were droplets oversized, on average by about 1 µm.
late 2016 and early 2017.The Precipitation and Cloud Measurements for Instrument Characterization and Evaluation Atmos.Meas.Tech.Discuss., https://doi.org/10.5194/amt-2017-479Manuscript under review for journal Atmos.Meas.Tech.Discussion started: 30 January 2018 c Author(s) 2018.CC BY 4.0 License.(PACMICE) campaign began in August 2016 and lasted until May 2017, with flights over eastern Wyoming and western Nebraska, USA.It focused on collecting cloud and precipitation measurements in precipitating stratiform and convective systems primarily in the shoulder seasons.The Seeded and Natural Orographic Wintertime clouds -the Idaho Experiment (SNOWIE) occurred during January -March 2017, and focused on wintertime orographic clouds in southwestern Idaho, Atmos.Meas.Tech.Discuss., https://doi.org/10.5194/amt-2017-479Manuscript under review for journal Atmos.Meas.Tech.Discussion started: 30 January 2018 c Author(s) 2018.CC BY 4.0 License.
Atmos.Meas.Tech.Discuss., https://doi.org/10.5194/amt-2017-479Manuscript under review for journal Atmos.Meas.Tech.Discussion started: 30 January 2018 c Author(s) 2018.CC BY 4.0 License.diameter of 9 µm are undersized by 1 µm or less for roughly 33% of the droplets sampled and are undersized by 1 -4 µm for the remaining 66% of droplets.Errors in droplet sizing for 9 µm droplets do not depend strongly on where droplets transited the sample area.The errors in sizing for these smallest droplets are likely related to the amplitude of Mie resonances compared to the relatively shallow slope of the Mie function.Droplets with diameters of 17 and 24 µm are sized to within 2 µm of the true droplet diameter for nearly all droplets sampled (>90%), but there appears to be a small lateral dependence 5 within the sample area on errors in sizing, such that droplets passing through the top half of the sample area are sized larger than those transiting through the bottom half.Tests for droplets with diameters 29, 34, 38, and 46 µm reveal more significant oversizing, by as much as 2 -4 µm, with an even stronger lateral on sizing error.Droplet generator experiments performed by Lance al. (2010) using 12 and 22 µm droplets reveal a similar lateral gradient in sizing accuracy.The researchers attributed this behaviour to a misalignment of the qualifier detector mask, however this consistent behaviour 10 across different probes might indicate a problem with the optical design.Similar tests on two other CDPs using the University of Wyoming Droplet Generator system revealed similar lateral dependencies.

5 D
Droplet generator test characteristics including the number of droplets injected at each sample location, longitudinal and latitudinal resolution, test duration, mean droplet diameter from glares (mean D true ), and the 95 th to 5 th percentile range of D true .true statistics are from 80 randomly-selected glare images.
in several distribution parameters when calculated using CDP-recorded droplet diameter (D CDP ) vs. diameter from glares rounded to the geometric mean of CDP size bins (D true *).A positive difference (or positive percent difference) indicates that calculations using D CDP result in a larger value than D true *.Percent LWC difference is calculated by 5 comparing the integrated 3 rd moment of normalized D CDP distributions vs. normalized D true * distributions.

Fig. 2 .
Fig. 2. Mie response scaled to CDP A/D counts computed for 4°-12° collection angles (red) and 5°-13° collection angles overlaid on the CDP A/D threshold (shaded blue) that is used to bin individual drops.Black dots show mean droplet diameter (D true ) for the 7 5