Evaluation of two Vaisala RS 92 radiosonde solar radiative dry bias correction algorithms

Introduction Conclusions References


Introduction
Water vapor (WV) is an important driver of weather and climate phenomena.Numerous studies have focused on modeling processes associated with water vapor and evaluating and improving water vapor observations (e.g., Ferrare et al., 1995Ferrare et al., , 2006;;Revercomb et al., 2003;Suortti et al., 2008;Krämer et al., 2009;Moradi et al., 2013a, b).Accurate measurements of water vapor are especially crucial in the upper troposphere; although very little water vapor is present in this part of the atmosphere (e.g., Ferrare et al., 2004), processes such as cirrus cloud formation and maintenance (Liou, 1986) and maintenance of stratospheric water vapor (e.g., Jensen et al., 1996a, b;Hartmann et al., 2001) require very accurate knowledge of the upper-tropospheric water vapor budget.Our understanding of dynamic, thermodynamic, and radiative processes, and even cloud water vapor budget, is impacted by the quality of water vapor measurements (Starr and Cox, 1985;Guichard et al., 2000;Wang and Zhang, 2008).
Published by Copernicus Publications on behalf of the European Geosciences Union.

A. M. Dzambo et al.: Comparing radiosonde humidity correction algorithms
Vaisala RS92 radiosondes have been launched by research and operational centers for over a decade and, compared to most ground and space-based instruments, provide very high ( ∼ 10 m) vertical resolution.The RS92 radiosonde utilizes two thin-film capacitive elements to measure water vapor, wherein the capacitance measured by the radiosonde is proportional to the number of water vapor molecules that are in contact with the sensor.The resulting relative humidity (RH) measurement is taken as a function of this capacitance and the air temperature, which is measured by a separate thin capacitive wire sensor.While in flight, one of the RH sensors measures WV while the other RH sensor is artificially warmed to prevent ice buildup on the sensor; this process alternates between sensors.Unlike its predecessors (such as the RS80 radiosonde), the RH sensor is not shielded from solar radiation.If the RH sensor is warmer than the ambient environment due to solar heating, then the measured RH (as computed by Vaisala's DigiCORA ® software) will be lower than its actual value.Many correction algorithms have been developed (e.g., Vömel et al., 2007b;Cady-Pereira et al., 2008;Yoneyama et al., 2008;Miloshevich et al., 2009;Wang et al., 2013) to correct for this solar radiative dry bias (SRDB).Nearly all of the aforementioned algorithms correct RH as a function of pressure, solar elevation (zenith) angle, and/or RH itself.
Two of the most widely used correction algorithms come from the work of Wang et al. (2013) and Miloshevich et al. (2009); for brevity, these will be referred to as WANG and MILO hereafter.WANG used Global Climate Observing System (GCOS) Reference Upper-Air Network (GRUAN) data (Seidel et al., 2009;Dirksen et al., 2014) to develop and test their RS92 correction algorithm.This physically based correction uses the following form: where T is the sonde-measured air temperature, hf is a heating factor (set to 13), cf is a correction factor (set to 0.4 below 500 hPa and 0.6 above 500 hPa) that accounts for both clear skies and cloud cover, and T CORR RSN is a temperature correction given by Vaisala (http://www.vaisala.com/en/products/soundingsystemsandradiosondes/soundingdatacontinuity/RS92DataContinuity/Pages/ revisedsolarradiationcorrectiontableRSN2010.aspx).Note that T CORR RSN accounts for pressure and solar zenith angle.
The MILO correction was developed using cryogenic frost-point hygrometer (CFH), microwave radiometer (MWR), and reference humidity probes during the 2006 Water Vapor Validation Experiment Satellite/Sondes (WAVES) campaign (Vömel et al., 2007a).MILO consists of an empirically developed correction: where G(P , RH) is an empirically derived function and given as a "look-up" table of coefficients in Miloshevich et al. (2009), and RH TLAG is the original RH data that have been corrected for time lag 1 .The MILO correction also includes a correction based on solar zenith angle (Eq.4), which is applied to Eq. ( 3): solar radiation error (SRE) is dependent on solar altitude angle (α) and expressed as a fraction of the SRE at 66 • , which represents the mean solar zenith angle for the daytime CFH/RS92 soundings during WAVES (Miloshevich et al., 2009).A comparison of these two correction algorithms in a typical atmospheric sounding is given in Fig. 1.
In 2011, Vaisala upgraded its DigiCORA ® software to version 3.64, which included their own time-lag and SRDB correction algorithm.Although the details of this algorithm are not freely available to the public, it is possible to deactivate the time-lag and SRDB corrections during configuration of the sonde.We note that for results shown later in this study, the RS92 RH data are not corrected for time-lag error 2 because the average change in RH between time-lag corrected and non-time-lag corrected data is almost always around 0 % and at most around 2 % for 25 hPa bins (results not shown).This study focuses on RS92 radiosondes collected before this change to the DigiCORA software was made.
We evaluate the WANG and MILO SRDB corrections at sites maintained by the Department of Energy's (DOE) Atmospheric Radiation Measurement (ARM) program (Ackerman and Stokes, 2003;Mather and Voyles, 2013), at which numerous instruments are deployed that will aid in this evaluation.We use data from the ARM sites in the Southern Great Plains (SGP) in Lamont, OK, USA, North Slope Alaska (NSA) in Barrow, AK, USA, and the tropical western Pacific (TWP) on Nauru Island, Republic of Nauru (Stokes and Schwartz, 1994).We also use ARM data collected during a 3month experiment at a 5300 m m.s.l.site at Cerro Toco (CJC) in northern Chile (Turner and Mlawer, 2010).Utilizing several distinct climate locations ensures a more accurate and in-depth analysis of the two correction algorithms.

Comparing the correction algorithms directly
The two correction algorithms were applied to RS92 data launched at the SGP, NSA, TWP, and CJC sites.These data spanned all months of the year.The mean change in water vapor mixing ratio as a function of height (relative to the original radiosonde measurement) for each site is shown in Fig. 2. The largest difference between the two correction algorithms is in the middle and upper troposphere above 7 km, where the MILO algorithm moistens the original radiosonde much more than the WANG correction; the difference between MILO and WANG approaches a factor of 1.8 by 14 km.Given the sensitivity of the outgoing long-wave radiation to changes in upper-tropospheric water vapor (e.g., Ferrare et al., 2004), understanding which of these corrections is more appropriate is very important.However, a close inspection of Fig. 2 also shows that the WANG correction moistens the radiosonde slightly more than the MILO correction in the lowest 2 km for the moister tropic and midlatitude sites.
We compare the precipitable water vapor (PWV) values derived from integrating the moisture profiles from the original and corrected radiosonde profiles with those retrieved from the ARM two-channel MWRs using the so-called "MWRRET" algorithm (Turner et al., 2007).ARM has used the MWR-retrieved PWV as a "standard" for correcting for first-order radiosonde biases (Turner et al., 2003;Cady-Pereira et al., 2008), calibrating its Raman lidar (Turner and Goldsmith, 1999), and evaluating infrared radiative transfer models (e.g., Turner et al., 2004).
The comparisons of the radiosonde PWV values with those from the MWR (Fig. 3) show that the original uncorrected radiosondes have a dry bias that increases as the PWV increases.Table 1 summarizes the median and standard deviations; in an effort to remove outliers, values that were below/above the 5th/95th percentile were removed before computing the PWV biases.Figure 3a1 shows that the mean PWV from the original radiosondes at SGP are approximately 0.35 cm drier than the MWR-retrieved value in the 4.25-4.75cm bin; however, the Wang-corrected radiosonde, while moister than the original radiosonde, still has a slight dry bias of 0.10 cm relative to the MWR in this bin (Fig. 3a3).The magnitude of the PWV bias generally increases when more PWV is present in the atmosphere.Both the WANG and MILO corrections increase the sonde's derived PWV and result in much better agreement with the MWR.This result is consistent with the findings in Yu et al. (2015), where MWR retrievals of PWV and PWV derived from WANG-corrected RH data were found to be within the uncertainty of the MWR instrument (which is ∼ 0.07 cm; Turner et al., 2007).
The PWV results (Fig. 3, Table 1), especially when we consider all three sites (SGP, NSA, and TWP), demonstrate that both algorithms greatly improve the accuracy of the PWV relative to the MWR but do not distinguish which of the two corrections may be better.The WANG's drier correction (relative to MILO) in the upper troposphere is slightly offset by its wetter correction near the surface and thus yields similar PWV values.A close inspection of Table 1, however, suggests that the MILO correction seems to add more PWV compared to WANG in the tropics, whereas WANG adds more PWV in drier climates such as SGP and NSA.Regardless of the climate, PWV is mainly contained in the lowest 1-2 km of the atmosphere; thus corrected RH in the middle and upper troposphere influences the results shown here very little.To evaluate the accuracy of the two SRDB corrections as a function of height, we first considered comparing the corrected radiosondes with water vapor measurements made by the ARM Raman lidars (Goldsmith et al., 1998;Ferrare et al., 2006) at the SGP and TWP/Darwin sites.Unfortunately, during the daytime the Raman lidar observations are limited to altitudes below 5 km and thus unable to provide any insight into the accuracy of the two corrections in the upper troposphere.
Instead we use two radiance closure experiments to evaluate the two corrections in the upper troposphere: one downwelling experiment and one upwelling experiment.Radiance closure studies have been used in prior studies to validate sonde-derived brightness temperature (T B ) measurements (e.g., Turner et al., 2003;Soden et al., 2004;Mattioli et al., 2008;Kottayil et al., 2012;Moradi et al., 2013a, b) and offer another method for detecting systematic biases in radiosonde RH measurements.In each experiment, a radiative transfer model is used to transform the original RH data, along with the WANG-and MILO-corrected RH data, into simulated brightness temperatures.The model-derived T B data are directly compared to an appropriate reference spectral radiance measurement, which will be described more thoroughly in the respective experiment sections.Statistical significance (for p = 0.05) is computed, where appropriate, to show the significance of the difference between WANG, MILO, and the original data.

Downwelling experiment
The ARM program conducted the second phase of the Radiative Heating in Underexplored Bands Campaign (RHUBC-II) in CJC in August through October 2009 (Turner and Mlawer, 2010).The CJC site is located approximately 5.3 km above sea level in the Atacama Desert; this site can be considered a mid-tropospheric site due to its altitude and water vapor conditions.Also, during RHUBC-II, there was a high frequency occurrence of clear-sky and dry conditions, making it optimal for studying the accuracy of upper-tropospheric water vapor measurements.
Our reference instrument is the G-band water vapor radiometer profiler (GVRP).The GVRP measures downwelling radiation in 15 channels at 170.0, 171.0, 172.0, . . ., 182.0, 183.0, and 183.31GHz.Cimini et al. (2009) showed that the GVRP (in that paper, referred to as "MP-183") agreed within uncertainty with two other collocated 183 GHz radiometers during RHUBC-I, which was held at the NSA site in February-March 2007.The lower frequency channels (e.g., below 178 GHz) are more sensitive to the total PWV, while the higher frequency channels are more sensitive to middle/upper-tropospheric water vapor (Fig. 4; Cimini et al., 2009).The GVRP has an uncertainty of 1.5 K for T B measurements (Cadeddu, 2010;Cadeddu et al., 2013).
The corrected and uncorrected RH data from the 144 RS92 radiosondes launched during RHUBC-II were used as input into version 4.1 of the MonoRTM radiative transfer model (Payne et al., 2008(Payne et al., , 2011;;Clough et al., 2005) to compute monochromatic downwelling radiance at high spectral resolution (10 MHz) from 168 to 185 GHz.Since the Cerro Toco site almost always has clear skies, the model was run to compute clear-sky radiances (methodology for identifying cases with environmental inhomogeneity or clouds is described in the next paragraph).These computed clear-sky monochromatic spectra were convolved with the GVRP's instrument response function to calculate brightness temperatures corresponding to each GVRP channel.These model-derived radiances, which were converted to T B , were directly compared to the T B measurements made by the GVRP.
To reduce the complexity of the analysis, we restricted our comparisons to clear-sky conditions only.To identify cloudysky conditions as well as inhomogeneous environments (i.e., when there was a horizontal gradient in water vapor across the RHUBC-II site), the standard deviation of the GVRP T B measurements at 174 GHz over a 30 min window centered at the radiosonde launch time at both 30 and 150 • was computed.When the standard deviation at either angle (where 90 • corresponds to zenith) was more than 2.25 K, the sky conditions were not considered uniform and the sonde was removed from subsequent analysis.This additional screening also accounts for inhomogeneity created by localized mountain-scale circulations and a thermally driven circulation across the Cerro Toco site (Marín et al., 2013).The comparison of the MonoRTM T B calculations using the MILO-and WANG-corrected radiosondes as input demonstrated a different spectral character based upon the PWV in the profile.For the moistest 30 % of the CJC radiosondes (i.e., where the PWV > 0.57 mm, where the maximum PWV observed at CJC was 1.20 mm), the MILOcomputed T B was typically larger than the WANG-computed values at all GVRP frequencies (Fig. 5, green spectra), which implies that the MILO-corrected radiosondes are moister over the entire profile.However, for the driest 30 % of the CJC radiosondes (i.e., PWV < 0.37 mm), the T B values computed using the WANG-corrected profiles are larger than the MILO-computed radiance for frequencies below 182 GHz (Fig. 5, orange spectra).This suggests that the WANGcorrected radiosondes are moister than the MILO-corrected data, especially in the lowest several kilometers of the atmosphere.Most importantly, this analysis suggests that the significant differences in how the two correction algorithms behave at different PWV amounts can be used with GVRP spectral observations to evaluate both algorithms.
The median observed minus computed brightness temperature spectra for the WANG-and MILO-corrected radiosondes are shown in Fig. 6; these data are also divided into the 30 % moistest and 30 % driest profiles, each of which has 26 cases.Table 2 summarizes the median biases for the 30 % moistest profiles and 30 % driest profiles with standard deviations.For the median of the driest cases, the MonoRTMderived T B calculations for both correction algorithms are approximately 1-4 K warmer than the GVRP observations for frequencies between 170 and 178 GHz, increasing to over 13 K warmer than the GVRP at the center of the water vapor absorption line at 183.3 GHz.This suggests that both correction algorithms actually worsen the MonoRTMderived T B measurements (compared to T B measurements derived from the original RH data) in the most extreme of dry cases seen in the CJC data set.Interestingly, the MonoRTM calculations that used the original uncorrected radiosondes provide a much better agreement with the GVRP observations for these very dry cases.Furthermore, the application Table 2.A summary of the median T b biases between the MonoRTM-derived T b and GVRP T B measurements using original radiosonde RH data and WANG/MILO-corrected radiosonde RH data.Data are represented as a median bias with ± 1 standard deviation.Figure 5. Downwelling brightness temperature differences between MonoRTM calculations using the WANG-and MILO-corrected RH profile as input.Data are sorted by the moistest 30 % and driest 30 % of all profiles in the CJC data set (green and orange, respectively).The thick black lines are the mean spectral residual for the two subsets of data. of the two correction algorithms increases the scatter between the GVRP and MonoRTM-computed T B at 183.0 and 183.31GHz relative to the original uncorrected radiosonde (Table 2), suggesting that neither algorithm adds skill at the very low PWV amounts seen in this category of cases.Given the extremely low RH values of ∼ 10 % characteristic of the CJC site (Fig. 7), the precision of the RH measurement itself (0.5 %) propagates an additional error as high as 0.5 % in the resultant WANG/MILO corrections at the CJC site (result not shown).This adds an additional residual error to the otherwise bias-corrected MonoRTM-computed T B values.A much different story, however, is seen in the 30 % moistest profiles.The mean T B bias between the GVRP observations and the MonoRTM calculations using both the WANG and MILO-corrected input data from this moist subset is much smaller than for the 30 % driest profiles.The WANG/MILO MonoRTM calculations also yield slightly moist-biased results compared to the original RH MonoRTM calculations, which are dry biased (Fig. 6).The good agreement between the observed and computed spectra for frequencies less than 177 GHz suggests that both algorithms have the PWV correct, as these channels have relatively constant weighting functions with height.At 183.0 and 183.31GHz, the MonoRTM-derived T B calculations for the WANG calculation are warm biased by 0.42 and 0.33 K, respectively, whereas the T B calculations using the MILOcorrected radiosondes are warm biased by approximately 1.8 K.While these results seem to indicate that WANGcorrected radiosondes are in better agreement with the GVRP observations, this result is not statistically significant.Interestingly, the scatter in the GVRP minus MonoRTM residuals at these two frequencies is very similar between the calculations that used the original RH profile and either of the two corrected RH profiles (Table 2).The moist 30 % cases in this analysis, when compared to other distinct climatological locations (Fig. 8), are considerably drier when compared to a tropical location (e.g., the ARM TWP Nauru site).
As a consistency check for the T B residuals (computed as observed minus computed) derived from original, WANGand MILO-corrected RH data, a one-sided Student t test is performed on the 30 % partitioned moist and dry cases for all 15 MonoRTM frequencies (results not shown here).For the moistest and driest 30 % of cases, WANG-and MILOcorrected RHs are statistically significant (at the p = 0.05 level) from the original RH data.A one-sided Student t test between WANG and MILO for the moistest or driest 30 % of cases, however, reveals no statistical significance at any frequency.Despite the noted difference in biases from Fig. 6, we cannot reasonably conclude that one correction algorithm is better than the other.Hence, a second experiment is needed to further deduce differences between the WANG and MILO corrections.

Upwelling experiment
The downwelling radiance closure experiment demonstrated that both WANG-and MILO-corrected RH data are improved over the original RH data only for the moister cases at CJC.However, while the CJC site is representative of a midtropospheric site in terms of altitude and pressure, its very dry climate resulted in water vapor amounts (as indicated by the integrated water vapor (IWV) histograms in Fig. 8) that are significantly drier than those found at other ARM sites.Thus, downwelling radiance closure studies at the other sites would prove difficult because lower-tropospheric water vapor is much higher, meaning the downwelling radiance would have little sensitivity to change in upper-tropospheric humidity.The one-sided Student t test results further suggest little variation between the correction algorithms despite the fact they correct differently in the upper troposphere.
However, upwelling spectral infrared radiance observations are very sensitive to the vertical distribution of water vapor.The SGP site experiences a wide range of weather phenomena throughout the year, which results in a wide range of upper-tropospheric IWV throughout the year (Fig. 8 -green line).During the cold season, upper-tropospheric IWV at the ARM SGP site is representative of that mea-sured at the ARM's NSA (Barrow) site (Fig. 8 -blue line), whereas during the warm season at the ARM SGP site the upper-tropospheric IWV is representative of a tropical location (e.g., the ARM's TWP sites; see Fig. 8 -orange line).For this reason, radiosonde data from the SGP site are chosen for the upwelling radiance closure exercise.
We used the infrared radiance observations made by the Atmospheric Infrared Sounder (AIRS; Aumann and Pagano, 1994).Launched into a sun-synchronous polar orbit on 4 May 2002 aboard NASA's Aqua satellite (Parkinson, 2003), this instrument has provided extensive insight into a host of weather and climate-related phenomena (e.g., Chahine et al., 2006;Shu and Wu, 2009;Shimada and Minobe, 2011).The high spectral resolution of the AIRS, with 2378 channels, provides a wealth of information for our study.Its data have been extensively compared with data from infrared spectrometers flown on aircraft (e.g., Tobin et al., 2006), demonstrating excellent calibration accuracy and stability.One caveat to using the AIRS, like any sunsynchronous polar-orbiting satellite, is the temporal resolution of the data: although approximately 12.5 years of AIRS data are available, surface locations near the poles will have more measurements than surface locations in the midlatitudes or near the equator.The ARM SGP site launches radiosondes around 18:00 UTC every day, which is about 2 to 3 h before the AIRS overpass time (i.e., around 20:00 to 21:00 UTC).For this experiment, AIRS T B and radiosonde data from a 5-year period from January 2005 through December 2009 were used.
Upwelling infrared radiation is highly sensitive to changes in water vapor, so we needed to ascertain if the PWV changed appreciably between the sonde launch and AIRS overpass.Clouds must also be filtered from the data set, because measured upwelling radiation is very sensitive to changes in cloud properties.The development or advection of clouds at the time of the radiosonde launch or AIRS overpass can obscure the atmosphere below the cloud-top height.To minimize these impacts, we included data only: 1. where the AIRS overpass occurred within 135 min of the radiosonde launch 2. during cloud-free scenes, as discerned by the AIRS and radiosonde observations (methodology explained in the following paragraphs) 3. when the MWR PWV did not change by more than 5 % between the time of the radiosonde launch and AIRS overpass.
In short, only data during completely cloud-free conditions are examined.This is especially necessary because both the WANG and MILO correction algorithms are intended for use mainly in clear-sky conditions.The 5 % threshold was determined through a sensitivity study: for two standard atmospheres (summer and winter), Additional screenings were implemented to account for the effects of cloud cover during this time threshold.The AIRS provides radiance measurements in a "footprint", which is a 3 × 3 set of pixels.Data were chosen such that the center pixel was the measurement closest to the SGP site.At 938 cm −1 the atmosphere is transparent to nearly all gases except for water vapor, thereby making this channel very sensitive to surface temperature in clear conditions.The standard deviation of the T B values obtained from the 938 cm −1 channel radiances (T B,938 hereafter) was computed for all nine pixels and thresholds were determined based on all available footprints (Table 3).To account for seasonal variability in the T B,938 measurements, thresholds are determined on a monthly basis: T B,938 measurements in all pixels (for a clearsky scene) result in a small standard deviation (generally less than 2 K).
For comparison sake, previous AIRS validation studies at this channel over the ocean (e.g., Hagan and Minnett, 2003) demonstrated that the AIRS radiometric uncertainty is ap-proximately 1 %, which is about 0.5 at 300 K for 938 cm −1 .Tobin et al. (2006) later demonstrated that the root mean square error of brightness temperature and water vapor measurements over the ocean approached the theoretical expectations of clear-sky conditions.Even in clear-sky data, some variability in T B,938 measurements occurs as a result of local differences in surface temperature across the swath of the footprint.To account for these deviations in surface temperature while keeping the error to within ∼ 6 % or ∼ 3 K, we defined a clear-sky threshold equal to twice the 25th percentile of the T B,938 standard deviation for that month (Table 3).The factor of 2 ensures that enough cases make it into the analysis while staying under 3 K for any season, which accounts for the prescribed natural variability in T B,938 .High T B standard deviations are primarily a signature of partly or mostly cloudy skies, since cloud tops are almost always colder than the surface.
Stratiform cloud decks are also accounted for: low T B,938 standard deviations but lower than average T B,938 values (relative to the mean for that month) signify a cloud deck and therefore are also screened from the data.Subvisible cirrus clouds, which affect the radiance budget but are too optically thin to be easily identified in the AIRS observations, were identified using the radiosonde RH data.Any original RH profile that has an RH ICE measurement greater than 90 % anywhere in the column is removed.Using all of the above criteria to account for cloud coverage and environmental homogeneity, 96 cases pass these screenings.
The line-by-line radiative transfer model LBLRTM (Alvarado et al., 2013;Clough et al., 2005;Turner et al., 2004), which shares the physical basis as the MonoRTM used in the downwelling experiment, is used to compute upwelling infrared radiance from the original and corrected RH data.The LBLRTM computes very high-resolution radiance data; in order to match the 2378 AIRS channels, the monochromatic LBLRTM output is convolved with the AIRS instrument spectral response function for each of the 2378 AIRS channels.The atmosphere is generally opaque in the spectral region between approximately 1300 and 2000 cm −1 at the SGP site due to absorption by water vapor.Our analysis focused on the radiative closure in this spectral region, using only AIRS channels where the transmission of the atmosphere was 0. By restricting our analysis to this set of channels, uncertainties associated with the emission of the earth's surface were avoided.
For each radiosonde/AIRS overpass pair, the upwelling T B was computed using the LBLRTM along the viewing angle of the AIRS instrument, and the observed minus computed T B differences were assigned to different altitudes.We attributed the T B (λ) difference to the altitude where the weighting function for that wavelength (λ) had its maximum value.The weighting functions as a function of height W (z) were computed as where β(z) is the gaseous absorption coefficient and τ (z) is the cumulative optical depth from the AIRS sensor to height z computed as and the wavelength dependence is inferred.In the 1300-2000 cm −1 spectral region, water vapor is the primary gaseous absorber.Weighting functions "peak" at various heights depending on the respective channel's sensitivity to water vapor and the shape of the water vapor profile.For midlatitude atmospheres, weighting functions for the different spectral channels generally peak between 5 and 12 km depending on the water vapor profile (which determines the optical depth profile) and the temperature profile.AIRS channels where the weighting function peaks above 2 km and below the tropopause are considered valid for this study.If a peak fell within a 1 km altitude range (e.g., 5-6, 6-7 km), the observed minus computed T B residual for that channel was binned in this height range.Similar to the downwelling experiment, mean residuals are computed according to the 30 % moistest and 30 % driest cases, which corresponded to IWV thresholds (for all radiosondes having valid measurements between 525 and 200 hPa) of above 0.96 mm and below 0.37 mm, respectively.Median brightness temperature biases between the AIRS and un/corrected RH data (Fig. 9) reveal an average correction for any given layer of approximately 0.2 to 0.4 K, depending on the correction.Below 5 km, T B computations using WANG-corrected RH are less biased than T B computations using MILO-corrected RH (a result consistent with Fig. 2).Above 5 km, MILO-corrected RH results in modelcomputed T B that is less biased than WANG, but both WANG-and MILO-corrected RHs result in T B computations that are statistically significant from T B model computations using original RH as input (for all altitude levels).When comparing WANG-and MILO-corrected T B residuals against one another, the corrections become statistically significant (at p = 0.05) from one another above the 5-6 km height bin.Also, MILO-corrected T B residuals are less biased than WANG-corrected T B residuals except at the 12-13 km height bin.We reasonably conclude that MILOcorrected RH for all cases performs better than WANGcorrected RH; however, we feel it is necessary to partition the cases by upper-tropospheric IWV in order to further deduce differences between the WANG and MILO RH correction algorithms.
When evaluating the driest 30 % of data and moistest 30 % of data in Fig. 9, brightness temperature biases between the AIRS and un/corrected RH data (Fig. 10) are corrected, on average, by 0.2 to 0.5 K for the driest cases and 0.3 to 0.4 K for the moistest cases, depending on the correction algorithm that was used.Table 4 summarizes the median biases for the driest and moistest cases with standard deviations.Table 4.A summary of the brightness temperature biases between the AIRS and the LBLRTM derived data over the SGP site using un/corrected RH data as input as a function of height, where the height for each spectral residual was determined as the height where the weighting function for that profile peaks.The driest 30 % and moistest 30 % of the data correspond to upper-tropospheric IWV thresholds of less than 0.37 mm and greater than 0.96 mm, respectively., where the residual in a spectral channel was assigned to a particular height (in 1 km intervals) based upon where the weighting function for that channel peaked with altitude (using the original RH profile).Error bars represent the 25th/75th percentile of brightness temperature residuals.
Aside from the 12-13 km layer for WANG and the 6-7, 10-13 km height bins for MILO, the correction algorithms remain slightly dry biased.This result is consistent with the findings in Fig. 2: since MILO generally adds more WV in the middle and upper troposphere, it follows that MILO corrects more than WANG in these driest cases (though no more than about 0.2 K) and appears to be better.The moist cases, however, result in T B residuals closer to the observed AIRS T B , with MILO-corrected T B residuals being less biased than WANG-corrected T B residuals at every height bin except the 12-13 km height bin.Again, these results are consistent with Fig. 2: MILO corrects more than WANG (as much as 0.10 to 0.15 K more), which is only possible in the presence of increased WV in the middle and upper troposphere.It should be noted that many more observations (i.e., usable channels resulting from the weighting function analysis) are avail-  9, but where the residuals are for the moistest 30 % and driest 30 % of the water vapor profiles.The median values shown in this plot, along with the standard deviations, are given in greater detail in Table 4. able for the moist case category (especially above the 5-6 km height bin).In the drier profiles, the opacity of the atmosphere due to water vapor absorption decreases and thus more AIRS channels are eliminated from the analysis because the channel is sensitive to surface emission, thereby making fewer measurements available.The number of measurements (i.e., number of brightness temperature measurements between 1300 and 2000 cm −1 from the partitioned cases) per height bin for the driest 30 % and moistest 30 % of data is also given in Table 4.
For both WANG and MILO, Table 4 shows that both corrections have a slightly decreased standard deviations compared to the original measurements at nearly every height bin.MILO, in most cases, has a slightly lower standard deviations compared to WANG.
We also computed statistical significance among the T B residuals for original, WANG-and MILO-corrected T B data (for the 30 % moistest and driest cases).Again, both the WANG-and MILO-corrected T B are significantly different from the T B derived from the original RH data for all altitudes.When coupled with the fact that T B residuals among the correction algorithms are much less biased compared to T B residuals using original RH data, we can conclude that WANG-or MILO-corrected RH is much improved over the original RH measurements.For the driest 30 % of cases, the WANG and MILO corrections are statistically significant from each other (at the p = 0.05 level) at and above the 9-10 km bin.For the moistest 30 % of cases, WANG-and MILO-corrected T B become statistically significant from one another at and above the 7-8 km bin.In both cases, MILO is less biased than WANG above the stated altitude bins (except the 12-13 km bin); therefore we can also conclude that MILO-corrected RH is better representative of uppertropospheric RH compared to WANG-corrected RH.
For both the upwelling and downwelling experiments, the dry thresholds are the same (0.37 mm), however, the T B residuals computed for the upwelling experiment from each correction algorithm reduced the bias, which was not the case for the driest 30 % of results from the downwelling experiment.At this time, we cannot conclude why results for the respective subsets of data differ.The moist threshold is higher for the upwelling experiment compared to the downwelling experiment (0.96 vs. 0.57 mm) -likely because water vapor can more easily reach the upper troposphere due to phenomena such as deep convection at the SGP, while at CJC there are a range of processes at work keeping the troposphere relatively dry (Rutllant Costa, 1977).Figures 7 and 8 corroborate this idea as well considering the CJC observes lower RH and IWV, respectively, compared to the SGP site.With the exception of the 12-13 km bin, T B residuals (Fig. 10) computed from MILO-corrected RH are less biased than T B residuals computed from WANG-corrected RH but remain slightly dry biased.Despite the limitations present in the upwelling experiment, but given the statistical significance between MILO-and WANG-corrected RH, the results from this experiment suggest that MILO-corrected RH is better representative of clear-sky RH compared to WANG-corrected RH in the upper troposphere, and both corrections represent improvements compared to uncorrected sondes.

Conclusion
Both the WANG and MILO corrections significantly improve the original Vaisala RS92 RH data, as demonstrated in an analysis of PWV at multiple sites, yielding approximately the same improvement in PWV relative to the MWR-retrieved value.However, the two algorithms differ in their corrections as a function of height due to their different methodologies.
Given this difference, radiative closure experiments were performed to determine whether one of the two corrections was better than the other.Comparing radiative transfer calculations that use the WANG-and MILO-corrected radioson-des, an analysis of downwelling measurements at the 183.00 and 183.31GHz channels of the CJC GVRP indicated that the WANG median T B calculation was not statistically different compared to the MILO median T B calculation for the moist cases that are more typical of upper troposphere in midlatitude atmospheres.Also, both corrections significantly improved the T B bias for the moist cases: the original median T B calculation was ∼ 10 K too warm (implying the original sonde was too dry) at 183.00 and 183.31K.However, radiosondes in the very dry category, corresponding to upper-tropospheric conditions not typically found in midlatitude or tropical locations, were made significantly too moist by both corrections, yielding much poorer agreement with the GVRP than the original uncorrected radiosonde profile.We find WANG-and MILO-corrected RH to be statistically better than the original RH for the moist cases; however, WANG-and MILO-corrected RHs are not statistically different when tested against one another.
The upwelling experiment using AIRS measurements revealed additional differences between WANG and MILO, likely owing to the fact the SGP site has a great seasonal dependence on upper-tropospheric IWV.The driest cases show that WANG is slightly less biased than MILO below 5 km, which is likely due to the fact that WANG corrects more than MILO in the lower troposphere.Otherwise, MILO is less biased than WANG in nearly every other scenario, as indicated by the partitioning of radiances by height using weighting functions.Both the WANG and MILO corrections result in T B computations that are statistically significant from T B computations derived from original RH -a result consistent with the results found in the downwelling experiment.We find, however, that MILO is statistically different from WANG above 8 km in the moistest 30 % of cases and above 10 km in the driest 30 % of cases.We conclude that MILO offers a more realistic representation of upper-tropospheric RH compared to WANG because of the lower T B bias at nearly all altitudes coupled with the statistical significance between MILO and WANG.
The outcome of the upwelling radiance closure experiment suggests that the correction factor "cf" used to scale the temperature correction in WANG may be too low.However, the intent of this correction factor is to account for both clear and cloudy conditions and despite the fact WANG offers a much better agreement than the original RH measurements, our results indicate that WANG seemingly under-corrects for solar radiative dry bias.This also likely explains (from the upwelling experiment) why WANG is statistically different from MILO in the upper troposphere.Given the ease of use of the WANG correction, we suggest that the "cf" be computed separately for clear and cloudy skies.This change, however, may be complicated by the fact that cloud extinction varies significantly between high ice clouds and low-altitude liquid clouds, and considering the large variability in the microphysical properties between these two types of clouds, adjusting the "cf" would at minimum need to be a func-A.M. Dzambo et al.: Comparing radiosonde humidity correction algorithms tion of altitude and water phase.If this adjustment could be made, the WANG correction would become more robust and would be applicable to an increased number of applications.Regardless, our results demonstrate the utility of both correction algorithms across a wide range of climatic regimes, where MILO is especially effective in the upper troposphere for clear-sky conditions.

Figure 1 .
Figure 1.A comparison of the WANG-and MILO-corrected RH profiles (left plot; red and green, respectively) compared to the original RH profile (black).The light blue line represents the saturation RH with respect to ice.The right plot shows the difference between the original RH profile and the WANG/MILO RH profiles (red/green), respectively.This example is the 18Z sounding for the SGP site on 15 June 2006.

Figure 2 .
Figure 2. The mean relative increase in the water vapor mixing ratio caused by the two correction algorithms for RS92 radiosondes launched at the SGP, NSA, TWP, and CJC sites (left) and the standard deviation (right) as a function of height.The MILO (WANG)corrected data are shown with dotted (solid) lines.The number of comparisons for each site is shown in the figure.NSA results are only shown up to the mean tropopause height (10 km).The inset plot on the main figure is the mean relative increase in the water vapor mixing ratio caused by the two correction algorithms, but only from 0 to 4 km.

Figure 3 .
Figure3.A comparison between the PWV derived from the original radiosonde data (top), WANG-corrected (middle), and MILO-corrected (bottom) radiosonde data with the PWV derived from the collocated MWR at the SGP site (panels a1, a2, and a3), NSA site (panels b1, b2, and b3), and TWP Darwin site (panels c1, c2, and c3).The solid black line superimposed on the data denotes the mean values for each PWV bin, and the vertical lines represent the standard deviations.

Figure 4 .
Figure 4.The water vapor Jacobian computed for mean conditions at Cerro Toco (surface altitude is 5.3 km m.s.l.) at the GVRP frequencies.The PWV for this case was 1.1 mm.

Figure 6 .
Figure6.Median MonoRTM minus GVRP spectral residuals, where the MonoRTM was driven by WANG-and MILO-corrected radiosondes (red/green and blue/brown, respectively) and uncorrected radiosondes (gray lines).These median residuals were computed for the moistest and driest 30 % of the CJC radiosondes, as shown in Fig.4.

Figure 7 .
Figure7.Median (uncorrected) RH profiles for four arm sites.RH is grouped in 25 hPa bins (starting at 1000 hPa), and the median is computed from that bin.There are 142 soundings for the CJC site, 2500 soundings across the annual cycle for the SGP and TWP (Nauru) sites, and 1712 soundings for the NSA site.

Figure 8 .
Figure 8. Distributions of upper-tropospheric integrated water vapor (IWV) from 530 to 200 hPa for four ARM sites, each with distinct climates.The mean surface pressure at the CJC site is 530 hPa, while 200 hPa is the approximate height of the tropopause.

Figure 9 .
Figure9.The median LBLRTM minus AIRS brightness temperature difference (residual) as a function of height (for all data), where the residual in a spectral channel was assigned to a particular height (in 1 km intervals) based upon where the weighting function for that channel peaked with altitude (using the original RH profile).Error bars represent the 25th/75th percentile of brightness temperature residuals.

Figure 10 .
Figure10.Same as in Fig.9, but where the residuals are for the moistest 30 % and driest 30 % of the water vapor profiles.The median values shown in this plot, along with the standard deviations, are given in greater detail in Table4.

Table 3 .
A summary of the monthly brightness temperature thresholds used to screen cloudy-sky scenes from the AIRS data.