Investigation of ground-based microwave radiometer calibration techniques at 530 hPa

. Ground-based microwave radiometers (MWR) are becoming more and more common for remotely sensing the atmospheric temperature and humidity proﬁle as well as path-integrated cloud liquid water content. The calibration accuracy of the state-of-the-art MWR HATPRO-G2 (Humid-ity And Temperature Proﬁler – Generation 2) was investigated during the second phase of the Radiative Heating in Underexplored Bands Campaign (RHUBC-II) in northern Chile (5320 m above mean sea level, 530 hPa) conducted by the Atmospheric Radiation Measurement (ARM) program conducted between August and October 2009. This study as-sesses the quality of the two frequently used liquid nitrogen and tipping curve calibrations by performing a detailed error propagation study, which exploits the unique atmospheric conditions of RHUBC-II. Both methods are known to have open issues concerning systematic offsets and calibration repeatability. For the tipping curve calibration an uncertainty of ± 0.1 to ± 0.2 K (K-band) and ± 0.6 to ± 0.7 K (V-band) is found. The uncertainty in the tipping curve calibration is mainly due to atmospheric inhomogeneities and the assumed air mass correction for the Earth curvature. For the liquid nitrogen calibration the estimated uncertainty of ± 0.3 to ± 1.6 K is dominated by the uncertainty of the reﬂectivity of the liquid nitrogen target. A direct comparison between the two calibration techniques shows that for six of the nine channels that can be calibrated with both methods, they agree within the assessed uncertainties. For the other three channels the unexplained discrepancy is below 0.5 K. Systematic offsets, which may cause the disagreement of both methods within their estimated uncertainties, are discussed.


Introduction
Passive remote sensing instruments are widely used to retrieve atmospheric state variables. Measurements along the absorption lines of atmospheric gases can provide vertically resolved profiles of the gas concentration and/or temperature. Therefore, a realistic simulation of atmospheric absorption and emission is essential for the development of retrievals for atmospheric state variables. As radiative transfer models employ absorption line parameters that are obtained from laboratory measurements, field measurements are needed for independent evaluation. The aim of the Radiative Heating in Underexplored Bands Campaign (RHUBC-II) (Turner and Mlawer, 2010), which is part of the Atmospheric Radiation Measurement (ARM) program (Stokes and Schwartz, 1994), is to evaluate and improve existing absorption models. In addition to frequent radiosonde profiles of temperature, pressure and humidity, RHUBC-II data offer a unique set of accurate radiation measurements of the middle-to-upper troposphere. This paper concentrates on the microwave spectral region, which is widely used by space-borne and groundbased radiometers. For RHUBC-II this spectral range is covered by the passive microwave radiometer HATPRO-G2 (Humidity And Temperature Profiler -Generation 2). The radiometer has seven K-band channels along the rotational water vapor line at 22 GHz to retrieve information about the atmospheric water vapor and liquid water content. Another seven V-band channels along the oxygen absorption complex at 60 GHz are used to retrieve temperature profiles.
Ground-based temperature profiling using V-band frequencies was first suggested by Westwater (1965) and is now G. Maschwitz et al.: Investigation of ground-based microwave radiometer calibration techniques at 530 hPa performed operationally at several sites worldwide (Güldner and Spänkuch, 2001). However, compared to radiosonde profiles, temperature profiles derived from radiometric measurements can show biases up to 1 K (Liljegren, 2002;Löhnert and Maier, 2012). It is still unknown whether these originate from uncertainties in the oxygen absorption line parameters or from inaccuracies in the radiometer's absolute calibration. On the one hand, Cadeddu et al. (2007) evaluate oxygen absorption characteristics around 60 GHz by radiometer measurements and test different sets of absorption coefficients. They show that different sets of oxygen absorption line parameters lead to retrieved temperature profiles that may differ by more than 2 K. On the other hand, the accuracy of the commonly used liquid nitrogen (LN 2 ) calibration is found to be 1-2 K (Liljegren, 2002). Similar issues are present in the K-band for the remote sensing of water vapor (Hewison, 2007). This makes clear that calibration and absorption model uncertainties are subject to the same level of uncertainty. Consequently, microwave radiometer calibration techniques need to be improved and well characterized before measurements may be used to evaluate current absorption models. Thus, the main goal of this paper is to evaluate two commonly used independent microwave radiometer calibration methods by carrying out a detailed error propagation by realistically considering multiple error sources.
The LN 2 calibration method is compared to the tipping curve method (Han and Westwater, 2000). Generally, all microwave channels can be calibrated using liquid nitrogen as a cold absolute standard. However, a successful tipping curve calibration requires a non-optically thick atmosphere at the frequency at stake. At sea level only K-band channels are transparent enough to be calibrated by this method. Fortunately, for the low pressure conditions at the RHUBC-II site, the tipping curve method can also be applied to the two most transparent V-band channels, allowing a first-time independent evaluation of the LN 2 calibration in the V-band. This paper is organized as follows: in Sect. 2 we briefly describe the RHUBC-II measurement campaign and the instrument that this study is based on. In the following Sect. 3 the theory behind calibration methods and the instrument used in this study are described, and Sect. 4 performs the actual calibration assessment. First, an error propagation study of the LN 2 calibration procedure is carried out in order to characterize the effects of different assumptions on the overall uncertainty. Specifically, these are the uncertainties attributed to the cold and hot target temperatures and emissivities, as well as to the detector non-linearity. Secondly, a similar procedure is carried out for the tipping curve calibration, where uncertainties such as pointing error, horizontal homogeneity and Earth curvature effects are discussed. Section 5 summarizes and compares the results of the two methods and Sect. 6 recommends steps to be taken in order to guarantee an optimal calibration result.

Measurement campaign
The primary focus of RHUBC-II was to characterize and improve the accuracy of gas absorption models using high spectral resolution radiance observations in spectral regions that are normally opaque at lower altitudes due to strong water vapor absorption and high air pressure (Turner and Mlawer, 2010). The measurement site was set up at 5320 m a.m.s.l. (above mean sea level) on the Cerro Toco next to the Chajnantor plateau (22.96 • S, 67.77 • W). The plateau is part of the Atacama Desert in northern Chile, which is one of the driest regions on Earth. Here, the amount of Precipitable Water Vapor (PWV) ranges from 0.1-5.0 mm (Rutllant, 1977).
RHUBC-II was conducted within the dry season between August and October 2009. The ARM program set up a Self-Kontained Instrument Platform (SKIP) providing infrastructure and power supply for a variety of different instruments. Several radiometric instruments provide measurements almost across the entire infrared spectrum (Turner et al., 2012). The passive microwave radiometer HATPRO-G2 (Rose et al., 2005) took the data for this analysis and is described in detail below. The instrumentation was completed by an automatic weather station (AWS) and atmospheric profiles from 128 Vaisala RS92-K radiosondes launched during the campaign. During RHUBC-II PWV ranged from 0.2-1.5 mm with an average value of 0.6 ± 0.3 mm. The synoptic station at the site recorded extraordinary constant surface conditions: a daily mean temperature of 266.3 ± 2.8 K, an air pressure of 532.2 ± 2 hPa, and a relative humidity of 23 ± 15 %.
The HATPRO-G2 is a 14-channel total-power microwave radiometer measuring within the K-and the V-band, allowing high temporal resolution PWV, Liquid Water Path (LWP), humidity and temperature profile retrievals from brightness temperature measurements. It was manufactured by Radiometer Physics GmbH and is operated by the University of Cologne. In the K-band, atmospheric radiation is measured at seven channels along the wing of the water vapor absorption line at 22.35 GHz. Another seven Vband channels measure along the oxygen absorption complex centered around 60 GHz (Fig. 1). The receivers of each frequency band are designed as filter banks in order to acquire measurements at each frequency channel simultaneously. The channels have been designed with well characterized band-pass filters allowing precise frequency allocation of the signal. The eleven most transparent channels (Kband and V-band up to 54.94 GHz) have a half-power bandwidth of 230 MHz. For the three remaining channel frequencies the spectral gradient is low, so wider bandwidths of 600, 1000 and 2000 MHz provide higher precision at equal integration times. A parabolic mirror diameter of 250 mm gives an antenna half-power beam width for the channels along the water vapor line of 3.3-3.7 • and 2.5-2.7 • along the oxygen absorption complex. During RHUBC-II the radiometer was operated in a continuous scanning mode. It measured G. Maschwitz et al.: Investigation of ground-based microwave radiometer calibration techniques at 530 hPa 2643 Fig. 1. Highly spectrally resolved brightness temperature (T b , top panels) and opacity (τ , bottom panels), calculated by the Rosenkranz'98 absorption model (Rosenkranz, 1998), for the spectral range covered by HATPRO-G2. Spectral features are the water vapor absorption line at 22.24 GHz (left panels) and the oxygen absorption complex at 60 GHz (right panels). At 530 hPa, pressure broadening is reduced and single transitions can be recognized in the spectral wing of the oxygen absorption complex (lower lines). The upward-looking spectrum is plotted for a very dry (PWV = 0.3 mm) RHUBC-II radiosonde profile from 13 September 2009. For comparison the spectrum is calculated for sea-level conditions using a subtropical standard atmosphere (higher T b and τ values). HATPRO-G2 channels are illustrated by the width of band-pass filters (gray). Note that band-pass filters of the three most opaque channels overlap. at symmetric elevation angles in a fixed azimuthal plane (70 • /250 • ) at elevations 90.0, 45.0, 30.0, 15.0, 9.6 and 4.8 • . With an integration time of 1 s each scan takes about 30 s. In this study, these scans are used to apply the tipping curve calibration technique to the stored voltages of the nine lowopacity radiometer channels between 22.24 and 52.28 GHz.

Radiometer calibration
Calibrating the radiometer means to determine the relation between the detected voltage (U det ) and the received power. The relation is derived by measuring against black body targets with a well-known physical temperature T phys . The spectral radiance B ν that is emitted by the black body target is given by Planck's law: in W m −2 sr −1 Hz −1 , with the Boltzmann constant k b and Planck's constant h. Calibration procedures based on Eq. (1) provide measurements expressed in Planck equivalent brightness temperatures with the received spectral radiance I ν . In the following, given temperature values are always Planck equivalent. The relation between detected voltages and measured brightness temperatures T b is expressed by the following set of calibration parameters: with the detector gain g, the total system noise temperature T sys = T R + T b (being the sum of the receiver noise temperature T R and measured T b ) and the non-linearity parameter α (Radiometer Physics GmbH, 2011). At the beginning of every deployment period, the complete set of calibration parameters for HATPRO-G2 is determined by an LN 2 calibration. Additionally, during operation noise diode calibrations (every 30 min) and hot load calibrations (every 5 min) are performed. For details on the procedures' input and output parameters confer Table 1. In the following the different procedures are introduced.

Liquid nitrogen calibration
HATPRO-G2 uses an LN 2 target that is mounted alongside the radiometer for calibration. The calibration target is observed from above using a reflector that is tilted by 45 • . The load is filled with a microwave absorber to guarantee the target's black body properties. The target temperature T C is the actual boiling temperature of LN 2 . The hot load is an internal black body at ambient temperature T H , where venting ensures a homogeneous temperature distribution. In order to take the system non-linearity α into account a 4-point calibration scheme is used (Fig. 2), which means that the radiometer measures at four calibration points to derive the calibration parameters g, T R and α (Eq. 3) (Radiometer Physics GmbH, 2011). The determination of α is realized by noise injection from an internal noise diode (T N ). After integrating on the hot and cold target for 30 s each, the cycle is repeated with additionally injected noise. T R , g, T N and α are then determined by solving the following system of equations: and  with U C , U H , U CN , and U HN denoting radiometer voltages measured on the cold, ambient, cold-plus-noise and ambient-plus-noise references, respectively. Instead of using Planck spectral radiances B ν (T ) within these calibration equations, Eq. (1) is expanded in terms of (h ν/k b T ) and truncated after the first term. For HATPRO-G2's frequency range of 22 GHz < f < 58 GHz and temperature range of T C < T phys < T H the truncation error of this approximation is negligible. Furthermore, Eqs. (4)-(7) are valid within HATPRO-G2's frequency range as long as the non-linearity α is close to one. This condition is fulfilled by the HATPRO-G2 receivers. As a consequence, the target's physical temperatures can directly be inserted into Eqs. (4)-(7). The derived calibration parameters and the resulting brightness temperatures agree with the black body equivalent brightness temperatures T PL b (Eq. 2). After a successful calibration, T N from a well burned-in noise diode is stable enough to serve as a secondary calibration standard during operation. During RHUBC-II only one single LN 2 calibration was performed after radiometer setup on 11 August 2009. Therefore, in Sect. 4.1.5, we give a repeatability analysis of the calibration procedure using LN 2 calibration performed with the same radiometer at the Jülich Research Center.

Tipping curve calibration
An alternative to the LN 2 calibration is the so-called tipping curve calibration, which can be used to calibrate low-opacity radiometer channels. The general idea is to replace the LN 2 target by the cold clear sky. Han and Westwater (2000) discuss this method in detail and specify an absolute calibration accuracy of better than 0.5 K for K-band channels. In principle, the method uses opacity-air mass pairs from elevation scans. T b measurements are recalibrated in an iterative process, where the opacity τ is calculated from the following equation: with Planck spectral radiances B (Eq. 1). T back (= 2.73 K) denotes the cosmic background radiation and T mr is the mean radiative temperature of the atmosphere, which is calculated independently (Sect. 4.2.1). The tipping curve calibration encompasses a large brightness temperature range and Eq. (8) is highly non-linear. Therefore, it is necessary to perform the tipping curve calibration in the power domain (B(T b )) and convert the power values to black body equivalent T b by the inverse of Eq. (1) afterwards. Initial T b values in Eq. (8) are set by using a prior calibration or an educated guess. Assuming clear sky conditions and a homogeneously stratified and non-opaque atmosphere, the opacity scales linearly with the air mass for low optical depths along the slant path. The slope of this linear relation is the zenith opacity τ zen . Zenith T zen b and τ zen are connected by B(T zen b ) provides an updated cold reference at zenith. This procedure is repeated iteratively and converges to optimal calibration parameters g and T R (Eq. 3), assuming that α remains unchanged. For a perfect calibration the regression line would pass through the origin of the opacity-air mass diagram. However, deviations can result from horizontal atmospheric inhomogeneities within the scanned volume. Quality thresholds that guarantee the goodness of the fit are used to filter out tipping curve calibrations under inhomogeneous conditions.
Tipping curves are a standard calibration tool that is commonly used to calibrate K-band channels of ground-based radiometers. During RHUBC-II the manufacturer's internal procedure performed a tipping curve calibration of the Kband channels every 6 h. For this, the radiometer scanned towards an azimuth of 50.0 • N at the following elevation angles: 30.0, 33.3, 38.4, 45.6, 56.4 and 90 • . Throughout the campaign, 97 from 269 tipping curves passed the manufacturer's internal quality checks.
V-band channels are generally not included in tipping curve routines, because at sea level the opacity of these channels is too high. However, at 530 hPa, HATPRO-G2 channels at 51.26 and 52.28 GHz are transparent enough to apply the tipping curve calibration. Zenith T b is below 40 and 60 K, respectively ( Fig. 1) -well below the boiling temperature of LN 2 . For the analysis in this study the tipping curve calibration is not only applied to the K-band channels but also the first two V-band channels (Sect. 4). Quality thresholds for the horizontal atmospheric homogeneity can be varied (cf. Sect. 4.2.5) and possible uncertainty sources can be investigated separately. The algorithm used within this analysis has been verified with K-band results from the manufacturer's internal calibration procedure.

Noise diode calibration
T N is determined by every HATPRO-G2 LN 2 and tipping curve calibration (Sects. 3.1 and 3.2). T N is stable enough to be used as a secondary calibration standard for several months (cf. Sect. 4.1.5). The stability of g for the V-band channels is guaranteed by injecting noise periodically -with a frequency of 10 Hz -while the radiometer is pointing to an arbitrary scene (T scene ). T scene can be extracted by using the fraction of the detected signal with and without noise.
Using T R from a previous hot load calibration (cf. Sect. 3.4) gives a corrected detector gain g. Updating g with such high frequency is not necessary for the K-band channels, because the more stable gallium arsenide (GaAs) technology compared to indium phosphide (InP) technology (V-band) is used. Therefore, in the K-band a noise calibration is performed every 30 min to recalibrate T R . The noise is injected while pointing to the ambient temperature target whereby T R can be updated.

Hot load calibration
This calibration type uses the ambient hot load target as a reference. The hot load temperature T H is known from a precision in-situ measurement within the target itself. HATPRO-G2 uses a second sensor to check for malfunction of the primary sensor. Generally, temperature gradients within the target may lead to a T H that is not representative for the emitted radiance (McGrath and Hewison, 2001). For HATPRO-G2 such temperature gradients are sufficiently reduced (< 0.2 K, Sect. 4.1.4), because the airflow from the internal blower covers the internal target. In order to correct for significant changes in HATPRO-G2's detector gain that occur on time scales longer than 5 min, the hot load target is reviewed every 5 min with an integration time of 4 s. For the V-band channels, g is known from continuous noise switching (Sect. 3.3) and thus T N is used to update T R during every hot load target calibration.

Liquid nitrogen calibration
Biases of 1-2 K are found when comparing simulated and measured T b for radiometer channels along the oxygen absorption complex at 60 GHz (Liljegren, 2002;Löhnert and Maier, 2012). However, the question remains whether these biases originate from uncertainties in the LN 2 calibration or in the oxygen absorption model. This bias would be too large for an absolute evaluation of absorption models because absorption model uncertainties are on the same order (Hewison and Cimini, 2006). Therefore, the absolute accuracy of  (Span, 2000), crosses: calibration point T C and T H . (b) Impact on the derived T b from differences in the assumed temperature of the cold calibration target (T C ). Lines: LN 2 calibration simulations using the different corrections for the 23.04 GHz HATPRO-G2 channel, X: cold and hot calibration point.
HATPRO-G2 measurements is assessed by propagating uncertainties of the calibration parameters through the LN 2 calibration procedure. The contribution of each source of uncertainty is estimated by separately varying the individual input parameters for U det (Eqs. 4-7). The total system noise temperature T sys , when pointing to the LN 2 target, is modeled as with the signal emitted by the target itself (T tar ), a contaminating component T cont , which is reflected into the signal beam, the receiver noise T R , and T N , which is injected by an internal noise diode. r is the reflectivity of the calibration target: with n being the real part of the target's refractive index. r would be zero in case of a perfect black body target. For further information on reflectivities of radiometer calibration targets refer to Randa et al. (2005). T N and T R are determined by the calibration procedure (Eqs. 4-7). The uncertainty of T N and T R results from error sources associated with T tar , T cont and r. The associated error sources are analyzed in the following. An important error source of a calibration that uses an external LN 2 target is the formation liquid water on the radome due to condensation. This effect is not discussed here, because HATPRO-G2 uses a combined dew blower/heater system, which effectively prevents condensation during measurements and LN 2 calibrations.

Boiling point correction
When calibrating against a liquid nitrogen target, the LN 2 boiling point serves as reference. At standard pressure (p 0 = 1013.25 hPa) this is T 0 = 77.36 K. However, the boiling point depends on the atmospheric pressure p and has to be corrected accordingly. For HATPRO-G2 the corrected boiling point is calculated from surface pressure supplied by the internal sensor. At 530 hPa the correction is several Kelvin. Any inaccuracy in the applied correction impacts T b -especially for scenes close to or below the cold calibration temperature. However, different instrument manufacturers use different formulas for the pressure-dependent correction ( Fig. 3): and with p in hPa. T RPG C and T RAD C refer to Radiometer Physics GmbH (2011) and Radiometrics (2007), respectively. The different formulations might be explained by the fact that the pressure dependency is often calculated for high technical pressures in the laboratory. At standard pressure the discrepancy between these formulations is still mostly negligible, however with decreasing pressure it increases and reaches 1.1 K at 530 hPa. In Fig. 3 the impact of the different boiling point corrections on the calibration is shown. The effect itself is not frequency-dependent, but the impact on measured T b increases with decreasing channel opacity. It can be seen that moving away from the hot target increases the cold target's influence. At the LN 2 boiling point the discrepancy between the different corrections already exceeds 1 K. The proper pressure-dependent correction for the LN 2 boiling point can be calculated from the Clausius-Clapeyron equation: with the boiling point of liquid nitrogen T and the ideal gas constant R. Apart from the pressure Eq. (14) only depends on the heat of vaporization H vap , which is constant for an ideal gas. Based on Eq. (14) a consistent formulation is now used: Using the atmospheric pressure during the LN 2 calibration at RHUBC-II of p = 534.7 hPa, Eq. (15) (Span, 2000).

Refractive index of liquid nitrogen
The refractive index of the LN 2 target, n LN 2 , determines the reflectivity r LN 2 of its surface (Eq. 11). n LN 2 is derived from laboratory measurements. Reesor et al. (1975) determine n LN 2 = 1.2 for a frequency range of 18 to 26 GHz. The results are in agreement with several other experiments at frequencies between 0.5 GHz (Hosking, 1993) and 130 GHz (Vinogradov et al., 1967). Therefore, n LN 2 = 1.2, which corresponds to a reflectivity of r LN 2 = 0.82 % (Table 2), is assumed for all HATPRO-G2 channels. The contaminating signal T cont (Eq. 10) is estimated to originate from the receiver (T rec = 305 K) and being reflected back at the LN 2 surface. As T rec is much higher than the LN 2 target temperature even a small reflectivity results in a reflective component that has to be added to the cold target temperature. In this case, a reflective component is calculated to T refl = 1.9 ± 0.6 K when an n LN 2 uncertainty of ±0.03 is assumed (Benson et al., 1983). The resulting T b uncertainty reaches 1.4 K for T b = 5 K, which is the minimum measured T b within the Kband during RHUBC-II (Fig. 4). The effect decreases linearly with higher T b values. For the most opaque channels in the V-band the uncertainty reduces to 0.1-0.2 K. Finally, it disappears at the hot calibration point. Additionally, the LN 2 surface is exposed to the environment and is thus deformed by capillary waves and boiling bubbles. On the one hand, this can lead to frequencydependent interferences resulting in variations in the reflectivity r LN 2 of the target (Shitov et al., 2011), because the deformation length scale is on the range of the received signal wavelength. These variations are on time scales well below the integration time on the LN 2 target of 30 s. Therefore, they are assumed to compensate each other over the integration period. On the other hand, air bubbles on the LN 2 surface reduce the density of the interacting surface. This effect will not be compensated by averaging and may improve the cold target's black body characteristics because r LN 2 is reduced. However, it is difficult to quantify this effect. Assuming that the density of the interacting LN 2 -air volume is reduced by 1 % due to air bubbles would lead to a T C reduced by 0.2 K.

Resonance effect
Continuous observations of the LN 2 evaporating from the cold load were carried out at the Jülich Research Center. The observations were conducted with the same radiometer that was previously deployed at RHUBC-II. While the level of the LN 2 declines, its distance to the receiver s increases and the resonance condition changes. The effect on T tar (Eq. 10) can be interpreted as a small perturbation res(s). Averaged over time, res(s) is zero ( res(s(t)) = 0 ). Via r res these standing waves contribute to the reflective component: Note that r res is unknown, because the reflectivity on the receiver end is not known. The corresponding amplitudes and phases of the observed signal are frequency-dependent and affect the uncertainty of the calibration point. The maximum uncertainty is estimated to be twice the amplitude of the oscillation observed at each channel, because the integration time within the LN 2 calibration is small compared to the oscillation periods. K-band channels show oscillation amplitudes of 0.1 to 0.6 K. In the V-band the noise level is generally higher, making it more difficult to detect oscillations, and may lead to a reduction of amplitudes, which are 0.1-0.3 K. For both receiver bands, amplitudes in the band's center show higher amplitude. Most probably this is caused by the isolator, which is located directly after the feed horn. The isolator is designed to cover the large bandwidth of the whole receiver. Its performance might slightly decrease towards the band edges. Figure 4 shows the impact of the resonances on calibrated T b for 23.04 GHz. This channel has the largest uncertainty: for RHUBC-II measurements with T b below 10 K the maximum uncertainty range reaches 1.2 K. The impact of the resonances can be diminished by integrating on the cold target for exactly one oscillation period. However, the periods are channel-dependent. Therefore, a more practical solution is to determine the calibration parameters from repeated calibrations while the LN 2 evaporates. The remaining uncertainty is due to the fact that only a limited number of repetitions is possible before the LN 2 has evaporated. In this study, the maximum uncertainty from resonances is assumed, because for RHUBC-II only a single cold load integration was performed.

Non-linearity parameter
Concerning the non-linearity parameter α there are two aspects to be analyzed: (1) the effect of α on the calibration characteristic (T b = f (U det )) and (2) its variability. The first point is addressed by simulating the LN 2 calibration with a classical 2-point calibration scheme. This scheme is a simplification of the 4-point scheme: it assumes a linear relation between U det and measured T b (α = 1) and no additional noise is injected. The impact on the calibration characteristic is given in Fig. 5. Apart from numerical effects, the impact diminishes at the two calibration points T C and T H . The impact on the calibration is slightly higher for V-band channels because α is generally smaller than for the K-band channels. Nevertheless, the effect depends on T b : the maximum difference of 0.3 K is reached at the lower end of the measured T b range. Therefore, K-band channels are most affected. In any case, considering the detector non-linearity noticeably improves the calibration. The variability in α is investigated by 27 LN 2 calibrations that were performed with HATPRO-G2 at the Jülich Research Center, Germany, between July 2010 and November 2011. α cannot be determined independently, because it characterizes the non-linearity of the whole radiometer system. Within the considered time period, no significant drift of α is found. Therefore, the standard deviation of α over all 27 calibrations reflects the random uncertainty of the α determination, which is 0.1-0.2 % of the mean α value of each channel. It can be concluded that α is solely a frequency-dependent instrument property. For the K-band channels the variability is largest at 22.24 GHz and decreases with frequency. Figure 5 shows the impact on calibrated T b when the original α is varied within the range of uncertainty. Again, the effect is largest for small measured T b . Combined with the higher variability of α, K-band channels are most affected. Anyway, the effect does not exceed ±0.04 K (Table 5) and is therefore negligible.

Hot load temperature
All calibration types (Table 1)  by assuming a maximum uncertainty of ±0.2 K for the insitu temperature measurement. This value is reasonable as it reflects the maximum deviation between the two temperature sensors within the ambient target during the campaign.
For the three saturated V-band channels the resulting T b uncertainty lies within ±0.2 and ±0.3 K. For smaller T b values the impact of the uncertainty within the in-situ measurement drops linearly. All other channels are affected by approximately ±0.1 K.

Repeatability and validity
The repeatability is the capability of a calibration to reproduce the calibration parameters on time scales with negligible instrument drift. It is particularly important to assess the repeatability, because in this study a single LN 2 calibration is compared to a number of tipping curve calibrations. We address this issue by analyzing 11 LN 2 calibrations that were performed within about two hours, using the same radiometer that was deployed at RHUBC-II. The measurements were conducted at the Jülich Research Center on 10 November 2011. It is assumed that the repeatability is characterized by the stability of the noise diode temperature T N , which is determined with every calibration. Using T N from the 11 calibrations allows to map the uncertainty within T N into the range of measured brightness temperatures. For these simulations the detector voltages, the target temperatures, and the other three calibration parameters in a 4-point calibration scheme (the system noise temperature T R , detector gain g and the detector non-linearity parameter α) are kept constant while only T N is varied. The impact on measured T b is demonstrated for the HATPRO-G2 channels at 51.26 GHz (Fig. 6). For this channel the standard deviation of T N is σ (T N ) = 2.0 K. Variations in T N affect T b measurements below the hot calibration point T H . The repeatability can be expressed by the standard deviation σ (T b ). σ (T b ) does not only depend on the variation of σ (T N ), but also on the distance between T N and T H , and the T b value itself. The lower T b , the larger the impact of T N variations. For typical T b values at 51.26 GHz under RHUBC-II conditions (T b ≈ 38.4 K) σ (T b ) is 0.4 K. The results are summarized for all channels in Fig. 6. It shows that the repeatability σ (T b ) ranges between 0.2 and 0.4 K for the non-opaque channels. For the opaque channels above 54 GHz, the impact of T N on T b is negligibly small, because the calibration is dominated by the ambient target temperature. It is assumed that the uncertainty is mainly caused by noise and small variations from calibration to calibration in the level of LN 2 in the cold calibration load, which changes the resonance condition for standing waves between the receiver and the LN 2 surface.
The second point we would like to address here is the validity period of the LN 2 calibration. This aspect is rather important for the V-band channels which cannot be calibrated by the tipping curve calibration. Measurements of these channels depend on the noise diode temperature derived from a liquid nitrogen calibration. Our paper includes a comparison of the liquid nitrogen calibration and the tipping curve calibration. Results of the liquid nitrogen calibration, performed on 11 August 2009, are compared to tipping curve results from 16 August 2009. This comparison is only possible because the instrument's drift over this time period is negligible.
Like the repeatability, the drift is recorded by the variation of T N . We have analyzed 27 liquid nitrogen calibrations performed with HATPRO-G2 at the Jülich Research Center between June 2010 and May 2012. The time period between subsequent calibrations ranges from less than one hour to several months. Still, a trend analysis of these calibrations reveals significant trends of T N . Depending on the channel T N changes by +0.006 to +0.010 K per day in the K-band and by +0.054 to +0.072 K per day in the V-band (Table 3). The trends were calculated for a significance level of 0.05. The calculated trends are statistically significant for all channels apart from the channels at 31.40 GHz. The resulting confidence intervals vary between ±0.003 K and ±0.010 K per day (Table 3). On the basis of the calculated trends, T N changes by less than +0.1 K in the K-band and +0.4 K in the V-band between 11 and 16 August 2009. The impact on T b can be derived similarly to the repeatability analysis discussed above. We use the first calibration on 10 November 2011 performed in Jülich as reference. When this calibration is simulated with a T N modified by the five-day drift, T b is affected by less than 0.06 ± 0.01 K for all channels. The given uncertainty results from the maximum uncertainty of the calculated trends. We can conclude that the T N drift is negligible for the analysis in this study (Table 3).

Tipping curve calibration
The tipping curve calibration results and the associated uncertainties are investigated by a procedure that uses the elevation scans in the 70 • N/250 • N azimuthal plane, which were continuously performed during RHUBC-II. The results from these elevation scans cannot be used to determine the detector non-linearity α and the noise diode temperature T N , because no noise is injected during the scans. Nevertheless, calibrated zenith T b values, which can be derived from each single scan, do not depend on these parameters. Measurements at elevations 90.0, 45.0, 30.0 and 15.0 • are considered in the calibration procedure used for this analysis. Expressed in terms of air masses, they correspond to 1.0, 1.4, 2.0 and 3.0, respectively. This is an advantage compared to the method used by Han and Westwater (2000), who combine zenith measurements with only one additional measurement at an air mass that varies between 1.5 and 4. Fitting more than two opacity-air mass pairs allows the user to set quality thresholds that reduce the impact of atmospheric inhomogeneities on the calibration results (Sect. 4.2.5). From each scan passing the quality thresholds, zenith T b is calculated from Eqs. (1), (8) and (9). Calibrated zenith T b could then be used as a cold calibration point T C to derive a T b value for the scene. The impacts of different possible sources of uncertainty are examined in the following. Furthermore, the most appropriate quality thresholds are determined.

Mean radiative temperature
The frequency-dependent mean radiative temperature of the atmosphere, T mr , is a required parameter for the calibration. If a radiosonde profile exists, it can be calculated using a radiative transfer model to compute the absorption coefficient and optical depth:  with the absorption coefficient β(s), the physical temperature profile along the slant path, T (s), and the total opacity τ (0, ∞) along the slant path. T mr must be determined prior to the calibration and cannot be determined from the radiometer itself. One possibility is to use a climatological T mr derived from radiative transfer calculations with a climatology of atmospheric profiles. However, the tipping curve procedure is more accurate when the ambient surface temperature T surf is used as predictor for T mr in a linear regression scheme (Han and Westwater, 2000). Therefore, τ is calculated from RHUBC-II radiosonde profiles using the Rosenkranz'98 absorption model (Rosenkranz, 1998). Then, the predictand T mr is calculated for the nine non-opaque HATPRO-G2 channels from 22.24-52.28 GHz and different elevation angles from Eq. (17). The surface temperature, taken from the campaign's AWS, is averaged over a time period of 5 min before and 1 h after launch of each radiosonde. Finally, linear regression coefficients are calculated from the obtained T mr -T surf pairs. For 22.24 GHz the RMSE (Root Mean Square Error) is 3.2 K at three air masses. The RMSE decreases towards higher frequencies and reaches 1.1 K/1.2 K at 51.26 GHz/52.28 GHz at three air masses. The regression allows us to estimate T mr from the continuous ambient surface temperature measurements throughout the campaign. Figure 7 shows the sensitivity of calibrated zenith T b to uncertainties in T mr . Assuming an uncertainty of T mr of one RMSE affects T b in the V-band by ≈ 0.1 K. In the K-band the effect has virtually no impact. In order to exclude the effect Atmos. Meas. Tech., 6, 2641-2658 of biases within the radiative transfer calculations, these are compared to results from the Liebe'93 (Liebe et al., 1993) and the Liebe'87 (Liebe, 1987) models. At three air masses the different models result in a T mr that agrees within 0.1 K (Table 5). T b results are not affected by such small differences. We can therefore conclude that T mr can be derived from surface temperatures with sufficient accuracy for the tipping curve calibration.

Air mass correction
The tipping curve procedure uses scene observations to derive opacities at different air mass values. For a plane-parallel non-refracting atmosphere the air mass depends only on the elevation angle θ. In that case, the relative air mass is given by a 0 = 1/sin(θ ). However, for real observations the relative air mass differs from a 0 when observing off zenith. On the one hand, the air mass is reduced due to the Earth curvature. On the other hand, atmospheric refraction increases the air mass. The effects from refraction and curvature increase when approaching the horizon and affect the tipping curve results. Even though the curvature effect is predominant, refraction is not negligible. Here, both effects are considered by a spherical ray-tracing algorithm that uses refractive index profiles derived from RHUBC-II radiosondes. The refractive index n is calculated at each level using a bulk formula: The ray-tracing algorithm calculates the slant path lengths for each height layer. The slant path opacity τ (θ) for the given radiosonde profile is then integrated over the full path length.
Finally, the exact relative air mass value is given by For each tipping curve scan, a is calculated by using the closest available radiosonde profile. The usage of a instead of a 0 does not notably affect the tipping curve results for K-band channels under RHUBC-II conditions. For the two V-band channels, the impact is higher: zenith T b results are affected by T zen b = +0.2 K. The uncertainty of the ray-tracing algorithm is assessed by the variation between all four radiosondes of the analyzed day. It is found that the uncertainty does not have any effect on the tipping curve results. Using the air mass correction suggested by Han and Westwater (2000) leads to an underestimation of T b results by 0.1 K at 51.26 and 52.28 GHz, while the K-band is not affected.

Beam width correction
The tipping curve calibration assumes that T b measurements are given at distinct nominal air masses. However, microwave radiometers receive the incoming radiation over an aperture that is defined by the antenna pattern (Han and Westwater, 2000). For HATPRO-G2 side lobe suppression is better than −30 dB. Numerical simulations of the beam pattern reveal that 99.5 % of the signal is received within an angular range of two half-power beam widths (HPBW). It is concluded that contribution from outside this range can be neglected.
Still, the finite HPBW affects the effective air mass value because T b does not scale with the elevation angle. For transparent radiometer channels this results in an overestimated T b because a larger part of the signal is received from directions below the nominal air mass value. This effect on measured T b increases with air mass. In order to derive T b measurements for the nominal air mass values, the specified HPBW is used (Rose et al., 2005). The beam is modeled as a Gaussian-shaped lobe with a doubled HPBW being , (el = elevation). Crosses: mean HATPRO-G2 measurements during radiosonde ascents (5 min before and 1 h after each launch). Radiative transfer calculations corrected air mass values; the antenna beam width and exact band-pass filters are considered. 7 • for the K-band and 5 • for the V-band channels. It is resolved by equally distributed and weighted radiative transfer calculations around the nominal elevation angle with a resolution of 0.1 • (air mass correction is included). The corrections are applied to all T b measurements within tipping curve scans on 16 August 2009. For air masses between one and three, the correction is below 0.1 K for the K-band channels due to the small opacities under the low water vapor conditions during RHUBC-II. At 51.26 and 52.28 GHz, the correction is 0.1 and 0.2 K at two air masses, and +0.5 and −0.5 K at three air masses, respectively. Zenith T b values from the tipping curve calibrations of these two channels are affected by 0.5 K when using air masses between one and three. Han and Westwater (2000) derive a theoretical T b correction from a Gaussian-shaped antenna pattern and an effective scale height. Compared to the numerical calculations with radiosonde profiles used here, that approach leads to a much smaller beam width correction: the impact on calibrated T b would be less than 0.1 K at three air masses for all channels.

Pointing error
The tipping curve procedure is very sensitive to uncertainties in the radiometer elevation pointing. A mispointing of 1 • can easily lead to a calibration error of several Kelvin (Han and Westwater, 2000). As mentioned before, the radiometer continuously performed elevation scans along the (70 • N/250 • N) azimuthal plane. The V-band channels at 51.26 and 52.28 GHz show almost constant T b differences between the two scanning directions that reach 2-3 K at four air masses. Model simulations show that a large part of these differences are systematic and can be explained by a tilt of the instrument by 0.2 • (Fig. 8). Note that the simulations consider the beam width and air masses in a spherical and refracting atmosphere (Sect. 4.2.2). Han and Westwater (2000) point out that tilts can be balanced by averaging measurements of symmetric elevation angle prior to the tipping curve procedure. Nevertheless, the random effect from atmospheric inhomogeneities cannot be compensated by this method. The effect of inhomogeneities on T b increases with air mass as water vapor variations close to the surface are observed. This is reflected in the stronger variance of up to 0.7 K at four air masses when successive scans are compared (Fig. 8). This suggests that -depending on the measurement site and weather conditionsonly measurements down to three air masses should be used for the tipping curve calibration. Furthermore, the considered air mass range may additionally be constrained by sitedependent obstacles.
For the RHUBC-II deployment, the detected instrument tilt is not compensated by averaging, because the degree of homogeneity differs significantly towards both scanning directions. For the K-band channels, T b measurements towards both azimuthal directions deviate from each other by up to 2 K at four air masses. The difference can be explained by drier air masses over the hillside of Cerro Toco (70 • N) compared to the open plateau (250 • N). Consequently, in the K-band the number of succeeding successful scans towards both directions is reduced significantly. Therefore, scans towards both directions are separately used after correcting the elevation angles by 0.2 • . Finally, it is assumed that the correction results in a residual pointing uncertainty of 0.05 • . This uncertainty has no effect on the K-band channel results, though it increases with channel opacity and results in a ±0.1 K uncertainty of calibrated T b in the V-band (Table 5).

Quality control and results
Five days after the original LN 2 calibration, the second operational day during RHUBC-II (16 August 2009) offered ideal clear sky conditions for analyzing continuous tipping curve Atmos. Meas. Tech., 6, 2641-2658, 2013 www.atmos-meas-tech.net/6/2641/2013/ calibration. In addition to the four available radio soundings, detector voltages and T b data from HATPRO-G2 were taken. The quality of a tipping curve scan is estimated by two criteria, which determine if the atmosphere is horizontally stratified to the required degree. Two measures are used to determine the goodness of the linear fit between measured opacities and air masses: the linear correlation coefficient corr(τ, a) and χ 2 , which is sensitive to deviations of τ from the linear relation to a. We are aware that these statistical quantities are not well defined for the small number of opacity-air mass pairs used. Nevertheless, they can be used to index the quality of the fit. χ 2 is defined in a relative way: This study indicates that a relative threshold of χ 2 max = 1 × 10 −5 is most appropriate. Figure 9 illustrates the influence of different correlation thresholds on the tipping curve results. Naturally, raising the correlation threshold corr min reduces the number of successful scans while increasing the calibration quality. Obviously, the quality threshold for χ 2 is a sensible supplement, because even scans with very high correlation coefficients fail. Generally, V-band channels show higher correlations than K-band channels, because the temperature field seems much more homogeneous than the atmospheric water vapor field. Different correlation thresholds have been tested and it is found that the manufacturer's default threshold (corr min = 0.9995) is adequate for all calibrated channels. Higher values are not beneficial, because the standard deviation of all tipping curves on 16 August 2009 is Table 5. Assessed total calibration uncertainties for RHUBC-II in Kelvin. For the LN 2 calibration, the total uncertainty results from uncertainties of the refractive index of the LN 2 surface (n LN 2 ), from resonances between the receiver and the LN 2 target (res), from uncertainties in the in-situ hot load measurement (hot), and from the detector non-linearity (α). The uncertainties are assessed for the mean T b values measured at each channel on 16 August 2009. For the tipping curve calibration the total uncertainty results from the derivation of the mean radiative temperature (T mr ), from the beam pointing (poi), and from atmospheric inhomogeneities (atm). not changed significantly. The number of successful tipping curves depends on the channel frequency: while it is small for the three K-band channels next to the line center (49-69), there are 120 and 216 successful tipping curves for the V-band channels at 51.26 and 52.28 GHz, respectively (Table 4). Again, this indicates the high inhomogeneity of atmospheric water vapor, which is most pronounced close to the line center at 22.235 GHz. Figure 10 shows time series of zenith T b differences between successful tipping curve calibrations and the original LN 2 calibration for selected channels. In Table 4 the results are summarized for all calibrated channels. The assumed quality thresholds result in a good repeatability of the tipping curve calibration throughout the day. The standard deviation of T b over all scans is 0.1-0.2 K for the K-band  channels. For the V-band channels at 51.26 and 52.28 GHz it is 0.5 and 0.3 K. It is assumed that this variability is induced by random processes such as atmospheric turbulence and radiometer noise. Therefore, it can be used to estimate the random uncertainty of tipping curve calibrations. Twice the standard deviation gives the maximum uncertainty range of measured T b from a single tipping curve calibration (Table 5). One possibility to eliminate the random variability is to take the daily mean average of the tipping curve results. A similar approach is taken by Liljegren (1999), who introduces a rolling dataset of calibration parameters to the calibration procedure of ARM radiometers.

Summary and discussion
Two different calibration techniques for a state-of-the-art ground-based microwave radiometer (MWR) have been assessed on the basis of RHUBC-II data. The most common error sources that need to be considered when specifying the accuracy of the brightness temperature measurements are discussed. Table 5 gives a summary of the different uncertainty contributions. Although the uncertainties discussed with respect to the LN 2 calibration are partially instrumentspecific, we note that RPG-HATPROs account for a major part of existing ground-based MWRs, and thus this analysis is valid for a significant part of the worldwide available systems. The analysis of the tipping curve calibration is generally applicable to all ground-based MWRs.

Liquid nitrogen calibration
For the LN 2 calibration, a systematic error at 530 hPa was identified and corrected for by applying an improved formulation of the pressure-dependent boiling point of LN 2 . After this correction, the LN 2 calibration is mainly influenced by the temperature uncertainties of the cold and hot calibration targets. Here, the main contributions arise from the uncertainty of the reflective component that superposes the signal from the LN 2 target. The reflective component depends on the refractive index of the liquid nitrogen surface n LN 2 and the strength of the standing wave that builds up between the LN 2 surface and the receiver. This leads to T b uncertainties, because n LN 2 is not exactly known and the development of the standing wave depends on the level of evaporating LN 2 in the cold load target. Resonances primarily affect the K-band channels, because here the largest amplitudes are observed. Additionally, both effects are frequency-dependent: the impact on calibrated T b increases with decreasing T b . Consequently, at RHUBC-II, K-band channels -with measured T b below 10 K -are most affected. Approaching the hot calibration point, uncertainties in the temperature measurement of the internal ambient target gain influence. It is shown that a non-zero reflectivity of the internal target can influence the derived T b values. However, the  (Table 5), +: tipping curve results, red: estimated uncertainty range associated with the tipping curve calibration (Table 5), green: range of zenith T b differences from radiative transfer calculations with three different absorption models: Liebe'87 (Liebe, 1987), Liebe'93 (Liebe et al., 1993) and Rosenkranz'98 (Rosenkranz, 1998). V-band calculations for < 56 GHz have been convolved with the real band-pass filter shapes measured with a resolution of 10 MHz. effect is difficult to quantify because there is no information available on the uncertainty of the hot load reflectivity. In general, for different channels the overall uncertainty is between ±0.3 and ±1.6 K (Table 5). Furthermore, the LN 2 calibrations benefit from considering the detector non-linearity parameter α within the 4-point calibration scheme. Compared to the 2-point calibration scheme, neglecting α can result in a bias in the scene T b in low-opacity channels that is close to 0.2 K.

Tipping curve calibration
Tipping curve results are very sensitive to the exact radiometer pointing. Therefore, scans on both sides of the radiometer were used to diagnose the instrument's tilt. The pointing is improved by compensating the instrument's tilt with a residual uncertainty of about 0.05 • . The residual effect is negligible in the K-band, and even in the V-band T b is only affected by 0.1 K. A further source of uncertainty is the derivation of the mean radiative temperature T mr . The uncertainty of T mr affects T b by up to ±0.3 K in the V-band. Finally, the uncertainty of successful tipping curve calibrations is dominated by atmospheric inhomogeneities that are not completely excluded by the quality thresholds (Sect. 4.2.5). The inhomogeneities lead to uncertainties up to ±0.2 K in the K-band and up to ±0.7 K in the V-band. The contribution from atmospheric inhomogeneities can be reduced by averaging the results of several tipping curve calibrations instead of the current practice to use a single tipping curve for an extended measurement period.
Compared to Han and Westwater (2000) more realistic air mass and beam width corrections -using observations from one to three air masses -are applied. In comparison to a plane-parallel non-refracting atmosphere, the used air mass correction that considers the Earth curvature and refraction (Rüeger, 2002) raises calibrated zenith T b by 0.2 K at 51.26 GHz and 0.1 K at 52.28 GHz. Neglecting the antenna beam width would increase zenith T b measurements at RHUBC-II by up to 0.2 K.
In total, all effects add up to a zenith T b uncertainty up to ±0.2 K in the K-band and up to ±0.7 K in the V-band. It is important to mention that these estimations depend on the atmospheric conditions at RHUBC-II. The assessed uncertainties may deviate for locations with higher opacities.

Comparative assessment
After assessing the uncertainties of the two calibration procedures, both are compared on the basis of calibrated zenith T b on 16 August 2009. The average tipping curve results are compared to measurements based on the original LN 2 calibration five days before (Fig. 11). However, in Sect. 4.1.5 it is shown that the comparison is still possible, because the instrument drift over this period is negligible. Zenith T b from tipping curves are generally smaller than T b derived from the LN 2 calibration. Table 4 shows that for the chosen quality thresholds (corr min = 0.9995, χ max = 1 × 10 −5 ) the difference T b = T b (TIP) − T b (LN 2 ) reaches the maximum of −1.3 K. T b does not seem to depend on the number of averaged tipping curves, because the repeatability of tipping curve results throughout the day is very good (< ±0.2 K, Table 4). For the two V-band channels the difference is −1.8 K at 51.26 GHz and −2.0 K at 52.28 GHz. Additionally, Fig. 11 shows the results of the improved boiling point correction (Eq. 15). This correction significantly reduces the bias between the two calibration techniques to −0.4 K at 51.26 GHz and −0.7 K at 52.28 GHz (cf. Figs. 10 and 11). Apart from the channel at 27.84 GHz, the tipping curve calibration in the K-band generally provides smaller T b results than the LN 2 calibration. Still, the total uncertainty ranges of both calibration methods overlap for the two V-band channels and for most K-band channels (Fig. 11) when the updated boiling point correction is included. Only at 22.24 GHz and 27.84 GHz they are slightly outside the overlap range (−0.2 and −0.4 K, respectively). Nevertheless, the three lowest K-band channels show a bias, which systematically increases towards the line center. This raises the question whether an atmospheric water vapor signal has contaminated the LN 2 calibration via a reflective component (Sect. 4.1.2). Such an external atmospheric signal would reduce the reflective component and therefore calibrated T b . An additional frequency-dependent effect results from the antenna's beam width, which increases with decreasing frequency. Therefore, the low-frequency channels could be influenced by an external signal that originates from objects at ambient temperature or from the atmosphere.
Finally, the question which technique gives results that are closer to the truth cannot be answered. However, radiative transfer calculations for the K-band channels and the results from the tipping curve calibration agree very well: Zenith T b measurements that are calibrated by the tipping curve calibration only deviate by 0.3-0.8 K from simulated T b values from RHUBC-II radiosonde profiles for all K-band channels (Fig. 11). Therefore, we recommend to use the tipping curve calibration for the K-band channels whenever possible (Table 5). This is especially true for low-opacity conditions, where the uncertainties of tipping curve results are small and the results show a very good repeatability (Fig. 10). In contrast, the LN 2 calibration suffers from large uncertainties at the cold calibration point. The influence of standing waves can be reduced by averaging calibration results. However, the T b uncertainty resulting from an uncertain refractive index of LN 2 remains.
For the two V-band channels the uncertainty of the tipping curve is much larger than in the K-band, because most effects that impact the uncertainties grow with channel opacity. In the V-band, both calibration techniques are in good agreement when the LN 2 calibration is used with the improved boiling point correction. Also simulations for 51.26 and 52.28 GHz agree well with the tipping curve calibration results, though for the other non-opaque V-band channels below 56 GHz, simulated T b are systematically smaller than T b values from the improved LN 2 calibration. The difference reaches almost 3 K at maximum. Still, T b uncertainties for non-opaque channels are smaller than for the LN 2 calibration, which suffers from resonances and an uncertain refractive index of LN 2 in the V-band, too. Of course, the LN 2 calibration results can be tuned to the tipping curve results by adapting n LN 2 . However, when doing this, each channel provides a significantly different n LN 2 value. Furthermore, this excludes an independent comparison of both calibration techniques.

Outlook
Summarizing, a larger number of detailed uncertainty assessments are needed to explain the differences between the LN 2 calibration and the tipping curve calibration. Upcoming measurement deployments should be accompanied by an accuracy analysis of the two calibration methods. Particularly, it would be valuable to repeat the sensitivity study for radiometer measurements under different atmospheric conditions, because RHUBC-II covers only the extreme case of dry and relatively warm conditions at high altitudes. Parts of the estimated calibration uncertainties might scale with T mr (1 − exp(−τ )) of the observed atmosphere, as suggested by one of the reviewers of this study. Other parts depend on the instrument type and the calibration procedure. At this point future deployments will help estimate how general the assessed calibration uncertainties are.
An important aspect concerning the LN 2 calibration is to verify its reproducibility by repeated calibrations. Additionally, frequent elevation scans help to improve the calibration accuracy by the tipping curve method. Using both methods simultaneously gives the opportunity to detect possible calibration problems.
Finally, when the calibration accuracy is properly assessed, the resulting T b measurements can be used to validate gas absorption models. In contrast to the K-band, where different absorption models show a good agreement, in the V-band T b calculations from different models deviate by up to 2 K. Any significant deviation from the measurement can then be assigned to the model spectroscopy, i.e., the absorption parameters. Here, RHUBC-II data give the opportunity to validate gas absorption models in a very dry middle-toupper troposphere.