Enhancing the consistency of spaceborne and ground-based radar comparisons by using quality filters

Coinciding monsoon and typhoon seasons in the Philippines cause torrential rainfall, and associated hazards such as flooding and landslides. While early warning systems require accurate radar-based rainfall estimates, low-density rain gauge networks in the Philippines make it challenging to monitor the calibration of the ground-based radars (GRs). As an alternative, we explore the potential of spaceborne radar (SR) observations from the Ku-band precipitation radars on board the TRMM and GPM 5 satellites as a reference to quantify the calibration bias of an S-band GR in the Philippines. To this end, the 3D volume-matching algorithm proposed by Schwaller and Morris (2009) is implemented and applied to five years (2012-2016) of observations. We further extend the procedure by a framework to take into account the data quality of each ground radar bin. Through these methods, we are able to assign a quality index to each matching SR-GR volume, and thus compute the GR calibration bias as a quality-weighted average of reflectivity differences in any sample of matching GR-SR volumes. We exemplify the idea of 10 quality-weighted averaging by using the beam blockage fraction as a basis of a quality index. As a result, we can increase the consistency of SR and GR observations, and thus the precision of calibration bias estimates. The remaining scatter between GR and SR reflectivity, as well as the variability of bias estimates between overpass events indicate, however, that other error sources are not yet fully addressed. Still, our study provides a framework to introduce any other quality variables that are considered relevant in a specific context. The code that implements our analysis is based on the open source software library 15 wradlib, and is, together with the data, publicly available to monitor radar calibration, or to scrutinize long series of archived radar data back to December 1997, when TRMM became operational. Copyright statement.


Introduction
Weather radars are essential tools in providing high quality information about precipitation with high spatial and temporal resolution in three dimensions.However, several uncertainties deteriorate the accuracy of rainfall products, with calibration contributing the most amount (Houze Jr et al., 2004), while also varying in time (Wang and Wolff, 2009).While adjusting (Abon et al., 2016;Heistermann et al., 2013a).The approach, however, is limited to within ±35 degrees latitude for TRMM and ±65 degrees latitude for GPM.

Ground Radar
The Philippine Atmospheric, Geophysical, and Astronomical Services Administration (PAGASA) maintains a nationwide network of ten weather radars, eight of which are single-polarization S-band radars and two are dual-polarization C-band radars.Subic radar, which covers the greater Metropolitan Manila area, has the most extensive set of archived data.The radar coverage includes areas that receive some of the highest mean annual rainfall in the country.
The Subic radar sits on top of a hill at 532 m.a.s.l. in the municipality of Bataan, near the border to Zambales (location: 14.82 • N, 120.36 • E) (see Figure 1).To its south stands Mt.Natib (1253 m.a.s.l.) and to its north runs the Zambales Mountain Range (highest peak stands at 2037 m.a.s.l.).To the west is the Redondo Peninsula in the southern part of the Zambales province, where some mountains are also situated.Almost half of the coverage of the Subic radar is water, with Manila Bay to its southeast and the West Philippine Sea to the west.Technical specifications of the radar are summarized in Table 1.Data from April 2012 to December 2016 were obtained from PAGASA.Throughout the five years the scan strategy remained the same, except for 2015 when it was limited to only three elevation angles per volume due to hardware issues.The standard scanning strategy was re-implemented in 2016.

Space-Borne Precipitation Radar
Precipitation radar data were gathered from TRMM 2A23 and 2A25 version 7 products for overpass events intersecting with the Subic ground radar coverage between 1 June 2012 to 30 September 2014, and GPM 2AKu version 6 products from 1 June 2014 to 31 December 2016, covering the monsoon season of June to August which overlaps with the typhoon season of June to November.On that basis, we collected 225, 378, and 295 TRMM overpasses for 2012, 2013, and 2014, respectively. For GPM, there were 178, 224, and 211 for 2014, 2015, and 2016.The data were downloaded from NASA's Precipitation Processing System (PPS) through the STORM web interface (https://storm.pps.eosdis.nasa.gov/storm/).The parameters of TRMM/GPM extracted for the analysis are the same as Warren et al. (2017;their Table 3).

Partial Beam Blocking
In an ideal situation, SR and GR should have the same measurements for the same volume of the atmosphere, as they are measuring the same target.However, observational differences may arise due to different view geometries, different operating frequencies, different environmental conditions of each instrument, and different processes along the propagation path of the beam.As pointed out before, we focus on beam blockage as an index of GR data quality.In regions of complex topography, ground radars are typically affected by the effects of beam blockage, induced by the interaction of the beam with the terrain surface or other solid obstacles along the beam propagation path, resulting into a weakening or even loss of the signal.
To quantify that process within the Subic radar coverage, a beam blockage map is generated following the algorithm proposed by Bech et al. (2003).It assesses the extent of occultation using a digital elevation model (DEM).The DEM used in this study is 5 from the Shuttle Radar Topography Mission (SRTM) data, with 1-arc-second (approximately 30-meter) resolution.The DEM was resampled to the coordinates of the radar bin centroids, using spline interpolation, in order to match the polar resolution of the radar data (500 m in range and 1 • in azimuth, extending to a maximum range of 120 km from the radar site; see Figure 1).
A beam blockage map is generated for all available elevation angles.
The function wradlib.qual.beam_block_frac()which is an implementation of this algorithm was used to calcu-10 late the beam blockage fraction for each bin and each antenna pointing angle.The function wradlib.qual.cum_beam_block_frac(was used to compute the cumulative beam blockage along each ray.A cumulative beam blockage fraction (BBF) of 1.0 corresponds to full occlusion, and a value of 0.0 to perfect visibility..5, 2.4, 3.4, 4.3, 5.3, 6.2, 7.5, 8.7, 10, 12, 14, 16.7, 19.5 (0.0, 1.0, 2.0) The quality index based on beam blockage fraction is then computed as follows, following the equation from Zhang et al. (2011): With this index, all the values with near perfect visibility have the highest quality, while those below 50% visibility have zero quality. 5

SR-GR Volume Matching
For each SR gate, several meta-data parameters were extracted, such as the corresponding ray's bright band height and width, gate coordinates in three dimensions (longitude and latitude of each ray's Earth intercept and range gate index), time of overpass, precipitation type (stratiform, convective, or other), and rain indicators (rain certain or no-rain).The parallax-corrected altitude (above mean sea level) and horizontal location (with respect to the GR) of each gate were determined as outlined in 10 the appendix of Warren et al. (2017).From the bright band height/width and the altitude of each SR gate, the bright band membership of each gate was calculated by grouping all rays in an overpass and computing the mean brightband height and width, using the wradlib.qual.get_bb_ratiofunction.A ratio value of less than zero indicates that the gate is below bright band, greater than one indicates that the gate is above the bright band, and a value between zero and one means that the gate is within the bright band.The gates below and above the brightband were considered in the comparison.To ensure that there are sufficient bins with actual rain included in the comparison, overpasses with less than 100 gates flagged as rain certain were discarded.
For each SR overpass, the GR sweep with the scan time closest to the overpass time within a 5-min window was selected.
Both the SR and GR data were then geo-referenced into a common azimuthal equidistant projection centered on the location of the ground radar.
In order to minimize systematic differences in comparing the SR and GR reflectivites caused by the different measuring frequencies, the SR reflectivities are converted from Ku to S Band (using the function wrl.trafo.KuBandToS) following the formula from Cao et al. (2013): where the a i are the coefficients for dry snow and dry hail, rain, and in between at varying melting stages (Table 1 of Cao et al. (2013)).
The actual volume matching algorithm closely follows the work of Schwaller and Morris (2011), where SR reflectivity is spatially and temporally matched with GR reflectivity without interpolation.The general concept is highlighted by Figure 2: Each matching sample consists of bins from only one SR ray and one GR sweep.From the SR ray, those bins were selected that intersect with the vertical extent of a specific GR sweep at the SR ray location (as a result of GR beam width and distance from radar).From each GR sweep, those bins were selected that intersect with the horizontal footprint of the SR ray at the corre-  Minimum GR reflectivity considered 0 dBZ sponding altitude.The SR and GR reflectivity of each matched volume was computed as the average reflectivity of the intersecting SR and GR bins, where the averaging is done in linear units (mm 6 /m 3 ).The entire workflow of this volume matching procedure is based on wradlib functions and available at https://github.com/wradlib/radargpm-beamblockage.
The nominal minimum sensitivity of both TRMM PR and GPM KuPR is 18 dBZ, so only values above this level were considered in the calculation of average SR reflectivity in the matched volume.In addition, the fraction of SR gates within a matched volume above that threshold was also recorded.On the other hand, all GR bins are included in the calculation of average GR reflectivity, after setting the bins with reflectivities below 0.0 dBZ to 0.0 dBZ, as also suggested by Schwaller and Morris (2011).The filtering criteria applied in the workflow are summarized in Table 2.

Estimating GR calibration bias
Beam blockage and the corresponding GR quality maps were computed for each GR bin (cf.Section 2.2.)For each matched SR-GR volume, the data quality was then based on the minimum quality of the GR bins in that volume.
To analyze the effect of data quality on the estimation of GR calibration bias, we compared two estimation approaches: a simple mean bias that does not take into account beam blockage, and a weighted mean bias that considers the quality value of each sample as weights.The corresponding standard deviation and weighted standard deviation were calculated as well.The overall process is summarized in Figure 3.This way, we provide an overview of the variability of our bias estimates over time.
3 Results and Discussion

Beam Blockage Map
Figure 4 shows the beam blockage map for the two lowest elevation angles of each scanning strategy.Figure 4a and c   As expected, the degree of beam blockage decreases with increasing antenna elevation, yielding the most pronounced beam blockage at 0.0 • .Each blocked sector can be explained by the topography (see Figure 1), with the Zambales Mountain Range causing blockage in the northern sector, Mt.Natib in the southern sector, and the Redondo peninsula mountains in the western sector.The Sierra Madre mountains also cause some partial beam blocking at the far east, and a narrow partial blocking northeast of the station where Mount Arayat is located.

5
As the elevation angle increases, the beam blockage becomes less pronounced or even disappears.Substantial blockage persists, however, for the higher elevation angles in the northern and southern sectors.

SR-GR Matching
There were 183 TRMM and 103 GPM overpasses that overlapped with the 120 km Subic radar range from 2012 to 2016.
However these numbers decreased to 74 (for TRMM) and 40 (for GPM) after applying the selection criteria listed in Table 2.
In order to get a better idea about the overall workflow, we first exemplify the results for two specific overpass events -one for TRMM, and one for GPM.

Case 1: 08 November 2013
For the TRMM overpass event on November 8, 2013, the top row of Figure 5 shows SR (a) and GR (b) reflectivity as well as the resulting differences (c) for matching samples at an elevation angle of 0.5 • .Each circle in the plots represents a matched volume.A corresponding map of Q BBF is shown in (d) while (e) shows a scatter plot of GR versus SR reflectivities, with points coloured according to their Q BBF .The reflectivity difference map and scatter plot indicate significant variability with differences of up to -10 dB.Large differences can be observed at the edges of the southern sector affected by beam blockage (cf.also Figure 4).Major parts of that sector did not receive any signal due to total beam blockage.At the edges, however, partial beam blockage caused substantially lower GR reflectivity values.As expected, large negative differences of Z GR -Z SR are characterized by low quality.
Consequently, the estimate of the calibration bias substantially depends on the consideration of partial beam blockage (or quality).Ignoring quality (simple mean) yields a bias estimate of -1.9 dB while the quality-weighted average yields a bias estimate of -1.2 dB.Accordingly, the standard deviation is reduced from 3.4 to 2.6 dB, indicating an increased precision of the bias estimate.
This case demonstrates how partial beam blockage affects the estimation of GR calibration bias.It is important to note that beam blockage only becomes an issue once the reflectivity observed by the ground radar exceeds the detection limit.As a consequence, total beam blockage does not interfere with the bias estimation.
The effect becomes even more obvious for the next elevation angle.Figure 6 is equivalent to Figure 5, but for an elevation angle of 1.5 • : As the sector of total beam blockage shrinks at that elevation, the impact of partial beam blockage on the estimation of GR calibration bias increases.That might be considered counterintuitive, as one might expect the effect of beam blockage to disappear at higher elevations.For an antenna elevation of 1.5 • , however, ignoring beam blockage yields a bias estimate of -2.1 dB (simple mean), while the quality-weighted average yields a bias of -1.4 dB.At the same time, considering quality substantially reduces the standard deviation from 3.4 dB to 2.1 dB.

Case 2: 01 October 2015
The second case confirms the findings in the previous section for a GPM overpass on October 1, 2015.That overpass captured an event in the northern and eastern part of the radar coverage where partial beam blockage is dominant, as well as a small part of the southern sector with partial and total beam blockage.Figure 7 shows the results of that overpass in analogy to the previous figures, for an antenna elevation of 0.0 degree.The figure shows a dramatic impact of partial beam blockage, with a

Calibration bias over time
Finally, we applied both the simple and the quality-weighted mean bias estimation to each of the TRMM and GPM overpasses from 2012 to 2016 that met the criteria specified in Section 2.3 Table 2.As pointed out in Section 2.3, the matching procedure itself is carried out per GR sweep, ı.e.separately for each antenna elevation angle.
As a result, we obtain a time series of bias estimates for GR calibration, as shown in Figure 8.In this figure, the calibration vations.In the upper panel (a), each marker represents the quality-weighted mean bias for a specific SR overpass (circles for GPM, triangles for TRMM).The center panel (b) highlights the differences between the quality-weighted and the simple mean approach, ı.e. it quantifies the effect of taking into account GR data quality (ı.e.partial beam blockage).The bottom panel (c) shows the differences between the quality-weighted standard deviation and the simple standard deviation of differences, illustrating how taking into account GR quality affects the precision of the bias estimates.
The time series provide several important insights: (1) Short term variability: There is a strong variability of the estimated calibration bias between overpasses (Figure 8a).
That variability appears to have a strong random component, and it is clearly not a desirable property: typically, we would not expect changes in calibration bias to occur at the observed frequency and amplitude.Yet, it is beyond the scope of this study to disentangle the sources of this variability.We have to assume that the fluctuations are a cumulative result of various effects that might include e.g.hardware instability, fluctuations in the refractive index of the atmosphere, dynamic changes in the atmosphere in the time interval between SR overpass and GR sweep, non-meteorological echoes, residual errors in the geometric intersection of the volume samples, and other random error components, as noted in other studies (Anagnostou et al., 2001;Kim et al., 2014;Schwaller and Morris, 2011;Wang and Wolff, 2009;Warren et al., 2017).
(2) Change of bias over time: Despite the variability of bias estimates between the individual overpass events, the time series still provides us with a clear signal: The bias estimates appear to fluctuate around an average value that appears to be quite persistent over the duration of the corresponding wet seasons of the different years, so over intervals of several months.
Considering the average calibration bias over the different wet seasons (horizontal lines in Figure 8a), we can clearly observe changes in calibration bias over time.The bias was most pronounced in 2012 and 2013, with average bias estimates around -4.1 dB for 2012 and -2.5 dB for 2013.For 2014, the absolute calibration bias was much smaller, at a level of 1.4 dB while for 2015 and 2016, the situation improved further, with an average bias of 0.03 dB in 2015 and 0.6 dB in 2016.It is important to note that these values were computed as the average bias across all matched volumes and not as the average of biases across overpasses, as the overpasses have different sample sizes.We have to assume that a fundamental issue with regard to calibration maintenance was addressed between 2013 and 2014, although we could not confirm that with the radar operator.
(3) Effect of quality weighting on bias estimation: Figure 8b and c together illustrate the benefit of taking into account GR data quality (i.e.beam blockage) when we estimate GR calibration bias.It is important to understand that only the center and the bottom panel together prove that benefit: It does not come as a surprise that the quality-weighted mean bias is consistently higher than the simple mean bias (Figure 8b) since we give less weight to those GR bins affected by partial beam blockage.
But only Figure 8c shows that the quality weighted bias estimates are also consistently more precise: in the vast majority of overpasses the quality weighted standard deviation is substantially smaller than the simple standard deviation.That result is also consistent with the case study result shown above.However, when the matched sample size is already small, further filtering of samples increases the standard deviation.
(4) GPM and TRMM radars are consistent: In 2014, both TRMM and GPM overpasses are available.That period of overlap shows that the GR calibration bias estimates that are based on both TRMM and GPM observations can be considered as homogeneous.Using TRMM data, the average calibration bias for all 2014 overpasses amounts to 1.6 ± 1.3 dB, while using the GPM overpasses yields a bias of 1.8 ± 1.5 dB.That makes us confident that the substantial year-to-year changes of our bias estimates are based on changes in GR calibration, not based on differences between TRMM and GPM sensors.

Conclusions
In 2011, Schwaller and Morris presented a new technique to match spaceborne radar (SR) and ground-based radar (GR) reflectivity observations, with the aim to determine the GR calibration bias.Our study extends that technique by a coherent approach to take into account the quality of the ground radar observations: Each GR bin was assigned a quality index between 0 and 1, which was again used to assign a quality value to each matched volume of SR and GR observations.For any sample of matched volumes (e.g.all matched volumes of one overpass, or a combination of multiple overpasses), the calibration bias can then be computed as a quality-weighted average of the differences between GR and SR reflectivity in all samples.
We exemplified that approach by inferring a GR data quality index based on the beam blockage fraction, and we demonstrated the added value for both TRMM and GPM overpasses over the 120 km range of the Subic S-band radar in the Philippines for a five year period.
Although the variability of the calibration bias estimates between overpasses is high, we could show that taking into account partial beam blockage leads to more consistent and more precise estimates of GR calibration bias.Analyzing five years of archived data from the Subic S-band radar (2012-2016), we could also demonstrate that the calibration standard of the Subic radar substantially improved over the years, from bias levels around -4.1 dB to bias levels around 0.6 dB.
Case studies for specific overpass events also showed that the necessity to account for partial beam blockage might even increase for higher antenna elevations.That applies when sectors with total beam blockage (in which no valid matched volumes are retrieved at all) turn into sectors with partial beam blockage at higher elevation angles.
Considering the scatter between SR and GR reflectivity in the matched volumes of one overpass (see case studies), as well as the variability of bias estimates between satellite overpasses (see time series), it is obvious that we do not yet account for various sources of systematic and random sources of observational errors.Also the simulation of beam blockage itself might still be prone to various errors (e.g. from geometric inaccuracies, variability of atmospheric refractivity, interpolation effects, errors in the DEM, or limited DEM resolution).Nevertheless, the idea of the quality-weighted estimation of calibration bias presents a consistent framework that allows for the integration of any quality variables that are considered important in a specific environment or setting.For C-band radars, e.g.considering path-integrated attenuation would be vital.The framework could also be extended by explicitly assigning a quality index to SR observations, too.In the context of this study, that was implicitly implemented by filtering the SR data e.g. based on bright band membership.
Despite the fact that there is still ample room for improvement, our tool that combines SR-GR volume matching and quality-weighted bias estimation is readily available for application or further scrutiny.In fact, our analysis is the first of its kind that is entirely based on open source software, and thus fully transparent, reproducible and adjustable (see also Heistermann et al. (2015)).Therefore this study, for the first time, demonstrates the utilization of wradlib functions that have just recently been implemented to support the volume matching procedure and the simulation of partial beam block-

Figure 1 .
Figure 1.(a) A map of the Philippines showing the region of study and (b) the 120km coverage of the Subic radar (location marked with red diamond) with the SRTM Digital Elevation Model of the surrounding area.

Figure 2 .
Figure 2. Diagram illustrating the geometric intersection.Left panel shows a single SR beam intersection ground radar beams of two different elevation angles.From Schwaller and Morris (2011) ©American Meteorological Society.Used with permission.
Figure4shows the beam blockage map for the two lowest elevation angles of each scanning strategy.Figure4a and care for 0.0 • and 1.0 • , which are the two lowest elevation angles in 2015, while Figure 4b and d are for 0.5 • and 1.5 • , which are the two lowest elevation angles for the rest of the dataset.

Figure 3 .
Figure 3. Flowchart describing the processing steps to calculate the mean bias and the weighted mean bias between ground radar data and satellite radar data.The results of each step are shown in Section 3 Atmos.Meas.Tech.Discuss., https://doi.org/10.5194/amt-2018-101Manuscript under review for journal Atmos.Meas.Tech.Discussion started: 10 April 2018 c Author(s) 2018.CC BY 4.0 License.

Figure 5 .
Figure 5. PPI of volume-matched samples from 8 November 2013 at 0.5 • elevation angle of (a) SR reflectivity,(b) GR Reflectivity, (c) difference between GR and SR reflectivities, and (d) QBBF .(e) Scatter plot of ZGR vs ZP R where each point is colored based on the data quality (QBBF ).The solid line in (a)-(d) is the edge of the SR swath, the other edge lies outside the figure.The dotted line denotes the central axis of the swath.The solid circles demarcate the 15 km and 115 km ranges from the radar.

Figure 6 .
Figure 6.Same as in Figure 5 but for 1.5 • elevation angle

Figure 7 .
Figure 7.As in Figure 5 but for the overpass 01 October 2015

Figure 8 .
Figure 8.(a) Time series of the weighted mean bias (∆Z * ) from 2012 to 2016.Triangle markers represent comparison with TRMM overpasses while circle markers are comparisons with GPM overpasses.Symbols are colored according to the number of volume-matched sample pairs on a logarithmic scale: light gray = 10-99, medium gray = 100-999, and black = 1000+.Solid and dashed horizontal lines represent the weighted average of all individual matched points within the year for TRMM and GPM, respectively.(b) The difference between the weighted mean biases (∆Z * ) and the simple mean biases (∆Z).(c) The standard deviation of the weighted mean bias values minus the standard deviation of the simple mean bias values.
Atmos.Meas.Tech.Discuss., https://doi.org/10.5194/amt-2018-101Manuscript under review for journal Atmos.Meas.Tech.Discussion started: 10 April 2018 c Author(s) 2018.CC BY 4.0 License.age.We also make available the complete workflow that put together the different steps of the processing chain, together with the underlying ground and space-borne radar data.Both code and results can be accessed at the following repository https://github.com/wradlib/radargpm-beamblockageupon the publication of this manuscript.Through these open-source resources, our methodology provides both research institutions and weather services with a valuable tool that can be applied to monitor the radar calibration, and -perhaps more importantly -to quantify the calibration bias for long time series of archived radar observations, basically beginning with the availability of TRMM radar observations in December 1997.Atmos.Meas.Tech.Discuss., https://doi.org/10.5194/amt-2018-101Manuscript under review for journal Atmos.Meas.Tech.Discussion started: 10 April 2018 c Author(s) 2018.CC BY 4.0 License.

Table 1 .
Characteristics of the Subic radar and its volume scan strategy.The numbers in parentheses correspond to scans in 2015, where the scanning strategy was different due to hardware issues.

Table 2 .
Filtering criteria for the matching workflow.