Retrieval algorithm for rainfall mapping from microwave links in a cellular communication network

Microwave links in commercial cellular communication networks hold a promise for areal rainfall monitoring and could complement rainfall estimates from ground-based weather radars, rain gauges, and satellites. It has been shown that country-wide (≈ 35 500 km2) 15 min rainfall maps can be derived from the signal attenuations of approximately 2400 microwave links in such a network. Here we give a detailed description of the employed rainfall retrieval algorithm. Moreover, the documented, modular, and user-friendly code (a package in the scripting language “R”) is made available, including a 2-day data set of approximately 2600 commercial microwave links from the Netherlands. The purpose of this paper is to promote rainfall mapping utilising microwave links from cellular communication networks as an alternative or complementary means for continental-scale rainfall monitoring.


Introduction
Accurate rainfall observations with high spatial and temporal resolution are needed for hydrological applications, agriculture, meteorology, weather forecasting, and climate monitoring.However, there is a lack of accurate rainfall information for the majority of the land surface of the earth, notably from ground-based weather radars (Heistermann et al., 2013).Moreover, the number of reporting rain gauges is dramatically declining in Europe, South America, and Africa.Lorenz and Kunstmann (2012) report a decline of approximately 50 % in the period 1989-2006 for GPCC, version 5.0.Satellites are often the only source of rainfall informa-tion.Despite their increasing coverage and spatio-temporal resolution, measurement errors and sampling uncertainties limit the stand-alone applicability of satellite rainfall products (e.g.Sorooshian et al., 2000;Joyce et al., 2004;Roebeling and Holleman, 2009;Kidd and Huffman, 2011;Hou et al., 2014).This calls for alternative and complementary sources of rainfall information.Since 2006 various studies have shown that microwave links from operational cellular communication networks may be used for rainfall monitoring for various networks and climates (e.g.Messer et al., 2006;Leijnse et al., 2007a;Zinevich et al., 2009;Overeem et al., 2011Overeem et al., , 2013;;Chwala et al., 2012;Rayitsfeld et al., 2012;Bianchi et al., 2013;Doumounia et al., 2014).The ability to observe other types of precipitation, such as snow, is limited however.
A link is defined as a radio connection from one telephone tower to another telephone tower, whereas a link path describes the path between two telephone towers.Many links are full duplex, i.e. those links measure in two directions over the same link path, in which case we have (according to our definition) two links but only one link path.The basic principle of rainfall estimation using microwave links is as follows.Rainfall attenuates the electromagnetic signals transmitted from the directional antenna of one telephone tower to another (Fig. 1).The received power at one end of a microwave link as a function of time is stored operationally by communication companies to monitor the quality of their networks.From the decrease in received signal level with respect to the reference signal level, being representative of dry weather, the rainfall-induced path-integrated attenuation can be calculated.This can be converted to an average rainfall intensity over the path of a link.Rainfall estimates from networks of individual links could in turn potentially be employed to create near real-time rainfall maps.This is particularly interesting for (developing) countries where few surface rainfall observations are available.For instance, (Gosset et al., 2016), who report on a Rain Cell Africa workshop held in Burkina Faso in March 2015, clearly demonstrate the relevance and interest to accelerate the uptake of this new measurement technique on the African continent.Despite the sympathy from scientists and representatives of meteorological services and telecommunication companies, to date no userfriendly computer code for microwave link data processing and rainfall mapping has been made publicly available.That is the motivation of this paper.
Based on a 12-day data set Overeem et al. (2013) have shown that country-wide (≈ 35 500 km 2 ) 15 min rainfall maps can be derived from received signal powers of microwave links in a cellular communication network.Hence, further upscaling this novel source of rainfall information is the logical next step toward continental-scale rainfall monitoring.The underlying rainfall retrieval algorithm is briefly described in Overeem et al. (2013) and largely based on Overeem et al. (2011).The purpose of this paper is to provide a detailed description of the algorithm employed by Overeem et al. (2013) and the corresponding computer code, which is needed for a successful implementation by potential users.Moreover, sensitivity analyses are performed with respect to two threshold values of the wet-dry classification method and the outlier filter threshold value.Finally, the transferability of the code to other networks and climates is extensively discussed.
The code is freely provided as R package "RAINLINK" on GitHub 1 .It contains a working example to compute linkbased 15 min rainfall maps for the entire surface area of the Netherlands for 40 h from real microwave link data as used in Overeem et al. (2013).This is a working example using actual data from an extensive network of commercial microwave links, for the first time in the scientific literature, which will allow users to test their own algorithms and compare their results with ours.Note that link data are utilised in a stand-alone fashion to obtain rainfall maps; i.e. data from rain gauges, weather radars, or satellites are not combined with the link data.
The basic theory of rainfall estimation employing microwave links has already been described, e.g. in Messer et al. (2006), Leijnse et al. (2007a), andOvereem et al. (2011).The core of the rainfall retrieval algorithm consists of the (path-averaged) rainfall intensity R (mm h −1 ) being estimated from microwave (path-averaged) specific attenuation k (dB km −1 ) using a power-law R-k relation (Atlas and Ulbrich, 1977;Olsen et al., 1978): (1) Coefficients a (mm h −1 dB −b km b ) and exponents b (−) depend mainly on link frequency.The rainfall retrieval algorithm consists of the following steps: (1) preprocessing of link data; (2) wet-dry classification; (3) reference signal determination; (4) removal of outliers due to malfunctioning links; (5) correction of received signal powers; and (6) computation of mean path-averaged rainfall intensities.Below attention is given to the main retrieval issues associated with link-based rainfall estimation.
Received signal powers occasionally decrease during nonrainy periods, resulting in non-zero rainfall estimates, e.g.caused by reflection of the beam or dew formation on the antennas (see Upton et al. (2005) for an overview).A reliable classification of wet and dry periods is needed to prevent this rainfall overestimation.Different classification methods have been proposed, some of which can be applied to received powers or signal attenuations when they are sampled at very high frequencies (often for research purposes), e.g.every 6 or 30 s (Schleiss and Berne, 2010) or even at 20 Hz.For instance, Chwala et al. (2012) present a spectral time series analysis, and Wang et al. (2012) Markov switching models.Here the so-called "nearby link approach" is employed (Overeem et al., 2011(Overeem et al., , 2013, termed "link , termed "link approach" in these papers), which was derived for application to minimum received powers over a time interval (15 min in Overeem et al., 2011Overeem et al., , 2013)).A common operational sampling strategy for commercial microwave links is to obtain a minimum and maximum received power per 15 min interval (Messer et al., 2006;Overeem et al., 2011Overeem et al., , 2013)).Hence, methods designed for frequently sampled attenuation data cannot be applied to data from such operational networks.However, the nearby link approach cannot be applied if spatial link densities are too low.Messer and Sendik (2015) provide more information on different methods of wet-dry classification and reference signal determination.Sometimes powers are sampled every second (Doumounia et al., 2014), every minute (Rayitsfeld et al., 2012), or only once or twice every 15 min (Leijnse et al., 2007a).Other approaches to identify rainy and non-rainy spells make use of auxiliary sources but are not considered in this paper.For instance, radar data have been utilised in the "radar approach" (Overeem et al., 2011), and geostationary satellite data in the "satellite approach" (Van het Schip et al., 2016).An advantage of the nearby link approach is that it solely employs link data; i.e. it does not depend on auxiliary data.
Attenuation due to wet antennas gives rise to overestimation of rainfall and needs to be compensated for (Kharadly and Ross, 2001;Minda and Nakamura, 2005;Leijnse et al., 2007aLeijnse et al., , b, 2008;;Schleiss et al., 2013).The sampling strategy is also an important error source, e.g. one sample every 15 min leads to large deviations due to unresolved rainfall variability (Leijnse et al., 2008).
This paper is organised as follows.First a description of the required microwave link data is given.Next, the rainfall retrieval algorithm and the interpolation methodology are described.The results section illustrates the different steps of the rainfall retrieval algorithm including rainfall mapping.Also sensitivity analyses of parameters of the algorithm are provided, as well as a comparison between the performance of two interpolation methods.Next, the algorithm and its applicability to other networks and regions are discussed.Finally, conclusions are provided.

Microwave link data: characteristics and preparations
In order to compute path-averaged rainfall intensities, received signal powers were obtained from Nokia microwave links in one of the national cellular communication networks in the Netherlands, operated by T-Mobile NL.The minimum and maximum received powers over 15 min intervals were provided, based on 10 Hz sampling.The transmitted power was almost constant.Here the data have a resolution of 1 dB, and the majority of these Nokia links used vertically polarised signals.The data format required by the code is given in Appendix A. Data from the working example were obtained from 9 September, 08:00 UTC, to 11 September, 08:00 UTC (2011), to estimate rain (Overeem et al., 2016a).Figure 1 shows the locations of the links which can be used to estimate rain-fall (on average 2473 links and 1527 link paths over all 160 time intervals of 15 min).Data from another day are used to illustrate the rainfall retrieval algorithm, but these data are not released with this paper.In addition, a 12-day validation data set, which includes the data from the working example, is used for sensitivity analyses of parameters of the rainfall retrieval algorithm.This data set is from June, August, and September 2011.Overeem et al. (2013) and Rios Gaona et al. (2015) provide more information on the characteristics of microwave links from this 12-day data set.All link data are from an independent validation data set; i.e. they have not been used to calibrate the rainfall retrieval algorithm.

Gauge-adjusted radar rainfall depths
Overeem et al. ( 2013) use a gauge-adjusted radar data set with a spatial resolution of approximately 0.9 km2 and a temporal resolution of 5 min to calibrate the microwave link rainfall retrieval algorithm.Here this radar data set, from another period than the calibration period, is utilised to validate linkbased rainfall maps.More information on the derivation of this data set can be found in Overeem et al. (2009aOvereem et al. ( , b, 2011)).The data set is freely available at the climate4impact portal (Overeem et al., 2016b).

Methodology
This section describes the entire processing chain from received signal powers to rainfall maps.The code is provided via GitHub as the R 2 package called "RAINLINK" (version 1.11)3 , which is distributed under the terms of GNU General Public License version 3 or later.Table 1 gives an overview of the (sub)functions needed for rainfall retrieval and mapping.Table 2 provides an overview of variables used in (sub)functions in the rainfall retrieval algorithm.Table 3 shows the parameters used in the rainfall retrieval algorithm and their default values.First, preprocessing of link data is performed (Appendix B).

Classification of wet and dry periods: the nearby link approach
In order to define wet and dry periods, it is assumed that rain is correlated in space and hence that several links in a given area should experience a joint decrease in received signal level in the case of rain.A time interval is labelled as wet if at least half of the links in the vicinity (default radius is 15 km, but this can be modified to better match other time intervals) of the selected link experience such a decrease.A detailed description of this classification algorithm can be found in Appendix C.  b This subfunction can also be used as a function to extract (interpolated) rainfall depths from a data frame of (interpolated) rainfall values for supplied latitude and longitude.
Note that this processing step is optional.The user can also decide not to apply a wet-dry classification, which may be the only option in areas with low spatial link densities.

Determination of reference signal level
The performed classification of rainy and non-rainy time intervals serves two purposes: (1) it allows for determining an accurate reference signal level or base level, which needs to be representative of dry weather; (2) it prevents non-zero rainfall estimates during dry weather.
The reference signal level P ref is computed for each link and time interval separately from the minimum and maximum received signal powers (dBm), P min and P max respectively: 1. P = P min +P max 2 (in dBm) is computed for each time interval classified as dry in the previous 24 h (including the present time interval); 2. P ref is the median of P over all dry time intervals.If the number of dry time intervals represents less than 2.5 h over the previous 24 h, P ref , and hence the rainfall intensity, is not available and so not computed.
If no wet-dry classification has been applied, the reference level is determined over all time intervals in the previous 24 h.The periods of 2.5 and 24 h are the default values and can be modified.

Filter to remove outliers
Malfunctioning links can cause outliers in rainfall retrievals.These outliers can be removed by using a filter that is based on the assumption that rainfall is correlated in space.The filter discards a time interval of a link for which the cumulative difference between its specific attenuation and that of the surrounding links (i.e.within a default radius of 15 km) over the previous 24 h (default value; including the present time interval) becomes lower than the outlier filter thresh- old (dB km −1 h −1 ).This criterion is applied to specific attenuation derived from uncorrected minimum received power (Overeem et al., 2013).Imagine that the default value of 32.5 dB km −1 h −1 is uniformly distributed over all time intervals in a 24 h period.This implies a maximum specific attenuation of approximately 1.35 dB km −1 (32.5 dB km −1 h −1 divided by 24 h) per time interval.This corresponds to a daily rain accumulation of approximately 120 mm for a 38.9 GHz link and 750 mm for the least sensitive, 13 GHz, link.Hence, a time interval of a chosen link will only be discarded if the rainfall amounts during the previous 24 h period are substantial.It is therefore highly unlikely that this filter would discard real rain.
The value of the cumulative difference, F , is computed as follows: where t is the time interval, t = 0 being the present time interval for which F needs to be computed, and t is the time interval in hours (0.25 h in the working example).A link is not used to estimate rainfall if F < F t .Note that P S L , median( P L ), and F are computed in the nearby link approach and are based on the minimum received powers, the superscript S referring to the selected link for which rainfall is to be computed.Running the outlier filter is optional.

Correction of received powers
Subsequently, corrected minimum (P C min ) and maximum (P C max ) received powers are computed for each time interval.In case of no wet-dry classification the time interval of P min is always considered wet.c To some extent also on polarisation, temperature, drop shape, and canting angle distribution (this has not been taken into account).
Here values have been computed from one data set of measured drop size distributions (p.65 in Leijnse, 2007c).

Computation of path-averaged rainfall intensities
Here the path-averaged rainfall intensities are computed from the corrected minimum and maximum received signal powers.The minimum and maximum rain-induced attenuation are calculated for each link and time interval using (5) Next, the minimum and maximum path-averaged rainfall intensities are computed: with H the Heaviside function (if the argument of H is smaller than 0, H = 0; else H = 1).A a is meant to correct for attenuation due to wet antennas (dB) and assumed to be constant, e.g.independent of rain rate and frequency.The coefficients a (mm h −1 dB −b km b ) and b (−), provided in a file on GitHub, are valid for vertically polarised signals (Fig. 2), which will usually be employed for microwave links.Utilising these coefficients for horizontally polarised signals will generally only produce small errors in the retrieved rainfall estimates.The values recommended by the International Telecommunication Union (ITU, 2005), meant for computing specific attenuation for given rain rates and for worldwide application, are also plotted.Differences up to 10 % are found for the value of the exponent b from ITU (dashed and dotted lines) compared to that obtained from drop-size distribution data from the Netherlands (solid lines).
For the link frequencies employed in this study (between 12.8 and 40.0 GHz) the value of the exponent b is close to 1 (Fig. 2, right).(Berne and Uijlenhoet, 2007), (Leijnse et al., 2008(Leijnse et al., , 2010)), and (Overeem et al., 2011) show that this nearlinearity only leads to small errors in rainfall estimates.
The assumed temporal sampling strategy only provides a minimum and maximum received power over a given time interval.The goal is to obtain a reliable mean path-averaged rainfall intensity over this time interval.This is achieved by computing the mean path-averaged rainfall intensity as a weighted average: where α is a coefficient that determines the relative contributions of the minimum ( R min ) and maximum ( R max ) path-averaged rainfall intensity (mm h −1 ) during a time interval.The values for A a (2.3 dB) and α (0.33) have been taken from Overeem et al. (2013), who use a 12-day calibration data set from June and July 2011 (which has not been used for validation).They compare daily link-based rainfall depths with gauge-adjusted radar retrievals to calibrate the rainfall retrieval algorithm.

Rainfall maps
Path-averaged rainfall intensities from microwave links are spatially interpolated to obtain rainfall maps.The user can choose to apply ordinary kriging (OK) employing a spherical variogram model (Overeem et al., 2013;Rios Gaona et al., 2015) or to apply inverse distance weighted (IDW) interpolation on link rainfall data.OK and IDW are well suited for dealing with heterogeneously distributed data locations.OK requires a variogram model, but unfortunately it is impossible to robustly estimate such variograms for each time interval separately due to the sparsity of rainfall.Hence, a more robust procedure is followed.The parameter values can be supplied by the user or they can be computed as follows.The sill and range of an isotropic spherical variogram model have been expressed as a function of day of year (DOY) and duration (1-24 h) using a 30-year rain gauge data set from the Netherlands (Van de Beek et al., 2012).This data set does not overlap with the link data set.If needed the relationships can be extrapolated to time intervals shorter than 1 h.The nugget is set equal to 0.1 times the sill.Note that these equations and the optimal values of their coefficients have been found to be useful for the Dutch climate.Hence, they may need adjustment for other climatic settings.See Appendix D for a detailed description of the interpolation algorithm.Figure 3 shows an example of the interpolation procedure for a given 15 min time interval.The left panel shows the locations of the microwave links, where the colour denotes the rainfall depth.Next, these path-averaged rainfall depths are assigned to the middle of the link, i.e. considered as point measurements (centre panel).This is done to simplify the interpolation procedure.The right panel shows the corresponding interpolated rainfall map (OK).Interpolated rainfall maps can be visualised by using the functions provided in the package.This is described in more detail in Appendix E.

Illustration of steps in rainfall retrieval algorithm
Here the steps in the rainfall retrieval are illustrated.Starting points are the minimum and maximum received powers over a given time interval (15 min in this case), as shown in Fig. 4a for one link during 24 h.As expected, there is a strong negative correlation between the minimum received powers and the path-averaged gauge-adjusted radar rainfall intensities, considered to be the ground truth.Next, the corrected received powers and the reference level are shown (Fig. 4b).
In this particular example the received powers are hardly corrected.A decrease around 08:30 UTC, probably not related to rainfall, is corrected for.In contrast, a rain-induced decrease just before 17:00 UTC is removed.The mean pathaveraged link rainfall intensities (Fig. 4d) and the cumulative link rainfall depths (Fig. 4e) correspond well with the gaugeadjusted radar-based values.In total approximately 60 mm was observed, resulting from two convective events with tens of millimetres in a couple of hours.

Rainfall mapping
Link rainfall maps are compared to gauge-adjusted radar rainfall maps in Figs. 5 and 6 for 15 min and daily rainfall accumulations respectively.These are obtained using ordinary kriging with a climatological spherical variogram model.The left panels show the link-based rainfall maps with the locations of the employed links for the nearby link approach with outlier filter.The radar rainfall maps are given in the right panel.The cellular communication network is able to detect the rainfall patterns for the 15 min interval, although devi-ations are found with respect to radars combined with rain gauges.Note that some large areas do not have link data.The daily rainfall maps are from 10 September 2011, 08:00 UTC, to 11 September 2011, 08:00 UTC, and reveal link-based rainfall depths larger than 26.0 mm.These local high rainfall depths occurred in a few hours, i.e. convective rainfall.
In general, the link network is able to correctly determine the spatial rainfall patterns.These examples demonstrate a successful application of microwave links to estimate rainfall.Figure 7 illustrates the capability of the rainfall mapping functions to produce a (local) rainfall map of high graphical quality.

Sensitivity analyses
For all sensitivity analyses the same values for A a (2.3 dB) and α (0.33) are employed, i.e. as obtained for the nearby link approach with outlier filter using the default parameter values (Table 3).Validations are performed on 15 min pathaveraged link rainfall depths or link rainfall maps from the 12-day validation data set (Overeem et al., 2013).No threshold values regarding the minimum rainfall depths are applied in the comparisons.Metrics are computed for residuals, i.e. the link minus the gauge-adjusted radar rainfall depths.The sensitivity analyses are based on 12 rainy days from the summer.Such analyses may yield different results for other seasons and rainfall types.

Nearby link approach wet-dry classification
A sensitivity analysis is performed for the two threshold values of the wet-dry classification, where median( P ) is varied from −5 to 0 dB with a step size of 1 dB (because the resolution of the employed data set is 1 dB), and median( P L ) is varied from −2 to 0 dB km −1 , with a step size of 0.1 dB km −1 .Although these threshold values could   potentially become positive, this will hardly happen or only small positive values will be obtained.If both threshold values are 0 dB (km −1 ) this can therefore be considered representative for the situation without wet-dry classification.
The 2-D contour plots in Fig. 8 display relative mean error (%; black lines) and coefficient of variation (CV) or squared correlation coefficient (ρ 2 ) (colours) in pathaveraged 15 min rainfall depths as a function of median( P ) and median( P L ).The default threshold values are indicated by the black dot.Given the resolution of 1 dB, the default value for median( P ), −1.4 dB, implies that median( P ) should be lower than −1 dB (hence the black dot is plotted at 1 dB).In order to focus on the performance of the wet-dry classification with respect to the situation without wet-dry classification, the outlier filter has not been applied here.A wet-dry classification generally leads to a clear improvement in terms of ρ 2 and CV upon a situation without wet-dry classification.
The contour plots show that application of median( P L ) is not necessary and that the best results are obtained for median( P ) values around −4 dB, i.e. much lower than the default value.Note that the default values of A a and α have been used, where an outlier filter was applied.This may be the reason for the variability in relative mean errors.Finding optimal values of A a and α for every combination of median( P ) and median( P L ) was considered too computationally demanding and could have compensated for errors in wet-dry classification for each combination of both threshold values.
The nearby link approach with outlier filter and default settings for the parameter values (Table 3), gives a CV of the residuals of 3.84, a squared correlation coefficient ρ 2 of 0.54, and a relative bias in the mean of 10.5 % compared to the reference: gauge-adjusted, path-averaged radar rainfall depths.Removing step 8 (Appendix C) from the nearby link approach does hardly change these validation results.There- fore, this step can now also be discarded by supplying a function argument.In this step the previous two time intervals and the next time interval are classified as wet if the present time interval is rainy and has more than 2 dB of attenuation for the link under consideration.

Outlier filter
Performance of link rainfall estimates clearly deteriorates if the outlier filter is not applied.The CV of the residuals is much larger, 6.03, the ρ 2 is much lower, 0.35, and the relative bias in the mean becomes much larger, 24.2 % (compare to previous paragraph).The latter may be related to the fact that A a and α have been calibrated on a data set where the outlier filter has been applied.Now the performance for a wide range of threshold values of the outlier filter is investigated for path-averaged 15 min rainfall depths.For the default threshold value of −32.5 dB km −1 h −1 , denoted by the grey line, good results are obtained in terms of ρ 2 and CV (Fig. 9).Results improve for increasing threshold values at the expense of a severe decline in the amount of link data.The best choice for a threshold value is somewhat arbitrary, i.e. a range of values gives good results, while still having many link data.The relative mean error decreases from more than +20 % to less than −15 % from F t = −300 to 0 dB km −1 h −1 .If for each value of F t the coefficients A a and α would have been calibrated, the relative mean error would be expected to be more constant.Apparent are the jumps in ρ 2 and CV for certain values of F t .It seems that these are related to some very large "rainfall depths", which point to malfunctioning links.Note that F is the cumulative difference between a link's specific attenuation and that of the surrounding links.Hence, such jumps for strongly negative values of F t are likely not caused by rain or (sources of) errors affecting a whole region.

Performance of ordinary kriging versus inverse distance weighted interpolation
Here the 12-day data set of 15 min path-averaged link rainfall depths is employed to obtain interpolated 15 min rainwww.atmos-meas-tech.net/9/2425/2016/fall maps (0.9 km 2 ), which are aggregated to daily rainfall maps (0.9 km 2 ).The nearby link approach with outlier filter and default parameter values has been applied.These maps are computed for two interpolation methodologies: (1) OK with a spherical climatological variogram model, i.e. the default method; (2) IDW interpolation with an inverse distance weighting power of 2.0.First, the 15 min rainfall maps are studied.The following metric values are obtained for IDW: relative mean error of 8.3 %, CV of 3.30, and a ρ 2 of 0.41.OK performs better, since the relative mean error is clearly lower (1.5 %), CV is similar (3.29), and ρ 2 is clearly higher (0.48).Despite the high spatial resolution of the obtained rainfall maps (0.9 km 2 ) compared to the link density, still reasonable results are obtained in terms of ρ 2 , but values of CV are high.
Next, the daily rainfall maps are investigated.The following metric values are obtained for IDW: relative mean error of 8.2 %, CV of 0.51, and a ρ 2 of 0.70.OK performs somewhat better, since the relative mean error is clearly lower (1.4 %), CV is slightly larger (0.54), and ρ 2 is somewhat higher (0.73).Hence, the improvement of OK with respect to IDW is particularly evident at the 15 min scale.Note that the results for the daily timescale are much better than those for the 15 min timescale.The performance of IDW could become better if another value of the inverse distance weighting power would be selected.

Computation time
Using one processor (Intel i7; Linux operating system) the entire processing up to and including the interpolation takes around 10 min, i.e. to obtain 15 min link-based rainfall maps for 40 h based on data from on average 2381 links and 2473 links in total (representing 1527 unique link paths).Not applying a wet-dry classification leads to a decrease in computation time of 1 min.Most time is consumed by the OK interpolation (8 min).The IDW interpolation only takes 2 min (for an inverse distance weighting power of 2.0).A similar amount of time would be needed in case multiple processors would be used to run multiple periods, each processor running its own period.In order to reduce computational time it could also be worthwhile to divide a region into subregions, particularly to speed up the wet-dry classification.Another option would be to transfer the algorithm to a high-level programming language.

Transferability of code
The developed rainfall retrieval algorithm or its parameter values will likely need adaptation for (optimal) rainfall estimation for other networks and climates around the world.Nevertheless, many networks have similar characteristics as those in the Netherlands.A 15 min sampling strategy, either instantaneous or minimum and maximum values, is common and has been used in several other networks (e.g.Messer et al., 2006;Leijnse et al., 2007a;Messer and Sendik, 2015).
We advocate a pragmatic approach to first apply this algorithm to data from those networks with minimum and maximum received signal levels and assess the quality of the derived rainfall maps.This is certainly relevant for areas (e.g.developing countries) where few reference rainfall data are available to calibrate the rainfall retrieval algorithm.Suboptimal parameter values or interpolation methods may still provide meaningful rainfall estimates for other networks or climates.
As a next step the parameters of the algorithm could be adapted to local conditions, for instance based on recommendations from the International Telecommunication Union (ITU, 2005).Then it could be decided whether further modifications of the algorithm would be needed or not.For poorly gauged regions we suggest optimising the algorithm, including interpolation methodology, employing data from a region with a similar climate and network, for which sufficient ground-truth rainfall data are available.
Coefficients and thresholds have been optimised for 15 min intervals; hence their values may not be appropriate for other time intervals.Their values can be easily changed by modifying function arguments.
Even if network characteristics are very different, e.g. in terms of sampling strategy, large parts of our code could still be used to develop algorithms suited for the specific needs of these networks.Although the rainfall retrieval algorithm contains several empirical parameters, the developed methods are not merely statistical in nature.Their general principles hold for other networks as well.For instance, the principle of the wet-dry classification, where data from surrounding links are used to distinguish between wet and dry periods, makes use of the general fact that rainfall is correlated in space, although decorrelation distances can vary between different climates.
In case of very different sampling strategies, e.g.mean or instantaneous received signal powers for shorter or longer time intervals, the code should be modified.Although many networks have constant transmitted powers (Messer et al., 2006;Leijnse et al., 2007a;Chwala et al., 2012;Rayitsfeld et al., 2012;Bianchi et al., 2013), other networks may not operate with constant transmitted powers.For links using Automatic Transmit Power Control (ATPC) the transmitted power can become higher in case of a reduced received power at the end of the link, to compensate for large losses along the link path.In this case the transmitted power also needs to be known and should be taken into account in the rainfall retrieval algorithm (Schleiss and Berne, 2010;Doumounia et al., 2014).Note that a sampling strategy where minimum and maximum transmitted and received powers are available over a chosen time interval may result in additional errors in case of ATPC.This is because the timing of minimum received and transmitted powers does not necessarily coincide, which also holds for the maximum received and transmitted powers.
Ideally, the employed values for a and b should match the polarisation of the link.If information on the latter is available, the code could be easily adapted to incorporate this.

Parameter values
Table 3 gives an overview of the parameters of the rainfall retrieval algorithm, their default values, and the factors influencing them.This can help to assess which parameters will change for other regions and networks.Some parameters are already modelled as function of frequency, others do not seem to be sensitive to frequency, whereas again others should ideally be optimised for different frequencies.

Wet-dry classification with nearby link approach
Some of the signal fluctuations that occur in dry weather may also occur during rainy periods.The algorithm does not correct for such errors.Furthermore, nearby links may suffer simultaneously from signal fluctuations not caused by rainfall, resulting in a poor performance of the nearby link approach.
The parameters concerning the wet-dry classification have been optimised using data from another period and are based on received signal level data stored at 0.1 dB resolution (Overeem et al., 2011).These values have been applied in Overeem et al. (2013) and in the current manuscript to an independent data set from another brand of links with slightly different antenna covers and a coarser 1 dB power resolution.The sensitivity analysis shows that the parameter values from Overeem et al. (2011) are suboptimal but give a clear improvement in link-based rainfall estimates compared to the case without wet-dry classification.
The chosen radius depends on the spatial correlation of rainfall and is, hence, not frequency dependent.Because networks are designed in such a way that P will be similar for different microwave frequencies, lower frequencies are utilised for longer link paths and vice versa.Lower frequencies experience less rainfall-induced specific attenuation compared to higher frequencies.Hence, median ( P ) is nearly independent of frequency.
Ideally, a sensitivity study should be carried out to find optimal threshold values for the wet-dry classification in case of other networks and climates.This will be computationally expensive in case of large data sets.
It is to be expected that the nearby link approach will also work for other temporal resolutions than 15 min.This may require a smaller radius and a higher link density for higher temporal resolutions and may allow a larger radius and lower link density for lower temporal resolutions.For data sampled at very high frequencies, i.e. 1 s, the nearby link approach may become too computationally expensive.This could be circumvented by first averaging signal attenuations over longer durations before applying this approach.

Reference level determination
Note that a default 24 h period is considered for determining the reference level and for one step in the nearby link approach.Such a relatively long period is chosen, among other reasons, to increase the probability to obtain at least 2.5 h of dry periods, which helps to determine the reference level more accurately.Note that both periods can be modified.

Outlier filter
The filter to remove outliers deals with specific attenuation; i.e. it does not explicitly take into account frequency.It is also suitable for other time interval lengths.Nevertheless, its threshold value is very high, which makes it unlikely that actual rainfall is filtered out accidentally, irrespective of the employed frequency.The outliers are likely caused by malfunctioning links.Perhaps melting precipitation also plays a role.Hence, it makes sense to apply a frequency-independent threshold value.Note that the threshold value may need to be optimised for other networks and climates.The sensitivity analysis on a 12-day data set shows that the chosen default value of F t is suitable but a range of values shows a similar performance.

Wet antenna attenuation and sampling strategy
The wet antenna attenuation correction A a will not only compensate for wet antennas.The value of A a found in the calibration will also be influenced by other errors.Moreover, in case of rain along the link path no, one, or both antennas can be wet, whereas this correction is always applied, and A a itself has also been optimised on data where the number of wet antennas can vary.In addition, the correction does not depend on rainfall intensity.Hence, applying A a should be seen as a pragmatic approach towards correcting for wet antennas.
The coefficient α is employed to obtain mean pathaveraged rainfall intensities from minimum and maximum path-averaged rainfall intensities and is expected to depend on the time variability of rainfall.Note that the value of α presented here is based on data available at 15 min time intervals.It is expected that this value will be different for disparate time intervals.Messer et al. (2006) use the known distribution of rainfall intensities at the point scale to weigh minimum and maximum received signal levels.
The optimised values of A a and α are representative of summer months in the Netherlands and appear to be relatively close to each other for different frequency classes.Application to data from other months will generally only lead to a small decrease in performance (not shown).Hence, application of the existing values of A a and α to other rainfall types can still give reasonably good rainfall estimates.Nevertheless, it is advised to recalibrate A a and α in case of other networks or climates.This may be achieved by comparing link-based rainfall estimates with high-quality (gauge-adjusted) weather radar data, which may provide full coverage over the network, as in Overeem et al. (2013).Alternatively, path-averaged rainfall intensities may be estimated from rain gauge data.This may require a dedicated research experiment (Minda and Nakamura, 2005;Wang et al., 2012), preferably with several rain gauges along the link path.Overeem et al. (2011) utilise a research link with high temporal resolution for testing the proposed rainfall retrieval algorithm.By applying the rainfall retrieval algorithm, mean path-averaged rainfall intensities can be derived.These can be compared to the corresponding true values obtained from the received signal powers sampled at 10 Hz.Hence, the same instrument, a microwave link, is used to assess the ability of the retrieval algorithm to deal with the sampling strategy.Since the same instrument is used representativeness errors are not present.The probability distribution of minimum and maximum rainfall intensities could be studied in order to obtain more reliable mean rainfall intensities.Moreover, such an experiment may also help to assess attenuation due to wet antennas caused by dew or rain, its dependence on rainfall intensity or antenna cover type, and the time it takes for antennas to dry following a rain event.Preferably a rain gauge or disdrometer should be available near the antenna.A research link is particularly useful when the microwave frequency, path length, and antenna cover are representative of the links in a cellular communication network.Finally, see, for instance, Leijnse et al. (2007bLeijnse et al. ( , 2008) ) and Schleiss et al. (2013) for other wet antenna attenuation correction methods.

Interpolation methodology
The employed interpolation methodology, ordinary kriging, may not be specifically suited for other regions with different rainfall climatologies.The methodology developed in Van de Beek et al. (2012) could be optimised for other climates.This requires long rainfall time series.The assumed stationarity and isotropy will often be violated (Schuurmans et al., 2007).However, violation of assumptions does not automatically imply that the interpolation method is not useful.Further, the path-averaged link rainfall intensities are assumed to be point measurements.Hence, it is recommended to improve the interpolation methodology, e.g. by treating the rainfall values as line observations instead of as point observations, which is expected to have the largest impact at local scales for areas with high link densities or at areas with long links.Using data from the same 12 days as employed in Overeem et al. (2013), Rios Gaona et al. (2015) find that link rainfall retrieval errors themselves are the source of error that contributes most to the overall uncertainty in rainfall maps from a commercial link network.Errors due to mapping, i.e. interpolation methodology and link density, play a minor, albeit non-negligible role for the same network as utilised in this study.Hence, despite the limitations of the interpolation methodology its usefulness has been confirmed (Overeem et al., 2013).Further, Rios Gaona et al. (2015) study the performance of simulated link rainfall maps as a function of link density.They show that even for low spatial link densities reasonable results can still be obtained.
It may be interesting to apply a tomographic approach in order to obtain link-based rainfall maps (Zinevich et al., 2008;Cuccoli et al., 2013).Such an approach can potentially reconstruct the two-dimensional distribution of rainfall from a set of one-dimensional transmission data from many different (nearby or even intersecting) paths (Giuli et al., 1991).Hence, it would take the line character of the attenuation measurements into account, in contrast to the current approach where a line measurement is assigned to a point at the middle of the link path.
The estimated sill and range will become less accurate by extrapolating to time intervals shorter than 1 h, such as 15 min.Villarini et al. (2008) use rain gauge data from England to quantify spatial correlation for short time intervals, e.g. 1 and 15 min.These kind of studies can be useful to improve the interpolation of link-based rainfall intensities.

R-k relationship
The relationship between path-averaged rainfall intensity and path-averaged specific attenuation is commonly employed in other studies.The provided parameter values of a and b are available for frequencies ranging from 1 to 100 GHz.The value of the exponent b is close to 1 for the frequencies employed in this study, which range from 12.8 to 40.0 GHz.Frequencies between 37.0 and 40.0 GHz are denoted by the grey-shaded area in Fig. 2, which contains 81 % of the links from the working example, having values of b very close to 1 (right).For other frequencies the corresponding value of b often deviates more from 1 (Fig. 2, right).High rainfall variability along the link path will lead to overestimation for b < 1 and underestimation for b > 1 (Leijnse et al., 2008;Uijlenhoet et al., 2011).
Although exponents are often not exactly equal to 1, they are much closer to 1 than the ones typically used in radar reflectivity-rain rate relations.For example, (Overeem et al., 2011) assessed the influence of spatial variability on the link path for these two frequencies.They show that the underor overestimation will generally be small.In the tropics this problem will be more pronounced because of the high spatial rainfall variability.Particularly for long links operating at low frequencies (e.g. 7 GHz), this may lead to overestimation.Also note that (Doumounia et al., 2014) obtain quite accurate rainfall estimates using a 7 GHz microwave link with a length of 29 km in Burkina Faso, having a tropical climate.Previous work has demonstrated that this error is limited for temperate climates such as experienced in the Netherlands (Leijnse et al., 2008(Leijnse et al., , 2010)).Moreover, 82 % of the links in our network have a length shorter than 5 km, even 97 % of the links are shorter than 10 km.The average link length is only around 3 km.

Conclusions
It has been shown in several studies that microwave links from cellular communication networks can be used to retrieve rainfall information.For instance, country-wide (≈ 35 500 km 2 ) 15 min rainfall maps can be obtained from received signal powers of microwave links (Overeem et al., 2013).In this paper a detailed description is given of the algorithm of Overeem et al. (2013).The accompanying code is made publicly available as the R package RAINLINK via GitHub under the condition of version 3 or later of the GNU General Public License.The modular programming facilitates users to adapt the code to their specific network and climate conditions or to only employ one or some functions of RAINLINK.We hope that RAINLINK will promote the application of rainfall monitoring using microwave links in poorly gauged regions around the world.
We invite researchers to contribute to RAINLINK to make the code more generally applicable to data from different networks and climates.Ideally, the code should be tested on data sets containing all seasons, for varying networks and regions.Such an endeavour worldwide is currently difficult to achieve.It would require an enormous effort and it would also require data sharing among researchers, which is still not that easy to accomplish due to confidentiality requirements often imposed by telecommunication companies.
One may wonder whether the technology is bound to disappear due to the introduction of fibre optical cable networks.For instance, for the provided link data set the majority of the links does not exist anymore due to network renewal and deployment of underground fibre optical cable networks.Whereas the Netherlands is at the forefront internationally concerning deployment of underground fibre optical cable networks for telecommunication between base stations, the country is expected to still have several thousands of links (from cellular telecommunication companies and others) in 2025.For other countries and continents the uptake of fibre optics will be significantly slower, lagging behind at least 5-20 years.Moreover, construction of fibre optical cable networks may not be feasible or economically viable in many mountainous or rural areas around the world.Therefore, we expect this type of cellular communication infrastructure to still be around for several decades worldwide (Ralph Koppelaar, T-Mobile NL, personal communication, 2016).Why not attempt to use this existing infrastructure as a complementary source of rainfall information, in particular in those areas around the world with very few rain gauges, let alone weather radars?
Although rain gauges, radars, and satellites have been specifically designed to measure rainfall, all of these instruments face their own challenges.It is well known that radar rainfall estimates generally deteriorate for longer ranges from the radar.Geostationary satellite observations have a time resolution of typically 15 min but are often very indirect (e.g.estimates through cloud physical properties; Roe-beling and Holleman, 2009).Low-Earth Orbit satellites usually have long revisit times.Despite (new) satellite missions, microwave link data can still become important for ground validation of or merging with satellite rainfall products.For instance, the IMERG product of the new GPM mission provides gridded rainfall products every 30 min covering 60 • N-60 • S with a spatial resolution of 0.1 • (Hou et al., 2014;Rios Gaona et al., 2016).This is certainly a major step forward with respect to TRMM, but one has to recognise that the rainfall retrieval algorithm heavily relies on temporal interpolation and, depending on the product, additional data sources, such as rain gauges, since the actual satellite revisit time is typically several hours.Moreover, links measure rainfall close to the ground, which is not the case for weather radar and satellites, and at spatio-temporal scales relevant for meteorology and hydrology (typically 1 s-15 min; 0.1-20 km).Even if rain gauges are present, the number of links will often be an order of magnitude larger than the number of rain gauges in a region.The larger spatial density of links has been demonstrated in our previous work to compensate for their lower accuracy with respect to rain gauges.Hence, rainfall information from cellular telecommunication networks is promising for hazardous weather warning, flood forecasting, food production, drought monitoring, etc.Finally, although it is indeed difficult to obtain transmitted and received signal level data from telecommunication companies, researchers have managed to obtain data for a limited, but expanding, number of countries (namely, Australia, Brazil, Burkina Faso, Czech Republic, France, Germany, Israel, Kenya, Pakistan, Sweden, Switzerland, and the Netherlands).
To conclude, we feel that merging of rainfall data from different sources (if available) will often yield the best rainfall estimates.For instance, satellite data could be used for wet-dry classification to prevent non-zero link-based rainfall estimates during dry periods (Van het Schip et al., 2016).We believe that the main potential for rainfall estimation using microwave links is found in areas with few surface rainfall observations.In addition, development of a merged linksatellite rainfall product seems an interesting opportunity.9.All time intervals that have not been classified as wet are classified as dry for the link selected in step 1.
10. Repeat these steps for all other links.
Note that a radius of 10 km is used in Overeem et al. (2011), whereas a default radius of 15 km is used here and in Overeem et al. (2013), which allows estimating rainfall in areas with lower spatial link densities.The 15 km radius is representative of the decorrelation distance of convective rainfall in the Netherlands in case of a time interval of 15 min.For the often occurring stratiform rainfall this distance will be (much) longer.
Step 4 has been slightly altered with respect to Overeem et al. (2011).
The threshold values in steps 7 and 8 have been obtained from Overeem et al. (2011), who optimise them by visual comparison with the gauge-adjusted radar data set of pathaveraged rainfall intensities employing data from 2009 (NEC links with 0.1 dB power resolution).
Step 8 was used because rainfall is generally correlated in time, and sometimes very local, so that it does not occur at surrounding links.
If the algorithm would be rerun for another period, for instance one time interval later, this can result in different rainfall estimates compared to the preceding run.This is due to step 8 of the wet-dry classification methodology, which may also classify the previous 30 min as rainy.In order to obtain the same rainfall estimates for different runs, the algorithm would need to be slightly modified or one should wait 30 min before applying the algorithm.Another option is to not apply step 8, which can be supplied as function argument.The rainfall retrieval algorithm is suitable for real-time application, for which the reclassification of previous time intervals is of no consequence, because rainfall intensity of present time interval is of interest.

Appendix D: Interpolation methodology
Here an extensive description of the interpolation algorithm is given, which is carried out by calling the function "Interpolation".It performs the following steps.
1. Convert the supplied interpolation grid with coordinates (longitude and latitude; two columns) to an azimuthal equidistant cartesian coordinate system.In the example the interpolation grid is in WGS84 (degrees).A radar grid is used, where the coordinate of each grid cell represents the middle of the radar pixel.The coordinates of the supplied link data are also converted to an azimuthal equidistant cartesian coordinate system (easting and northing; km).
2. Select the mean path-averaged link rainfall intensities for each time interval.
Then it calls the subfunction "IntpPathToPoint", which does the following: 1. Compute the coordinates belonging to the middle of the links.
2. Determine the unique coordinates of the middle of the links.
3. Calculate the average rainfall intensity for each unique set of coordinates.This implies that data from fullduplex links are averaged.If another link happens to have the same middle of the link path, its rainfall intensity is taken into account in the averaging.
Next, three different interpolation methodologies are available, which can be chosen by supplying a function argument: 1. Inverse distance weighted interpolation on link rainfall data (subfunction "IDW").The inverse distance weighting power should be supplied as function argument.
2. Ordinary kriging with spherical variogram model.Its parameter value nugget, sill, and range can be defined by the user as function arguments.
where r is the range (m), C is the partial sill (mm 2 h −2 ), C 0 is the nugget (mm 2 h −2 ), DOY is day of year, and D is the duration (h), i.e. the time interval of the rainfall intensities which are to be interpolated.The nugget is basically the semi-variance at zero distance, which can be interpreted as very-fine scale variability or as measurement uncertainty.
The sill is the variance at very large distances, and the range is the distance at which the variance does not increase any more (this is equivalent to the distance beyond which the field is completely decorrelated, i.e. ρ(r) = 0).The spherical variogram takes the following form: where h is the distance (m).

Figure 1 .
Figure 1.Left: illustration of a telephone tower.The electromagnetic signals transmitted from the directional antenna of one cellular communication tower to another are attenuated by rainfall.Right: map of the Netherlands with locations of the employed link paths (1527) from the cellular communication network for 9 September, 16:00 UTC-11 September, 08:00 UTC (2011).These were part of one network of one of the three providers in the Netherlands.The white circles show the locations of the two weather radars operated by the Royal Netherlands Meteorological Institute (KNMI).
rainfall b Coefficient of R-k power law a (mm h −1 dB −b km b ) 3.4-25.0Drop size distribution, frequency c Exponent of R-k power law b (−) 0.81-1.06Drop size distribution, frequency c a Here A a is fixed.b Here α is fixed.

Figure 2 .
Figure 2. Values of coefficients in the relationship to convert specific attenuation to rainfall intensity for frequencies ranging from 6 to 50 GHz.The grey-shaded area denotes the 37.0-40.0GHz range.Note the logarithmic vertical scale in the left figure.Here values have been computed from one data set of measured drop size distributions (p.65 inLeijnse (2007c); solid lines).The values recommended by the International Telecommunication Union(ITU, 2005), meant for computing specific attenuation for given rain rates and for worldwide application, are also plotted (dashed and dotted lines).

Figure 3 .
Figure 3.The figure illustrates how path-averaged link rainfall depths are interpolated to link rainfall maps.First, path observations (left panel) are assigned to points (centre panel).Next, ordinary kriging is applied to obtain rainfall maps (right panel).For time interval 20:30-20:45 UTC on 10 September 2011.

Figure 4 .
Figure 4. From received signal powers to cumulative rainfall depths (one day, one link): minimum and maximum received powers and path-averaged, gauge-adjusted radar rainfall intensities (a); corrected minimum and maximum received powers, reference signal levels, and path-averaged, gauge-adjusted radar rainfall intensities (b); minimum and maximum path-averaged link rainfall intensities (c); mean pathaveraged link and gauge-adjusted rainfall intensities (d); cumulative path-averaged link and gauge-adjusted radar rainfall depths (e); map with location of microwave link in the city centre of Amsterdam, the Netherlands (f).Period is 30 August, 08:00 UTC-31 August, 08:00 UTC (2012).

Figure 5 .Figure 6 .
Figure 5. Fifteen-minute rainfall maps from 10 September 2011, 20:30-20:45 UTC, for links only (left) and radars plus gauges (right) for the Netherlands.Spatial resolution is approximately 0.9 km 2 .Values below 0.1 mm are not shown.The lines denote the locations of the employed microwave links (left).This figure is part of the working example.

Figure 7 .
Figure 7. Illustration of plotting capability of R visualisation function "RainMapsLinksTimeStep" in package RAINLINK: 15 min rainfall map from 10 September 2011, for links only for Amsterdam, the Netherlands.Spatial resolution is approximately 0.9 km 2 .Values below 0.2 mm are not shown.The lines denote the locations of the employed microwave links.The number at the red cross is the rainfall depth at that location, of which the name is provided in the title caption.This figure is part of the working example.

Figure 8 .Figure 9 .
Figure 8. Sensitivity analyses of the threshold values for the wet-dry classification.The 2-D contour plots display relative mean error (%; black lines) and, in colours, CV (left) or ρ 2 (right) in path-averaged 15 min rainfall depths as a function of median( P ) and median( P L ).The default threshold values are indicated by the black dot.No outlier filter has been applied.
3. Ordinary kriging with spherical variogram model withclimatological parameter values based on a 30-year rain gauge data set.These are computed for the DOY as obtained from the input data frame with microwave link data, thus taking into account seasonality in spatial rainfall correlation.The subfunction "ClimVarParam" computes these parameter values.For the last methodology the spherical variogram parameters can be computed as follows (power-law scaling in cosine function parameter;Van de Beek et al., 2012):

Table 1 .
Overview of functions needed for processing link data from received signal powers to rainfall maps.Italicised (sub)functions are optional.A choice has to be made between bold subfunctions.The script "Run.R" can be employed to determine which (sub)functions are being applied.

Table 2 .
Most important variables used in the (sub)functions of the rainfall retrieval algorithm.−1 dB −b km b Coefficient of R-k power law • (km) Latitude (or northing) of end of microwave link

Table 3 .
Values of the parameters used in the rainfall retrieval algorithm.All these parameter values can be modified.The configuration file "Config.R" can be utilised to load all parameter values.