Hydrometeor classification through statistical clustering of polarimetric radar measurements: a semi-supervised approach

Hydrometeor classification is the procedure of identifying different types of hydrometeors by exploiting polarimetric radar observations. The main drawback of the existing supervised classification methods, mostly based on fuzzy logic, is a significant dependency on a presumed electromagnetic behavior of different hydrometeor types. Namely, the results of classification largely rely upon the quality of scattering simulations. When it comes to the unsupervised approach, it eventually lacks the 5 constraints related to the hydrometeor microphysics. The idea of the proposed method is to compensate for these drawbacks by combining the two approaches in a way that microphysical hypotheses can, to a degree, adjust the content of the classes obtained statistically from the observations. This is done by means of an iterative approach which, in a statistical framework, examines clustered representative polarimetric observations by comparing them to the presumed polarimetric properties of hydrometeor classes. Aside from comparing, a routine alters the content of clusters by encouraging further statistical clustering in case of 10 non-identification. By merging all identified clusters, the multi-dimensional polarimetric signatures of various hydrometeor types are obtained for each of the studied representative datasets, i.e. for each radar system of interest. These are depicted by sets of centroids which are finally employed in operational labeling of different hydrometeors. The method has been applied on three C-band datasets, each acquired by different operational radar from the MeteoSwiss Rad4Alp network, as well as on an X-band dataset acquired by a research radar. The results are discussed through a comparative analysis which includes a 15 corresponding supervised and unsupervised approach, with a particular emphasis on hail detection performances.

As the most wide-spread approach in hydrometeor classification, fuzzy logic classification methods have been subject of several validation campaigns. One of the most extensive, the Joint Polarization Experiment (Ryzhkov et al., 2005) in particular, demonstrated an improved hail detection capabilities using ground measurements.
A Bayesian approach proposed by Marzano et al. (2010), is another representative supervised approach, where each simu-25 lated class is characterized by its center and covariance matrix whereas the labeling of the observations is done by means of Bayesian inference (the maximum a posteriori rule).
The most obvious limitation of this dominant class of methods is the significant conditionality of the classification decision by the quality of the a priori supposedly known polarimetric signatures.
A different approach, based on the unsupervised concept, pioneered by Grazioli et al. (2015), tends to avoid using a priori 30 known or presumed polarimetric signatures. The focus is rather on exploiting the radar observations with the aim of clustering a set of diverse polarimetric measurements into distant clusters which are to be labeled as different hydrometeor types. In the mentioned method the separation is achieved through Agglomerative Hierarchical Clustering (AHC), by simultaneously 2 Atmos. Meas. Tech. Discuss., doi:10.5194/amt-2016-105, 2016 Manuscript under review for journal Atmos. Meas. Tech. Published: 30 March 2016 c Author(s) 2016. CC-BY 3.0 License. introducing a spatial texture information, whereas the labeling of obtained clusters is done once manually by taking into account both radar and non-radar information.
As well as it was the case with the introduced unsupervised approach, the idea behind the semi-supervised approach we propose here, is to avoid heavily relying on presumed polarimetric properties of hydrometeors, though not entirely. Namely, the intention was to: allow for the "glimpse" of the presumed hydrometeor microphysical properties through a constrained 5 clustering and simultaneously, automatize labeling of the obtained clusters (influence of the supervised concept); make the classification decision criteria conform to the data specificities, particularly potential imperfections of the radar measurements (influence of the unsupervised concept); ensure the operational potential of the method i.e. keep the implementation simple enough for real time operation. This is achieved in a somehow different way with respect to the state-of-the-art semi-supervised approach, proposed by Bechini and Chandrasekar (2015), by combining two classical data processing tools: k-medoids clustering and Kolmogorov-Smirnov test. The "glimpse" of the presumed microphysics comes through the state-of-the-art assumptions (Dolan and Rutledge, 2009;Dolan et al., 2013), appropriately modified using scattering simulations, whereas the influence of the technical specificities of data is taken into account by comparatively working on datasets acquired by three MeteoSwiss Rad4Alp (Germann et al., 2015) C-band operational radars (Albis, Monte Lema and Plaine Morte) and one X-band radar (MXPol) belonging 15 to EPFL. As a result, we obtain for each of the considered radars a set of centroids in multi-dimensional space, formed by four polarimetric parameters and a liquid/melting/ice phase sigmoidal indicator. These are later used to classify observed precipitation by simply applying an Euclidean distance criterion, which makes the method very suitable for operational use.
A qualitative and quantitative validation is performed through comparison with the appropriate supervised and unsupervised routines, as well as by involving some external information (rain gauge measurements and hail operational product). 20 The article is organized as follows: in Section 2 we introduce the employed statistical methods. Section 3 contains a detailed description of the proposed method, along with some auxiliary analyses. Further on, in Section 4 we illustrate some results of the classification applied to C and X band datasets, simultaneously validating them through appropriate comparisons with independent measurements. Finally, Section 5 concludes the article through a discussion and provides some perspectives.
2 Background on employed statistical methods 25 The proposed semi-supervised algorithm mainly relies on two statistical tools, elaborated in the following subsections for nonexpert readers: the unsupervised k-medoids clustering and the Kolmogorov-Smirnov statistical test. These two methods have a role of adhesive between the polarimetric radar measurements and the hydrometeors scattering hypotheses.

Unsupervised clustering
As it would be the case with the k-means (Lloyd., 1982), the employed k-medoids algorithm (Kaufman and Rousseeuw, 2009) 30 is used to partition the multivariate observation vectors (x 1 , x 2 , ...x n ) into k subsets or clusters (S 1 , S 2 , ...S k ), in a way that the 3 Atmos. Meas. Tech. Discuss., doi:10.5194/amt-2016-105, 2016 Manuscript under review for journal Atmos. Meas. Tech. Published: 30 March 2016 c Author(s) 2016. CC-BY 3.0 License. subsets minimize D, the sum of distances between the observations and the centroid of a subset µ i : (1) The distance d can vary from squared Euclidean norm · 2 2 for k-means, 1 norm for the original k-medoids algorithm (Kaufman and Rousseeuw, 1987), to the standardized Euclidean distance: 5 normalized with respect to the standard deviation of the subset (σ Si ), that we have adopted for our approach. It is an iterative algorithm, where centroids are recalculated after each iteration, during which the composition of the subsets changes. Once the composition becomes stationary, the algorithm has converged. Unlike it is the case for the k-means, where a centroid does not necessarily belong to the dataset, the centroid of a subset in the k-medoids algorithm, named medoid, is always a member of a set. This makes k-medoid more robust to the presence of outlier data, particularly when partitioning smaller sets 10 of observations. The implementation of the method depends on the size of the observations sample, following criteria of the default MathWorks@ (2015) version: for small samples (up to 3000 observations), we employ the Partitioning Around Medoids (PAM) algorithm (Kaufman and Rousseeuw, 2009). This procedure assumes minimizing D by swapping between medoids and non-medoids; for large samples (from 3000 to 10000 observations), an algorithm proposed in Park and Jun (2009) is used. The mini-15 mization of D is achieved as in the case of k-means, by choosing the closest medoid to the hypothetical corresponding k-means centroid; for very large samples, only a random selection of cluster's samples is considered in recalculating medoids.
As foreboded in the introduction, the vector x has five dimensions in our case: four polarimetric parameters and a liquid/melting/ice phase sigmoidal indicator. Different distributions are characterized with different kurtosis (e.g. Z H usually has 20 far more negative kurtosis than K dp ) and therefore the need to standardize (normalize) the Euclidean distance by dividing it by the standard deviation of the considered variable.

Kolmogorov-Smirnov test
The two-sample Kolmogorov-Smirnov (KS) test is a non-parametric hypothesis test which tells us whether two samples can be characterized with the same probability distribution, whereas the one-sample version determines whether the sample is 25 distributed according to the particular distribution (Kolmogorov, 1933;Smirnov, 1948).
The test itself is based on the comparison between empirical cumulative distribution functions (F ) of two samples, the test statistic D KS being the supremum of the set of their distances: with the absolute value making the test two-tailed. The decision on accepting H 0 hypothesis which assumes that two samples are being issued from the same probability distribution, is taken either by comparing the test statistic with the critical value or by comparing p-value with the test significance α (type I error). Through the dependence of critical value and p-value on the number of samples, related to the test power 1 − β, the test decision depends as well on β (type II error). A type I error is the false rejection of a true H 0 hypothesis, while a type II error is the failure to reject a false H 0 hypothesis.

5
In our case, the two samples are the values of the observed parameter x j (one of the five considered parameters) and the expected values of the same parameter (issued from the employed membership functions), as will be elaborated in the following section. The decision is based on comparing the value of the test statistic with the critical value as determined by Pearson and Hartley (1972).

10
The process starts with the selection of representative observations, aiming to get a dataset which contains all hydrometeor types we should potentially be able to detect and identify: crystals (CR), aggregates (AG), light rain ( hail, which is present in two classes. At X band, we did not manage to observe any vertically aligned ice while collecting representative observations and thus, we had to omit the VI class. Nevertheless, this selection of classes is not mandatory, because the proposed approach can be used with any set of hydrometeor classes.
[ Figure 2 about here.] 3.1 Data preparation 20 At C band, representative observations are selected by carefully sampling eight days of radar measurements for Albis and Monte Lema radars and four days for Plaine Morte radar, involving several stratiform and convective precipitation events (from all four seasons). The reason behind the smaller initial set for Monte Lema radar (four days) is the lower regional frequency of hail storms and our desire to keep the proportions of different hydrometeors similar for all radars. All three considered operational radars have the same scanning pattern covering the entire azimuth with 20 elevations (from −0.2 • to 40 • ) in 5 25 minutes, resulting in 288 full scans per day. Clutter contaminated pixels, as pointed by the slightly adjusted operational clutter removal routine (Germann and Joss, 2004), as well as the pixels below the noise level threshold, are removed. However, the sampling is restricted to the elevations from 3.5 • to 11 • , as well as to the range between 3 km and 40 km. Lower elevation boundary was chosen to avoid any potential residual clutter, while the upper one, combined with a selected range, restricts the considered altitude below 7.5 km, sufficient to sample all types of precipitating hydrometeors. Selection itself is a sort of constrained random sampling. As an effort to encourage the diversity of present hydrometeor types, we aim to obtain the distributions of parameters (particularly Z H and relative altitude with respect to 0 • C isotherm), as platykurtic as possible, in the following ranges: 1. Z H : -10 -60 dBZ, 2. Z DR : -1.5 -5 dB, 5 3. K dp : -0.5 -5 deg/km, 4. ρ hv : 0.7 -1, 5. Ind: -1 -1.
The fifth parameter is introduced to better distinguish classes in liquid and ice phase that have similar polarimetric signatures, but without directly introducing the information about temperature. It is not directly observed by radar (but can be deduced 10 from radar measurements in stratiform precipitations by identifying the melting layer). It is finally a quasi balanced ternary system indicating liquid, melting and ice phase, obtained by applying a sigmoid transform in order to decrease its influence on discrimination between different hydrometeor types inside liquid or ice phase: with ∆H being a relative altitude with respect to the 0 • isotherm, a centering parameter m being set to zero and a slope 15 parameter b being either very low (blue curve in Fig. 3) or very large (red curve). The former one is applied in centroids derivation, while the latter is used in the assignment. The rationale behind the less steep slope applied on the representative observations is preserving a sort of continuity, for the purpose of coherent statistical testing of five continuous distributions.
[ Figure 3 about here.] The set of representative observations at X band is the one used in illustrating the unsupervised approach proposed by 20 Grazioli et al. (2015). Given the transportability of the employed radar, these datasets were collected at two different locations (Davos in Switzerland and Ardèche in France), at elevation angles ranging from 3.5 • to 10 • .
The sizes of the derived representative datasets, along with the most relevant information about the considered radars, are given in Table 1.
[ Table 1 about here.] 25 Instead of using the specific differential phase derived as a product of operational Rad4Alp radar network, in order to avoid any outliers, this parameter was estimated in particular in the presented study by rigorously employing a multi-step approach (Vulpiani et al., 2012), reinforced by median filtering. As for the rest, the method conforms to the current state of the operational network. Attenuation and differential attenuation were corrected in the entire volume using ZPHI method from Testud et al.  (2000), while noise in correlation was corrected according to the standard operational procedures. Due to the tremendous efforts invested in automatic calibration and monitoring of the network, we are confident that the probability of radar errors is significantly reduced (Germann et al., 2015).
The information concerning the altitude of the 0 • C isotherm has been collected from the COSMO model (Baldauf et al., 2011), by relying on the 0 • C isotherm product (centroids derivation) or by applying standard atmosphere lapse rate in the 5 troposphere (6.4 • C/km) on the temperature profiles (pixel assignment). The exceptions are stratiform events observed with the X-band radar, for which the melting layer is detected using a polarimetric radar based method (Wolfensberger et al., 2015).

Centroids derivation
The method itself is conceived as an iteration inside an iteration. In this section we intend to provide a detailed description, starting from the "internal loop" and going towards the external one, the latter resulting in a final set of centroids for the 10 considered radar.
As can be seen in Fig. 2, the "internal loop" is the very core of the proposed method. It starts with an initial, entirely unsupervised clustering of the representative dataset. This is done by means of k-medoids clustering algorithm, which divides the initial set into N distant sets, by using the standardized Euclidean distance as a criterion. The value of N is set to nine which corresponds to the number of hydrometeor classes we eventually seek (see section 3), though a different value does not simulations based on double layer T-matrix method (Mishchenko et al., 1996). The comparison itself is performed using the Kolmogorov-Smirnov test (see section 2.2), by comparing separately each of the five considered parameters. Then, the five obtained test statistics are combined using a weighted arithmetic sum: with w i = 1 for i = 1 . . . 4, and w 5 = 0.75, the last being part of the endeavor to decrease impact of non-radar variables. The 25 resultant test statistic is finally compared with the threshold defined by a chosen test significance (α) and a number of samples (β), following Pearson and Hartley (1972).
Clusters which satisfy the H 0 hypothesis exit the iteration as labeled observations, while the rest proceeds to the additional clustering, this time only into two sets. This clustering procedure is identical to the initial one, entirely unsupervised. The potential benefit of including the information about dissimilarity, arising from the Kolmogorov-Smirnov test, which is the value 30 of x j for which we have the maximal distance D KS (Eq. 3), was investigated but it turned out that a constrained clustering would in fact not be beneficial. Namely, both the identification rate and the credibility of the obtained classes were in favor of an unconstrained clustering. The obtained new sets are then again undergoing identification separately. The loop for a cluster which fails to be identified ends when the number of iterations exceeds i max , which is empirically determined to at least 10, or when the size of the cluster falls below n min , being identical to the number of samples, a parameter that varies in the external loop. All the labeled clusters are merged, according to the assigned label, into nine classes, which are characterized by the set of nine centroids in the five-dimensional space. On the other side, unlabeled clusters are assumed to be hydrometeor mixtures and therefore are not 5 further analyzed in this phase of our research. Their proportion is minimized by considering a fairly proximate range (up to 40 km) in selecting representative observations.
As it was the case with the clustering method, the comparison method has been also challenged, by implementing in parallel Student's t-test (Snedecor and Cochran, 1989) and Wilcoxon rank sum test (Gibbons and Chakraborti, 2011). The former is focussed on the equality of mean values under the assumption of equal variances of normally distributed samples, while the 10 latter examines equality of medians without any additional constraints. The identification with both of these alternative tests is a bit faster with respect to the KS test (Fig. 4), which can be explained by the fact that the KS test relies on the entire probability density function (all moments), unlike the studied alternatives which consider only first order statistics. However, the composition of the obtained classes does not vary significantly, leading us to the decision to keep the KS test due to less restraining assumptions. The obtained classes from each of these 30 external iterations are firstly used to estimate multi dimensional probability density functions for each of the detected hydrometeor types (Fig. 5). This is done by means of a Kernel Density Estimator (KDE) (Ihler and Mandel, 2003;Parzen, 1962). The resulting polarimetric descriptors have the potential to be further on used as a non-parametric membership function in a fuzzy logic classification algorithm (Wen et al., 2015(Wen et al., , 2016. Though, in this 25 paper their role is restricted to the qualitative description of the obtained classes.
[ Figure 5 about here.] The proposed classification requires a final set of centroids, composed out of medians of centroids obtained in the external loop (Fig. 6). However, before defining a final centroid for a given hydrometeor class, we check for the dispersion of the 30 centroids obtained in the considered five-dimensional space. This is done by calculating the interquartile coefficient of 30 dispersion, ranging from 0 to 1, and conceived as: with j standing for a polarimetric parameter and i for a hydrometeor class. If the overall value of the coefficient (average over all five parameters), inversely proportional to the share of a given hydrometeor class in the representative set of observations, exceeds the empirically determined threshold (0.5), the corresponding class is not being considered.
[ Figure 6 about here.] Generally speaking the differences between centroids characterizing different C band radars do not appear to be too sig-5 nificant (Fig. 6), and consequently do not alter considerably the epilogue of the classification (Fig. 7). However, as we can observe in the latter figure, misclassification is still possible (e.g. AG vs. RP or WS vs. RN). Therefore, it appears that the idea of classification criteria being adapted to the particularities of a radar is relevant. This is especially justified in case of the ρ hv parameter, whose noise correction is likely more susceptible to minor dissimilarities existing between different operational radars. The comparison with the centroids derived from unprocessed data (before attenuation and noise corrections), illustrate 10 the rather significant influence the post-processing can have on the hydrometeor classification.

Pixels assignment
Given the skewness and the leptokurticity characterizing distributions of K dp and ρ hv , these parameters are transformed into K dp and ρ hv , by respectively applying the following transformations:
The fifth parameter is scaled by means of a significantly stricter sigmoid transformation (Eq. 4). In this way, in the pixel assignment, the external parameter plays literally the role of ice/liquid phase indicator (Fig. 3). The classification itself is performed by determining Euclidean distance in the five-dimensional space of any observed precipitation pixel with respect to the nine defined centroids (Fig. 6). The distance of the kth observed pixels and the jth centroid is calculated as: with: where the weights are w i = 1 for i = 1 . . . 3 while w 4 = 0.75 and w 5 ≤ 0.5. The impact of the ρ hv parameter is slightly decreased due to the highly probable residual noise in the correlation between channels, whose influence is emphasized with a logarithmic transformation. Out of the nine obtained distances for each pixel, it is the minimal one which determines its hydrometeor label.

10
The choice of the employed distance was challenged by comparative analysis with the standardized Euclidean distance (including standard deviation) and the Mahalanobis distance (including covariance estimate, Mahalanobis 1936). By relying on the same qualitative and quantitative criteria we use in the validation, and by additionally taking into the account the simplicity required for the operational purpose, we have conserved the original choice.
Finally, the classification map is transformed into its probabilistic version (Fig. 8d), by sorting the distances individually for 15 each of the detected hydrometeor types. In this way, pixels being the most distant with respect to their centroid receive a lower probability of occurrence, which should point either towards a potential misidentification or towards a likely hydrometeor mixture.

Results and validation
The algorithm with the derived sets of centroids for the four considered radars has been applied on a number of characteristic 20 events observed by these radars. This is used to illustrate, and to a degree validate, the prospects of the proposed classification.

X-band
At X band the focus is on the comparison with the centroids issued from the corresponding unsupervised method (Grazioli et al., 2015), which was in fact defined using the same dataset. Classification based on fuzzy logic was considered as well (modeled upon the work of Dolan and Rutledge 2009). The example of the comparison illustrated in Fig. 8 genuinely represents the 25 ensemble of results obtained while treating MXPol data.
[ Figure 8 about here.] The results at X band match to a significant degree those obtained by employing unsupervisely derived centroids (Fig. 8).
The main difference can be spotted by closely observing ice phase classes (e.g. crystals, aggregated and rimed ice particles) as well as the wet snow (melting) layer. However, this does not imply that aside from the automatized centroids derivation a real contribution of the supervised routine to the proposed semi-supervised one does not exist i.e. that constraints introduced in clustering indeed do not improve the decision process. Namely, unsupervisely derived centroids could in a way be taken as reference due to the following reasons: they are derived using computationally expensive, fairly sophisticated clustering method (AHC); the information about the texture is explicitly introduced; the identification is performed through human expertise, using complementary data 5 when possible.
Here, we are getting to the very similar results: by using more simple k-medoids clustering method; without at all considering the texture information; with the identification being performed automatically, using modifiable theoretical assumptions at the input; and while deriving more classes.
Therefore, the comparison with the unsupervised approach, based on the same representative dataset, could be considered 10 as a sort of validation. Especially if we take into the account the comprehensive validation of ice particles supported by a two-dimensional video disdrometer (Grazioli et al., 2014).
[ The comparison with the output of a fuzzy logic algorithm which uses similar membership functions to the ones employed in constraining our clustering, was quantified using spatial homogeneity feature, derived from the co-occurrence matrix: with i, j being the position indices and p, q pixel values (in our case number of a label -from 1 to 9) (Haralick et al., 1973). It is actually a measure of the co-occurrence matrix diagonality. As illustrated in Table 2 there is an important increase in spatial homogeneity with respect to fuzzy logic and very small decrease with respect to the unsupervised approach which nevertheless contains spatial information. 20 Finally, we performed the matching analysis by applying our classification, based on MXPol centroids, on two X-band radars pointing toward the same volume (Fig. 9). As it can be seen in the normalized matching matrix (results averaged over the entire day of acquisitions), aside from slightly confusing crystals and dry snow (different levels of aggregation), the correspondence appears to be very good.

C-band
The results of the classification at C-band have been compared to the 5 minutes operational hail detection product -Probability of hail (POH) (Nisi et al., 2016). Though also based on radar measurements, due to its entirely different concept, this algorithm can be considered as a quasi-independent reference. Namely, the derived probability of hail is proportional to the difference between 45 dBZ echo top height and 0 • C isotherm height (Witt et al., 1998;Foote et al., 2005). Differences below 1.65 km 30 indicates no hail, while ones above 5.5 km mean 100% hail probability. [ Figure 10 about here.] The division of hail in ice hail and high-density graupel on one side and melting hail on other side (Ryzhkov et al., 2013), allowed us to bypass the obstacle of the fifth parameter (though lower weighted) and properly identify the convective core of the storm, which can be observed in Fig. 11. The same figure illustrates the potential of properly detecting the presence of 10 vertical ice, related to the reported atmospheric lightning.
[ Figure 11 about here.] An additional comparison with the corresponding fuzzy logic routine concerns liquid precipitation (Fig. 12). Namely, hourly averaged rain gauge measurements at two MeteoSwiss stations (in the vicinity of Monte Lema radar) are compared to the rain vs. light rain output of semi-supervised and supervised classification. Although drawing a border line between light rain and rain 15 is indeed somehow debatable, by observing ground measurements, one can perceive a larger plausibility of the results obtained with the method we propose. An interesting peculiarity is that the detected "other" class at the Stabio station corresponds to the melting hail, whose presence at the considered location is confirmed by MeteoSwiss POH archive.
[ Figure 12 about here.] 5 Conclusions and future perspectives 20 In this paper, we propose a novel semi-supervised method for hydrometeor classification from polarimetric radar data. The idea is to combine the principal advantages of both supervised and unsupervised approaches, while keeping the potential operational implementation reasonably simple. This is achieved through the statistical clustering of representative observations of the considered polarimetric radar. It includes the implicitly introduced constraints, provided by the state-of-the-art assumptions which are appropriately modified using scattering simulations, and enforced by the Kolmogorov-Smirnov statistical test. The on three operational C-band MeteoSwiss radars and a research X band radar. The comparative analysis with the standard supervised and unsupervised approach was done in order to properly position the proposed method, stating the benefits and the limitations. A meaningfulness of the hydrometeor identification was evaluated using ground truth measurements and well established MeteoSwiss operational products.
As for the moment, the reference observations are mostly generated using appropriately adjusted state-of-the-art membership 5 functions. The idea is to replace this input in the future, with potentially statistically richer information, as are the EM properties of hydrometeors which have been recently appearing in the literature, determined by employing the Method of Moments (MOM) (Mirkovic et al., 2015) or the Invariant Imbedding T-Matrix Method (Pelissier et al., 2015). This way, we would entirely exploit the potential of the Kolmogorov-Smirnov test, limited in the current implementation by the imposed probability density distributions of most of the reference observations. Due to the non-availability of the ground-truth data for the ice phase 10 particles, we had to rely on the analogy with the validated unsupervised method. However, with the envisaged campaigns involving a Multi Angle Snowflake Camera (MASC, Garrett et al. 2012), data acquired at the ground level above 0 • isotherm, will be used to improve the discrimination between aggregates and rimed particles. Finally, the biggest challenge in front of us would be dealing with the pixels suspected to be hydrometeor mixtures. Therefore, the plan is to go further on in range and deal with larger radar sampling volumes, either through their decomposition or through defining a new set of mixed classes for 15 far ranges.

Appendix A: Employed clustering constraints
The basis of the membership functions employed to generated reference polarimetric observations by means of an inverse transform sampling have been adopted, from Dolan et al. (2013) and Dolan and Rutledge (2009), respectively for C and X band. They have a form of a bell-shaped function: with x being a polarimetric parameter and m, a and b, being respectively mean, width and slope of a function, provided in Tables 3 and 4. Less rigorous slope criteria was used at X band for the identification rate comparative analysis (Fig. 4).
Wet snow class for X band, as well as ice and melting hail classes for both considered frequency bands were defined using scattering simulations with double layer T-matrix method (Mishchenko et al., 1996), as indicated in the Section 3. As well, a 25 number of other parameters from the original membership functions has been altered to fit the specific purpose these clustering constraints have in the framework of our approach.
13 Atmos. Meas. Tech. Discuss., doi:10.5194/amt-2016-105, 2016 Manuscript under review for journal Atmos. The fifth parameter -a liquid/melting/ ice indicator, was generated using trapezoid function: where the parameters v 1 , v 2 , v 3 and v 4 are provided in Table 5. An important remark would be that in the implementation of a fuzzy logic approach, for the purpose of a coherent comparison, instead of the original temperature membership bell-shaped functions, we employed the relative altitude as provided by Grazioli et al. (2015).