Interactive comment on “ Self-Nowcast Model of extreme precipitation events for operational meteorology ” by G . B . França

Introduction Conclusions References

Anonymous Referee #1 Received and published: 11 November 2015 This paper is interesting and quite well written.It addresses a topic of great importance in nowcasting C5127 research.I would like to reconsider the paper for publication if major revisions will be done, especially related to the major concerns that I will describe in what follows.
Major concerns The division of the total dataset into a training set and a validation set only does not guarantee the reliability of the final results.Due to this fact and to the large number of hidden neurons allowed in the network structure, it is very probable that an overfitting problem arises in your investigation, so giving outputs with an overestimated goodness.The standard way of acting in dealing with these problems is to consider a training , a C3785 validation and a test set, by stopping the iterative training cycles when the error begins to increase in the validation set.
Only a procedure like this could guarantee that the test data are not overfitted.Thus, consider this procedure!You will probably see that also the optimal number of hidden neurons will decrease.In particular, the class-frequency statistics on the validation and test sets should be the same of the total dataset.Furthermore, an alternative procedure, very useful for small datasets, has been re-cently developed.See, for instance, Pasini and Modugno (2013), Atmospheric Science Letters 14, 301-305;Pasini (2015), Journal of Thoracic Disease 7, 953-960.In the lat ter paper also a treatment of the overfitting problem has been given in terms of training, validation and test sets.I suggest to apply also the so-called generalized leave-one-out procedure described in these papers, or at least to cite them as a reference to another useful procedure that could be adopted for avoiding overfitting.
Authors response: We have previously done that, but it was not clearly described in the manuscript, i.e., the training and test datasets are separeted from the validation one.Please see sections 3.2, 3.2.2,3.3 (i) and 3.3 (ii) (underlined text).
Furthermore, there is no explanation about the way in which you choose the optimal number of hidden neurons.Empirical choice?Please, specify.Authors response: Please see below (underlined text) detailed explanation about the number of neurons (section 3.3 (i), 3.3 (ii)) and how the algorithm reaches the optimal (section 3.4).

C5128
Again, the structure of the networks used has not been sufficiently specified.For instance, have you considered nonlinear transfer functions at hidden neurons and linear ones at the output?Authors response: Yes, we have assumed as you wrote, i.e. σ and h are linear and no linear transfer functions between the neural network layers, respectively (see below 2nd paragraph of section 3) Please, give more details on probabilistic neural networks, too.Readers could be not familiar with them.Authors response: It was done; see below second paragraph of section 3.
Finally, did you adopt an objective method for pruning the inputs?Do you know that the presence of collinear inputs bring no new information and could decrease the network performance?A pruning performed starting from consideration about linear and nonlinear correlations (through the so-called correlation ratio) will be welcome.See, for instance, Pasini andAmeli (2003), Geophysical Research Letters 30, 7, 1386, where this problem is addressed.
Authors response: Above questions are answered in the last paragraph (underlined text) of section 3.2.1.
Minor comments P. 10640, rows 1-7.Several other references should be given for neural network appli-C3786 captions to environmental studies.Refer to S. Haupt et al. (2009), Artificial intelligence methods in the environmental sciences, Springer; W. Hsieh (2009), Machine learning methods in the environmental sciences, Cambridge.
Authors response: The references were included in the new manuscript P. 10640, rows 9-14.You talk about three simple tasks but describe just two of them.
Authors response: It was done.
Authors response: It was done.English is not up to standards.Please, ask help to a mother tongue colleague.Authors response: The manuscript was generally revised by English native speaker (see attached the editorial certificate)

Methodology and algorithm description
Meteorologists have limited windows of time in which to integrate all available data and generate a nowcast, as stated by Mueller et al. (2003).Therefore, the idea is to create an automated nowcast model in which a neural network algorithm is used for data fusion, similar to the work performed by Cornman et al. (1998) for detecting and extrapolating weather fronts.At present, one may find applications of neural network in numerous fields of science, such as modelling, time series investigations, and image pattern recognition, owing to their capability to learn from input data (Haykin, 2002).Normally, stages of neural networks are denoted by a global function (Equation 1), as described by Bishop (2006)â ȂŤ for example: Equation ( 1) is here!where xi and yk are the input and output, respectively; (1), (2) and Wji, Wkj represent the input layer, hidden layer and the connection weights (that should determinate) between input and hidden layers and hidden and output layers, respectively; D and M are the number of inputs and number of neurons in the internal layer, respectively; and σ and h are linear and no linear transfer functions between the neural network layers, respectively.Thus, determination of the output via Equation 1 crucially depends on the values of the weights that are worked out, similarly as in a multiple linear regression using a set of inputs and outputs; however, instead, to minimize the distance as in nonlinear regression, the neural networks attempt to minimize the cost function.Given that the SIE forecast problem requires a categorical output, it was decided to use probabilistic neural networks, initially proposed by Specht (1990Specht ( , 1991)), which is based on radial-basis function (RBF), A RBF network consists of three layers: the input layer; the second layer (or hidden), apply a non-linear transformation, denoted as h that, here, is Gaussian function, of the input space to the hidden space.The third layer, the outgoing, is linear (σ), providing C5130 the network response.Further details about neural networks and their applications may be found in Pasini et al. (2001), Haykin (2002), Pasero and Moniaci (2004), Bremnes and Michaelide (2005), Bishop (2006), Haupt et al. (2009) and Hsieh (2009).Figure 3 depicts a general flowchart for the proposed automated nowcasting model.It has four major steps: (1) data processing; (2) definitions of input and output variables; (3) training and testing; and (4) validation.These steps are described below.3.1 Step 1â ȂŤ Data processing: All datasets were sorted chronologically, and their statistical consistency was observed, resulting in 63,320 h of meteorological records.Based on weather conditions reported by METAR, each meteorological record was classified into two classesâ ȂŤ "0" and "1", representing nonexistence of important weather conditions (low impact to flight flow) and the existence of significant atmospheric instability (or SIE, as previously defined) for flights in the TA of Rio de Janeiro, respectively.Table 1 shows all weather conditions reported in terms of METAR code and their classification per class.

3.2
Step 2â ȂŤ Input and output definition: ANM data fusion is based on a neural network, which must be sequentially trained, tested and subsequently validated to forecast the presence or absence of SIEs.The latter corresponds to the learning process of a neural network.The input and output variables play an important role in ANM data fusion and should be previously defined.

Input variables
These variables are the predictors of ANM and indicate the atmospheric stages of SIEs in the study area that are used by the ANM during its learning process.A meteorological record is composed of primary and derived variables that are extracted from METAR, TEMP, and RR and calculated using primary variables.The purpose of ANM is to nowcast SIEs and other weather conditions; therefore, all inputs (or predictors) should thermodynamically represent the presence or absence of SIE, which are embedded in the meteorological records utilized to train/test and validate the ANM.The latter should be able to classify or forecast weather conditions of classes numbered as C5131 "0" and "1", and its performance is evaluated by cross-validation with observations as presented later.The criterion to select input (primary and derived) variables is based on a conceptual model of how the atmosphere worksâ ȂŤ particularly during SIE occurrence, which have typical atmospheric patterns.Several input variables are usedâ ȂŤfor example, atmospheric instability indices, i.e., K-index (K) = (T850-T500)+Td850-(T700-Td500), where Tz and Tdz represent temperature and dew point, respectively, in Celsius degrees, and z is the given atmospheric pressure in hPa); Total Totals (TT) = T850+Td850-2T500; Lapse Rate (LR), represented by LR = 1000(T500-T700)/ (GPH500-GPH700), where GPH denotes the geopotential height; and others defined in columns three and four of Table 1.At the beginning, many inputs were generated.However, with regard to the neural network training, it is necessary to adopt a method to prune collinear inputs that bring no new information and, thus, could reduce the network performance.Pasini and Ameli (2003) have investigated heuristic pruning methods.Here, autocorrelation was selected and enforced to remove collinearity of the input.Twelve variables then remained, divided into eight primary and four derived variables as listed in columns three and four of Table 1, respectively.

Output variables
The output is defined as weather conditions reported in METAR codes and divided into two classes, "0" and "1", which represent the absence and presence of SIEs, respectively, as shown in Table 2.In other words, classes 0 and 1 indicate nonexistence of significant instability and existence of significant instability (i.e., weather condition of METAR code as T, TL, TRW-, TRW, TRW+) in the TA of Rio de Janeiro, respectively.
According to Pasini (2015) and aiming to avoid the overfitting problem during the learning process of the neural network, which is represented by step 3, the meteorological records were divided into three subsets: training, testing and validation.Figure 4 (a) shows the initial training and testing datasets representing 70% of the original records (or 44,324) with 30% (or 18,996) for validation, as shown in Figure 4  The internal number of neurons (previously defined as M) of probabilistic neural networks is here determined based on cascade-correlation algorithm suggested by Fahman and Lebiere (1990).Figure 2 shows generally an example of a cascade forward network for five inputs and one output.The training and testing are performed in an iterative cycle composed of a looping of two phases, which are executed using a specific dataset (initially the one in Figure 4 (a), which could be artificially modified until the optimal dataset is reached, as described in step 4), and a constant number of inputs (defined as D is equal to twelve).The two phases are described as follows: i) It starts with a minimal (only one neuron) internal layer of the neural network (represented generally by Equation 1) and automatically adds new hidden neurons one at a time, in each round, finally resulting in a multilayer structure with the input connection frozen (represented by squares in Figure 2); and ii) The follow-on neural network is applied to the test dataset, and the error is calculated.There are then two options: first, return to (i) if the test error has not increased from the previous round and the number of neurons in the internal layers is less than 150; or second, to go step 4, which means that the final (or that could be an optimum) neural network configuration (or ANM) has been obtained.

Step 4â ȂŤ Validation:
This step compares the SIE forecasts (output) of ANM with the true observations, which are assumed to have at least one of two conditions: a) weather conditions (class 1 of Table 2) reported by METAR or SPECI (corresponding the validation dataset in Figure 4 (b)); and/or b) lightning reported inside a 50-km radius centred at Galeão airport during a 1-h period.The lighting data are included in the validation because the weather conditions reported in METAR or SPECI represent an observation by the meteorologist at an instant C5133 of time; therefore, sometimes it does not correctly represent an entire one-hour period, which is the minimum time interval for an ANM forecast, and the lightning data will be continuously generated during the entire ANM forecast time and beyond the METAR observation, which depends on the meteorologist's observation skills.The lightning data allow the ANM forecast verification to be spread out to encompass the entire flight terminal area of Rio de Janeiro.Moreover, it is assumed in this work that the presence of lighting is related with SIE.Therefore, these two conditions will certainly permit a better ANM validation, which is accomplished via a two-dimensional contingency table.The calculation of five categorical statistics used to verify the frequency of correct and incorrect forecasted values is performed as follows: 1) proportion correct (PC), which shows the frequency of the ANM forecasts that were correct (a perfect score equals one); 2) BIAS, which represents the ratio between the frequency of ANM estimated events and the frequency of ANM observed events (a perfect score equals one); 3) probability of detection (POD), which represents the probability of the occasions when the forecast event actually occurred (hits), and the scale varies from zero to one, where one indicates a perfect forecast; 4) false-alarm ratio (FAR), which indicates the fraction of ANM-predicted SIEs that did not occur (a perfect score equals zero); and 5) threat score (TS), which indicates how the ANM forecasts correspond to the observed SIEs (a perfect score equals one).In particular, the TS is relatively sensitive to the climatology of the studied event, tending to produce poorer scores for rare events, such as an SIE.Therefore, the model is considered to be optimal when it creates SIE nowcasting with scores as near perfect as possible for the five statistics described (Wilks, 2006).
Finally, if the validation results of the ANM do not indicate satisfactory performance, a normal procedure is to rearrange the representativeness of the target class one in the training data (i.e., modifying the training/testing dataset) and then go to step 3 and repeat step 4 in Figure 3. Otherwise, the optimal model is reached.The ANM training strategy and results are discussed in the next section.-----------Authors final comment: Considering the referees comments and suggestions, the manuscript was revised as in attached file (PDF).C5134 3â ȂŤ Neural Network Training and Testing