AMTAtmospheric Measurement TechniquesAMTAtmos. Meas. Tech.1867-8548Copernicus PublicationsGöttingen, Germany10.5194/amt-10-2009-2017Satellite-based high-resolution mapping of rainfall over southern AfricaMeyerHannahanna.meyer@geo.uni-marburg.deDrönnerJohannesNaussThomashttps://orcid.org/0000-0003-3422-0960Environmental Informatics, Faculty of Geography, Philipps-University Marburg, Deutschhausstr. 10, 35037 Marburg, GermanyDatabase Research Group, Faculty of Mathematics and Informatics, Philipps-University Marburg, Hans-Meerwein-Str. 6, 35032 Marburg, GermanyHanna Meyer (hanna.meyer@geo.uni-marburg.de)6June2017106200920191February20176February201721April20179May2017This work is licensed under the Creative Commons Attribution 3.0 Unported License. To view a copy of this licence, visit https://creativecommons.org/licenses/by/3.0/This article is available from https://amt.copernicus.org/articles/10/2009/2017/amt-10-2009-2017.htmlThe full text article is available as a PDF file from https://amt.copernicus.org/articles/10/2009/2017/amt-10-2009-2017.pdf
A spatially explicit mapping of rainfall is necessary for southern
Africa for eco-climatological studies or nowcasting but accurate estimates
are still a challenging task. This study presents a method to estimate hourly
rainfall based on data from the Meteosat Second Generation (MSG) Spinning
Enhanced Visible and Infrared Imager (SEVIRI). Rainfall measurements from
about 350 weather stations from 2010–2014 served as ground truth for
calibration and validation. SEVIRI and weather station data were used to
train neural networks that allowed the estimation of rainfall area and
rainfall quantities over all times of the day. The results revealed that 60 % of recorded rainfall events were correctly classified by the model
(probability of detection, POD). However, the false alarm ratio (FAR) was
high (0.80), leading to a Heidke skill score (HSS) of 0.18. Estimated hourly
rainfall quantities were estimated with an average hourly correlation of ρ=0.33 and a root mean square error (RMSE) of 0.72. The correlation increased
with temporal aggregation to 0.52 (daily), 0.67 (weekly) and 0.71 (monthly).
The main weakness was the overestimation of rainfall events. The model
results were compared to the Integrated Multi-satellitE Retrievals for GPM
(IMERG) of the Global Precipitation Measurement (GPM) mission. Despite being
a comparably simple approach, the presented MSG-based rainfall retrieval
outperformed GPM IMERG in terms of rainfall area detection: GPM IMERG
had a considerably lower POD. The HSS was not significantly different
compared to the MSG-based retrieval due to a lower FAR of GPM IMERG. There
were no further significant differences between the MSG-based retrieval and
GPM IMERG in terms of correlation with the observed rainfall quantities. The
MSG-based retrieval, however, provides rainfall in a higher spatial resolution.
Though estimating rainfall from satellite data remains challenging, especially
at high temporal resolutions, this study showed promising results towards
improved spatio-temporal estimates of rainfall over southern Africa.
Introduction
The dynamics of rainfall play an important role in southern
Africa, especially in the arid and semi-arid areas where farming is the main
source of income and the quality of the pastures mainly depends on water availability
. Accurate nowcasting of rainfall at high temporal and
spatial resolutions is therefore of interest for the farmers in southern
Africa and would help them to assess the carrying capacity of their land. It
is of further importance as a baseline product for a variety of environmental
research studies as rainfall is a key variable for many ecological and
hydrological processes.
Rain gauges are still considered the most accurate way to measure
rainfall. Southern Africa features a network of rain gauges operated by the
weather services of the individual countries as well as by a variety of
research projects. However, the network does not feature a sufficient density
to capture spatially highly variable rainfall dynamics. To obtain spatially
explicit data, ground-based radar networks are well established to measure
rainfall in other parts of the world (e.g. RADOLAN in Germany,
). A radar network covering the entire region of southern
Africa, however, is currently not available and the existing radar-based
rainfall estimates in South Africa are still afflicted with many
uncertainties . A satellite-based monitoring of rainfall is
therefore an obvious alternative.
A number of global satellite-derived products have been developed in the last
decades (e.g. TRMM, CMORPH, PERSIANN; see review in ). Since 2014, the latest
product from the Global Precipitation Measurement (GPM) mission, a
successor of the Tropical Rainfall Measuring Mission (TRMM), provides the
most recent global estimates of precipitation at high spatial and temporal
resolutions. It might be expected that the GPM products would feature a high
degree of accuracy since the TRMM-3B42 product has been identified as the
most accurate retrieval at least for eastern Africa .
In addition to global rainfall retrievals, a number of regionally adapted
retrievals were developed in the last decades
.
and presented a
methodology to estimate rainfall from optical Meteosat Second Generation
(MSG) Spinning Enhanced Visible and InfraRed Imager (SEVIRI) data for
Germany. In this approach, machine learning algorithms were used to relate
the spectral properties of MSG to reliable radar data as a ground truth.
Though the retrieval showed promising results, such spatially comprehensive
ground truth data are lacking for southern Africa. An adaptation of the
retrieval technique to southern Africa hence requires a model training that
relies on sparse weather station data as a ground truth.
This study aims to test the suitability of a MSG and artificial neural
network-based rainfall retrieval, which is regionally trained using rain gauge
data to provide spatially explicit estimates of rainfall areas and rainfall
quantities for southern Africa. The suitability of the model is assessed by
validation with independent weather station data and comparison to the
Integrated Multi-satellitE Retrievals for GPM (IMERG) product.
Methods
The methodology is divided into preprocessing satellite and rain gauge
data and model tuning and training, including its validation, model estimation
and comparison to GPM IMERG (Fig. ).
Flow chart of the methodology applied in this study.
Study area
The area of investigation comprises South Africa, Lesotho and Swaziland,
Namibia, Botswana and Zimbabwe, as well as parts of Mozambique (Fig. ). Average annual rainfall in southern Africa roughly
follows an aridity gradient from the dry west to the more humid east. With
the exceptions of some coastal regions in South Africa, most rain falls
during the summer months. In the coastal areas of South Africa, frontal
systems cause light rain that may last over several days. The majority of
interior areas are dominated by local and short-term convective heavy showers
mostly with thunder in the afternoon or evening hours. Rain from synoptic
systems lasting up to several days also occurs. Snow and hail only contribute
a negligible amount to the overall precipitation totals. The interannual
variability of rainfall is high for the arid areas. For a detailed
description of southern African rainfall characteristics see
and .
Map of the average annual precipitation sums in the study area as
estimated by WordClim . Points show the locations of the
weather stations that were used as ground truth data in this study. Automatic
rainfall stations (ARS) and automatic weather stations (AWS) are operated by
the South African Weather Service (SAWS). Further stations are operated by
SASSCAL WeatherNet as well as by the IDESSA project.
Data and preprocessingStation data
Rainfall data for 2010 to 2014 were obtained from the South African Weather
Service (SAWS). The data were recorded at 229 automatic rainfall stations and
91 automatic weather stations (Fig. ). They were
complemented by 22 stations from SASSCAL WeatherNet
(www.sasscalweathernet.org/) located in southern Namibia and Botswana. For
2014, data from an additional 15 stations in South Africa operated by the
IDESSA project (An Integrative Decision Support System for Sustainable
Rangeland Management in southern African Savannas, www.idessa.org/) were
available. The data passed general provider-dependent quality checks before
they were used in this study. This includes filtering data beyond common data
ranges or carrying out situational checks for consistency with related parameters (e.g.
air humidity) from SASSCAL. SAWS payed attention to rainfall values > 10 mm
within 5 min and deleted those values if they were unreliable. Data from all
providers were then included in an on-demand processing database system
where they were automatically cross-checked for
reliability by filtering values < 0 and > 500 mm of rainfall per hour.
All station data that provided subhourly information were aggregated to a
temporal resolution of 1 h within the database. Though the station data are
not randomly distributed in the model domain, they cover the entire aridity
gradient, from sites with very low (< 200 mm) precipitation to sites in
areas with highest (∼ 1500 mm) yearly precipitation sums.
Satellite data
MSG SEVIRI scans the full disk every
15 min with a spatial resolution of 3×3 km at subsatellite
point (3.5×3.5 km in southern Africa). Reflected and emitted
radiances are measured by 12 channels, three channels at visible (VIS) and
very near-infrared wavelengths (NIR, between 0.6 and 1.6 µm), eight
channels ranging from near-infrared to thermal infrared wavelengths (IR,
between 3.9 and 14 µm) and one high-resolution VIS channel with a spatial
resolution of 1×1 km, which was not considered in this study.
The rainfall retrieval technique presented here works under the assumption
that VIS, NIR and IR channels of MSG SEVIRI provide proxies for microphysical
cloud properties, which are, in turn, related to rainfall. VIS and NIR
channels have been shown to be related to cloud optical depth
and cloud water path
where the NIR channel is further related to cloud particle size
. The IR channels have been shown to provide information
about the cloud top temperature which was used as a proxy for cloud height
. The cloud droplet effective radius as well as liquid
water path at night was approximated using IR differences
.
MSG SEVIRI Level 1.5 data were preprocessed to radiance
values according to and brightness temperatures
according to using a processing scheme based on a custom
raster processing extension of the eXtensible and fleXible Java library (see
https://github.com/umr-dbs/xxl) which enables parallel raster processing on
CPUs and GPUs using OpenCL.
Cloud mask
A cloud mask was used to exclude all pixels that were not cloudy in the
respective SEVIRI scenes. For 2010 to 2012, the CM SAF CMa Cloudmask product
was applied. Due to the availability of the CM SAF CMa
cloud mask data set, which was currently limited to the years 2004 to 2012, we
used the cloud mask information of the CLAAS-2 data record
for the years 2013 and 2014, which is the second edition
of the SEVIRI-based cloud property data record provided by the EUMETSAT
Satellite Application Facility on Climate Monitoring (CM SAF; see also
for further information on CLAAS). All pixels that were
classified as cloud contaminated or cloud filled were interpreted as cloudy.
Pixels that were classified as cloud-free were excluded from further
analysis.
Model strategies for rainfall estimationGeneral model framework
The modelling methodology follows the study of who used the spectral channels of MSG SEVIRI to train a
random forest model that is able to spatially estimate rainfall areas and
rainfall rates over Germany. Based on this study, have shown
that neural networks outperform the initially used random forest algorithm.
In these previous studies on the rainfall retrieval, the radar-based RADOLAN
product was used as ground truths to train the model. The
high data quality and spatially explicit information allowed the model to be
optimised without too much confusion caused by uncertainties in the training
data. However, the goal of the retrieval was that it can be applied to areas
where spatially explicit data for rainfall are not available, as it is the
case in southern Africa.
Training and test data sets
Cloud masked MSG data from 2010 to 2014 were extracted at the locations of
the weather stations. To match the temporal resolution of all available rain
gauge data, the extracted data were aggregated to hourly values. This was
done by taking the median value of the four scenes available every hour.
However, only if all four scenes were masked as cloudy, the corresponding
hourly values for a respective station were used for further analysis. The
extracted and aggregated MSG data were then matched with the corresponding
rain gauge information under consideration of the time shift between MSG data
(UTC) and rain gauge data (UTC+2).
The spectral channels as well as the channel differences ΔT6.2–10.8, ΔT7.3–12.1, ΔT8.7–10.8, ΔT10.8–12.1,
ΔT3.9–7.3, ΔT3.9–10.8 and the sun zenith were used as
predictor variables during the daytime, in accordance to
and previous studies on MSG-based delineation of cloud properties (see
Sect. ). Thus, the predictor variables contain the SEVIRI
channels as well as channel combinations. Although this partially duplicates
information, the channel combinations allow for highlighting patterns that might
not be apparent in the individual channels. As additional potential
predictors, tested different cloud texture parameters and
have shown that the chosen spectral channels and differences are sufficient
as predictors.
Since neural networks require that the predictor variables are standardised,
all predictors were centred and scaled by dividing the values of the
mean-centred variables by their standard deviations. Since the VIS and NIR
channels of MSG are not available at night-time, the data set was split
into a daytime data set (data points with a solar zenith angle < 70∘) and a night-time data set (data points with a solar zenith angle
> 70∘) and were considered in separate models. Though two
different models might lead to rough transitions between daytime and
night-time estimates, accurate estimates were in the foreground of this study,
leading to the decision of separate models according to data availability.
The response variables (rainfall yes/no and rainfall quantities) were taken
from the rain gauge measurements.
The years 2010 to 2012 were used for model training. The year 2013 was used
for validation. The retrieval process was in two steps and consisted of (i) the
identification of precipitating cloud areas and (ii) the assignment of
rainfall quantities. All 2010 to 2012 data from the rain gauges that are
masked as cloudy by the cloud mask products were used for training the
rainfall area model. All recorded rainfall events were used for training the
rainfall quantities model. The resulting training data set comprised 917 774
(daytime) and 1 409 072 (night-time) samples for the rainfall area training and
69 703 (daytime) and 129 325 (night-time) samples for training of rainfall
quantities from 26 243 individual MSG scenes.
Tuning and model training
A single-hidden-layer feed-forward neural network was applied as a machine
learning algorithm. The spectral channels of MSG SEVIRI as well as the
channel differences served as input nodes (predictor variables). The neural
network was then applied to learn the relations between these spectral
information and rainfall areas or rainfall quantities, respectively. In this
context, a sophisticated preselection of input variables is not required, as
the network is able to deal with correlated and even uninformative predictors
unless their number is very high , which was not the case
in this study. For the technical realisation, all steps of the model training
were performed using the R environment for statistical computing
. The neural network implementation from the “nnet” package
in R was used in conjunction with the
“caret” package , which provides enhanced functionalities for
model training, estimation and validation.
Neural networks require two hyperparameters to be tuned to avoid under- or
overfitting of the data: the number of neurons in the hidden layer, as well
as the weight decay. The neurons in the hidden layer represent non-linear
combinations of the input data and their number influences the performance of
the model . Weight decay penalises large weights and
controls the generalisation of the outcome . The number of
neurons as well as the weight decay were tuned using a stratified 10-fold
cross-validation. Thus, the training samples were randomly partitioned into
10 equally sized folds with respect to the distribution of the response
variable (i.e. raining cloud pixels, rainfall rate). Thus, every fold is a
subset (1/10) of the training samples and has the same distribution of the
response variable as the total set of training samples. Models were then
fitted by repeatedly leaving out one of the folds. The performance of a model
was then determined by making predictions on the held back fold. The performance
metrics from the held back iterations were averaged to the overall model
performance for the respective set of tuning values. For the rainfall areas
classification models, the distance to a “perfect” model, based on receiver
operating characteristics (ROC) analysis (see for its
application in rainfall retrievals) was used as decisive performance metric.
For the rainfall quantity regression models, the root mean square error
(RMSE) was used. The number of hidden units were tuned for each value between
two and the number of predictor variables. Weight decay was tuned between 0
and 0.1 with increments of 0.02 . To train rainfall
areas, the threshold that separates rainy from non-rainy clouds according to
the estimated probabilities was used as an additional tuning parameter. The optimal
threshold was expected to be considerably smaller than 0.5 since the number
of non-rainy samples was higher than the number of rainy samples. Therefore,
the range of tested thresholds was 0 to 0.1 with increments of 0.01, and 0.4
to 1 with increments of 0.1. See for further details of the
threshold tuning methodology.
The optimal values for the hyperparameters that were revealed in the tuning
study (Table ) were adopted for the final model fitting. In
this step, the model is fit to all training data using the optimal
hyperparameters.
Optimal hyperparameters for the individual models revealed during
the tuning study and applied in the final model fitting.
Number ofWeight decayThresholdneuronsdecayRainfall areas at daytime50.050.07Rainfall areas at night-time50.070.01Rainfall quantities at daytime50.05Rainfall quantities at night-time50.05Spatial estimations of rainfall
Final models were applied to all hourly MSG SEVIRI scenes from 2010–2014 for
the southern African extent to obtain spatio-temporal estimates of rainfall.
Therefore, the clouded areas of a scene were first classified into rainy or
non-rainy using the respective model. The rainfall quantities were then
estimated for the estimated rainfall areas. To ensure consistency within one
scene, the choice of the model being applied (either the daytime or night-time
model) was made according to the mean solar zenith angle of the respective
scene. If the mean solar zenith angle was < 70∘, rainfall for the
entire scene was estimated using the daytime model. For scenes with a mean
solar zenith angle > 70∘, the night-time model was applied.
Validation
Model estimates and weather station records from the entire year 2013 were
used as independent data for model validation. For the validation of
estimated rainfall areas, all pixels at the location of the weather stations
that were classified as cloudy by the cloud mask product were considered.
Therefore the information from the weather stations on whether it was
raining or not was compared to the model estimate for the respective MSG
pixel. The validation data contained 403 211 samples during the daytime and 565 415
samples at night-time. Average hourly probability of detection (POD),
probability of false detection (POFD), false alarm ratio (FAR) and Heidke
skill score (HSS) were calculated as validation metrics. The POD gives the
percentage of rain pixels that the model correctly identified as rain (Tables , ). POFD gives the proportion of
non-rain pixels that the model incorrectly classified as rain. The FAR gives
the proportion of estimated rain where no rain is observed. The HSS also
accounts for chance agreement and gives the proportion of correct
classifications (both rain pixels and non-rain pixels) after eliminating
expected chance agreement.
Confusion matrix as a baseline for the calculation of the verification
scores used for the validation of the rainfall area estimates.
Categorical metrics for validation of rainfall area estimates.
MetricFormulaRangeOptimal valueProbability of detectionPOD =TPTP+FN 0–11Probability of false detectionPOFD =FPFP+TN 0–10False alarm ratioFAR =FPTP+FP 0–10Heidke skill scoreHSS =TP×TN-FP×FN[(TP+FN)×(FN+TN)+(TP+FP)×(FP+TN)]/2-∞–11
To evaluate the ability of the model to estimate rainfall quantities, the
correlation between the measured and the estimated hourly rainfall was
calculated using Spearman's product moment correlation (rho) to account for a
non-normal distribution of the data. RMSE was also calculated. All cloudy
data points (including non-rainy data points) were used for the validation of
rainfall quantities. The rainfall quantities were further aggregated to
daily, weekly and monthly rainfall sums to assess the performance of the
model on different temporal scales.
Comparison to GPM
The results of the presented rainfall retrieval were compared to the rainfall
estimates of the GPM mission. GPM, as a successor of the TRMM, consists of an international network of satellites
designed for worldwide high-resolution precipitation estimates
. GPM provides data from March 2014
onwards. The GPM IMERG product estimates rainfall by combining all available
passive-microwave estimates as well as microwave-calibrated infrared
satellite estimates and data from rainfall gauges. GPM IMERG is available in
6 h, 18 h and 4-month latency.
In this study the 4-month latency (final product) with 30 min temporal
and 0.1∘ spatial resolution (∼ 10 km × 10 km) was used
. Due to different data availabilities of GPM IMERG, MSG
as well as weather station data, the comparison was conducted for the
overlapping time period from late March 2014 to August 2014. GPM was aggregated
from 30 min to 1 h to match the temporal resolution of the MSG-based
estimates. Both products were validated using the weather station data as a
reference. The performance metrics were compared between the MSG product and
the GPM product on an hourly basis.
ResultsModel performance
On average, 60 % of the rainfall observations were correctly identified as
rainy by the model, with a high number of scenes having much higher PODs (Fig. ). The POFD was low (18 % in average) but the estimates
featured a high FAR of 0.80. The average HSS per scene was 0.18. The POD was
highest for high measured rainfall quantities and decreased for lower
rainfall quantities (Fig. ). FAR was highest
for low predicted rainfall quantities and decreased for higher predicted
quantities.
The average hourly RMSE was 0.72 mm h-1 (Fig. ).
In particular, data points with low or medium measured rainfall could be
estimated with low RMSE (Fig. ). The RMSE was
higher for high measured rainfall. The correlation indicated by Spearman's rho
had an hourly average of 0.33. The performance of modelled rainfall quantities
increased with the aggregation level (Fig. ). The average
correlation increased from ρ=0.33 (hourly) to 0.52 on a daily, 0.67 on a
weekly and 0.71 on a monthly basis. An overestimation of rainfall is
observed,
especially when aggregated to monthly totals. An example of temporally
aggregated rainfall estimates for 2013 are shown in Fig. .
Validation of estimated rainfall areas for 2013 on an hourly basis.
Each of the data points is the average performance of 1 h. The data are
visualised as “vioplot” which entails that a box plot is complemented by the kernel density
of the data shown as grey areas at the sides of the box plot.
Comparison of POD for different hourly measured rainfall quantities
as well as FAR for different predicted rainfall quantities. RMSE was compared
for different measured rainfall quantities. All data points from 2013 were
used for the calculation of the statistics. Thresholds for the three rainfall
classes were set according to the first and third quartiles of the measured
hourly rainfall quantities.
Validation of estimated rainfall quantities for 2013 on an hourly
basis. Each of the data points is the average performance of 1 h. See
Fig. for further information on the figure style.
Validation of estimated rainfall quantities for 2013 at (a) hourly
resolution and on the different aggregation levels: (b) daily, (c) weekly,
(d) monthly. Each of the data points represents a station at the respective level of
temporal aggregation. Rho represents the average correlation for each time
step of the respective aggregation level. For an easy visual interpretation,
the data are presented via hexagon binning in which the number of data points
falling in each hexagon are depicted by colour.
Monthly precipitation sums in millimetres from the year 2013 as estimated by
this study.
Comparison to GPM
Compared to GPM IMERG, the MSG-based rainfall retrieval for the period
March–August 2014 showed a higher POD (0.57) than GPM IMERG (0.28) which
considerably underestimated rainfall events (Fig. ). In
contrast, GPM IMERG had a lower FAR (0.70) than the MSG-based model (0.81).
However, the FAR was high for both retrievals. The average HSS was the same
for both retrievals (0.17), but the median HSS for GPM IMERG was 0 which was
considerably lower than when using the MSG-based retrieval (0.10). Concerning the
rainfall quantities, neither the correlation to measured rainfall nor the
RMSE showed significant differences between the retrievals (Fig. ). The average rho was 0.36 for the MSG-based retrieval and
0.34 for GPM IMERG. The average RMSE was 0.88 for the MSG-based retrieval and
0.85 for MSG IMERG.
Figure gives an example of the differences between the MSG-based retrieval and GPM IMERG for 24 April 2014 12:00 UTC when severe floods
occurred in the Eastern Cape province of South Africa. The colour composite
of the corresponding MSG scene shows that clouds had a high optical depth in
this area. The pattern is reflected in the estimates of the MSG-based
retrieval that estimated rainfall for the areas with high values of optical
depth. This was partly confirmed by the weather station data. However,
rainfall was also estimated for areas where weather stations did not record
any rainfall. In contrast, GPM IMERG showed an underestimation of rainfall
areas, but still captured the high rainfall quantities that were recorded by
the weather stations. The summary statistics for this hour are a POD of 0.75
for the MSG-based retrieval and 0.19 for GPM IMERG. FAR was 0.65 and HSS 0.34
for the MSG-based retrieval compared to a FAR of 0.89 and a HSS of 0.08 for
GPM IMERG. The correlation between estimated and observed rainfall was 0.39
for the MSG-based retrieval and -0.06 for GPM IMERG.
Comparison of the performance of the MSG-based retrieval and GPM
IMERG for rainfall area delineation between March and August 2014. Each of
the data points is the average performance of 1 h. See Fig. for further
information on the figure style.
Comparison of the performance of the MSG-based retrieval and GPM
IMERG for hourly rainfall quantities between March and August 2014. Each of
the data points is the average performance of 1 h. See Fig. for further
information on the figure style.
Sample satellite scene from 24 April 2014 10:00 UTC represented as a
VIS0.8-IR3.9-IR10.8 false-colour composite according to
in which cloud optical depth is indicated by red colouration, cloud particle
sizes and phases in green and the brightness temperature modulates in blue.
The rainfall estimates for this scene (estimated using the daytime model) are
shown as well as the corresponding GPM IMERG product. Observed rainfall is
depicted where weather station data were available. For visualisation
purposes, the spatial extent of the stations was increased. White background
in the colour composite as well as in the MSG-based retrieval and the GPM
IMERG product represent no data due to missing clouds. In addition, the white
background in the representation of the observed rainfall is due to the
absence of weather stations.
Discussion
The presented monthly maps reflect the general spatial and temporal rainfall
patterns of southern Africa as shown in . They also reflect
the annual characteristics of the year 2013. For example, the heavy rainfall
events over southern Mozambique and the Limpopo River basin during mid-January .
The validation of the rainfall retrievals showed promising results but also
highlights the difficulties of optical satellite-based rainfall estimates.
The strength of the retrieval in terms of rainfall areas classification was a
high POD for heavy rainfall events. The rainfall quantities for the heavy
rainfall events were, however, underestimated in most cases. The major
problem of the model was the overestimation of rainfall events leading to an
overestimation of rainfall quantities. However, false alarms in the retrieval
were generally predicted with low rainfall quantities. In this context, it is
of note that in view of the scene-based validation strategy, FAR can easily
increase in dry conditions when there are just a few false alarms in the
estimates and no rainfall was observed by any station. However, the FAR was
still high for hours with a considerable number of rainfall events. This
might be partly explainable by spatial displacement due to parallax shifts.
Though the shift is generally below 1 pixel in this region, even minor shifts
can affect model training as well as the estimates. For future enhancement of
the rainfall retrieval, a correction of the parallax shift
would be appropriate. Differences in spatial and temporal
scale are also an important issue, especially since a majority of rainfall
events in southern Africa are of small spatial and temporal extent. The
aggregation to an hour, as well as the assumption that the weather station
observation is representative for the entire pixel, is also problematic,
though essential. The issue of scale especially affects the broader
resolution GPM IMERG data where a several-kilometre-sized pixel is validated by a
single point measurement. Besides the issue of scale and spatial
displacement, the retrieval technique depends on the quality of the rain
gauge observations. Although the data were quality checked, common problems
associated with rain gauge measurements, e.g. wind drift or evaporation
leading to errors in the ground truth data and affect model training and
validation remain . Also, due to different installation
dates of the individual weather stations as well as the natural challenge of
maintaining weather stations in remote areas, no gapless data set could be
compiled. Therefore, different sensor and data-provider-dependent calibration
techniques and gaps in the time series of the data, as well as the general
problems associated with rain gauge measurements, might lead to
inconsistencies and uncertainties. However, no reliable alternatives are
available and rain gauge measurements are still considered the most reliable
source of rainfall data.
The retrieval techniques relied on the cloud mask for an initial selection of
relevant data points used for model training, validation and the final
spatio-temporal estimates. Therefore, it cannot be excluded that some data
points were falsely excluded from the analysis as they were falsely masked as
being not cloudy but rainfall was measured on the ground. However, we assume
that rainy clouds are easy to capture by common cloud masking algorithms and
that the resulting bias is therefore comparably small.
Despite the errors and uncertainties associated with the presented rainfall
retrieval, the combination of MSG data and neural networks are a promising
approach. The model presented in this study outperformed the GPM IMERG
product in terms of rainfall area detection. GPM IMERG considerably
underestimated rainfall events. This behaviour is partly explainable by scale
because GPM IMERG has a coarser resolution of 0.1∘. This makes local
processes difficult to capture which is an disadvantage considering that in
southern Africa especially small-scale convective showers contribute to
rainfall sums . In terms of rainfall quantities, GPM IMERG
and the presented retrieval did not show significant differences in
correlation. The sample spatial comparison has shown that GPM IMERG has more
differentiated rainfall estimates while the MSG-based retrieval tends to
estimate the mean distribution.
The presented MSG-based retrieval is an easy to use method and allows for
time series at a relatively high spatial resolution. Aside from the promising
results, compared to GPM IMERG, the daily estimates of the MSG-based retrieval
are at least comparable to other products incorporated in the IPWG validation
study . A detailed comparison could currently not be given
since validation data and strategy were not identical. Incorporation of the
presented retrieval scheme to the IPWG validation study is intended by the
authors for future assessment.
Conclusions
The rainfall retrieval technique developed in this study provides hourly
rainfall estimates at high spatial resolution based on the spectral
properties of MSG SEVIRI data and neural networks. The retrieval showed
promising results in terms of rainfall area detection and estimation of
rainfall quantities. However, the results also showed that the estimation of
rainfall remains challenging. The main weakness of the presented retrieval
was the overestimation of rainfall occurrence. However, the retrieval could
compete with the GPM IMERG product in terms of rainfall quantity and was even
better for rainfall area detection.
High-resolution spatial data sets of rainfall are requested by a variety of
research disciplines. The developed MSG-based rainfall retrieval is able to
deliver time series from the launch of MSG SEVIRI onward. An
operationalisation for near real-time rainfall estimates is intended. It can
therefore serve as a valuable data set where high-resolution rainfall data for
southern Africa are needed. As an example, it will serve as an important
parameter within the IDESSA (An Integrative Decision Support System for
Sustainable Rangeland Management in Southern African Savannas) project, which
aims to implement an integrative monitoring and decision support system for
the sustainable management of different savanna types. The hourly and
aggregated rainfall quantity estimations are available from the authors on
request.
The data
developed in this study are available
from the authors on request.
H. Meyer and T. Nauss designed the study. J. Drönner preprocessed the satellite data. H. Meyer
developed the model code, performed the data analysis and prepared the manuscript with contributions from both co-authors.
The authors declare that they have no conflict of
interest.
Acknowledgements
This work was financially supported by the Federal Ministry of Education and
Research (BMBF) within the IDESSA project (grant no. 01LL1301) which is part
of the SPACES programme (Science Partnership for the Assessment of Complex
Earth System processes). We are grateful to the South African Weather Service
for providing us with their rainfall data for South Africa and to SASSCAL
WeatherNet for rainfall data from Namibia and Botswana. The cloud masking was
done by using Level-2 data of the CLAAS-2 data record provided by the
EUMETSATs Satellite Application Facility on Climate Monitoring (CM SAF). The
GPM IMERG V3 data were provided by the NASA/Goddard Space Flight Center's
Mesoscale Atmospheric Processes Laboratory and Precipitation Processing
System (PPS), which develop and compute the GPM IMERG V3 as a contribution to
project GPM, and archived at the NASA GES DISC.
Edited by: G. Vulpiani
Reviewed by: two anonymous referees
References
Aminou, D. M. A., Jacquet, B., and Pasternak, F.: Characteristics of the
Meteosat Second Generation (MSG) radiometer/imager: SEVIRI, in:
Proceedings of SPIE: Sensors, Systems, and Next-Generation Satellites, 3221,
19–31, 1997.
Bartels, H., Weigl, E., Reich, T., Lang, P., Wagner, A., Kohler, O., Gerlach,
N., and MeteoSolutions
GmbH: Projekt RADOLAN – Routineverfahren zur Online-Aneichung
der Radarniederschlagsdaten mit Hilfe von automatischen
Bodenniederschlagsstationen (Ombrometer), Deutscher Wetterdienst, Offenbach,
2004.Benas, N., Finkensieper, S., Stengel, M., van Zadelhoff, G.-J., Hanschmann, T., Hollmann, R., and Meirink, J. F.:
The MSG-SEVIRI based cloud property data record CLAAS-2, Earth Syst. Sci. Data Discuss., 10.5194/essd-2017-9, in review,
2017.
Cattani, E., Merino, A., and Levizzani, V.: Evaluation of Monthly
Satellite-Derived Precipitation Products over East Africa, J.
Hydrometeorol., 17, 2555–2573, 2016.EUMETSAT: High Rate SEVIRI Level 1.5 Image Data – MSG – 0 degree,
http://navigator.eumetsat.int/discovery/Start/DirectSearch/DetailResult.do?f(r0)=EO:EUM:DAT:MSG:HRSEVIRI
(last access: 13 July 2015),
2010.
EUMETSAT: The Conversion from Effective Radiances to Equivalent
Brightness Temperatures, 2012a.
EUMETSAT: Conversion from radiances to reflectances for SEVIRI warm
channels, 2012b.
Feidas, H. and Giannakos, A.: Classifying convective and stratiform rain
using multispectral infrared Meteosat Second Generation satellite data,
Theor. Appl. Climatol., 108, 613–630, 2012.
Finkensieper, S., Meirink, J.-F., van Zadelhoff, G.-J., Hanschmann, T., Benas,
N., Stengel, M., Fuchs, P., Hollmann, R., and Werscheck, M.: CLAAS-2: CM
SAF CLoud property dAtAset using SEVIRI – Edn. 2, Tech. rep.,
Satellite Application Facility on Climate Monitoring, 2016.
Fynn, R. and O'Connor, T.: Effect of stocking rate and rainfall on rangeland
dynamics and cattle performance in a semi-arid savanna, South Africa, J.
Appl. Ecol., 37, 491–507, 2000.
Giannakos, A. and Feidas, H.: Classification of convective and stratiform
rain based on the spectral and textural features of Meteosat Second
Generation infrared data, Theor. Appl. Climatol., 113, 495–510, 2013.Hamann, U., Walther, A., Baum, B., Bennartz, R., Bugliaro, L., Derrien, M., Francis, P. N., Heidinger, A., Joro, S., Kniffka, A.,
Le Gléau, H., Lockhoff, M., Lutz, H.-J., Meirink, J. F., Minnis, P., Palikonda, R., Roebeling, R., Thoss, A., Platnick, S.,
Watts, P., and Wind, G.: Remote sensing of cloud top pressure/height from SEVIRI: analysis of ten current retrieval algorithms,
Atmos. Meas. Tech., 7, 2839–2867, 10.5194/amt-7-2839-2014, 2014.
Hijmans, R. J., Cameron, S. E., Parra, J. L., Jones, P. G., and Jarvis, A.:
Very high resolution interpolated climate surfaces for global land areas,
Int. J. Climatol., 25, 1965–1978, 2005.
Hou, A. Y., Kakar, R. K., Neeck, S., Azarbarzin, A., Kummerow, C. D., Kojima,
M., Oki, R., Nakamura, K., and Iguchi, T.: The Global Precipitation
Measurement Mission, B. Am. Meteorol. Soc., 95, 701–722, 2014.Huffman, G., Bolvin, D., Braithwaite, D., Hsu, K., Joyce, R., and Xie, P.:
GPM L3 IMERG Late Half Hourly 0.1 degree x 0.1 degree
Precipitation V03, Greenbelt, MD, Goddard Earth Sciences Data
and Information Services Center (GES DISC), 10.5067/GPM/IMERG/HH/3B,
2014.IPWG: IPWG South African Validation,
http://rsmc.weathersa.co.za/IPWG/ipwgsa_qlooks.html
(last access: 1 February 2017), 2016.
Kaptué, A. T., Hanan, N. P., Prihodko, L., and Ramirez, J. A.: Spatial and
temporal characteristics of rainfall in Africa: Summary statistics for
temporal downscaling, Water Resour. Res., 51, 2668–2679, 2015.
Kidd, C. and Huffman, G.: Global precipitation measurement, Meteorol. Appl.,
18, 334–353, 2011.
Kidd, C., Bauer, P., Turk, J., Huffman, G. J., Joyce, R., Hsu, K.-L., and
Braithwaite, D.: Intercomparison of High-Resolution Precipitation
Products over Northwest Europe, J. Hydrometeor., 13, 67–83, 2011.
Kniffka, A., Stengel, M., and Hollmann, R.: SEVIRI cloud mask dataset –
Edition 1 – 15 minutes resolution, Satellite Application Facility on
Climate Monitoring, EUMETSAT Satellite Application Facility on Climate
Monitoring (CM SAF), 2014.
Krogh, A. and Hertz, J. A.: A Simple Weight Decay Can Improve
Generalization, in: Advances in Neural Information Processing Systems 4,
Morgan Kaufmann, 950–957, 1992.
Kruger, A. C., ed.: Climate of South Africa, Precipitation,, vol.
Report No. WS47, South African Weather Service, Pretoria, South Africa, 2007.Kuhn, M.: caret: Classification and Regression Training,
https://CRAN.R-project.org/package=caret
(last access: 1 February 2017), r package version
6.0-68, 2016.
Kuhn, M. and Johnson, K.: Applied Predictive Modeling, chap. 7.1 Neural
Networks, Springer, New York, 1 Edn., 141–145, 2013.
Kühnlein, M., Appelhans, T., Thies, B., and Nauss, T.: Precipitation
Estimates from MSG SEVIRI Daytime, Nighttime, and Twilight Data
with Random Forests, J. Appl. Meteor. Climatol., 53, 2457–2480,
2014a.
Kühnlein, M., Appelhans, T., Thies, B., and Nauss, T.: Improving the
accuracy of rainfall rates from optical satellite sensors with machine
learning - A random forests-based approach applied to MSG SEVIRI,
Remote Sens. Environ., 141, 129–143, 2014b.
Levizzani, V., Amorati, R., and Meneguzzo, F.: A Review of
Satellite-based Rainfall Estimation Methods, Tech. rep., European
Commission Project MUSIC Report (EVK1-CT-2000-00058), 2002.
Manhique, A. J., Reason, C. J. C., Silinto, B., Zucula, J., Raiva, I., Congolo,
F., and Mavume, A. F.: Extreme rainfall and floods in southern Africa in
January 2013 and associated circulation patterns, Nat. Hazards, 77,
679–691, 2015.
Merk, C., Cermak, J., and Bendix, J.: Retrieval of optical and microphysical
cloud properties from Meteosat SEVIRI data at night – a feasibility
study based on radiative transfer calculations, Remote Sens. Lett., 2,
357–366, 2011.
Meyer, H., Kühnlein, M., Appelhans, T., and Nauss, T.: Comparison of four
machine learning algorithms for their applicability in satellite-based
optical rainfall retrievals, Atmos. Res., 169, Part B, 424–433, 2016.
Meyer, H., Kühnlein, M., Reudenbach, C., and Nauss, T.: Revealing the
potential of spectral and textural predictor variables in a neural
network-based rainfall retrieval technique, Remote Sens. Lett., 8,
647–656, 2017.
Panchal, G., Ganatra, A., Kosta, Y. P., and Panchal, D.: Behaviour Analysis
of Multilayer Perceptrons with Multiple Hidden Neurons and Hidden
Layers, International Journal of Computer Theory and Engineering, 3, 332–337, 2011.
Prigent, C.: Precipitation retrieval from space: An overview, C. R.
Geosci., 342, 380–389, 2010.R Core Team: R: A Language and Environment for Statistical
Computing, R Foundation for Statistical Computing, Vienna, Austria,
https://www.R-project.org/
(last access: 1 February 2017), 2016.Ripley, B. and Venables, W.: nnet: Feed-Forward Neural Networks and
Multinomial Log-Linear Models,
http://CRAN.R-project.org/package=nnet
(last access: 1 February 2017), r package version
7.3-12, 2016.
Roebeling, R. A., Feijt, A. J., and Stammes, P.: Cloud property retrievals
for climate monitoring: Implications of differences between Spinning
Enhanced Visible and Infrared Imager (SEVIRI) on METEOSAT-8 and
Advanced Very High Resolution Radiometer (AVHRR) on NOAA-17, J.
Geophys. Res.-Atmos., 111, D20210, 10.1029/2005JD006990,
2006.
Rosenfeld, D. and Lensky, I. M.: Satellite-Based Insights into
Precipitation Formation Processes in Continental and Maritime
Convective Clouds, B. Am. Meteorol. Soc., 79, 2457–2476, 1998.Skofronick-Jackson, G., Petersen, W. A., Berg, W., Kidd, C., Stocker, E. F.,
Kirschbaum, D. B., Kakar, R., Braun, S. A., Huffman, G. J., Iguchi, T.,
Kirstetter, P. E., Kummerow, C., Meneghini, R., Oki, R., Olson, W. S.,
Takayabu, Y. N., Furukawa, K., and Wilheit, T.: The Global
Precipitation Measurement (GPM) Mission for Science and Society,
B. Am. Meteorol. Soc., 10.1175/BAMS-D-15-00306.1,, 2017.Stengel, M., Kniffka, A., Meirink, J. F., Lockhoff, M., Tan, J., and Hollmann, R.: CLAAS: the CM SAF cloud property
data set using SEVIRI, Atmos. Chem. Phys., 14, 4297–4311, 10.5194/acp-14-4297-2014, 2014.
Thies, B. and Bendix, J.: Satellite based remote sensing of weather and
climate: recent achievements and future perspectives, Meteorol. Appl., 18,
262–295, 2011.
Venables, W. N. and Ripley, B. D.: Modern Applied Statistics with S,
Springer, New York, 4 Edn., 2002.
Vicente, G. A., Davenport, J. C., and Scofield, R. A.: The role of orographic
and parallax corrections on real time high resolution satellite rainfall rate
distribution, Int. J. Remote Sens., 23, 221–230, 2002.
Wöllauer, S., Forteva, S., and Nauss, T.: On demand processing of
climate station sensor data, in: EGU General Assembly Conference Abstracts,
vol. 17 of EGU General Assembly Conference Abstracts, p. 5231, 2015.