Download Report

Journal of Hydrology 525 (2015) 138–151
Contents lists available at ScienceDirect
Journal of Hydrology
journal homepage: www.elsevier.com/locate/jhydrol
Entropy theory based multi-criteria resampling of rain gauge networks
for hydrological modelling – A case study of humid area in southern
China
Hongliang Xu a,b, Chong-Yu Xu a,c,⇑, Nils Roar Sælthun a, Youpeng Xu b, Bin Zhou d,e, Hua Chen c
a
Department of Geosciences, University of Oslo, PO Box 1047, Blindern, 0316 Oslo, Norway
Department of Land Resources and Tourism, Nanjing University, 22 Hankou Road, Nanjing, Jiangsu 210093, PR China
Department of Hydrology and Water Resources, Wuhan University, PR China
d
Department of Chemistry, University of Oslo, Oslo, Norway
e
Tianjin Academy of Environmental Sciences, 17 Fukang Road, Tianjin, PR China
b
c
a r t i c l e
i n f o
Article history:
Received 14 May 2014
Received in revised form 1 March 2015
Accepted 7 March 2015
Available online 25 March 2015
This manuscript was handled by
Konstantine P. Georgakakos, Editor-in-Chief,
with the assistance of Emmanouil N.
Anagnostou, Associate Editor
Keywords:
Entropy
Mutual information
Multi-criteria
Xinanjiang Model
SWAT Model
Xiangjiang River Basin
s u m m a r y
Rain gauge networks are used to provide estimates of area average, spatial variability and point rainfalls
at catchment scale and provide the most important input for hydrological models. Therefore, it is desired
to design the optimal rain gauge networks with a minimal number of rain gauges to provide reliable data
with both areal mean values and spatial–temporal variability. Based on a dense rain gauge network of
185 rain gauges in Xiangjiang River Basin, southern China, this study used an entropy theory based
multi-criteria method which simultaneously considers the information derived from rainfall series, minimize the bias of areal mean rainfall as well as minimize the information overlapped by different gauges to
resample the rain gauge networks with different gauge densities. The optimal networks were examined
using two hydrological models: The lumped Xinanjiang Model and the distributed SWAT Model. The
results indicate that the performances of the lumped model using different optimal networks are stable
while the performances of the distributed model keep on improving as the number of rain gauges
increases. The results reveal that the entropy theory based multi-criteria strategy provides an optimal
design of rain gauge network which is of vital importance in regional hydrological study and water
resources management.
Ó 2015 Elsevier B.V. All rights reserved.
1. Introduction
Accurate rainfall estimation is an important and challenging
task and good spatial distribution of the rain gauge is a vital factor
in providing reliable areal rainfall. Modern rainfall network
established to monitor hydrological features should provide the
necessary and real-time information for purposes such as water
resources management, reservoir operation and ﬂood forecast
and control (Chen et al., 2008; Shaﬁei et al., 2014). Direct measurement of rainfall can only be achieved by rain gauges, which provide
a basis for characterizing the temporal and spatial variations of
rainfall (Cheng et al., 2008). However, even if rain gauges are
capable of providing real-time rainfall information at very ﬁne
temporal resolution under the help of automatic rainfall record
equipment, it is still difﬁcult to characterize the spatial variation
⇑ Corresponding author at: Department of Geosciences, University of Oslo, PO
Box 1047, Blindern, 0316 Oslo, Norway.
E-mail address: [email protected] (C.-Y. Xu).
http://dx.doi.org/10.1016/j.jhydrol.2015.03.034
0022-1694/Ó 2015 Elsevier B.V. All rights reserved.
of rainfall without a well-designed rain gauge network in the
catchment.
A well designed rain gauge network with proper densities and
distributions is essential to provide valid precipitation information
reﬂecting the spatial–temporal features in a catchment. However,
most river basins of the world are poorly gauged or ungauged,
and most rain gauge networks applied for hydrological purposes
are largely inadequate according to the most dilute density
requirements of the World Meteorological Organization (WMO).
The WMO recommends certain densities of rain gauges to be
followed for different types of basins such as 500 km2 per gauge
is recommended in ﬂat regions of temperate zones, while 25 km2
per gauge is recommended for small mountainous islands with
irregular precipitation (WMO, 1994). Moreover, many non-hydrological factors considerably impact the rain gauge network design,
e.g. accessibility, cost and easiness of maintenance, topographical
aspects, etc. However, many studies have noted a marked decline
in the amount of hydrometric data being collected in many parts
of the world (Perks et al., 1996; Stokstad, 1999; Kizza et al.,
H. Xu et al. / Journal of Hydrology 525 (2015) 138–151
2009). The decline of hydrometric gauges exists not only in developing countries, but even in developed countries, e.g. the U.S.
Geological Survey (USGS) network had undergone some signiﬁcant
reductions in the mid-1990s (Mason and York, 1997; Pyrce, 2004).
Meanwhile, with the technology advancement, the widespread
application of satellite rainfall products has further caused deterioration of rain gauge networks in some cases (Ali et al., 2003;
Visessri and McIntyre, 2012). This decline in hydrometric gauges
means that scientists and water resources engineers are less able
to monitor water supplies, predict droughts, and forecast ﬂoods
than they were 30 years ago (Stokstad, 1999).
Rain gauge network design involves analysis of the quantity and
location of stations necessary for fulﬁlling the required accuracy
(Bras, 1990) and meeting the objectives of information provided
by the network as efﬁciently and economically as possible
(Hackett, 1966). It is therefore desirable to design a minimum density of a rain-gauge network for a required level of service in a
given catchment.
Entropy theory has been applied as a useful tool for understanding the characteristics of precipitation that governed by complex
factors in helping designing rain gauge networks. Maruyama
et al. (2005) assessed the global potential water resources availability by using two different measures of entropy, the so called
Intensity Entropy (IE) and Apportionment Entropy (AE) and
11,260 rain gauges. Tapiador (2007) analysed the global satellite
based monthly precipitation database (NOAA Climate Prediction
Centre Merged Analysis of Precipitation data) from 1979 to 2001
using a direct maximum entropy spectral analysis method and
found that several cycles other than the annual or seasonal cycles
affect the rainfall distribution of many areas, in particular western
Europe. Liu et al. (2013) studied the large-scale spatial rainfall
distribution in the Pearl River basin of China from 1959 to 2009
using the information entropy theory and the fuzzy cluster analysis. The study area was then classiﬁed into 10 zones with their
unique temporal and spatial distribution characteristics to meet
the increasing demand of domestic and industrial usage of water
resources. Sang (2013) investigated the spatial and temporal variability of daily precipitation and precipitation extrema in the
Yangtze River Delta (YRD) during 1958–2007 by using the discrete
wavelet entropy method and indicated that the daily precipitation
variability in YRD is determined by the comprehensive impacts of
atmospheric circulation, urbanizations, and the Taihu Lake, while
the variability of precipitation extrema is mainly determined by
natural atmospheric circulation.
Based on the understanding of precipitation patterns, various
approaches using optimal selection of rainfall gauges have been
applied in designing rain gauge network to yield higher precision
of rainfall estimation with minimum cost. Pardo-Igúzquiza
(1998) presented an optimal network design for the estimation
of areal mean rainfall events by using simulated annealing method,
which demonstrated that the simulated annealing algorithm of
random search for optimal location of rain gauges takes into consideration the estimation accuracy and economic cost simultaneously. Patra (2001) applied a statistical theory for rain gauge
network design. The study applied the coefﬁcient of variation
and the acceptable percentage of error range to estimate the optimal number of rain gauges. St-Hilaire et al. (2003) evaluated the
impact of meteorological network density on the estimation of
basin precipitation and runoff in ﬁve drainage basins in Mauricie
watershed in Quebec, Canada by using Kriging method to estimate
the spatial distribution and variance of rainfall. Dong et al. (2005)
used variance reduction analysis method to ﬁnd the appropriate
quantity and location of rain gauges in Qingjiang River basin,
China for ﬂow simulation. The study demonstrated that both cross
correlation coefﬁcient and modelling performance increased
hyperbolically and level off after more than ﬁve rain gauges were
139
included in the network for the study area. Anctil et al. (2006)
applied the method of randomly selection of rain gauges to produce subsets of rain gauge network to optimize the mean daily
areal rainfall series in Bas-en-Basset watershed, southern France
and using a genetic algorithm to orient the rain gauge combinatorial problem towards improved forecasting performance.
Segond et al. (2007) investigated the relationship between spatial rainfall and runoff production by using the rain gauge networks
of various densities and radar data in Lee catchment, UK. The study
concluded that the dominant effect on hydrological modelling is
the spatial variability of the rainfall estimated by different rain
gauge networks and radar data. Bárdossy and Das (2008) studied
the inﬂuence of the spatial resolution of rainfall input on the model
calibration and application by varying the distribution of the rain
gauge network via External Drift Kriging method (EDK) in southwest of Germany. The study pointed out that the overall performance of the model worsened dramatically with reduction of
rain gauges, while there is no signiﬁcant improvement of the
model after the number of rain gauges passed a certain threshold.
Chen et al. (2008) applied Kriging and entropy-based algorithm to
design rain gauge network which contains the minimum number
of rain gauges and optimum spatial distribution in Taiwan
Province, China. The study found that the saturation of rainfall
information can be used to add or remove the rain gauge stations
in order to determine the optimum spatial distribution and the
minimum number of rain gauges in the network. Yoo et al.
(2008) compared the applications of mixed and continuous distribution functions to the theory of entropy for the evaluation of
rain gauge networks in the Choongju Dam basin, Korea. Due to
the small wet probability and the high coincidence of daily rainfall
between rain gauge stations, the study found that the optimal
number of rain gauges estimated by the mixed distribution function was much smaller, but still reasonable, than that estimated
by applying the continuous distribution function. Wei et al.
(2014) investigated the spatiotemporal scaling effect on rainfall
network design relocated by calculating the maximum joint
entropy of rainfall in 1992–2012 for 1-, 3-, and 5-km grids in
Taiwan Province, China. The study found a smaller number and a
lower percentage of required stations reached stable joint entropy
provide key reference points for adjusting the network to capture
more accurate information and minimize redundancy.
In many studies, rain-gauge networks are designed to provide
good estimation for areal rainfall and for ﬂood modelling and prediction (e.g. Nour et al., 2006; Segond et al., 2007; Barca et al.,
2008; Volkmann et al., 2010; Chebbi et al., 2011; etc.). Tsintikidis
et al. (2002) demonstrated that even when lumped models are
used for ﬂood forecasting, a proper gauge network can signiﬁcantly
improve the results. Due to the summer ﬂash rainfall exhibits particularly high spatiotemporal variability and produces severe,
quick, and sharply peaked ﬂash ﬂooding (Desilets et al., 2008),
the monitoring of summer ﬂash rainfall represents the most difﬁcult and important challenge for a rain gauge network designed for
ﬂood prediction. Volkmann et al. (2010) designed sparse rain gauge
networks in semiarid catchments with complex terrain to predict
ﬂash ﬂood. The study showed that the multi-criteria strategy
which provided a robust design in diluting the rain gauge network
could be implemented in designing sparse but accurate rain
gauge networks in the semiarid catchments similar to the one
studied.
Precipitation gauge network structure is not only dependent on
the station density; station location also plays an important role in
determining whether information is gained properly. Gupta et al.
(2002) and Yatheendradas et al. (2008) pointed out that the mountain areas with rapidly changing patterns of precipitation are
poorly monitored which make it difﬁcult to produce accurate
hydrological forecasts with sufﬁcient leading time. Therefore, the
140
H. Xu et al. / Journal of Hydrology 525 (2015) 138–151
design of hydrological measurement networks has received considerable attention in research settings.
Rain gauge network optimization can be taken as the process
of ﬁnding the locations of a limited number of rain gauges
which provide sufﬁcient rainfall information of both the spatial
distribution and the areal mean precipitation. The main objectives of this paper are designed to: (1) understand and quantify
the variability of the precipitation at catchment scale using
the Shannon’s Entropy and Mutual Information method; (2)
design and evaluate a new entropy theory based multi-criteria
strategy for identifying the best locations for installation of rain
gauges based on the existing dense rain gauge network; and (3)
evaluate the impact of the different rain gauge networks
on hydrological simulation by using lumped and distributed
hydrological models.
2. Material and methods
variation of runoff follows that of the precipitation, i.e., about
70% of the ﬂooding events occurs during the rainy season.
The precipitation dataset used in this study consists of 185 rain
gauges (Fig. 1) and covers the period from 1st January 1991 to 31st
December 2005. The China Meteorological Data Sharing Service
System (http://cdc.cma.gov.cn) provided the meteorological data
including daily maximum and minimum air temperature, precipitation, wind speed, solar radiation and average daily humidity.
The spatial data of the basin including elevation (the 90 m
resolution DEM map was downloaded from the Shuttle Radar
Topography Mission), one km resolution soil dataset with respect
to texture, depth and drainage attributes (provided by the project
of Watershed EUTROphication management in China through
system oriented process modelling of Pressures, Impacts and
Abatement actions) and land use map (consisting of ﬁve classes
of forest, agriculture, grass, urban and water was interpreted from
the Landsat satellite images) were prepared as inputs to the SWAT
Model (Soil and Water Assessment Tool).
2.1. Study area and data
2.2. Methodology
Xiangjiang River is a tributary of Yangtze River located between
24°300 –29°300 N and 110°300 -114°E in central-south China with a
total river length of 856 km (Fig. 1). In the 94,660 km2 catchment
area, the terrain is ladder-like with high mountains in the
headwater region and low plains in the downstream. The monsoon
climate is the main factor to bring rainfall in the catchment from
the Paciﬁc Ocean. Nearly two thirds of the 1600 mm annual precipitation occur in the rainy season (from April to September). The
mean annual temperature and mean annual potential evapotranspiration are 17 °C and 1000 mm respectively. The mean annual
discharge at Xiangtan gauge is 72.2 billion m3 and the seasonal
To achieve the objectives of optimizing rain gauge networks of
various gauge densities and investigating the impact of rain gauge
location on hydrological modelling, the long-term rainfall (from
1991 to 2005) is considered in the analysis. The rainfall from 185
gauges and the areal mean rainfall computed by using the
Thiessen Polygon algorithm with all the 185 gauges over the study
area were assumed to present the ‘‘true’’ point and areal mean
precipitation, respectively. In the following three subsections, the
basic theory of Shannon’s Entropy and Mutual Information, the
strategy for rain gauge network optimization, and the hydrological
models used in the study are described.
Fig. 1. Study area: Xiangjiang River Basin.
141
H. Xu et al. / Journal of Hydrology 525 (2015) 138–151
2.2.1. Shannon’s entropy and Mutual Information
The entropy of a system is commonly described as a measure of
the inherent disorder within the system. Shannon’s Entropy (H)
(1949), also named as Information Entropy, is a measure of the
uncertainty in a random variable (Ihara, 1993), which quantiﬁes
the expected value of the information contained in a message
(i.e. the speciﬁc realization of the random variable) (Bush, 2010).
Information Entropy is the average unpredictability in a random
variable, which is equivalent to its information content (Strange
et al., 2005). The basic hypotheses of the entropy are: for a discrete
random variable X with possible values {x1, x2, . . ., xn} and probability mass function P(X), the amount of information, I(P), is a real
nonnegative measure, additive and a continuous function of
probability p, then (Lin, 1991; Chen et al., 2008):
IðPðxi ÞÞ P 0
ð1Þ
IðPðx1 ÞPðx2 ÞÞ ¼ IðPðx1 ÞÞ þ IðPðx2 ÞÞ
For any discrete probability distribution, Shannon’s entropy is
denoted as (Yao, 2003; Bhattacharyya and Sanyal, 2012):
HðXÞ ¼ E½IðXÞ ¼ E½log b ðPðXÞÞ ¼
X
¼ Pðxi Þlogb Pðxi Þ
X
Pðxi ÞIðxi Þ
where p(x,y) is the joint probability distribution function of X and Y,
and p(x) and p(y) are the marginal probability distribution functions
of X and Y respectively. The entropies of these distributions are
related to each other and it can be proved that (Zhang and Yeung,
1998):
HðX; YÞ P max½HðXÞ; HðYÞ
When the precipitation information is recorded in rain gauge X,
the uncertainty of the rain gauge Y can be exhibited by the conditional entropy. The conditional probability of rain gauge X under
the impact of rain gauge Y can be denoted as:
pðXjYÞ ¼ pxjy ¼
pxy
py
ð2Þ
i
Therefore
XX
HðX; YÞ ¼ pxy log 2 pxy
x2X y2Y
XX
½pðxjyÞpðyÞlog 2 ðpðxjyÞpðyÞÞ
¼
"
#
X
X
¼
pðyÞ pðxjyÞlog 2 pðxjyÞ
y2Y
where E is the expected value operator, I is the information content
of X, I(X) = logb(P(X)) is self-information of a random variable (i.e. a
measure of the information content associated with the outcome of
a random variable), b is the base of the logarithm with common
values of 2, Euler’s number e, and 10, and the unit of entropy is
bit for b = 2, nat for b = e, and dit (or digit) for b = 10. P(xi) is the
probability mass function of outcome xi. In the case of P(xi) = 0 for
some i, the value of the corresponding summand 0logb(0) is taken
to be 0, which is consistent with the well-known limit:
lim pðlog b ðpÞÞ ¼ 0
ð3Þ
An important property of entropy is that it is maximized when
the system is in the highest possible state of disorder. For a system
with a ﬁnite number of possible states, the entropy is maximized
when all probabilities are equal, i.e. P(x) = 1/n and Hmax(X) = logb(n)
(Borwein et al., 2014). Mathematically, the amount of information
contained in the random variable X is inversely associated to the
probability of the occurrence of xi. The principle of maximum
entropy is a general method for estimating probability distributions
from random variable X which can be used to obtain unbiased probability assessments (Guiasu and Shenitzer, 1985). The over-riding
principle in maximum entropy is a generalization of the classical
principle of indifference which means that the distribution of X
should be as uniform as possible if nothing is known about X except
that it belongs to a certain class (Smith and Grandy (Eds.) 1985; Gull,
1989).
Mutual Information measures the amount of information that
can be obtained about one random variable by observing another
(i.e. it is a quantitative measurement of the mutual dependence
of two random variables) (Steuer et al., 2001). In our case the
two random variables are two rain gauges X with values {x1, x2,
. . ., xn} and Y with values {y1, y2, . . ., yn}, respectively. The precipitation information of rain gauges X and Y may be overlapped.
Corresponding to Eq. (2), the joint entropy of the two rain gauges
is (Krstanovic and Singh, 1992):
XX
HðX; YÞ ¼ E½log 2 ðPðX; YÞÞ ¼ pxy log2 pxy
y2Y x2X
XX
pðx; yÞ
¼
pðx; yÞlog2
pðxÞpðyÞ
y2Y x2X
ð6Þ
x2X y2Y
i
p!0þ
ð5Þ
HðX; YÞ 6 HðXÞ þ HðYÞ
ð4Þ
x2X
"
#
X
X
pðyÞlog 2 ðpðyÞÞ pðxjyÞ
y2Y
ð7Þ
x2X
The joint entropy which measures the uncertainty associated
with two random variables X and Y is computed by using Eq. (8)
with conditional probability (Cover and Thomas, 2012):
HðYjXÞ ¼ HðX; YÞ HðXÞ
ð8Þ
where HðYjXÞ is the conditional entropy of event Y under given
event X. There are no uncertainties in terms of the conditional
entropy of identical variables. Hence, the value of conditional
entropy shown below is set to be zero:
HðxjxÞ ¼ 0
ð9Þ
Unlike Pearson’s correlation that characterizes linear dependence, Mutual Information is completely general (Pethel and
Hahs, 2014). The amount of mutual (overlapped) information of
the two rain gauges can be estimated by applying the transferable
information computation that similar to using the rain gauge X to
forecast the information of rain gauge Y. The reduction in information about one variable (i.e. rain gauge X) due to the knowledge of
the other (i.e. rain gauge Y) is (Chen et al., 2008):
IðX; YÞ ¼ HðYÞ HðYjXÞ ¼ HðXÞ þ HðYÞ HðX; YÞ
ð10Þ
where I(X,Y) is the Mutual Information of variables (rain gauges) X
and Y, and zero dependence occurs if and only if p(x,y) = p(x)p(y) (i.e.
pðx;yÞ
X and Y are independent and log2 ðpðxÞpðyÞ
Þ ¼ log2 1 ¼ 0), otherwise
I(X,Y) is a positive quantity and can be proved that:
8
>
< IðX; YÞ ¼ IðY; XÞ
IðX; YÞ 6 HðXÞ
>
:
IðY; XÞ 6 HðYÞ
ð11Þ
The deduction of Eqs. (8) and (10) is given in the Appendix A.
For a detailed description and understanding of Information
Entropy and Mutual Information, please refer to William (2007)
and MacKay (2003).
2.2.2. Rain gauge network optimization
To ﬁnd the optimal rain gauge network with different number
of rain gauges, various scenarios of different rain gauge densities
were built. Nine broad scenarios comprising of 5–75% of total rain
142
H. Xu et al. / Journal of Hydrology 525 (2015) 138–151
gauges (Table 1) are obtained for analysis. For each scenario, the
best and good rain gauge networks are selected from 5000 different network conﬁgurations by using Monte Carlo method from
the total of 185 gauges following the steps below:
Step 1: Compute the Shannon’s Entropy (H) of each rain gauge
and ﬁnd the rain gauge with maximum Hmax. The surplus 184 rain
gauges (except the rain gauge Hmax) can be considered as a 184dimensional dataset R184 and the given number of rain gauges
various aspects between calculated and ‘‘true’’ precipitation in
the catchment.
Step 3: To ﬁnd rain gauge networks that optimize all OFs
simultaneously from H, formulate the multi-criteria network
design problem (function (12)) as (Gupta et al., 1998):
randomly selected from R184 were considered as sub-datasets
where h stands for the possible gauge combinations within the rain
gauge network set H and m stands for the number of criteria used
(in this study, m = 3). In general, this multi-criteria optimization
problem has a set of solutions but not a unique solution that
simultaneously optimizes each criterion. Therefore, it is necessary
to take use of a Pareto set of solutions which have the property that
moving from one solution to another and resulting in the improvement of at least one criterion while causing deterioration in at least
one other (Gupta et al., 2003; Jayawardena, 2014). As Volkmann
et al. (2010) stated ‘‘This Pareto set deﬁnes the minimum uncertainty in network selection that can be achieved without stating a
subjective relative preference for minimizing one speciﬁc component of F(h) at the expense of another’’.
However, in practice, the rain gauge network design needs a
unique solution rather than Pareto solutions, but it is not a simple
task to objectively assess the goodness of the possible solutions in
the Pareto set and identify the best solution before validated in
hydrological models. The selection of a restricted set of Pareto
solutions which allows for subjectively identifying an appropriate
single network can be named as ‘‘compromise solutions’’, i.e. the
Pareto solutions which correspond to a more balanced trade-off
between the three OFs and consequently to simultaneously optimize all the OFs (low I and PBIAS and high NSC). That is to say,
the Pareto solution which referred to the most satisﬁable compromise among the three OFs is the solution that each OF cannot be
optimized any further, otherwise the slightly improvement of
one OF comes at the expense of a strong deterioration of at least
one other objective (in practice, a slightly deterioration of only
one function is acceptable to achieve the improvement of the other
two functions). Therefore, the appropriate ﬁnal ‘‘best’’ rain gauge
network can be identiﬁed from the compromise solutions that fulﬁl the multiple-criteria requirements (i.e. in a simple manner in
this study, the best network is the network which identiﬁed in
the three criteria indices in the compromise solutions by (1) giving
three best values; or (2) giving at least one best value while the
other values are only slightly worse than the best values; or (3)
all three indices are not achieving their best values but only
slightly worse than the best values). But in some certain case, the
value of an OF is usually reasonable but may not optimal in a single-criterion sense. After selecting the best rain gauge network, all
solutions close to the ideal values of the three criteria (i.e.
I = min(I), PBIAS = 0 and NSC = 1) and in proximity to the best solution may be chosen as ‘‘good’’ rain gauge networks (denoted as Hg)
with respect to their compromise between the three OFs (in the
sense of 3-D space built by the three criteria, the good networks
should simultaneously fulﬁl the conditions of: (1) close to the ideal
values of the three criteria, and (2) close to the best network). To
ﬁnd the good rain gauge networks, ﬁrst, a compromise set of solutions (denoted as Hc) should be identiﬁed from the highly compromised part of the Pareto set which excludes the solutions that only
correspond to single-criteria optimizations (denoted as Hp, and
Rd 2 R184 (d = 8, 18, 27,. . . etc.). To ﬁnd all the possible combinations of given gauge number (d) is not necessary and impossible
as well (e.g. there are approximately 2.792 1013 different
combinations of 8 gauges in 184 gauges), so we adopt Monte
Carlo stochastic selection method to composite the feasible rain
gauge network set H0 which includes 5000 different combinations
of each given gauge number (d). Then add the rain gauge of maximum entropy into each rain gauge network in H0 to composite the
rain gauge network set H.
Step 2: To ﬁnd the ‘‘best’’ network conﬁguration in a given
number of rain gauges, we applied a multi-criteria algorithm based
on values computed for three objective functions (OFs) in dataset
H.
8
Pd1 Pd
IðX i ;X j Þ
>
i¼1
j¼iþ1
>
>
F
ðhÞ
¼
I
¼
> 1
C 2d
>
>
>
Pn
<
jðxt pt Þj
t¼1
F 2 ðhÞ ¼ PBIAS ¼ P
n
pt
>
>
>
Pt¼1
>
n
>
ðxt pt Þ2
>
>
: F 3 ðhÞ ¼ NSC ¼ 1 Pt¼1
n
2
t¼1
ð12Þ
ðpt pÞ
where Xi and Xj are rain gauges pair derived from H, C 2d is the
. pt is the ‘‘true’’ areal mean
combinatorial number equals to dðd1Þ
2
precipitation (computed by using all the 185 rain gauges in the
catchment) at time interval t, xt is the sampled areal mean precipitation from a given network conﬁguration at that time interval,
and n is the number of 1 day time intervals analysed. The over score
operator (as in x and p) indicates the average of the measure (pt)
over all n time intervals considered.
The ﬁrst OF (F 1 ðhÞ) is the arithmetic mean of the Mutual
Information computed by all bi-combinations of the rain gauges
(Xi and Xj) from H which represents the rainfall information
‘‘overlapped’’ among gauges. The second OF (F 2 ðhÞ) is PBIAS which
measures errors in global rainfall volume input to the catchment,
computed as the percent bias of the absolute error. Considering
the temporal dynamics of discharge generation process, only the
accurate estimation of rainfall volumes cannot provide sufﬁcient
information to predict accurately the discharge volume and hydrograph shape. Therefore, the third OF (F 3 ðhÞ), the Nash–Sutcliffe
Coefﬁcient (NSC) is also used which determines the relative magnitude of the residual variance (‘‘noise’’) compared to the measured
data variance (‘‘information’’) (Nash and Sutcliffe, 1970).
Simultaneous optimization of the non-commensurable information provided by the three OFs (i.e. F 1 ðhÞ, F 2 ðhÞ and F 3 ðhÞ) helps
to extract relevant information from a single signal for resampling
the optimal rain gauge networks applied in hydrological modelling
(Gupta et al., 1998). Note that any number of relevant OFs could be
considered in the multi-criteria purpose, but the selected OFs
should not be highly correlated for the purpose of evaluating
optim FðhÞ ¼ ½F 1 ðhÞ; F 2 ðhÞ; . . . ; F m ðhÞ
ð13Þ
h2H
Table 1
The number of rain gauges in different percentage of selection.
Percentage of rain gauges
5%
10%
15%
20%
25%
30%
40%
50%
75%
100%
Number of rain gauges
9
19
28
37
46
56
74
93
139
185
H. Xu et al. / Journal of Hydrology 525 (2015) 138–151
Hg Hc Hp H). Then, the good rain gauge networks can be
found in proximity to the best rain gauge network solution. The
proximity is deﬁned as solutions in a region of three-criteria space
with vertices located at (1) I = min(I), PBIAS = 0 and NSC = 1 and (2)
the xth percentile of F i ðhÞ and the other three OFs equal their best
values (e.g. the xth percentile of I, PBIAS = 0 and NSC = 1). The value
x is selected by iteratively increasing from the 90th percentile until
the resulting region contains 10 solutions or less. The best and the
good rain gauge networks are identiﬁed by using the entire rainfall
data (from 1st January 1991 to 31st December 2005).
2.2.3. Evaluation of rainfall estimated by optimal rain gauge
networks–variance reduction due to the increase in the number of rain
gauges
Variance in the rainfall time series provides one estimate of the
variability of the rainfall at a location or of a region (Dong et al.,
2005). The variance of areal mean rainfall is expected to reduce if
the number of rain gauges for calculating the areal rainfall is
increased. The algorithm adopted from Yevjevich (1972) provides
a good measure of the relationship between the variance of station
measurement and that of areal mean rainfall.
Under the hypothesis that a rain gauge network of a basin is
composed by n gauges with records length N, and the rainfall
recorded in the area is spatially ergodic and homogeneous, the
variance of the areal mean rainfall can be expressed as
(Yevjevich, 1972; Dong et al., 2005):
s2 ¼
s2j
n
½1 þ r ðn 1Þ
ð14Þ
where
s2j ¼
s2j ¼
n
1X
s2
n j¼1 j
"
N
X
ð15Þ
#
ðxij xj Þ2 =N
ð16Þ
i¼1
N
X
xj ¼
!
xij =N
ð17Þ
143
the Xinanjiang Model has been widely applied in humid and
semi-humid areas of China and aboard (Zhao et al., 1995; Singh,
1995; Zhang et al., 2012). Based on the concept of runoff formation
on repletion of storage, the main application of the model is for
hydrological forecasting (Zhao, 1992; Li et al., 2009), meanwhile,
the model also demonstrated great potential in application of
water resources assessment, catchment management, hydrological
design, water quality accounting and impact study of climate
change and land use change (e.g. Yao et al., 2009; Yuan et al.,
2012; Zhang et al., 2012; Xu et al., 2013).
SWAT (Soil and Water Assessment Tool) Model is a physicallybased continuous, long-term, distributed-parameter model
designed to predict the effects of land management practices on
the hydrology, sediment, and contaminant transport in agricultural
watersheds under varying soils, land use, and management conditions (Arnold et al., 1998). SWAT is based on the concept of
Hydrologic Response Units (HRUs), which are portions of a subbasin that possesses unique land use, management, and soil attributes. The runoff from each HRU is calculated separately based on
weather, soil properties, topography, vegetation, and land management and then summed to determine the total value from the
subbasin. The hydrologic cycle simulated by SWAT is based on
the water balance equation (Neitsch et al., 2002):
SW t ¼ SW 0 þ
t
X
ðRday Q surf Ea W seep Q gw Þ
ð19Þ
i¼1
where SWt is the ﬁnal soil water content, SW0 is the initial soil water
content on day i, t is the time (days), Rday is the amount of precipitation on day i, Qsurf is the amount of surface runoff on day i, Ea
is the amount of evapotranspiration on day i, Wseep is the amount
of water entering the vadose zone from the soil proﬁle on day i,
and Qgw is the amount of return ﬂow on day i. For a detailed description and explanation of the SWAT Model, please refer to Neitsch
et al. (2011).
The hydrological models are calibrated from 1st January 1991 to
31st December 1999 and validated from 1st January 2000 to 31st
December 2005. The performances of hydrological simulation is
evaluated by using Relative Mean Error (RE) and Nash–Sutcliffe
efﬁciency coefﬁcient (NSC) (e.g., Li et al., 2014).
i¼1
r ¼
!
n1 X
n
X
2
r ij =nðn 1Þ
ð18Þ
j¼1 i¼jþ1
where
s2j
is the mean of the gauge variance;
3. Results
3.1. Optimal networks with different number of rain gauges
s2j
is the variance of the
jth rain gauge; xij is the rainfall data recorded at the ith time point
and the jth rain gauge; xj is the mean of the jth rain gauge; rij is the
sample product–moment correlation coefﬁcient between rainfall
series of gauges i and j; and r is the arithmetic mean of the correlation coefﬁcients of all bi-combinations of the rain gauges.
According to Eq. (14), the variance of areal mean rainfall is expected
to decrease hyperbolically with an increasing number of rain gauges
n (more details can be seen in Rodriguez-Iturbe and Mejía, 1974;
Booij, 2002).
2.3. Hydrological models
To investigate the impact of rain gauge density and spatial location on hydrological modelling performances, and to test the utility
of the entropy theory based multi-criteria resampling algorithm,
two different types of hydrological models are applied in the study.
Xinanjiang Model is a lumped conceptual rainfall-runoff hydrological model developed in 1973 (Zhao et al., 1980). Since then,
3.1.1. The distribution of Information Entropy in the catchment
Daily precipitation records from all 185 rain gauges of
Xiangjiang River Basin are used as benchmark data in designing
the optimal rainfall network. Fig. 2 shows the distribution of
Information Entropy over the study area calculated by daily precipitation of 185 rain gauges from 1st January 1991 to 31st December
2005 using Ordinary Kriging method. It is seen that the spatial
variability of Information Entropy of rainfall is non-homogeneous
and is larger in the south and east region than that in the north
and west of the basin. Referring to the DEM map in Fig. 1, it is seen
that the high values of Information Entropy (>4.6 bit) mainly
distributed in the mountain area of the south-east border of the
catchment (the rain gauges which located outside the basin are
not considered in the study because of the limitation of precipitation data availability). This is due to its high elevation that
blocks the transpiration of vapour carried by the monsoon from
the Paciﬁc Ocean and thereby forms very high volume of rainfall;
meanwhile the complex topography also caused considerable
precipitation variation in this area.
144
H. Xu et al. / Journal of Hydrology 525 (2015) 138–151
Fig. 2. The distribution of Information Entropy in Xiangjiang River Basin.
3.1.2. The Pareto front of best and good rain gauge networks
As expected, a clear trend of improvement in PIBAS, NSC and I is
found with the increasing number of rain gauges (Fig. 3). When
nine rain gauges are included in the networks, the maximum/
minimum values of the 5000 combinations of the networks are
0.59/0.09, 0.99/0.44 and 0.542/0.434 bit for PBIAS, NSC and I
respectively, then the values of PBIAS, NSC decrease/increase
0.6
1
0.55
PBIAS / NSC
NSC
0.6
Mutual
Information
0.5
0.4
0.45
0.2
Mutual Information (bit)
PBIAS
0.8
0.4
0
9
19
28
37
46
56
74
93
139
185
Number of Rain gauges
Fig. 3. The maximum and minimum values of multi-criteria results for all
combination (5000 combinations for each given rain gauge number) of networks
with different number of rain gauges.
gradually to approach the theoretical ideal value (PBIAS = 0 and
NSC = 1) with the increasing of rain gauge numbers. On the other
hand, the maximum values of NSC show nearly no difference
among the networks with different number of rain gauges, and
the maximum and minimum values of I decrease and increase progressively to approach 0.467 bit calculated from the all 185 rain
gauges in the catchment. It is also seen that the maximum/minimum values of PBIAS decrease nearly linearly with the increase
of gauge numbers in the networks while the maximum/minimum
values of NSC and I show no considerable differences after more
than 74 rain gauges are selected in the network.
For illustrative purposes, Fig. 4 shows the results of the entropy
theory based multi-criteria rain gauge network optimization algorithm of network composed by nine, 19, 46 and 93 rain gauges. As
the number of rain gauges increased in the network (moving from
Fig. 4(a)–(d)), the network Pareto solutions, including the best
network and good networks, move closer in performance to the
network including all 185 rain gauges (i.e. PBIAS = 0, NSC = 1 and
I = 0.467 bit). The borders (proximity) of Pareto front of the best
and good rain gauge networks (green lines) become gradually
shorter and move towards the location of theoretically perfect
network in the three-dimensional coordinate system (i.e. threecriterion space). In addition, it is also seen that the location of best
and good rain gauge networks increasingly tends to concentrate
into the corner which locates in the diagonal position to the
theoretically perfect solution (with respect to each of combination
H. Xu et al. / Journal of Hydrology 525 (2015) 138–151
145
Fig. 4. Three dimensional projections of objective function space for the multi-criteria optimization of rain gauge networks of (a) 9 rain gauges; (b) 19 rain gauges;(c) 46 rain
gauges, and (d) 93 rain gauges (the red star indicates the best network, the blue dots indicate the good networks and the green lines are the Pareto fronts). (For interpretation
of the references to color in this ﬁgure legend, the reader is referred to the web version of this article.)
Annual areal mean rainfall (mm)
of two OFs) as numbers of rain gauges increased. This result shows
that the algorithm adopted is an informed method to identify the
best and good rain gauge networks.
1700
a
1650
1600
1550
Good Networks
Best network
1500
1450
9
19
28
37
46
56
74
93
139
185
Number of rain gauges
65
b
Variance (mm2/day 2)
62
59
56
Good Networks
53
Best network
50
3.1.3. Rainfall variation evaluated by different optimal rain gauge
networks
The areal annual mean rainfall and variance computed from
1991 to 2005 of best and good rain gauge networks with different
number of rain gauges are shown in Fig. 5. It is seen that areal
mean rainfalls estimated by the best and good networks of different rain gauge conﬁgurations vary around the line equals 1602 mm
(i.e. the areal annual mean rainfall calculated by the all 185 rain
gauges in the catchment, Fig. 5(a)), and the relative errors are all
less than 5% compared with the areal annual mean rainfall estimated by the network containing 185 gauges. Moreover, there
are no signiﬁcant differences between the areal annual mean rainfalls computed by different gauge conﬁgurations using Student’s
t-test under the 5% signiﬁcant level. Similarly, the effects of the
number of rain gauges of optimal networks on the variance of areal
mean rainfall series are shown in Fig. 5(b). It is expected that the
optimal networks with sparse gauges should perform similar variance as the optimal networks with dense gauges to demonstrate
that there is no considerable differences between the optimal
networks with different numbers of rain gauges. It is seen that
the variance of best networks computed from Eq. (14) decreases
slightly from 61.5 mm2/day2 to 59 mm2/day2 with the increase of
the rain gauge number from 9 to 185. After a certain threshold
(best network with 74 gauges), the variance levels off to a ﬁnal
value of 59 mm2/day2, which implies that the effect of the variance
of areal mean rainfall series keeps stable when rain gauge number
n is greater than a certain threshold number. In addition, the
variation range of variance estimated by good networks is less than
6 mm2/day2 and narrows gradually from more than 46 rain gauges
are included in the good networks.
9
19
28
37
46
56
74
93
139
185
Number of rain gauges
Fig. 5. Effect of the number of rain gauges of optimized networks on the mean and
variance of areal mean rainfall.
3.1.4. The distribution of rain gauges of the best and good network in
the catchment
For illustrative purposes, Fig. 6 gives an example of the geographical distribution of the optimal networks of nine, 19 and 37 rain
gauges (the unique best and two good networks are represented
respectively). It is seen that the values of the three criteria indices
146
H. Xu et al. / Journal of Hydrology 525 (2015) 138–151
Fig. 6. The distribution of optimized rain gauge networks with sparse gauges. (a): best network of 9 gauges, (b) and (c): good networks of 9 gauges; (d): best network of 19
gauges, (e) and (f): good networks of 19 gauges; (g): best network of 37 gauges, (h) and (i): good networks of 37 gauges.
(i.e. I, NSC and PBIAS) are similar in best and good networks with
given rain gauge numbers (e.g. the values of I /NSC/PBIAS are
0.459/0.953/0.186, 0.458/0.949/0.186 and 0.455/0.945/0.195 for
the best and good networks with nine rain gauges showed in
Fig. 6(a)–(c) respectively). There are two characteristics of geographical locations can be detected from the combinations that sparsely
distributed in the catchment: (1) a strong effect of the geographical
location: rain gauges in best and good networks are mainly located
in the upper and middle reaches of the main stream and tributaries
streams, and (2) referring to Figs. 1 and 2, the rain gauges located in
the mountain areas in the south and west parts of the basin which
have higher values of Information Entropy play an important role
in designing the optimal networks.
3.2. Evaluation of rainfall estimates for hydrological simulation
3.2.1. Comparison of simulation results in Xinanjiang Model and SWAT
Model
The rainfall estimation errors will be reﬂected in hydrological
modelling performances. To test the suitability of Information
Entropy theory based multi-criteria rain gauge network optimization algorithm and study the impact of rain gauge density and
spatial location on hydrological modelling; Figs. 7 and 8 show
the simulation results of the lumped Xinanjiang Model and the
distributed SWAT Model at Xiangtan gauge, respectively.
In case of the lumped Xinanjiang Model, it is seen that there is
no considerable differences between the models’ results based on
H. Xu et al. / Journal of Hydrology 525 (2015) 138–151
147
Fig. 7. Simulation results of Xinanjiang Model. (a) and (c) are calibration period; (b) and (d) are validation period.
Fig. 8. Simulation results of SWAT Model. (a) and (c) are calibration period; (b) and (d) are validation period.
various optimal rain gauge networks in terms of Relative Error (RE)
and Nash–Sutcliffe efﬁciency coefﬁcient (NSC).
In the calibration period, RE (Fig. 7(a)) and NSC (Fig. 7(c)) are
0.3% and 0.92 respectively in the model using the best network
of nine rain gauges. With the increasing of the number of rain
gauges in the best and good networks, the simulation results show
that the values of RE locate around zero, and all |RE| values are less
than 2%. Moreover, a gradually increasing trend can be observed in
the values of NSC. When the models using the best network with
more than 28 rain gauges, the values of NSC show nearly no differences and approximately equal to 0.95. Considering the models
based on good networks, it is seen that the variation ranges of RE
148
H. Xu et al. / Journal of Hydrology 525 (2015) 138–151
Fig. 9. Simulated hydrographs of 1999 from optimized rain gauge networks based hydrological models: (a), (b) and (c) are 9, 37, 93 gauged based Xinanjiang Model; (d), (e)
and (f) are 9, 37, 93 gauged based SWAT Model.
and NSC of the simulation results narrow progressively with the
increasing number of rain gauges.
In the validation period, a similar change patterns can be
observed for RE (Fig. 7(b)) and NSC (Fig. 7(d)) as in calibration period, but the values of RE and NSC are slightly larger and lower
respectively and the variation ranges expanded. The |RE| values
are less than 3% for all models’ simulations based on best networks
and the values of NSC show nearly no differences and approximately equal to 0.93 when the models use the best networks
including more than 19 rain gauges. However, the variation range
of RE is 8% to 3% and narrows to ±4% until more than 74 rain
gauges are included in the good networks in hydrological modelling. Furthermore, the variation range of NSC is 0.89–0.92 when
simulated by using good networks including nine gauges and
gradually narrows from when 37 gauges are included in the good
networks in modelling.
In case of the distributed SWAT Model, both the density and
distribution of rain gauges affect the model performances considerably compared with the lumped Xinanjiang Model. Similar
to the rainfall estimates, the uncertainty in runoff simulation is
strongly reduced by increasing the rain gauge numbers in the optimal networks in the distributed SWAT Model.
In the calibration period, the models based on best networks
have RE values of the simulated and observed discharge range
between 5% and 7% (Fig. 8(a)). The range of RE is 6% and 9% in
the good networks, and gradually narrows with the increasing
number of rain gauges especially from when 56 rain gauges are
included. Fig. 8(c) shows a clear improvement trend of NSC from
0.66 (nine rain gauges) to 0.91 (185 rain gauges). However, the
NSC of model simulations based on good networks increases but
the variation range only shows a slight narrowing trend (e.g. the
variation ranges of NSC are 0.61–0.72, 0.73–0.8 and 0.86–0.9 for
the 9, 37 and 139 rain gauges in the model results based on good
networks, respectively).
In the validation period, the models using various best networks
overestimate the volume of discharge (except the models using the
best networks including 74, 139 rain gauges) and all RE values are
located between 1% and 9% (Fig. 8(b)). The variation range of RE of
the models using various good networks does not show a clear narrowing trend, but it is progressively more cantered on zero with
the increasing of rain gauge numbers (e.g. the variation ranges of
RE are 3–14%, 3–8% and 5% to 4% for the good networks of nine,
37 and 139 rain gauges respectively), which does suggest that
the mean error decreases from discharge simulations using the
optimal networks that include more rain gauges. The values of
NSC increase clearly with the increasing number of rain gauges
included in the optimal networks (Fig. 8(d)). In models using best
networks, the NSC increased progressively from 0.57 (nine rain
gauges) to 0.86 (185 rain gauges); meanwhile, the variation range
of NSC narrows gradually with increasing number of rain gauges in
good networks used in hydrological modelling (e.g. the variation
ranges of NSC are 0.43–0.78, 0.58–0.79 and 0.85–0.88 for the
H. Xu et al. / Journal of Hydrology 525 (2015) 138–151
models based on good networks with nine, 37 and 139 rain gauges
respectively), and after more than 93 rain gauges are used in simulation, the differences for NSC between the upper and lower limit
are less than 0.1. If the subjectively decided acceptance domain of
the model performance is |RE|<5% and NSC > 0.8, it is seen that at
least more than one fourth (i.e. 46 rain gauges) of the rain gauges
should be included in the network in using the SWAT Model for
discharge simulation in the catchment.
As Pareto optimal does not imply that it maximizes utility
(Friedland, Ed. 2009), it is seen that the hydrologic models based
on ‘‘best’’ rain gauge networks do not always give the best model
performances in comparing with the ‘‘good’’ networks due to three
reasons (1) both the rainfall-runoff transformation processes in
reality and modelling processes are highly non-linear, (2) runoff
generation and routing mechanisms are model depended and cannot easily be generalized (for example, as shown in the paper,
lumped model and distributed model respond differently to the
density and distribution of rain gauges in a network), and (3) criteria used to evaluate the performance of rain gauge networks
and hydrological models are also different. However, good and
similar performances can be achieved in the ‘‘best’’ and ‘‘good’’ networks using the lumped Xinanjiang Model with lower number of
rain gauges, and for the distributed SWAT model good performances can be achieved in the ‘‘best’’ and ‘‘good’’ networks that
contain more than a certain number of gauges.
In summary, it is seen that there is no considerable difference in
model performances for the lumped Xinanjiang Model using various gauge conﬁgurations of best and good networks. However, as
the distributed SWAT Model uses only the single rain gauge nearest to the sub-basin’s centroid as rainfall input for each sub-basin
(Galván et al., 2014), it is much more sensitive to the number of
gauges and their spatial distribution compared to a lumped model.
When using low density gauge networks, the gauge assigned to a
particular sub-basin may be quite far from its centroid.
Consequently, there may be large errors between the rainfall used
by the SWAT model for this sub-basin and the ‘‘true’’ areal rainfall.
This will result in large errors in discharge simulation for the
affected sub-basins. Moreover, in extreme and ideal situation, good
simulation results can be achieved in lumped Xinanjiang Model
using the network only includes one well located rain gauge which
can perfectly represent the areal mean rainfall, but the distributed
SWAT Model requires the properly located minimal number of rain
gauges to generate acceptable simulation results.
3.2.2. Comparison of hydrographs in Xinanjiang Model and SWAT
Model
For illustrative purposes, an example of hydrographs is generated using the calibrated Xinanjiang Model and SWAT Model
(Fig. 9) using optimal networks with nine, 37 and 93 rain gauges
for the period of 1st January 1999–31st December 1999 at
Xiangtan gauge. It is seen that: (1) It is possible to capture the major
temporal characteristics of the dynamic process of discharge in both
hydrological models by using the optimal networks with different
rain gauge numbers. (2) The ranges of the simulated hydrographs
from the good networks narrow gradually with increasing number
of rain gauges in the catchment. (3) Comparing with the lumped
Xinanjiang Model (Fig. 9(a)–(c)), the hydrographs derived from
the distributed SWAT Model (Fig. 9(d)–(f)) have higher probability
to overestimate or underestimate the peak ﬂows and the ranges of
the simulated hydrographs are also wider.
4. Conclusions and outlook
This paper designs an entropy theory based multi-criteria rain
gauge resampling method to investigate the inﬂuences of the optimal gauge networks with various rain gauge densities and gauge
149
locations on the performance of lumped and distributed hydrological models. Several aspects of the results in the study reveal that
the rain gauge networks selected by this method are robust and
optimal. It is concluded from the study that:
1. There is no signiﬁcant difference between the annual areal
mean rainfall estimated by the benchmark network (185
gauges) and the optimal rain gauge networks with different
gauge densities, although the spatial distribution patterns
change with the number of stations and the location of the stations. This is the main reason why the performance of the
lumped Xinanjiang Model does not change much with the number of rain gauges used but the distributed SWAT model does.
Meanwhile, the effect of an increase in the number of rain
gauges in optimal networks on the variance reduction of mean
areal precipitation is not obvious.
2. In the case of optimal rain gauge networks (best and good), it is
seen that most of the rain gauges distributed in the upper and
middle reaches of the main stream and tributaries or in the
mountain areas.
3. For the best and good rain gauge networks, the lumped
Xinanjiang Model gives small relative errors (all |RE|<2%) and
high values of Nash–Sutcliffe efﬁciency coefﬁcient (all
NSC > 0.92) in the calibration period; while in validation period,
the relative errors (8% < RE < 3%) and values of NSC (>0.89) are
slightly increased/decreased when using the good networks
with only nine rain gauges in simulation.
4. In general, the performance of the distributed SWAT Model is
somehow lower than the lumped Xinanjiang Model, and also
larger variability can be observed in simulated runoff using
optimal networks with different rain gauge densities. The relative errors are 6% to 9% and 7% to 14% for the calibration and
validation periods respectively. However, a clear improving
trend can be observed in the values of NSC with more rain
gauges included in the optimal networks.
It should be noted that (1) the multi-criteria objective functions
can be changed depending on different design objectives and
hydrological models to provide improved discharge simulations.
(2) Following the idea of Chen et al. (2008), we select the unique
rain gauge with maximum Information Entropy in the ﬁrst step in
designing the optimal rain gauge networks; however, the x rain
gauges which simultaneously fulﬁl the conditions of: (i) having
the values of Information Entropy larger than a certain threshold
and (ii) the arithmetic mean Mutual Information is below a certain
threshold can be considered in the ﬁrst step in rain gauge network
design in the future studies.
(3) In this study region, high density and good quality rain gauges
are available which can be used as ‘‘true’’ or benchmark precipitation. However, in many catchments good quality and high density
rain gauges may not be available, the satellite-based precipitation
data and other global rainfall datasets with high spatial–temporal
resolution have been used by some researchers as an alternatively
‘‘true’’ precipitation (Krajewski and Smith, 2002; Germann et al.,
2006; Germann et al., 2009; Adjei et al., 2015; Al-Mukhtar et al.,
2014; Castro et al., 2015; Kang and Merwade, 2014; etc.).
(4) As discussed in Section 3.2.1, the ‘‘best’’ rain gauge networks
do not always give the best performances of hydrological models
due to the nonlinear behaviour of rainfall-runoff transformation
processes and modelling processes, differences in runoff generation and routing mechanisms among the models, and differences
in evaluation criteria used to evaluate the performance of rain
gauge networks and hydrological models. However, this method
is time saving and guarantees that small differences of simulation
results can be achieved in hydrological simulations using the optimal networks and using the most densely benchmark networks.
150
H. Xu et al. / Journal of Hydrology 525 (2015) 138–151
Appendix A
References
1. The deduction of Eq. (8)
HðYjXÞ ¼
X
"
#
X
X
½pðxÞHðYjX ¼ xÞ ¼ pðxÞð ðpðyjxÞlog 2 pðyjxÞÞÞ
x2X
x2X
y2Y
XX
X
½pðx; yÞlog 2 pðyjxÞ ¼ ½pðx; yÞlog 2 pðyjxÞ
¼
x2X y2Y
x2X;y2Y
X pðx; yÞ
¼
pðx; yÞlog 2
pðxÞ
x2X;y2Y
¼
X pðxÞ
pðx; yÞlog 2
pðx; yÞ
x2X;y2Y
(
¼
X
)
½pðx; yÞlog 2 pðx; yÞ
(
þ
x2X;y2Y
X
)
½pðx; yÞlog 2 pðxÞ
x2X;y2Y
(
)
X
½pðxÞlog 2 pðxÞ ¼ HðX; YÞ HðXÞ
¼ fHðX; YÞg þ
x2X
2. The deduction
of Eq. (10)
X pðx; yÞ
pðx; yÞlog2
pðxÞpðyÞ
x2X;y2Y
(
) (
)
X
X
½pðx;yÞlog2 pðx;yÞ ½pðx; yÞlog2 ðpðxÞpðyÞÞ
¼
IðX; YÞ ¼
x2X;y2Y
(
¼
X
x2X;y2Y
)
½pðx;yÞlog2 ðpðxÞpðyjxÞÞ
x2X;y2Y
(
(
¼
)
X
½pðx; yÞlog2 ðpðxÞpðyÞÞ
x2X;y2Y
X
x2X;y2Y
(
)
½pðx; yÞlog2 pðyjxÞ
x2X;y2Y
X
X
½pðx;yÞlog2 pðxÞ þ
X
½pðx; yÞlog2 pðxÞ þ
x2X;y2Y
)
½pðx; yÞlog2 pðyÞ
x2X;y2Y
( "
!
#
)
X X
X
¼
pðx; yÞ log2 pðxÞ þ
½pðx; yÞlog2 pðyjxÞ
x2X
( "
X
y2Y
x2X;y2Y
!#
"
!#)
X
X
X
log2 pðxÞ
pðx; yÞ þ
log2 pðyÞ
pðx;yÞ
x2X
y2Y
y2Y
x2X
(
)
X
X
¼
½pðxÞlog2 pðxÞ þ
½ðpðxÞpðyjxÞÞlog2 pðyjxÞ
x2X
x2X;y2Y
(
)
X
X
½ðlog2 pðxÞÞpðxÞ þ
½ðlog2 pðyÞÞpðyÞ
x2X
y2Y
(
"
#)
X
X
X
½pðxÞlog2 pðxÞ þ
pðxÞ ðpðyjxÞlog2 pðyjxÞÞ
¼
x2X
x2X
y2Y
f½HðYÞ þ ½HðXÞg
(
)
X
X
½pðxÞlog2 pðxÞ þ
½pðxÞðHðY jX ¼ xÞÞ
¼
x2X
x2X
f½HðYÞ þ ½HðXÞg
¼ f½HðXÞ þ ½HðYjXÞg f½HðYÞ þ ½HðXÞg
¼ HðYÞ HðYjXÞ
Adjei, K.A., Ren, L.L., Appiah-Adjei, M.K., Odai, S.N., 2015. Application of satellitederived rainfall for hydrological modelling in the data-scarce Black Volta transboundary basin. Hydrology Research, in press. doi: http://dx.doi.org/10.2166/
nh.2014.111.
Ali, A., Lebel, T., Amani, A., 2003. Invariance in the spatial structure of Sahelian rain
ﬁelds at climatological scales. J. Hydrometeorol. 4, 996–1011.
Al-Mukhtar, M., Dunger, V., Merkel, B., 2014. Evaluation of the climate generator
model CLIGEN for rainfall data simulation in Bautzen catchment area, Germany.
Hydrol. Res. 45 (4–5), 615–630. http://dx.doi.org/10.2166/nh.2013.073.
Anctil, F., Lauzon, N., Andréassian, V., Oudin, L., Perrin, C., 2006. Improvement of
rainfall-runoff forecasts through mean areal rainfall optimization. J. Hydrol.
328, 717–725.
Arnold, J.G., Srinivasan, R., Muttiah, R.S., Williams, J.R., 1998. Large area hydrologic
modeling and assessment Part I: model development. J. Am. Water Resour.
Assoc. 34, 73–89.
Barca, E., Passarella, G., Uricchio, V., 2008. Optimal extension of the rain gauge
monitoring network of the Apulian Regional Consortium for Crop Protection.
Environ. Monitor. Assess. 145 (1–3), 375–386.
Bárdossy, A., Das, T., 2008. Inﬂuence of rainfall observation network on model
calibration and application. Hydrol. Earth Syst. Sci. Discuss. 12, 77–89.
Bhattacharyya, S., Sanyal, G., 2012. A robust image steganography using DWT
difference modulation (DWTDM). Int. J. Comput. Network Inform. Security 7,
27–40.
Booij, M.J., 2002. Extreme daily precipitation in Western Europe with climate
change at appropriate spatial scales. Int. J. Climatol. 22, 69–85.
Borwein, J., Howlett, P., Piantadosi, J., 2014. Modelling and simulation of seasonal
rainfall using the principle of maximum entropy. Entropy 16 (2), 747–769.
Bras, R.L., 1990. Hydrology: An Introduction to Hydrologic Science. Addison-Wesley
Publishing Company.
Bush, S.F., 2010. Nanoscale Communication Networks. Artech House.
Castro, L.M., Salas, M., Fernández, B., 2015. Evaluation of TRMM Multi-satellite
precipitation analysis (TMPA) in a mountainous region of the central Andes
range with a Mediterranean Climate. Hydrology Research, in press. http://
dx.doi.org/10.2166/nh.2013.096.
Chebbi, A., Bargaoui, Z.K., Cunha, M.D.C., 2011. Optimal extension of rain gauge
monitoring network for rainfall intensity and erosivity index interpolation. J.
Hydrol. Eng. 16 (8), 665–676.
Chen, Y.C., Wei, C., Yeh, H.C., 2008. Rainfall network design using Kriging and
entropy. Hydrol. Process. 22, 340–346.
Cheng, K.S., Lin, Y.C., Liou, J.J., 2008. Rain gauge network evaluation and
augmentation using geostatistics. Hydrol. Process. 22, 2554–2564.
Cover, T.M., Thomas, J.A., 2012. Elements of Information Theory. John Wiley & Sons.
Desilets, S.L., Ferré, T.P., Ekwurzel, B., 2008. Flash ﬂood dynamics and composition
in a semiarid mountain watershed. Water Resour. Res. 44, W12436.
Dong, X., Dohmen-Janssen, C.M., Booij, M.J., 2005. Appropriate spatial sampling of
rainfall or ﬂow simulation. Hydrol. Sci. J. 50, 279–297.
Friedland, J. (Ed.), 2009. Doing Well and Good: The Human Face of the New
Capitalism. Information Age Publishing Inc.
Galván, L., Olías, M., Izquierdo, T., Cerón, J.C., Fernández de Villarán, R., 2014.
Rainfall estimation in SWAT: an alternative method to simulate orographic
precipitation. J. Hydrol. 509, 257–265.
Germann, U., Galli, G., Boscacci, M., Bolliger, M., 2006. Radar precipitation measurement
in a mountainous region. Quarterly J. Royal Meteorol. Soc. 132, 1669–1692.
Germann, U., Berenguer, M., Sempere-Torres, D., Zappa, M., 2009. REAL—ensemble
radar precipitation estimation for hydrology in a mountainous region. Quart. J.
Royal Meteorol. Soc. 135, 445–456.
Guiasu, S., Shenitzer, A., 1985. The principle of maximum entropy. Math. Intell. 7,
42–48.
Gull, S.F., 1989. Developments In Maximum Entropy Data Analysis. Maximum
Entropy and Bayesian Methods. Springer, Netherlands, pp. 53–71.
Gupta, H.V., Sorooshian, S., Yapo, P.O., 1998. Toward improved calibration of
hydrologic models: Multiple and noncommensurable measures of information.
Water Resour. Res. 34, 751–763.
Gupta, H.V., Sorooshian, S., Gao, X., Imam, B., Hsu, K., Bastidas, L., Li, J., Mahani, S.,
2002. The challenge of predicting ﬂash ﬂoods from thunderstorm rainfall.
Philos. Trans. Royal Soc. Lond. Ser. A: Math., Phys. Eng. Sci. 360, 1363–1371.
Gupta, H.V., Sorooshian, S., Hogue, T.S., Boyle, D.P., 2003. Advances in automatic
calibration of watershed models. Water Sci. Appl. 6, 9–28.
Hackett, O.M., 1966. National water data program. J. Am. Water Works Assoc. 58,
786–792.
Ihara, S., 1993. Information Theory for Continuous Systems. World Scientiﬁc,
Singapore.
Jayawardena, A.W., 2014. Environmental and Hydrological Systems Modelling. CRC
Press.
Kang, K., Merwade, V., 2014. The effect of spatially uniform and non-uniform
precipitation bias correction methods on improving NEXRAD rainfall accuracy
for distributed hydrologic modeling. Hydrol. Res. 45 (1), 23–42. http://
dx.doi.org/10.2166/nh.2013.194.
H. Xu et al. / Journal of Hydrology 525 (2015) 138–151
Kizza, M., Rodhe, A., Xu, C.-Y., Ntale, H.K., Halldin, S., 2009. Temporal rainfall
variability in the Lake Victoria Basin in East Africa during the Twentieth
Century. Theor. Appl. Climatol. 98, 119–135.
Krajewski, W.F., Smith, J.A., 2002. Radar hydrology: rainfall estimation. Adv. Water
Resour. 25, 1387–1394.
Krstanovic, P.F., Singh, V.P., 1992. Evaluation of rainfall networks using entropy: I.
Theoretical development. Water Resour. Manage. 6, 279–293.
Li, H., Zhang, Y., Chiew, F.H.S., Xu, S., 2009. Predicting runoff in ungauged
catchments by using Xinanjiang Model with MODIS leaf area index. J. Hydrol.
370, 155–162.
Li, H., Beldring, S., Xu, C.-Y., 2014. Implementation and testing of routing algorithms
in the distributed HBV model for mountainous catchments. Hydrol. Res. 45 (3),
322–333.
Lin, J., 1991. Divergence measures based on the Shannon entropy. Inform. Theory,
IEEE Trans. 37 (1), 145–151.
Liu, B., Chen, X., Lian, Y., Wu, L., 2013. Entropy-based assessment and zoning of
rainfall distribution. J. Hydrol. 490, 32–40.
MacKay, D.J., 2003. Information Theory, Inference and Learning Algorithms.
Cambridge University Press.
Maruyama, T., Kawachi, T., Singh, V.P., 2005. Entropy-based assessment and
clustering of potential water resources availability. J. Hydrol. 309 (1), 104–113.
Mason, R.R., York, T.H., 1997. Streamﬂow information for the nation. U.S. Geological
Survey.
Nash, J., Sutcliffe, J., 1970. River ﬂow forecasting through conceptual models part I—
A discussion of principles. J. Hydrol. 10, 282–290.
Neitsch, S.L., Arnold, J.G., Kiniry, J.R., Williams, J.R., King, K.W., 2002. Soil and Water
Assessment Tool Theoretical Documentation, Version 2000. Texas, USA.
Neitsch, S.L., Arnold, J.G., Kiniry, J.R., Williams, J.R., King, K.W., 2011. Soil and Water
Assessment Tool Theoretical Documentation, Version 2009. Texas, USA.
Nour, M., Smit, D., El-Din, M., 2006. Geostatistical mapping of precipitation:
implications for rain gauge network design. Water Sci. Technol. 53 (10), 101–
110.
Pardo-Igúzquiza, E., 1998. Optimal selection of number and location of rainfall
gauges for areal rainfall estimation using geostatistics and simulated annealing.
J. Hydrol. 210, 206–220.
Patra, K.C., 2001. Hydrology and Water Resources Engineering. Alpha Science
International Limited.
Perks, A., Winkler, T., Stewart, B., 1996. The Adequacy of Hydrological Networks: A
Global Assessment. Secretariat of the World Meteorological Organization.
Pethel, S.D., Hahs, D.W., 2014. Exact test of independence using mutual information.
Entropy 16 (5), 2839–2849.
Pyrce, R.S., 2004. Review and Analysis of Stream Gauge Networks for the Ontario
Stream Gauge Rehabilitation Project. Watershed Science Centre, Trent
University, Peterborough, Ontario, Canada.
Rodriguez-Iturbe, I., Mejía, J.M., 1974. On the transformation from point rainfall to
areal rainfall. Water Resour. Res. 10, 729–735.
Sang, Y.F., 2013. Wavelet entropy-based investigation into the daily precipitation
variability in the Yangtze River Delta, China, with rapid urbanizations. Theor.
Appl. Climatol. 111 (3–4), 361–370.
Segond, M.L., Wheater, H.S., Onof, C., 2007. The signiﬁcance of spatial rainfall
representation for ﬂood runoff estimation: a numerical evaluation based on the
Lee catchment, UK. J. Hydrol. 347 (1), 116–131.
Shaﬁei, M., Ghahraman, B., Saghaﬁan, B., Pande, S., Gharari, S., Davary, K., 2014.
Assessment of rain-gauge networks using a probabilistic GIS based approach.
Hydrol. Res. 45 (4–5), 551–562.
Shannon, C.E., 1949. Commun. Theor. Secrecy Syst.: Bell Syst. Tech. J. 28, 656–715.
Singh, V.P., 1995. Computer Models of Watershed Hydrology. Water Resources
Publications. ISBN 0-918334-91-8.
Smith, C.R., Grandy, W., Jr. (Eds.), 1985. Maximum-Entropy and Bayesian Methods
In Inverse Problems, vol. 14. Springer.
151
Steuer, R., Kurths, J., Daub, C.O., Weise, J., Selbig, J., 2001. The mutual information:
detecting and evaluating dependencies between variables. Bioinformatics 18,
S231–S240.
St-Hilaire, A., Ouarda, T.B., Lachance, M., Bobée, B., Gaudet, J., Gignac, C., 2003.
Assessment of the impact of meteorological network density on the
estimation of basin precipitation and runoff: a case study. Hydrol. Process. 17,
3561–3580.
Stokstad, E., 1999. Scarcity of rain, stream gages threatens forecasts. Science 285,
1199–1200.
Strange, B.A., Duggins, A., Penny, W., Dolan, R.J., Friston, K.J., 2005. Information
theory, novelty and hippocampal responses: unpredicted or unpredictable?
Neural Networks 18, 225–230.
Tapiador, F.J., 2007. A maximum entropy analysis of global monthly series of rainfall
from merged satellite data. Int. J. Rem. Sens. 28 (6), 1113–1121.
Tsintikidis, D., Georgakakos, K.P., Sperfslage, J.A., Smith, D.E., Carpenter, T.M., 2002.
Precipitation uncertainty and rain gauge network design within Folsom Lake
watershed. J. Hydrol. Eng. 7, 175–184.
Visessri, S., McIntyre, N., 2012. Comparison between the TRMM Product and Rainfall
Interpolation for Prediction in Ungauged Catchments. In: 2012 International
Congress on Environmental Modelling and Software Managing Resources of a
Limited Planet, Sixth Biennial Meeting, Leipzig, Germany.
Volkmann, T.H., Lyon, S.W., Gupta, H.V., Troch, P.A., 2010. Multicriteria design of
rain gauge networks for ﬂash ﬂood prediction in semiarid catchments with
complex terrain. Water Resour. Res. 46, W11554.
Wei, C., Yeh, H.C., Chen, Y.C., 2014. Spatiotemporal scaling effect on rainfall network
design using entropy. Entropy 16 (8), 4626–4647.
William, H., 2007. Numerical Recipes 3rd edition: The Art of Scientiﬁc Computing.
Cambridge University Press.
WMO (World Meteorological Organization), 1994. Guide to Hydrological Practices.
168. WMO: Geneva, Switzerland.
Xu, H., Xu, C.Y., Chen, H., Zhang, Z.X., Li, L., 2013. Assessing the inﬂuence of rain
gauge density and distribution on hydrological model performance in a humid
region of China. J. Hydrol. 505, 1–12.
Yao, Y.Y., 2003. Information-theoretic measures for knowledge discovery and data
mining. Entropy Measures, Maximum Entropy Principle and Emerging
Applications, pp. 115–136.
Yao, C., Li, Z., Bao, H., Yu, Z., 2009. Application of a developed Grid-Xinanjiang Model to
Chinese watersheds for ﬂood forecasting purpose. J. Hydrolo. Eng. 14, 923–934.
Yatheendradas, S., Wagener, T., Gupta, H., Unkrich, C., Goodrich, D., Schaffner, M.,
Stewart, A., 2008. Understanding uncertainty in distributed ﬂash ﬂood
forecasting for semiarid regions. Water Resources Research 44, W05S19.
Yevjevich, V., 1972. Probability and Statistics in Hydrology. Water Resources
Publications, FORT COLLINS, COLORADO, U.S.A.
Yoo, C., Jung, K., Lee, J., 2008. Evaluation of rain gauge network using entropy
theory: comparison of mixed and continuous distribution function applications.
J. Hydrol. Eng. 13 (4), 226–235.
Yuan, F., Ren, L.L., Yu, Z.B., Zhu, Y.H., Xu, J., Fang, X.Q., 2012. Potential natural vegetation
dynamics driven by future long-term climate change and its hydrological impacts
in the Hanjiang River basin, China. Hydrol. Res. 43, 73–90.
Zhang, Z., Yeung, R.W., 1998. On characterization of entropy function via information
inequalities. Inform. Theor., IEEE Trans. Inform. Theor. 44 (4), 1440–1452.
Zhang, D.R., Zhang, L.R., Guan, Y.Q., Chen, X., Chen, X.F., 2012. Sensitivity analysis of
Xinanjiang rainfall–runoff model parameters: a case study in Lianghui, Zhejiang
province, China. Hydrol. Res. 43, 123–134.
Zhao, R.J., 1992. The Xinanjiang Model applied in China. J. Hydrol. 135, 371–381.
Zhao, R.J., Zhuang, Y.L., Fang, L.R., Liu, X.R., Zhang, Q.S., 1980. The Xinanjiang
Model. Hydrological Forecasting Proceedings Oxford Symposium. IASH 129,
351–356.
Zhao, R.J., Liu, X., Singh, V.P., 1995. The Xinanjiang Model. Computer Models of
Watershed Hydrology. Water Resources Publications, pp. 215–232.