Download Report

SYSTEMS,
1,996,VOL. 10, NO. 5, 605-621
rNT. J. cEocRApHrcAL TNFORMATTON
ResearchArticle
The geographyof parameterspace:an investigationof spatial
non-stationarity
MARTIN CHARLTON
A. STEWART FOTHERINGHAM,
NE.RRl/Department of Geography,University of Newcastleupon Tyne,
Newcastleupon Tyne, NE17RU, England,U.K.
and CHRIS BRUNSDON
NE.RRl/Department of Town and Country Planning,University of Newcastle
upon Tyne, Newcastleupon Tyne, NE17RU, England,U.K.
(Receiued
27 September1994;Accepted3 March 1995)
1. Introduction
In line with the current information revolution, geographersincreasinglymake
a distinction between the analysisof spatial data and spatial data analysis.While
both types of analysisinvestigate data having geographicalco-ordinates,the former
effectivelyignores the geographicalcomponent and treats the data as if they were
aspatial(inter alia,Flowerdew and Green 1,994),whereasthe latter makes use of the
geographicalcomponent to explore explicitly spatial aspectsof the data (inter alia,
Anselin 1988,Besag 1986). It is this latter characteristicwhich makes spatial data
anaiysis such a rich field for exploratory investigation. To clarify the distinction,
supposeone is interestedin the relation betweenan attribute Y and a set of attributes
Xr, Xr... XN where the data all have spatial co-ordinates.A frequently used
procedure to examine relations in such a situation is to regressY on the set of X
variablesusing the full data set. That is, observationsfrom all the spatial units are
used to produce a 'global' set of parameter estimatesthat relate the variable Y to
each of the X's. In following such a procedure,a greatdeal of spatial information is
lost in that the reiations being examined may in fact exhibit significant spatial
variation which is not uncoveredin the estimationof the global parameterestimates.
This variation is referredto as spatial non-stationarity.
In this paper, we propose an exploratory method of spatial data analysiswhich
allows spatial variations in relations to be examinedvisuaily. It is useful to note at
this stagethat exploratory analysisis of two types:pre-confirmatory,which focusses
on data accrtacy, and post-confirmatory which focusses on model accvtacy.
However,neithertype of exploratory analysisis exclusivelydala or model-orientated.
Pre-confirmatoryexploratory anaiysiscan be usedto suggestrelations to be included
within a modeliing framework and post-confirmatory analysis can be useful in
highlighting unusualaspectsof data. The exploratory analysisdescribedhereis postconfirmatory and we are concernedwith examining differencesbetween a 'global'
calibration of a model and many different 'local' calibrations. In particular, we are
interested in exploring spatial variations in model calibration results. We would
arguethat this method has the following advantagesover more conventionalanalysis:
0269-3798196
$12'00O 1996Taylor& FrancisLtd
A. Stewart FotheringhameL al.
606
1. It allows greater insights into the nature and accuracy of the data being
examined;
2. It provides a more detailed understandingof the nature of relations and their
variation over space,
3. It demonstratesthe possible naivet6 of conventional approaches to data
analysisthat often ignore spatial non-stationarity;and
4. It allows a more detailed comparison of the relative performancesof different
types of analysisor different models.
2. Visualisationof spatial non-stationarity
Consider a region divided into R rows and C columnsin which a typical analysis
is undertaken so that R by C observations are used to produce a single set of
parameterestimates(the analysisof spatial data or global approach).An alternative
techniqueinvolves placing a moving window of dimensionsr by c (wherer < R and
c < C) over every cel1of the matrix for which the window can be centred without
any pafi of it lying outside the region. For every placementof the moving window
the relation between Y and each of the X's is estimatedso that a set of parameter
estimatesfor eachrelation is obtained (seefigure 1). Thesesetsof parameterestimates
have spatial co-ordinates and can therefore be mapped to show how the relation
between Y and a particular X varies over space (a spatial data analysis or local
approach).The distinction betweenthe terms global and local as used here is similar
to that betweenthe analysisof spatial data and spatial data analysis.Obviously a
spatial goodness-of-fitstatistic and other diagnosticscan be mapped in the same
way. The key outputs then are two- and three-dimensionalsurfacesof parameter
estimates,their t-valuesand an R-square statistic. Two-dimensionalplots are also
presented to compare the relative performance of global and local calibration
methods.
It is noted at the outset that the detection of spatial non-stationarity dependson
a subjectiveimpressionof the roughnessof a three-dimensionalsurfaceof parameter
estimatesaithough work is currently in progressto producea more objectivecomparative procedure. Clearly, if spatial non-stationartty does not exist, the parameter
+_c
tI
I
{-r
H
K
I
I
It
--T-I
Figure 1. Moving window parameterestimation.
oJ spatialnon-stationarity
An inuestigation
607
surfacedepicting spatial variations in relationswill be smooth and increasingspatiai
non-stationanty will be indicated by increasingdivergencefrom this smooth surface.
It is further noted that a squarewindow is being employedand that other shapes
of windows could be used. Perhaps the most intuitive window shape would be
circular so that cells are included that are within a pre-specifieddistancefrom the
centreof the window. However,a squarewindow was chosenas beingmore representative of regionsthat are often definedfor data analysis.It is more iikely, for example,
that a squareor rectangular subsetof the data is used to calibrate a model-rarely
are study regions circular. As an example,the global region which forms the basis
of this analysisand which is describedbelow, is rectangular.
The techniqueproposedis similar to other developmentsin spatial data analysis
suchas thoserelatedto the modifiable arealunit problem by, inter alia, Fotheringham
and Wong (1991)where the sensitivityof analytrcalresultsto variationsin the spatial
units for which data arc observedis examined.Both studiesutilise GIS to examine
spatial aspectsof the sensitivityof model performance.However, whereasthe investigations of the modifiable arealunit problem keep the study area boundary fixed and
examine the performanceof an anaTyticalmethod when the study spaceis divided
in different ways, the technique describedhere in effect varies the study region so
that different subsetsof the data are used to calibrate a model. The techniqueis also
part of a growing field of analytical techniquesthat provide spatial disaggregations
of more traditional global statistics.Other examplesinclude the developmentof a
localised spatial association statistic by Getis and Ord (I99I), the generation of
and the spatial expansionof
variogram clouds and pocket plots by Cressie(1,991.),
parameter estimatesby Jones and Casetti (1992) and by Fotheringham and Pitts
( 1 9 9 5 ).
3. An example using OLS and bi-squareregression
The basic aim of this example is to examine the presenceof spatial nonstationarity in the calibration of a regressionmodel. The model attemptsto connect
peopleand the environmentby investigatingthe relationsbetweenpopulation density
and a seriesof attributes of the physicai landscape,namely: elevation,land cover
and soil type. The data on thesevariables arc all taken from an areain north-east
Scotland and are describedmore fullv below.
3.1..Data
The data refer to an area 38'45km by 51'2km in the north-eastof Scotland
which is describedin detail in Aspinall and Veitch (L993) and consistof a population
density measureas the dependentvariable, and ground elevation,land cover, and
soil type as the independentvariables.The object of this example is to provide a
demonstration of the visualisation of spatial non-stationarity rather than to model
the socio-physicalprocessesaffectingthe settlementpattern in the study area,but in
the interestsof having a reasonablyplausible model on which to work, it might be
expectedthat the distribution of population is related to land cover, elevation,and
(perhaps more speculatively)the underlying soil type. The models are based on a
1 km raster.
Counts of residentialpopulation for eachcensusenumerationdistrict in the study
area were extracted from the 1,991,Census of Population together with the
co-ordinates of the centroid of each enumeration district. The count data were
aggiegatedto the 1 km raster on an 'all or nothing' basis-the raster is relatively
608
A. Stewart Fotheringham et aL.
crude by comparisonwith that used by Martin (1989)in allocatingcentroid based
censusdata to a raster.The density for a given grid squareI is calculatedas:
popDEN,:
(1)
I(population:l:l(@i x,)2+ (yi y,)2))
where (x1,!;) is the co-ordinateof the lower left-handco-ordinateof grid square/.
The elevationrasterwas resampledfrom a 1,:250000scaledigital terrain model,
originally sampled every 100m. The land cover data were extracted from a 50m
'woodland',
raster coverage,captured at I:50000-the data were reclassedinto
'grassland'and 'other', each being expressedas a percentageof the Iand area rn a
1km output raster. The soils data were extractedfrom 100m raster data, capttxed
into'podsols'and'other soil types';in the output
at 1:250000scaleand reclassed
raster,the percentageof the land area that is composedof podsois is recorded.
3.2. OLS regression
The above data were standardisedby transformingthem into Z-scoresand these
were used to calibrate the following modei:
POPDEN : d,* 9, EL + P2 WOOD+ F. GR,4.SS
+ P4PODS
(2)
such that global estimatesof the four slopeparametersas well as a globai goodnessof-fit indicator, the R-squaredstatistic,were obtained.The globally fitted model is:
PO P D E N :0 -0 '4 3 E f -0 '0 5 W OOD - 0' 11 GR,4SS+0' 10 PODS
( 4.8)
(1 6 .4 )
( 2' 2)
( 4.1)
( 3)
with an R-squaredvalue of 0'26. The values in parenthesesare the absolutevalues
of the f-statisticsassociatedwith each parameterestimate.
The resultssuggestthat although there remainsa considerableamount of variance
to be explained,there do seemto be some significantrelations betweenpopulation
significdensity and featuresof the physicallandscape.Population density decreases
antly as elevationrises,as the proportion of both woodland and grasslandrises,and
increasessignificantlyin areas where podsols are found. A not unreasonableinterpretation of these results would be that in this part of Scotland, settlementsare
found primarily in low-lying areaswhich are clearedof woodland and grasslandand
which have fairly rich podsolic soils conducive to agriculture. We explore these
interpretationsin more detail below.
3.3. Bi-squareregression
One of the major problemswith ordinary leastsquaresregressionis its sensitivity
to outlying observationsas shown in figure 2 and as describedby many authors
(inter alia lJnwin and Wrigley 1987, Wrigley 1983). Although a generaltrend is
followed by most observations,there is one exception that unduly influencesthe
position of the regressionline.
Here the difficulty can be overcomeby manual intervention,simply by excluding
the outlying casefrom the regressionanalysis.However, this is oniy possiblewhen
the data can be visualised.If the regressionmodel had three or more independent
variables,outliers would have to be identified in Euclidean space of at least four
dimensions.For this reason, some automated controlling for outliers needs to be
adopted.
In its crudest form, this can be thought of as zero-weighting the offending
observations.
Thesecan be identifiedin a two-stageprocess:
An inuestigationof spatial non-stationarity
609
Y-Value
25
Original Fitted Line
Line Distorted by Outlier
Data Point
u
r
r +fr""
{E
Figure 2. The effectof outliers on leastsquaresregression.
1. Fit the model on all the data and compute the residuals.
2. Determine outiiers by looking at the values of the residuals.Re-fit the model
excluding these.
This seemsa reasonabieapproach, but suffersfrom two main setbacks.Firstly, the
threshold of residual sizebeyond which observationsare deemedto be outliers is
not specified.Secondly,the weighting makes a quantum leap from fu11weighting to
absolute exclusion of the observation at this threshold. Thus, the weighting in the
second stage can be thought of as a step function of r, the size of residual, and k,
the threshold.
In the bi-weighted approach to outlier resistant regression,the cut-off is less
drastic (Coleman et al. 1980).Weighting of the observationsgradually falls off with
increasingr, ar'd finally becomeszero smoothly at the cut off point:
2
w(r):(,
_
(;)')
w(r):9
if r(k
(4)
otherwise
This is illustrated in figure 3. The question still arisesas to the value of k although
a good experimentalvalue for k has beenfound to be five times the median absolute
value of the residuals.Finally, aithough the methodology suggestedso far suggests
only two iterations (i.e.,fit regression,re-weightand fit again),the bi-squaremethod
can be iterated severaltimes until the estimatesof the regressioncoefficientsconverge.
In this instance,the bi-square methodology is applied to the moving window
regressionanalysis alongside the ordinary least squaresmethodology. Due to the
relativeiy small size of the regressionwindows (49 pixels), the effect of an outlying
observationcould be quite prominent on the estimatesof the regressionparameters.
It is intended that this method will identify any outlying pixels and reduce any
distortion that thesemay cause.In the casewhen the results from the two anaiyses
do not differ drastically,this will at least reassureus that the ordinary ieast squares
methodoiogy was not affectedby outliers.
610
A. Stewart Fotheringham et al.
1
0.9
0.8
u.l
u.o
ilt\
n4
0.4
n?
n4
Figure3. The weightingfunctionin bi-squareregression.
The result of calibrating the global bi-squareregressionmodel with standardised
data is:
POPDEN: -0.08 -0.37 EL _0.03 WOOD -0.07 GR/SS +0.03 PODS (5)
which is similar to the OLS version although there are differencescausedby the
reduction in the influenceof extremevalues.It is not possibleto report vaiues of rstatisticsfor the parameter estimatesbecausethe standard errors reported in the
bi-squarecalibration are unreliable becausethe weightsare dependenton the observations and are thereforerandom variables.The R-squaredvalue is unreliable for
the samereason and should not be compared with that from the OLS regression.
4. Localised results
4.1. LocalisedOLS results
To obtain localisedresults, a 7 by 7 window was placed over each pixel which
provides 49 data points for the model calibration at eachpixel location. This was
the smallest squarewindow with which we felt comfortable as a basis for analysis.
The data for each set of 49 pixels arclocal7ystandardisedprior to model calibration.
As describedabove, the typical output of this techniquefor each of the four slope
parametersin the model is a three-dimensionalsurfaceas shown in figure 4. A twodimensionalrepresentationis aiso shown as an alternativeview of the data. Clearly
there are vartationsin the parameterestimateover the region although it is not clear
how important thesevariations might be. In an attempt to overcomethis problem
we have chosen to depict surfaces of the parameter estimates divided by their
standard errors, /-statisticsurfaces,which show more clearly the variations in results
that can be obtained by using different parts of the data to calibrate the model.
Thesef-statistic surfacesare shown in figures 5-8 for the parametersassociatedwith
the variableseleuation,woodland,grassland,and podsols,respectively.The interpretation of thesefiguresis that eachpoint on the surfaceindicatesthe f-statisticassociated
with the parameterestimatethat would be obtained if a 7 by 7 matrix of cellsaround
that point constitutedthe samplefrom which the estimatewas deriued.
As an exampie,consider the interpretation of figure 5 which depicts the degree
of non-stationarityfor the parameterassociatedwith elevation.The regionsof upland
depictedin white are thoseareasfor which the local parameterestimateis significantly
An inuestigationof spatial non-stationarity
61.1
Figure 4. OLS model elevationparameter.
positive. The regions shaded in light-grey, the hill sides,are where the parameter
estimate is not significantly different from zero, and the ualley bottoms shaded in
dark-grey depict areas where the parameter estimate is significantly negative. So
whereasthe global regressionresult suggestsa significant negativerelation between
population density and elevation, there are a number of localised areas where
612
A. Stewart Fotheringhamet al.
Figure5. OLS modelelevationt-statistic.
population densityincreasessignificantlyas elevationincreases.The two-dimensional
ffi&p, onto which the local road pattern has been added, suggeststhat the areas
where a signiflcantpositive relation betweenpopulation density and elevation exists
are close to roads which in this area tend to hug lower ground. This could suggest
an aversion to very lowJying ground which may be difficult to cultivate or which
may be prone to flooding. Whatever the explanation,the resultsare striking in their
An inuestigationof spatial non-stationarity
613
l"?-*
Figure6. OLS model woodland/-statistic.
depiction of the extent of spatial non-stationarttywhich is completelyhidden in the
traditional modeliing approach.
Becausethe vertical exaggerationis held constant throughout, figure 6 depictsa
similar, although slightly lessdramatic, picture for the woodland parameter.Use of
the global data set would lead to the conclusion that there is a significant negative
relation betweenpopulation densityand the amount of woodland althoughmost of
6r4
A. Stewart Fotherinsham et al.
Figure7. OLS modelgrassland
r-statistic.
the local data setswould lead one to concludethat there is no significant relation.
This apparent discrepancyis probably causedby the large number of degreesof
freedomin the former although it shouid be noted that the global estimateis just
significant at the 95 per cent confidencelevel.There are a number of local data sets
which would lead to the conclusion that there is a significant positive relation. The
pixels for which no results are reported on this surfaceare those for which the data
An inuestigationo.fspatial non-stationarity
61,5
Figure8. OLS modelpodsolst-statistic.
did not allow the local estimation of the parameter;this was becausethe variable
was constant within the window.
The grassland estimatesin figure 7 again suggest that one would reach the
conclusionof no signiflcantrelation using many of the local data sets,in contrast
to the globally significant negative relation. However, there are regions where the
loca1 data lead one to conclude the relation is sienificantlv oositive and others that
61,6
A. Stewart Fotheringhamet aL
lead one to conclude it is significantly negative.It needsmore familiarity with the
local geographyto understandwhy the relationsshouldshowthis degreeof variation.
Figure 8 is dominated by a large areaof missing values where the variable was
zeto (that is the soils were not podsolic) for aIl49 pixels within the window. On the
remainder of the surfacethere are substantialareaswhere the local data lead to the
conciusion of a significant negative relation in contrast to the globally reported
positive one.
The goodness-of-fit,
measuredby the R-squaredstatistic,of the loca1model is
depictedin figure9 which usesa continuousgrey scaleto show spatialvariationsin
the degreeto which the model fits the data. There are clearly identifiable hills and
valleys in this surface.The model performancerangesfrom explaining only 1 per
cent of the variance of the population density data tn some parts of the region to
explaining over 80 per cent in others.The globally fitted model expiains 26 per cent
of the variance. Again, the large spatial variation in goodness-of-fitsuggeststhat
spatial non-stationarity exists and that there are interestingfits to the data in some
parts of the region, prompting questions regarding both the data and the model
which needdetailedlocal knowledge,perhapsfrom subsequentfieldwork, to answer.
The surface therefore provides a good example of the use of this technique in
encouraging a more detaiied consideration of the data and the relations being
examinedthan would happen in the conventional,global type of analysiscurrently
the modusoperandiof most research.
4.2. Localisedbi-squareresults
Figures10-13 show the /-statisticsurfacesproducedby the bi-squarecalibration
method. Given that this calibration method is designedto reduce the effects of
outliers in the data, there will obviously be some differencesbetweenthese surfaces
and those produced by OLS. For reasonsdiscussedearlier, any significancetesting
basedon the bi-squaretechniqueis strictly exploratory.However, the differencesare
not particularly great and the bi-square surfacesdepict a similar degreeof spatial
non-stationarity to the OLS ones and so will not be discussedfurther.
The R-squared surfaceshown in flgure 14 again is similar to that derived from
the OLS caiibration method although the absolutevalues cannot be directly compared to those produced by OLS becauseof the differencein the two caiibration
methods.
The similarity of the two sets of results usefully demonstratesthat the OLS
regressionis not being unduly biased by the presenceof outliers. The methodology
aiso shows how the comparison can be used to assessreiative model performance
becauseif the results of the localisedcalibration were quite different,then it may be
possible to make statementsabout which model is better able to capture spatial
variationsin relations.
4.3. A comparisonof global, local OLS and local bi-squareresults
A fina1 set of results depicts the relative performancesof the three estimation
techniquesin replicating the observedpopulation density surface.In figure 15, three
setsof scatterplotsshow the relations betweenthe observedd.ataon the horizontal
axis and the global predicted values (figure 15(a)),the local OLS predicted values
(flgure15(b))and the local bi-squarepredictedvalues(figure15(c)).It is clear,and
only to be expected,that the local regressionsoutperform the global one in replicating
the data sincethe former allow the mean of the predictedvalues to vary acrossthe
An inuestigation of spatial non-stationarity
617
Figure9. OLS modelgoodness-of-fit.
study area. Perhaps more interesting is the extent to which the globa1regression
resultsfail to replicatethe spatial pattern of the original data as shown in figure 16.
Population density in this part of NE Scotland shows a marked trend with highest
values towards the south-eastand lowest values towards the north-west. The two
locally derived estimatedsurfacesboth succeedin picking up this trend, becauseof
the moving mean, but the global regressionmodel performs poorly and almost
missesthe trend entirely.
618
A. Stewart Fotheringham et aL.
Figure10. Bi-squaremodelelevationf-statistic.
5. Discussion
This techniquefor the exploration and visualisation of spatial non-stationarity
can be used on any spatial data set;however,the number of spatial units should be
large enough to ensure that those units excluded from the analysis, due to the
windowing, form a reasonablysmall percentageof the total. The techniqueis easier
An inuestigationof spatial non-stationarity
Figure11. Bi-squaremodel woodland /-statistic.
619
620
A. Stewart Fotherinsham et al.
Figurei2. Bi-squaremodelgrassland
t-statistic.
to apply to raster data due to the uniform nature of the recording units but there is
no conceptualreasonto restrictits use to this type of data. The techniquecould be
applied to irregular polygonsalthough somecare would have to be taken in defining
the window and this is currently under investigation.
Given that a key output from this techniqueis a set of three-dimensionalsurfaces
depicting,inter alia, spatial variations in a set of parameterestimates,judging the
An inuestigationof spatial non-stationarity
621
modelpodsolsf-statistic.
Figure13. Bi-square
successof the technique is heavily visual and subjective.Consider, for example,a
situation in which there was negligiblespatial non-stationarity in a relation depicted
by a parameterestimatedthrough regression,The three-dimensionalsurfacewould
then be a flat plain. Increasing deviations from this plain, as hills and valleys and
other features evolve, indicate increasing degrees of spatial non-stationarity.
Consequently,if the result from this techniqueis a relativeiy flat sutface,then one
622
A. Stewart Fotheringham et al.
Figure14. Bi-squaremodelgoodness-of-fi1.
can infer that spatial non-stationarity is negligiblewhich increasesconfidencein the
global result as being a reasonable representation of a more general relation.
However,if the output is a highly convoiuted surface,then clearly the use of a global
estimateis suspect.
The technique should be used for any modelling application where there is a
suspicion of spatial non-stationanty. Although we have used a relatively simple
A n inuestigationof spatial non-stationarity
3500
n
o
1500
(s
1000
fl
500
0
0
500 100015002000250030003500
ActualDensitv
{a)
2500
U)
J
2000
.E
.9 1500
6 '1000
E
500
J
500
1000 1500 2000
ActualDensity
(b)
2500
(6
+ 2ooo
U)
.f
co
I 1500
.9
(D
d rooo
o)
(E
0
500
1000 1500 2000 2500
ActualDensity
(c)
Figure 15.
Comparison of fitted-and-actual density values.
623
624
A. Stewart Fotheringham et al.
(a)
PopuiationDensity
(b)
GlobalPrediction
(d)
Local Predictions (OLS)
Local Predictions {Bisquare)
Figure 16. Spatial comparisonof fitted and actual values.
iinear model, with the justification that this is representativeof the type of model
used in many applications,the technique we describecan be used whenever it is
useful to compare a global model with a seriesof local models. For instance,the
model in question could just as easilybe a non-parametricregressionmodel (Hastie
and Tibshirani 1990)which could be examinedwith a global data set and with many
different subsetsof the data to allow comparisonssimilar to those made above.We
would argue that any globally-fitted model may miss impofiant spatial variations
in relations and that this techniquewill provide greaterinsights into the data being
used for calibration as well as the nature of the relations being measured
Given the emphasison visualisationand the needto referencespatial co-ordinates,
the technique should be applied through a GIS. It is not necessaryto conduct the
model calibrationswithin a GIS (and in the exampledescribedabovethe calibrations
were undertakenoutsidea GIS) although the spatial referencingand graphicscapabilities of a GIS make it almost a necessityin interpreting results. Hearnshaw and
Unwin (1994), and particularly the chaptersby Bracken (1994) and Gatrell (1,994)
therein, provide many other examplesof the use of the visualisation capabilities of
a GIS for exploratory spatial analysis.
An inuestigationof spatial non-stationarity
625
6. Reservations
We have the following reservationsregarding the technique although we feel
theseare not sufficientto outweigh the benefits:
The main output is a set of three-dimensionalsurfaceswhich may be difficult
to interpret.In someplaces,we usetwo-dimensionalsurfacesto depict information although the parameter results are best seenin three-dimensionalto get
a senseof the variations within the area. Of course, arry three-dimensional
plot is subject to the vertical exaggerationused in the construction of the
surface.
2. Although all the original data are used in the construction of the threedimensional surfaces,not all the original cells of the region can be depicted
'stripped off' to provide consistencyin
becausethe outlying cells have to be
the windowing. For example,if a 7 by 7 window is used on a raster image
500 by 500, the dimensionsof the resulting three-dimensionalsurfacewould
be 494by 494.In most casesthis will not presenta problem unlessthe extreme
margins of the region are of interest or the study area is smal1.
a
J . It is much easier to apply the technique to raster data where the cell size is
uniform. Where cells are of unequal size,the deflnition of the window will be
more difficult and the numbers of data points within the window may vary.
However, neither of theseproblems should prohibit the use of the technique.
4. To avoid the problem of multiple hypothesistesting, the surfacesof /-values
should be used primarily as exploratory tools and care should be taken on
drawing inferencesabout the parameterestimates.The surfacesindicate those
parts of the study area whereparticular conclusionswould be reachedregarding a relation if local data were used in the calibration of the model. The
surfacestherefore prompt some interesting questionsabout the data and the
relations being examinedthat would not otherwiseget asked.
1
t.
7. Extensions
In this researchwe have adopted a conservativeposture by moving one step at
a time. To this point we have explored non-stationartty in a very common model
format, iinear regression,which is available in many GISs. However, it is a model
that has had seueral potentrai problems and given that one of the aims of this
techniqueis to aliow more detailed examination of any globally applied model, as
well asto investigatespatial non-stationarityand data accvracy,an obvious extension
would be to use the technique to evaluatemodelling approachesdesignedexplicitly
to capture spatial effects.We intend to explore the capabilities of spatial lag and
spatial error models in dealing with spatial non-stationarity. Would the parameter
surfacesshown above be smoother if the estimateswere drawn from either of these
models capturing aspectsof spatial dependency?
A further extensionwould be to apply the techniqueto a set of irregular polygons
so that the optimal windowing techniquefor such a surfacecould be explored.In
such an application it is likely that the window would be irregular and this could
iead to a spatially adaptive windowing technique whereby windows of different
magnitudeswere allowed around each point in order to maxirnise some criterion,
perhaps related to goodness-of-fit.It would also be interesting to examine the
modifiabie arcal unit problem in the context of spatial non-stationarity: to what
extent would the results change as the moving window changed in size?Would
626
A. Stewart Fotheringham et al.
windows of different sizesidentify non-stationarity at different scales?Presumably
the surfacewill become smoother as window size increasesand there may well be
some optimal size of windowing that allows spatial trends in relations to be most
clearly seen.
A final extensionof the resultswould be to producea more objectivemethod of
deciding when spatial non-stationarity is a serious problem. Clearly a threedimensional parametersurfacethat is flat is indicative of no spatial non-stationarity
and increasing deviations from this p1ain, as hills and valleys and other features
evolve,indicateincreasingdegreesof spatial non-stationarity.It would be usefulto
have an objective means of deciding whether the degreeof spatial non-stationarity
invalidatedthe useof a global model.Sucha testmay be possiblebasedon explained
variancesand would be similar to the F-test used in ordinary regression.
8. Summary
The classicaluse of regressionwith spatial data is to calibrate a model, spatial
or aspatial, to produce a global set of parameter estimateswhich are implicitly
assumedto representrelations that are constant across the spacefrom which the
data arc drawn. This paper provides a relatively simple techniquefor examiningthis
assumption and in the example above the results indicate that the assumption is
questionable.Given the results provided here, for example,how much faith could
one place on any extrapolation of the global results beyond this particular set of
polygonsin north-eastScotland?
The methodology of depicting spatial non-stationarrtyvisually is part of a general
trend towards exploratory analysis particularly in the use of spatial data.
Geographershave long been embarrassedby their data becauseof the difficulties
arising from properties such as spatial dependencyand spatial non-stationarity.
These problems are increasingly seen as the outcomes of data which are rich
in information and which provide the catalyst for increasingly imaginative types
of analysis. It is our hope that this and subsequent studies will advance the
understandingof the behaviour of models in the context of spatial data analysis.
References
AttsELrN,
L., 1988,SpatialEconometrics:
MethodsandModels,(Dordrecht:KluwerAcademic).
AsPINelt,R. J., and Vurclr, N., 1993,Habitat mappingfrom satelliteimageryand wildlife
survey data using a Bayesianmodellingprocedurein a GIS. Photogrammetric
Engineering
andRemote
Sensing,59,
537-543.
Besac,J. E., 1986,On the statisticalanalysisof dirty pictures.Journalof the RoyalStatistical
Society
B, 48,259-279.
BnACrcN,I., 1994,Towardsimprovedvisualization
of socio-economic
data.In Visualization
in GIS editedby H. Hearnshawand D.J.Unwin (Chichester:
Wiley),pp.76-84.
CotntvlLN,D., Horrlwo, P., KADEN,N., Kr-IMl.,V., and Prtrns, S. C., 1980,A systemof
subroutinesfor iterativelyre-weightedleast-squares
computations.ACM Transactions
on MathematicalSoftware, 6, 327-336.
Cnnssm,N. A. C, 1991,Statisticsfor SpatialDcra (New York: Wiley).
Flowrnonw, R., and GREEw,M, 1994,Areal interpolation and types of data. In Spatial
analysisand GIS editedby A.S. Fotheringhamand P.A. Rogerson(London: Taylor &
Francis),pp.12I-145.
FotrmnrNcl nn,A. S.,and Pltrs, T.,7995,Directionalvariation in distance-decay.
Enuironment
and PlanningA, 27, 7 15-729.
ForrunrNcHAM,A. S., and WoNG, D. W., 1991,The modifiable areal unit problem and
multivariate analysis.Enuironmentand PlanningA,23, 7A25-1,044.
An inuestigationof spatial non-stationarity
627
GATnu,l, 4., L994,Density estimation and the visualization of point patterns. In Visualization
in GIS edited by H. Hearnshawand D.J. Unwin (Chichester:Wiley), pp. 65-78.
GntN, A., and Ono, J. K.,1991,,The analysisof spatialassociationby useof distancestatistics.
Geographical Analy sis,23, 189-206.
Hesrn, T. J., and TrnsnrnaNr,R. J., 1990,GeneralizedAdditiue Models (London: Chapman
& Hall).
Iln*nsnew, H., and IJNwrN, D. J., 1994,Visualizationin GIS (Chichester:Wiley).
JoNrs, J. P. III, and Cmnrfl, E., 1992, Applicattons of the Expansion Method (London:
Routledge).
MAnrtN, D., 1,989,Mapping population data from zone centroid locations. Transactionsof
the Institute of British Geographers,14, 90-97.
IJNwrN, D. J., and'Wrucr,rv, N., 1987,Control point distribution in trend surfacemodelling
revisited: an application of the concept of leverage. Transactionsof the Institute of
British Geographers, 12, I47 -L60.
Wnrcrrv, N., 1983, Quantitative methods: on data and diagnostics. Progress in Human
Geography,7, 567-577.