Spatial Analysis What is it?

Spatial Analysis
What is it?
• “…the purpose of geographic inquiry is to
examine relationships between geographic
features collectively and to use the relationships to
describe the real-world phenomena that map
features represent.” (Clarke 2001, 182).
• One Definition: the quantitative procedures
employed in the study of the spatial arrangement
of features (points, lines, polygons and surfaces)
Geographic Information Analysis
• “Geographic information analysis
is…concerned with investigating the
patterns that arise as a result of processes
that may be operating in space” (p. 3).
• “Techniques [that] enable the
representation, description, measurement,
comparison, and generation of spatial
patterns…”
How Do We Represent the World
(in Map or Digital Form?)
• Raster – Vector
• A Higher Level of Abstraction? (p. 5)
• Objects and Fields
– The key distinction (according to your authors)
– A slightly different conceptualization
• How do we choose the “best”
representation(s)?
Spatial Analysis:
What is it?
• What types of relationships exist between
geographic features, and how do we express
them?
• Properties of spatial features and/or
relationships between them: size,
distribution, pattern, contiguity,
neighborhood, shape, scale, orientation
3 Fundamental Questions
Regarding Spatial Relationships
• How can two (or more) spatial distributions be
compared with each other?
• How can variations in geographic properties over
a single area or data set be described and/or
analyzed?
• How can we use what we have learned from an
analysis(es) to predict future spatial distributions?
Spatial Analysis can cover the spectrum implied by these
questions!
What role does GIS play in
Spatial Analysis?
• GIS is a tool with unique capabilities:
–
–
–
–
Can handle geographically-referenced data
Spatial/attribute data entry/update capabilities
Data conversion functions
Storage and organization of a variety of spatial and
attribute data
– Manipulation of spatial and attribute data (encompasses
many different operations)
– Presentation/display capabilities
– Spatial analysis tools (many tools may be used in
combination)
Do you remember the 5
functional elements of a GIS?
•
•
•
•
•
Data acquisition
Preprocessing
Database Management
Manipulation/Analysis
Final product output
These elements are
all part of the spatial
analysis “equation”
(and a GIS
professional’s
knowledge base).
Our framework this semester for
discussing GIS operations/procedures
that are useful for spatial analysis…
•
•
•
•
•
•
•
•
The Pitfalls and Potential of Spatial Data
Maps as Outcomes of Processes
Point Pattern Analysis
Describing and Analyzing Fields
Statistical Analysis of Fields/Spatial Interpolation
Map Overlay Concepts and Procedures
Spatial Modeling
Network Analysis
How can we characterize Spatial
Analysis (what skills does it require)?
• Spatial analysis is an artistic and a scientific
endeavor (what does this mean?)
– It requires knowledge of the problem and/or question to
be answered
– It requires knowledge about the data (how it was
collected, organized, coded, etc.)
– It requires knowledge of GIS capabilities
– It may require knowledge of statistical techniques
– It requires envisioning the results of any
operation…and the combination of any operations
– It is not completely objective, in fact some argue that it
is completely subjective
– Many times there is more than one way to derive
information that answers a question
Are Spatial Data “Special,” and if
They Are, Why?
• Spatial Data are Special…
– Why?
– How?
– What are the implications?
• Pitfalls
• Potential
• Why “They” Need “Us”
The “Pitfalls” of Spatial Data
• Most spatial samples are not random!!
• This situation/problem is known as spatial
autocorrelation
– The earth’s surface is not an isotropic plane
– Positive autocorrelation, negative auto correlation, zero
autocorrelation
• “…Describing the autocorrelation structure, is of
primary importance in spatial analysis.” (p. 29)
– First order, and second order spatial variation
The “Pitfalls” of Spatial Data
• The Modifiable Areal Unit Problem
– “…aggregation units used are arbitrary with
respect to the phenomena under investigation”
– “If spatial units…were specified differently, we
might observe very different patterns…” (p. 30)
• The Ecological Fallacy
– Rampant in media reporting
The “Pitfalls” of Spatial Data
• Scale Issues
– Examples
• Nonuniformity of Space and Edge Effects
– Space is not uniform
– Edge Effects?
The Potential of Spatial Data
• Quantification of Spatial Relationships
– How? What kind of relationships matter?
• Summarizing spatial relationships
– How?
Spatial data are the building
blocks of any spatial analysis
• Spatial data structures:
– Raster: geographically-referenced matrix of
uniform size cells…advantages and
disadvantages
– Vector: features on the earth’s surface are
represented as geographically-referenced vector
objects (points, lines, polygons)…advantages
and disadvantages
Representation of
vector spatial objects
• Hierarchical nature of objects (points, lines,
polygons)
– Points: different types
• Entity, label, area, node
– Lines:
• Line, arc, link, etc.
– Polygons:
• Area, polygon, complex polygon
Basic elements of
spatial information required
to undertake spatial analysis
• Location
– X,Y coordinate or locational reference
• Attribute data
– Describing the (aspatial) characteristics of
locations
• Topology
– Describing the spatial relationships between
spatial features
Measurement of Location:
GIS Issues
• A GIS suitable for spatial analysis must
have the necessary functions dealing with
coordinate systems
– What are these functions?
• What coordinate systems do we normally
see or work with in a GIS…and what are
their characteristics?
Measurement of Location:
GIS Issues
• Basic measurement of spatial features:
– Points are defined by x,y coordinates
– Lines are represented by an ordered sequence of
points…they can be “decomposed” into sections of
straight line segments
– The distance between two points on a Cartesian plane is
derived through Euclidean distance…the length of a
line segment is the sum total of the Euclidean distances
of all segments that compose it (p. 105 Chou)
– The area of any feature represented as a polygon an be
computed by constructing a trapezoid from every line
segment delineating the polygon…then systematically
aggregating the trapezoid areas (both positive and
negative) (p. 106 Chou)
Attribute Data Measurement
• Categories: Nominal and Ordinal data
• Numeric: Interval and ratio data
• Measures of Central Tendency (mode,
median, mean) and Dispersion (variance,
standard deviation)
• Must be cognizant of spatial units and
geographic sampling techniques
Topology: What kinds of spatial
relationships between spatial features?
• Adjacency: Which polygons are adjacent to
which? Often used in the spatial analysis of
areal data.
• Containment: Which spatial features are
contained within which? Can be used for
selection or perhaps geocoding.
• Connectivity: Which line segments are
connected? Often used for network analysis.
The Arc-node Data Model: a method
of expressing vector topology
• Used for ARC/INFO coverages (we will use
this as our example)…a proprietary ESRI
vector spatial data structure
• Topological data is stored in “attribute
tables”: point attribute tables (PATs), arc
attribute tables (AATs), polygon attribute
tables (PATs)…what is contained in these
tables?
Sample Attribute Tables
• Arc Attribute Tables (AATs) - contain the
following data fields: arc-ID, Length, F-node, Tnode, L-poly, R-poly
• Polygon Attribute Tables (PATs) – contain the
following data fields: poly-ID, perimeter, area
• Point Attribute Tables (PATs) – the same fields as
above, but zero perimeter and area
** These tables store the topological data needed to
quantify the spatial relationships between features
Spatial Data Formats
• Spatial data formats are the product of the private
sector working to create data files that allow users
to:
– Create maps
– Manipulate spatial data
– Perform spatial analysis
• Example ESRI spatial data formats (files):
shapefiles, coverages, GRIDs, geodatabases,
TINs, Routes
3 Major vector-based
datasets used in ArcGIS:
Shapefiles, Coverages, Geodatabases
• ESRI Shapefiles:
–
–
–
–
–
Spatial data is stored in binary files
Attribute data is stored in dBase tables
Contain one simple feature class
No topology is developed for spatial features
Types of shapefiles: point, line, polygon and
multi-point
3 Major vector-based
datasets used in ArcGIS:
Shapefiles, Coverages, Geodatabases
• ESRI ARC/INFO Coverages:
– Spatial data is stored in binary files
– Topological and attribute tables are stored in INFO
tables
– Contain topological features classes that define line or
polygon topology
– Topology is “built” for lines and polygons - lines: arcs,
nodes and routes; polygons: arcs, label points,
polygons, regions
– Primary coverage feature classes are: point, arc,
polygon, and node; secondary: tic, link, annotation;
compound: region, route
ARC/INFO Coverages
• ARC coverage files: defined by header files, index
files, ARC, PAL, LAB, CNT, PRJ, LOG, TOL
• ARC: arc definitions and vertices; PAL: contains
polygon definitions; LAB: contains label point
records; CNT: contains polygon centroid
information; PRJ: contains projection information;
TOL: contains the tolerance values to use when
processing a polygon coverage
ESRI GRID file
• ESRI’s proprietary raster file structure
– Readable in ArcGIS without any extensions
– The Spatial Analyst extension needed to
perform analysis on these files
• Follow conventions we have learned about:
– Uniform raster cell size
– Single value per cell
– Continuous data (including null values)
“Special” Spatial Data Structures:
TINs and Routes
• Triangulated Irregular Networks (TINs):
sample points are connected to form
triangles, with the relief inside each
represented as a plane or facet
–
–
–
–
VIPs (Very Important Points)
Delaunay Triangulation
3-dimensional surface description
ArcGIS can generate these through the 3-D
Analyst extension
“Special” Spatial Data Structures:
TINs and Routes
• Routes are spatial data structures generated to
represent linear features
– Used when the definition of linear features does not
meet the needs of a network-based application
– Dynamic segmentation procedure
• New line segments are defined…
• Based on the location of “events”
• Measurements of offsets on segments
– Network Analyst and ARC/INFO
3 Major vector-based
datasets used in ArcGIS:
Shapefiles, Coverages, Geodatabases
• ESRI Geodatabase
– All spatial, topological, and attribute data is stored in
tables in a relational database
– A feature dataset in a geodatabase can contain simple or
topological feature classes
– Many feature classes can be associated with a
topological role within the geodatabase
– User-defined associations can be created between
features in different feature classes
– Types of feature classes: point, line, polygon,
annotation, simple junction, complex junction, simple
edge, complex edge
The Geodatabase Data Model: “…a
better way to associate behavior with
[spatial] features was needed”
• An object-oriented data model: data “objects” can
have rules, relationships, topology
• Facilitates the creation of “smart features” that are
more complex than generic points, lines, or
polygons
• All data is stored in a relational database ( as
opposed to separate spatial and attribute data)
Centralized management of data
• Geodatabases organize data into a hierarchy of
data objects: object classes, feature classes, feature
datasets
– Object class: a table in a geodatabase that stores nonspatial data
– Feature class: a collection of features with the same
type of geometry and the same attributes
– Feature dataset: a collection of feature classes that have
the same spatial reference system
• “Simple” feature classes can exist either within or outside a
feature dataset; topological feature classes must be contained
within a feature dataset
Maps as Outcomes of Processes
• [Spatial] patterns provide clues to a
possible causal [spatial] process(es)
– “…Usefulness of maps…remains in their
inherent ability to suggest patterns in the
phenomena they represent.” p. 52
– Conceptualizing spatial analysis as processes
and patterns
Types of Processes: Spatial Processes
and their Possible Realizations
• Could the pattern we observe have been generated by this
particular process?
• Deterministic processes:
– Processes whose outcome can be predicted exactly from
knowledge of initial conditions
– Many times can be mathematically described
– Outcome always the same
• Stochastic processes:
– Processes whose outcome is subject to variation that cannot be
given precisely by a mathematical formula
– Introduction of a random (stochastic) element to model the range
of potential solutions
– “Chance process with well-defined mechanisms” p. 58
Predicting Patterns:
Expected Results
• Assumptions
– Example: independent random process (IRP) (or complete spatial
randomness (CSR))
– Math used to predict frequency distribution under assumed
randomness
• Observed vs. expected
• What is this assumption called in the scientific method?
• Real World – usually not characterized by spatial
randomness
– First-order effects: the earth is not an isotropic plane, and therefore
some areas will be more attractive of phenomena than others
– Second-order effects: the assumption that events are independent
of each other is not realistic…i.e. the location of events will
influence the location of other events
Point Pattern Analysis
• The spatial properties of the entire set of
points is analyzed (rather than individual
points)
• Requirements/Assumptions according to
O’Sullivan and Unwin (pp.78-79)?
• Descriptive statistics for point distributions
– Frequency; density; geometric center; spatial
dispersion; spatial arrangement
Point Pattern Analysis
• Thinking about point patterns…
– How can we describe and analyze them
• The geographical properties of a point pattern are
characterized (described) by geometric center and
dispersion
– Geometric (mean) center = mean x,y coordinates;
dispersion = standard distance of x and y distribution
– Geometric (mean) center is not a reliable measure of
central tendency when either the x or y standard
distance is large
• What are these measures useful for?
Point Pattern Analysis
• Density-based and distance-based measures
– i.e. Point Density and Point Separation
• Density: ratio of frequency to area…intensity of a
pattern
– depending on distribution within a defined study area
may be misleading (pp. 81-82)
• Quadrat Count Methods
– Census or random methods
– Issues?
Density-based measures
• Quadrat Analysis – based on the frequency of
occurrence of points within quadrat units
– Requires overlaying quadrats onto a layer of point
features
– Once quadrats are overlayed onto the point layer,
frequencies of points per quadrat can be counted
– All quadrats are classified according to observed
frequency of points
– Null hypothesis: point features are randomly distributed
Density-based Measures
• Kernel Density Estimation
– A pattern has a density at any location…
– Continous densities for defined “kernels” to
create a continous surface
Distance-based
Point Pattern Measures
• The Logic of Distance Measures
– Can be described using types (categories):
• Clustered – points are concentrated in one or more
groups/areas
• Uniform – points are regularly spaced with
relatively large interpoint distance
• Random – Neither the clustered or uniform pattern
is prevalent
Measuring Spatial Arrangement
• Nearest Neighbor Analysis (Index)
– Measures the degree of spatial dispersion in a point
distribution based on minimizing interpoint distances
– Logic: in general the average distance between points in
a clustered pattern is less than in a uniform pattern\
– Logic: a random pattern is associated with an avg.
interpoint distance larger than a clustered pattern but
smaller than a uniform pattern
– The “nearest neighbor” for each point feature must be
determined, and the interpoint distance is computed
Measuring Spatial Arrangement
• Nearest Neighbor Analysis (Index) con’t
– Observed average nearest neighbor distances compared
to expected average nearest distances assuming
complete spatial randomness [CSR] (1/2 sq.rt. A/n)
– NNI = Ad/Ed p.100
– NNI range: 0 to 2.1491…where 0 indicates perfectly
clustered and 2.1491 indicates perfectly uniform
(values close to 1 indicate a random pattern)
– To test the statistical significance of an NNI value, a
computed z value can be compared to a critical value
(1.96)
Measuring Spatial Arrangement
• Nearest Neighbor Analysis: Pros and Cons
– Pros: relatively simple; easy to compute;
straightforward logic
– Cons: is not sensitive to complex patterns
unless extended to include more than just
nearest neighbors
The Concept of Spatial
Autocorrelation
• Spatial Autocorrelation: measures the extent to
which the occurrence of one feature is influenced
by the distribution of similar features in the
adjacent area Why is this idea important in the context of
“classical” statistical analysis?
• Captures some aspects of point spatial distribution
not reported by NNI or quadrat analysis
• Spatial auto correlation is characterized as positive
(the existence of one feature tends to attract
similar features) or negative (the existence of one
feature tends to deter the location of similar
features)
Types of Area Objects
• Natural Areas vs. Command Regions
– Who cares?
• Issues with Command Regions?
• Raster
– Pros and Cons?
• Planar-enforced areas…
– GIS-context?
Geometric Properties of Areas
• Area
– How is it calculated?
• Shape
– Comparison of a polygon to a known shape
• Spatial pattern
– Contact numbers
– Fragmentation (FRAGSTATS)
Spatial Autocorrelation
• Most common spatial autocorrelation statistic is
Moran’s I coefficient
– Similar to a traditional correlation coefficient
– The I coefficient for the most part ranges between –1
and +1; larger negative values indicate a scattered
pattern…positive values indicate a clustered pattern
• Also Geary’s C (Geary’s Ratio)
– The C coefficient tends to range between 0 and 2;
values approaching 0 imply similar values of a variable
tend to cluster (positive spatial autocorrelation)…values
approaching 2 indicate that dissimilar values tend to
cluster
Spatial Autocorrelation
• Joins Count approach
– Logic?
The Concept of Fields
• “…phenomena are continously variable and
measureable across space.” (p. 210)
– Scalar fields: All locations are represented by a
value…one value per unit
– Vector fields: values are not independent of
coordinates (magnitude and direction)
Describing and Analyzing Fields
• Two steps in the recording and storage
process of fields (p. 213):
– Sampling the “real” surface
• The input data
– Interpolation to derive a continuous surface
representation
• Types of fields and how they are derived
Sampling the earth’s surface
• Issues to consider:
– The methodology used to obtain the sample
• How would we find out?
– The spatial resolution of the sample
– In may cases, we may be “stuck” with scalar
field sample data…Why?
Continuous Surface Description
• Types of Fields (pp. 214-220):
– Point Systems
• Grid sampling (raster) , surface specific, surface
random
– Triangulated Irregular Networks (TINs)
– Contours
– Mathematical Functions
• Data may need to be processed further to
derive “usable” fields…interpolation
Continuous Surface Description: The
Raster Data Structure
• A cell (grid) data structure
• Row, Column coordinates (all positive
values)
• Uniform cell size
• Every cell is assigned a value
– Numeric (integer or floating point)
– Categorical (usually in effect integer)
Continuous Surface Description: The
Raster Data Structure
• Cell Value Assignment:
–
–
–
–
Centroid Method
Predominant Type
Most Important Type
Hierarchical
• ** In many cases, the data you are working
with may already have cell values assigned
Example Continuous Surface
Description: DEMs
• Digital Elevation Models (DEMs): a sample
of elevation data for a study area
represented as evenly-spaced points or
raster cells
– Data from a DEMs is often used in land surface
analysis, as they are free and data quality can be
ascertained
– In most GIS packages, DEMs are converted to
a raster format prior to analysis
Derived Measures on Surfaces:
Raster Data Processing
• Local Operations: raster layer is processed
on a cell-by-cell basis
– Single layer
– Multiple layer (raster overlay)
– Examples…
Derived Measures on Surfaces:
Raster Data Analysis
• Neighborhood Spatial Operations: cell data is
processed based on a focal cell and its neighboring
cells
– Neighboring cells become part of an operation based on a
distance and/or directional relationship to the focus cell
– Focus cell is usually assigned a value based on the values
of neighboring cells
– Common neighborhoods: 3x3 “window”; circle;
– Operations: sum, mean, standard deviation, minimum,
maximum
– Examples…
Derived Measures on Surfaces:
Raster Data Processing
• Zonal Operations: apply to groups of cells that
belong to the same “zone” or have a common
value
– Single layer: geometry of zones (perimeter, area,
centroid, etc.)
– Multiple layers (overlay): one layer defines the zones,
the other defines variables values…summary statistics
are calculated by zone (mean, standard deviation, area,
min, max.)
– Examples…
Derived Measures on Surfaces:
Raster Data Analysis
• Global (Distance Measure) Operations: the output
value of each cell is calculated based on spatial
relationship to a “source” cell
– Distance measurement in a raster layer is based on
nodes and links
• Node = centroid
• Link = lateral (1 cell) or diagonal (1.4142 cells) connections
to adjacent cells
– Euclidean, Physical (buffer), and Cost Distance
Measurement
Derived Measures on Surfaces:
Surface Analysis
• Involves analyzing a phenomena that is 3dimensional…the 3rd dimension can be
represented as a “z-coordinate” (in addition
to x,y coordinates)
• The z-coordinate (or value) can represent
almost anything, although it is most often
employed to model topography
Derived Measures on Surfaces:
Surface Analysis
• Data Types for Surface Analysis
– Irregularly-spaced point features
– Regularly-spaced cells in a raster layer (for
example, DEMs)
– Vector contour lines
– Triangulated Irregular Networks (TINs)
Derived Measures on Surfaces:
Surface Analysis
• Triangulated Irregular Networks (TINs):
approximate a 3-dimensional surface using a
series of non-overlapping triangles
– Based on an irregular distribution of points that have
x,y, and z coordinates
– Sample points are used to generate triangles using
either the VIP or max z-tolerance algorithm
– Triangles are generated using rules of Delaunay
Triangulation…all nodes are connected to their nearest
neighbors, and triangles are as equi-angular as possible
– Triangles have area and angles associated with them
Derived Measures on Surfaces:
Surface Analysis
• Slope and Aspect: Calculated by determining the
amount and direction of tilt of a cell’s normal
vector
• Surface Curvature: Used to determine if the
surface at a cell location is upwardly convex or
concave
• Viewshed Analysis: Determining what areas are
visible and not visible from a vantage point
• Watershed Analysis: Watershed delineation and
drainage characterization based on elevation data
Spatial Interpolation
• Control points are points with known
values…it is best if there is “good
coverage” of control points (how often does
this happen?)
• Assumptions:
– 1. The surface of the Z variable is continuous
– 2. The Z variable is spatially dependent
Types of Spatial Interpolation
• Global vs. Local
– The difference is the number of control points used
• Exact vs. Inexact
– How control point values are used and “re-estimated”
• Deterministic vs. Stochastic
– Assessment of prediction errors (with estimated
variances)
Simple Spatial
Interpolation Techniques
• Local Methods: The z value of an unknown
point location is estimated from known
local point neighbor locations
• Interpolation procedures are used when we
have discontinuous datasets and we want
(or need) to process them into spatially
continuous datasets
“Simple” Deterministic Spatial
Interpolation Techniques
• Usually used to derive field datasets for
further processing:
–
–
–
–
Inverse Distance Weighted Spatial Average
Proximity polygons
Local Spatial Averaging
Other Methods
Statistical Spatial Interpolation
• A process of using locations with known
data values to estimate values at other
locations.
– Global (Statistical) Methods: Use all available
data (control points) to perform estimation
• A “statistical surface” is constructed by
interpolating unknown values from known
values
Spatial Interpolation
• Global (Statistical) Methods: The z value of an
unknown point location is estimated from all known
point data
– Polynomial Trend Surface Analysis (Inexact,
Deterministic): approximates points with known values
with a polynomial equation
– The equation is used as an “interpolator” to estimate
values at other points
– Computed by the least squares method and a “goodness
of fit” can be computed for each control point
Zx, y  b0  b1x  b2 y
Spatial Interpolation
• Local Methods
– Inverse Distance Weighted (Exact,
Deterministic): enforces that the estimated
value of a point is influenced more by nearby
known points than those farther away
• All predicted values are within the range of the
maximum and minimum values in the distribution
Spatial Interpolation
• Local Methods
– Splines (Exact, Deterministic): create a surface
that passes through the control points and has
the least possible change in slope at all points
(minimum curvature surface)
Spatial Interpolation
• Local Methods
– Kriging (Exact, Stochastic): a geostatistical
method for spatial interpolation where the mean
is estimated from the best linear unbiased
estimator or best linear weighted moving
average
• Assumes that the spatial variation of an attribute is
neither totally random nor totally deterministic (a
correlated component, a drift, a random error term)
How do we Accomplish Spatial
Interpolation in ArcGIS?
• Geostatistical Analyst:
– An ArcGIS extension that provides tools to
perform statistically-based spatial interpolation
• Exploratory Data Analysis
• Calculation and Modeling of Surface Properties
(Structural Analysis)
• Surface Prediction and Assessment of Results
Knowing the Unknowable:
The Statistics of Fields
• Statistical spatial interpolation techniques…why
are they necessary or advantageous? (p. 246-247)
– Control point data has error and varies over time…we
are not going to obtain an exact fit from deterministic
methods
– If we have sample datasets, we have data pertaining to
the spatial distribution of phenomena that can be used
in spatial interpolation
– We try to “fit” a mathematical model or function to the
semivariogram (Gaussian, linear, spherical, circular,
exponential) to be used as an interpolator
“Geostatistical” Spatial Interpolation
• Kriging: Assumes that the estimation of surface
variations is based on the assumption that the
surface can be represented by 3 factors:
– The residual of local fluctuation…the level of spatial
correlation locally estimated from a polynomial
function
– The drift of regional tendency…representing a spatial
“trend”
– A random error estimate
– There are different variations of kriging, based on the
the presence or absence of a “drift” factor and the
interpretation
Spatial Interpolation
• Types of Kriging:
– Ordinary:
• the drift component is excluded
• Focus on the degree of spatial dependence among sampled known
points (semivariance)
2
1 n
• Semivariance =
 (h )   ( z ( xi )  z ( x  h ))
2n i 1
• Semivariance values are plotted on a semivariogram where the
semivariance is recorded on the Y-axis and the distance between
known points on the X-axis (nugget, range, sill)
• The semivariogram is fitted to a mathematical model (sherical,
circular, exponential, linear, Gaussian)
s
• Equation for estimating Z:
Z 0   ZxWx
i 1
Spatial Interpolation
• Types of Kriging:
– Universal Kriging: assumes that the spatial
variation in z values has a “drift” or “trend” in
addition to the spatial correlation between
known points
– Co-Kriging: Can be used to improve spatial
predictions by incorporating secondary
variables, provided they are spatially correlated
with the primary variable
Semivariogram
Covariance
Co-Kriging using multiple variables
Concept of Cross-correlation
Isotropic vs. Anisotropic Interpolation Techniques
Single Layer Operations
• We might consider these operations the “simplest”
form of spatial analysis; although this might not
always be true
• Single layer (horizontal) operations: procedures
that apply to only one data layer at a time
– We are conceptualizing things in this way to simplify
our understanding of what analysis operations do…not
because this is really how we utilize the operations
• Operations that apply to a single feature type
– Does this change with the geodatabase?
Single Layer Operations
• Feature Identification and Selection
– Identify, Select Feature, Attribute Query
• Feature Classification
– What type of distribution, how do we
determine? Uniform (equal interval, equal
frequency); Normal (standard deviation);
Multiple Cluster (natural breaks)
Single Layer Operations
• Feature Manipulation
– Boundary Operations
• That ArcView can perform: Clip, Dissolve,
Append?
• That ArcView cannot perform (ARC/INFO
required): Erase, Update, Split, Mapjoin, Eliminate
– Proximity Analysis
• ArcView: Buffer
• ArcView cannot: Thiessen polygons
Map Overlay
(Multiple Layer) Operations
• “…arguably, the most important feature of
any GIS is its ability to combine spatial
datasets…” (p. 285)
• 10 Possible types of Map Overlay
Map Overlay Operations
• Polygon Overlay operations
– Simplest Form: Boolean Overlay (Sieve mapping)
• 4 Steps (pp. 288-302):
– Determining the Inputs
– Getting the Data
– Getting the Spatial Data into the Same Coordinate
System
– Overlaying the Maps
Map Overlay Operations
• Overlay Operations (in ArcGIS)
– Union
– Intersect
– Identity
• Results?
Erase (Coverage)
Identity Overlay
Intersect Overlay
Symmetrical Difference
Union Overlay
Update Overlay
Spatial Modeling
• According to Chou (1997), a Spatial Model:
– 1. Analyzes phenomena by identifying
explanatory variables that are significant to the
distribution of the phenomenon and providing
information about the relative weight of each
variable
– 2. Is useful for predicting the probable impact
of a potential change in “control” factors
(independent variables)
Spatial Modeling:
Thinking About Models
• Models can be:
–
–
–
–
Descriptive or Prescriptive
Deterministic or Stochastic
Static or Dynamic
Deductive or Inductive
Spatial Modeling
• General Types of (Spatial) Models
– Descriptive: characterization of the distribution of
spatial phenomena
– Explanatory: deal with the variables impacting the
distribution of a phenomena
– Predictive: once explanatory variables are identified,
predictive models can be constructed
– Normative: models that provide optimal solutions to
problems with quantifiable objective functions and
constraints
Spatial Modeling
• More specific types of spatial models:
– Binary models (descriptive): use logical expressions to identify or
select map features that do or do not meet certain criteria…How?
– Index models (descriptive): use index values calculated for
variables to produce a ranked spatial surface…How?
• Weighted Linear Combination Model
– Regression models (explanatory or predictive): a dependent
variable is related or explained by independent variables in an
equation…How?
• Linear and logistic regression
– Process (explanatory or predictive): integrate existing knowledge
about environmental processes into a set of relationships and
equations for quantifying those processes…How?
Spatial Modeling
• Steps in the Modeling Process
–
–
–
–
Define the goals of the model
Break down the model into elements
Implementation and calibration of the model
Model validation
• Sometimes difficult or not feasible
The Role of GIS in Spatial Modeling
• How can GIS enable spatial modeling?
– GIS is a tool that can integrate a myriad of data sources
– GIS can incorporate raster and/or vector data into
modeling schemes
– Modeling may take place within a GIS, or require
linking to other computer programs
• Loose coupling
• Tight coupling
• Embedded System
Spatial Modeling
• Important Issues in Conducting Spatial Analysis:
– Delineation of geographic units of analysis
• How do you choose geographic units of analysis so that spatial
analyses are valid?
– Identification of structural and spatial factors that
impact spatial analysis
• Structural – impact site
• Spatial – impact situation (absolute and relative location,
neighborhood effects)
Stormwater modeling
project logic
Based on TR-55


First issued by the US SCS in 1975, today
Natural Resource Conservation Service (NRCS)
Presents simplified procedures for addressing
stormwater during initial overland flow (runoff,
peak discharge, hydrographs, and storage
volumes for detention ponds)
Stormwater modeling
project logic
TR-55


–
Stormwater runoff calculation
– based on Runoff Curve Number (CN) method
CN - empirically derived number
– Product of hydrologic soil group, cover type,
treatment, hydrologic condition, and antecedent
runoff condition
Also – Percent impervious surface
Network Analysis
• Network analysis: the spatial analysis of linear
(line) features
– Your text distinguishes between several different types
of lines
• Network analysis involves 2 types of problems:
– analyzing structure (connectivity pattern) of networks
– analyzing movement (flow) over the network system
• Network analysis is often a major part of subfields
that are related to transportation: transportation
geography, transportation planning, civil
engineering, etc.
Linear Regression Models: Logic
and Assumptions
• Assumptions (predicted vs. actual values):
– Errors have the expected mean value of zero
– Errors are independent of each other
– Correlations among independent variables
should not be high
Network Analysis
• Concepts:
–
–
–
–
–
–
Network
Line segment(s)/Links
Nodes (and vertices)
Impedance
Topology
Dynamic Segmentation
Network Analysis:
Network Structure
• Evaluation of Network Structure:
–

Index: the ratio of the actual number of links
to the maximum possible
number of links 3(n
2) (n = # of nodes)…range between 0-1
–
Index: the ratio of the actual number of
circuits to the maximum number of circuits
(c/(2n-5))…evaluation in terms of the number
of ways to get from one node to another

Network Analysis:
Network Structure
• Network Diameter: the maximum number of steps
required to move from any node to any other node
using shortest possible routes over as connected
network
• Network Connectivity: an evaluation of nodal
connectivity over a network based on direct and
indirect connections (expressed through the
construction of matrices c1, c2, c3)
Network Analysis:
Network Structure
• Network Accessibility: can be evaluated based on
nodes or the entire network…the accessibility
network is many times called the T matrix
– T matrix is the sum of all connectivity matrices up to
the level equal to the network diameter (i.e. c3 or c4)
– Logically this makes sense if you are trying to evaluate
total connectivity of a node or the entire network
– How do we read the matrix?
Network Analysis:
Network Structure
• Network Structure in a Valued Graph
– The previously discussed measures of network structure
are based on either counting links and/or nodes….what
element are we missing with these?
– Q. What is a valued graph? A. A matrix is constructed
in which every link (line segment) in a network is
coded with an impedance measure (such as what?)
• An often-used type of valued graph is the minimal spanning
tree…satisfies 3 criteria:
• Can a GIS construct a minimal spanning tree?
Network Analysis:
Normative Models of Network Flow
• Normative models are those that are designed to
determine a best or optimal solution based on
specific criteria
• Simple Shortest Path Algorithm:
– Involves finding the “path” or route with the minimum
cumulative impedance between nodes on a network
– Requires an impedance matrix (such as a valued graph)
and a set of interative procedures:
• GIS must know which nodes are connected to which…multistep evaluation of connectivity and least cumulative impedance
(distance, time, cost, etc.)
Network Analysis:
Normative Models of Network Flow
• The Traveling Salesman Problem:
– 2 “constraints” – 1) the salesman must stop at each
location once 2) the salesman must return to the origin
of travel (there can be variations)
– The objective is to determine the path or route that the
“salesman” can take to minimize the total impedance
value of the trip
– Often a heuristic method is used…beginning with an
initial random tour, a series of locally optimal solutions
is run by swapping stops that cause a reduction in
cumulative impedance (an iterative procedure is also
described in your book on pp. 236-244).
Network Analysis:
Normative Models of Network Flow
• Various Types of Network Problems:
– Shortest Path Analysis (Best Route)
• Simple shortest path
• Traveling Salesman
• Closest Facility
– Allocation (Define Service Area)
– Location-Allocation: solves problems matching supply
and demand by using sets of objectives and constraints
• P-median, max covering, max equity
Network Analysis:
Normative Models of Network Flow
• Dynamic Segmentation Data Model: The ability to
derive the locations of events in relation to linear
features dynamically…not reliant upon the
existing topology of a network
• Models linear features using routes and events…
– Routes: represent dynamic linear features
– Events: phenomena that occur at locations along line
segments
• Dynamic segmentation is used to operationalize
network analysis in ArcInfo/ArcGIS
Spatial Interpolation
Z
Y
X
Z
X
Y
Ordinary Kriging Comparison
• With Anisotopy
–
–
–
–
–
Mean= .01694
RMS = 2.862
Avg. Stan Error = 3.441
Mean Stan. = .004232
RMS Stan. = .8324
• Without Anisotopy
–
–
–
–
–
Mean= .0002331
RMS = 2.857
Avg. Stan Error = 3.424
Mean Stan. = .0006747
RMS Stan. = .8347
Universal Kriging Comparison
• With Anisotopy
–
–
–
–
–
Mean= .04253
RMS = 2.595
Avg. Stan Error = 2.354
Mean Stan. = .01806
RMS Stan. = 1.102
• Without Anisotopy
–
–
–
–
–
Mean= .0001592
RMS = 3.054
Avg. Stan Error = .8181
Mean Stan. = .001031
RMS Stan. = 3.731
Regression Equations
• TWOYR = -3.538 + 0.06031 * AVGCURV
+ 0.03331 * PERCIMPV
• TENYR = -4.156 + 0.07806 * AVGCURV
+ 0.04368 * PERCIMPV