EXPLORING ACROSS-SCALE RELATIONSHIPS IN SPATIALLY AGGREGATED DATA: INFORMING THE MODIFIABLE AREAL UNIT PROBLEM Jonathan K. Nelson and Cynthia A. Brewer Penn State Department of Geography To achieve a better understanding of the stability of these data across the three scales, we first spatially joined values from upper-level aggregates to nested lower-level aggregates (fig. 2). The spatial joins resulted in: tract-level data with county values appended and block group-level data with county and tract values appended. (a) Spatial join values from upper-level aggregates to nested lower-level aggregates. We demonstrate our statistical-visual approach to exploring the scalar effects of MAUP using two variables and three spatial scales. Figure 1 cartographically depicts both datasets at all three scales. Classification breaks are held constant to visualize the effects of scale on data variability across the three aggregation levels. Our aim is to advance the understanding of the (in)stability of these data across the three scales. Spatial Scales 1. Block Group 2. Tract 3. County Why these Spatial Scales? ● well-defined; complete ● statistically uniform ● reliable data ● common in literature ● clean, nesting relationship Figure 1: PA Median Income and NY Cancer Diagnosis Rates at the (a) block group level, (b) tract level, and (c) county level. COUNTY COUNTY INCOME TRACT TRACT INCOME BLOCK GROUP BLOCK GROUP INCOME Pennsylvania Philadelphia County Philadelphia County Philadelphia County Philadelphia County Philadelphia County Philadelphia County Philadelphia County Philadelphia County Philadelphia County Philadelphia County $36,251 Census 8.01 Census 8.03 Census 8.03 Census 8.03 Census 8.03 Census 8.04 Census 8.04 Census 8.04 Census 9.01 Census 9.01 Tract $79,000 Group $79,000 Tract $59,135 Group $40,795 Tract $59,135 Group $51,574 Tract $59,135 Group $106,658 Tract $59,135 Group $107,727 Tract $62,589 Group $64,643 Tract $62,589 Group $72,431 Tract $62,589 Group $48,750 Tract $39,265 Group $53,897 Tract $39,265 Block 1 Block 1 Block 2 Block 3 Block 4 Block 1 Block 2 Block 3 Block 1 Block 2 Group $25,810 Pennsylvania Figure 2b: Sample of records underlying PA block group topology post spatial join processing (right). All block groups are within four different census tracts in Philadelphia county. Because of this there is one median income value for the county, four different values for each of the tracts, and individual values for the ten block groups. Note just how different some of these values are considering they are spatially related across scale. In the row highlighted in red, we see a county value of ~$36,000, a tract value of ~$59,000, and a block group value of over $100,000. Pennsylvania Pennsylvania Pennsylvania Pennsylvania Pennsylvania Pennsylvania Pennsylvania $36,251 $36,251 $36,251 $36,251 $36,251 $36,251 $36,251 $36,251 $36,251 Next, we performed bivariate local indicators of spatial association (LISA) analyses (Anselin 1995, Geog Analy 27:2). LISA is a local form of spatial autocorrelation, which decomposes the global Moran’s I statistic into individual local indicators of spatial autocorrelation. Bivariate LISA analysis allows values of one variable to be regressed on neighboring values of a different variable. Here, we apply LISA to quantify across-scale autocorrelation. As such, values of the lower-level aggregates were standardized in standard deviation units with a mean of zero and variance of one, and regressed on standardized neighboring values of the appended upper-level aggregate scales. PA Median Income: Tract against County Moran's I (0.55, n=3217, p<0.01) Scatterplot NY Cancer Diagnosis Rates: Tract against County Moran's I (0.41, n=4901, p<0.01) Scatterplot LISA 2 Indices Similar 0 Dissimilar 2 3 0 3 ● How do median household income values at one scale correlate with nearby median household income values of a different scale? ● How do cancer diagnosis rates at one scale correlate with nearby cancer diagnosis rates of a different scale? Similar 0 Dissimilar 2 2.5 0 Dissimilar 2 5.0 4 Standardized Values Lower Level Aggregate 8 12 4 Indices Standardized Values Lower Level Aggregate Similar NY Cancer Diagnosis Rates: Block Group against County Moran's I (0.31, n=15247, p<0.01) Scatterplot LISA 0.0 Indices 0 2 2.5 LISA 2 6 Standardized Values Lower Level Aggregate PA Median Income: Block Group against County Moran's I (0.51, n=9738, p<0.01) Scatterplot 2. Bivariate LISA Analysis Spatially Lagged Values Upper Level Aggregate ● commonly analyzed in aggregate form ● vary differently across space ● spatially adjacent geographies for effective cartographic comparison STATE Pennsylvania Spatially Lagged Values Upper Level Aggregate Why these Variables? (d) (b) (a) (d) (b) (e) (c) (f) (e) (b) Spatially Lagged Values Upper Level Aggregate 2. Cancer Diagnosis Rates (2005-09), NY (a) Figure 2a: Schematic diagram of spatial join process (left). County values are appended to nested tracts and block groups. Tract values are appended to nested block groups. LISA 2 Indices Similar 0 2 Dissimilar 4 7.5 0 PA Median Income: Block Group against Tract Moran's I (0.69, n=9738, p=0.01) Scatterplot 10 20 Standardized Values Lower Level Aggregate NY Cancer Diagnosis Rates: Block Group against Tract Moran's I (0.46, n=15247, p<0.01) Scatterplot LISA Indices Similar 2 0 Dissimilar 2 2.5 0.0 2.5 Standardized Values Lower Level Aggregate 5.0 7.5 LISA Indices Similar 2 0 Dissimilar 2 0 10 20 Standardized Values Lower Level Aggregate (c) (f) Figure 4: Bivariate choropleth maps of across-scale relationships of PA median income. The Moran’s I scatterplots (fig. 3) plot Figure 5: Bivariate choropleth maps of across-scale standardized median income values and relationships of NY cancer rates. cancer diagnosis rates of lower-level aggregates against spatially lagged values and rates of upper-level aggregates. Color represents LISA indices and visually reinforces areas in the plots that convey similar (green), dissimilar (purple), and random (brown) across-scale relationships. The maps in a, b, and c of figures 4 and 5 integrate type of spatial autocorrelation with level of significance. Hue conveys type of across-scale spatial autocorrelation: purple denotes negative autocorrelation; brown denotes little or no autocorrelation; and green denotes positive autocorrelation. The varying lightness in hue represents the level of significance for a given type of spatial autocorrelation, with darker shades indicating greater significance. The maps in d, e, and f of figures 4 and 5 depict standardized median income values and cancer diagnosis rates of lower-level aggregates against spatially lagged values and rates of upper-level aggregates. Shades of gray depict similarities between levels of areal aggregation. Shades of pink and green represent areas of across-scale discordance in median income and cancer rates. FINDINGS 30 4 4 Spatially Lagged Values Upper Level Aggregate 1. Median Household Income (2010), PA The LISA analyses provided both global and local insights on the spatial autocorrelation of median household income and cancer diagnosis rates across scale. However, we needed strategies for visualizing the results to better understand the spatial patterns of autocorrelation and significance. Statistical output from the six bivariate LISA analyses was visually transformed into trivariate Moran’s I scatterplots (fig. 3), and bivariate choropleth maps (fig. 4 and 5). 1. Data Processing VARIABLES, SCALES & RATIONALE Variables VISUALIZATION & INTERPRETATION METHODS Spatially Lagged Values Upper Level Aggregate Socioeconomic and health analysts commonly rely on areally aggregated data, in part because government regulations on confidentiality prohibit data release at the individual level. Analytical results from areally aggregated data, however, are sensitive to the modifiable areal unit problem (MAUP) (Openshaw 1984, Geo Books). Levels of aggregation, as well as the arbitrary and modifiable sizes, shapes, and arrangements of zones affect the validity and reliability of findings from analyses of areally aggregated data. MAUP, long acknowledged, remains unresolved (Root 2012, AAG 102:5; Manley 2014, H Reg Sci). We present an exploratory spatial data analytical approach to understand the scalar effects of MAUP. Spatially Lagged Values Upper Level Aggregate INTRODUCTION 30 Figure 3: Trivariate Moran's I scatterplots for across-scale relationships of PA median income and NY cancer diagnosis rates. ● Positive GLOBAL spatial autocorrelation for all relationships ● Tract-block group relationships most similar ● County-block group relationships most dissimilar ● Tract-county and block group-county relationships differ in a similar way ● Cancer diagnosis rates possess weaker signal ● Pockets of LOCAL across-scale instability for all relationships
© Copyright 2024