Using the NAACCR Geocoder to Perform In

Using the NAACCR Geocoder to Perform In-House
Geocoding of the Alaska Cancer Registry Database
David O’Brien, PhD, GISP
Data Analyst
Alaska Cancer Registry
Section of Chronic Disease Prevention & Health Promotion
Alaska Department of Health and Social Services
Presentation Topics







Geocoding nature resource data vs people
Alaska Cancer Registry & cancer data reporting laws
Cancer basics
Why do people get cancer?
Why geocode cancer data?
Cancer cluster studies
Geocoding the Alaska Cancer Registry database
Geocoding Natural Resource Data




Most geocoding in Alaska related to natural
resources
Oil wells, gas wells, mines, gravel pits
Commercial fishing, lumber, aquaculture,
mariculture, fish hatcheries
Harbor seal haulouts, sea lion haulouts, salmon
streams, sea bird colonies, bald eagle nests
How Were These
Natural Resources Geocoded?



Field notes & maps transcribed to final paper maps,
paper maps published either alone, in reports, or in
atlases, maps digitized
GPS coordinates for some features (bald eagle
nests, sea bird colonies)
Sometimes problems with data layers digitized in
the wrong datum or arbitrary coordinate system
So How Do You Geocode People?




People are geocoded differently from natural
resources
Data from cancer patients obtained from
confidential medical records
Types of location data: Where they live, where
they work, what hospital or doctor they went to
Cancer registries concentrate on location of
residence at time of diagnosis
About the Alaska Cancer Registry



Consists of a database of all cancer cases diagnosed
and treated in Alaska since 1996, plus Alaska
residents diagnosed or treated in other states
AK Department of Health & Social Services
Funded by the National Program of Cancer
Registries (NPCR) through the US Centers for
Disease Control and Prevention (CDC)
U.S. Cancer Reporting Laws


Public Law 102-515 (Oct 1992): Cancer Registries
Amendment Act: “each form of invasive cancer
with the exception of basal cell and squamous
cell carcinoma of the skin and each form of in situ
cancer except for carcinoma in situ of the cervix
uteri”
Benign Brain Tumor Cancer Registries
Amendment Act (Oct 2002): Cases diagnosed on
or after 1/1/2004
Alaska Cancer Reporting Laws
7 AAC 27.011 (Jan 1996): Established reporting
requirements for our statewide cancer registry
 Benign brain reporting added to 7 AAC 27.011
Jan 2004

Who is Required to Report to ACR?








Hospitals
Physicians
Independent pathology laboratories
Outpatient centers
Home health agencies
Hospices
Nursing homes
Intermediate care facilities
Purpose of ACR

To identify all reportable cases of cancer in Alaska in
order to provide information on the burden of
cancer, types of cancer, and changing patterns of
cancer among residents of our State.
Goals of ACR




Obtain complete, accurate and timely data on all
newly diagnosed cancer cases
Monitor trends for unusual patterns
Identify areas of the state with unusually high
incidence or mortality rates
Provide data to:
AK Comprehensive Cancer Control Program
 AK Breast & Cervical Cancer Health Check Program

Some Basic Information about Cancer




Cancer is not one disease
Cancer is more common than most people realize
About 50% of men and 33% of women will develop
cancer
About 24% of all deaths nation-wide are from
cancer
Highest Ranked Cancers for AK

Incidence:
 Breast,

Prostate, Lung, Colorectal, Bladder (56%)
Mortality:
 Lung,
Colorectal, Breast, Pancreas, Prostate (55%)
 Lung alone is 29% of all cancer deaths
Why Do People Get Cancer?

Personal behavior
 Smoking
& chewing tobacco
 Excessive alcohol consumption
(esp. in smokers)
 Obesity / lack of exercise
 Type of diet (high in red meats,
processed meats, low in fruits &
vegetables
Why Do People Get Cancer?

Workplace exposure
 Example:


asbestos, certain industrial chemicals
Genetics / family history
Unknown factors
Why Do People Get Cancer?
Other Risk Factors:




Previous exposure to ionizing radiation for medical
treatment of chest (breast & lung cancers) or head and
neck (thyroid cancer)
Infection from Human Papilloma Virus (cervical, anal, &
mouth cancers)
Stomach infection from H. Pylori bacteria (stomach &
pancreatic cancers)
Certain illnesses: Type 2 diabetes & inflammatory bowel
disease (colorectal cancer); Hepatitis B & C (liver cancer)
Why Do People Get Cancer?
Other Risk Factors:



Long-term use of oral contraceptives or hormone
replacement therapy, esp. estrogen plus progesterone
(breast cancer)
Radon gas exposure in the home & secondhand smoke
exposure (lung cancer in non-smokers)
Excessive sun exposure and use of tanning beds,
especially at a young age (skin cancer)
Why Do People Get Cancer?

Very rarely due to exposure to pollution found in
the environment, but this tends to be the highest
concern
Why Geocode Cancer Patients?


Incidence case counts and rates reported by
borough/census area
Mortality case counts and rates reported by
borough/census area
Rate per 100,000
102.8 to 218
92.6 to 102.8
82.4 to 92.6
72.2 to 82.4
28 to 72.2
(5)
(1)
(5)
(5)
(9)
Lung Cancer Incidence Rates by Borough/Census Area, 1996-2004
Rate per 100,000
220 to 270
170 to 220
120 to 170
70 to 120
20 to 70
(2)
(6)
(8)
(2)
(4)
Prostate Cancer Incidence Rates by Borough/Census Area, 1996-2004
# of Cases
248 to 1,839
49 to 247
33 to
48
16 to
32
2 to
15
(5)
(5)
(5)
(6)
(6)
32
32
25
25
33
33
40
40
473
473
44
44
14
14
16
16
378
378
66
66
1,839
1,839
39
39
343
343
24
24
44
66
22
15
15
248
26 248
26
67
67
93
93
49
49
37
37
116
116
10
10
22
22
Number of In Situ & Malignant Breast Cancer Cases, AK Residents, 1996-2005
Why Geocode Cancer Patients?


Determine census tract for each cancer case
Census tracts can be linked with socio-economic
data collected by the US Census Bureau
 Researchers

can conduct quality of care studies
Geocode cancer data necessary to conduct cancer
studies for the general public and communities
with cancer concerns
Why Are There Cancer Concerns?

General Public
 Personal
experience with cancer
 Neighbors getting cancer
 Something in the water?

Communities
 Usually
some sort of past event as a trigger
 Oil or chemical spill, pollution from industrial
or military facility
Responding to the General Public




Phone call: Listen, ask questions, be understanding,
the caller is usually upset
Concern is usually due to neighborhood aging
Extended phone interview if caller not satisfied
Review information and determine level of
investigation
Responding to the General Public

Provide copy of Cancer and
the Environment,
by National Cancer Institute
Responding to Communities



Usually referred to us by AK DHSS’s Section of
Epidemiology or CDC’s Agency for Toxic Substances
and Disease Registry (ATSDR)
Proceed with a full cancer study if there is a
triggering event of concern
If no triggering event, start with a smaller study as
a “screening” measure
Past Community Cancer Studies



Chemical spill at refinery in North Pole
Pollution from old military facilities
near Savoonga and Gambell on
St. Lawrence Island
Pollution from old military facilities near Yakutat
Cancer Study
Addresses These Questions:


Are the number and types of cancers
reported in the community unusual (cases & deaths)?
Are the number of observed cancers greater than the
number of expected cancers (cases & deaths)?
First Study:
Number and Types of Cancers




Alaska Cancer Registry for reported cancer cases
Bureau of Vital Statistics’ State Mortality Database
for reported cancer deaths
Cancers per year
Compare most common cancers to
top ranked cancers in the state
Second Study:
Observed vs. Expected Cancers




Determine the number of reported cancers for the
community or for an affected census tract in the
community
Calculate the number of expected cancers
Is observed greater than expected?
If so, is the difference statistically significant?
Past ACR Geocoding Practices




Traditional geocoding was not done
Only about 10 towns in Alaska have more than one
census tract
Used a table of census tract numbers and town
names to populate the database
Census tracts only 34.3% complete
Preparing for Geocoding



Create special “Address at Geocoding” Fields
Reduce number of records w/ unknown addresses
Export data and put into format acceptable to the
geocoding software
“Address at Geocoding” Fields





Stores address that was used for geocoding
Initially the same as the patient address
Over time, patient address might get updated
For future geocoding, 2 addresses are compared
Geocode record if:
 Address at
Geocoding is blank
 The 2 addresses are different
Reduce Unknown Addresses

Identify records w/ unknown addresses
 14,679
records (36.3%)
 PO Box, rural route, pouch, general delivery, unknown
 P.O. Box,
PO Box, P O Box, P. O. Box, PO Bx, POB, Box, BX
 HC, PS, SR, RT, RR, Route
 Unk, Unknown, General Delivery, No Address, Bad Address,
Not Deliverable

Link unknown address records with source records
 260
records resolved
Reduce Unknown Addresses

Link unknown records with the Alaska Permanent
Fund Dividend (PFD) Application database
 578
records resolved from mailing address
 4023 records resolved from physical address
Reduce Unknown Addresses

Manually review remaining non-numeric addresses
 Sorted
all addresses in reverse-alpha order and
 Manually review addresses starting with letters
 Apartment
buildings
 Mobile home parks
 Long-term care facilities
 Correctional facilities
 25
records resolved
Reduce Unknown Addresses



Total of 40,389 records
Started with 14,679 unknown addresses (36.3%)
Ended with 9,793 unknown addresses (24.2%)
Format Data for Geocoding


Exported data from database
Prepared a geocoding data file with specific fields
 Year
of Diagnosis
 Unique ID: combination of Patient ID and Tumor
Number
 Address at diagnosis -- Street
 Address at diagnosis -- City
 Address at diagnosis -- State
 Address at diagnosis -- Postal Code
Interpreting the Geocoder Output


Don’t just assume all the output is correct!
Verify all records have been geocoded
 Problem

with geocoder timing out?
Review “GIS Coordinate Quality” codes
GIS Coordinate Quality


Manually research records geocoded to the state or
borough centroid
Review records geocoded to city centroid
 OK
for PO Boxes
 Not OK for street addresses
 Census tract assignment for cities with multiple census
tracts need to be recoded as 999999

Re-geocode corrected addresses
Geocoding Results

Certain towns were not recognized by the geocoder
and were geocoded to the state centroid
 Kongignack,
Auke Bay, Nikiski, Ward Cove, Funter Bay,
Cube Cove
 Ketchikan with zip code 99950
 Manually assigned to the city centroid and the
corresponding census tract

No cases assigned to a borough centroid
Geocoding Results

160 numerical street addresses geocoded to city
centroid
 Manually

research, corrected in database, re-geocoded
All cases geocoded to city centroid in
Anchorage, Fairbanks, Wasilla, Juneau, Kodiak
were PO Boxes or UNKNOWN
 Zip
codes specific to a PO Box and do not have a Zip
Code Tabulation Area (ZCTA)
 Manually changed census tract to 999999
Geocoding Results






Used unique ID field to link back to the cancer registry
database
Updated geocode-related data fields
1996-2010 cancer data geocoded with Census 2000
data (35,315 records)
2011+ cancer data geocoded with Census 2010 data
(3,302 records)
Successfully geocoded 38,617 of 40,436 records with
known census tracts
Census tract completeness increased from
34.3% to 95.5%
Acknowledgements
This presentation was supported by the cooperative
agreement number U58/DP003856-01 (DP-1205-01)
from The Centers for Disease Control and
Prevention. Its contents are solely the responsibility of
the authors and do not necessarily represent the
official views of CDC.