HOW TO USE STATISTICS FOR LIBRARY DECISION- MAKING Diana Very

1
HOW TO USE STATISTICS
FOR LIBRARY DECISIONMAKING
Diana Very
June 27, 2011
2
MBA for Librarians: Statistics
This program will demystify statistical concepts and skills
and illustrate their library applications. The instructor will
show how data can and should influence all areas of library
operations. Learn about studies, tools and resources to
assist you in comparing your data with that from other
institutions. Create information, knowledge and stories
from numerical and qualitative data to enhance decision
making.
The goal: Manage smarter.
Led by Diana Very.
3
Diana Very
Director of LSTA, Statistics, & Research
Funding provided for this presentation
by IMLS through the LSTA program grant
4
How many have never
used statistics?
5
Has anyone received one of these?
100%
6
Statistics tell a story
•
•
•
•
•
What
Where
When
How
Why
7
What is a Statistic?
• A statistic is a quantity that is calculated from a sample
of data. It is used to give information about unknown
values in the corresponding population. For example, the
average of the data in a sample is used to give
information about the overall average in the population
from which that sample was drawn.
• It is possible to draw more than one sample from the
same population and the value of a statistic will in general
vary from sample to sample. For example, the average
value in a sample is a statistic. The average values in
more than one sample, drawn from the same population,
will not necessarily be equal.
8
Definitions
• Mean
• Median
• Mode
• Percentage Change %∆= ((𝑁𝑒𝑤 − 𝑂𝑙𝑑)/𝑂𝑙𝑑)*100
• Range
• Sample
• Standard Deviation
• Target Population
(Children at a Children’s Program)
• Trend
• Variance
• Correlation
9
Example of Population, Mean, Variance,
Standard Deviation
• Consider a population consisting of the following eight values:
2,4,4,4,5,5,7,9
• These eight data points have the mean (average) of 5:
• (2+4+4+4+5+5+7+9)/8 = 5
• To calculate the population standard deviation, first compute the difference
of each data point from the mean, and square the result of each:
• (2-5)2 = (-3)2 = 9 (4-5)2 = (-1)2 = 1 (4-5)2 = (-1)2 = 1 (4-5)2 = (-1)2 = 1
• (5-5)2 = (-0)2 = 0
(5-5)2 = (-0)2 = 0
(7-5)2 = (2)2 = 4
(9-5)2 = (4)2 = 16
• Next compute the average of these values, and take the square root:
(9+1+1+1+0+0+4+16)/8 = 4 = variance
square root of 4 is 2 = Standard deviation
10
Example of Normal Curve
• This quantity is the population standard deviation; it is equal
to the square root of the variance.
• A slightly more complicated real life example, the average
height for adult men in the United States is about 70", with a
standard deviation of around 3". This means that most men
(about 68%, assuming a normal distribution) have a height
within 3" of the mean (67"–73") — one standard deviation —
and almost all men (about 95%) have a height within 6" of the
mean (64"–76") — two standard deviations. If the standard
deviation were zero, then all men would be exactly 70" tall. If
the standard deviation were 20", then men would have much
more variable heights, with a typical range of about 50"–90".
Three standard deviations account for 99.7% of the sample
population being studied, assuming the distribution is normal
(bell-shaped).
11
Normal Curve or Bell Curve
12
Multivariate Analysis
Not as scary as it sounds
Involves observation and analysis of more than one
statistical variable at a time. In design and analysis, the
technique is used to perform trade studies across multiple
dimensions while taking into account the effects of all
variables on the responses of interest.
Example:
During a production process, a number of different
measurements such as the tensile strength, brittleness,
diameter, etc. are taken on the same unit. Collectively such
data are viewed as multivariate data.
13
Pearson Correlation
The Pearson correlation measures the correlation or strength
of linear dependence between two variables X and Y.
𝑟=
(𝑋−𝑋)(𝑌−𝑌)
(𝑆𝑆𝑥 )(𝑆𝑆𝑦 )
It returns values between +1 and −1 inclusive.
1 implies that Y increases as X increases.
0 implies that there is no linear correlation between the
variables.
−1 implies that Y decreases as X increases.
For −1 and 1, a linear equation exists that describes the
relationship between X and Y perfectly.
14
Public Library Use Determinants
Total
Library
Reference
Public
Total
Circulation
Visits
Transaction
Hours
Staff
18,379,134
47,155,895
39,392,010
9,513,049
882,799
2,995
9,446,298
17,856,414
47,811,748
40,852,165
8,734,545
907,316
3,105
2008
9,319,532
17,646,302
43,663,621
36,979,778
7,994,164
905,630
3,109
2007
9,098,140
17,056,943
40,816,175
35,703,912
8,275,923
896,848
3,018
2006
8,789,529
16,496,624
40,735,627
31,952,301
8,547,509
887,400
3,038
2005
8,650,046
16,041,499
41,155,342
31,557,896
8,571,452
868,892
2,796
2004
8,510,563
16,040,938
40,269,048
31,285,987
8,076,037
871,401
2,821
Mean
9,126,258
17,073,979
43,086,779
35,389,150
8,530,383
888,612
2,983
Median
9,098,140
17,056,943
41,155,342
35,703,912
8,547,509
887,400
3,018
Mode
#N/A
#N/A
#N/A
#N/A
#N/A
#N/A
#N/A
Population
Total Collection
2010
10,069,700
2009
Variance
Standard
Deviation
Correlation
541,635,247,329.
540,046.06
1,581,668,337,779. 19,057,194,894,718. 28,421,398,103,038
922,858.37
3,203,367.99
3,912,010.96
0.922
0.715
485,335,050,482.728
511,208.62
441,888,716
15,425.31
0.490
29,827
126.73
15
Public Library Use Determinants
by Diana Very, 4/1/2011
• This hypothesis was based on an assumption that library users
only used the libraries for new books and best sellers that are
provided when the budget is available to buy them. The library
materials budget was used as the independent variable
assuming that the circulation was dependent on the amount
available in the budget. Using the Pearson correlation
coefficient of r for determining the extent to which these
variables are related produced an r coefficient of -0.502 which
means that there is good evidence that these variables are not
correlated.
• The circulation statistics are generated by the library visits,
which would lend itself to project that marketing to more of the
library service population would increase circulation and use of
library materials rather than spending more money for new
materials. When making a decision about marketing budget or
materials budget, this study may prove to be helpful.
16
Where to get statistics?
• Statistics are everywhere. The statistics that you want to
use will depend on what decision you want to make from
them.
• Some questions that come up for libraries
• Who are our customers?
• Can we bring in more users?
17
Public Library Statistical Survey - IMLS
http://harvester.census.gov/imls/publib.asp
Public Libraries in the United States:
Fiscal Year 2008
Release Date: June 2010
Revised Date: January 2011
http://harvester.census.gov/imls/pubs/pls/index.asp
18
Academic Library Statistical Survey
•
Academic Libraries: 2008 First Look
• http://nces.ed.gov/pubsearch/pubsinfo.asp?pubid=2010348
• National Center for Education Statistics
• FY2008 edition provides stats on 3,827
academic libraries
• Circulations
• Public Service Hours
• Gate Count
• Collection Numbers & Types
• Staff
19
Public School Library/Media Center
• http://nces.ed.gov/pubsearch/getpubcats.asp?sid=041#
• Several reports are available at this site, but only to 2000.
• Federal Libraries and Media Centers reports are also
available, but not up to date.
• Digest of Education Statistics, 2010
• http://nces.ed.gov/pubsearch/pubsinfo.asp?pubid=2011015
• Contains up-to-date stats for education from kindergarten
through graduate school.
20
Public Library Data Service
Statistical Report
• The survey for 2010 data is the 23rd edition of the annual
survey.
• This report is created from a survey sent to 9,272 valid
U.S. and Canadian libraries through web contacts.
• 1,105 responded to the questionnaire.
21
Census Data
• Home page of data sets and instructions
• http://www.census.gov/acs/www/
• American Community Survey – Provides demographics
such as population number, races, housing, education,
etc., for states, counties, and municipalities
• http://www.census.gov/acs/www/
• Guidance for data users – provides instruction on using
the data and finding the correct data set
• http://www.census.gov/acs/www/guidance_for_data_user
s/guidance_main/
22
Other Examples of Library Stats
• 2011 State of America’s Libraries — ALA Releases
Annual Report :
http://ala.org/ala/newspresscenter/mediapresscenter/ame
ricaslibraries2011/index.cfm
• Library Research Service – Colorado Library Stats
http://www.lrs.org/pub_stats.php
• Current Look at Georgia Public Libraries FY 2010
http://www.georgialibraries.org/lib/publiclibinfo/
23
Where to find comparative statistics
• This depends on what type of comparisons you want to
•
•
•
•
•
make
In Georgia, the library directors want to compare their
system with others in Georgia
In DeKalb County, Georgia, the library branch managers
want to compare their branches to other libraries within
the county system.
The Public Library Survey from IMLS provides data for
states to compare their state data with other states.
Peer-to-peer comparisons;
I’m ok just so I’m better than you…Oh, my!
24
Compare this year with last year
• Use a trend analysis (compares different years of same
statistic) for staff motivation, accountability reports,
marketing and promotional activities.
• Stats to Use:
• Circulation
• Visits
• Program attendance
• Genre Circulation
• Library Cards
• Try per capita calculations
• Library Cards per capita
• Program attendance per capita
• Identify the % not participating.
25
A table of usable statistics
Population
Total
Total
Collection Circulation
2010
10,069,700
18,379,134
2009
9,446,298
2008
Library
Visits
Reference
Transaction
Public
Hours
Total
Staff
47,155,895
39,392,010
9,513,049
882,799
2,995
17,856,414
47,811,748
40,852,165
8,734,545
907,316
3,105
9,319,532
17,646,302
43,663,621
36,979,778
7,994,164
905,630
3,109
2007
9,098,140
17,056,943
40,816,175
35,703,912
8,275,923
896,848
3,018
2006
8,789,529
16,496,624
40,735,627
31,952,301
8,547,509
887,400
3,038
2005
8,650,046
16,041,499
41,155,342
31,557,896
8,571,452
868,892
2,796
2004
8,510,563
16,040,938
40,269,048
31,285,987
8,076,037
871,401
2,821
26
How to make the decision – Step 1
1. What’s the situation?
4% budget cut
27
How to make the decision – Step 2
2. Decision tree
• Reduce staff – already skeleton staff
• Furlough staff – not fair to staff
• Reduce hours – possibility
• Reduce library collection budget – cut last
year to nearly nothing
• Reduce outreach services – agreements
already in place
28
How to make the decision – Step 3
3. Justify how to reduce hours
• Check into patterns of library use
• Check into staff efficiencies
• Check into circulation and reference
use
• Check into website hits and WIFI
traffic
29
Group Work
Name a statistic or set of statistics
that will answer:
• Patterns of library use
• Staff efficiencies
• Circulation and Reference use
• Website and WIFI traffic
30
If you want to know about
patterns of library use,
you would collect what
type of data?
Reference contacts by hour
31
If you want to know about
staff efficiencies, you
would collect what type of
data?
Circulations per FTE
32
If you want to know about
circulation and
reference use, you would
collect what
type of data?
Collection turnover rate (circulation/collection)
Hint hint
33
If you want to know about
website hits and WIFI
traffic, you would collect
what type
of data?
34
Create a Logic Model
Logic Model Template
Grant Period
Project Title
Total Cost
Project Description
Resources
In order to accomplish
set of activities, we will
need the following:
Name of resources
State, Federal
Or Other Funding Source
Other Results
Anecdotal Information
Exemplary Reason
Activities/Methods
In order to address
our problem we will
conduct the following
activities:
Name of activities
Outputs
We expect that
these activities will
produce the following
evidence of service
delivery
Number of items
Outcomes
We expect changes
in attitudes, behaviors,
knowledge, skills
resulted from this
project
Increased number
Percentage Increase
Impacts
Organizational, community
or procedural level changes
resulted from this project.
Increased Number
Percentage increase
35
Example Using the Logic Model
Logic Model
Project Title
Ourtown Summer Library Program
Grant Period
Total Cost
4/1/11 - 9/15/11
$1,000 per library system
Project Description
Ourtown public library and school library will work together to bring library activity day
every Wednesday afternoon to the local mall's center staging area. Teenagers will be
hired as mentors and activities will involve reading and writing stories about animals.
Resources
In order to accomplish
Activities/Methods
In order to address
Outputs
We expect that
Outcomes
We expect changes
Impacts
Organizational, community
set of activities, we will
need the following:
our problem we will
conduct the following
activities:
or underway these
activities will produce
the following evidence
of service delivery
in attitudes, behaviors,
knowledge, skills
resulted from this
project
or procedural level changes
resulted from this project.
Grant - LSTA Funding
Publicity
number of patrons
served
new patrons
Increased attendance
Staff
Craft Activities
Reading Stories
number of computer
classes
Patrons comfortable
with computer use
Computer Training
number of activities
Library skills increased
number of days of
activities
library behavioral
problems decreased
because of project
attendance
Volunteers
Other Results
Anecdotal Information
Exemplary Reason
Increased family services
Improved library services
36
Pictures often say more than words
37
Tell the 2011 Clifford Presentation Story
1,518 participants
15 programs
@ a Cost of
62
Per Participant
http://animoto.com/play/jSIexmvTn8wimFaCnajcbw
38
Use Statistics to Make Informed Decisions
• When is the busiest time at the library?
• Do I need more staff to keep up?
• Do we really need a new building or can we rearrange the
current facility?
• Should we arrange the fiction by genre or alphabetically
by author?
• Why is our teen collection not being used? Old collection?
Hidden in the middle of the picture books? Teens don’t
know about our books?
• Would the community support the program if they knew
the benefits?
39
Thank you!!
Contact information:
Diana Very
Georgia Public Library Service
1800 Century Place, Ste. 150
Atlanta, GA 30345
404-235-7156
[email protected]
40
References
• Standard Deviation from Wikipedia retrieved on 5/12/2011
from
http://en.wikipedia.org/wiki/Standard_deviation
• PLA - Public Library Data Service Statistical
Report. 2010. Presented by the PLA/ALA,
ordering information found at
http://pla.org/ala/mgrps/divs/pla/plapublications/pl
dsstatreport/index.cfm
• IMLS - Public Libraries in the United States:
Fiscal Year 2008. Only available on-line at
http://harvester.census.gov/imls/pubs/pls/pub_det
ail.asp?id=130
41
References, cont.
• Smith, Mark. 1996. Collecting and using public library
statistics: A how-to-do-it manual for librarians, Number
56. Neal-Schuman Publishers, Inc.
• 2011 State of America’s Libraries — ALA Releases Annual
Report :
http://ala.org/ala/newspresscenter/mediapresscenter/ame
ricaslibraries2011/index.cfm
• Library Research Service - Colorado Statistics
http://www.lrs.org/pub_stats.php
• Multivariate Analysis Concepts, retrieved from
http://support.sas.com/publishing/pubcat/chaps/56903.pdf
42
References, cont. 2
• Multivariate analysis, retrieved from Wikipedia,
http://en.wikipedia.org/wiki/Multivariate_analysis
• Very, Diana. 2011. Public Library Use Determinants, p 13,
e-mail for copy [email protected].