) Statistical Package for the Social Sciences (SPSS IBM SPSS Statistics 19.0

Statistical Package for the
Social Sciences (SPSS)
IBM SPSS Statistics 19.0
Yupaporn Siribut
Objectives
 to provides some training in the use of a
powerful software package to relieve
students of computational drudgery
 to help you understand the concepts and
techniques of statistical analysis
 to provide practice exercises on SPSS
The research process
MINZAS ??
Contents
Session I: Introduction
1
The usefulness of SPSS/ PASW
2
What we need to prepare?
3
Introduction to descriptive statistics
4
Exploring data by Graphs
Contents
1
Session II: Practice exercises
2
Doing basic statistics on SPSS
3
Doing regression on SPSS
4
Interpreting the result
Cont.
Session I:
An Overview
Statistical Package for the Social Sciences
(SPSS) software, since 2009 known as
Predictive Analysis Software (PASW)
Statistical software used by commercial,
government, and academic organizations
around the world to solve business and research
problems
Session I:
An Overview
Cont.
Quickly and easily discover new insights from
data, test hypotheses, and build powerful
predictive models
Even if you have little or no statistical or
mathematical background, PASW Statistics will
show you how to generate statistical support and
decision-making information quickly and easily
Session I:
Usefulness of SPSS
SPSS/ PASW provide followings;
 Descriptive statistics (Mean, Median, Mode, Standard
deviation, Range)
 Discrete probability distributions (Binomial, Poisson,
Geometric, Hyper geometric)
 Continuous probability distributions (Normal, T, Chi
Square, F)
 Correlation (Rank correlation, Pearson’s correlation)
 Linear regression (Simple and Multiple linear
regression)
 Logistic regression
 Market research
Session I:
Applied research
• Factors influencing the adoption of OVF ---Logistic Regression
• Factors influencing the extent of OVF by individual farm households--Linear Regression
Session I:
Applied research
t-tests for individual measures assessed attitudinal differences between
participants and non-participants of each group
Session I:
Applied research
Simple linear regression model can be designed to analyze factors
influencing adoption of land management
Session I:
How the output of SPSS presents?
Session I:
How the output of SPSS presents?
Session I:
How the output of SPSS presents?
Figure 1 Daily calories intake
(kcal/capita/day) compared with MDER
(1,850 Kcal) across lowland, upland and
highland ecosystems.
Session I:
How the output of SPSS presents?
Session I:
The research process
Session I:
What we need to prepare?
Session I:
1.Preparing a codebook
Preparing the codebook involves deciding about;
 defining and labeling each of the variables
 assigning numbers to each of the possible responses
Session I:
1.Preparing a codebook
1.Preparing a codebook
1.Preparing a codebook
Output
Session I:
2.Creating a data file
To prepare a data file, three key steps are
covered in;
 Step 1. The first step is to check and modify,
where necessary, the options that SPSS uses
to display the data and the output that is
produced
 Step 2. The next step is to set up the structure
of the data file by ‘defining’ the variables
Session I:
2.Creating a data file
 Step 3. The final step is to enter the data that is,
the values obtained from each participant or
respondent for each variable “ Data entry”
Session I:
3.Data entry
Session I:
A First Look at SPSS Statistics 19
Fig 2
If you start up SPSS for the first time, it presents a screen similar to Fig 2
Let everyone take look at program….
Session I:
Data editor for entering data
Session I:
3.1 What to measure?
a) Independent and dependent variables
 Independent --- Predictor variable
 Dependent variables--- outcome variable
---Things to think about before entering data---
Variables
Session I:
3.1What to measure?
Things to think about before entering data
Cont.
Cont.
Session I:
3.1What to measure?
Variables
 b) Levels of measurement
 The relationship between what is being
measured and the numbers that represent what is
being measured is known as the level of
measurement.
 Variables can be split into categorical and
continuous, and within these types there are
different levels of measurement
Things to think about before entering data
Cont.
Session I:
Variables
3.1What to measure?
Categorical (entities are divided into distinct
categories):
 Binary variable: There are only two categories (e.g.
dead or alive)
 Nominal variable: There are more than two categories
(e.g. whether someone is an omnivore, vegetarian,
vegan, or fruitarian)
 Ordinal variable: The same as a nominal variable but
the categories have a logical order
(e.g. whether people got a fail, a pass, a merit or a
distinction in their exam)
Things to think about before entering data
Cont.
Session I:
Variables
3.1What to measure?
 Continuous (entities get a distinct score):
 Interval variable: Equal intervals on the variable
represent equal differences in the property being
measured (e.g. the difference between 6 and 8 is
equivalent to the difference between 13 and 15)
 Ratio variable: The same as an interval variable, but the
ratios of scores on the scale must also make sense (e.g.
a score of 16 on an anxiety scale means that the person
is, in reality, twice as anxious as someone scoring 8)
Things to think about before entering data
Cont.
Session I:
Time to Break !!!
^__^
Session I:
4. Screen for errors
Common sources of error are:
 missing data coded as “999”
 'not applicable' or 'blank' coded as “0”
 typing errors on data entry
 Column shift
 “made up”
 coding errors
 measurement and interview error
Detection
Most errors will be detected using three
procedures:
 Descriptive statistics (exp. Standard deviation higher than
the mean value)
 Scatter plot
 Histograms
SPSS output – Scatter plot
SPSS output - Histogram
Detection
Session I:
3. Screen for errors
 Histogram
 Look at the tails of the distribution. Are there
data points sitting on their own, out on the
extremes?
 If so, these are potential outliers. If the scores
drop away in a reasonably even slope, there is
probably not too much to worry about.
Correction
 There are slightly different ways to deal with
error in DEPENDENT and INDEPENDENT
variables.
 Dependent Variables
• When there are a minimal number of errors, the
values are generally recoded to "missing".
• Take a look then recoding a variable
 Independent variables
• set the error values to the data set mean or the
group mean
5. Exploring Data
 a) Descriptive statistics
 describe the characteristics of your sample in
the method section of your report
 check your variables for any violation of the
assumptions underlying the statistical
 techniques that you will use to address your
research questions
 address specific research questions
Descriptive statistics
The differences types of descriptive statistics
(Mooi and Sarstedt , 2011)
Session I:
Descriptive statistics
 Frequency Command
 The Frequency command allows you to
analyses a full range of descriptive statistics
including the measures of central tendency,
percentile values, dispersion and distribution
Frequency Command
SPSS output
Session I:
Time to have a
Lunch !!!
^__^
Session I:
5.Exploring Data
 Statistical tests
 t-test,
 ANOVA,
 correlation
Correlation
Pearson correlation or Spearman correlation is
used when you want to explore the strength of
the relationship between two continuous
variables.
This gives you an indication of both the direction
(positive or negative) and the strength of the
relationship.
Correlation
Example of research question:
 Is there a relationship between the amount of
control people have over their internal states
and their levels of perceived stress? Do
people with high levels of perceived control
experience lower levels of perceived stress?
 Total perceived stress: tpstress, Total
PCOISS: tpcoiss
Correlation
Interpretation
In the example given here, the Pearson
correlation coefficient (–.58) is negative,
indicating a negative correlation between
perceived control and stress.
The more control people feel they have, the less
stress they experience.
Interpretation
Pearson correlation is .581, which when squared
indicates 33.76 per cent shared variance.
Perceived control helps to explain nearly 34 per
cent of the variance in respondents’ scores on
the Perceived Stress Scale
Interpretation
The results of the above example using Pearson correlation could be
presented in a research report as follows.
t-test
T-tests are used when you have only two groups
(e.g. males/females) or two time points (e.g. preintervention, post-intervention)
The rationale of the t test is to test for significant
differences in the means of two samples,
therefore choose Compare Means
t-test
 2 types of its;
 Independent-samples t-test, used when you
want to compare the mean scores of two
different groups of people or conditions
 paired-samples t-test, used when you want
to compare the mean scores for the same
group of people on two different occasions, or
when you have matched pairs.
t-test
Example of research question:
 Is there a significant difference in the mean
self-esteem scores for males and females?
What you need: Two variables:
 one categorical, independent variable (e.g.
males/females)
 one continuous, dependent variable (e.g. selfesteem scores)
SPSS out put
t-test
Are the N values for males and females correct?
If your Sig. value for Levene’s test is larger than
.05 (e.g. .07, .10) you should use the first line in
the table, which refers to Equal variances
assumed.
If the significance level of Levene’s test is p=.05
or less (e.g. .01, .001), this means that the
variances for the two groups (males/females)
are not the same.
Therefore your data violate the assumption of
equal variance.
ANOVA
One way ANOVA
Example of research question: What is
the impact of age and gender on
optimism?
Does gender moderate the relationship
between age and optimism?
Contents
1
Session II: Practice exercises
2
Doing basic statistics on SPSS
3
Doing regression on SPSS
4
Interpreting the result
Practice exercises
Part 1: Getting started
Practice exercises
Part 2: Preparing the data file
Practice exercises
Part 3: Preliminary analyses
References
 Carver, R. H., & Nash, J. G. (2011). Doing data analysis
with SPSS version 18.0. Boston, MA: Brooks/Cole
Cengage Learning.
 Mooi, E., & Sarstedt, M. (2011). A concise guide to
market research: The process, data, and methods using
IBM SPSS statistics. Berlin: Springer.
 Pallant, J. (2010). SPSS survival manual. Maidenhead:
McGraw Hill.