Advanced Quantitative Methods

Advanced Quantitative
Methods
William L. Holzemer, RN, Ph.D., FAAN
Professor, School of Nursing
University of California, San Francisco
[email protected]
Objectives
• Develop your definition of nursing science
• Use the Outcomes Model to think about
your area(s) of interest
• Review quantitative methods
• Think about how we build knowledge to
improve health and nursing practice.
2
Assignments
• PhD Students -individual assignments
• MS Students – group assignment
– Mini-literature review
•
•
•
•
Outcomes Model
Substruction
Synthesis Tables
Summary
3
Nursing = Nursing Science?
Definition of Nursing
American Nurses Association:
“Nursing is the assessment , diagnoses,
and treatment of human responses”
4
Definition of Nursing
Japan Nurses Association
“Nursing is defined as to assist the
individual and the group, sick or well, to
maintain, promote and restore health.”
5
Definition of Nursing
International Council of Nurses
“Nursing encompasses autonomous and
collaborative care of individuals of all ages,
families, groups and communities, sick or
well and in all settings. Nursing includes the
promotion of health, prevention of illness,
and the care of ill, disabled and dying
people. Advocacy, promotion of a safe
environment, research, participation in
shaping health policy and in patient and
health systems management, and education
are also key nursing roles.”
6
Common Elements:
Definitions of Nursing
•
•
•
•
Person (individual, family, community)
Health (Wellness & Illness)
Environment
Nursing (care, interventions, treatments)
7
Nursing Science
The body of knowledge that supports
evidence-based practice
8
Nursing Science Uses Various Research
Methodologies
Qualitative
Understanding
Interview/observation
Discovering frameworks
Textual (words)
Theory generating
Quality of informant more
important than sample size
Rigor
Subjective
Intuitive
Embedded knowledge
Quantitative
Prediction
Survey/questionnaires
Existing frameworks
Numerical
Theory testing (RCTs)
Sample size core issue in
reliability of data
Rigor
Objective
Public
9
Types of Research Methods:
(all have rules of evidence!)
Qualitative
Quantitative
Grounded theory
Ethnography
Critical feminist theory
Phenomenology
Non-Experimental or
Descriptive
Experimental or
Randomized Controlled
Trials
Ethnography
Content Analysis
Models of analysis: fidelity
to text or words of
interviewees
Models of analysis:
Parametric vs. nonparametric
10
Outcomes Model for Health Care Research
(Holzemer, 1994)
Inputs
1970’s
Processes 
1980’s
Outcomes
1990’s
Client
Provider
Setting
11
Outcomes Model
• Heuristic
• Systems model (inputs are outputs,
outputs become inputs)
• Relates to Donabedian’s work on quality of
care (Structure, Process, and Outcome
Standards)
12
Outcomes Model: Nursing Process
Inputs
Client
Provider
Processes 
Problem
Outcomes
Outcome
Intervention
Setting
13
Outcomes Model for Health Care Research
Inputs
(Covariate,
confounding
variable)
Processes  Outcomes
(Independent
Variable)
(Outcome
Variable)
Client
Age, gender,
SES, Ethnicity
Severity of Illness
Self-care
Adherence
Family care
Quality of Life
Pain control
Pt. satisfaction
Pt. falls,
Provider
Age, gender,
SES,
Education,
Experience,
Certification
Perc. Autonomy
Interventions
Care
Talking, touch, time
Vigilance,
communication
Quality of Work life
Turnover
Errors
Satisfaction
Setting
Resources
Philosophy
Staffing levels
Actual staffing ratios
Mortality
Morbidity
Cost
14
Outcomes Model: Your assignment
(Think about a project or program of research)
Inputs
z
Processes 
x
Outcomes
y
Client
Provider
Setting
15
Where Should We Find EvidenceBased Practice Guidelines?
• Clinical practice guidelines
• Nursing Standards/ Procedural Manuals
• Great demand, low level of delivery (Great
demand, growing level of delivery)
• Knowledge base from research literature
16
Types of Evidence:
How do we know what we know?
•
•
•
•
•
•
Clinical expertise
Intuition
Stories
Preferences, values, beliefs, & rights
Descriptive/quasi-experimental studies
Randomized clinical (controlled) trials
(RCTs) - the gold standard
17
Summary: Introduction to Research
• Think about nursing research – nursing science
• Outcomes Model designed to put boundaries
around your area of study and expertise (very
difficult challenge in nursing!)
• Variable identification
• Understanding rigor – correct methods for any
type of research design
• Enhance enjoyment in reading research articles
• Understand the challenge of the words so easily
used, “evidence-based practice.”
18
Some Challenges:
• Think about developing your definition of nursing
science.
• Use the Outcomes Model to help you think about
your program of research.
• Enhance your understanding of rigor in all types
of research designs.
• Increase your enjoyment of reading research
articles.
• Understand the complexities of “evidence-based
practice.”
19
When thinking about your research
problem:
•
•
•
•
Is it significant?
Are you really interested in it?
Is it novel?
Is it an important area?
– High cost, high risk?
• Can it be studied?
• Is it relevant to clinical practice?
20
Where do ideas come from?
•
•
•
•
•
•
•
•
Literature reviews
Newspaper stories
Being a research assistant
Mentors/teachers
Fellow students
Patients
Clinical experience
Experts in the field
Build your area of expertise from multiple sources.
21
Uses of Substruction
• Critique a published study
• Plan a new study
22
Substruction
• A strategy to help you understand the
theory and methods (operational
system) in a research study
• Applies to empirical, quantitative
research studies
• There is no word, Substruction, in the
dictionary. It has an inductive meaning,
constructing and a deductive meaning,
deconstructing
• Hueristic
23
Substruction
Theory
(Theoretical
system)
Construct
Methods
(Operational
System)
Measures

Concept

Deductive (qualitative)












Scaling/Data
analysis
(quantitative) Inductive
24
Substruction:
Building Blocks or Statements of Relationships
Construct
Pain

Concept
Intensity

Measure
10 cm scale
axiom
proposition
hypothesis
Construct
quality of life

Concept
functional status

Measure
mobility scale
25
Statements of Relationships
Construct:
Postulate:
Statement of
relationship
between a
construct and
concepts
Pain consists of three
concepts
Concepts:
Intensity
Location
Duration
26
Substruction:
Research Design Perspective
Focus of Study (RCT?)
Co-variates Z
Severity of illness
for risk adjustment
(analysis of covariance)
Independent Variable X
treatment
how measured?
Dependent Variable Y
27
Substruction: Theoretical System,
an example
Pain Intervention Study
Post Surgical
Patient Severity of
illness
age
gender
Pain Management
Intervention
Patient communication
Standing PRN orders
Non pharmacological tx
Pain Control
Length of stay
Patient Satisfaction
28
Substruction: Operational
System
Pain Intensity
Instrument:
VAS 10 cm scale
(low to high pain)
Functional Status
Instrument:1-5 Likert
scale, 1=low & 5=high
function
Scale: continuous or
discrete?
Scale: continuous or
discrete?
29
Scaling
Discrete: non-parametric (Chi square)
• Nominal
gender
• Ordinal
low, medium, high income
Continuous: parametric (t or F tests)
• Interval
Likert scale, 1-5
functionality
• Ratio
money, age, blood
pressure
30
Issues
•
•
•
•
•
•
What is the conceptual basis of the study?
What are the major concepts and their
relationships?
Are the proposed relationships among the
constructs and concepts logical and defensible?
How are the concepts measured? valid?
reliable?
What is the level of scaling and does it relate to
the appropriate statistical or data analytical
plan?
Is there logical consistency between the
theoretical system and the operational system?
31
Is there a relationship between touch and
pain control, accounting for initial amount
of post-operative pain? rx,y.z
Inputs Processes  Outcomes
Z
X
Y
Client
Post
Pain
operative
Control
pain
Provider
Therapeutic
Touch vs NL
care
Setting
32
Literature Review
• We review the literature in order to
understand the theoretical and operational
systems relevant to our area of interest.
• What is known about the constructs and
concepts in our area of interest?
• What theories are proposed that link our
variables of interest?
33
Literature Review
• What is known?
• What is not known?
• Resources
– The Cochran Library
– Library Data Bases
• PubMed
• CINYL
34
Literature Review:
How to combine, synthesis, and demonstrate
direction?
Topic
Study 1
Study 2
Study 3
35
Literature Review
Topic
Study 1
Study 2
Study 3
36
Table 1. Outline of study variables related to
your topic
Covariates Interventions
Studies
Z
Outcomes
Independent Dependent
variable
Variable
X
Y
Smith (1999)
Jones (2003)
Etc.
37
Table 2. Threats to validity of research
studies related to topic
Author
(year)
Type of
Design
Diagram
Smith
(1999)
RCT
O X1 O
O X2 O
O
O
Statistical
Conclusion
Validity
Construct
Validity of
Cause &
Effect
Internal
Validity
External
Validity
n/a
Jones
(2003)
38
Table 3. Instruments
Instrument
Studies
Smith (1999)
#
items
Validity
Reliability
Utility
McGill Pain
Questionnaire
Jones (2003)
39
Table 4. Power analysis for
literature review on topic.
Studies
Smith (1999)
Sample
Size
Alpha
Power
Effect
Size
32 –exp
40 –
cont
0.05
0.60
Est. at
medium
Jones
(2003)
40
Literature Synthesis
• Synthesis - what we know and do not
know
• Strengths – rigor, types of design,
instruments?
• Weaknesses –lack of rigor, no RCTs,
poorly developed instruments
• Future needs – what is the next step?
41
Research Designs
42
Research Design: Qualitative
•
•
•
•
•
•
•
Ethnography
Phenomenology
Hermeneutics
Grounded Theory
Historical
Case Study
Narrative
43
Rigor in Qualitative Research
•
•
•
•
Dependability
Credibility
Transferability
Confirmability
44
Types of Quantitative Research
Designs
• We will focus on RIGOR:
– Experimental
– Non-experimental
45
X,Y, Z notation
• Z = covariate
• Severity of illness
• X = independent variable (interventions)
• Self-care symptom management
• Y = dependent variable (outcome)
• Quality of life
46
Types of Quantitative Research
Designs
– Descriptive
X? Y? Z?
• What is X, Y, and Z?
– Correlational
rxy.z
• Is there a relationship between X and Y?
– Causal
ΔX  ΔY?
• Does a change in X cause a change in Y?
47
Rigor in Quantitative Research
• Theoretical Grounding: Axioms & postulates
– substruction-validity of hypothesized
relationships
• Design validity (internal & external) of
research design; Instrument validity and
reliability
• Statistical assumptions met (scaling, normal
curve, linear relationship, etc.)
(Note: Polit & Beck: reliability, validity,
generalizability, objectivity)
48
Literature Review
Study Aims
Study Aims
Study Question
Study Question
Study Hypothesis
49
Aim, Question, and Hypothesis
• Study Aim: To explore if it is possible to reduce
patient falls for elderly in nursing homes.
• Study Question: Does putting a “sitter” in a
patient room reduce the incidence of falls?
• Study Hypothesis:
Null: H0: There is no difference between patients
who have a “sitter” and those who do not in the
incidence of falls.
50
Experimental Designs
51
Definition: Experimental Design
1. There is an intervention that is controlled
or delivered
2. There is an experimental and control
group
3. There is random assignment to groups
52
Classic Experimental Design
O1exp X
O2exp
O1con
O2con

R

(pretest)
(posttest)
O=observation
1 = pretest or time one; 2 = posttest or time two
X = intervention
R = random assignment to groups
53
Classic Experimental Design
O1exp X
O2exp
O1con
O2con

R

(pretest)
(posttest)
The RCT is the Gold Standard for
Evidence-Based Practice
54
Randomization
1. Random assignment to groups
(internal validity issue) – equals Z
variables in both groups
2. Random selection from population to
sample (external validity issue) –
equals Z variables in the sample that
are true for the population
55
Goal:
Statement of Causal Relationship
56
Conditions Required to Make a
Causal Statement: X causes Y
1. X precedes Y
2. X and Y are correlated
3. Everything else controlled or
eliminated. No Z variables impacting
outcome.
4. We never prove something, we
gather evidence that supports our
claim.
57
Controlling Z variables:
1. Minimize threats to internal
validity
2. Limit sample (e.g. under 35
years only) to control variation
3. Statistical manipulation
(ANCOVA)
4. Random assignment to groups
58
Dimensions of Research Designs:
Groups & Time
O1exp X
O2exp

Groups (n=2 experimental & control)

O1con
O2con
-----------------------------------------------
 Time (n=2) 
(repeated measures)
59
Dimensions of Research Designs:
Groups & Time
Groups = between factors
Time = within factors
60
Types of Designs
• O - descriptive, one time
• O1 O2 O3 - descriptive, cohort, repeated
measures)
• O1 X O2 (not an experimental design!) - prepost-test
61
Types of Designs
• O1
O1
X O2
O2
RCT randomized controlled trial
62
Types of Designs
• O 1 O 2 O 3 X O4 O 5 O 6
O 1 O2 O3 O4 O5 O6
• O1 X O2 Xno O3 X O4 Xno O5
(repeated measures vs. time series designs)
63
Types of Design
R
O1
O1
O1
X1
X2
O2
O2
O2
# of groups? ___
# points in time? ___
64
Types of Designs
Post-test only design:
X O2
O2
What is the biggest threat to this
post-test only design?
65
Types of Research Design
• Experimental (true)
• Quasi-Experimental (quasi)
– No random assignment to groups
66
Design Validity
– Statistical conclusion validity
– Construct validity of Cause & Effect (X
& Y)
– Internal validity
– External
67
Design Validity
• Statistical Conclusion Validity
rxy?
– Type I error (alpha 0.05)
– Type II error (Beta) Power = 1-Beta,
inadequate power, i.e. low sample size
– Reliability of measures
Can you trust the statistical findings?
68
Design Validity
• Construct Validity of Putative Cause &
Effect (X  Y?)
– Theoretical basis linking constructs and
concepts (substruction)
– Outcomes sensitive to nursing care
– Link intervention with outcome theoretically
Is there any theoretical rationale for why X and
Y should be related?
69
Design Validity
Internal Validity
–
–
–
–
–
–
Threat of history (intervening event)
Threat of maturation (developmental change)
Threat of testing (instrument causes an effect)
Threat of instrumentation (reliability of measure)
Threat of mortality (subject drop out)
Threat of selection bias (poor selection of
subjects)
Are any Z variables causing the observed
changes in Y?
70
Design Validity
External Validity
– Threat of low generalizability to people,
places, & time
– Can we generalize to others?
71
Building Knowledge
• Goal is to have confidence in our
descriptive, correlational, and causal data.
• Rigor means to follow the required
techniques and strategies for increasing
our trust and confidence in the research
findings.
72
Sampling
[Sample selection, not assignment]
73
Terms
• Population
- All possible subjects
• Sample
-A subset of subjects
• Element
- One subject
74
What do we sample?
• People (e.g. subjects)
• Places (e.g. hospitals,
units, cities)
• Time (e.g. season, am
vs. pm shift )
75
Sampling: What do we do?
• Random Assignment
• Random Selection
-is designed to equalize
the “Z” variables in
the experimental and
control groups
-is designed to equalize
the “z” variables that
exist in the population
to be equally
distributed in a
sample
76
Types of Probability Sampling
Probability
Simple random sampling –using a random
table of numbers
Stratified random sampling –divide or stratify by
gender and sample within group
Systematic random sampling –take every 10th
name
Cluster sampling – select units (clusters) in
order to access patients or nurses
77
Types of Non-probability sampling
• Convenience – first patients to walk in the
door
• Purposive –patients living with an illness
• Quota – equal numbers of men & women
• (volunteers)
• (convenience)
78
Types of Samples
Homogeneous: subjects are similar, all
females, all between the ages of 21-35
Heterogeneous: subjects are diverse, wide
age range, all types of cancer patients
79
Sampling Error
Population (n=1000)
Mean Age:
36.5 years
Samples (n=50)
Mean Age:
34.6 yrs
37.1 yrs
36.4 yrs.
80
How to control sampling error?
• Use random selection of subjects
• Use random assignment of subjects to
groups
• Estimate required sample size using
power analysis to ensure adequate power
• Overestimate required sample size to
account for sample mortality (drop out)
81
Sample Size and Sampling
Error
small
Sampling
Error
large
small
large
Sample Size
82
Sample Size Calculations
•
•
•
•
•
Type of design
Accessibility of participants
Statistical tests planned
Review of the literature
Cost (time and money)
83
Strategies for Estimating Sample Size
• Ratio of subjects to variables in
correlational analysis. 3:1 up to 30:1
subjects to variables. 30 item
questionnaire requires 90 to 900 subjects.
• Chi square – can’t work if less than 5
subjects per cell
84
Power Analysis
Power - commonly set at 0.80
Alpha - commonly set at 0.05 or 0.01
Effect Size - based upon pilot studies or literature
review; small, medium, large
Sample Size - # subjects required to ensure
adequate power
Power is a function of alpha, effect size, and
sample size.
85
Power Analysis Programs
• SPSS Pakcage
• nQuery Adviser Release 4.0 (most
recent?)
http://www.statsolusa.com
86
Power
• Power is the ability to detect a difference
between mean scores, or the magnitude of
a correlation.
• If you do not have enough power in a
study, it does not matter how big the effect
size, i.e. how successful your intervention,
you can not statistically detect the effect.
• Many studies are under powered.
87
Effect Size
• Effect size can be thought of as how big a
difference the intervention made.
• Statistical significance and clinical
significance are often not the same thing
88
Effect Size
• Small (correlations around 0.20)
– Requires larger sample size
• Medium (correlations around 0.40)
– Requires medium sample size
• Large (correlations around 0.60)
– Requires smaller sample size
89
Effect Size
Meanexp – Meancon
Effect Size =
SD e & c
90
Eta Squared (ŋ2)
• In ANOVA, it is the proportion of
dependent variable (Y) explained.
• Estimate of Effect Size
• Similar to R2 in multiple regression
analysis.
91
alpha
• alpha relates to hypothesis testing and how
often you are willing to make a mistake in
drawing a conclusion
• alpha is equivalent to Type 1 error – or saying
that the intervention worked, when in fact the
effect size observed, is just due to chance
• alpha of 0.01 is more conservative than 0.05
and therefore, harder to detect differences
92
Hypothesis Testing:
Is it true or false?
• Null hypothesis: H0
– Mean (experimental) = Mean (control)
• Alternative hypothesis: H1
– Mean (experimental) =/= Mean (control)
93
Hypothesis Testing and Power
Goal:
Reject H0
REALITY
REALITY
Null H0 True
H0:Mc=Me
Null H0 False
H0:Mc=/=Me
DECISION
Reject H0
Type I Error
Power
(1-Beta)
DECISION
Accept H0
Correct
Decision
Type II Error
(Beta)
94
Quiz:
• If sample size goes up, what happens to power?
• If alpha goes from .05 to .l01, what happens to
required sample size?
• If power falls from .80 to .60, what type of error
is most likely to occur?
• If effect size is estimated based upon the
literature as large, what effect does this have on
the required sample size?
95
Sample Loss in RCT
N=243
Randomization
N=118
N=122
1 month
N=105
N=110
6 months
N=91
N=89
96
Measurement
“If it exists, it can be measured”
R. Cronbach
97
What we measure:
• Knowledge, Attitudes, Behaviors
(KAB)
• Physiological variables
• Symptoms
• Skills
• Costs
98
Classical Measurement Theory:
Measurement:
Reliability
Observation = Truth (fact) +/- Error
Validity
99
Type of Measures
•
Standardized – evidence as follows:
1.
2.
3.
4.
•
Systematically developed
Evidence for instrument validity
Evidence for instrument reliability
Evidence for instrument utility – time,
scoring, costs, sensitive to change over time
Non-standardized
100
Types of Measurement Error
• Systematic - can work to minimize
systematic error due to poor instructions,
poor reliability of measures, etc.
• Random - can do nothing about this,
always present, we never measure
anything perfectly, there is always some
error.
101
Validity
Question: Does the instrument
measure what it is supposed to
measure?
• Theory-related validity
– Face validity
– Content validity
– Construct validity
• Criterion-related validity
– Concurrent validity
– Predictive validity
102
Theory-related Validity
• Face validity
– participant believability
• Content validity (observable)
– Blue print
– Skills list
• Construct validity
(unobservable)
– Group differences
– Changes of times
– Correlations/factor analysis
103
Criterion-related Validity
• Concurrent
– Measure two variables and correlate
them to demonstrate that measure 1 is
measuring the same thing as measure 2
–same point in time.
• Predictive
– Measure two variables, one now and
one in the future, correlate them to
demonstrate that measure 1 is
predictive of measure 2, something in
the future.
104
Reminder:
• Design Validity
Does the research
design allow the
investigator to answer
their hypothesis?
(Threats of internal
and external validity)
• Instrument Validity
Does the instrument
measure what it is
supposed to
measure?
105
Instrument Reliability
Question: can you trust the data?
• Stability – change over time
• Consistency – within item agreement
• Rater reliability – rater agreement
106
Instrument Reliability
• Test-retest reliability (stability)
– Pearson product moment correlations
• Cronbach’s alpha (consistency) – one point in
time, measures inter-item correlations, or
agreements.
• Rater reliability (correct for change agreement)
– Inter-rater reliability Cohen’s kappa
– Intra-rater reliability Scott’s pi
107
Cronbach’s alpha
n


2
1   SD items

n
1


alpha =
2
n 1 
SD



n
SD =
2


m

X
n

1
n 1
108
Cronbach alpha Reliability Estimates:
• > 0.90
– Excellent reliability, required for decisionmaking at the individual level.
• 0.80
– Good reliability, required for decision-making
at the group level.
• 0.70
– Adequate reliability, close to unacceptable as
too much error in the data. Why?
109
Internal Consistency: Cronbach’s alpha
Person A: Internally consistent
Person B: Internally inconsistent
All the
time
Much of
the time
A little of
the time
Rarely
1
4
A
3
2
1
B
2
4
B
3
A
2
1
3
4
3
A
2
B
1
4
4
A
3
B
2
1
Item
110
Error in Reliability Estimates
“Error = 1 – (Reliability Estimate)2”
If alpha = 0.90, 1-(0.90)2
1-0.89 = .11 error
If alpha = 0.70, 1 – (0.70)2
1-.49 = .51 error
If alpha = 0.70, it is the 50:50 point
of error vs. true value
111
Reliability Values
• Range: 0 to 1
• No negative signs like
correlations
• Cohen’s kappa and Scott’s pi
are always lower, i.e. 0.50,
0.60
112
Utility
Things you would like to know about an
instrument.
•
•
•
•
Time to complete (subject fatigue)?
Is it obtrusive to participants?
Number of items (power analysis)?
Cultural, gender, ethnic
appropriateness?
• Instructions for scoring?
• Normative data available?
113
Reporting on Instruments
• Concept(s) being measured
• Length of instrument or number of
items
• Response format (Likert scale, etc.)
• Evidence of validity
• Evidence of reliability
• Evidence of utility
114
Quiz:
• Can a scale be valid and not reliable?
• Can a scale be reliable and not valid?
115
Scale Development
• Generation items from focus groups/interviews
• Scaling decisions capture variation
• Face validity - check with experts and
participants
• Standardize scale (evidence for validity,
reliability, & utility)
• Estimate correlates of concept
• Explore sensitivity to change over time
116
Translation
• Forward translation (A to B)
• Backward translation (B to A)
• Conceptual equivalency across
cultures
• Using of slang, idioms, etc.
117
Data Analysis
118
Data Analysis: Why?
• Capture variability (variance) – how the
scores vary across persons
• Parsimony – data reduction technique,
how to describe many data points in
simple numbers
• Discover meaning and relationships
• Explore potential biases in data (sampling)
• Test hypotheses
119
Where to begin:
• After data is collected, we begin a long
process of data entry & cleaning
• Data entry requires a code book be
developed for the statistical program you
plan to use, such as SPSS.
• Data codebooks allow you to give your
variables names, values, and labels.
120
Data Entry & Cleaning
• Data entry is a BIG source of error in data
• Double data entry is one strategy
• Cleaning data looking for values outside
the ranges, e.g. age of 154 is probably a
typo.
• We examine frequencies, high score, low
scores, outliers, etc.
121
Coding Variables
Capture data in its most continuous form possible.
Age: 35 years - get the actual value
vs.
Check one: _<25
_ 25-35
_ 36-45
_ >45
122
Dichotomous Variables
Do not do this:
1 = Male
2= Female
Do this!
1 = male
0 = female
Why? Add function
123
Dummy Coding
Ethnicity
1 = Black; 2 = White; 3 = Hispanic
N-1 or 3-1 = 2 variables
Black: 1 = Black; 0 = White and Hispanic
White: 1 = White; 0 = Black and Hispanic
124
Missing Data
• SPSS assigns a dot “.” to missing data
• SPSS often gives you a choice of
pairwise or listwise deletion for missing
values.
Mean Substitution: give the variable the
average score for the group, e.g. age,
adds no variation to the data set.
125
Missing Data
Pairwise: just a particular correlation is
removed, best choice to conserve power
Listwise: removes variables, required in
repeated measures designs.
126
Measures:
• Central Tendency
• Relationships
• Effects
127
Measures of Central Tendency
• Mean – arithmetic average score
• Standard deviation (SD) – how the scores
cluster around the mean
• Range – high and low score.
(Example: M = 36.4 years
SD= 4.2
Range: 22-45)
128
Formulas
n
X
n
1
Mean =
N
n
2


m

X
n

SD =
1
n 1
129
Measures of Central Tendency
• Mean – arithmetic average
• Median – score which divides the
distribution in half (50% above and 50%
below)
• Mode – the most frequently occurring
value
When does the mean=median=mode?
130
Normal Curve: very robust!
34%
34%
2.5%
2.5%
-2
-1
M
+1
+2
131
Normal Curves
132
Normal Curve
(Mean=Median=Mode)
Frequency
50% 50%
Mean
Median
Mode
133
Y-Axis
Non-Normal Curves
Y-Axis
X-Axis
X-Axis
134
Scaling
• Discrete
(qualitative)
– Nominal
– Ordinal
• Continuous
(quantitative)
– Interval
– ratio
• Non-parametric
(no assumptions
required; Chi square)
• Parametric
(assumes the normal
curve, e.g. t and F
tests)
135
Degrees of Freedom
• Statistical correction so one does not over
estimate
136
Degrees of Freedom for ball 1?
137
Degrees of Freedom for ball 2?
138
Degrees of Freedom for ball 3?
139
Degrees of Freedom
• Sample size (n-1)
• Number of groups (k-1)
• Number of points in time (l-1)
140
Relationships or Associations
141
Measures of Association: Correlations
• Range: -1 to 1
• Dimensions:
– Strength (0-1)
– Direction (+ or -)
• Definition: a change in X results in a
predictable change in Y; shared variation
or variance.
142
Correlations
• Sample specific (each sample is a subset
of the population)
• Unstable
• Dependent upon sample size
• Everything is statistically significant with a
very large sample size; may not be
clinically significant.
• Expresses relation not a causal statement
143
Types of Correlations
• Pearson product moment r
– continuous by continuous variable
• Phi correlation
– discrete by discrete variable (Chi square)
• Rho rank order correlation
– discrete ranks by ranks
• Point-biserial
– discrete by continuous variable
• Eta Squared
144
X-Axis
Y-Axis
r=?
X-Axis
Y-Axis
r=?
Y-Axis
Estimate the value of the
correlation
r=?
X-Axis
145
Variance
Area under the curve = SD2
Variance
146
Shared variance r2
If r = 0.80, r2 = 0.64
64%
147
Shared variance r2
If r = 1,
If r = 0,
100%
0%
148
Types of Data Analyses
Descriptive
X? Y? Z?
Measures of central tendency
Correlational
rx,y?
Is there a relationship between X and Y?
Measures of relationships (correlations)
Causal
ΔX  ΔY?
• Does a change in X cause a change in Y?
Testing group differences (t or F tests)
149
Testing Effects of Interventions
150
Testing Group Differences
• t tests
• F tests (Analysis of Variance or ANOVA)
(t tests are F tests with two groups)
151
Types of tests of group differences
• Between groups
– (unpaired)
• Within groups
– (paired or repeated measures; if two groups it
is also test-retest)
– requires identified subjects
152
Classic Experimental Design
O1exp X
O2exp
O1con
O2con

R

(pretest)
(posttest)
Group: Between Factor
Time: Within Factor
153
Tests of Significance
3
4
1
O1
X O2
2
O1
O2
154
Testing Group Differences
Between Variance
F (or t) =
Within Variance
155
Examining Variance
Between
Variance
Within
Variance
Mc
Me
156
Examining Variance:
No difference between the means
Mc Me
157
Examining Variance:
Big difference between means
Mc
Me
158
Examining Variance: Three groups
Mc
Me2
Me1
159
Types of Designs
O 1 O2 O3
change within group over time, repeated
measures design
160
Types of Designs
O1e
O1c
X O2e
O2c
change within group from O1e to O2e
change between groups O2e and O2c
161
How to analyze this design?
• O1e O2e O3e X O4e O5e O6e
O1c O2c O3c O4c O5c O6c
• Two group repeated measures analysis
of variance.
• One between factor (group) and one
within factor (time) with six levels.
162
Post-test only design
• X O2e
O2c
Unpaired t test
Null hypothesis:
H0: O2e = O2c
Alternative directional hypothesis:
H1: O2e > O2c
163
• Standard Deviation
– how scores vary around a mean
• Standard Error of the Mean
– how mean scores vary around a population
mean
164
Standard Error of the Mean:
Average of sample SDs
Population (n=1000)
Mean Age:
36.5 years
Samples (n=50)
Mean Age:
SD
34.6 yrs
3.4
37.1 yrs
3.8
36.4 yrs.
4.1
165
Conceptual:
MeanE – MeanC
t=
standard error of the mean
166
Assumptions of ANOVA
•
•
•
•
Normal distribution
Independence of measures
Continuous scaling
Linear relationship between variables
167
3 X 2 ANOVA

R

O1exp X1
O2exp
O1exp X2
O2exp
O1con
O2con
One between factor: group (3 levels)
One within factor: time (2 levels)
168
Omnibus F Test

R

O1exp X1
O2exp
O1exp X2
O2exp
O1con
O2con
F test group: Is there a difference among the three
groups?
F test time: Is there a difference between time 1 and 2?
If yes to either question, where is the difference?
Interaction: Group by Time
169
Post-hoc comparisons
O1exp1 X1
O2exp1
O1exp2 X2
O2exp2
O1con
O2con

R

Types: Scheffé, Tukey – control for degrees of freedom in different
ways; compares all possible two way comparisons
H0: O2exp1 = O2exp2 = O2con If you reject Null, or F test is
significant, then you can look for two-way differences.
(O2exp1= O2exp2?) or (O2exp2= O2con?) or (O2exp1 = O2con?)
170
Tests of Significance
Non-parametric
Parametric
Two-groups
Paired Wilcoxin Rank
Unpaired Mann-Whitney U
Paired t test
Unpaired t test
More than two-groups
Repeated measures Friedman test
Independent groups Kruskal -Wallis
ANOVA
Repeated measures
ANOVA
171
Galloping alpha
• Danger in conducting multiple t tests or doing itemlevel analysis on surveys
• alpha = probability of rejecting the Null hypothesis
• alpha 0.05 divided by number of tests, distributes
alpha over tests
• If conducting 10 t tests, alpha at 0.005 per test
(0.05/10=0.005)
172
ANOVA
• ANOVA – analysis of variance
• ANCOVA – analysis of co-variance,
includes Z variable(s)
• MANOVA – multivariate analysis of
variance (more than one dependent
variable)
• MANCOVA – multivariate analysis of
co-variance, includes Z variable(s).
173
Multiple Regression Analysis
Correlational technique
– Unstable values
– Sample specific
– Reliability of measures very
important
– Requires large sample size
– Easy to get significance with large
sample size
174
Multiple Regression Analysis
Attempts to make causal statements of
relationship
Y = X1+X2+X3
Y = dependent variable (health status)
X1-3 = predictors or independent variables
Health Status = Age + Gender + Smoking
175
Multiple Regression Questions:
• What is the contribution of age, gender, and
smoking to health status?
• How much of the variation in health status is
accounted for by variation in age, gender, and
smoking?
176
Multiple Regression Analysis
• Creates a correlation matrix.
• Selects the most highly correlated independent
variable with the dependent variable first.
• Extract the variance in Y accounted for by that X
variable.
• Repeats the process (iterative) until no more of
the variance in Y is statistically explained by the
addition of another X variable.
177
Health Status =
Age + Gender + Smoking
Health
Status
Y
Age
X1
Gender
X2
Smoking
X3
Health
Status
Y
Age
X1
r2
Gender
X2
r2
Smoking
X3
r2
1
0.25
6%
0.04
0%
0.40
16%
1
0.11
1%
.05
0%
1
.20
4%
1
178
Multiple Regression: Shared Variance
Smoking 40%
Health Status
Age 25%
Gender 4%
Age
Smoking
Gender
179
Multiple Regression
• Correlation results in a r
• Multiple regressions results in an r2
• R squared is the total amount of the
variance in Y that is explained by the
predictors, removing the overlap among
the predictors.
180
Multiple Regression
Types
• Step-wise = based upon highest
correlation, that variable is entered first
(computer makes the decision), theory
building
• Hierarchical = choose the order of entry,
forced entry, theory testing
181
Multiple Regression
• Allows one to cluster variables into Blocks.
• Block 1: Demographic variables
– (age, gender, SES)
• Block 2: Psychological Well-Being
– (depression, social support)
• Block 3: Severity of Illness
– (CD4 count, AIDS dx, viral load, OIs)
• Block 4: Treatment or control
– 1= treatment and 0 = control
182
Regression Analysis
• Multiple regression: one Y, multiple Xs.
• Logistic regression: Y is dichotomous,
popular in epidemiology, Y=disease or no
disease; odds - risk ratio (not explained
variance)
• Canonical variate analysis: multiple Y and
multiple X variables: Y1+Y2+Y3=X1+X2+X3
-linking physiological variables with
psychosocial variables.
183
Multivariate Regression Models:
• Path Analysis and now Structural Equation
Modeling
• Software program: AMOS
• Measurement model is combined with predictive
model
• Keep in the picture the multicolinearity of
variables (they are correlated!)
• Allows for moderating variables (direct and
indirect effects.
184
Multiple Dependent & Independent
Path Analysis Modeling
Relationships are based upon
the literature review and then
potentially explored, discovered,
tested, or validated in a study
Age
Severity of
illness
Adherence
to diet
Gender
Diabetic
Control
Cognitive
Ability
Social
Support
185
Structural Equation Modeling
Muscle
ache Month
0
Fatigue
Month 0
Intercep
t
Intercept
Muscle
ache
Month 1
Fatigue
Month 1
Muscle
ache
Month 3
Fatigue
Month 3
Muscle
ache
Month 6
Slope
Slope
Fatigue
Month 6
186
Factor Analysis
• Exploration of instrument construct validity
• Correlational technique
• Requires only one administration of an
instrument
• Data reduction technique
• A statistical procedure that requires artistic
skills
187
Conceptual Types of Factor Analysis
• Exploratory – see what is in the data
set
• Confirmatory – see if you can
replicate the reported structure.
188
Factor Analysis
• Principal Components –
(principal factor
or
principal axes)
189
Correlation Matrix of Scale Items:
Which items are related?
Item 1
Item 2
Item 3
Item 4
Item 1
Item 2
Item 3
Item 4
1
0.80
0.30
0.25
1
0.40
0.25
1
0.70
1
190
Factor Analysis:
An iterative process
Factor extraction
191
Factor Analysis
Factor I
Factor II
Factor III
Communality
Item 1
0.80
0.20
-0.30
0.77
Item 2
0.75
0.30
0.01
0.65
Item 3
0.30
0.80
0.05
0.63
Item 4
0.25
0.75
0.20
0.67
Eigenvalue
2.10
2.05
0.56
% var
34%
30%
10%
192
Definitions:
• Communality: Square item loadings on
each factor and sum over each ITEM
• Eigenvalue: Square items loading down
for each factor and sum over each
FACTOR
• Labeling Factors: figments of the authors
imagination. Items 1 & 2 = Factor I; Items
3 & 4 = Factor II.
193
Factor Rotation
Factors are mathematically rotated depending
upon the perspective of the author.
• Orthogonal – right angels, low inter-factor
correlations, creates more independence of
factors, good for multiple regression analysis,
may not reflect well the actual data. (varimax)
• Oblique – different types, let’s factors
correlate with each other to the degree they
actually do correlate, some like this and
believe it better reflects that actual data,
harder to use in multiple regression because
of the multicolinearity. (oblimax)
194
Summary: Data Analysis
•
•
•
•
•
Measures of Central Tendency
Measures of Relationships
Testing Group Differences
Correlational
Multiple regression as a predictive
(causal) technique.
• Factor analysis as a scale
development, construct validity
technique
195
Ethical Guidelines for Nursing
Research
Vulnerability – a power relationship
between health care provider and
patient, family, or client.
Vulnerable participants in research
require more protection from harm.
196
Ethical Principles that Guide Research
•
•
•
•
•
•
Beneficence – doing good
Non-malfeasances – doing no harm
Fidelity – creating trust
Justice – being fair
Veracity – telling the truth
Confidentiality – protecting or safeguarding
participants identifying information
197
Ethical Principles that Guide Research
Confidential
– names kept guarded
vs.
Anonymous
– no identifiers
198
Best Wishes