How to prioritise recontacts Dan Hedlin Statistics Sweden &

How to prioritise recontacts
Dan Hedlin
Statistics Sweden
&
Stockholm University
Situation
• Suspect errors
• Possible to make another observation of a unit
• Cheaper to re-observe all items for a selected unit
than the same number of items for several units
• Must prioritise units
• Able to predict true value
• Issue: how to select observations for recontact
EESW 2009 Stockholm
2
Solution: selective editing
•
•
•
•
Widely used
Large body of experience
Meets requirements of quality enhancement and
reduced response burden
Although general problem, applications only in
surveys and, to some extent, auditing
EESW 2009 Stockholm
3
Form 1
Form 2
Item 1
Item 2
Sum of items 1
and 2
Item 3
EESW 2009 Stockholm
4
Form 1
Form 2
Item score 1
for form 2
Item 1
Item 2
Sum of items 1
and 2
Item score 3
for form 2
Item 3
Combined item scores for
form 2, ’unit score’
EESW 2009 Stockholm
5
Item score
• Common item score for item j in record k:
~
•
•
•
kj
wk ~
ykj zkj
wk design weight
~
ykj predicted value
zkj reported value
EESW 2009 Stockholm
6
Item score
• Common item score for item j in record k:
~
kj
wk ~
ykj zkj
Motivations
• related to point estimate: difference between
‘true’ point estimate and ‘incorrect point
estimate
~
• or ykj zkj as an estimate of square root of
measurement variance
EESW 2009 Stockholm
7
Unit score
• What function of the item scores to form
a unit score?
• Let a item score be denoted by kj
• … and a unit score by g γ k
EESW 2009 Stockholm
8
Common unit score functions
In the editing literature:
p
• Sum function:
kj
j 1
• Euclidean score:
p
2
kj
j 1
• Max function: max
j
kj
EESW 2009 Stockholm
9
Outstanding issues in
selective editing
1. Item score
2. Prediction of true value
3. Unit score
4. Nonresponse issues
5. Selection design (of forms)
6. Estimation of ‘total error’
Focus on choice of unit score
EESW 2009 Stockholm
10
Unit scores unified by…
1
• Minkowski’s distance
p
g γk ;
kj
j 1
1
• Sum function if = 1
• Euclidean = 2
• Maximum function if
EESW 2009 Stockholm
infinity
11
Parametrised by
Advantages
• Unified unit score simplifies presentation
and software implementation
• Gives structure: orders the feasible
choices
…from smallest: = 1
…to largest: infinity
EESW 2009 Stockholm
12
Three editing situations
1. Large errors remain in data, such as unit
errors
2. No large errors, but may be bias due to
many small errors in the same direction
3. Little bias, but may be many errors
EESW 2009 Stockholm
13
Situation 3. Can show that if…
•
•
Variance of error is Var
Item score is
~
kj
kj
~
ykj zkj
2
~
wk ykj zkj
Then Euclidean unit score optimal
• The Euclidean unit score will minimise
the sum of the variances of the remaining
error in estimates of the total
EESW 2009 Stockholm
14
~
• However, ykj only prediction of true value
• Can show that max function more robust to
this prediction error than Euclidean function
EESW 2009 Stockholm
15
Situation 2. Can show that if…
•
Bias contribution
~
ykj zkj
Then no unit score optimal
• The sum unit score will minimise the sum
of the absolute values of the bias
contributions, but not the bias
However, sum unit score may be useful
anyway in Situation 2
EESW 2009 Stockholm
16
Conclusion
• The Euclidean and the max unit scores are
good choices in Situation 3
• In Situation 2 one has to analyse the nature
of the bias
EESW 2009 Stockholm
17