EXPERIMENTAL VALIDATION OF A RISK ASSESSMENT METHOD

EXPERIMENTAL VALIDATION OF A RISK
ASSESSMENT METHOD
EELCO VRIEZEKOLK, SANDRO ETALLE, ROEL WIERINGA
OVERVIEW
THE PAPER, AND THIS PRESENTATION
Contribution: an approach to testing reliability of a method
Step 1: Identify and mitigate cause of variation
Step 2: Validation of effectiveness of mitigations
Step 3: Conduct experiment
Step 4: Analyse the results
Motivation: we are developing a risk assessment method,
for telecommunication service availability risks.
Results:
- the approach, description and test
- specific lessons for improving our risk assessment
- expert judgements (as ours) can have low reliability
Experimental Validation of a Risk Assessment Method, REFSQ2015
24-3-2015
2
RELIABILITY OF A METHOD
WHAT IS IT, WHY IS IT RELEVANT
A method is like a recipe, like an instrument.
If you use it as instructed, it will yield desired results.
Feasibility is desirable. Effort, skillset.
Validity is desirable. Sometimes difficult to ascertain.
Reliability is desirable.
▪
▪
!  Repeatability
!  Robustness
Consistency
Reproducibility (test–test)
!  Stability (test–retest)
To test reliability: apply the method multiple times, compare the results.
Experimental Validation of a Risk Assessment Method, REFSQ2015
24-3-2015
3
RELIABILITY OF A METHOD
WHAT IS IT, WHY IS IT RELEVANT
A method is like a recipe, like an instrument.
If you use it as instructed, it will yield desired results.
Feasibility is desirable. Effort, skillset.
Validity is desirable. Sometimes difficult to ascertain.
Reliability is desirable.
▪
▪
!  Repeatability
!  Robustness
Consistency
Source: Krippendorff, K. (2004) Content analysis:
an introduction to its methodology, Sage
Publications.
Reproducibility (test–test)
!  Stability (test–retest)
To test reliability: apply the method multiple times, compare the results.
Experimental Validation of a Risk Assessment Method, REFSQ2015
24-3-2015
4
DESIGN OF EXPERIMENTS TO TEST RELIABILITY
TEST THROUGH MULTIPLE APPLICATIONS, COMPARE RESULTS
!  Unreliable method
high variation in results
!  Low variation in results
reliable method
!  High variation in results
unreliable method ?
Causes of variation:
Environ!  Internal
ment
1.  the method itself
!  Contextual
2.  the subjects applying the method
3.  the case to which the method is applied
4.  then environment in which the method us applied.
Subjects
Case
Method
Results
Control contextual causes, then argue that observed variation must be due to causes
internal to the method.
Experimental Validation of a Risk Assessment Method, REFSQ2015
24-3-2015
5
STEP 1: IDENTIFY AND MITIGATE CAUSE OF VARIATION
AND STEP 2: VALIDATION OF MITIGATIONS
a.  Subjects applying the method: understand, be capable, be motivated.
!  misapplication and misunderstanding
!  lack of experience or expert knowledge
!  sufficiently motivated
Subjects
Case
Use of students is not necessarily a threat to validity.
b.  Case to which the method is applied.
!  avoid ambiguity
!  avoid interobserver disagreement, contentious
Environment
Method
Results
c.  Environment during application.
!  time, resources
!  physical comfort
Experimental Validation of a Risk Assessment Method, REFSQ2015
24-3-2015
6
STEP 4: ANALYSE RESULTS
MEASURE INTER-RATER RELIABILITY (IRR)
IRR = 1 – Observed disagreement
/ Expected disagreement
!  Scale: nominal (categorical), ordinal, interval, ratio scales.
!  Complete vs. partially incomplete (omitted items).
Measure
Raters
Scale
Missing data
Cohen’s κ,
Scott’s π
2
nominal
no
Fleiss’ κ
≥2
nominal
no
Spearman’s ρ
2
ordinal
no
Krippendorff’s α
≥2
any
yes
Experimental Validation of a Risk Assessment Method, REFSQ2015
24-3-2015
7
OUR EXPERIMENT
VALIDATING RELIABILITY OF OUR RISK ASSESSMENT METHOD
Risks to an organization, availability of telecommunication services.
!  Hazards on technical components; existing services.
!  Based on diagrams of telecommunication services.
!  Expert judgment: lack of information (infrastructure, risk scenarios)
for each component:
for each vulnerability:
assess likelihood of incident (ordinal / abstain),
assess impact of incident (ordinal / abstain).
Six teams of three students, each team making 138 frequency and 138
impact assessments.
Experimental Validation of a Risk Assessment Method, REFSQ2015
24-3-2015
8
FSQ
OUR EXPERIMENT
VALIDATING RELIABILITY OF OUR RISK ASSESSMENT METHOD
Risks to an organization, availability of telecommunication services.
employee 1
employee 2
!  Hazards on technical
relay server components; existing services.
external
contact
desktop
laptop
!  Based on diagrams of telecommunication
services.
carton
designer
graphic
artist
work station a
work station b
ethernet cable b
!  Expert judgment: lack of information
(infrastructure,
risk
scenarios)desk cable d
desk cable a
desk cable b
desk cable c
ext. mail server
MX 1
MX 2
for each component: DMZ
Department
Department
for each vulnerability:
1
2
assess
likelihood
of
incident
ethernet
cable a
ethernet
cable c (ordinal / abstain),
assess impact of incident (ordinal / abstain).
Server LAN
Firewall ext
Firewall DMZ
Firewall int
File server
Mail server
DHCP server
Six
teams of three
students, each
team making 138 frequency and 138
internet
fiber optic
DNS server
ethernet cable d
impact assessments.
Experimental Validation of a Risk Assessment Method, REFSQ2015
24-3-2015
9
CONCLUSIONS
EXPERT JUDGEMENT, LOW RELIABILITY
Approach to testing reliability expected to be more generally applicable.
Our experiment’s results had low reliability (high variation).
!  partially explained by contextual causes (not fully mitigated)
!  found one internal cause for variation
improvements to our method.
!  our method has to rely on expert judgments
–  Knowledge on architecture & risk scenarios is incomplete.
Implications for expert-based methods:
!  Experts retain a responsibility.
!  (Risk assessment) methods must support justification.
Experimental Validation of a Risk Assessment Method, REFSQ2015
24-3-2015
10