Heuristic Evaluation CS 160, Spring 2002 Professor James Landay February 25, 2002

Heuristic Evaluation
CS 160, Spring 2002
Professor James Landay
February 25, 2002
2/25/2002
1
Interface Hall of Shame or
Fame?
 Standard MS calculator
 on all Win95/98/NT/2000/XP
2/25/2002
2
Interface Hall of Shame!
 What is the empty button
above MC for?
 Can’t resize
 Blue for numbers!
 goes against all we know
 hard to focus on
 combined with red  eye
strain
2/25/2002
3
Heuristic Evaluation
CS 160, Spring 2002
Professor James Landay
February 25, 2002
2/25/2002
4
Outline








Review of speech UI design
Discount usability engineering
Heuristic evaluation overview
Administrivia
Heuristics
How to perform a HE
HE vs. user testing
How well does HE work
2/25/2002
5
Review of Speech UI Design
 Speech UIs allow
 more natural computer access
 computer use in more situations (e.g., hands free)
 Speech UIs are hard to get to work well because of




lack of visible state
tax working memory
recognition problems
natural language understanding is also a hard problem
 Multimodal UIs are
 combination of two or more natural input modalities
 e.g., speech & pen, speech & gesture, etc.
 Multimodal UIs address some of the problems by
 helping disambiguate ambiguous inputs
 helping w/ corrections
2/25/2002
6
Iterative Design
Prototype
low-fi
Design
task analysis
contextual inquiry
scenarios
sketching
paper, DENIM, SUEDE
low-fi testing,
Evaluate … today HE
2/25/2002
7
Discount Usability
Engineering
(?)
 Cheap
 no special labs or equipment needed
 the more careful you are, the better it gets
 Fast
 on order of 1 day to apply
 standard usability testing may take a week
 Easy to use
 can be taught in 2-4 hours
2/25/2002
8
Examples
 Walkthroughs
 put yourself in the shoes of a user
 like a code walkthrough
 Low-fi prototyping
 Action analysis
 GOMS (add times to formal action analysis)
 On-line, remote usability tests
 Heuristic evaluation
2/25/2002
9
Heuristic Evaluation
 Developed by Jakob Nielsen
 Helps find usability problems in a UI design
 Small set (3-5) of evaluators examine UI
 independently check for compliance with usability
principles (“heuristics”)
 different evaluators will find different problems
 evaluators only communicate afterwards
 findings are then aggregated
 Can perform on working UI or on sketches
 I’ve found in this class it works better on working UI
2/25/2002
10
Why Multiple Evaluators?
 Every evaluator
doesn’t find every
problem
 Good evaluators
find both easy &
hard ones
2/25/2002
11
Heuristic Evaluation Process
 Evaluators go through UI several times
 inspect various dialogue elements
 compare with list of usability principles
 consider other principles/results that come to mind
 Usability principles
 Nielsen’s “heuristics”
 supplementary list of category-specific heuristics
 competitive analysis & user testing of existing products
 Use violations to redesign/fix problems
2/25/2002
12
Heuristics (original)
 H1-1: Simple & natural
dialog
 H1-2: Speak the users’
language
 H1-3: Minimize users’
memory load
 H1-4: Consistency
 H1-5: Feedback
2/25/2002
 H1-6: Clearly marked
exits
 H1-7: Shortcuts
 H1-8: Precise &
constructive error
messages
 H1-9: Prevent errors
 H1-10: Help and
documentation
13
Heuristics (revised set)
searching database for matches
 H2-1: Visibility of system status
 keep users informed about what is going on
 example: pay attention to response time




2/25/2002
0.1 sec: no special indicators needed, why?
1.0 sec: user tends to lose track of data
10 sec: max. duration if user to stay focused on action
for longer delays, use percent-done progress bars
14
Heuristics (cont.)
 Bad example: Mac desktop
 Dragging disk to trash
 should delete it, not eject it
 H2-2: Match between system &
real world
 speak the users’ language
 follow real world conventions
2/25/2002
15
Heuristics (cont.)
 Wizards
 must respond to Q
before going to next
 for infrequent tasks
 (e.g., modem config.)
 not for common tasks
 H2-3: User control & freedom  good for beginners
 “exits” for mistaken choices,
 have 2 versions
undo, redo
(WinZip)
 don’t force down fixed paths
 like that BART machine…
2/25/2002
16
Heuristics (cont.)
 H2-4: Consistency & standards
2/25/2002
17
Heuristics (cont.)
 MS Web Pub. Wiz.
 Before dialing
 asks for id & password
 When connecting
 asks again for id & pw
2/25/2002
 H2-5: Error prevention
 H2-6: Recognition rather than
recall
 make objects, actions, options,
& directions visible or easily
retrievable
18
Heuristics (cont.)
Edit
Cut
Copy
Paste
 H2-7: Flexibility and efficiency of use
 accelerators for experts (e.g., gestures, kb shortcuts)
 allow users to tailor frequent actions (e.g., macros)
2/25/2002
19
Heuristics (cont.)
 H2-8: Aesthetic and minimalist design
 no irrelevant information in dialogues
2/25/2002
20
Heuristics (cont.)
 H2-9: Help users recognize, diagnose,
and recover from errors
 error messages in plain language
 precisely indicate the problem
 constructively suggest a solution
2/25/2002
21
Heuristics (cont.)
 H2-10: Help and
documentation
 easy to search
 focused on the user’s
task
 list concrete steps to
carry out
 not too large
2/25/2002
22
Administrivia
 Please turn in the homework
 it must also be online
 Wai-ling will help get you info on how to
move to our Web server by next week
 We will be webcasting by next Monday
 Other questions?
2/25/2002
23
Phases of Heuristic
Evaluation
1) Pre-evaluation training
 give evaluators needed domain knowledge and
information on the scenario
2) Evaluation
 individuals evaluate and then aggregate results
3) Severity rating
 determine how severe each problem is (priority)
 can do this first individually & then as a group
4) Debriefing
 discuss the outcome with design team
2/25/2002
24
How to Perform Evaluation
 At least two passes for each evaluator
 first to get feel for flow and scope of system
 second to focus on specific elements
 If system is walk-up-and-use or evaluators are
domain experts, no assistance needed
 otherwise might supply evaluators with scenarios
 Each evaluator produces list of problems
 explain why with reference to heuristic or other
information
 be specific and list each problem separately
2/25/2002
25
Examples
 Can’t copy info from one window to another
 violates “Minimize the users’ memory load” (H1-3)
 fix: allow copying
 Typography uses mix of upper/lower case
formats and fonts




violates “Consistency and standards” (H2-4)
slows users down
probably wouldn’t be found by user testing
fix: pick a single format for entire interface
2/25/2002
26
How to Perform Evaluation
 Why separate listings for each violation?
 risk of repeating problematic aspect
 may not be possible to fix all problems
 Where problems may be found




2/25/2002
single location in UI
two or more locations that need to be compared
problem with overall structure of UI
something that is missing
 hard w/ paper prototypes so work extra hard on those
 note: sometimes features are implied by design docs and
just haven’t been “implemented” – relax on those
27
Severity Rating
 Used to allocate resources to fix problems
 Estimates of need for more usability efforts
 Combination of
 frequency
 impact
 persistence (one time or repeating)
 Should be calculated after all evals. are in
 Should be done independently by all judges
2/25/2002
28
Severity Ratings (cont.)
0
1
2
3
4
-
2/25/2002
don’t agree that this is a usability problem
cosmetic problem
minor usability problem
major usability problem; important to fix
usability catastrophe; imperative to fix
29
Debriefing
 Conduct with evaluators, observers, and
development team members
 Discuss general characteristics of UI
 Suggest potential improvements to
address major usability problems
 Dev. team rates how hard things are to fix
 Make it a brainstorming session
 little criticism until end of session
2/25/2002
30
Severity Ratings Example
1. [H1-4 Consistency] [Severity 3][Fix 0]
The interface used the string "Save" on the first screen for
saving the user's file, but used the string "Write file" on the
second screen. Users may be confused by this different
terminology for the same function.
2/25/2002
31
HE vs. User Testing
 HE is much faster
 1-2 hours each evaluator vs. days-weeks
 HE doesn’t require interpreting user’s actions
 User testing is far more accurate (by def.)
 takes into account actual users and tasks
 HE may miss problems & find “false positives”
 Good to alternate between HE & user testing
 find different problems
 don’t waste participants
2/25/2002
32
Results of Using HE
 Discount: benefit-cost ratio of 48 [Nielsen94]
 cost was $10,500 for benefit of $500,000
 value of each problem ~15K (Nielsen & Landauer)
 how might we calculate this value?
 in-house -> productivity; open market -> sales
 customer calls to your customer service center
 Correlation between severity & finding w/ HE
2/25/2002
33
Results of Using HE (cont.)
 Single evaluator achieves poor results
 only finds 35% of usability problems
 5 evaluators find ~ 75% of usability problems
 why not more evaluators???? 10? 20?
 adding evaluators costs more
 many evaluators won’t find many more problems
2/25/2002
34
Decreasing Returns
problems found
benefits / cost
 Caveat: graphs for a specific example
2/25/2002
35
Simple HE Homework
 Given a poorly designed web page
 Find at least 15 distinct heuristic violations
 number the violations on the diagram
 list violation with number on another sheet
 give a solution to fix
 Individual assignment
 Due next Monday in class
2/25/2002
36
Summary
 Heuristic evaluation is a discount method
 Have evaluators go through the UI twice
 Ask them to see if it complies with heuristics
 note where it doesn’t and say why




Combine the findings from 3 to 5 evaluators
Have evaluators independently rate severity
Discuss problems with design team
Alternate with user testing
2/25/2002
37
Next Time
 Web Design #1
 Heuristics hard to design from -> Web Design Patterns
 Read
 L & R 4.3-4.4
 Nielsen HE chapter (read 5 links under "Jakob Nielsen's
Online Writings on Heuristic Evaluation")
 Chapter 2 from VDLH
 Pattern Group C – Creating a Powerful Home Page
2/25/2002
38