i247: Information Visualization and Presentation Marti Hearst Data Types and Graph Types

i247: Information Visualization and
Presentation
Marti Hearst
Data Types and Graph Types
1
Outline
•
•
•
•
The Roles and Stages of Visualization (briefly)
Data Models and Types of Data
Which Kinds of Graphs for Which Types of Data?
Class Exercise
2
The Roles and Stages of
Visualization
3
What Visualization Can Do (Ware)
•
•
•
•
Allows comprehension of huge amounts of data.
Allows perception of emergent properties
Enables problems with the data to stand out
Facilitates understanding at both large and small
scales; patterns linking local features
• Facilitates hypothesis formation.
4
What Visualization Can Do (Tufte ’83)
•
•
•
•
•
•
Show the data
Induce to viewer to think about the data
Avoid distorting what the data have to say
Present many numbers in a small space
Make large data sets coherent
Encourage the eye to compare different pieces
of data
• Reveal the data at several levels of detail,
from overview to fine structure
• Serve a clear purpose:
– Description, exploration, tabulation, or decoration
• Be closely integrated with the statistical and
verbal descriptions of a data set.
5
Stages of Visualization (Ware)
• Collection and storage of data
• Preprocessing to transform data into
something understandable
• Hardware and graphics algorithms for
producing an image on the screen
• Human perceptual and cognitive system.
• (I think he’s missing a stage … Design of the
visualization.)
6
Put it Into Questions
•
•
•
•
What are our goals?
What questions do we want to answer?
What kind of data might we collect?
How might we convey the information
associated with this data?
7
Visualization Components
•
Human
Abilities
Imply
•
Design
Principles
•
Visual perception
•
Visual display
•
Cognition
•
Interaction
•
Motor skills
Inform
design
•
Frameworks
•
Data types
•
Tasks
Constrain
design
•
Techniques
•
Graphs & plots
•
Maps
•
Trees & Networks
•
Volumes & Vectors
•
…
•
Design
Process
•
Iterative design
•
Design studies
•
Evaluation
From Melanie Tory
8
Data Models and Types of Data
9
Basic Elements of a Data Model
• A data model represents some aspect of the
world
• Data models consist of these basic elements:
– objects
– values (also called attributes)
– relations
10
Adapted from Stone & Zellweger
Basic Elements: Objects
• Objects are items of interest
– people, plants, cars, films, etc…
• Objects allow you to define and reason about
a domain
– ecosystem: ponds, streams, woodlands, mountains,
plants, animals, etc.
11
Adapted from Stone & Zellweger
Basic Elements: Values
• Values (or attributes) are properties of objects
• Two major types
– quantitative
– categorical
• Appropriate visualizations often depend upon
the type of the data values
12
Adapted from Stone & Zellweger
Basic Elements: Relations
• Relations relate two or more objects
– leaves are part of a plant
– a department consists of employees
• Ecosystem
– connections between streams and lakes
– predator/prey network of what eats what
– …
13
Adapted from Stone & Zellweger
Types of Data (Ware)
• Entities
• Relationships
• Attributes of Entities or Relationships
– Nominal / Ordinal / Interval / Ratio (Stevens ’46)
– Categorical / Integer / Real
• Operations Considered as Data
–
–
–
–
Mathematical
Merging lists
Transforming data, etc.
Metadata (derived data)
14
Types of Data (Few)
• Quantitative
• Categorical
(allows arithmetic operations)
(group, identify & organize; no arithmetic)
Nominal
Ordinal
Interval
Hierarchical
15
Adapted from Stone & Zellweger
Types of Data
• Quantitative
(allows arithmetic operations)
- 123, 29.56, …
• Categorical
(group, identify & organize; no arithmetic)
Nominal (name only, no ordering)
• Direction: North, East, South, West
Ordinal (ordered, not measurable)
• First, second, third …
• Hot, warm, cold
Interval (starts out as quantitative, but is made categorical by
subdividing into ordered ranges)
• Time: Jan, Feb, Mar
• 0-999, 1000-4999, 5000-9999, 10000-19999, …
Hierarchical (successive inclusion)
• Region: Continent > Country > State > City
• Animal > Mammal > Horse
16
Adapted from Stone & Zellweger
Which Types of Graphs for
Which Kinds of Data?
17
Quantitative Against Categorical
From Few, "Quantitative vs.
Categorical Data: A Difference
Worth Knowing", DM Review
18
Magazine, April 2005
Quantitative against Quantitative
From Few, "Quantitative vs.
Categorical Data: A Difference
Worth Knowing", DM Review
19
Magazine, April 2005
Questions to ask when creating a graph
• Is a graph needed?
– Yes, if illustrating relationships among measurements
• What information is being conveyed?
– What is most important?
– Start by writing a title
20
Questions to ask when creating a graph
• What data is needed to answer specific questions?
– Overview? Relationships?
– Grice’s maxims
• combine relevant information together
• don’t show extraneous information
• Who is your audience?
21
What Format to Use?
• Bertin has a notion of efficiency
• Tufte says “show the data”
• Let’s start with familiar graph types
–
–
–
–
line graphs
bar charts
scatter plots
layer graphs
• When to use each?
22
Anatomy of a Graph
(Kosslyn 89)
• Framework
– sets the stage
– kinds of measurements, scale, ...
• Content
– marks
– point symbols, lines, areas, bars, …
• Labels
– title, axes, tic marks, ...
23
When to use which type?
• Line graph
– x-axis requires quantitative variable
– differences among contiguous values
– familiar/conventional ordering among ordinals
• Bar graph
– comparison of relative point values
• Scatter plot
– convey overall impression of relationship between
two variables
24
What to put on the x axis?
• Independent vs. Dependent variables
– we often measure one quantitative variable against
another
– the value of one changes in relation to the other
– the dependent variable changes relative to the
independent one
– the independent variable acts as a “measuring stick”
• Independent usually goes on the x (horizontal)
axis
25
Independent vs. Dependent
• Independent vs. Dependent variables
– heat in degrees against time
– sales against season
– tax revenue against city
• What happens when there is more than one
independent variable?
– Choose one for the x axis, and another as a variation
in the mark (color, shape)
26
Few on How to Show Information
• The best way to show a single value?
– Use a textual representation.
– Why?
• How to draw attention to a number?
27
Few on How to Show Information
• What are tables good for?
– Data lookup
– Hierarchical relationships
28
Class Exercise
29
How to Combine Data Types?
• Class Exercise:
– Using data about autos from the 70’s
– Each person get a column of data
• First, identify the data type
• Then, stand up
• Then, repeat the following several times:
– Walk up to someone else. If they have a
different column than you do, discuss whether
and how you should plot your two columns.
» If yes, what question are you answering?
» If no, why not?
• Then, repeat this, but with groups of three
people.
30