Seminar: Algorithms for Large Social Networks in Theory and

Seminar: Algorithms for Large Social
Networks in Theory and Practice
Yaroslav Akhremtsev · Peter Sanders · Christian Schulz
Darren Strash
Institute of Theoretical Informatics – Algorithmics
Darren Strash:
KIT – University of the State of Baden-Wuerttemberg and
National
Laboratory Algorithms
of the Helmholtz for
Association
Seminar:
Large Social
Networks in Theory and Practice
Institute of Theoretical Informatics
www.kit.edu
Algorithmics
Introductions
Please introduce yourself, by stating:
Your name
What year you are in (+ Bachelor’s or Master’s)
Who you work with, and what you work on (if applicable)
If you want: Why social networks?
Darren Strash:
Seminar: Algorithms for Large Social Networks in Theory and Practice
Institute of Theoretical Informatics
Algorithmics
Introductions
Please introduce yourself, by stating:
Your name
What year you are in (+ Bachelor’s or Master’s)
Who you work with, and what you work on (if applicable)
If you want: Why social networks?
Yaroslav Akhremtsev
PhD researcher with Peter Sanders
Search data structures, some external memory algorithms
Darren Strash:
Seminar: Algorithms for Large Social Networks in Theory and Practice
Institute of Theoretical Informatics
Algorithmics
Introductions
Please introduce yourself, by stating:
Your name
What year you are in (+ Bachelor’s or Master’s)
Who you work with, and what you work on (if applicable)
If you want: Why social networks?
Yaroslav Akhremtsev
PhD researcher with Peter Sanders
Search data structures, some external memory algorithms
Christian Schulz
2nd year postdoc with Peter Sanders
Expert in engineering efficient graph partitioning algorithms
Darren Strash:
Seminar: Algorithms for Large Social Networks in Theory and Practice
Institute of Theoretical Informatics
Algorithmics
Motivation
Make money
Advertising
Product recommendation
Darren Strash:
Seminar: Algorithms for Large Social Networks in Theory and Practice
Institute of Theoretical Informatics
Algorithmics
Motivation
Make money
Advertising
Product recommendation
Understand our behavior
Sociologists–How do communities form? How do we identify
communities? Who is “important”?
Military–Who is a terrorist? Where will an attack happen next?
Emergency Responders–How do we communicate, and how
can we collaborate better?
Darren Strash:
Seminar: Algorithms for Large Social Networks in Theory and Practice
Institute of Theoretical Informatics
Algorithmics
Motivation
Make money
Advertising
Product recommendation
Understand our behavior
Sociologists–How do communities form? How do we identify
communities? Who is “important”?
Military–Who is a terrorist? Where will an attack happen next?
Emergency Responders–How do we communicate, and how
can we collaborate better?
Algorithmics
Tons of publicly available data
Interesting open problems
Exploiting the structure of real-world data
Darren Strash:
Seminar: Algorithms for Large Social Networks in Theory and Practice
Institute of Theoretical Informatics
Algorithmics
Social Networks
Facebook, Google+, Twitter
Communication (e-mail, commenting on blog)
Contact (meetings, sex, war)
Citations
Items purchased together (+ people who made similar
purchases)
Working together (co-authors, actors)
Criminal/Terrorist networks
Political corruption
Darren Strash:
Seminar: Algorithms for Large Social Networks in Theory and Practice
Institute of Theoretical Informatics
Algorithmics
What do we want to compute?
Graph Features
Subgraphs: triangles, cliques, cycles, paths
Community overlap
Clusters / partitions
Darren Strash:
Seminar: Algorithms for Large Social Networks in Theory and Practice
Institute of Theoretical Informatics
Algorithmics
What do we want to compute?
Graph Features
Subgraphs: triangles, cliques, cycles, paths
Community overlap
Clusters / partitions
Graph Measures
Correlation coefficient
Centrality
Diameter
Largest clique / independent set / vertex cover
Darren Strash:
Seminar: Algorithms for Large Social Networks in Theory and Practice
Institute of Theoretical Informatics
Algorithmics
What do we want to compute?
Graph Features
Subgraphs: triangles, cliques, cycles, paths
Community overlap
Clusters / partitions
Graph Measures
Correlation coefficient
Centrality
Diameter
Largest clique / independent set / vertex cover
Models
Detect and predict patterns (brokerage)
Darren Strash:
Seminar: Algorithms for Large Social Networks in Theory and Practice
Institute of Theoretical Informatics
Algorithmics
Theory and Practice
Types of articles in this area...
Darren Strash:
Seminar: Algorithms for Large Social Networks in Theory and Practice
Institute of Theoretical Informatics
Algorithmics
Theory and Practice
Types of articles in this area...
Theory
Algorithm
Application
Problem
Darren Strash:
Seminar: Algorithms for Large Social Networks in Theory and Practice
Institute of Theoretical Informatics
Algorithmics
Theory and Practice
Types of articles in this area...
Theory
Experiments
Experiments
Algorithm
Algorithm
Application
Problem
Application
Problem
Darren Strash:
Seminar: Algorithms for Large Social Networks in Theory and Practice
Institute of Theoretical Informatics
Algorithmics
Theory and Practice
Types of articles in this area...
Theory
Experiments
Applications
Experiments
Understanding
Algorithm
Algorithm
Algorithm
Application
Problem
Application
Problem
Darren Strash:
Seminar: Algorithms for Large Social Networks in Theory and Practice
Application
Problem
Institute of Theoretical Informatics
Algorithmics
Theory and Practice
This Seminar
Understanding
Experiments
Algorithm
Application
Problem
End-to-end (synthesis)
Assigned a topic problem, with an
interesting “seed” paper.
Theory paper? Further discuss
applications, experiments, and what we
learn from applying the algorithm.
Experimental paper? Further discuss
theory, applications, understanding.
Applications? Further discuss theory,
experiments, and what we learn from
applying the algorithm.
Darren Strash:
Seminar: Algorithms for Large Social Networks in Theory and Practice
Institute of Theoretical Informatics
Algorithmics
Course Outline
Presentations
To be made with Ipe [http://ipe7.sourceforge.net/]
5-minute teaser presentation
45-minute full presentation
Darren Strash:
Seminar: Algorithms for Large Social Networks in Theory and Practice
Institute of Theoretical Informatics
Algorithmics
Course Outline
Presentations
To be made with Ipe [http://ipe7.sourceforge.net/]
5-minute teaser presentation
45-minute full presentation
Comprehensive write-up
To be written with LATEX template
12–15 pages
Darren Strash:
Seminar: Algorithms for Large Social Networks in Theory and Practice
Institute of Theoretical Informatics
Algorithmics
Course Outline
Presentations
To be made with Ipe [http://ipe7.sourceforge.net/]
5-minute teaser presentation
45-minute full presentation
Comprehensive write-up
To be written with LATEX template
12–15 pages
Supervisor
Work with a supervisor, who will help guide you with
presentations and write-up.
Darren Strash:
Seminar: Algorithms for Large Social Networks in Theory and Practice
Institute of Theoretical Informatics
Algorithmics
Grade Breakdown: Presentations
5-minute teaser presentation - 10%
Motivation & persuasiveness
Clarity of presentation
Use of illustrative images
Darren Strash:
Seminar: Algorithms for Large Social Networks in Theory and Practice
Institute of Theoretical Informatics
Algorithmics
Grade Breakdown: Presentations
5-minute teaser presentation - 10%
Motivation & persuasiveness
Clarity of presentation
Use of illustrative images
45-minute full presentation (40min + 5min for questions) - 50%
Same as teaser presentation, plus:
Incorporating feedback from teaser presentation
Cohesiveness
Showing clear understanding of methods
Discussion of theory and applications
Ability to answer questions
Darren Strash:
Seminar: Algorithms for Large Social Networks in Theory and Practice
Institute of Theoretical Informatics
Algorithmics
Grade Breakdown: Write-up
Write-up - 40%
Grammar
Clear structure
Analysis of competitors and applications
Show clear understanding of method
Critical analysis of techniques
Darren Strash:
Seminar: Algorithms for Large Social Networks in Theory and Practice
Institute of Theoretical Informatics
Algorithmics
Guidelines: Presentations
Plan to talk for about 2 minutes per slide
Avoid discussing basics, use majority of time to discuss
unique properties of your subject
Use bullet points, not paragraphs
Avoid long theorems–keep it simple
Use helpful images!
Proofread your slides
Unsure about something? Talk to your supervisor
It’s ok to be nervous. Keep calm and carry on
Darren Strash:
Seminar: Algorithms for Large Social Networks in Theory and Practice
Institute of Theoretical Informatics
Algorithmics
Guidelines: Write-up
Structure:
Use the template
Provide succinct, but complete abstract
Why is this topic so important, and what makes the
paper great/unique?
Complete bibliography with BibTeX
Use search tools such as Google Scholar to find related
work
Thoroughly discuss related work
Discuss the end-to-end impact of your topic
Darren Strash:
Seminar: Algorithms for Large Social Networks in Theory and Practice
Institute of Theoretical Informatics
Algorithmics
Class Schedule
Today Choose articles, set up first meeting with
supervisor
24.04 Template & Ipe introduction + assign additional
articles as needed
15.05 5-minute teaser presentations
12.06 2 presentations (TBD)
19.06 2 presentations (TBD)
26.06 2 presentations (TBD)
10.07 Write-ups due
Darren Strash:
Seminar: Algorithms for Large Social Networks in Theory and Practice
Institute of Theoretical Informatics
Algorithmics
Dynamic subgraph counting
A Dynamic Data Structure for Counting Subgraphs in Sparse
Graphs
ˇ Dvoˇrak
´ and Vojtech
ˇ Tuma
Zdenek
˚
Considers graphs with bounded expansion
Maintaining counts during graph updates takes polylogarithmic
time
Theory
Darren Strash:
Seminar: Algorithms for Large Social Networks in Theory and Practice
Institute of Theoretical Informatics
Algorithmics
Dynamic subgraph counting
A Dynamic Data Structure for Counting Subgraphs in Sparse
Graphs
ˇ Dvoˇrak
´ and Vojtech
ˇ Tuma
Zdenek
˚
Considers graphs with bounded expansion
Maintaining counts during graph updates takes polylogarithmic
time
0
5
4
3
···
6
6
Theory
Darren Strash:
Seminar: Algorithms for Large Social Networks in Theory and Practice
1
Institute of Theoretical Informatics
Algorithmics
Dynamic subgraph counting
A Dynamic Data Structure for Counting Subgraphs in Sparse
Graphs
ˇ Dvoˇrak
´ and Vojtech
ˇ Tuma
Zdenek
˚
Considers graphs with bounded expansion
Maintaining counts during graph updates takes polylogarithmic
time
0
5
3
2
···
7
5
Theory
Darren Strash:
Seminar: Algorithms for Large Social Networks in Theory and Practice
3
Institute of Theoretical Informatics
Algorithmics
Triangle listing
Triangle Listing Algorithms: Back from the Diversion
by Mark Ortmann and Ulrik Brandes
Fits existing algorithms into a common methodology
Almost all theoretical algorithms match the best algorithm
Experimental analysis on large sparse graphs
Theory Exp
Darren Strash:
Seminar: Algorithms for Large Social Networks in Theory and Practice
Institute of Theoretical Informatics
Algorithmics
Clique finding
Finding Maximal Cliques in Massive Networks by H*-graph
J. Cheng, Y. Ke, A. Fu, J. Xu Yu, and L. Zhu
External memory algorithm
Used to list all cliques in graphs with up to 10 million nodes
Experiments Th
Darren Strash:
Seminar: Algorithms for Large Social Networks in Theory and Practice
Institute of Theoretical Informatics
Algorithmics
Diameter
An Exact Algorithm for Diameters of Large Real Directed
Graphs
Takuya Akiba, Yoichi Iwata, and Yuki Kawata
First algorithm to compute diameter exactly for large sparse
graphs.
Experiments
Darren Strash:
Seminar: Algorithms for Large Social Networks in Theory and Practice
Institute of Theoretical Informatics
Algorithmics
Diameter
An Exact Algorithm for Diameters of Large Real Directed
Graphs
Takuya Akiba, Yoichi Iwata, and Yuki Kawata
First algorithm to compute diameter exactly for large sparse
graphs.
Experiments
Darren Strash:
Seminar: Algorithms for Large Social Networks in Theory and Practice
Institute of Theoretical Informatics
Algorithmics
Diameter
An Exact Algorithm for Diameters of Large Real Directed
Graphs
Takuya Akiba, Yoichi Iwata, and Yuki Kawata
First algorithm to compute diameter exactly for large sparse
graphs.
Experiments
Darren Strash:
Seminar: Algorithms for Large Social Networks in Theory and Practice
Institute of Theoretical Informatics
Algorithmics
Shortest path queries
Dynamic and Historical Shortest-Path Distance Queries on
Large Evolving Networks by Pruned Landmark Labeling
Takuya Akiba, Yoichi Iwata, and Yuichi Yoshida
A new algorithm for reporting the shortest distance between
points in dynamic graphs
Significantly faster than previous methods
Experiments
Darren Strash:
Seminar: Algorithms for Large Social Networks in Theory and Practice
Institute of Theoretical Informatics
Algorithmics
Shortest path queries
Dynamic and Historical Shortest-Path Distance Queries on
Large Evolving Networks by Pruned Landmark Labeling
Takuya Akiba, Yoichi Iwata, and Yuichi Yoshida
A new algorithm for reporting the shortest distance between
points in dynamic graphs
Significantly faster than previous methods
1
Experiments
Darren Strash:
Seminar: Algorithms for Large Social Networks in Theory and Practice
Institute of Theoretical Informatics
Algorithmics
Shortest path queries
Dynamic and Historical Shortest-Path Distance Queries on
Large Evolving Networks by Pruned Landmark Labeling
Takuya Akiba, Yoichi Iwata, and Yuichi Yoshida
A new algorithm for reporting the shortest distance between
points in dynamic graphs
Significantly faster than previous methods
4
Experiments
Darren Strash:
Seminar: Algorithms for Large Social Networks in Theory and Practice
Institute of Theoretical Informatics
Algorithmics
Personalized PageRank
Computing Personalized PageRank Quickly by Exploiting
Graph Structures
Takanori Maehara, Takuya Akiba, Yoichi Iwata, and Ken-ichi
Kawarabayashi
Apply a graph decomposition to acheive faster algorithm for
large sparse graphs
[0.98, 1.12, ..., 9.8] +
= Score/Ranking
Experiments
Darren Strash:
Seminar: Algorithms for Large Social Networks in Theory and Practice
Institute of Theoretical Informatics
Algorithmics
Partner prediction
Romantic Partnerships and the Dispersion of Social Ties: A
Network Analysis of Relationship Status on Facebook
Lars Backstrom and Jon Kleinberg
New graph measure dispersion
Use measure to determine which Facebook friend is a
romantic partner
Application
Understanding
Darren Strash:
Seminar: Algorithms for Large Social Networks in Theory and Practice
Institute of Theoretical Informatics
Algorithmics
Articles
1.
2.
3.
4.
5.
6.
7.
Dynamic subgraph counting
Triangle listing
Clique finding
Diameter
Shortest path queries
Personalized PageRank
Partner prediction
Theory
Theory Exp
Darren Strash:
Seminar: Algorithms for Large Social Networks in Theory and Practice
Experiments Th
Experiments
Experiments
Experiments
Application
Understanding
Institute of Theoretical Informatics
Algorithmics
Articles
1.
2.
3.
4.
5.
6.
7.
Dynamic subgraph counting (Darren)
Triangle listing (Yaroslav)
Clique finding (Darren)
Diameter (Darren)
Shortest path queries (Yaroslav)
Personalized PageRank (Darren)
Partner prediction (Christian)
Contact Info:
Darren: [email protected]
Yaroslav: [email protected]
Christian: [email protected]
Darren Strash:
Seminar: Algorithms for Large Social Networks in Theory and Practice
Institute of Theoretical Informatics
Algorithmics