How to do things with Tweets Miles Osborne August 10, 2010

How to do things with Tweets
Miles Osborne
School of Informatics University of Edinburgh
@milesosborne
August 10, 2010
Miles Osborne
How to do things with Tweets
1
Miles Osborne
How to do things with Tweets
2
Miles Osborne
How to do things with Tweets
3
Sun CEO resignation
Today’s my last day at Sun. I’ll miss it. Seems only fitting to end
on a #haiku. Financial crisis/Stalled too many customers/CEO no
more
Miles Osborne
How to do things with Tweets
4
Woman sued $50k after Tweeting
This refers to the condition of her rented appt
Miles Osborne
How to do things with Tweets
5
Haiti Earthquake
Miles Osborne
How to do things with Tweets
6
Twitter
Characteristics:
◮
> 55 million posts / day
◮
> 100 million registered users
Potentially huge impact for research
Miles Osborne
How to do things with Tweets
7
Data
◮
We crawl approx 1.5 million tweets per day
◮
Crawl started April 2009
◮
c 400 million tweets so far
Publically released 100 M Tweets:
http://demeter.inf.ed.ac.uk/
Downloaded 2200 times (June 2010 to now)
Miles Osborne
How to do things with Tweets
8
Task 1: How can we find ‘interesting’ events?
Twitter contains a stream of news events
◮
Traditional ‘old media’ co-released stories
◮
Citizen journalists reporting on novel events
Find events in the massive stream of posts
Miles Osborne
How to do things with Tweets
9
Finding all new events
Sasa Petrovic and Victor Lavrenko:
◮
For each new tweet, compare it to all previously seen tweets
◮
If that tweet looks novel, assign it to a new thread.
◮
Announce the fastest growing threads as corresponding to
new events
We have time efficient algorithms which makes this feasible
Miles Osborne
How to do things with Tweets
10
Finding all new events
Ipad-related Tweets / minute (Source: NY Times)
Miles Osborne
How to do things with Tweets
11
Finding all new events
Data:
◮ Use Twitter streaming API
◮
◮
1.5 Million tweets / hour, 24/7.
. . . and data gathered over six months
◮
163.5 Million tweets
Miles Osborne
How to do things with Tweets
12
Finding all new events
Fastest Growing Threads (June 2009):
# users First tweet
7814
7579
3277
2526
1879
1511
1458
1426
TMZ reporting michael jackson has had a heart
attack. We r checking it out. And pulliing
video to use if confirmed
RIP Patrick Swayze...
Walter Cronkite is dead.
we lost Ted Kennedy :(
RT BULLETIN – STEVE MCNAIR HAS DIED.
David Carradine (Bill in ”Kill Bill”) found hung in Bangkok hotel.
Just heard Sir Bobby Robson has died. RIP.
I just upgraded to 2.0 - The professional
Twitter client. Please RT!
Miles Osborne
How to do things with Tweets
13
Task 2: Translating Tweets
English
Brazillian Portuguese
Japanese
Spanish
Indonesian
Dutch
German
Malayasian
Italian
Portuguese
59 660 690
7 986 562
7 134 916
6 244 053
3 475 389
3 150 534
2 216 601
1 624 710
240 035
169 643
Roughly 60% of Tweets are in English
Miles Osborne
How to do things with Tweets
14
Twitter Translation
Early Haiti Earthquake-related Tweets in our crawl:
22:24:43
22:23:57
22:17:43
moisesfaponte Terremoto en haiti 7.3 posible tsunami en el caribe
fuente cnn hace 1 min
clausantander RT @jorr2006: temblor 7 grados en haiti
http://earthquake.usgs.gov/earthquakes/recenteqsww/Quakes/
justinholtweb reading the USGS and Nat Weather Service
NOT expecting Tsunami on east coast after haiti earthquake. good
Earthquake struck 21:53 UTC
Miles Osborne
How to do things with Tweets
15
Twitter Translation
Laura Jehl:
◮
Massive variablilty in style
◮
German – English
◮
Large differences from training data (European Parliamentary
proceedings)
Miles Osborne
How to do things with Tweets
16
Twitter Translation
Sample translations:
http://twitpic.com/10o3oo - my b-day... drink from the bottle of bacardi
http://twitpic.com/10o3op - hello:)
later everyone will be an ipad want http://goo.gl/fb/aniv
@simoulah pppl i am drunk.
i owned more no feeling in my left leg.. i made it here never get out of it alive.
Miles Osborne
How to do things with Tweets
17
Task 3: Predicting the Stock Market using Tweets
Michael Sebastian Aurelio Wolfram
◮
Tweets encode the wisdom of the crowd
◮
. . . contain far more information than is possible for people to
comprhened
Given tweets and a history of prices, predict stock movements
◮
Early results show improvements over the baseline (moving
average)
Miles Osborne
How to do things with Tweets
18
What is next?
◮
Real-time search over massive streams
◮
◮
◮
◮
Relating old and new media
◮
◮
Help people find information and interesting people
Understand how streams behave
Novel algorithms to make this possible
Influence, lag
Mobile and geolocation
Miles Osborne
How to do things with Tweets
19