A Structural Analysis of Social Media Networks

The International Centre for Security Analysis
The Policy Institute at King’s
King’s College London
A Structural Analysis of
Social Media Networks
A Reference Guide for Analysts &
Policymakers
James Barge
Mick Endsor
The International Centre for Security Analysis
The International Centre for Security Analysis (ICSA) is a research unit within the Policy
Institute at King’s College London. We carry out multidisciplinary academic and policy
research on international security issues. Our research and training currently concentrates
on three main areas:
1. The theory and practice of open source intelligence (OSINT) and social media
intelligence (SOCMINT);
2. The non-proliferation of weapons of mass destruction (WMD), nuclear safeguards,
nuclear security and nuclear terrorism;
3. Regional security issues in North Africa, the Middle East, East Asia and South
America.
ICSA aims to be the global academic centre of excellence for open source and social media
intelligence. We provide training courses in OSINT and SOCMINT as well as bespoke courses
for clients on a range of other topics including: financial open source intelligence (FOSINT),
research methodologies, radicalisation and regional security issues.
ICSA also runs ad hoc seminars relating to the centre’s key research interests in OSINT,
SOCMINT, nuclear non-proliferation and regional security issues. We are currently running
the Grand Strategy Seminar Series analysing the texts of great historical thinkers as they
apply to contemporary issues of strategy.
Contact Us:

Website: you can find out more information about the work ICSA carries out on our
website: http://www.kcl.ac.uk/sspp/policy-institute/icsa/index.aspx

Blog: you can read our blog here: http://blogs.kcl.ac.uk/icsa/

Twitter: you can follow us here: https://twitter.com/icsa_kings
Executive Summary
In order to develop a useful and efficient approach to social media intelligence, policymakers and analysts must develop a methodological approach. This document analyses a
selection of platforms across the spectrum of social media and provides broad categories to
understand their various features and what information analysts can collect from them. The
recommendations of this report show that an effective approach to social media
intelligence can only be derived from a nuanced understanding of social networks; a
dedication to research and development; and an ability to remain on the cutting edge of
developing trends.
The purpose of this is to provide a framework for analysts to improve their understanding of
social media. In addition, it is aimed at the often neglected component of intelligence
analysis; the managers who task analysts and the policy-makers who consume and act upon
intelligence products.
The key conclusions drawn from this report are:
1. Social media must be analysed
within the context of an
intelligence
tasking
and
information requirements.
4. Social networks are adapted by
users for their own purposes;
social network designers attempt
to predict user innovations.
2. Funding research into improved
social media analytics technologies
should be a key consideration for
policy-makers
5. Ethical and legal due process
should be implemented for
analysts on social media at every
stage of the intelligence cycle.
3. Analysts
should
track
developments in social media
networks, technologies and user
experiences.
6. Organisations should decide how
best to create and organise social
media analysts within existing
teams.
This report aims to provide an analysis of social media characterised by breadth and detail.
However the analysis conducted does have some limitations. The lack of foreign language
resources has meant that important foreign language social networks have been excluded
(although references are made to them throughout the report). In the absence of a welldeveloped and rigorous body of academic and scholar research surrounding the still nascent
discipline of social media intelligence, much of the research cited in this report fails to place
social media within a specific intelligence context. This report therefore takes this research
and contextualises its key findings within the outlined framework to provide an
understanding of its potential implications within the discipline of intelligence analysis.
Contents
Introduction ............................................................................................................................... 1
1. Representational Features ................................................................................................... 12
2. Community Features............................................................................................................ 23
3. Interactional Features .......................................................................................................... 30
4. Privacy/Accessibility Features .............................................................................................. 44
5. Infrastructural Features ....................................................................................................... 51
Conclusions & Recommendations ........................................................................................... 61
Glossary of Technical Terms .................................................................................................... 63
Works Cited .............................................................................................................................. 65
List of Figures
Figure 1.
The Conversation Prism
Figure 2.
Daily Active Facebook Users by Country/Region.
Figure 3.
Millions of Teens Have Abandoned Facebook Since 2011.
Figure 4.
Percentage of UK Internet users who use Twitter as of February 2013, by
age group.
Figure 5.
Distribution of Twitter users worldwide from 2012 to 2018.
Figure 6.
Growth of Instagram users worldwide from 4th quarter 2013 to 1st quarter
2014, by generation.
Figure 7.
Regional distribution of Instagram traffic in the last three months as of
April 2014, by country.
Figure 8.
The 24 most active subreddits.
Figure 9.
Reddit’s most engaged countries by average page views per visit.
Figure 10.
Prediction Accuracy of Dichotomous traits by examining likes on Facebook.
Figure 11.
Prediction accuracy of private traits by examining likes on Facebook.
Figure 12.
Twitter ‘tribes’.
Figure 13.
A “new” retweet.
Figure 14.
A “manual” retweet.
Figure 15.
The hoax tweets posted on the Associated Press official Twitter account.
Figure 16.
TweetCred ratings displayed on the Reuters Top News official Twitter
account.
Figure 17.
An example of a user utilising the #richkidsofintstagram hashtag on
Twitter.
Figure 18.
The EXIF data for the photo that revealed John McAfee’s location.
Figure 19.
Social media mobile usage stats.
Figure 20.
Instagram’s interface.
Figure 21.
Reddit’s front page.
Figure 22.
Reddit’s comment interface.
Introduction
Since its emergence as a mass medium of interaction in the early 2000s, social media has
quickly become a huge source of valuable information for researchers from all backgrounds.
People increasingly spend more and more of their lives on platforms such as Facebook,
Twitter, Instagram and Reddit (the four platforms analysed in this report). Through social
media, the web has become a place where users represent themselves, interact in
thousands of different ways and constantly produce and consume information.
From an intelligence perspective, social media has the potential to be incredibly useful. If a
user Tweets in reaction to a political news story, it can provide information about their
political beliefs. If members of the Islamic State of Iraq and al-Sham (ISIS) run recruitment
campaigns through Instagram or Twitter accounts, they may carelessly reveal, and indeed
have revealed, the location of ordnance, training camps and other useful intelligence. The
more our lives become connected to the Internet, the more useful information is available.
Social media intelligence, or SOCMINT as it has been termed,1 is the newest member of the
intelligence family, emerging out of open source intelligence (OSINT).2 SOCMINT deals
specifically with intelligence that is a product of social media data and information.
However, although the potential benefits of social media are often extolled, intelligence
analysts and professionals often criticise the absence of a strategy, doctrine or best practice
to determine how to best extract, analyse and utilise SOCMINT. The changing nature of the
social media landscape means that a dynamic approach is needed, analysts must be able to
adapt to new platforms; updates to current sites and changing social media culture amongst
users. This means that there is not one set approach for dealing with social media, it must
be a custom approach that is defined by context and fundamental questions: What platform
is the interaction taking place? What is the nature of the interaction? What ultimately do I
want to find out?
The purpose of this report is to provide a guide to help intelligence analysts better
understand social media. It is not intended as an exhaustive account of every type of
information that can be directly known or inferred. Instead, it attempts to give a
representative overview of some of the most important features from an intelligence
standpoint. The methodology that has been chosen splits features of social media into 5
categories: representation features, community features, interaction features, privacy
features and infrastructural features.
1
Omand, Sir David, Bartlett, Jamie, Miller, Carl, ‘Introducing Social Media Intelligence (SOCMINT), Intelligence
and International Security, 2012
<http://www.academia.edu/1990345/Introducing_Social_Media_Intelligence_SOCMINT_> [accessed 29
September 2014].
2
There is no consensus on whether or not SOCMINT is considered a separate or sub-discipline of OSINT.
Certain SOCMINT practitioners may choose to access social media information that is considered private,
which would not fall within the remit of OSINT.
1
Representational Features
• How users create and maintain versions of their identity online, for example how a
detailed Facebook profile would affect the content of individual posts.
Community Features
• How social media communities differ, how to classify soical media communities and
the relationships between real-world groups and their online counterparts.
Interactional Features
• Investigates
how and why users utilise different communicative features across
platforms (such as retweets, likes etc.)
Privacy Features
• The
privacy aspects of social media networks, including perceived vs actual privacy;
metadata; and the impact of the Snowden revelations on user behaviour.
Infrastructural Features
• Elements of the formal structure of social media platforms; third party software;
ranking algorithms and other crucial, unseen and often poorly understood features.
The social media platforms that have been looked at specifically for this report are
Facebook, Twitter, Instagram and Reddit. Whilst this is a distinctly western perspective,
these networks provide a reasonable overview of the different types of social media
networks. This report aims to provide a framework with which alternate social media
platforms can be analysed and examined. If an analyst is presented with a new platform that
is previously unseen, they will be able to highlight features discussed in this report as a
starting point to understanding what information and insights can be gained from the new
platform.

Facebook represents a ‘classic’ form of social media that requires a detailed profile
and seeks to help users maintain a network of real-world relationships.

Twitter is the archetypal ‘microblogging’ platform, where a user’s online activity is
limited to short, periodic posts.

Instagram is a minimal, mobile and visual photo sharing platform. These three
factors are increasingly important in emerging social media platforms like Snapchat.

Reddit, primarily functions as a link aggregator and message board. It currently
stands as one of the most vast and complex communities on the Internet. It is the
perfect example of how the structure of a website influences the behaviour of the
users, which is the basis for one of the main points of this report.
This report is designed not just to be read from front to back; it is equally a reference guide,
which can be navigated by “ctrl-clicking” the key policy recommendations included within
the table of contents at the beginning of each chapter.
2
3
Figure 1. The Conversation Prism. A graphic displaying the range of websites across the spectrum of social media.
3
Brian Solis and JESS3, ‘The Conversation Prism’, www.conversationprism.com, 2014 [accessed 29 September
2014].
3
Facebook
Employees: 7,185
Launched: February 2004
Monthly active users: 1.32 billion
Mobile monthly active users: 1.07 billion
Total number of minutes spent of Facebook each month: 640,000,000
Average time spent on Facebook per visit: 18 minutes
Total number of Facebook pages: 54,200,000
Number of Languages supported: 70
Number of fake profiles: 81,000,000
Average number of Friends per user: 130
Every 20 minutes: 1 million links shared, 2 million friends requested and 3 million
messages sent.
Overview:
Facebook is a social networking site started by students at Harvard univeristy as a means to
improve communication between students. Facebook gradually expanded to support
students from other universities in the United States and Canada. By 2006 Facebook was
globally available.
Facebook primarily functions as a place for people to connect with their real life friends.
Users create detailed profiles, which includes information about their work and education,
location, contact information, family and relationships, favourite quotations, sexual
orientation, political and religious views. However, Facebook does not demand that users
share all this information, the decision is left to the discretion of the user. Users can interact
on Facebook in a number of ways. Privately, users can send each other instant messages.
Publicly, users can post on each others walls (with text, photos, videos or external links) and
comment on wall posts. Users can ‘like’ posts and comments and share each other’s
content.
Facebook has an increasingly popular community of groups, which are based around
interests ranging form interesting wikipedia articles to vintage trainers. These should not be
confused for facebook ‘Fan Pages’ which allow users to connect with brands and people
they are interested in in a similar fashion to following a user on Twitter or Instagram.
4
Daily Active Facebook Users by Country/Region
UK
24m
Asia
228m
US & Canada
152m
Europe
206m
Figure 2 – Daily Active Facebook Users by Country/Region.
4
Figure 3 – Millions of Teens Have Abandoned Facebook since 2011.
5
4
‘Daily Active Facebook Users by Country/Region’, The International Centre for Security Analysis and Facebook,
<www.facebook.com>, 2014 [accessed 29 September 2014].
5
‘Millions of Teens Have Abandoned Facebook Since 2011’, Statista,
<http://www.statista.com/chart/1789/facebook-s-teenager-problem/>, [accessed 29 September 2014].
5
Twitter
Launched: July 2006
Employees: 3,300
Monthly active users: 271 million
Tweets sent per day: 500 million
Number of Tweets per second: 9,100
Percentage of users active on mobile: 78%
Percentage of accounts outside US: 77%
Number of languages supported: 35+
Percentage of users characterised as ‘lurkers’ (people who watch but don’t contribute): 40%
Annual net income: 645.32 million
Overview:
Twitter is a social networking service that allows users to read and send 140 character
microblogs, or ‘Tweets’. Users post tweets to their profile, which can be viewed by anyone
with or without a Twitter account. Users ‘follow’ each other on Twitter, which enables them
to view Tweets on their home tab. However, following is not reciprocal (as becoming friends
is on Facebook), it is possible for a user to follow another and for that user not to follow
them back. This has allowed Twitter to become the preferred platform for celebrities,
brands and organisations to communicate with fans and followers.
Twitter’s defining feature is the 140 character limit that is applied to each Tweet. The
thinking behind this is that is forces users to condense what they want to say, which makes
Tweets more digestible for followers. In contrast, the traditional blogging network format
enables posts to be any given length
Twitter is probably the most useful SOCMINT tool at the analyst’s disposal. There are a
number of reasons for this. Twitter is a fast, responsive and public medium that groups
posts together by hashtag, user, location and date. It has also become the go-to tool for
spreading news and information fast, playing a pivotal role in events such as the Arab Spring
and the Islamic State of Iraq and al-Sham’s (ISIS’s) recruitment drive and propagandising.
Twitter also enables access to its streaming application programming interfaces (APIs), with
different levels granted for different contracts and prices.6
6
However full ‘firehouse’ access is only currently being granted to Sysomos, Yandex and Dataminr.
6
Figure 4 - Percentage of UK Internet users who use Twitter as of February 2013, by age group.
Figure 5 - Distribution of Twitter users worldwide 2012 to 2018 [forecast] by region.
7
8
7
‘Percentage of UK Internet users who use Twitter as of February 2013, by age group’, Statista,
<http://www.statista.com/statistics/257429/share-of-uk-internet-users-who-use-twitter-by-age-group>
[accessed 29 September 2014].
8
‘Distribution of Twitter users worldwide from 2012 to 2018’, Statista,
<http://www.statista.com/statistics/303684/regional-twitter-user-distribution/> [accessed 29 September
2014].
7
Instagram
Launched: October 2010
Amount paid by Facebook for purchase of Instagram:
$715 million (+ stock options)
Monthly active users: 200 million
Daily active users: 75 million
Number of photos shared (as of 26/3/14): 20 billion
Percentage of internet users that use Instagram: 13%
Percentage of US teens and millennials (14-34) that use Instagram: 34%
Most liked photo on Instagram: A photo posted by Kim Kardashian of her wedding to
Kanye West9
Overview:
Instagram is a photo and video sharing service that is primarily mobile-based. Users can take
square-format photos, add a range of filters (sepia, cross-processed, black and white etc.)
and share them to Instagram, Facebook, Twitter, Tumblr, Flickr or Foursquare. In 2012
Instagram was acquired by Facebook. To date there has been minimal integration of the two
platforms, with Facebook ‘committed to building and growing Instagram independently.’10
Instagram introduced the “explore” tab in 2012. This feature presents the user with a
refreshable selection of 21 photos that Instagram believes you would be interested in. This
has since become a popular way for people to engage with new content and presents a
challenge for marketers trying to ‘game’ the discover page.
There are a number of factors that make Instagram distinctive and noteworthy. Amongst
these is the fact that it is one of the first genuinely popular social networks to come from a
mobile application. Instagram has been resistant to expanding support to a browser based
application, only facilitating this in 2013 with a service that simply mimics the mobile
version with no additional benefits. Instagram has expanded on Twitters use of hashtags as
its primary method to categorise posts and form the basis for communities. Hashtags are
probably more important on Instagram than on any other platform, other than providing
their usual function of categorising posts; they are also the basis for a great number of
communities, who tag posts with hashtags to signify their place within the group.
9
http://instagram.com/p/ogSSO6uS9C/ [accessed 29 September 2014].
Rusli, M. Evelyn, ‘Facebook buys Instagram for $1 Billion’, The New York Times, April 2012,
<http://dealbook.nytimes.com/2012/04/09/facebook-buys-instagram-for-1billion/?_php=true&_type=blogs&_r=0> [accessed 29 September 2014].
10
8
th
st
Figure 6 – Growth of Instagram usage worldwide from 4 quarter 2013 to 1 quarter 2014 by generation.
Figure 7 – Regional distribution of Instagram traffic in the last three months as of April 2014, by country.
11
th
11
12
st
‘Growth of Instagram users worldwide from 4 quarter 2013 to 1 quarter 2014, by generation’, Statista,
<http://www.statista.com/statistics/307026/growth-of-instagram-usage-worldwide/> [accessed 29 September
2014].
12
‘Regional distribution of Instagram traffic in the last three months as of April 2014, by country’, Statista,
<http://www.statista.com/statistics/272933/distribution-of-instagram-traffic-by-country/> [accessed 29
September 2014].
9
Reddit
Founded: June 2005
Employees: 51
Unique monthly visitors: 114.5 million
Proportion of U.S. online adults that visit Reddit: 6%
Largest demographic: 18-29 year old males.
Number of subreddits: 476,720 (5,400 active13)
Top 10 subreddits: r/announcements, r/funny, r/pics, r/AskReddit, r/todayilearned,
r/worldnews, r/blog, r/science, r/IAmA, r/videos.
Number of monthly Reddit pageviews: 5.2 billion
Overview:
Designed as a social news site based around aggregation algorithms, Reddit allows users to
create anonymous accounts and post two kinds of content: text posts and links. The Reddit
community votes on posts with either “upvotes” or “downvotes”, the ratio of upvotes to
downvotes combined with the ranking algorithm decides on the order and visibility of posts.
Users are given “karma” for the number of upvotes they receive, which is reduced by
downvotes. Karma is largely useless but can increase user’s social capital within
communities as well as bestowing some small measurable benefits, such as allowing users
to post more frequently. Reddit is divided up into “subreddits”, which are individual forums
within Reddit that are grouped by a similar interest or topic. Users can subscribe and
unsubscribe to whichever subreddits they wish.
This report will focus largely on the features of Reddit that relate to its status as an online
community base. As opposed to the other forms of social media discussed here, Reddit has
a strong sense of self-identity, which defines the community on Reddit in relation to the rest
of the Internet. This has caused rivalries between Reddit and other broad communities
online including 4chan, Tumblr and 9Gag. Reddit’s social element is not designed as
something to build and maintain relationships with others, but as a centre for discussion
over the content that is posted. The culture on Reddit is one that values long and detailed
contributions as well as certain brands of humour, memes and in-jokes. In addition, the vote
system means that user’s content is only visible if it receives democratic approval from the
Reddit community, this means that users attempt (probably above all else) to pander to the
Reddit community’s ideology and attitudes.
13
Where active is defined as subreddits that had at least 5 posts or comments in the past day.
10
14
Figure 8 - 24 most active subreddits. This represents how Reddit posts would be shared if the 24 most
15
popular subreddits were the only existing subreddits.
Figure 9. Reddit’s most engaged countries by average page views per visit
16
14
Olson, Randal, ‘Most Active Subreddits’, 2013, <http://www.randalolson.com/> [accessed 29 September
2014].
15
‘Most addicted/engaged countries by avg. pageviews per visit.’, Reddit Blog, 2011,
<http://www.redditblog.com/2011/06/which-cities-countries-have-most-reddit.html> [accessed 29
September 2014].
16
‘Which Cities & Countries Have the Most reddit Addicts?’, Reddit Blog, 2011,
<http://www.redditblog.com/2011/06/which-cities-countries-have-most-reddit.html> [accessed 29
September 2014].
11
1. Representational Features
Information found in the profiles of users determines facts about them, studying their
periodic updates may help determine complex psychological traits. ..................................... 13
It is unclear whether users of social media believe that their online presence is an accurate
representation of themselves .................................................................................................. 18
When users interact as part of a group, they often tend to behave in accordance with group
norms, rather than personal attitudes .................................................................................... 19
Analysts should be aware of other factors that affect self-representation on social media .. 21
12
Information found in the profiles of users determines facts about them, studying
their periodic updates may help determine complex psychological traits.
Firstly, it must be established that it is all but impossible to be active on social media and to
maintain full anonymity. Why is this? If we take the definition of anonymity given by
Lapidot-Lefler and Barak as something which points to the ‘unidentifiability aspect... rather
than namelessness’ of a person, then it becomes clearer.17 Anonymity is not simply narrowly
defined by unknown names but rather concerns the broader inability to identify individuals.
With this distinction in mind, it is clear that any contributions users make to social media
(filling in facts in our profile, offering opinions in tweets or adding hashtags to Instagram
photos) can increase identifiably and decrease anonymity. For example, there are certain
psychological facts about a person that can be known from a single tweet, albeit these
“facts” may often be little more than observing that “X is the sort of person that is not
averse to posting a single tweet on Twitter”. However, this does provide some information
about an individual that can contribute to the identification of that person. Crucially,
monitoring an individual’s public activity on social media networks over time may offer
much greater insights for the analyst and undermine the anonymity of the individual. In
sum, any information a person provides within the context of social media gives some
information about that person and can be used as a tool to identify them. This is crucial in
understanding the different sorts of information that the analyst can collect or infer across
the spectrum of social media.
We can now look at the various ways users represent themselves within social media, and
how these representations can be analysed to gain actionable information.
Broadly, across the spectrum of social media, we can split information about users into two
categories:
1. Users fill out information about themselves in the creation of their social media
profiles.
2. Users contribute information about themselves by updating their social media
periodically.
The first of these is an explicit and conscious contribution to self-representation online.
Users are presented with an interface, which asks them to fill out specific facts about
themselves. We see this most extensively Facebook (as well as online dating sites), which
asks users to give extensive psychological, historical and geographical details. These include
information about our political views, sexual orientation and things that we “like” as well as
17
Lapidot-Lefler, Noam and Barak, Azy, ‘Effects of anonymity, invisibility, and lack of eye-contact on toxic
online disinhibition’, Computers in Human Behaviour, Vol. 28, No.2, 2012, pp. 434-443.
13
details of our current and previous employment, pre-Facebook past and family members.18
For SOCMINT analysts, the profile details of individuals can be extremely useful. However,
these details can be influenced heavily by user’s self-awareness that when they are filling
out information in their profile, they are giving an online representation of themselves. In
general, once we have an awareness that information we are providing will be used as a
representation of ourselves, we have a substantial motivation to manipulate this
information to give a representation that is more in line with how we want to be seen,
rather than how we actually are. We can suppose that it is possible that if someone is
explicitly asked to describe their political views, they may give a less accurate account of
their genuine opinions than could be inferred from observing their political discussions
throughout time.
This can be contrasted with the second way in which users contribute information about
themselves: periodic social media updates in the form of Facebook statuses, wall posts,
tweets, Instagram photos, and Reddit posts and replies. When users are contributing to
social media in these myriad ways, it is not solely in the context of creating an idealised
representation of their identity online, although this is a potentially substantial motivation.
Rather they are also defined by their content; sharing a video, writing a message to a friend
etc. Social media sites such as Twitter and Instagram have very limited requirements for
their profiles, however they place a stronger emphasis on constant updates than Facebook,
VKontake (Facebook’s Russian equivalent) and dating sites.
The contrasts between a profile based social media interface to an update based one are
important. The latter encourages users to contribute information which could be more
useful for SOCMINT analysts. Social media updates are generally time and location specific;
they implicitly demonstrate opinions and communicate conversationally with other users.
Furthermore, research has shown that the sorts of facts contained in detailed profiles can
also be found out (to some extent) by studying specific types of updates. In the case study
below, researchers analysed the Facebook likes of approximately 58,000 volunteers in an
attempt to analyse the type of information that can be extracted from Facebook likes and
the inferences that could be made from this data.19
18
We are asked to retroactively fill out life events starting from out birth and including important events like
weddings, graduations, moving house etc.
19
Kosinski, Michael, Stillwell, David and Graepel, Thor (2013), ‘Private Traits and attributes are predictable
from digital records of human behaviour’, Proceedings of the National Academy of Sciences of the United
States of America, Vol. 110, No. 15, pp. 5802-5805.
14
Figure 10 - Prediction accuracy of dichotomous traits by examining likes on Facebook
20
expressed by the area under curve (AUC). The study used a sample of 58,466
volunteers in the United States using the myPersonality Facebook application. There
was an average of 170 likes per person.
Figure 10 shows that the researchers were able to predict dichotomous traits with a high
degree of accuracy based purely on Facebook likes. The most accurate predictions were
made for Caucasian versus African American (95%) and gender (93%). However, the model
was less successful for other variables such as parents together at 21 (60%) and uses drugs
(65%). It is important to note that given that this aspect of the study focused on
dichotomous variables; random guessing would average a 50% success rate. However, in all
of the variables above, the model using Facebook likes had a better predictive success than
random guessing.
Clearly, the character of updates on social media can take different forms, revealing
different things about users. On Twitter, tweets are 140 characters long and usually
20
Kosinski, Michael, Stillwell, David and Graepel, Thor (2013), ‘Private Traits and attributes are predictable
from digital records of human behaviour’, Proceedings of the National Academy of Sciences of the United
States of America, Vol. 110, No. 15, pp. 5802-5805.
15
demonstrate an opinion, view or piece on information, as well as sometimes communicating
with others via hashtags. Instagram has image-based updates that tend to inform us about
the location of a user and what they are doing.
Perhaps most interestingly for SOCMINT analysts, there have been a number of interesting
studies which demonstrate how personality traits can be predicted with ‘reasonable
precision’ by studying social media updates.21 Initial research in the field has focussed
largely on the “big 5” personality traits: agreeableness, conscientiousness, extraversion,
neuroticism and openness.22,23 However more recently there have been some interesting,
and attempts to ascertain “The Dark Triad” personality traits by conducting a linguistic
analysis of tweets. The Dark Triad personality traits are psychopathy, Machiavellianism and
narcissism and ‘all focus, to varying degrees on social malevolence, self-promotion,
emotional coldness, duplicity and aggressiveness.’24 Whilst as of yet the prediction accuracy
for these studies is generally poor (see figure 11 below), they are hopefully a precursor for
more successful attempts. Analysts should be aware of developments in this field; the value
of a reliable model of personality analysis cannot be understated.
As figure 11 shows, in contrast to the prediction of dichotomous variables from the same
study (see figure 10 above) the predictive accuracy of Facebook likes was much worse for
deeper psychological traits in contrast to more basic dichotomous variables. As the authors
note, psychological traits are examples of “latent traits” which cannot be directly measured.
Approximate, albeit imperfect, measurement was provided by responses to questionnaires.
In general, the study showed that Facebook likes were a poor predictor of latent traits, with
prediction accuracies generally half the questionnaires test-retest reliabilities.
This reveals the difficulties of using one specific type of social media interaction to predict
deeper psychological traits. Perhaps more importantly, the use of Facebook likes is perhaps
an inherently poor predictor of traits as they only provide limited information. Analysing
actual user posts, such as Facebook posts or tweets may provide more prove to be more
effective predictors for analysts interested in gauging latent traits of individuals or groups. It
is interesting to note that dating websites have almost entirely circumvented this problem
by asking users to supply deeper psychological information when they sign up. Indeed,
many of these dating sites explicitly sell their prospects for successfully finding an individual
with a match based on pairing individuals with similar psychological profiles.
21
Bai, Shoutian, Zhu, Tingshao and Cheng, Li, ‘Big-Five Personality Prediction Based on User Behaviors at Social
Network Sites’, eprint arXiv:1204.4809, 2010.
22
Ibid.
23
Kosinski, Michael, Stillwell, David and Graepel, Thor (2013), ‘Private Traits and attributes are predictable
from digital records of human behaviour’, Proceedings of the National Academy of Sciences of the United
States of America, Vol. 110, No. 15, pp. 5802-5805.
24
Paulhaus, L, and Williams, K, ‘The Dark Triad of personality: Narcissism, Machiavellianism and Psychopathy’,
Journal of Research in Personality, Vol. 36, 2002, 00. 556-563.
16
25
Figure 11 - Prediction accuracy of private traits by examining likes on Facebook.
Predictions expressed by the Pearson correlation coefficient between predicted and
actual attribute values at the P < 0.001 level. The transparent bars indicate the
baseline accuracy of the questionnaire expressed as test-retest reliability.
As well as containing information that can suggest personality traits, social media updates
provide factual information that is location and time specific, interactional and
contextualised. Additionally, updates are generally less considered users to be pieces of
information that is being used to represent their identities. Often they are perceived by
users to be more ephemeral than information given in a profile. Updates are usually in
response to an external stimulus: a reaction to a news story, a reply to a friend etc. This
contributes to potentially less consideration on the part of the user and an increased
probability of useful information being revealed, to the benefit of the SOCMINT analyst.
Researchers have also noted that ‘while users may be careful about the content they post to
Twitter, the words they use may reveal more about their personalities than they would
wish.’26 This indicates that it is linguistic analysis of data that can prove the most insightful,
25
Kosinski, Michael, Stillwell, David and Graepel, Thor (2013), ‘Private Traits and attributes are predictable
from digital records of human behaviour’, Proceedings of the National Academy of Sciences of the United
States of America, Vol. 110, No. 15, pp. 5802-5805.
26
Summer, Chris et al., ‘Predicting Dark Triad Personality Traits from Twitter usage and a linguistic analysis of
th
Tweets’, Proceedings of the IEEE 11 International Conference on Machine Learning and Applications ICMLA
2012, 2012.
17
rather than analysis of the content of tweets. However, this will entirely depend on the
context of whatever update we are looking at. A succinct and clearly expressed opinion on
social media is likely to be less linguistically useful than a lengthy vague comment- which
will have more linguistically interesting features. Additionally the specific social media
medium will have a significant impact on this. Twitter has a 140 character limit on tweets,
meaning that users will have to adapt their natural linguistic style to fit their message into
tweets. Alternately, Reddit communities encourage posts and replies to be lengthy,
interesting and watertight. This seems to have the opposite effect on users, with the
language used becoming specialised, verbose and complicated. In both these cases it is
generally not the users natural language style that is being represented, but a version of it
mediated through the implicit and explicit restrictions imposed by the platform and its
users.27
So as we have seen, there are two broad categories that we can place social media data
into. The first is data gained from the profiles of users, and the second is information gained
from the updates of users. The first is primarily useful for gathering any facts that are
included in profiles, things like location, age, marital status and sometimes sexual
orientation and political views.28 The second can also be used to gather this kind of
information however this is through inference. Inferential information from social media
updates is likely to be more useful when determining more subjective personality traits.
However, this may require tools to aid the analyst or a sophisticated understanding of the
individual in question and the social media platform used to express opinions. To analogise,
the first is the sort of information a psychotherapist might ask a patient directly about and
the second is the sort that might be inferred whilst a patient discussed their thoughts and
feelings. Both have use for the analyst; but as always it depends entirely on the purpose of
the intelligence collection and the analytical context.
It is unclear whether users of social media believe that their online presence is an
accurate representation of themselves
There is a second-order consideration to be made concerning the factors discussed in the
previous section. The different ways in which users represent themselves on social media
must be viewed alongside social media users’ own perception of their self-representation.
The important question to ask is: to what extent does this particular social media interface
give users the impression that their online avatar is an accurate representation of them? On
social media where users believe their online profile is an accurate self-representation, we
would (prima facie) expect them to behave in a way that genuinely displays their real
thoughts and attitudes.
27
Although, as will become a running theme in this paper, there is always something interesting to be learned
when users are forced to adapt to limitations. It all depends on the intelligence aims of the SOCMINT analyst.
28
Although it has been pointed out that updates can also indicate some of this information See Figure 10).
18
Perhaps surprisingly, there is a general absence of relevant academic studies that deal with
the accuracy of self-representation in social media. Instead, the majority of existing research
focusses on the impact of social media on user self-esteem. In lieu of this, we propose that
further research into user perception of self-representation in social media would be
beneficial. Studies into this area would provide further evidence to support some of the
claims made in this section. It would also provide some useful insight into some of the more
specific, contextual aspects of representation in social media. An investigation into how
identity and representation are affected by age, sex or location, for example could be
extremely useful for the social media analyst.
When users interact as part of a group, they often tend to behave in accordance
with group norms, rather than personal attitudes
Self-representation on social media networks is also often strongly linked to communities.
Across the spectrum of social networks we see substantial differences in the sorts of
communities that form within them. Sites like Reddit and Facebook have significant
infrastructural support for groups built into in their website architecture. Reddit is entirely
formed of “subreddits” - forums that are dedicated to discussion of particular topics such as
a football team, political party, ideology or funny videos - almost every conceivable interest
and subculture is covered by a subreddit. Similarly, Facebook has increasingly become a
medium by which people connect over topics in a similar way, with a much more supportive
“group” function than was previously seen on the site. As you would expect, where site
infrastructure has stronger support for groups, we see increased overall strength in the
groups themselves. They tend to have a more robust sense of identity as well as sometimes
being insular and hostile to outsiders, with the use of in-jokes, memes and circular
references to reinforce group identity.29
There has been some substantial psychological research into changes in user behaviour
when interacting as part of an online group or community. Specifically the Social Identity
model of Deindividuation Effects (SIDE model) was developed by researchers seeking to
describe social effects of computer-mediated communication (CMC)30 and has more
recently been applied to user activities on social media sites that have a strong community
or group presence.31
For our purposes we can see the SIDE model as something which explains why people on
the Internet behave differently when they are interacting within a group. Specifically SIDE
points to a correlation between strength of group identity and the likelihood of individual’s
actions demonstrating the perceived attitudes of the group, rather than their own individual
beliefs and attitudes.
29
A bizarre example of this is the subreddit: www.reddit.com/r/montageparodies (discretion advised).
Chan, Michael, ‘The Impact of Email on Collective Action: A field application of the SIDE model’, New Media
and Society, Vol. 12, No. 8, 2010, pp. 1313-1330.
31
Suler, John, ‘The Online Disinhibition Effect’, Cyberspace and Behaviour, Vol. 7, No.3, 2004, pp. 321-326.
30
19
Central to SIDE’s perspective is the idea that a significant portion of an individual’s selfconcept is formed in terms of social categories and group membership. These social
categories often bring with them a set of norms and attitudes which differ from any given
individual within the group. SIDE’s proposal is that an individual’s behaviour largely depends
on whether personal identity or social identity is salient at a particular time. So, applying te
SIDE model, when group identity is more salient than personal identity, individuals will act
and represent themselves in ways that reflect the group identity rather than their own.
Crucially, SIDE picks out two specific features of social media interaction that amplifies the
saliency of social identity over personal identity: physical isolation and visual anonymity.
The medium where we see a large swing towards group identity is, as previously mentioned,
on social media networks that have a strong sense of group identity, Reddit for instance.
Reddit’s age and popularity has created a situation wherein there is a very clear concept of
what it is to be a “Redditor”. Traits generally included within a self-perceived redditor are
things such as being a skeptic, atheist, scientifically-minded, cultured, intellectual, liberal,
gamer, liking cats, white, and pro-legalisation of drugs.32 We cannot say the same thing
about Facebook and Twitter, there is not such a strong concept of the archetype
“Facebooker” or “Tweeter” as there is with a “Redditor”.33 As well as this, Reddit’s division
into subreddits means that each subreddit also has its own sub-identity. For instance
frequent posters on /r/atheism have a strong sense of self identity that differs from that on
/r/bassguitar, but both still conform somewhat to the umbrella “redditor” identity.
Social media’s tendency to engineer conditions that produce these sorts of complex,
compound identities presents a challenge for the analyst. Examining an individual’s activity
across different groups on social media will likely present variations of that individual
mediated through different social groups. SOCMINT analysts must therefore be aware of
contextualised self-identity and understand how this affects the information collected.
Considering the fact that social media can present the sorts of complex, compound social
identities, a careful assessment of the relevant social media groups is necessary. If an
analyst is tasked with analysing a specific group’s members, community and interactions on
social media, an awareness of complex compound social identities is critical. In addition, this
will help analysts to avoid incorrect attributions of beliefs, values and opinions to particular
individuals, when they really represent the group identity.
So, from the analyst’s perspective, we should be cautious when examining behaviour of
individuals within the context of a strong group environment. As research has shown, the
32
From reading the Reddit post: “what would the stereotypical redditor hate about you?”
<http://www.reddit.com/r/AskReddit/comments/1vzeru/what_would_the_stereotypical_redditor_hate_abou
t/?sort=top> [accessed 29 September 2014].
33
A possible explanation of this is the fact that communities on Reddit are generally formed only online and do
not reflect networks and relationships in the real world, as Facebook and Twitter have a higher tendency to.
This will be explored in more detail in the following section.
20
behaviour of individuals can change greatly within these contexts to increasingly reflect the
perceived attitudes of the group. Of course after we have made the necessary consideration
for these biases then the data can be of use in other ways; for instance it can help us define
exactly what the norms, attitudes and beliefs of the group are.
Analysts should be aware of other factors that affect self-representation on social
media
The above section describes how self-representation in social media is affected by
interaction within groups. An exhaustive list of every other way representation can be
affected would be the subject of a much longer report or research designed specifically to
investigate the fluid nature of self-representation on social media. However, notable
examples include users using social media as a professional advertisement (a sort of
dynamic online resume such as that seen on LinkedIn); or users creating profiles on social
media specifically to improve their relationship with certain groups or individuals.
A relevant and interesting example of the latter is the case of Aymenn Jawad Al-Tamimi. AlTamimi is a rising terrorism analyst who since mid-2013 until recently (Summer 2014)
adopted a “Jihadi-Persona”34 on Twitter in order to garner information about ISIS.35 He has
become an accepted public authority on ISIS,36 gaining citations in major news outlets such
as the New York Times, Wall Street Journal and Washington Post as well as initially being
invited to contribute to the popular investigative website Bellingcat. Al-Tamimi’s methods
included becoming very close with some ISIS supporters, referring to them as “akhi”
(brother) and expressing distress upon hearing they were killed in conflict.
Al-Tamimi has maintained on his blog that this persona was only used to gain the best
quality intelligence and that he is in no way an ISIS supporter himself.37 On the other hand
organisations such as Business Insider have referenced individuals who have disputed this
claim.38 Either way, one version of Al-Tamimi’s Twitter account is giving a false impression,
either he is being his true self when showing ISIS sympathies, or when he is providing
information that is harmful to ISIS to western news outlets.
In more general terms, Al-Tamimi’s methods have been criticised for being unethical,
notwithstanding his possible ISIS allegiance which he has emphatically denied. The ethical
34
Al-Tamimi, Aymenn Jawad, ‘Reflections on Methods’, Aymenn Jawad Al-Tamimi’s Blog, July 22 2014
<http://www.aymennjawad.org/2014/07/reflections-on-methods> [accessed 29 September 2014].
35
He boasts on his blog that ISIS contacts enabled him to identify the Moroccan ex-Guantanamo Bay inmate
Mohammed Mizouz in Syria.
36
Rosen, Armin, ‘The Remarkable Story of a Rising Terrorism Analyst Who Got Too Close to His Subjects’,
Business Insider, July 22 2014, <http://www.businessinsider.com/tamimi-2014-7> [accessed 29 September
2014.
37
Al-Tamimi, Aymenn Jawad, ‘Reflections on Methods’, Aymenn Jawad Al-Tamimi’s Blog, July 22 2014
<http://www.aymennjawad.org/2014/07/reflections-on-methods> [accessed 29 September 2014].
38
Rosen, Armin, ‘The Remarkable Story of a Rising Terrorism Analyst Who Got Too Close to His Subjects’,
Business Insider, July 22 2014, <http://www.businessinsider.com/tamimi-2014-7> [accessed 29 September
2014.
21
position on the creation of false profiles, or “sockpuppets” on social media for research
purposes is a hazy area for. Certainly major governmental organisations will not publicly
condone the use of false profiles as the public backlash alone would be potentially politically
damaging. This report does not condone the use of false profiles or personas for research
purposes; however it does intend to highlight the need for further study into the ethical
implications of creating sockpuppets. We should remember that some of the information
provided by Al-Tamimi has proved incredibly informative and useful in understanding
jihadism in Syria, Iraq and the wider Middle East.
Approaching the problem from a different perspective, analysts at the UK think-tank Demos
have also pointed out that the creation of false profiles poses a challenge for analysts to
overcome.39 They cite the extraordinary case of the “Syrian-American Lesbian” Abdalla Arraf
al-Omari who allegedly ran the blog “A Gay Girl in Damascus”. It later transpired that the
blog was run by a PhD candidate at Edinburgh University. The motivations for creating
sockpuppets are obvious. They allow users to interact with people they otherwise wouldn’t;
pass on false or misinformation; and access protected websites. It is reasonable, therefore
for Demos to suggest that ‘any core aspect of any SOCMINT capability will be the ability,
both analyst and automated-led, to weed out false and misleading information’.40 To
expand on this point, an automated method for locating and identifying false profiles on
social media seems like an incredibly difficult task especially considering the limited ability
that current techniques in credibility analysis have managed to reach. However, any
developments in this field would be of great benefit to the SOCMINT community of
researchers, analysts and intelligence professionals.
Finally, it is important to note that it is not just individual analysts who have had to address
ethical challenges in conducting this type of research, known as netnography. In 2011, US
Central Command (CENTCOM) awarded a $2.76m contract to Ntrepid for the creation of an
‘online persona management service’.41 This relatively benign description disguises the fact
that Ntrepid was contracted to provide 50 user licenses each allocated with 10 false
personas that would enable US personnel to influence online conversations and advance US
interests. Not only is there an ethical dimension to this case, there is also the analytical
challenge it poses for outside analysts seeking to understand social media communities that
may have been infiltrated by governments, militaries or other official agencies.
39
Rosen, Armin, ‘The Remarkable Story of a Rising Terrorism Analyst Who Got Too Close to His Subjects’,
Business Insider, July 22 2014, <http://www.businessinsider.com/tamimi-2014-7> [accessed 29 September
2014.
40
Ibid.
41
‘PsyOps and Socialbots’, Infosec Institute, <http://resources.infosecinstitute.com/psyops-and-socialbots/>
[accessed 29 September 2014].
22
2. Community Features
The importance of understanding communities on social networks ...................................... 24
Some social networks create communities that are reflected in reality, others create virtual
communities that exist only online ......................................................................................... 24
Analysts should be aware of open source and paid-for network visualisation tools. ............. 25
Understanding the evolution of groups on social networks is a necessary skill for SOCMINT
professionals ............................................................................................................................ 26
Instagram blocks the use of many hashtags, undermining community structures ................ 29
Community lifespan on Instagram and Twitter is often related to external events ............... 29
23
The importance of understanding communities on social networks
The social aspect of social media implies individuals interacting, forming relationships,
creating and joining networks and evolving in reference to each other. This section focuses
on the creation and evolution of communities as a result of this social element of social
media. “Community” in the context of the Internet doesn’t lend itself to an easy definition;
must a community be self-aware? Must it be supported by a platform? Must it share some
common interest? To date, none of these questions have been answered in detail. This is
fundamentally because the structural aspects of the internet, and especially social media,
are so varied, that communities form sharing no apparent similarities. It should be said,
however, that a minimal uncontroversial condition for a community is that some of its
members should interact based on shared norms, online cultures or other salient features.
The importance for SOCMINT analysts to have a strong, developed understanding of
communities online cannot be stressed enough. Not only is understanding an individuals’
place within a community the key to understanding them, but looking at communities as
more than the sum of their individuals provides insight that cannot be found elsewhere.
Analysts must be aware of how communities form on social networks, how they utilise
features of social networks to maintain and grow their communities, and how these
communities relate to groups in other places, on and off the web. This section intends to
give an overview of these issues, highlighting important insights into social communities.
There is the potential some significant cross-over with this section and the previous
discussion in Representational Features that is concerned with how individual identity can
be subsumed by group identity. We have chosen to discuss these features in the previous
section, although they could equally be talked about here. This demonstrates that social
networks and their constituent features cannot be analysed in isolation but must be
approached coherently and methodically by the analyst.
Some social networks create communities that are reflected in reality, others create
virtual communities that exist only online
The first truly popular social network “Friends Reunited” was designed to reunite old friends
who had fallen out of contact, that is, to augment real-world relationships. Similarly,
Facebook (originally intended as a networking tool for students at Harvard) was also
designed to augment real world friendships, provide a platform to communicate and share
media. On the opposite end of the spectrum, sites like Reddit and massive multiplayer
online role-playing games (MMORPGs) such as Second Life, have minimal integration with
real-world friendships. Their community features are focused on creating and maintaining
online relationships. Most sites however (including Facebook in its current form) occupy a
place on this spectrum. Users on social media such as Twitter, Instagram and LinkedIn tend
to communicate with their real-world friends as well as meeting new people online.
24
It is in the interest of SOCMINT researchers to analyse and understand the relationship
between a user’s online relationships and real world community. A methodology for
determining this would include looking at any geolocation information shared between
users, investigating interaction history and examining any mutual connections they may
have using network analysis.42 Benefits gained from understanding this information can be
extensive. As well as being a good starting point to understanding an individual’s network in
general, analysing this real-online distinction can provide information about the breadth of
influence a user has, details about their location and the sources of their information.
Analysts should be aware of open source and paid-for network visualisation tools.
Social network visualisation applications are an increasingly invaluable part of the SOCMINT
analyst’s tool-kit. They allow analysts to visual a network of individuals or groups within
social networks. There are a wide number of free and paid-for tools available, both openwhich will be utilised by various organisations depending on available resources and
requirements. Open source examples include Gephi43 and Cytoscape.44 Popular paid-for
solutions include software developed by Silobreaker,45 Palantir46 and IBM.47
Figure 12. A stylised network visualisation of Facebook friends using Gephi visualisation software.
48
42
Note, well-cultivated online relationships can often seem indistinguishable from real-world ones, sometimes
some deep research is required here.
43
‘The Open Graph Viz Platform’, Gephi, <http://gephi.github.io/> [accessed 29 September 2014].
44
‘Cytoscape’, <http://www.cytoscape.org/> [accessed 29 September 2014]
45
‘Silobreaker’. <http://www.silobreaker.com/network-2> [accessed 29 September 2014].
46
‘Palantir’, <https://www.palantir.com/> [accessed 29 September 2014].
47
‘i2 National Security and Defense Intelligence’, <http://www-03.ibm.com/software/products/en/nationalsecurity-defense-intelligence> [accessed 29 September 2014].
48
GrandJean, Martin, ‘Analyser graphiquement son réseau facebook’, MartinGrandJean, 17 March 2013,
<http://www.martingrandjean.ch/analyser-graphiquement-reseau-facebook/> [accessed 29 September 2014].
25
Understanding the evolution of groups on social networks is a necessary skill for
SOCMINT professionals
‘Unfortunately, doing analysis on giant unstructured digital social networks 49 turns out to be
one of the big challenges of social science research.’50
The above quote, an admission from frustrated PhD student Sebastian Benthall, highlights
the unstructured nature of online communities. Whilst it is not immediately clear exactly
what specifically the writer is referring to here as “unstructured”, we can see that
communities on social media can be unstructured in a number of ways. Firstly, there may be
an absence of any formal support for groups to form on a particular platform. Twitter and
Instagram, for instance lack any sort of the “group” function, which we see in Facebook
(groups) and Reddit (subreddits). Secondly, he could be referring to the resistance from
Internet based groups to fall into any particular category. The quote above was actually in
response to Benthall’s discovery of (what has become known as) “Weird Twitter”. Know
Your Meme defines weird Twitter as the following:
…a loosely connected group of Twitter users who are known to experiment with
spelling, punctuation and format for humor or poetry. The style of writing can be
considered surrealist by participants in the group, with subject matter ranging
from creating absurd scenarios to attempting to describe abstract feelings by
choosing words for their “verbal aesthetic appeal.” However, many of the
accounts are grouped together by the same desire to reinterpret the “realness”
of life in ways people do not always get to experience.51
It’s clear that Weird Twitter resists any conventional description, there are many, many
users who interact as part of the community who do not have any of the characteristics
described above whatsoever. Additionally Weird Twitter members in general, reject the
moniker, believing that it belittles them and brings them under than one name that which
they see to be many different communities. Whilst this is only one particular example of a
group on Twitter, it is also somewhat prototypical.
Weird Twitter is an example of a “group” on a platform that doesn’t support groups whilst
simultaneously resisting classification of being called a group itself.52 However, certain other
groups utilise hashtags (#) as a tool to identify themselves as group members, and to flag up
their communication. An interesting niche example is #bcsm, an initialism for Breast Cancer
49
Read: online communities chapter.
Benthall, Sebastian, ‘“Weird Twitter” art experiment method notes and observations’, Digifesto, 18 October
2012, <http://digifesto.com/2012/10/18/weird-twitter-art-experiment-method-notes-and-observations/>
[accessed 29 September 2014].
51
‘Weird Twitter’, Know Your Meme, <http://knowyourmeme.com/memes/weird-twitter> [accessed 29
September 2014].
52
Although (with what could be used as an argument against its existence as a community), many members of
the Weird Twitter community gladly accept the title, for example
<https://twitter.com/BevisSimpson/status/509887014326784000> [accessed 29 September 2014].
50
26
Social Media. The hashtag started as a means for sufferers of breast cancer to communicate
on Twitter and has since expanded to become the basis for a dedicated website53 and even
a YouTube channel.54 What is most interesting here is that #bcsm comes from use on a
platform that doesn’t have dedicated group infrastructure support, but provided the basis
for the creation of a website that does. Furthermore, as Dan Munro points out, it is a
reversal of the usual protocol of starting with a website and adding a Twitter handle to
increase visibility. In this instance the website has come as a result of the hashtag, not the
other way around.55 Whilst there have always been breast cancer support groups, the use of
a hashtag has allowed one to develop that simultaneously functions as an extension of
previous cancer support communities and a new communities in itself.) Other examples of
communities based around hashtags include #richkidsofinstagram hashtag (see the
discussion of hashtags in interactional features below).
So, on one end of the spectrum we see communities that are so varied and fragmented that
they resist the name of “community” altogether. At the other end we can observe
communities that are strongly unified and self-identifying, even when there is no formal site
infrastructure to support them. Perhaps as a response to some of these difficulties,
researchers at Royal Holloway and Princeton have approached communities on Twitter in a
different way, choosing to classify groups by similar language.56
The researchers studied word usage in a weighted network of approximately 189,000 nodes
(corresponding to users) from a sample of 250,000 Twitter users. By grouping sets of tweets
together by common key-word use, researchers were able to identify groups and
communities with surprising precision. For example “pln”, “edtech” and “edublogs” are
words used by the community “teachers who often talk about technology”. The largest
identified group was African Americans using the words “N**ga”, “poppin” and “chillin”.
Researchers on the project described interesting findings, such as the fact that groups have
regional accents, in that they commonly misspell words in the same way. Justin Bieber fans
have collectively developed a habit of adding “ee” as a suffix to words, like “please” to form
“pleasee".57 With sufficient data, the researchers claim they would be able to predict
community membership with 80% accuracy.58
53
‘Welcome to the BCSM Community’, #BCSM, <www.bcsmcommunity.org> [accessed 29 September 2014].
‘The BCSM Community’, #BCSM, <https://www.youtube.com/user/BCSMCommunity> [accessed 29
September 2014].
55
Munro, Dan, ‘Twitter Community #BCSM Expands Online to Broaden Patient Engagement’, Forbes, 31 March
2013, <http://www.forbes.com/sites/danmunro/2013/03/31/twitter-community-bcsm-expands-online-tobroaden-patient-engagement/> [accessed 29 September 2014].
56
Bryden, John and Funk, Sebastian ‘Word usage mirrors community structure in the online social network
Twitter’, Vol. 2, No. 3, 2013, <http://www.epjdatascience.com/content/2/1/3> [accessed 29 September 2014].
57
‘Twitter users forming tribes with own language, tweet analysis shows’, Guardian Data Blog,
<http://www.theguardian.com/news/datablog/2013/mar/15/twitter-users-tribes-language-analysis-tweets>
[accessed 29 September 2014].
58
Ibid.
54
27
Figure 13. Twitter users grouped into “tribes” annotated with words typically used by each group. The top
word is the most significant within that community. Circles refer to communities with the size proportional to
the number of users. The width of lines between circles represents the number of messages sent between
communities. The colours of the loops represent the proportion of messages that are from users within that
59
group yellow being 0 and red 1 and their size indicates overall number of messages.
Such research may prove to be very useful, for instance the ability to distinguish between
different Islamist groups on Twitter using linguistic subtleties. Analysts should be aware of
advances in this field, as well as considering conducting additional research to explore its
possible benefits. However, they should also be aware of its limitations including problems
arising from analysing foreign languages and the possibility that individuals who do not
identify themselves within a particular group nevertheless appropriate its language.
59
Bryden, John and Funk, Sebastian ‘Word usage mirrors community structure in the online social network
Twitter’, Vol. 2, No. 3, 2013, <http://www.epjdatascience.com/content/2/1/3> [accessed 29 September 2014].
28
Instagram blocks the use of many hashtags, undermining community structures
Groups on Instagram use hashtags to identify with each other. However, Instagram has a
policy of blocking many hashtags from appearing in search functions or via the API.
Instagram has justifiably blocked hashtags in the name of protecting users, such as
#proanorexia and hashtags associated with pornography. However, Instagram also blocked
hashtags such as #iphone and #photography while allowing #passport and #license,
providing identity thieves with a raft of new victims. The Data Pack has compiled a nonexhaustive list of banned hashtags60 and developed a banned hashtag search tool.61
Communities on Instagram that have had their hashtags blocked have developed a
methodology for creating a new hashtag, side-stepping the difficult task of manually
communicating the new hashtag to ‘members’. When the hashtag is banned users will
simply duplicate the last letter. So for the banned hashtag #junkiesofig (where heroin users
share images), users simply added an extra ‘g’ making it #junkiesofigg. This hashtag was also
banned, so currently users operate with the hashtag #junkiesofiggg. The ingenious
consequence of only adding extra letters to a hashtag is that when a user searches for the
original (banned) one, the newer version appears as a result of the user-appropriated
suggested search function. This means that users who may not be aware that a new hashtag
has become the norm will be automatically alerted to it when searching for the old hashtag.
Community lifespan on Instagram and Twitter is often related to external events
Hashtags enjoy a complex symbiosis with external real-world events. Events can create
hashtags (#bringbackourgirls); or destroy them (the September 2014 leak of celebrity
photos may have broken the #ALSicebucketchallenge trend).62 Real-world events can also
be created, amplified and sustained by hashtags, most notably the hashtag
#occupywallstreet. #occupywallstreet provided the basis for an online community which
had the explicit aim to spill over into the real world. However, #occupywallstreet quickly
grew out of just the physical occupation of Wall Street and became a tagline for global anticapitalist movements and sentiments. The online community was united under the
#occupywallstreet hashtag whilst the genuine occupation of Wall Street continued.
However, when the occupation finished, members of the community generally ceased using
the hashtag. This had the effect of diminishing the online community based on its usage.
Although other replacement hashtags did emerge, such as #wearethe99%, they failed to
mobilise a comparatively strong community base. We have no reason to suppose that the
activists online expressed any desire to break the online community up; but the breaking of
a fragile connection between real-world events and a hashtag had this effect.
60
‘The Banned #Hashtags of Instagram’, The Data Pack, 26 August 2013, <http://thedatapack.com/bannedhashtags-instagram/#comment-6156> [accessed 29 September 2014].
61
‘Banned Hashtag Search’, The Data Pack, <http://thedatapack.com/tools/blocked-hashtag-search/>
[accessed 29 September 2014].
62
Foster, Michael, ‘Two things the Fappening Teaches Marketers’, All Voices,
<http://www.allvoices.com/article/100000692> [accessed 29 September 2014].
29
3. Interactional Features
Many different types of interaction take place on social media, some features are designed
to facilitate interaction, others are appropriated by users for this purpose .......................... 31
Retweets, shares and regrams: a case study ........................................................................... 32
There are important differences between “manual” and “new” retweets. ........................... 33
Interactional features are adapted by users for complex conversational functions. ............. 35
Retweets can create a “Rumour effect” .................................................................................. 36
Analysts should actively monitor developments in Social Media analytics, for example:
credibility analysis on Twitter .................................................................................................. 37
Analysts should keep track of developments in automated approaches to social media
analysis, .................................................................................................................................... 39
Retweets are similar but not identical to Facebook “shares” and Instagram “regrams”. ...... 39
The variety in use of hashtags across different social media platforms should be considered
in detail. ................................................................................................................................... 39
Favourites, Likes, and Upvotes - users express approval differently across platforms. ......... 42
Social media analysis and the observers paradox ................................................................... 43
30
Many different types of interaction take place on social media, some features are
designed to facilitate interaction, others are appropriated by users for this purpose
Interactional features include anything that allows users to contact each other, share
information or become linked in some way; this is more broadly defined than
“conversation”.
Facebook
-
Likes - Displays token approval of comments and posts.
-
Comments - Allows users to respond to posts with text, images or links.
-
Private messages - Allow users to communicate privately with each other.
-
Wall posts - Allow users to leave posts, links, photos and videos on each other’s
profiles. They are displayed chronologically on a “Timeline”.
-
Tagging - Users can ‘tag’ each other in posts by writing “@name”; this alerts the
users that they have been tagged and hyperlinks to their profile.
-
Sharing – Allows users to re-post posts and links to their profile, crediting the
original poster.
-
Friend Requests – Allows users to become ‘friends’ with each other and have
reciprocal access to each other’s profiles.
Twitter
-
Retweets – User can repost a tweet from another user.
-
@user tagging/syntax – Users can link to another user profile in a tweet, alerting
that user that they have been “tagged”
-
Favourites – Allows users to add tweets to a favourites list, which increases the
visibility of the tweet.
-
Privates messages – Allows users to communicate privately through Twitter.
Instagram
-
Likes - Displays token approval of comments and posts.
-
Private messages - Allow users to communicate privately with each other.
-
Comments - Allows users to respond to posts with text, images or links.
-
Regrams (third party/manual feature) – Users can repost each other’s images, by
use of a third-party client or manually.
-
Hashtags – word or phrases preceded by the # sign; multiple uses on social media
but generally used to increase searchability of posts and to group posts together.
31
Reddit
-
Replies (ad infinitum) – Users on Reddit can reply to posts, reply to replies and so
on ad infinitum.
-
Private messages – Reddit users can communicate privately using the inbox
feature.
-
Reddit Gold/Tips (third party) – Reddit Gold allows users to purchase premium
membership for each other. Tips are enabled by third party ‘bots’ and allow
users to donate cryptocurrencies such as Bitcoin, Dogecoin and Litecoin to each
other.
-
Upvotes/downvotes – Allow users to ‘vote’ for each other posts, increasing their
visibility in accordance with Reddit’s ranking algorithm.
Retweets, shares and regrams: a case study
A prototypical example of an interactional feature is Twitter’s “retweet” function. Retweets
allows users to repost another user’s tweet, crediting the original in the process. Retweeting
on twitter was originally something that was not built into the structure of the site. When
users wanted to repost something another user had tweeted, they would write “RT @(user)
(content of original tweet”). In 2009 Twitter added retweets to their interface; users are
given the option underneath each tweet to retweet it. This standardised the format for
retweeting and meant it was no longer possible to edit the original message in a retweet
(often because of the 140 character limit). Despite the regulation of the format however,
retweets are used in many different ways by Twitter users. The most divisive issue is
whether or not a retweet constitutes an endorsement of the content of the tweet or the
individual tweeting it. This is best highlighted by the fact that many twitter users (journalists
and high profile individuals especially) explicitly state in their biographies that “retweet ≠
endorsement” (or some similar variant).
There is a feeling amongst more active twitter users that retweets are for passing on
information and should not implicitly be interpreted to contain an opinion from the
retweeter.63 However, the fact that users feel the need to express that a retweet is not an
endorsement indicates that many other users do perceive them as such. Indeed there have
even been some cases where the perceived endorsement of tweets has proved very
problematic for Twitter users. A 19 year old was suspended from his job as a councillor after
retweeting a tweet that endorsed female genital mutilation. He defended himself by
maintaining that this was just to ‘raise awareness’ of the issue.64 This divide in opinion can
be problematic for analysts attempting to attribute the content of retweets to users.
63
‘Is a Retweet an Endorsement?’, Think Differently, 19 December 2012,
<http://thinkdifferently.ca/differently/is-a-retweet-an-endorsement/> [accessed 29 September 2014].
64
‘Resignation calls over councillor’s pineapple retweets’, BBC News,
<http://www.bbc.co.uk/news/uk-england-stoke-staffordshire-14709241> [accessed 29 September 2014].
32
Analysts collecting information about users through retweets must be aware of the
subtleties of retweets and judge whether it constitutes as an endorsement in each specific
context.
What we can say about retweets in general though is that they represent a desire for a user
to share the information with their followers. Therefore, although that particular user may
not endorse the information contained in the original tweet, they do think that it is
information that is worth sharing. This in itself has an intrinsic value for the social media
analyst independent of the debate on the function and illustrative nature of retweets.
There are important differences between “manual” and “new” retweets.
It is worth pointing out here that there is an important difference between retweets
facilitated through the retweet function and manual retweets (also known as classic or
traditional retweets). Twitter users have argued convincingly that Twitter’s introduction of
the retweet function decreased the “social” aspect of social media.65 Because of this, many
veteran Twitter users still prefer to use the manual method when retweeting. Analysts
should be aware of the differences between manual and function-based retweets.
Differences include:
-
Manual retweets allow the retweeter to edit the original tweet or add their own
comment, increasing the conversational nature of the retweet.
-
Manual retweets display the retweeter’s avatar not the original tweeter’s avatar.
-
Manual retweets have their own individual URL so are searchable, new retweets do
not.
-
New retweets do not increase your visibility (it will not increase your likelihood to
appear in a “suggested” list).
-
New retweets will not contribute to the popularity of a hashtag.
-
New retweets will not enter you into a conversation with the users mentioned in the
tweet.
Figure 14. A “new” retweet.
66
65
‘Retweet the old fashioned way, using “classic” or “traditional” retweets only’, Ray’s 2.0, 3 September 2013,
<http://rays20.blogspot.co.uk/2010/06/traditional-retweet-tr-key-to.html> [accessed 29 September 2014].
66
The author’s twitter posts from https://twitter.com/
33
Figure 15. A “manual” retweet.
67
What is clear is that manual retweets have a much greater involvement of the retweeter
themselves. A manual retweet allows the original to be edited or commented on, displays
the retweeter’s avatar and creates a unique URL attributed to their account. For the analyst,
this means that manual retweets can be a potentially richer source of information about
individuals that cannot be gained from their newer counterparts.
An additional problem faced by analysts confronted with manual retweets is the potential
for the retweeter to distort the information, edit it or present it out of context, changing the
meaning. As Boyd, Goldern and Lotan point out, even if the content of the tweet is not
altered, taking a tweet out of context “can give it a life of its own”. 68 A fictional example of
this is the difference between these two tweets:
1. @Twitteruser1: My girlfriend broke up with me by email...
2. (a follower) RT @Twitteruser1 “My girlfriend broke up with me by email...” OUCH!
The addition the retweeter has added here presents the information contained in the tweet
in a very different light to the original, transforming it from a rather sad confession into a
joke. Awareness of these sorts of “broken telephone” problems is crucial for the analyst.
Indeed, this seemingly trivial fictional example demonstrates how easily and rapidly
messages and information can be distorted on social media. Users may appropriate, edit
and disseminate a single update or multiple updates on social media for their own ends that
may be far removed from the purpose and intention of the original poster. From the
analyst’s perspective, this increases the “costs” of collecting, verifying and analysing
information extracted from social media. The most obvious of these costs is time.
67
Twitter, <www.twitter.com>
Boyd, Danah, Golder, Scott and Gilad, Lotan, ‘Tweet, Tweet, Retweet: Conversational Aspects of Retweeting
rd
on Twitter’, Proceedings of the 43 Hawaii International Conference on System Sciences, 2010,
<http://www.danah.org/papers/TweetTweetRetweet.pdf> [accessed 29 September 2014].
68
34
Interactional features are adapted by users for complex conversational functions.
Boyd, Goldern and Lotan have also conducted a survey which aims to give a non-exhaustive
list of reasons people retweet.69 The responses included the following:
-
To amplify or spread tweets to an audience.
-
To entertain or inform a specific audience, or an act of curation.
-
To comment on someone’s tweet by retweeting and adding new content, often to
begin a conversation.
-
To make one’s presence as a listener visible.
-
To publicly agree with someone.
-
To validate others’ thoughts.
-
As an act of friendship, loyalty or homage by drawing attention, sometimes via a
retweet request.
-
To recognise or infer to less popular people or less visible content.
-
For self-gain, either to gain followers or reciprocity from more visible participants.
From the perspective of the retweeter, we can divide these motives into two categories:
1. Users retweet to engage with their audience or a specific target audience;
2. Users retweet to engage with the original tweeter.
The first of these categories fits in with the intended function for retweets, the intention to
spread the information contained within the retweet. However this only accounts for one of
the functions appearing in the survey conducted. The second category accounts for the
majority of the functions appearing in results of the survey. These contain a number of ways
that retweets are adapted as a feature for facilitating conversation. When people wish to
engage with each other in the offline world, we have a fairly limited selection of options
available, these are mostly quite direct: calling someone’s name, sending them a letter,
introducing oneself etc. Online, features such as retweets, likes, favourites and shares have
been adapted to facilitate a complex range of subtler, indirect ways to address people.
Particularly interesting alternate functions of retweets, especially from an intelligence
perspective, include: indicating friendship, demonstrating loyalty and displaying an act of
homage. Users who want other users to realise that they are influenced by them; agree with
them; or share similar opinions use retweets in this way to express these feelings indirectly.
The indirectness of a retweet is important because it encourages users to use them in this
69
Boyd, Danah, Golder, Scott and Gilad, Lotan, ‘Tweet, Tweet, Retweet: Conversational Aspects of Retweeting
rd
on Twitter’, Proceedings of the 43 Hawaii International Conference on System Sciences, 2010,
<http://www.danah.org/papers/TweetTweetRetweet.pdf> [accessed 29 September 2014].
35
way without the feel of social embarrassment; after all a user can always claim that the
retweet was intended only for sharing purposes. It represents a socially low-risk option for
engaging with users you might otherwise not.
Users adapting features on social media for conversational purposes is not just seen in
retweets, it is common across all forms of social media. The effect this has on user
experience of these platforms is one of heightened connectivity and addressability. Users
are more likely to engage with people they haven’t met or previously interacted with. They
are given more opportunities to start conversations, join existing discussions and create new
ones. So when considering the range of possible interactions that can be made on social
media, users may choose many different options.
For the analyst, this means that when tracking the interpersonal relationships of individuals
and their specific interactions, it is not enough to focus exclusively on verbal
communication: comments, replies, wall posts. Analysts must assess whether use of these
non-verbal features is also playing an important role. This requires a sophisticated
understanding of social media communication and interaction.
Retweets can create a “Rumour effect”
The 140 character limit on Twitter
forces users to condense whatever
message they are trying to convey,
whilst this is a useful and important
part
of
Twitter,
it
can
disproportionately advantage catchy,
eye grabbing headlines over ones
that might contain the most factually
correct or important information. In
addition, these sorts of tweets can
spread extremely quickly because of
the ease of retweeting. Whilst in
most cases misinformation of this
kind can be fairly innocuous (quickly
dispelled rumours of celebrity deaths
example), certain examples have
almost been disastrous.
As Figure 16 shows, on April 2013
the Associated Press (AP) Twitter
Figure 16. The hoax tweets posted on the Associated Press
70
official Twitter account.
70
Domm, Patti, ‘False Rumor of Explosion at White House Causes Stocks to Briefly Plunge; AP Confirms its
Twitter Feed Was Hacked’, CNBC, <http://www.cnbc.com/id/100646197#> [accessed 29 September 2014].
36
account posted a tweet reading “Breaking: Two Explosions in the White House and Barack
Obama is injured”. The news quickly spread, and whilst the AP quickly deleted its account
and notified Twitter via alternate AP accounts that it was the result of a hack (claimed by
the pro-Assad regime Syrian Electronic Army), the damage had already been done. The Dow
Jones plunged more than 140 points and the S & P briefly lost $136.5 billion (which was
recovered within 15 minutes). Interestingly, the problem was not entirely caused by traders
reading tweets and acting in response, but automatic algorithms which read headlines and
create automatic orders. These algorithms were deceived not just by a false headline froma
reputable news source but also by the information echo of thousands of retweets that
increased the perceived relevance of the story.
Analysts should actively monitor developments in Social Media analytics, for
example: credibility analysis on Twitter
There have been a number of proposed solutions to the problems created by the spread on
misinformation on Twitter. Most notably there have been interesting recent developments
by the Indraprastha Institute of Technology, where researchers have developed a browser
extension named “TweetCred”,71 which purports to display a credibility rating out of 7 for all
Tweets visible on a users timeline.
Figure 17. TweetCred ratings displayed on the Reuters Top News official Twitter account. As a very credible
news source we would expect to see 7/7 ratings for all Reuters tweets. Tweetcred has been accurate for the
top two, however the bottom tweet (a retweet from Reuters business has been given a very low 2/7 rating.
72
The reason for this is unclear but it is probably at least partly because it does not contain a hyperlink.
71
Gupta, Aditi, et al., ‘TweetCred: A real-time Web-based System for Assessing Credibility of Content on
Twitter, Indraprastha Institute of Information Technology, 2014,
<http://chato.cl/papers/gupta_kumaraguru_castillo_meier_2014_tweetcred.pdf> [accessed 29 September
2014].
72
Twitter, <www.twitter.com> [accessed 29 September 2014].
37
TweetCred aims to provide users with additional information about Tweets to users,
embedded within the interface. it uses a criteria of 45 features to determine the credibility
of tweets. Some of these criteria are:
• Tweet meta data: Including number of second since the tweet,
source of tweet and geocoordinates.
• Tweet content features: Including number of characters, words,
URLs, hashtags, unique characters, presence of stock symbol,
happy/sad smiley, colon symbol etc.
Credibility
Score
• User based features: Including number o f followers, friends, time
since last tweet.
• Network features: Including number of retweets, mentions, replies,
whether the tweet is a reply or a retweet
• Linguistic features: Including presence of swear words, negative or
positive emotion words, pronouns.
• External resource features: Web of Trust (WOT) score for the URL;
retio of likes:dislikes for a Youtube video.
After the initial tests run by TweetCred, the developers report that users agreed with 43% of
the ratings given, with an additional 25% expressing minimal disagreement. These
underwhelming figures are an indication that there is still much work to be done in the field
of credibility evaluation. Figure 16 above demonstrates an obvious flaw with the TweetCred
algorithm; it shows how two tweets from one very reputable news source (Reuters) can be
given wildly different credibility readings. The first two tweets displayed are from the
Reuters Top News Twitter account and the third is a retweet from Reuters Business. It
seems that the only difference between the third tweet and the first two is that the third
contains a hashtag but no hyperlink. This should not be a reason to give a credibility rating
that is five points lower and reflects a significant weakness in the TweetCred methodology.
On the other hand, as one of the first pieces of pioneering software in the field, TweetCred
does seem promising. Certainly, a more accurate and developed version could be incredibly
useful for analysts to contextualise the relevance, accuracy and reliability of questionable or
previously unknown sources of information.
Indeed, assessing the credibility of information posted on Twitter is one of the most
significant challenges facing analysts seeking to extract useful information from social media
as part of an all-source intelligence product. Many of the same OSINT techniques developed
to assess the credibility of online sources can be applied by SOCMINT analysts. However, the
volume of information on social media; the diversity of commentators; and the brevity of
posts pose new challenges to analysts that software such as TweetCred explicitly aim to
address. Furfure developments in this field should be monitored carefully.
38
Analysts should keep track of developments in automated approaches to social
media analysis,
Even experimental software such as TweetCred highlight the potential benefits that
automated approaches to social media analysis can offer. This report has also discussed
methods to track communities on social networks and methods that aim to extract
information by analysing Facebook likes or the words used in Tweets. SOCMINTs status as a
recent discipline and the youth of social media in general means that we can expect many
more advances of this nature in the future. In addition, because social media is constantly
growing and changing, technology that seeks to analyse it will have to adapt alongside it.
This presents a challenge for SOCMINT analysts and policy-makers. There must be a method
in place to keep track of advances in SOCMINT analysis methods in technologies in order to
remain on the cutting edge. Organisations must decide whether it is viable to task an analyst
to keep track of these developments and how much time should be allocated for this. It
should be noted that methods and technologies that are applicable to SOCMINT will not
often be billed as SOCMINT technologies, often they may have been designed for an
alternate purpose but are able to be adapted for innovative SOCMINT analysis.
Retweets are similar but not identical to Facebook “shares” and Instagram
“regrams”.
Retweets on Twitter are analogous in many ways to the “share” function on Facebook and
“regrams” on Instagram. Sharing on Facebook does provide a space for users to submit their
own comment to their repost but still faces the same problems as retweets in terms of
endorsement. Unfortunately for the analyst, there is not a culture on Facebook of stating
whether or not you believe a “share” to be an endorsement or not, so this can be difficult
for analysts to find out (although the comment itself may be an indicator).
The variety in use of hashtags across different social media platforms should be
considered in detail.
Hashtags are words or phrases that are preceded by the pound sign (#). They are generally
used by social media users to provide extra information to a post.73 They allow posts to be
grouped together with posts that share the same hashtags, as well as to track trends and
enhance search functions. The overwhelming popularity of hashtags on social media is
largely restricted to Twitter and Instagram, although they are becoming increasing popular
on Facebook after it rolled out support in 2013. On Twitter, popular hashtags appear in the
“trending” section74 and on Instagram users can group tweets together by a hashtag search
function (users can also do this on Twitter).
73
Gunawardena, Nipun, et al. ‘Instagram Hashtag Sentiment Analysis’, University of Utah, 2013,
<http://www.eng.utah.edu/~cs5350/ucml2013/3-3p.pdf> [29 September 2014].
74
This can either be based on a user’s location or tailored to their interests.
39
Gunarardea et al have suggested that the use of hashtags as described in the previous
paragraph warrants their classification as “metadata” of social media posts.75 Whilst this
view certainly seems to illustrate the function of hashtag’s some of the time, it does not do
justice to their range of use, especially on platforms such as Instagram. Hashtags are
increasingly becoming the “stars” of social media networks, often forming the main content
of posts, with the body of text providing scaffolding for use of a hashtag.
The use of hashtags on Instagram works perfectly in tandem with Instagram’s use as a
platform for self-promotion. Instagram’s almost exclusive focus on posting and sharing
photo’s makes it an ideal tool for this and has allowed it to develop as the primary method
on the Internet that individuals use to ‘display economic, social and cultural capital’.76 To
this end, hashtag’s can be used to associate oneself with a particular lifestyle or social
group. Take, for example the #richkidsofinstagram/#rkoi hashtag, which has become
incredibly popular amongst affluent teenagers and young adults.
Figure 18. An example of a user utilising the #richkidsofinstagram hashtag on Instagram.
77
Above is a typical example of the #rkoi hashtag being used to affirm economic and social
capital. In the caption we can only see a brief comment: ‘Line up!!!’ with the majority of the
content being the 28 hashtags that are use alongside the photo. Most of these hashtags
serve the dual purpose of increasing the visibility of the post on Instagram and
75
Gunawardena, Nipun, et al. ‘Instagram Hashtag Sentiment Analysis’, University of Utah, 2013
Gyorffy, Rachele, ‘#NoFilter: Exploring Self Promotion and Identity Creation through Instagram.’ Princeton
University Senior Thesis, 2013
77
Instagram, <www.instagram.com> [accessed 29 September 2014].
76
40
simultaneously affirming steven_wong_91’s personal brand as an affluent youngster partial
to fast cars. Additionally, steven_wong_91’s legitimacy to use the #rkoi hashtag (he has a
photo that warrants it) places him within a community of other Instagram users, which
grants additional social capital. Analysts should not just be aware of how users use hashtags
to increase their social capital on an individual level, but also how their use fits into a wider
social context. To use the previous example, if it were the case that #rkoi was not popular
and that steven_wong_91 was the only person using the hashtag, then it would not carry
the same sort of social capital that it does, given its popularity. Whilst it would still carry the
everyday connotations of being a rich, young social media user, the fact that #rkoi is well
known means that it carries additional implications in terms of community identity.
The hashtags used in the post shown in the image above also serve to increase the visibility
of the post. This is probably the most common reason people use hashtags on Instagram.
Instagram’s “search by hashtag” feature allows posts sharing the same hashtag to be
grouped together. This is also a function on Twitter and many hashtags are created with this
in mind. Twitter, however, also has a strong emphasis on hashtag trends, which collate the
most popular hashtags.
Hashtag trends can either be “tailored to you” or location based. Tailored trends are
mediated by an algorithm that takes into account who a Twitter user follows. Location
based trends are trends that are specific to an area (it is possible to choose “worldwide” as
an option here). Through this feature Twitter has encouraged a culture that emphasis
Tweets as ephemeral; they are designed to be read soon after they have been posted. With
this in mind, hashtags are generally used by Twitter users to increase visibility of Tweets on
a temporary basis, and to contribute to current trends. As well as this, Tweets are
searchable by the words contained in the Tweets, but Instagram posts are not searchable by
the content of their photos or the captions provided. This means that Instagram users must
use hashtags to describe the actual content of their photos, whereas Twitter users can use
them to both provide extra information and increase visibility.
Instagram photos tend to be more personal in content than tweets. Tweets can be in
reference to anything whereas the subject of photos is something within the user’s vicinity,
so is more likely to be some strong relation to them. Because of this, Instagram hashtags
tend to contain more information about a user’s immediate environment than Tweets. This
is also because, on Twitter, if a user wishes to communicate their emotional state or
perceptions of their environment they have the option to do this via the main body of text.
This is not an option on Instagram as the main body is an image (although, perhaps it may
be implicit in the image or included in the caption).
Whilst we see some similarities in hashtags use across both platforms, hashtags on
Instagram have a much wider range of uses that often includes interesting personal, social
and cultural information about users. Users on both platforms use hashtags to identify with
41
certain communities and social groups, yet on Twitter these groups are generally more
ephemeral and are often connected to real world events (such as the Ice Bucket Challenge,
or the Occupy movement). On Instagram groups tend to be permanent as well as
sometimes being quite obscure (for example the group based around the #junkiesofig
hashtag). This illustrates the point for analysts that the subtle differences arising from the
context of features on social media must be understood and accounted for in order to
develop a fruitful understanding.
Favourites, Likes, and Upvotes - users express approval differently across platforms.
Social media platforms all allow users to express approval of other users’ posts. On
Facebook and Instagram users can “like” other users’ posts and comment, on Twitter users
can “favourite” and on Reddit users’ upvote comments they approve of (and downvote
comments they disapprove of).
When collecting information about user attitudes and opinions on social media, it should go
without saying that favourites, likes and upvotes are an important resource. Ostensibly, it
makes sense to have the general assumption that if somebody uses one of these features,
they are expressing a positive reaction to something that has been posted. Broadly this is
correct. However, as with hashtags, these features are not analogous across all platforms.
On Facebook, likes are generally used in three ways:
1. Likes are most often used when responding to a personal post from a friend, to
express that they like what they have posted (perhaps news of a new baby, or an
amusing joke).
2. Likes are also used to express agreement with something, for example an opinion
article or a “page” on a specific topic.
3. Less commonly, likes are used by users to indicate something is worthy of attention,
but not necessarily that they approve of (it is approval of the posting of the content,
not the content itself). For instance, a tragic news story. (Note: some users are afraid
of having likes misinterpreted and will refrain from exercising them in this way).
On Instagram likes are simpler; they are used almost exclusively to indicate approval or
enjoyment of the photo that has been posted.
Because of the popularity of retweeting to express approval or agreement on Twitter, the
favourite function is used in a subtler way. Users’ often favourite tweets when they do not
believe the tweet warrants a retweet, or that they do not think that it needs to be shared
further, perhaps because they do not want it to be visible on their profile (although it is
possible to view what other users favourite via the ‘activity’ tab). As well as this, the
favourite icon is designed and used as a bookmark button for Tweets. For some users this is
the limit of its use, other users favourite some tweets for approval and others for
42
bookmarking purposes. Analysts should be aware of this subtle distinction on Twitter which
quite often varies from user to user. They should be wary of drawing concrete conclusions
from the fact that a user “favourites” a tweet where the usual connotations of the word
favourite do not necessarily apply.
Social media analysis and the observers paradox
In social sciences the “observer’s paradox” refers to a phenomenon that occurs when events
in an experiment are affected by the presence of the observer. The “paradox” comes about
because it is impossible to conduct experiments without observation, but if observation
takes places then the experimental data can be corrupted.
Social media research can suffer chronically from problem. When users are aware or have
the impression that they are subject to observation we see a phenomenon known as
‘reactivity’,78 where users alter their behaviour in response to being watched. This could
specifically take place in a situation where a researcher or SOCMINT analyst felt it was wise
to reveal their true identity and intentions to an individual or group that was being
examined. The examined individuals would then adapt their behaviour in response to the
presence of an observer, possibly obscuring useful information they might otherwise offer.
Social media analysts acting as passive or active observers of networks, communities and
individuals across various platforms also raises issues relating to the discipline of
netnography, the study of individual and group behaviour online. The example of Aymenn
Jawad Al-Tamimi (see analysts and self-representation) demonstrates the potential pitfalls
of actively engaging with individuals or communities of interest online. It is imperative that
policy-makers develop clear guidelines for analysts searching social media for information
and crucially, for when they seek to interact with users to gain valuable insights.
Organisations will have to develop their own standards independently although these may
evolve into a consensus best practice.
78
Heppner, Paul P, Wampold, Bruce E and Kivlighan, Jr. Dennis M, ‘Research Design in Counselling (Research
rd
Statistics & Program Evaluation)’, Cengage Learning 3 Edition, 2007.
43
4. Privacy/Accessibility Features
Introduction ............................................................................................................................. 45
Much of the most useful information from an intelligence standpoint comes in the gap
between perceived privacy and actual privacy ....................................................................... 45
Facebook users can often become confused about the status of their privacy settings ........ 45
Changes in Twitter and Instagram privacy settings can reveal useful information that was
originally posted when private. ............................................................................................... 46
Users are often unaware of the information provided by their metadata ............................. 47
Reddit users often disregard their digital footprint, providing an important source of
information for analysts........................................................................................................... 49
Analysts should be aware of the potential ethical problem with accessing information
intended as private .................................................................................................................. 49
Analysts should consider which groups and demographics are likely to change their
behaviour in light of increased awareness of surveillance ...................................................... 49
44
Introduction
Privacy features on social media networks are naturally perceived as the biggest restriction
to SOCMINT analysts. The reasons for this are obvious; it is not always possible for SOCMINT
analysts to access protected content. However, from the relationship between public and
private content emerges some useful information that would be impossible to know in a
completely public Internet.
Much of the most useful information from an intelligence standpoint comes in the
gap between perceived privacy and actual privacy
Users on social media have a greater incentive to adopt very different personas when
interacting publicly as opposed to privately. Whilst occupying their public persona, we can
assume that users will act in ways that they perceive to be publicly suitable.79 Likewise we
can expect to see possible differences in behaviour, self-representation and interaction
when users occupy their private persona. The crucial difference between the two of these is
that researchers and SOCMINT analysts do not necessarily have access to information that is
shared privately, so it is very difficult to conceptualise the distinction between these two
versions of a person. As the analyst’s perception of an individual on social media will almost
always be based solely on is his or her public persona, it is difficult to see how they might
behave differently when the settings are turned private.
What is also clear is that much of the most useful information from the analyst’s perspective
is going to be the sort of information that is shared privately. Depending on the analyst’s
aims, access to private information could be extremely useful. If we want to understand
how an individual interacts with their close friends for instance, private data could be
invaluable. Similarly, defence and security analysts would benefit from access to private
information but in many cases such access would not be feasible because of privacy, ethical,
legal or operational concerns.
Whilst the authors of this report do not condone breaching or subverting user privacy in any
way, it should be recognised that there are certain aspects of social media that provide the
analyst the possibility to view information publicly that was posted under some impression
of privacy. The structure of different social networks means that there is often a gap
between perceived privacy and genuine privacy, where analysts may find the opportunity to
see users interacting with their private personas within the public sphere.
Facebook users can often become confused about the status of their privacy settings
A common criticism of Facebook is the complexity of the site’s privacy settings. Currently
users are granted a lot of control over who can see what content they post. Each individual
wall post, photo album or profile section can be given its own privacy setting, with the
79
That said, there is some evidence to suggests that many users do tend to reveal information publicly that
they would not deem to be suitable.
45
choice ranging from completely public to viewable only by the user themselves. Whilst this
depth of control can be useful for users, it also brings about confusion. Many Facebook
users have little understanding of what privacy settings they are using for which posts. A
common mistake is for users to set the privacy of one post to “public”, automatically
changing the privacy of future posts as well. From then on users may continue to post for a
long time under the impression of privacy before realising their error. So it is common to
find lots of information on social media that was posted under the impression of privacy.
Notwithstanding the ethical implications involved in accessing information made public by
mistake, this sort of information can be useful from an intelligence standpoint. It enables
SOCMINT analysts to view information that would otherwise be impossible for them to
access. As this information is delivered under the impression of privacy it will often contain
information that could to be more useful to an analyst than information that has
deliberately been made open to the public.
The possibility of identifying information that has been erroneously posted publicly under
the assumption of privacy poses a challenge for analysts. How is it possible to tell if a user
has made this sort of mistake? One possible way would be for analysts to pay attention to
when posts have been set to public for a reason (the user is promoting something that is
specifically addressed to a wider audience, for instance) and to see if the following posts
remain public, with no similar reason. Of course, this method cannot make a conclusive
assessment. Linguistic analysis of posts may also be employed to discern any subtle
differences between knowingly public posts and perceived private posts. This will require a
detailed and nuanced understanding of the individual in addition to software designed to
aid the analyst.
Changes in Twitter and Instagram privacy settings can reveal useful information
that was originally posted when private.
Twitter has much simpler privacy settings. Users can either choose to have their tweets
public or protected. Public tweets are viewable to everyone, even people not logged into a
Twitter account. Protected tweets can only be viewed by users that have been approved to
follow a user and they cannot be retweeted. Additionally, protected tweets will not appear
in any search and @replies sent to users that you have not approved will not be notified or
be able to see them.
However, unlike Facebook, Twitter does not allow users individual control of each tweet.
This means that all tweets must be set to private or public at any one time, it is not possible
to set one group of tweets to public and another to private. This being the case, many users
will spend a period of time tweeting privately and then decide to change to public tweets
form that point on, changing their future tweets and their previously private tweets, to
public.
46
There is then a situation where a user has posted a number of tweets as private
(presumably occupying their private persona) but then subsequently changed their privacy
settings and revealed these tweets to be public. This change has been made without editing
the previous tweets to fit in line with the user’s public profile. Whilst it is possible for users
to go back and delete tweets, and some surely will do this, there will be other who will not.
The quality of this revealed information (from an intelligence perspective) will be relevant
only in the specific context of whatever aims the analyst has, as well as the context of each
individual user. It should not be automatically assumed, for instance, that there will be a
difference between every individual’s private tweets and public tweets, although this may
be true in some instances. Similarly, just because a user chooses to make their tweets
private does not mean that they will change the content of subsequently public tweets.
Users are often unaware of the information provided by their metadata
Metadata is data about data. For our purposes it can be understood as additional
information that is included in social media posts and can include things such as the
location, time and interaction of posts. Some metadata is available to view directly on posts
themselves, other information is only accessible through the application’s API interface.
Metadata’s potential use for intelligence analysts cannot be understated; the metadata of
posts is often vastly more useful than the posts themselves, primarily because users often
disregard the information provided by metadata when posting on social media. The Edward
Snowden revelations have often centred on the controversial collection of metadata by US
(and other) intelligence agencies.
In 2012 Vice magazine published an article boasting that their journalists had recently met
up with John McAfee whilst he was on the run in Central America, after being accused of
murdering his neighbour.80 Vice uploaded a photo (see below) which contained location
metadata about where the photo was taken (the metadata also revealed the iPhone model
used, the lens, exposure setting, time and date). These location coordinates pointed to a
location in Guatemala that was used by law enforcement agents to track down and arrest
McAfee. Whilst it is possible that the metadata was left in the photo for a reason, it seems
much more likely that this was a genuine mistake on behalf of Vice employee. It is also an
interesting oversight given John McAfee’s notoriously stringent online security regimen.81
This case illustrates the point above that even those sophisticated users who have an
immediate and explicit interest not to reveal their location and a heightened awareness of
online security, can be careless with their metadata.
80
VICE Staff, ‘We are with John McAfee Right Now, Suckers’, 3 December 2012,
<http://www.vice.com/en_uk/read/we-are-with-john-mcafee-right-now-suckers> [accessed 29 September
2014].
81
‘Why John McAfee Is Paranoid about Mobile’, Dark Reading, 19 August 2014,
<http://www.darkreading.com/informationweek-home/why-john-mcafee-is-paranoid-about-mobile-/a/did/1298090> [accessed 29 September 2014].
47
Figure 19. The exchangeable image file format (Exif) data for the photo that revealed John McAfee’s location.
82
It is very common for Twitter users to add locations to their tweets inadvertently. The
setting is not located with each individual tweet, but in the “security and privacy” section of
settings. Whilst it is as simple as unchecking a box to prevent location information being
added to your tweets, many users are unaware this setting exists. There is also a button that
allows users to delete the location information from all of their tweets. Similarly, on
Facebook and Instagram, location is automatically added to updates and must be manually
disabled. On Twitter, those who can access Twitter’s API can access the following metadata:
82
-
Name
-
Location
-
Biography information
-
Account creation date
-
Username & Identifier
-
Tweet’s location, date and time zone
-
Tweet’s unique ID and ID of tweet replied to
-
Contributor IDs
-
Follower, following and favourite count
-
Verification status - users with significant social stature, at risk from fake imitation
profiles, can apply to have their account “verified” with a tick next to their name.
‘Jeffrey’s Exif Viewer’, <http://regex.info/exif.cgi> [accessed 29 September 2014].
48
Reddit users often disregard their digital footprint, providing an important source
of information for analysts
As discussed previously in this report (see group interaction and self-representation),
Reddit’s segmentation into different subreddits generates a situation where Reddit users
will adopt different personas when operating within different subreddits. This seems to
have an interesting effect on Reddit users, insofar that they tend to behave in a way on
individual subreddits that disregards their activity on others. While there is no formal
research evidence to support this, one need only spend 5 minutes browsing through
subreddits to see glaring inconsistencies amongst posts as well as the disclosure of highly
revealing information.
Analysts should be aware of the potential ethical problem with accessing
information intended as private
In 2012, a U.S. judge forced Twitter to provide Tweets to the district attorney, arguing that
Tweets posted publicly have ‘no reasonable expectation of privacy’.83 Whilst this is
convincing – it is very clear that when users tweet, that information is accessible to anyone,
there is a sense that this sort of tacit consent is not acceptable when we consider the
potential use of our data. To put it in another way, just because we post information
somewhere for everyone to see, does not mean that we expect that data to investigated out
of the usual context of the Twitter experience.
Whilst this point has some weight on its own, it is amplified when we see it in the context of
viewing data which was originally intended by users as something which was private. If a
user decides that they would like to have their posts on Twitter made public, this is unlikely
to include a consideration of the implications of making all their previously posted tweets
public as well. Indeed users may simply be unaware of this implication of their decision to
make their posts on Twitter public.
Analysts should consider which groups and demographics are likely to change their
behaviour in light of increased awareness of surveillance
In the post-Snowden digital environment, we should expect see greater concerns over
privacy and surveillance amongst the general population. It seems likely that the more
aware users are that their online activities can be observed and analysed, the more they will
change their behaviour to limit this. This is being done by users becoming more wary about
the kind of information they share and paying closer attention to privacy settings. One
extreme example is pointed out in a report by Recorded Future who observed an ‘increased
pace of innovation’ in Al-Qaeda’s encryption technology.84 Whilst Al-Qaeda’s motivations to
83
Fitzpatrick, Alex, ‘Judge: Public Tweets Have No “Reasonable Expectation of Privacy”’, Mashable, 3 July 2012,
< http://mashable.com/2012/07/03/twitter-privacy/> [accessed 29 September 2014].
84
‘How Al Qaeda Uses Encryption Post-Snowden (Part 1), Recorded Future, 8 May 2014,
<https://www.recordedfuture.com/al-qaeda-encryption-technology-part-1/> [accessed 29 September 2014].
49
evade surveillance are obvious, the sentiments of other groups and demographics are likely
to be similar. This imposes additional collection, verification and analysis costs on the
analyst.
Despite this logical presumption, it is clear that not all demographics have not altered their
behaviour online, or become more concerned about invasions of their privacy. A study has
recently found that young adults in Gothenburg are not concerned about their privacy in
light of the Snowden leaks and did not change their behaviour.85 This was predominantly
because these young adults assumed governments were already collecting this information
and therefore had already accepted this face and modified their online behaviour
accordingly. Of course this only points to one limited demographic and can certainly not be
considered representative. What is likely is that groups with the most to hide, such as AlQaeda, are also the most likely to adapt their behaviour in response to revelations regarding
online surveillance.
Analysts should track legal and regulatory developments
The relative youth of SOCMINT as an intelligence discipline means that there are sparse
legal and ethical guidelines that seek to determine the behaviour of analysts in the online
space. The ethical issues surrounding SOCMINT mirror the complexity of privacy on social
media but hold an important place in the public consciousness of online issues. In a postSnowden paradigm it is in the interest of the analyst to be strict on any public perception of
privacy violation. Whilst the legal repercussions may be minimal, the public outcry can be
much more harmful. A solution to this would be a solid set of legal guidelines that seek to
govern the behaviour of analysts. The Regulation of Investigatory Powers Act (RIPA) should,
in theory provide such a legal framework in the UK. However, RIPA was passed into law in
2000 and therefore predates the social media boom. Demos have advocated that RIPA be
updated in order to cover challenges presented by SOCMINT.86 RIPA is arguably unable to
deal with the SOCMINT environment because it presents a unique problem that does not fit
into the existing legislation. It is in the interest of SOCMINT analysts to both advocate for
and remain up to date with any developments that provide a legal and ethical framework
for SOCMINT.
85
Hochman, Nadav, and Manovich, Lev, ‘Zooming into an Instagram City: Reading the local through social;
media.’, First Monday, Vol. 18, No. 7, 2013
<http://firstmonday.org/ojs/index.php/fm/article/view/4711/3698> [accessed 29 September 2014].
86
Bartlett, Jamie, Miller, Carl, Crump, Jeremy, Middleton, Lynne, ‘Policing in an Information Age’, Demos,
March 2013, <http://www.demos.co.uk/files/DEMOS_Policing_in_an_Information_Age_v1.pdf?1364295365>
pp. 1-42, [accessed 29 September 2014].
50
5. Infrastructural Features
Whether a user is updating social media via mobile or desktop affects the character of the
information .............................................................................................................................. 52
Instagram’s interface allows it to be used as a personal tool of documentation – a Google
Earth seen through the eyes of social media users ................................................................. 53
Reddit’s vote based ranking system means that users pander for votes rather than speaking
their mind................................................................................................................................. 55
Analysts should be aware of third-party clients and enhancements utilised by social media
users ......................................................................................................................................... 57
Understanding timeline algorithms is the key to understanding user experience on social
media ....................................................................................................................................... 57
51
Whether a user is updating social media via mobile or desktop affects the character
of the information
Mobile use of social media has long since eclipsed desktop use. Many newer applications
are created with no desktops counterparts at all (Snapchat) whilst others have introduced
very minimal desktop interfaces that are only used by a tiny minority of users (Instagram).
Services like Facebook, Reddit and Tumblr started off on desktop but have increasingly
become more mobile based.
Figure 20. How Mobile are Social Networks? % of time spent on social networks in the United States, by
87
platform
Social media on the mobile platform can take on an entirely different character to its
desktop based counterpart. This is because mobile social media bears a much stronger
relationship to the real-world physical environment in which it’s being used. Desktop
computing tends to be done in a fairly uninteresting environment, in the bedroom, office, or
workplace for example. Laptop computing can be more interesting in terms of locations,
including: in a coffee shop, hotel, or when travelling. Mobile computing cannot be
pigeonholed in the same way; it is used anywhere and everywhere with a 3G/4G or Wi-Fi
connection. This ability for users to access social media within a changing physical
87
‘How Mobile are Social Networks? % of time spent on social networks in the United States, by platform’,
Statista, <http://www.statista.com/chart/2091/mobile-usage-of-social-networks/> [accessed 29 September
2014].
52
environment is coupled with increasingly faster internet speeds and mobile devices.
Additionally, social media use within real-world social settings has become increasingly
acceptable.
Analysts have begun to understand an increasingly visible trend towards the proliferation of
social media updates that reference real world events local to the social media user. The
mobility of social media has taken it out of the bedroom and the office and increased its
immediacy and presence in relation to real world actions and events. Twitter is often given
as the archetypal example of how real-world events can be monitored and reported in realtime by socially connected users. Instagram also offers the opportunity to photograph
incidents of interest, add hashtags to increase visibility and upload them for all to see.
Crucially, analysts will expect to see a further decrease in the time interval between realworld events occurring and the reporting of them on social media. Advanced methods in
mobile communications will allow Twitter, Facebook and Instagram feeds to be increasingly
responsive to real world events. This is especially useful to analysts working in disaster or
crisis response.
Instagram’s interface allows it to be used as a personal tool of documentation – a
Google Earth seen through the eyes of social media users
In a fascinating paper detailing how visualisations of social media data can provide social
and cultural insights, Hochman and Manovich88 discuss how Instagram’s interface allow it to
be compared to “planetary documentation tools” like Google Earth and Bing Maps.
Instagram timestamps are not given a specific date, but are rather by a dynamic timespan:
photos are “2 days ago” rather than “30/09/14”. This shifts the timestamps from an
objective description to something that is entirely relative to the user. Photos are made
“atemporal”. However, exact and concrete geographical location is emphasised by
Instagram. Users can either tag their photos by venue or add them to a personal “photomap”.
88
Hochman, Nadav and Manovich, Lev, ‘Zooming into an Instagram City: Reading the local through social;
media.’, First Monday, Vol. 18, No. 7, 2013
<http://firstmonday.org/ojs/index.php/fm/article/view/4711/3698> [accessed 29 September 2014].
53
Figure 21. Left to right: Instagram’s timeline, filters page, photo map.
89
This definite geography and temporal-relativity is used in conjunction with “filters” than can
be added to Instagram photos to give a variety of nostalgic effects. Each one of these filters
(sepia, black and white, cross processed, etc.) is designed to evoke a different feeling in the
photographs. With this feature photos are taken with filters in mind and studying the
different feelings and impressions created by Instagram features can provide useful
information about the mind-set of the user. Filters broadly communicate a feeling of
authenticity or nostalgia as well as to stamp the images as personal to the individual user –
in the way a one off original print would be. Filters are an attempt to rebel against the cold
and bland nature of digital images, to give a feeling of uniqueness to infinitely replicable
images by adding synthetic imperfections that, ironically, can be applied an infinite number
of times.
The third feature that completes Instagram’s status as a subjective tool of documentation is
the instant sharing function that increases a posts visibility within a wider social media
context. Photos on Instagram are generally not only shared with user’s personal network,
unless an account is protected, photos can be searched by user, hashtag and location.
Individual photos on Instagram are designed to fit into a wider collaborative project, filling
the world with a visual documentation effort. Thus Instagram ‘resists the time and place
presented by larger impersonal corporate efforts.’90 Moreover, Instagram’s inherently
mobile nature and its ubiquitous presence in the pockets of its millions of users leads to a
89
Instagram official screenshots.
Gyorffy, Rachele, ‘#NoFilter: Exploring Self Promotion and Identity Creation through Instagram.’ Princeton
University Senior Thesis, 2013.
90
54
daily upload of masses of potentially useful information for the analyst to collate, sift and
analyse.
It is unclear how far this affects the everyday user experience of Instagram users, certainly
some will be more aware than others. What SOCMINT analysts should take away from this is
twofold: firstly that Instagram photos can be used as an alternative geographic source to
google maps, which provides a view of the world mediated through user experience. This
has an obvious role as a tool for geolocation. Secondly, the more users become aware that
their Instagram photos are being used for this purpose the more subjective and self-aware
this will be, fundamentally altering its contribution to information collection and the
formation of an all-source intelligence product.
Note: It is also possible to create a Google Earth and Instagram hybrid by combining the two
and viewing Instagram photos through the Google Maps interface.91
Reddit’s vote based ranking system means that users pander for votes rather than
speaking their mind
As mentioned previously in this report, Reddit uses a vote based ranking system to decide
the visibility of posts on the front page, individual subreddits and comments within posts.
Users can “upvote” posts to increase visibility and “downvote” to decrease it. If a post
receives enough downvotes it will be hidden altogether.
Figure 22. Reddit’s front page. Upvote/downvote scores a visible on the left of the thumbnails.
91
Some attempts to do this can be seen here: http://www.gramfeed.com/ http://instahood.meteor.com/
http://www.shots24.com/ and
http://instaearth.me/#/stephaniedurant/photos/800672074709271886_240181984, [accessed 29 September
2014].
55
Reddit’s guideline page Reddiquette gives these rules on how users should conduct their
voting: Including a series of instructions: do not downvote something just because you don’t
like it; do not mass downvote someone else’s posts; do not moderate a story based on
opinion of its source or upvote/downvote based on the person who posted it.92 Whilst these
rules are generally followed by users voting on threads/posts (see above), when users vote
on comments they tend to abide by much less objective guidelines.
Figure 23. Reddit’s comment interface, viewable by clicking on the “comments” link underneath a post/thread
The very nature of comments on internet forums means that they will be loaded with (to
name a few) divisive opinions, questionable facts and ad hominem replies. The upshot of
this being that Reddit’s comment voting system descends into a popularity contest, with
redditors pandering to popular “reddit” opinions in order to receive upvotes.93 By making
the correct pop-culture references, in-jokes and Reddit tropes, users can sail to the top of
the comment pile. In addition to this, by far the majority of downvotes in comments on
Reddit are from users that simply disagree with the expressed opinion.
The effect this has on Reddit from an intelligence perspective is mixed. If we would like
Reddit to be a forum which can provide analysts a means to gauge user opinions about
certain issues, the voting system obscures. Upvotes are not just given when comments
‘contribute to discussion’,94 but in response to a variety of factors. However downvotes are
a little more uniform, generally being given out en masse when a user posts something that
is rude or abrupt or something that conflict with Reddit’s general outlook (see above). That
said, this cannot be taken as a rule and the specific instances are almost always contextual
92
Reddiquette, http://www.reddit.com/wiki/reddiquette [accessed 29 September 2014].
See above: Representation/Identity features for details on how certain sets of beliefs and attitudes come to
define the archetype “Redditor”.
94
Reddiquette, http://www.reddit.com/wiki/reddiquette [accessed 29 September 2014].
93
56
to the particular subreddit under discussion. This again imposes additional costs on the
analyst, an important characteristic of social media as an intelligence discipline. An
appreciation of this is leading many organisations, from private companies to intelligence
agencies, to develop dedicated SOCMINT teams although this is still a nascent phenomenon.
Additionally, because of Reddit’s comment sorting algorithms,95 comments that are posted
earlier are much more likely to be visible and receive votes than comments from those that
join the thread later. This places another arbitrary constraint on Reddit’s ranking system.
The result of all this is a situation where Reddit’s voting system (especially upvotes) should
not be trusted to be a reasonable barometer for determining anything in particular. Whilst it
may sometimes be the case that Reddit users adopt a reasonable approach, there are too
often many possible obfuscating factors that cannot be reasonably screened or adjusted for
in any final analysis.
Analysts should be aware of third-party clients and enhancements utilised by social
media users
Third-party social media clients utilise the original platform’s API and provide a new medium
with which users can access the applications services. Usually these services aim to give an
enhanced user experience; to give more control over a user’s social media; to optimise an
application for mobile or desktop or to tailor an application for a specific purpose
(marketing, professional etc.). Different social media sites have different rules and
guidelines for third-party developers, which the analyst must be aware of. However, most
networks endorse third-party development as successful third party apps lead to successful
parent products, often through adaptation of the network or through the acquisition and
integration of the third-party company.96
Third-party use of social networks mediates use in any number of ways, depending on the
specifics of the software. For analysts this is important, many of the factors of social media
and their SOCMINT implications usually depend on as assumption that the default interfaces
of social media are being used. When the social media experience changes through a thirdparty application this may affect what sort of information can be known by the analyst.
Understanding timeline algorithms is the key to understanding user experience on
social media
A central feature to almost every mainstream social media service is the
timeline/newsfeed/front page.97 Taking different names depending on the service, this
95
Comments in Reddit threads can be sorted by the following criteria: best (comments that are predicted to
have a very good upvote/downvote ratio), top (posts with the best overall ratio), new, hot (new posts that are
getting a good ratio), controversial (posts with lots of up and down votes), old
96
For instance Facebook have offered grants between $25,000 and $250,000 for developers.
97
For convenience I will refer generically to this feature as a “timeline” for this section but it is not to suggest
that this is a general term.
57
feature performs a similar function across the board of social media: it is the place where
the feeds from user’s subscriptions (friends, liked pages, joined subreddits, pages followed
on Twitter) are collated into one stream of information. This is the place that users on social
media spend the majority of their time, it dictates what updates they receive from friends or
brands, which news stories are visible or which posts from which subreddits.
The timeline’s place as the centre piece of social media means that its effect on the user
experience cannot be underestimated. A developed understanding therefore, is essential for
SOCMINT analysts. In order to do this, we must look in detail about the differences in
timeline’s across social media. There are a number of different ways of “sorting” content
that platforms use, ranging from the all-inclusive reverse chronological timeline of Twitter
and Instagram, to the vote-based system seen on Reddit and Digg and the confidential
algorithm that decides what is visible on Facebook.
To begin with, the simplest ranking mechanism for a timeline is the all-inclusive, reverse
chronological method employed by sites such as Twitter and Instagram (and to some extent,
Reddit). This is it exactly what it sounds like, all Tweets on Twitter and photos on Instagram
are visible on the timeline, with the most recent tweets and photos being visible at the top.
No content whatsoever is obscured from the user in this way. So, as far as user experience
goes, we can be sure that if a user is following a page, person or brand, then they will be
receiving visible updates from that page.
At the time of writing98 there recently been a lot of media attention99 concerning comments
made by Anthony Noto, Twitter’s financial chief who said that some of the most relevant
Tweets can be buried at the bottom of a user’s feed, suggesting that a more curated feed
might be in the pipeline for Twitter. Indeed Twitter has actually been introducing some of
these sorts of features into the timeline for some time now. We can now see some Tweets
that our followers have favourited, or replied to (although it has been stressed by Twitter
representatives that these are only when there are no new Tweets to show but the user is
refreshing his or her timeline). Twitter’s traditional timeline is considered an important part
of the Twitter experience for many users and is thought to contribute to its democratic and
egalitarian ideology. For this reason, suggestions of a filtered timeline have been met with
strong opposition by Twitter’s loyal community. Instagram is not showing any inclination of
moving in a similar direction, but given the fact that Facebook paid almost $1 billion for the
98
Early September 2014.
Hern, Alex, ‘End of the timeline? Twitter hints at move to Facebook-style curation’, The Guardian, 4
September 2014, <http://www.theguardian.com/technology/2014/sep/04/twitter-facebook-style-curatedfeed-anthony-noto?CMP=twt_gu> [accessed 29 September 2014].
99
58
company (and has hinted at the possibility previously100) then it appears likely that at some
point in the future we will see something like embedded adverts in Instagram’s feed.101
Facebook’s algorithm is the subject of a thousand technology blog posts and provides the
basis of an entire industry of social media marketers who make a living out of trying to game
it. It is incredibly complex and involves thousands of factors in determined where stories
appear on a timeline. The algorithm is also secret, so it is not worth attempting to
understand its precise mathematics. That said, we know that Facebook still considers the
general basis for the now defunct EdgeRank algorithm as important.102,103 It makes more
sense to understand the Facebook ranking algorithm in terms of its ideology. Lars Bakstrom,
a Facebook engineer has said that Facebook’s ‘main goal is to create the best personalised
newspaper for all of its users.’104 This means that Facebook want to strike the perfect
balance between showing updates from friends that users care about and updates from
brands and pages that they’re interested in. Facebook quantifies this by trying to maximise
the number of interactions with the content displayed.
Facebook’s algorithm has been heavily criticised for hiding content and creating an “echo
chamber” where users only see content they are already interested in, rather than being
introduced to new content. This then reinforces pre-existing connections and prevents users
from being introduced to new content they would otherwise be interested in. Tim Herrera
at the Washington post spent 6 hours scrolling through his newsfeed and was only shown a
fraction of the content that was posted by pages and people he was connected with, even
to the point where content was being replicated.105 Similarly, Mat Honan wrote an article in
Wired in which he described what happened to his Facebook news feed when he liked every
single piece of information presented to him my Facebook.106 Honan found that his
100
Borow, James, ‘How Facebook is Already Profiting from Instagram’, Ad Age, 8 August 2013,
<http://adage.com/article/digitalnext/facebook-profiting-instagram/243515/> [accessed 29 September 2014].
101
This is similar to the strategy Snapchat has taken in order to increase revenue. Some of the messages
Snapchat users receive are now adverts. It seems that embedding adverts into the normal user experience of
these sites is becoming the norm, rather than having ads located in a sidebar or as pop-ups. Whilst Facebook
does have adverts in the sidebar, it has also started introducing them into the regular newsfeed.
102
McGee, Matt, ‘EdgeRank is Dead: Facebook’s News Feed Algorithm Now Has Close to 100k Weight Factors’,
Marketing Land, <http://marketingland.com/edgerank-is-dead-facebooks-news-feed-algorithm-now-hasclose-to-100k-weight-factors-55908> [accessed 29 September 2014].
103
Edgerank is a combination of three factors, Affinity, Weight and Time Decay, which decide on where an
Edge appears on the News Feed. Edges aren’t just posts on Facebook, they include anything at all that happens
on the site, including likes, comments etc. The algorithm looks at all the Edges connected to a user and ranks
them based on the importance to the user, objects with the highest EdgeRank setting will be sent to the top.
104
King, Rachel, ‘Facebook engineers explain News Feed ranking algorithms; more changes soon’, ZDNet, 6
August 2013, <http://www.zdnet.com/facebook-engineers-explain-news-feed-ranking-algorithms-morechanges-soon-7000018996/ > [accessed 29 September 2014].
105
Herrera, Tim, ‘What Facebook doesn’t show you’, The Washington Post, 18 August 2014,
<http://www.washingtonpost.com/news/the-intersect/wp/2014/08/18/what-facebook-doesnt-showyou/?Post+generic=%3Ftid%3Dsm_twitter_washingtonpost> [accessed 29 September 2014].
106
Honan, Mat, ‘I Liked Everything I Saw on Facebook for Two Days. Here’s What It Did to Me’, Wired, 8
November 2014, <http://www.wired.com/2014/08/i-liked-everything-i-saw-on-facebook-for-two-days-hereswhat-it-did-to-me/> [accessed 9 November 2014].
59
Facebook feed drifted rapidly to the American political right before becoming increasingly
polarised between sentiments on the extremes of the political left and right in the US.
Interestingly, Reddit offers customised ranking systems to its users. Reddit provides five
different optional ranking algorithms that users can choose to sort posts and comments by.
The default setting for posts is ‘top’ and the default for comments and replies is ‘best’.
Below is an explanation of the five optional ranking algorithms that users can select on
Reddit.
-
Best – The newest ranking algorithm written by xkcd.107 Best aims to make a
prediction on the quality of the post based on its current score. It estimates what
sort of score the post would receive if everyone had seen it. Posts with the best
estimated ratio appear higher up the feed.
-
Hot – Is based on the rate of upvotes to downvotes. Posts that are currently
receiving a lot of upvotes and comparatively few downvotes will appear nearer the
top of the feed.
-
New – Ranks the newest posts first.
-
Top – Simply, is upvotes vs downvotes. Posts with the best ratio appear highest up
the feed. This has been highly criticised because it is massively biased to content that
is posted early. If a reply gets posted 5 minutes after the story has been created, it
will be much more visible than a (much better reply) that is posted a few days later.
The higher something is listed the more visibility it has and the better chance it has
of receiving a lot of upvotes.
-
Controversial – Posts that have a ratio of upvotes to downvotes that is close to
50/50 will appear higher up in the feed.
-
Old – Ranks the oldest posts first.
These different ranking systems alter the way in which information is presented on Reddit
and provides the analyst with the opportunity to approach conversations from five different
perspectives.
107
‘reddits new comment sorting system’, Reddit Blog, 15 October 2009,
<http://www.redditblog.com/2009/10/reddits-new-comment-sorting-system.html> [accessed 29 September
2014].
60
Conclusions & Recommendations
Recommendation 1: Social networks and their internal features should always be
examined contextually. The role of a social network is defined by the situation
within which it is being used.
This report provides a framework with which to analyse alternate social networks. It also
highlights the extent to which social networks must be examined within their specific
context. The example of hashtag use across Twitter and Instagram illustrates this. Whilst we
see some similarities in hashtag use across both platforms, within specific contexts hashtags
are used in very different ways. A similar trend can be seen in many features across social
media. This does not mean that generalisations cannot be made, but aims to show that it is
often the details of social media use that can provide the most insightful information.
Recommendation 2: Organisations should develop ethical and legal due process for
analyst’s behaviour on social media.
There is, as of yet, no definite legal procedure designed to regulate the behaviour of
analysts on social media.108 This is perhaps in lieu of any ethical consensus. It is in the
interest of organisations to develop a stringent ethical code of conduct for analysts. Ethical
concerns should be seen through the perspective of the social media user as well
appropriate legal and regulatory bodies. That is to say organisations should be aware of the
public repercussions that can result from unethical social media intelligence practice as well
as any additional ethical concerns. As well as this, many of the information gathering
techniques necessary for social media intelligence involve analysts interacting directly with
users of social media. This being the case, there must be a set of guidelines to govern this
behaviour. These may include imperatives such as the necessity for analysts to reveal their
identity and intentions when interacting with users on social media.
Recommendation 3: Organisations that value SOCMINT should consider funding
research into areas that will benefit SOCMINT. These may include: user
representation on social media and sentiment/ credibility analysis.
SOCMINT’s relative youth as an intelligence discipline means that much of the most
beneficial innovation and research have yet to be conducted. Organisations must consider
the cost/benefit of either: a) Allocating resources to fund external research projects; or b)
allocating funds and assigning personnel to research in-house. The potential benefits from
gaining an advantage in areas such as automated sentiment and credibility analysis depend
108
Demos have advocated an application of the Regulatory and Investigative Powers Act (RIPA) as a legal
framework for the gathering and use of social media data. Bartlett, Jamie, Miller, Carl, Crump, Jeremy,
Middleton, Lynne, ‘Policing in an Information Age’, Demos, March 2013,
<http://www.demos.co.uk/files/DEMOS_Policing_in_an_Information_Age_v1.pdf?1364295365> pp. 1-42,
[accessed 29 September 2014].
61
on the success of the research, but could prove immensely useful to intelligence
organisations. In addition, research focusing on a psychological or linguistic approach to
social media analysis may have fewer obvious short-term benefits but could eventually
transform the field, providing information and insights previously thought to be
unattainable.
Recommendation 4: Organisations must decide whether to implement a dedicated
SOCMINT team or to embed SOCMINT specialists within existing analytical teams.
As of yet, there is no accepted protocol for structuring SOCMINT within existing intelligence
infrastructure. Whether it makes more sense to implement separate or embedded
SOCMINT teams is primarily dependant on the relationship between SOCMINT and the
other types of intelligence research conducted by a particular organisation. If, for instance a
team is working on geolocation and utilising various different methods, then it seems that
embedded SOCMINT geolocation specialists would be the right choice. However if the role
of SOCMINT is to provide a separate perspective on a particular issue, perhaps Twitter as a
counterpoint to traditional news media, then a separate SOCMINT team makes more sense.
Recommendation 5: Resources permitting, organisations should consider tasking
an analyst to track advances in social media and social media analysis technology.
Given the rapidly changing landscape of social media, it is imperative for any SOCMINT
organisation to be up to date with the social media status-quo. For instance during the
writing of this report, Instagram have publicly announced plans to integrate advertisements
into their newsfeed. Algorithms that govern timelines, search results and recommendations
are constantly being tweaked. As well as changes to current popular social networks, the
massive industry of tech start-ups is churning out endless newer alternatives to the
traditional options. Sometimes these may be adaptations of existing social media,109 or
entirely new concepts.110 SOCMINT analysts must be aware of any relevant new
developments in social media in order to efficiently exploit it for intelligence purposes.
Inability to do this is one of the primary problems preventing SOCMINT from being a
successful intelligence discipline.
Similarly (as mentioned above) social media analysis technology is developing at a fast rate,
with many organisations choosing to purchase tools externally rather than develop their
own bespoke alternatives. Remaining up-to-date with advances in social media analysis
tools is essential to maintaining an edge in SOCMINT. Whilst many well publicised
developments may be easy to keep track of, others will require some significant research.
The combination of keeping track of developments in social media itself and the tools used
to analyse it warrants consideration by organisations, particularly relating to the issue of
whether it is worth dedicating an analyst for this specific purpose.
109
110
‘Medium’ (https://medium.com/) is similar to Twitter but focuses on longer, more detailed posts.
‘Learnist’ (https://learni.st/explore) is a social network based around sharing educational information.
62
Glossary of Technical Terms
4chan – An English-language forum made up of a community of anonymous users. 4chan’s
most popular forum “random” or “/b/” has been described as a place where ‘people try to
shock, entertain and coax free porn from each other.’111
9Gag – A platform where users share images, videos, links and vote on the quality of
content. Comparable (in theory) to Reddit.
Application Programming Interface (API) – Specifies a software component in terms of its
operations, their inputs and outputs and underlying types. Provides the means by which
third party applications can access the data on social media.
Astroturfing – The masking of the sponsors of a message or organisation to make it appear
as though it originates from and/or is supported by grassroots participants.
Avatar – A term broadly used to describe a person’s online representation of themselves; or
specifically used to describe the profile picture employed by a user.
Caption – The text accompanying a photo (e.g. on Instagram).
Emoticon – Small text-embedded images or text itself that is used to convey feelings or
emotions on social media (e.g. , , ;) :P :’().
Exchangeable image file format (Exif) – A standard that specifies the formats for images,
sound and ancillary tags for systems handling image and sound files.
Favourites – A function on Twitter that allows users to express approval of posts and/or
collate them for later viewing.
Friends Reunited – The first social networking site to become popular in Britain, focused on
connecting users with friends they had fallen out of contact with.
Geotag – Metadata (see below) attached to a post on social media that provides details
about where the user was when posting. Or metadata that is attached to a piece of media
(video/photo) containing location details.
Hashtag – A pound sign: # affixed to words on social media. For example: #example. Used
on Twitter, Instagram and Facebook to increase visibility of posts/contribute to
‘trends’/other functions discussed throughout this report.
Likes – A feature of Instagram and Facebook that allows users to express approval of posts.
111
Douglas, Nick, ‘What The Hell Are 4chan, ED, Something Awful, And “b”?’ Gawker, 2008,
<http://web.archive.org/web/20080724081826/http://gawker.com/346385/what-the-hell-are-4chan-edsomething-awful-and-b> [accessed 29 September 2014].
63
LinkedIn – A social network designed for working professionals.
Meme – A term coined by Richard Dawkins to designate an idea, behaviour or style that
spreads throughout a culture. The term is more commonly used to describe image macros
online.112
Metadata – ‘Data about data’, information about the content of a piece of data. Metadata
on a Tweet contains that users follower count amongst other things. Metadata on a photo
may contain its size, the camera used, the location the photo was taken (see: Geotag above)
or other information.
MMORPGs – Massive Multiplayer Online Role-Playing Games. A term used to describe video
games where many users interact in an online world.
Netnography – A branch of ethnography which seeks to analyse the behaviour of individuals
and communities online; it originally adapted market research techniques to provide
insights.
Ranking Algorithm – The algorithm that determines where certain information appears in a
timeline/newsfeed/search engine result.
Reddit Gold – Reddit users can donate gold to each other in order to give each other
premium membership. Reddit receives the money.
Regram – A trend on twitter where user’s repost each other’s images; it is not a feature
supported by the site but has emerged from user appropriation of the network’s features.
SIDE Model – The psychological model that seeks to explain why users behave different
when interacting as part of a community; it has some interesting applications for social
media analysts.
Sock Puppets – A term given to fake profiles on social media, usually ones that have been
created for deceptive purposes.
Subreddit – A forum on Reddit dedicated to discussion over a particular topic.
Trending – Popular words, phrases or hashtags on social media.
Tumblr – A popular blogging platform, especially amongst teenagers and young adults.
Upvotes/Downvotes – Enable users on Reddit to vote on posts.
VKontakte (VK) - The second largest social network in Europe after Facebook, similar to
Facebook in purpose and design. VK is primarily Russian-speaking, although it is available in
several languages.
112
<http://wac.450f.edgecastcdn.net/80450F/thefw.com/files/2012/05/most-interesting-man-meme.jpg>
64
Works Cited
Books, Journals and Articles
Bai, Shoutian, Zhu, Tingshao and Cheng, Li, ‘Big-Five Personality Prediction Based on User
Behaviors at Social Network Sites’, eprint arXiv:1204.4809, 2010
Bartlett, Jamie, Miller, Carl, Crump, Jeremy, Middleton, Lynne, ‘Policing in an Information
Age’, Demos, March 2013, pp. 1-42,
<http://www.demos.co.uk/files/DEMOS_Policing_in_an_Information_Age_v1.pdf?1364295
365>, [accessed 29 September 2014]
Boyd, Danah, Golder, Scott and Gilad, Lotan, ‘Tweet, Tweet, Retweet: Conversational
Aspects of Retweeting on Twitter’, Proceedings of the 43rd Hawaii International Conference
on System Sciences, 2010, <http://www.danah.org/papers/TweetTweetRetweet.pdf>
[accessed 29 September 2014]
Bryden, John and Funk, Sebastian ‘Word usage mirrors community structure in the online
social network Twitter’, Vol. 2, No. 3, 2013,
<http://www.epjdatascience.com/content/2/1/3> [accessed 29 September 2014]
Chan, Michael, ‘The Impact of Email on Collective Action: A field application of the SIDE
model’, New Media and Society, 2010.
Gunawardena, Nipun, et al. ‘Instagram Hashtag Sentiment Analysis’, University of Utah,
2013, <http://www.eng.utah.edu/~cs5350/ucml2013/3-3p.pdf> [29 September 2014]
Gupta, Aditi et al., ‘TweetCred: A real-time Web-based System for Assessing Credibility of
Content on Twitter, Indraprastha Institute of Information Technology, 2014.
Gyorffy, Rachele, ‘#NoFilter: Exploring Self Promotion and Identity Creation through
Instagram.’ Princeton University Senior Thesis, 2013
Heppner, Paul P, Wampold, Bruce E and Kivlighan, Jr. Dennis M, ‘Research Design in
Counselling (Research Statistics & Program Evaluation)’, Cengage Learning 3rd Edition, 2007.
Hochman, Nadav, and Manovich, Lev, ‘Zooming into an Instagram City: Reading the local
through social; media.’, First Monday, Vol. 18, No. 7, 2013
<http://firstmonday.org/ojs/index.php/fm/article/view/4711/3698> [accessed 29
September 2014].
Kosinski, Michael, Stillwell, David and Graepel, Thor (2013), ‘Private Traits and attributes are
predictable from digital records of human behaviour’, Proceedings of the National Academy
of Sciences of the United States of America, Vol. 110, No. 15, pp. 5802-5805.
65
Lapidot-Lefler, Noam and Barak, Azy, ‘Effects of anonymity, invisibility, and lack of eyecontact on toxic online disinhibition’, Computers in Human Behaviour, Vol. 28, No.2, 2012,
pp. 434-443
Omand, Sir David, Bartlett Jamie and Miller, Carl, ‘Introducing Social Media Intelligence
(SOCMINT)’, Intelligence and International Security, 2012,
<http://www.academia.edu/1990345/Introducing_Social_Media_Intelligence_SOCMINT_>
[accessed 29 September 2014]
Paulhaus, L, and Williams, K, ‘The Dark Triad of personality: Narcissism, Machiavellianism
and Psychopathy’, Journal of Research in Personality, Vol. 36, 2002, 00. 556-563
Suler, John, ‘The Online Disinhibition Effect’, Cyberspace and Behaviour, Vol. 7, No.3, 2004,
pp. 321-326.
Summer, Chris et al., ‘Predicting Dark Triad Personality Traits from Twitter usage and a
linguistic analysis of Tweets’, Proceedings of the IEEE 11th International Conference on
Machine Learning and Applications ICMLA 2012, 2012.
Graphs
‘Daily Active Facebook Users by Country/Region’, The International Centre for Security
Analysis and Facebook, <www.facebook.com>, 2014 [accessed 29 September 2014]
‘Distribution of Twitter users worldwide from 2012 to 2018’, Statista,
<http://www.statista.com/statistics/303684/regional-twitter-user-distribution/> [accessed
29 September 2014]
‘Growth of Instagram users worldwide from 4th quarter 2013 to 1st quarter 2014, by
generation’, Statista, <http://www.statista.com/statistics/307026/growth-of-instagramusage-worldwide/> [accessed 29 September 2014]
‘How Mobile are Social Networks? % of time spent on social networks in the United States,
by platform’, Statista, <http://www.statista.com/chart/2091/mobile-usage-of-socialnetworks/> [accessed 29 September 2014]
‘Millions of Teens Have Abandoned Facebook Since 2011’, Statista,
<http://www.statista.com/chart/1789/facebook-s-teenager-problem/>, [accessed 29
September 2014]
‘Most addicted/engaged countries by avg. pageviews per visit.’, Reddit Blog, 2011,
<http://www.redditblog.com/2011/06/which-cities-countries-have-most-reddit.html>
[accessed 29 September 2014]
Olson, Randal, ‘Most Active Subreddits’, 2013, <http://www.randalolson.com/> [accessed
29 September 2014]
66
‘Percentage of UK Internet users who use Twitter as of February 2013, by age group’,
Statista,
<http://www.statista.com/statistics/257429/share-of-uk-internet-users-who-use-twitter-byage-group> [accessed 29 September 2014]
‘Regional distribution of Instagram traffic in the last three months as of April 2014, by
country’, Statista, <http://www.statista.com/statistics/272933/distribution-of-instagramtraffic-by-country/> [accessed 29 September 2014]
Solis, Brian and JESS3, ‘The Conversation Prism’, 2014, <www.conversationprism.com>
[accessed 29 September 2014]
‘Which Cities & Countries Have the Most reddit Addicts?’, Reddit Blog, 2011,
<http://www.redditblog.com/2011/06/which-cities-countries-have-most-reddit.html>
[accessed 29 September 2014]
News Articles and Blog Posts
Al-Tamimi, Aymenn Jawad, ‘Reflections on Methods’, Aymenn Jawad Al-Tamimi’s Blog, July
22 2014 <http://www.aymennjawad.org/2014/07/reflections-on-methods> [accessed 29
September 2014]
‘Banned Hashtag Search’, The Data Pack, <http://thedatapack.com/tools/blocked-hashtagsearch/> [accessed 29 September 2014]
Benthall, Sebastian, ‘“Weird Twitter” art experiment method notes and observations’,
Digifesto, 18 October 2012, <http://digifesto.com/2012/10/18/weird-twitter-artexperiment-method-notes-and-observations/> [accessed 29 September 2014]
Borow, James, ‘How Facebook is Already Profiting from Instagram’, Ad Age, 8 August 2013,
<http://adage.com/article/digitalnext/facebook-profiting-instagram/243515/> [accessed 29
September 2014]
Domm, Patti, ‘False Rumor of Explosion at White House Causes Stocks to Briefly Plunge; AP
Confirms its Twitter Feed Was Hacked’, CNBC, <http://www.cnbc.com/id/100646197#>
[accessed 29 September 2014]
Douglas, Nick, ‘What The Hell Are 4chan, ED, Something Awful, And “b”?’ Gawker, 2008,
<http://web.archive.org/web/20080724081826/http://gawker.com/346385/what-the-hellare-4chan-ed-something-awful-and-b> [accessed 29 September 2014]
Fitzpatrick, Alex, ‘Judge: Public Tweets Have No “Reasonable Expectation of Privacy”’,
Mashable, 3 July 2012, < http://mashable.com/2012/07/03/twitter-privacy/> [accessed 29
September 2014]
67
Foster, Michael, ‘Two things the Fappening Teaches Marketers’, All Voices,
<http://www.allvoices.com/article/100000692> [accessed 29 September 2014]
GrandJean, Martin, ‘Analyser graphiquement son réseau facebook’, MartinGrandJean, 17
March 2013, <http://www.martingrandjean.ch/analyser-graphiquement-reseaufacebook/> [accessed 29 September 2014]
Hern, Alex, ‘End of the timeline? Twitter hints at move to Facebook-style curation’, The
Guardian, 4 September 2014,
<http://www.theguardian.com/technology/2014/sep/04/twitter-facebook-style-curatedfeed-anthony-noto?CMP=twt_gu> [accessed 29 September 2014]
Herrera, Tim, ‘What Facebook doesn’t show you’, The Washington Post, 18 August 2014,
<http://www.washingtonpost.com/news/the-intersect/wp/2014/08/18/what-facebookdoesnt-show-you/?Post+generic=%3Ftid%3Dsm_twitter_washingtonpost> [accessed 29
September 2014]
Honan, Mat, ‘I Liked Everything I Saw on Facebook for Two Days. Here’s What It Did to Me’,
Wired, 8 November 2014, <http://www.wired.com/2014/08/i-liked-everything-i-saw-onfacebook-for-two-days-heres-what-it-did-to-me/> [accessed 9 November 2014].
‘How Al Qaeda Uses Encryption Post-Snowden (Part 1), Recorded Future, 8 May 2014,
<https://www.recordedfuture.com/al-qaeda-encryption-technology-part-1/> [accessed 29
September 2014]
‘Is a Retweet an Endorsement?’, Think Differently, 19 December 2012,
<http://thinkdifferently.ca/differently/is-a-retweet-an-endorsement/> [accessed 29
September 2014]
King, Rachel, ‘Facebook engineers explain News Feed ranking algorithms; more changes
soon’, ZDNet, 6 August 2013, <http://www.zdnet.com/facebook-engineers-explain-newsfeed-ranking-algorithms-more-changes-soon-7000018996/ > [accessed 29 September 2014]
McGee, Matt, ‘EdgeRank is Dead: Facebook’s News Feed Algorithm Now Has Close to 100k
Weight Factors’, Marketing Land, <http://marketingland.com/edgerank-is-dead-facebooksnews-feed-algorithm-now-has-close-to-100k-weight-factors-55908> [accessed 29
September 2014]
Munro, Dan, ‘Twitter Community #BCSM Expands Online to Broaden Patient Engagement’,
Forbes, 31 March 2013, <http://www.forbes.com/sites/danmunro/2013/03/31/twittercommunity-bcsm-expands-online-to-broaden-patient-engagement/> [accessed 29
September 2014]
‘PsyOps and Socialbots’, Infosec Institute, <http://resources.infosecinstitute.com/psyopsand-socialbots/> [accessed 29 September 2014].
68
Reddiquette, http://www.reddit.com/wiki/reddiquette [accessed 29 September 2014]
‘reddits new comment sorting system’, Reddit Blog, 15 October 2009,
<http://www.redditblog.com/2009/10/reddits-new-comment-sorting-system.html>
[accessed 29 September 2014]
‘Resignation calls over councillor’s pineapple retweets’, BBC News,
<http://www.bbc.co.uk/news/uk-england-stoke-staffordshire-14709241> [accessed 29
September 2014]
‘Retweet the old fashioned way, using “classic” or “traditional” retweets only’, Ray’s 2.0, 3
September 2013, <http://rays20.blogspot.co.uk/2010/06/traditional-retweet-tr-keyto.html> [accessed 29 September 2014]
Rosen, Armin, ‘The Remarkable Story of a Rising Terrorism Analyst Who Got Too Close to His
Subjects’, Business Insider, July 22 2014, <http://www.businessinsider.com/tamimi-2014-7>
[accessed 29 September 2014
Rusli, M. Evelyn, ‘Facebook buys Instagram for $1 Billion’, The New York Times, April 2012,
<http://dealbook.nytimes.com/2012/04/09/facebook-buys-instagram-for-1billion/?_php=true&_type=blogs&_r=0> [accessed 29 September 2014]
‘The Banned #Hashtags of Instagram’, The Data Pack, 26 August 2013,
<http://thedatapack.com/banned-hashtags-instagram/#comment-6156> [accessed 29
September 2014]
‘The Open Graph Viz Platform’, Gephi, <http://gephi.github.io/> [accessed 29 September
2014]
‘Twitter users forming tribes with own language, tweet analysis shows’, Guardian Data Blog,
<http://www.theguardian.com/news/datablog/2013/mar/15/twitter-users-tribes-languageanalysis-tweets> [accessed 29 September 2014]
VICE Staff, ‘We are with John McAfee Right Now, Suckers’, 3 December 2012,
<http://www.vice.com/en_uk/read/we-are-with-john-mcafee-right-now-suckers> [accessed
29 September 2014]
‘Weird Twitter’, Know Your Meme, <http://knowyourmeme.com/memes/weird-twitter>
[accessed 29 September 2014]
‘Why John McAfee Is Paranoid about Mobile’, Dark Reading, 19 August 2014,
<http://www.darkreading.com/informationweek-home/why-john-mcafee-is-paranoidabout-mobile-/a/d-id/1298090> [accessed 29 September 2014]
69
Social Networks and Other Sites
‘Cytoscape’, <http://www.cytoscape.org/> [accessed 29 September 2014]
i2 National Security and Defense Intelligence’, <http://www03.ibm.com/software/products/en/national-security-defense-intelligence> [accessed 29
September 2014]
‘Instagram’, <www.instagram.com> [accessed 29 September 2014]
‘Jeffrey’s Exif Viewer’, <http://regex.info/exif.cgi> [accessed 29 September 2014]
‘Learnist’ (https://learni.st/explore) [accessed 29 September 2014]
‘Medium’ (https://medium.com/) [accessed 29 September 2014]
‘Palantir’, <https://www.palantir.com/> [accessed 29 September 2014]
‘Silobreaker’. <http://www.silobreaker.com/network-2> [accessed 29 September 2014]
‘The BCSM Community’, #BCSM, <https://www.youtube.com/user/BCSMCommunity>
[accessed 29 September 2014]
‘The Open Graph Viz Platform’, Gephi, <http://gephi.github.io/> [accessed 29 September
2014]
‘Twitter’, <www.twitter.com> [accessed 29 September 2014]
Welcome to the BCSM Community’, #BCSM, <www.bcsmcommunity.org> [accessed 29
September 2014]
70