The International Centre for Security Analysis The Policy Institute at King’s King’s College London A Structural Analysis of Social Media Networks A Reference Guide for Analysts & Policymakers James Barge Mick Endsor The International Centre for Security Analysis The International Centre for Security Analysis (ICSA) is a research unit within the Policy Institute at King’s College London. We carry out multidisciplinary academic and policy research on international security issues. Our research and training currently concentrates on three main areas: 1. The theory and practice of open source intelligence (OSINT) and social media intelligence (SOCMINT); 2. The non-proliferation of weapons of mass destruction (WMD), nuclear safeguards, nuclear security and nuclear terrorism; 3. Regional security issues in North Africa, the Middle East, East Asia and South America. ICSA aims to be the global academic centre of excellence for open source and social media intelligence. We provide training courses in OSINT and SOCMINT as well as bespoke courses for clients on a range of other topics including: financial open source intelligence (FOSINT), research methodologies, radicalisation and regional security issues. ICSA also runs ad hoc seminars relating to the centre’s key research interests in OSINT, SOCMINT, nuclear non-proliferation and regional security issues. We are currently running the Grand Strategy Seminar Series analysing the texts of great historical thinkers as they apply to contemporary issues of strategy. Contact Us: Website: you can find out more information about the work ICSA carries out on our website: http://www.kcl.ac.uk/sspp/policy-institute/icsa/index.aspx Blog: you can read our blog here: http://blogs.kcl.ac.uk/icsa/ Twitter: you can follow us here: https://twitter.com/icsa_kings Executive Summary In order to develop a useful and efficient approach to social media intelligence, policymakers and analysts must develop a methodological approach. This document analyses a selection of platforms across the spectrum of social media and provides broad categories to understand their various features and what information analysts can collect from them. The recommendations of this report show that an effective approach to social media intelligence can only be derived from a nuanced understanding of social networks; a dedication to research and development; and an ability to remain on the cutting edge of developing trends. The purpose of this is to provide a framework for analysts to improve their understanding of social media. In addition, it is aimed at the often neglected component of intelligence analysis; the managers who task analysts and the policy-makers who consume and act upon intelligence products. The key conclusions drawn from this report are: 1. Social media must be analysed within the context of an intelligence tasking and information requirements. 4. Social networks are adapted by users for their own purposes; social network designers attempt to predict user innovations. 2. Funding research into improved social media analytics technologies should be a key consideration for policy-makers 5. Ethical and legal due process should be implemented for analysts on social media at every stage of the intelligence cycle. 3. Analysts should track developments in social media networks, technologies and user experiences. 6. Organisations should decide how best to create and organise social media analysts within existing teams. This report aims to provide an analysis of social media characterised by breadth and detail. However the analysis conducted does have some limitations. The lack of foreign language resources has meant that important foreign language social networks have been excluded (although references are made to them throughout the report). In the absence of a welldeveloped and rigorous body of academic and scholar research surrounding the still nascent discipline of social media intelligence, much of the research cited in this report fails to place social media within a specific intelligence context. This report therefore takes this research and contextualises its key findings within the outlined framework to provide an understanding of its potential implications within the discipline of intelligence analysis. Contents Introduction ............................................................................................................................... 1 1. Representational Features ................................................................................................... 12 2. Community Features............................................................................................................ 23 3. Interactional Features .......................................................................................................... 30 4. Privacy/Accessibility Features .............................................................................................. 44 5. Infrastructural Features ....................................................................................................... 51 Conclusions & Recommendations ........................................................................................... 61 Glossary of Technical Terms .................................................................................................... 63 Works Cited .............................................................................................................................. 65 List of Figures Figure 1. The Conversation Prism Figure 2. Daily Active Facebook Users by Country/Region. Figure 3. Millions of Teens Have Abandoned Facebook Since 2011. Figure 4. Percentage of UK Internet users who use Twitter as of February 2013, by age group. Figure 5. Distribution of Twitter users worldwide from 2012 to 2018. Figure 6. Growth of Instagram users worldwide from 4th quarter 2013 to 1st quarter 2014, by generation. Figure 7. Regional distribution of Instagram traffic in the last three months as of April 2014, by country. Figure 8. The 24 most active subreddits. Figure 9. Reddit’s most engaged countries by average page views per visit. Figure 10. Prediction Accuracy of Dichotomous traits by examining likes on Facebook. Figure 11. Prediction accuracy of private traits by examining likes on Facebook. Figure 12. Twitter ‘tribes’. Figure 13. A “new” retweet. Figure 14. A “manual” retweet. Figure 15. The hoax tweets posted on the Associated Press official Twitter account. Figure 16. TweetCred ratings displayed on the Reuters Top News official Twitter account. Figure 17. An example of a user utilising the #richkidsofintstagram hashtag on Twitter. Figure 18. The EXIF data for the photo that revealed John McAfee’s location. Figure 19. Social media mobile usage stats. Figure 20. Instagram’s interface. Figure 21. Reddit’s front page. Figure 22. Reddit’s comment interface. Introduction Since its emergence as a mass medium of interaction in the early 2000s, social media has quickly become a huge source of valuable information for researchers from all backgrounds. People increasingly spend more and more of their lives on platforms such as Facebook, Twitter, Instagram and Reddit (the four platforms analysed in this report). Through social media, the web has become a place where users represent themselves, interact in thousands of different ways and constantly produce and consume information. From an intelligence perspective, social media has the potential to be incredibly useful. If a user Tweets in reaction to a political news story, it can provide information about their political beliefs. If members of the Islamic State of Iraq and al-Sham (ISIS) run recruitment campaigns through Instagram or Twitter accounts, they may carelessly reveal, and indeed have revealed, the location of ordnance, training camps and other useful intelligence. The more our lives become connected to the Internet, the more useful information is available. Social media intelligence, or SOCMINT as it has been termed,1 is the newest member of the intelligence family, emerging out of open source intelligence (OSINT).2 SOCMINT deals specifically with intelligence that is a product of social media data and information. However, although the potential benefits of social media are often extolled, intelligence analysts and professionals often criticise the absence of a strategy, doctrine or best practice to determine how to best extract, analyse and utilise SOCMINT. The changing nature of the social media landscape means that a dynamic approach is needed, analysts must be able to adapt to new platforms; updates to current sites and changing social media culture amongst users. This means that there is not one set approach for dealing with social media, it must be a custom approach that is defined by context and fundamental questions: What platform is the interaction taking place? What is the nature of the interaction? What ultimately do I want to find out? The purpose of this report is to provide a guide to help intelligence analysts better understand social media. It is not intended as an exhaustive account of every type of information that can be directly known or inferred. Instead, it attempts to give a representative overview of some of the most important features from an intelligence standpoint. The methodology that has been chosen splits features of social media into 5 categories: representation features, community features, interaction features, privacy features and infrastructural features. 1 Omand, Sir David, Bartlett, Jamie, Miller, Carl, ‘Introducing Social Media Intelligence (SOCMINT), Intelligence and International Security, 2012 <http://www.academia.edu/1990345/Introducing_Social_Media_Intelligence_SOCMINT_> [accessed 29 September 2014]. 2 There is no consensus on whether or not SOCMINT is considered a separate or sub-discipline of OSINT. Certain SOCMINT practitioners may choose to access social media information that is considered private, which would not fall within the remit of OSINT. 1 Representational Features • How users create and maintain versions of their identity online, for example how a detailed Facebook profile would affect the content of individual posts. Community Features • How social media communities differ, how to classify soical media communities and the relationships between real-world groups and their online counterparts. Interactional Features • Investigates how and why users utilise different communicative features across platforms (such as retweets, likes etc.) Privacy Features • The privacy aspects of social media networks, including perceived vs actual privacy; metadata; and the impact of the Snowden revelations on user behaviour. Infrastructural Features • Elements of the formal structure of social media platforms; third party software; ranking algorithms and other crucial, unseen and often poorly understood features. The social media platforms that have been looked at specifically for this report are Facebook, Twitter, Instagram and Reddit. Whilst this is a distinctly western perspective, these networks provide a reasonable overview of the different types of social media networks. This report aims to provide a framework with which alternate social media platforms can be analysed and examined. If an analyst is presented with a new platform that is previously unseen, they will be able to highlight features discussed in this report as a starting point to understanding what information and insights can be gained from the new platform. Facebook represents a ‘classic’ form of social media that requires a detailed profile and seeks to help users maintain a network of real-world relationships. Twitter is the archetypal ‘microblogging’ platform, where a user’s online activity is limited to short, periodic posts. Instagram is a minimal, mobile and visual photo sharing platform. These three factors are increasingly important in emerging social media platforms like Snapchat. Reddit, primarily functions as a link aggregator and message board. It currently stands as one of the most vast and complex communities on the Internet. It is the perfect example of how the structure of a website influences the behaviour of the users, which is the basis for one of the main points of this report. This report is designed not just to be read from front to back; it is equally a reference guide, which can be navigated by “ctrl-clicking” the key policy recommendations included within the table of contents at the beginning of each chapter. 2 3 Figure 1. The Conversation Prism. A graphic displaying the range of websites across the spectrum of social media. 3 Brian Solis and JESS3, ‘The Conversation Prism’, www.conversationprism.com, 2014 [accessed 29 September 2014]. 3 Facebook Employees: 7,185 Launched: February 2004 Monthly active users: 1.32 billion Mobile monthly active users: 1.07 billion Total number of minutes spent of Facebook each month: 640,000,000 Average time spent on Facebook per visit: 18 minutes Total number of Facebook pages: 54,200,000 Number of Languages supported: 70 Number of fake profiles: 81,000,000 Average number of Friends per user: 130 Every 20 minutes: 1 million links shared, 2 million friends requested and 3 million messages sent. Overview: Facebook is a social networking site started by students at Harvard univeristy as a means to improve communication between students. Facebook gradually expanded to support students from other universities in the United States and Canada. By 2006 Facebook was globally available. Facebook primarily functions as a place for people to connect with their real life friends. Users create detailed profiles, which includes information about their work and education, location, contact information, family and relationships, favourite quotations, sexual orientation, political and religious views. However, Facebook does not demand that users share all this information, the decision is left to the discretion of the user. Users can interact on Facebook in a number of ways. Privately, users can send each other instant messages. Publicly, users can post on each others walls (with text, photos, videos or external links) and comment on wall posts. Users can ‘like’ posts and comments and share each other’s content. Facebook has an increasingly popular community of groups, which are based around interests ranging form interesting wikipedia articles to vintage trainers. These should not be confused for facebook ‘Fan Pages’ which allow users to connect with brands and people they are interested in in a similar fashion to following a user on Twitter or Instagram. 4 Daily Active Facebook Users by Country/Region UK 24m Asia 228m US & Canada 152m Europe 206m Figure 2 – Daily Active Facebook Users by Country/Region. 4 Figure 3 – Millions of Teens Have Abandoned Facebook since 2011. 5 4 ‘Daily Active Facebook Users by Country/Region’, The International Centre for Security Analysis and Facebook, <www.facebook.com>, 2014 [accessed 29 September 2014]. 5 ‘Millions of Teens Have Abandoned Facebook Since 2011’, Statista, <http://www.statista.com/chart/1789/facebook-s-teenager-problem/>, [accessed 29 September 2014]. 5 Twitter Launched: July 2006 Employees: 3,300 Monthly active users: 271 million Tweets sent per day: 500 million Number of Tweets per second: 9,100 Percentage of users active on mobile: 78% Percentage of accounts outside US: 77% Number of languages supported: 35+ Percentage of users characterised as ‘lurkers’ (people who watch but don’t contribute): 40% Annual net income: 645.32 million Overview: Twitter is a social networking service that allows users to read and send 140 character microblogs, or ‘Tweets’. Users post tweets to their profile, which can be viewed by anyone with or without a Twitter account. Users ‘follow’ each other on Twitter, which enables them to view Tweets on their home tab. However, following is not reciprocal (as becoming friends is on Facebook), it is possible for a user to follow another and for that user not to follow them back. This has allowed Twitter to become the preferred platform for celebrities, brands and organisations to communicate with fans and followers. Twitter’s defining feature is the 140 character limit that is applied to each Tweet. The thinking behind this is that is forces users to condense what they want to say, which makes Tweets more digestible for followers. In contrast, the traditional blogging network format enables posts to be any given length Twitter is probably the most useful SOCMINT tool at the analyst’s disposal. There are a number of reasons for this. Twitter is a fast, responsive and public medium that groups posts together by hashtag, user, location and date. It has also become the go-to tool for spreading news and information fast, playing a pivotal role in events such as the Arab Spring and the Islamic State of Iraq and al-Sham’s (ISIS’s) recruitment drive and propagandising. Twitter also enables access to its streaming application programming interfaces (APIs), with different levels granted for different contracts and prices.6 6 However full ‘firehouse’ access is only currently being granted to Sysomos, Yandex and Dataminr. 6 Figure 4 - Percentage of UK Internet users who use Twitter as of February 2013, by age group. Figure 5 - Distribution of Twitter users worldwide 2012 to 2018 [forecast] by region. 7 8 7 ‘Percentage of UK Internet users who use Twitter as of February 2013, by age group’, Statista, <http://www.statista.com/statistics/257429/share-of-uk-internet-users-who-use-twitter-by-age-group> [accessed 29 September 2014]. 8 ‘Distribution of Twitter users worldwide from 2012 to 2018’, Statista, <http://www.statista.com/statistics/303684/regional-twitter-user-distribution/> [accessed 29 September 2014]. 7 Instagram Launched: October 2010 Amount paid by Facebook for purchase of Instagram: $715 million (+ stock options) Monthly active users: 200 million Daily active users: 75 million Number of photos shared (as of 26/3/14): 20 billion Percentage of internet users that use Instagram: 13% Percentage of US teens and millennials (14-34) that use Instagram: 34% Most liked photo on Instagram: A photo posted by Kim Kardashian of her wedding to Kanye West9 Overview: Instagram is a photo and video sharing service that is primarily mobile-based. Users can take square-format photos, add a range of filters (sepia, cross-processed, black and white etc.) and share them to Instagram, Facebook, Twitter, Tumblr, Flickr or Foursquare. In 2012 Instagram was acquired by Facebook. To date there has been minimal integration of the two platforms, with Facebook ‘committed to building and growing Instagram independently.’10 Instagram introduced the “explore” tab in 2012. This feature presents the user with a refreshable selection of 21 photos that Instagram believes you would be interested in. This has since become a popular way for people to engage with new content and presents a challenge for marketers trying to ‘game’ the discover page. There are a number of factors that make Instagram distinctive and noteworthy. Amongst these is the fact that it is one of the first genuinely popular social networks to come from a mobile application. Instagram has been resistant to expanding support to a browser based application, only facilitating this in 2013 with a service that simply mimics the mobile version with no additional benefits. Instagram has expanded on Twitters use of hashtags as its primary method to categorise posts and form the basis for communities. Hashtags are probably more important on Instagram than on any other platform, other than providing their usual function of categorising posts; they are also the basis for a great number of communities, who tag posts with hashtags to signify their place within the group. 9 http://instagram.com/p/ogSSO6uS9C/ [accessed 29 September 2014]. Rusli, M. Evelyn, ‘Facebook buys Instagram for $1 Billion’, The New York Times, April 2012, <http://dealbook.nytimes.com/2012/04/09/facebook-buys-instagram-for-1billion/?_php=true&_type=blogs&_r=0> [accessed 29 September 2014]. 10 8 th st Figure 6 – Growth of Instagram usage worldwide from 4 quarter 2013 to 1 quarter 2014 by generation. Figure 7 – Regional distribution of Instagram traffic in the last three months as of April 2014, by country. 11 th 11 12 st ‘Growth of Instagram users worldwide from 4 quarter 2013 to 1 quarter 2014, by generation’, Statista, <http://www.statista.com/statistics/307026/growth-of-instagram-usage-worldwide/> [accessed 29 September 2014]. 12 ‘Regional distribution of Instagram traffic in the last three months as of April 2014, by country’, Statista, <http://www.statista.com/statistics/272933/distribution-of-instagram-traffic-by-country/> [accessed 29 September 2014]. 9 Reddit Founded: June 2005 Employees: 51 Unique monthly visitors: 114.5 million Proportion of U.S. online adults that visit Reddit: 6% Largest demographic: 18-29 year old males. Number of subreddits: 476,720 (5,400 active13) Top 10 subreddits: r/announcements, r/funny, r/pics, r/AskReddit, r/todayilearned, r/worldnews, r/blog, r/science, r/IAmA, r/videos. Number of monthly Reddit pageviews: 5.2 billion Overview: Designed as a social news site based around aggregation algorithms, Reddit allows users to create anonymous accounts and post two kinds of content: text posts and links. The Reddit community votes on posts with either “upvotes” or “downvotes”, the ratio of upvotes to downvotes combined with the ranking algorithm decides on the order and visibility of posts. Users are given “karma” for the number of upvotes they receive, which is reduced by downvotes. Karma is largely useless but can increase user’s social capital within communities as well as bestowing some small measurable benefits, such as allowing users to post more frequently. Reddit is divided up into “subreddits”, which are individual forums within Reddit that are grouped by a similar interest or topic. Users can subscribe and unsubscribe to whichever subreddits they wish. This report will focus largely on the features of Reddit that relate to its status as an online community base. As opposed to the other forms of social media discussed here, Reddit has a strong sense of self-identity, which defines the community on Reddit in relation to the rest of the Internet. This has caused rivalries between Reddit and other broad communities online including 4chan, Tumblr and 9Gag. Reddit’s social element is not designed as something to build and maintain relationships with others, but as a centre for discussion over the content that is posted. The culture on Reddit is one that values long and detailed contributions as well as certain brands of humour, memes and in-jokes. In addition, the vote system means that user’s content is only visible if it receives democratic approval from the Reddit community, this means that users attempt (probably above all else) to pander to the Reddit community’s ideology and attitudes. 13 Where active is defined as subreddits that had at least 5 posts or comments in the past day. 10 14 Figure 8 - 24 most active subreddits. This represents how Reddit posts would be shared if the 24 most 15 popular subreddits were the only existing subreddits. Figure 9. Reddit’s most engaged countries by average page views per visit 16 14 Olson, Randal, ‘Most Active Subreddits’, 2013, <http://www.randalolson.com/> [accessed 29 September 2014]. 15 ‘Most addicted/engaged countries by avg. pageviews per visit.’, Reddit Blog, 2011, <http://www.redditblog.com/2011/06/which-cities-countries-have-most-reddit.html> [accessed 29 September 2014]. 16 ‘Which Cities & Countries Have the Most reddit Addicts?’, Reddit Blog, 2011, <http://www.redditblog.com/2011/06/which-cities-countries-have-most-reddit.html> [accessed 29 September 2014]. 11 1. Representational Features Information found in the profiles of users determines facts about them, studying their periodic updates may help determine complex psychological traits. ..................................... 13 It is unclear whether users of social media believe that their online presence is an accurate representation of themselves .................................................................................................. 18 When users interact as part of a group, they often tend to behave in accordance with group norms, rather than personal attitudes .................................................................................... 19 Analysts should be aware of other factors that affect self-representation on social media .. 21 12 Information found in the profiles of users determines facts about them, studying their periodic updates may help determine complex psychological traits. Firstly, it must be established that it is all but impossible to be active on social media and to maintain full anonymity. Why is this? If we take the definition of anonymity given by Lapidot-Lefler and Barak as something which points to the ‘unidentifiability aspect... rather than namelessness’ of a person, then it becomes clearer.17 Anonymity is not simply narrowly defined by unknown names but rather concerns the broader inability to identify individuals. With this distinction in mind, it is clear that any contributions users make to social media (filling in facts in our profile, offering opinions in tweets or adding hashtags to Instagram photos) can increase identifiably and decrease anonymity. For example, there are certain psychological facts about a person that can be known from a single tweet, albeit these “facts” may often be little more than observing that “X is the sort of person that is not averse to posting a single tweet on Twitter”. However, this does provide some information about an individual that can contribute to the identification of that person. Crucially, monitoring an individual’s public activity on social media networks over time may offer much greater insights for the analyst and undermine the anonymity of the individual. In sum, any information a person provides within the context of social media gives some information about that person and can be used as a tool to identify them. This is crucial in understanding the different sorts of information that the analyst can collect or infer across the spectrum of social media. We can now look at the various ways users represent themselves within social media, and how these representations can be analysed to gain actionable information. Broadly, across the spectrum of social media, we can split information about users into two categories: 1. Users fill out information about themselves in the creation of their social media profiles. 2. Users contribute information about themselves by updating their social media periodically. The first of these is an explicit and conscious contribution to self-representation online. Users are presented with an interface, which asks them to fill out specific facts about themselves. We see this most extensively Facebook (as well as online dating sites), which asks users to give extensive psychological, historical and geographical details. These include information about our political views, sexual orientation and things that we “like” as well as 17 Lapidot-Lefler, Noam and Barak, Azy, ‘Effects of anonymity, invisibility, and lack of eye-contact on toxic online disinhibition’, Computers in Human Behaviour, Vol. 28, No.2, 2012, pp. 434-443. 13 details of our current and previous employment, pre-Facebook past and family members.18 For SOCMINT analysts, the profile details of individuals can be extremely useful. However, these details can be influenced heavily by user’s self-awareness that when they are filling out information in their profile, they are giving an online representation of themselves. In general, once we have an awareness that information we are providing will be used as a representation of ourselves, we have a substantial motivation to manipulate this information to give a representation that is more in line with how we want to be seen, rather than how we actually are. We can suppose that it is possible that if someone is explicitly asked to describe their political views, they may give a less accurate account of their genuine opinions than could be inferred from observing their political discussions throughout time. This can be contrasted with the second way in which users contribute information about themselves: periodic social media updates in the form of Facebook statuses, wall posts, tweets, Instagram photos, and Reddit posts and replies. When users are contributing to social media in these myriad ways, it is not solely in the context of creating an idealised representation of their identity online, although this is a potentially substantial motivation. Rather they are also defined by their content; sharing a video, writing a message to a friend etc. Social media sites such as Twitter and Instagram have very limited requirements for their profiles, however they place a stronger emphasis on constant updates than Facebook, VKontake (Facebook’s Russian equivalent) and dating sites. The contrasts between a profile based social media interface to an update based one are important. The latter encourages users to contribute information which could be more useful for SOCMINT analysts. Social media updates are generally time and location specific; they implicitly demonstrate opinions and communicate conversationally with other users. Furthermore, research has shown that the sorts of facts contained in detailed profiles can also be found out (to some extent) by studying specific types of updates. In the case study below, researchers analysed the Facebook likes of approximately 58,000 volunteers in an attempt to analyse the type of information that can be extracted from Facebook likes and the inferences that could be made from this data.19 18 We are asked to retroactively fill out life events starting from out birth and including important events like weddings, graduations, moving house etc. 19 Kosinski, Michael, Stillwell, David and Graepel, Thor (2013), ‘Private Traits and attributes are predictable from digital records of human behaviour’, Proceedings of the National Academy of Sciences of the United States of America, Vol. 110, No. 15, pp. 5802-5805. 14 Figure 10 - Prediction accuracy of dichotomous traits by examining likes on Facebook 20 expressed by the area under curve (AUC). The study used a sample of 58,466 volunteers in the United States using the myPersonality Facebook application. There was an average of 170 likes per person. Figure 10 shows that the researchers were able to predict dichotomous traits with a high degree of accuracy based purely on Facebook likes. The most accurate predictions were made for Caucasian versus African American (95%) and gender (93%). However, the model was less successful for other variables such as parents together at 21 (60%) and uses drugs (65%). It is important to note that given that this aspect of the study focused on dichotomous variables; random guessing would average a 50% success rate. However, in all of the variables above, the model using Facebook likes had a better predictive success than random guessing. Clearly, the character of updates on social media can take different forms, revealing different things about users. On Twitter, tweets are 140 characters long and usually 20 Kosinski, Michael, Stillwell, David and Graepel, Thor (2013), ‘Private Traits and attributes are predictable from digital records of human behaviour’, Proceedings of the National Academy of Sciences of the United States of America, Vol. 110, No. 15, pp. 5802-5805. 15 demonstrate an opinion, view or piece on information, as well as sometimes communicating with others via hashtags. Instagram has image-based updates that tend to inform us about the location of a user and what they are doing. Perhaps most interestingly for SOCMINT analysts, there have been a number of interesting studies which demonstrate how personality traits can be predicted with ‘reasonable precision’ by studying social media updates.21 Initial research in the field has focussed largely on the “big 5” personality traits: agreeableness, conscientiousness, extraversion, neuroticism and openness.22,23 However more recently there have been some interesting, and attempts to ascertain “The Dark Triad” personality traits by conducting a linguistic analysis of tweets. The Dark Triad personality traits are psychopathy, Machiavellianism and narcissism and ‘all focus, to varying degrees on social malevolence, self-promotion, emotional coldness, duplicity and aggressiveness.’24 Whilst as of yet the prediction accuracy for these studies is generally poor (see figure 11 below), they are hopefully a precursor for more successful attempts. Analysts should be aware of developments in this field; the value of a reliable model of personality analysis cannot be understated. As figure 11 shows, in contrast to the prediction of dichotomous variables from the same study (see figure 10 above) the predictive accuracy of Facebook likes was much worse for deeper psychological traits in contrast to more basic dichotomous variables. As the authors note, psychological traits are examples of “latent traits” which cannot be directly measured. Approximate, albeit imperfect, measurement was provided by responses to questionnaires. In general, the study showed that Facebook likes were a poor predictor of latent traits, with prediction accuracies generally half the questionnaires test-retest reliabilities. This reveals the difficulties of using one specific type of social media interaction to predict deeper psychological traits. Perhaps more importantly, the use of Facebook likes is perhaps an inherently poor predictor of traits as they only provide limited information. Analysing actual user posts, such as Facebook posts or tweets may provide more prove to be more effective predictors for analysts interested in gauging latent traits of individuals or groups. It is interesting to note that dating websites have almost entirely circumvented this problem by asking users to supply deeper psychological information when they sign up. Indeed, many of these dating sites explicitly sell their prospects for successfully finding an individual with a match based on pairing individuals with similar psychological profiles. 21 Bai, Shoutian, Zhu, Tingshao and Cheng, Li, ‘Big-Five Personality Prediction Based on User Behaviors at Social Network Sites’, eprint arXiv:1204.4809, 2010. 22 Ibid. 23 Kosinski, Michael, Stillwell, David and Graepel, Thor (2013), ‘Private Traits and attributes are predictable from digital records of human behaviour’, Proceedings of the National Academy of Sciences of the United States of America, Vol. 110, No. 15, pp. 5802-5805. 24 Paulhaus, L, and Williams, K, ‘The Dark Triad of personality: Narcissism, Machiavellianism and Psychopathy’, Journal of Research in Personality, Vol. 36, 2002, 00. 556-563. 16 25 Figure 11 - Prediction accuracy of private traits by examining likes on Facebook. Predictions expressed by the Pearson correlation coefficient between predicted and actual attribute values at the P < 0.001 level. The transparent bars indicate the baseline accuracy of the questionnaire expressed as test-retest reliability. As well as containing information that can suggest personality traits, social media updates provide factual information that is location and time specific, interactional and contextualised. Additionally, updates are generally less considered users to be pieces of information that is being used to represent their identities. Often they are perceived by users to be more ephemeral than information given in a profile. Updates are usually in response to an external stimulus: a reaction to a news story, a reply to a friend etc. This contributes to potentially less consideration on the part of the user and an increased probability of useful information being revealed, to the benefit of the SOCMINT analyst. Researchers have also noted that ‘while users may be careful about the content they post to Twitter, the words they use may reveal more about their personalities than they would wish.’26 This indicates that it is linguistic analysis of data that can prove the most insightful, 25 Kosinski, Michael, Stillwell, David and Graepel, Thor (2013), ‘Private Traits and attributes are predictable from digital records of human behaviour’, Proceedings of the National Academy of Sciences of the United States of America, Vol. 110, No. 15, pp. 5802-5805. 26 Summer, Chris et al., ‘Predicting Dark Triad Personality Traits from Twitter usage and a linguistic analysis of th Tweets’, Proceedings of the IEEE 11 International Conference on Machine Learning and Applications ICMLA 2012, 2012. 17 rather than analysis of the content of tweets. However, this will entirely depend on the context of whatever update we are looking at. A succinct and clearly expressed opinion on social media is likely to be less linguistically useful than a lengthy vague comment- which will have more linguistically interesting features. Additionally the specific social media medium will have a significant impact on this. Twitter has a 140 character limit on tweets, meaning that users will have to adapt their natural linguistic style to fit their message into tweets. Alternately, Reddit communities encourage posts and replies to be lengthy, interesting and watertight. This seems to have the opposite effect on users, with the language used becoming specialised, verbose and complicated. In both these cases it is generally not the users natural language style that is being represented, but a version of it mediated through the implicit and explicit restrictions imposed by the platform and its users.27 So as we have seen, there are two broad categories that we can place social media data into. The first is data gained from the profiles of users, and the second is information gained from the updates of users. The first is primarily useful for gathering any facts that are included in profiles, things like location, age, marital status and sometimes sexual orientation and political views.28 The second can also be used to gather this kind of information however this is through inference. Inferential information from social media updates is likely to be more useful when determining more subjective personality traits. However, this may require tools to aid the analyst or a sophisticated understanding of the individual in question and the social media platform used to express opinions. To analogise, the first is the sort of information a psychotherapist might ask a patient directly about and the second is the sort that might be inferred whilst a patient discussed their thoughts and feelings. Both have use for the analyst; but as always it depends entirely on the purpose of the intelligence collection and the analytical context. It is unclear whether users of social media believe that their online presence is an accurate representation of themselves There is a second-order consideration to be made concerning the factors discussed in the previous section. The different ways in which users represent themselves on social media must be viewed alongside social media users’ own perception of their self-representation. The important question to ask is: to what extent does this particular social media interface give users the impression that their online avatar is an accurate representation of them? On social media where users believe their online profile is an accurate self-representation, we would (prima facie) expect them to behave in a way that genuinely displays their real thoughts and attitudes. 27 Although, as will become a running theme in this paper, there is always something interesting to be learned when users are forced to adapt to limitations. It all depends on the intelligence aims of the SOCMINT analyst. 28 Although it has been pointed out that updates can also indicate some of this information See Figure 10). 18 Perhaps surprisingly, there is a general absence of relevant academic studies that deal with the accuracy of self-representation in social media. Instead, the majority of existing research focusses on the impact of social media on user self-esteem. In lieu of this, we propose that further research into user perception of self-representation in social media would be beneficial. Studies into this area would provide further evidence to support some of the claims made in this section. It would also provide some useful insight into some of the more specific, contextual aspects of representation in social media. An investigation into how identity and representation are affected by age, sex or location, for example could be extremely useful for the social media analyst. When users interact as part of a group, they often tend to behave in accordance with group norms, rather than personal attitudes Self-representation on social media networks is also often strongly linked to communities. Across the spectrum of social networks we see substantial differences in the sorts of communities that form within them. Sites like Reddit and Facebook have significant infrastructural support for groups built into in their website architecture. Reddit is entirely formed of “subreddits” - forums that are dedicated to discussion of particular topics such as a football team, political party, ideology or funny videos - almost every conceivable interest and subculture is covered by a subreddit. Similarly, Facebook has increasingly become a medium by which people connect over topics in a similar way, with a much more supportive “group” function than was previously seen on the site. As you would expect, where site infrastructure has stronger support for groups, we see increased overall strength in the groups themselves. They tend to have a more robust sense of identity as well as sometimes being insular and hostile to outsiders, with the use of in-jokes, memes and circular references to reinforce group identity.29 There has been some substantial psychological research into changes in user behaviour when interacting as part of an online group or community. Specifically the Social Identity model of Deindividuation Effects (SIDE model) was developed by researchers seeking to describe social effects of computer-mediated communication (CMC)30 and has more recently been applied to user activities on social media sites that have a strong community or group presence.31 For our purposes we can see the SIDE model as something which explains why people on the Internet behave differently when they are interacting within a group. Specifically SIDE points to a correlation between strength of group identity and the likelihood of individual’s actions demonstrating the perceived attitudes of the group, rather than their own individual beliefs and attitudes. 29 A bizarre example of this is the subreddit: www.reddit.com/r/montageparodies (discretion advised). Chan, Michael, ‘The Impact of Email on Collective Action: A field application of the SIDE model’, New Media and Society, Vol. 12, No. 8, 2010, pp. 1313-1330. 31 Suler, John, ‘The Online Disinhibition Effect’, Cyberspace and Behaviour, Vol. 7, No.3, 2004, pp. 321-326. 30 19 Central to SIDE’s perspective is the idea that a significant portion of an individual’s selfconcept is formed in terms of social categories and group membership. These social categories often bring with them a set of norms and attitudes which differ from any given individual within the group. SIDE’s proposal is that an individual’s behaviour largely depends on whether personal identity or social identity is salient at a particular time. So, applying te SIDE model, when group identity is more salient than personal identity, individuals will act and represent themselves in ways that reflect the group identity rather than their own. Crucially, SIDE picks out two specific features of social media interaction that amplifies the saliency of social identity over personal identity: physical isolation and visual anonymity. The medium where we see a large swing towards group identity is, as previously mentioned, on social media networks that have a strong sense of group identity, Reddit for instance. Reddit’s age and popularity has created a situation wherein there is a very clear concept of what it is to be a “Redditor”. Traits generally included within a self-perceived redditor are things such as being a skeptic, atheist, scientifically-minded, cultured, intellectual, liberal, gamer, liking cats, white, and pro-legalisation of drugs.32 We cannot say the same thing about Facebook and Twitter, there is not such a strong concept of the archetype “Facebooker” or “Tweeter” as there is with a “Redditor”.33 As well as this, Reddit’s division into subreddits means that each subreddit also has its own sub-identity. For instance frequent posters on /r/atheism have a strong sense of self identity that differs from that on /r/bassguitar, but both still conform somewhat to the umbrella “redditor” identity. Social media’s tendency to engineer conditions that produce these sorts of complex, compound identities presents a challenge for the analyst. Examining an individual’s activity across different groups on social media will likely present variations of that individual mediated through different social groups. SOCMINT analysts must therefore be aware of contextualised self-identity and understand how this affects the information collected. Considering the fact that social media can present the sorts of complex, compound social identities, a careful assessment of the relevant social media groups is necessary. If an analyst is tasked with analysing a specific group’s members, community and interactions on social media, an awareness of complex compound social identities is critical. In addition, this will help analysts to avoid incorrect attributions of beliefs, values and opinions to particular individuals, when they really represent the group identity. So, from the analyst’s perspective, we should be cautious when examining behaviour of individuals within the context of a strong group environment. As research has shown, the 32 From reading the Reddit post: “what would the stereotypical redditor hate about you?” <http://www.reddit.com/r/AskReddit/comments/1vzeru/what_would_the_stereotypical_redditor_hate_abou t/?sort=top> [accessed 29 September 2014]. 33 A possible explanation of this is the fact that communities on Reddit are generally formed only online and do not reflect networks and relationships in the real world, as Facebook and Twitter have a higher tendency to. This will be explored in more detail in the following section. 20 behaviour of individuals can change greatly within these contexts to increasingly reflect the perceived attitudes of the group. Of course after we have made the necessary consideration for these biases then the data can be of use in other ways; for instance it can help us define exactly what the norms, attitudes and beliefs of the group are. Analysts should be aware of other factors that affect self-representation on social media The above section describes how self-representation in social media is affected by interaction within groups. An exhaustive list of every other way representation can be affected would be the subject of a much longer report or research designed specifically to investigate the fluid nature of self-representation on social media. However, notable examples include users using social media as a professional advertisement (a sort of dynamic online resume such as that seen on LinkedIn); or users creating profiles on social media specifically to improve their relationship with certain groups or individuals. A relevant and interesting example of the latter is the case of Aymenn Jawad Al-Tamimi. AlTamimi is a rising terrorism analyst who since mid-2013 until recently (Summer 2014) adopted a “Jihadi-Persona”34 on Twitter in order to garner information about ISIS.35 He has become an accepted public authority on ISIS,36 gaining citations in major news outlets such as the New York Times, Wall Street Journal and Washington Post as well as initially being invited to contribute to the popular investigative website Bellingcat. Al-Tamimi’s methods included becoming very close with some ISIS supporters, referring to them as “akhi” (brother) and expressing distress upon hearing they were killed in conflict. Al-Tamimi has maintained on his blog that this persona was only used to gain the best quality intelligence and that he is in no way an ISIS supporter himself.37 On the other hand organisations such as Business Insider have referenced individuals who have disputed this claim.38 Either way, one version of Al-Tamimi’s Twitter account is giving a false impression, either he is being his true self when showing ISIS sympathies, or when he is providing information that is harmful to ISIS to western news outlets. In more general terms, Al-Tamimi’s methods have been criticised for being unethical, notwithstanding his possible ISIS allegiance which he has emphatically denied. The ethical 34 Al-Tamimi, Aymenn Jawad, ‘Reflections on Methods’, Aymenn Jawad Al-Tamimi’s Blog, July 22 2014 <http://www.aymennjawad.org/2014/07/reflections-on-methods> [accessed 29 September 2014]. 35 He boasts on his blog that ISIS contacts enabled him to identify the Moroccan ex-Guantanamo Bay inmate Mohammed Mizouz in Syria. 36 Rosen, Armin, ‘The Remarkable Story of a Rising Terrorism Analyst Who Got Too Close to His Subjects’, Business Insider, July 22 2014, <http://www.businessinsider.com/tamimi-2014-7> [accessed 29 September 2014. 37 Al-Tamimi, Aymenn Jawad, ‘Reflections on Methods’, Aymenn Jawad Al-Tamimi’s Blog, July 22 2014 <http://www.aymennjawad.org/2014/07/reflections-on-methods> [accessed 29 September 2014]. 38 Rosen, Armin, ‘The Remarkable Story of a Rising Terrorism Analyst Who Got Too Close to His Subjects’, Business Insider, July 22 2014, <http://www.businessinsider.com/tamimi-2014-7> [accessed 29 September 2014. 21 position on the creation of false profiles, or “sockpuppets” on social media for research purposes is a hazy area for. Certainly major governmental organisations will not publicly condone the use of false profiles as the public backlash alone would be potentially politically damaging. This report does not condone the use of false profiles or personas for research purposes; however it does intend to highlight the need for further study into the ethical implications of creating sockpuppets. We should remember that some of the information provided by Al-Tamimi has proved incredibly informative and useful in understanding jihadism in Syria, Iraq and the wider Middle East. Approaching the problem from a different perspective, analysts at the UK think-tank Demos have also pointed out that the creation of false profiles poses a challenge for analysts to overcome.39 They cite the extraordinary case of the “Syrian-American Lesbian” Abdalla Arraf al-Omari who allegedly ran the blog “A Gay Girl in Damascus”. It later transpired that the blog was run by a PhD candidate at Edinburgh University. The motivations for creating sockpuppets are obvious. They allow users to interact with people they otherwise wouldn’t; pass on false or misinformation; and access protected websites. It is reasonable, therefore for Demos to suggest that ‘any core aspect of any SOCMINT capability will be the ability, both analyst and automated-led, to weed out false and misleading information’.40 To expand on this point, an automated method for locating and identifying false profiles on social media seems like an incredibly difficult task especially considering the limited ability that current techniques in credibility analysis have managed to reach. However, any developments in this field would be of great benefit to the SOCMINT community of researchers, analysts and intelligence professionals. Finally, it is important to note that it is not just individual analysts who have had to address ethical challenges in conducting this type of research, known as netnography. In 2011, US Central Command (CENTCOM) awarded a $2.76m contract to Ntrepid for the creation of an ‘online persona management service’.41 This relatively benign description disguises the fact that Ntrepid was contracted to provide 50 user licenses each allocated with 10 false personas that would enable US personnel to influence online conversations and advance US interests. Not only is there an ethical dimension to this case, there is also the analytical challenge it poses for outside analysts seeking to understand social media communities that may have been infiltrated by governments, militaries or other official agencies. 39 Rosen, Armin, ‘The Remarkable Story of a Rising Terrorism Analyst Who Got Too Close to His Subjects’, Business Insider, July 22 2014, <http://www.businessinsider.com/tamimi-2014-7> [accessed 29 September 2014. 40 Ibid. 41 ‘PsyOps and Socialbots’, Infosec Institute, <http://resources.infosecinstitute.com/psyops-and-socialbots/> [accessed 29 September 2014]. 22 2. Community Features The importance of understanding communities on social networks ...................................... 24 Some social networks create communities that are reflected in reality, others create virtual communities that exist only online ......................................................................................... 24 Analysts should be aware of open source and paid-for network visualisation tools. ............. 25 Understanding the evolution of groups on social networks is a necessary skill for SOCMINT professionals ............................................................................................................................ 26 Instagram blocks the use of many hashtags, undermining community structures ................ 29 Community lifespan on Instagram and Twitter is often related to external events ............... 29 23 The importance of understanding communities on social networks The social aspect of social media implies individuals interacting, forming relationships, creating and joining networks and evolving in reference to each other. This section focuses on the creation and evolution of communities as a result of this social element of social media. “Community” in the context of the Internet doesn’t lend itself to an easy definition; must a community be self-aware? Must it be supported by a platform? Must it share some common interest? To date, none of these questions have been answered in detail. This is fundamentally because the structural aspects of the internet, and especially social media, are so varied, that communities form sharing no apparent similarities. It should be said, however, that a minimal uncontroversial condition for a community is that some of its members should interact based on shared norms, online cultures or other salient features. The importance for SOCMINT analysts to have a strong, developed understanding of communities online cannot be stressed enough. Not only is understanding an individuals’ place within a community the key to understanding them, but looking at communities as more than the sum of their individuals provides insight that cannot be found elsewhere. Analysts must be aware of how communities form on social networks, how they utilise features of social networks to maintain and grow their communities, and how these communities relate to groups in other places, on and off the web. This section intends to give an overview of these issues, highlighting important insights into social communities. There is the potential some significant cross-over with this section and the previous discussion in Representational Features that is concerned with how individual identity can be subsumed by group identity. We have chosen to discuss these features in the previous section, although they could equally be talked about here. This demonstrates that social networks and their constituent features cannot be analysed in isolation but must be approached coherently and methodically by the analyst. Some social networks create communities that are reflected in reality, others create virtual communities that exist only online The first truly popular social network “Friends Reunited” was designed to reunite old friends who had fallen out of contact, that is, to augment real-world relationships. Similarly, Facebook (originally intended as a networking tool for students at Harvard) was also designed to augment real world friendships, provide a platform to communicate and share media. On the opposite end of the spectrum, sites like Reddit and massive multiplayer online role-playing games (MMORPGs) such as Second Life, have minimal integration with real-world friendships. Their community features are focused on creating and maintaining online relationships. Most sites however (including Facebook in its current form) occupy a place on this spectrum. Users on social media such as Twitter, Instagram and LinkedIn tend to communicate with their real-world friends as well as meeting new people online. 24 It is in the interest of SOCMINT researchers to analyse and understand the relationship between a user’s online relationships and real world community. A methodology for determining this would include looking at any geolocation information shared between users, investigating interaction history and examining any mutual connections they may have using network analysis.42 Benefits gained from understanding this information can be extensive. As well as being a good starting point to understanding an individual’s network in general, analysing this real-online distinction can provide information about the breadth of influence a user has, details about their location and the sources of their information. Analysts should be aware of open source and paid-for network visualisation tools. Social network visualisation applications are an increasingly invaluable part of the SOCMINT analyst’s tool-kit. They allow analysts to visual a network of individuals or groups within social networks. There are a wide number of free and paid-for tools available, both openwhich will be utilised by various organisations depending on available resources and requirements. Open source examples include Gephi43 and Cytoscape.44 Popular paid-for solutions include software developed by Silobreaker,45 Palantir46 and IBM.47 Figure 12. A stylised network visualisation of Facebook friends using Gephi visualisation software. 48 42 Note, well-cultivated online relationships can often seem indistinguishable from real-world ones, sometimes some deep research is required here. 43 ‘The Open Graph Viz Platform’, Gephi, <http://gephi.github.io/> [accessed 29 September 2014]. 44 ‘Cytoscape’, <http://www.cytoscape.org/> [accessed 29 September 2014] 45 ‘Silobreaker’. <http://www.silobreaker.com/network-2> [accessed 29 September 2014]. 46 ‘Palantir’, <https://www.palantir.com/> [accessed 29 September 2014]. 47 ‘i2 National Security and Defense Intelligence’, <http://www-03.ibm.com/software/products/en/nationalsecurity-defense-intelligence> [accessed 29 September 2014]. 48 GrandJean, Martin, ‘Analyser graphiquement son réseau facebook’, MartinGrandJean, 17 March 2013, <http://www.martingrandjean.ch/analyser-graphiquement-reseau-facebook/> [accessed 29 September 2014]. 25 Understanding the evolution of groups on social networks is a necessary skill for SOCMINT professionals ‘Unfortunately, doing analysis on giant unstructured digital social networks 49 turns out to be one of the big challenges of social science research.’50 The above quote, an admission from frustrated PhD student Sebastian Benthall, highlights the unstructured nature of online communities. Whilst it is not immediately clear exactly what specifically the writer is referring to here as “unstructured”, we can see that communities on social media can be unstructured in a number of ways. Firstly, there may be an absence of any formal support for groups to form on a particular platform. Twitter and Instagram, for instance lack any sort of the “group” function, which we see in Facebook (groups) and Reddit (subreddits). Secondly, he could be referring to the resistance from Internet based groups to fall into any particular category. The quote above was actually in response to Benthall’s discovery of (what has become known as) “Weird Twitter”. Know Your Meme defines weird Twitter as the following: …a loosely connected group of Twitter users who are known to experiment with spelling, punctuation and format for humor or poetry. The style of writing can be considered surrealist by participants in the group, with subject matter ranging from creating absurd scenarios to attempting to describe abstract feelings by choosing words for their “verbal aesthetic appeal.” However, many of the accounts are grouped together by the same desire to reinterpret the “realness” of life in ways people do not always get to experience.51 It’s clear that Weird Twitter resists any conventional description, there are many, many users who interact as part of the community who do not have any of the characteristics described above whatsoever. Additionally Weird Twitter members in general, reject the moniker, believing that it belittles them and brings them under than one name that which they see to be many different communities. Whilst this is only one particular example of a group on Twitter, it is also somewhat prototypical. Weird Twitter is an example of a “group” on a platform that doesn’t support groups whilst simultaneously resisting classification of being called a group itself.52 However, certain other groups utilise hashtags (#) as a tool to identify themselves as group members, and to flag up their communication. An interesting niche example is #bcsm, an initialism for Breast Cancer 49 Read: online communities chapter. Benthall, Sebastian, ‘“Weird Twitter” art experiment method notes and observations’, Digifesto, 18 October 2012, <http://digifesto.com/2012/10/18/weird-twitter-art-experiment-method-notes-and-observations/> [accessed 29 September 2014]. 51 ‘Weird Twitter’, Know Your Meme, <http://knowyourmeme.com/memes/weird-twitter> [accessed 29 September 2014]. 52 Although (with what could be used as an argument against its existence as a community), many members of the Weird Twitter community gladly accept the title, for example <https://twitter.com/BevisSimpson/status/509887014326784000> [accessed 29 September 2014]. 50 26 Social Media. The hashtag started as a means for sufferers of breast cancer to communicate on Twitter and has since expanded to become the basis for a dedicated website53 and even a YouTube channel.54 What is most interesting here is that #bcsm comes from use on a platform that doesn’t have dedicated group infrastructure support, but provided the basis for the creation of a website that does. Furthermore, as Dan Munro points out, it is a reversal of the usual protocol of starting with a website and adding a Twitter handle to increase visibility. In this instance the website has come as a result of the hashtag, not the other way around.55 Whilst there have always been breast cancer support groups, the use of a hashtag has allowed one to develop that simultaneously functions as an extension of previous cancer support communities and a new communities in itself.) Other examples of communities based around hashtags include #richkidsofinstagram hashtag (see the discussion of hashtags in interactional features below). So, on one end of the spectrum we see communities that are so varied and fragmented that they resist the name of “community” altogether. At the other end we can observe communities that are strongly unified and self-identifying, even when there is no formal site infrastructure to support them. Perhaps as a response to some of these difficulties, researchers at Royal Holloway and Princeton have approached communities on Twitter in a different way, choosing to classify groups by similar language.56 The researchers studied word usage in a weighted network of approximately 189,000 nodes (corresponding to users) from a sample of 250,000 Twitter users. By grouping sets of tweets together by common key-word use, researchers were able to identify groups and communities with surprising precision. For example “pln”, “edtech” and “edublogs” are words used by the community “teachers who often talk about technology”. The largest identified group was African Americans using the words “N**ga”, “poppin” and “chillin”. Researchers on the project described interesting findings, such as the fact that groups have regional accents, in that they commonly misspell words in the same way. Justin Bieber fans have collectively developed a habit of adding “ee” as a suffix to words, like “please” to form “pleasee".57 With sufficient data, the researchers claim they would be able to predict community membership with 80% accuracy.58 53 ‘Welcome to the BCSM Community’, #BCSM, <www.bcsmcommunity.org> [accessed 29 September 2014]. ‘The BCSM Community’, #BCSM, <https://www.youtube.com/user/BCSMCommunity> [accessed 29 September 2014]. 55 Munro, Dan, ‘Twitter Community #BCSM Expands Online to Broaden Patient Engagement’, Forbes, 31 March 2013, <http://www.forbes.com/sites/danmunro/2013/03/31/twitter-community-bcsm-expands-online-tobroaden-patient-engagement/> [accessed 29 September 2014]. 56 Bryden, John and Funk, Sebastian ‘Word usage mirrors community structure in the online social network Twitter’, Vol. 2, No. 3, 2013, <http://www.epjdatascience.com/content/2/1/3> [accessed 29 September 2014]. 57 ‘Twitter users forming tribes with own language, tweet analysis shows’, Guardian Data Blog, <http://www.theguardian.com/news/datablog/2013/mar/15/twitter-users-tribes-language-analysis-tweets> [accessed 29 September 2014]. 58 Ibid. 54 27 Figure 13. Twitter users grouped into “tribes” annotated with words typically used by each group. The top word is the most significant within that community. Circles refer to communities with the size proportional to the number of users. The width of lines between circles represents the number of messages sent between communities. The colours of the loops represent the proportion of messages that are from users within that 59 group yellow being 0 and red 1 and their size indicates overall number of messages. Such research may prove to be very useful, for instance the ability to distinguish between different Islamist groups on Twitter using linguistic subtleties. Analysts should be aware of advances in this field, as well as considering conducting additional research to explore its possible benefits. However, they should also be aware of its limitations including problems arising from analysing foreign languages and the possibility that individuals who do not identify themselves within a particular group nevertheless appropriate its language. 59 Bryden, John and Funk, Sebastian ‘Word usage mirrors community structure in the online social network Twitter’, Vol. 2, No. 3, 2013, <http://www.epjdatascience.com/content/2/1/3> [accessed 29 September 2014]. 28 Instagram blocks the use of many hashtags, undermining community structures Groups on Instagram use hashtags to identify with each other. However, Instagram has a policy of blocking many hashtags from appearing in search functions or via the API. Instagram has justifiably blocked hashtags in the name of protecting users, such as #proanorexia and hashtags associated with pornography. However, Instagram also blocked hashtags such as #iphone and #photography while allowing #passport and #license, providing identity thieves with a raft of new victims. The Data Pack has compiled a nonexhaustive list of banned hashtags60 and developed a banned hashtag search tool.61 Communities on Instagram that have had their hashtags blocked have developed a methodology for creating a new hashtag, side-stepping the difficult task of manually communicating the new hashtag to ‘members’. When the hashtag is banned users will simply duplicate the last letter. So for the banned hashtag #junkiesofig (where heroin users share images), users simply added an extra ‘g’ making it #junkiesofigg. This hashtag was also banned, so currently users operate with the hashtag #junkiesofiggg. The ingenious consequence of only adding extra letters to a hashtag is that when a user searches for the original (banned) one, the newer version appears as a result of the user-appropriated suggested search function. This means that users who may not be aware that a new hashtag has become the norm will be automatically alerted to it when searching for the old hashtag. Community lifespan on Instagram and Twitter is often related to external events Hashtags enjoy a complex symbiosis with external real-world events. Events can create hashtags (#bringbackourgirls); or destroy them (the September 2014 leak of celebrity photos may have broken the #ALSicebucketchallenge trend).62 Real-world events can also be created, amplified and sustained by hashtags, most notably the hashtag #occupywallstreet. #occupywallstreet provided the basis for an online community which had the explicit aim to spill over into the real world. However, #occupywallstreet quickly grew out of just the physical occupation of Wall Street and became a tagline for global anticapitalist movements and sentiments. The online community was united under the #occupywallstreet hashtag whilst the genuine occupation of Wall Street continued. However, when the occupation finished, members of the community generally ceased using the hashtag. This had the effect of diminishing the online community based on its usage. Although other replacement hashtags did emerge, such as #wearethe99%, they failed to mobilise a comparatively strong community base. We have no reason to suppose that the activists online expressed any desire to break the online community up; but the breaking of a fragile connection between real-world events and a hashtag had this effect. 60 ‘The Banned #Hashtags of Instagram’, The Data Pack, 26 August 2013, <http://thedatapack.com/bannedhashtags-instagram/#comment-6156> [accessed 29 September 2014]. 61 ‘Banned Hashtag Search’, The Data Pack, <http://thedatapack.com/tools/blocked-hashtag-search/> [accessed 29 September 2014]. 62 Foster, Michael, ‘Two things the Fappening Teaches Marketers’, All Voices, <http://www.allvoices.com/article/100000692> [accessed 29 September 2014]. 29 3. Interactional Features Many different types of interaction take place on social media, some features are designed to facilitate interaction, others are appropriated by users for this purpose .......................... 31 Retweets, shares and regrams: a case study ........................................................................... 32 There are important differences between “manual” and “new” retweets. ........................... 33 Interactional features are adapted by users for complex conversational functions. ............. 35 Retweets can create a “Rumour effect” .................................................................................. 36 Analysts should actively monitor developments in Social Media analytics, for example: credibility analysis on Twitter .................................................................................................. 37 Analysts should keep track of developments in automated approaches to social media analysis, .................................................................................................................................... 39 Retweets are similar but not identical to Facebook “shares” and Instagram “regrams”. ...... 39 The variety in use of hashtags across different social media platforms should be considered in detail. ................................................................................................................................... 39 Favourites, Likes, and Upvotes - users express approval differently across platforms. ......... 42 Social media analysis and the observers paradox ................................................................... 43 30 Many different types of interaction take place on social media, some features are designed to facilitate interaction, others are appropriated by users for this purpose Interactional features include anything that allows users to contact each other, share information or become linked in some way; this is more broadly defined than “conversation”. Facebook - Likes - Displays token approval of comments and posts. - Comments - Allows users to respond to posts with text, images or links. - Private messages - Allow users to communicate privately with each other. - Wall posts - Allow users to leave posts, links, photos and videos on each other’s profiles. They are displayed chronologically on a “Timeline”. - Tagging - Users can ‘tag’ each other in posts by writing “@name”; this alerts the users that they have been tagged and hyperlinks to their profile. - Sharing – Allows users to re-post posts and links to their profile, crediting the original poster. - Friend Requests – Allows users to become ‘friends’ with each other and have reciprocal access to each other’s profiles. Twitter - Retweets – User can repost a tweet from another user. - @user tagging/syntax – Users can link to another user profile in a tweet, alerting that user that they have been “tagged” - Favourites – Allows users to add tweets to a favourites list, which increases the visibility of the tweet. - Privates messages – Allows users to communicate privately through Twitter. Instagram - Likes - Displays token approval of comments and posts. - Private messages - Allow users to communicate privately with each other. - Comments - Allows users to respond to posts with text, images or links. - Regrams (third party/manual feature) – Users can repost each other’s images, by use of a third-party client or manually. - Hashtags – word or phrases preceded by the # sign; multiple uses on social media but generally used to increase searchability of posts and to group posts together. 31 Reddit - Replies (ad infinitum) – Users on Reddit can reply to posts, reply to replies and so on ad infinitum. - Private messages – Reddit users can communicate privately using the inbox feature. - Reddit Gold/Tips (third party) – Reddit Gold allows users to purchase premium membership for each other. Tips are enabled by third party ‘bots’ and allow users to donate cryptocurrencies such as Bitcoin, Dogecoin and Litecoin to each other. - Upvotes/downvotes – Allow users to ‘vote’ for each other posts, increasing their visibility in accordance with Reddit’s ranking algorithm. Retweets, shares and regrams: a case study A prototypical example of an interactional feature is Twitter’s “retweet” function. Retweets allows users to repost another user’s tweet, crediting the original in the process. Retweeting on twitter was originally something that was not built into the structure of the site. When users wanted to repost something another user had tweeted, they would write “RT @(user) (content of original tweet”). In 2009 Twitter added retweets to their interface; users are given the option underneath each tweet to retweet it. This standardised the format for retweeting and meant it was no longer possible to edit the original message in a retweet (often because of the 140 character limit). Despite the regulation of the format however, retweets are used in many different ways by Twitter users. The most divisive issue is whether or not a retweet constitutes an endorsement of the content of the tweet or the individual tweeting it. This is best highlighted by the fact that many twitter users (journalists and high profile individuals especially) explicitly state in their biographies that “retweet ≠ endorsement” (or some similar variant). There is a feeling amongst more active twitter users that retweets are for passing on information and should not implicitly be interpreted to contain an opinion from the retweeter.63 However, the fact that users feel the need to express that a retweet is not an endorsement indicates that many other users do perceive them as such. Indeed there have even been some cases where the perceived endorsement of tweets has proved very problematic for Twitter users. A 19 year old was suspended from his job as a councillor after retweeting a tweet that endorsed female genital mutilation. He defended himself by maintaining that this was just to ‘raise awareness’ of the issue.64 This divide in opinion can be problematic for analysts attempting to attribute the content of retweets to users. 63 ‘Is a Retweet an Endorsement?’, Think Differently, 19 December 2012, <http://thinkdifferently.ca/differently/is-a-retweet-an-endorsement/> [accessed 29 September 2014]. 64 ‘Resignation calls over councillor’s pineapple retweets’, BBC News, <http://www.bbc.co.uk/news/uk-england-stoke-staffordshire-14709241> [accessed 29 September 2014]. 32 Analysts collecting information about users through retweets must be aware of the subtleties of retweets and judge whether it constitutes as an endorsement in each specific context. What we can say about retweets in general though is that they represent a desire for a user to share the information with their followers. Therefore, although that particular user may not endorse the information contained in the original tweet, they do think that it is information that is worth sharing. This in itself has an intrinsic value for the social media analyst independent of the debate on the function and illustrative nature of retweets. There are important differences between “manual” and “new” retweets. It is worth pointing out here that there is an important difference between retweets facilitated through the retweet function and manual retweets (also known as classic or traditional retweets). Twitter users have argued convincingly that Twitter’s introduction of the retweet function decreased the “social” aspect of social media.65 Because of this, many veteran Twitter users still prefer to use the manual method when retweeting. Analysts should be aware of the differences between manual and function-based retweets. Differences include: - Manual retweets allow the retweeter to edit the original tweet or add their own comment, increasing the conversational nature of the retweet. - Manual retweets display the retweeter’s avatar not the original tweeter’s avatar. - Manual retweets have their own individual URL so are searchable, new retweets do not. - New retweets do not increase your visibility (it will not increase your likelihood to appear in a “suggested” list). - New retweets will not contribute to the popularity of a hashtag. - New retweets will not enter you into a conversation with the users mentioned in the tweet. Figure 14. A “new” retweet. 66 65 ‘Retweet the old fashioned way, using “classic” or “traditional” retweets only’, Ray’s 2.0, 3 September 2013, <http://rays20.blogspot.co.uk/2010/06/traditional-retweet-tr-key-to.html> [accessed 29 September 2014]. 66 The author’s twitter posts from https://twitter.com/ 33 Figure 15. A “manual” retweet. 67 What is clear is that manual retweets have a much greater involvement of the retweeter themselves. A manual retweet allows the original to be edited or commented on, displays the retweeter’s avatar and creates a unique URL attributed to their account. For the analyst, this means that manual retweets can be a potentially richer source of information about individuals that cannot be gained from their newer counterparts. An additional problem faced by analysts confronted with manual retweets is the potential for the retweeter to distort the information, edit it or present it out of context, changing the meaning. As Boyd, Goldern and Lotan point out, even if the content of the tweet is not altered, taking a tweet out of context “can give it a life of its own”. 68 A fictional example of this is the difference between these two tweets: 1. @Twitteruser1: My girlfriend broke up with me by email... 2. (a follower) RT @Twitteruser1 “My girlfriend broke up with me by email...” OUCH! The addition the retweeter has added here presents the information contained in the tweet in a very different light to the original, transforming it from a rather sad confession into a joke. Awareness of these sorts of “broken telephone” problems is crucial for the analyst. Indeed, this seemingly trivial fictional example demonstrates how easily and rapidly messages and information can be distorted on social media. Users may appropriate, edit and disseminate a single update or multiple updates on social media for their own ends that may be far removed from the purpose and intention of the original poster. From the analyst’s perspective, this increases the “costs” of collecting, verifying and analysing information extracted from social media. The most obvious of these costs is time. 67 Twitter, <www.twitter.com> Boyd, Danah, Golder, Scott and Gilad, Lotan, ‘Tweet, Tweet, Retweet: Conversational Aspects of Retweeting rd on Twitter’, Proceedings of the 43 Hawaii International Conference on System Sciences, 2010, <http://www.danah.org/papers/TweetTweetRetweet.pdf> [accessed 29 September 2014]. 68 34 Interactional features are adapted by users for complex conversational functions. Boyd, Goldern and Lotan have also conducted a survey which aims to give a non-exhaustive list of reasons people retweet.69 The responses included the following: - To amplify or spread tweets to an audience. - To entertain or inform a specific audience, or an act of curation. - To comment on someone’s tweet by retweeting and adding new content, often to begin a conversation. - To make one’s presence as a listener visible. - To publicly agree with someone. - To validate others’ thoughts. - As an act of friendship, loyalty or homage by drawing attention, sometimes via a retweet request. - To recognise or infer to less popular people or less visible content. - For self-gain, either to gain followers or reciprocity from more visible participants. From the perspective of the retweeter, we can divide these motives into two categories: 1. Users retweet to engage with their audience or a specific target audience; 2. Users retweet to engage with the original tweeter. The first of these categories fits in with the intended function for retweets, the intention to spread the information contained within the retweet. However this only accounts for one of the functions appearing in the survey conducted. The second category accounts for the majority of the functions appearing in results of the survey. These contain a number of ways that retweets are adapted as a feature for facilitating conversation. When people wish to engage with each other in the offline world, we have a fairly limited selection of options available, these are mostly quite direct: calling someone’s name, sending them a letter, introducing oneself etc. Online, features such as retweets, likes, favourites and shares have been adapted to facilitate a complex range of subtler, indirect ways to address people. Particularly interesting alternate functions of retweets, especially from an intelligence perspective, include: indicating friendship, demonstrating loyalty and displaying an act of homage. Users who want other users to realise that they are influenced by them; agree with them; or share similar opinions use retweets in this way to express these feelings indirectly. The indirectness of a retweet is important because it encourages users to use them in this 69 Boyd, Danah, Golder, Scott and Gilad, Lotan, ‘Tweet, Tweet, Retweet: Conversational Aspects of Retweeting rd on Twitter’, Proceedings of the 43 Hawaii International Conference on System Sciences, 2010, <http://www.danah.org/papers/TweetTweetRetweet.pdf> [accessed 29 September 2014]. 35 way without the feel of social embarrassment; after all a user can always claim that the retweet was intended only for sharing purposes. It represents a socially low-risk option for engaging with users you might otherwise not. Users adapting features on social media for conversational purposes is not just seen in retweets, it is common across all forms of social media. The effect this has on user experience of these platforms is one of heightened connectivity and addressability. Users are more likely to engage with people they haven’t met or previously interacted with. They are given more opportunities to start conversations, join existing discussions and create new ones. So when considering the range of possible interactions that can be made on social media, users may choose many different options. For the analyst, this means that when tracking the interpersonal relationships of individuals and their specific interactions, it is not enough to focus exclusively on verbal communication: comments, replies, wall posts. Analysts must assess whether use of these non-verbal features is also playing an important role. This requires a sophisticated understanding of social media communication and interaction. Retweets can create a “Rumour effect” The 140 character limit on Twitter forces users to condense whatever message they are trying to convey, whilst this is a useful and important part of Twitter, it can disproportionately advantage catchy, eye grabbing headlines over ones that might contain the most factually correct or important information. In addition, these sorts of tweets can spread extremely quickly because of the ease of retweeting. Whilst in most cases misinformation of this kind can be fairly innocuous (quickly dispelled rumours of celebrity deaths example), certain examples have almost been disastrous. As Figure 16 shows, on April 2013 the Associated Press (AP) Twitter Figure 16. The hoax tweets posted on the Associated Press 70 official Twitter account. 70 Domm, Patti, ‘False Rumor of Explosion at White House Causes Stocks to Briefly Plunge; AP Confirms its Twitter Feed Was Hacked’, CNBC, <http://www.cnbc.com/id/100646197#> [accessed 29 September 2014]. 36 account posted a tweet reading “Breaking: Two Explosions in the White House and Barack Obama is injured”. The news quickly spread, and whilst the AP quickly deleted its account and notified Twitter via alternate AP accounts that it was the result of a hack (claimed by the pro-Assad regime Syrian Electronic Army), the damage had already been done. The Dow Jones plunged more than 140 points and the S & P briefly lost $136.5 billion (which was recovered within 15 minutes). Interestingly, the problem was not entirely caused by traders reading tweets and acting in response, but automatic algorithms which read headlines and create automatic orders. These algorithms were deceived not just by a false headline froma reputable news source but also by the information echo of thousands of retweets that increased the perceived relevance of the story. Analysts should actively monitor developments in Social Media analytics, for example: credibility analysis on Twitter There have been a number of proposed solutions to the problems created by the spread on misinformation on Twitter. Most notably there have been interesting recent developments by the Indraprastha Institute of Technology, where researchers have developed a browser extension named “TweetCred”,71 which purports to display a credibility rating out of 7 for all Tweets visible on a users timeline. Figure 17. TweetCred ratings displayed on the Reuters Top News official Twitter account. As a very credible news source we would expect to see 7/7 ratings for all Reuters tweets. Tweetcred has been accurate for the top two, however the bottom tweet (a retweet from Reuters business has been given a very low 2/7 rating. 72 The reason for this is unclear but it is probably at least partly because it does not contain a hyperlink. 71 Gupta, Aditi, et al., ‘TweetCred: A real-time Web-based System for Assessing Credibility of Content on Twitter, Indraprastha Institute of Information Technology, 2014, <http://chato.cl/papers/gupta_kumaraguru_castillo_meier_2014_tweetcred.pdf> [accessed 29 September 2014]. 72 Twitter, <www.twitter.com> [accessed 29 September 2014]. 37 TweetCred aims to provide users with additional information about Tweets to users, embedded within the interface. it uses a criteria of 45 features to determine the credibility of tweets. Some of these criteria are: • Tweet meta data: Including number of second since the tweet, source of tweet and geocoordinates. • Tweet content features: Including number of characters, words, URLs, hashtags, unique characters, presence of stock symbol, happy/sad smiley, colon symbol etc. Credibility Score • User based features: Including number o f followers, friends, time since last tweet. • Network features: Including number of retweets, mentions, replies, whether the tweet is a reply or a retweet • Linguistic features: Including presence of swear words, negative or positive emotion words, pronouns. • External resource features: Web of Trust (WOT) score for the URL; retio of likes:dislikes for a Youtube video. After the initial tests run by TweetCred, the developers report that users agreed with 43% of the ratings given, with an additional 25% expressing minimal disagreement. These underwhelming figures are an indication that there is still much work to be done in the field of credibility evaluation. Figure 16 above demonstrates an obvious flaw with the TweetCred algorithm; it shows how two tweets from one very reputable news source (Reuters) can be given wildly different credibility readings. The first two tweets displayed are from the Reuters Top News Twitter account and the third is a retweet from Reuters Business. It seems that the only difference between the third tweet and the first two is that the third contains a hashtag but no hyperlink. This should not be a reason to give a credibility rating that is five points lower and reflects a significant weakness in the TweetCred methodology. On the other hand, as one of the first pieces of pioneering software in the field, TweetCred does seem promising. Certainly, a more accurate and developed version could be incredibly useful for analysts to contextualise the relevance, accuracy and reliability of questionable or previously unknown sources of information. Indeed, assessing the credibility of information posted on Twitter is one of the most significant challenges facing analysts seeking to extract useful information from social media as part of an all-source intelligence product. Many of the same OSINT techniques developed to assess the credibility of online sources can be applied by SOCMINT analysts. However, the volume of information on social media; the diversity of commentators; and the brevity of posts pose new challenges to analysts that software such as TweetCred explicitly aim to address. Furfure developments in this field should be monitored carefully. 38 Analysts should keep track of developments in automated approaches to social media analysis, Even experimental software such as TweetCred highlight the potential benefits that automated approaches to social media analysis can offer. This report has also discussed methods to track communities on social networks and methods that aim to extract information by analysing Facebook likes or the words used in Tweets. SOCMINTs status as a recent discipline and the youth of social media in general means that we can expect many more advances of this nature in the future. In addition, because social media is constantly growing and changing, technology that seeks to analyse it will have to adapt alongside it. This presents a challenge for SOCMINT analysts and policy-makers. There must be a method in place to keep track of advances in SOCMINT analysis methods in technologies in order to remain on the cutting edge. Organisations must decide whether it is viable to task an analyst to keep track of these developments and how much time should be allocated for this. It should be noted that methods and technologies that are applicable to SOCMINT will not often be billed as SOCMINT technologies, often they may have been designed for an alternate purpose but are able to be adapted for innovative SOCMINT analysis. Retweets are similar but not identical to Facebook “shares” and Instagram “regrams”. Retweets on Twitter are analogous in many ways to the “share” function on Facebook and “regrams” on Instagram. Sharing on Facebook does provide a space for users to submit their own comment to their repost but still faces the same problems as retweets in terms of endorsement. Unfortunately for the analyst, there is not a culture on Facebook of stating whether or not you believe a “share” to be an endorsement or not, so this can be difficult for analysts to find out (although the comment itself may be an indicator). The variety in use of hashtags across different social media platforms should be considered in detail. Hashtags are words or phrases that are preceded by the pound sign (#). They are generally used by social media users to provide extra information to a post.73 They allow posts to be grouped together with posts that share the same hashtags, as well as to track trends and enhance search functions. The overwhelming popularity of hashtags on social media is largely restricted to Twitter and Instagram, although they are becoming increasing popular on Facebook after it rolled out support in 2013. On Twitter, popular hashtags appear in the “trending” section74 and on Instagram users can group tweets together by a hashtag search function (users can also do this on Twitter). 73 Gunawardena, Nipun, et al. ‘Instagram Hashtag Sentiment Analysis’, University of Utah, 2013, <http://www.eng.utah.edu/~cs5350/ucml2013/3-3p.pdf> [29 September 2014]. 74 This can either be based on a user’s location or tailored to their interests. 39 Gunarardea et al have suggested that the use of hashtags as described in the previous paragraph warrants their classification as “metadata” of social media posts.75 Whilst this view certainly seems to illustrate the function of hashtag’s some of the time, it does not do justice to their range of use, especially on platforms such as Instagram. Hashtags are increasingly becoming the “stars” of social media networks, often forming the main content of posts, with the body of text providing scaffolding for use of a hashtag. The use of hashtags on Instagram works perfectly in tandem with Instagram’s use as a platform for self-promotion. Instagram’s almost exclusive focus on posting and sharing photo’s makes it an ideal tool for this and has allowed it to develop as the primary method on the Internet that individuals use to ‘display economic, social and cultural capital’.76 To this end, hashtag’s can be used to associate oneself with a particular lifestyle or social group. Take, for example the #richkidsofinstagram/#rkoi hashtag, which has become incredibly popular amongst affluent teenagers and young adults. Figure 18. An example of a user utilising the #richkidsofinstagram hashtag on Instagram. 77 Above is a typical example of the #rkoi hashtag being used to affirm economic and social capital. In the caption we can only see a brief comment: ‘Line up!!!’ with the majority of the content being the 28 hashtags that are use alongside the photo. Most of these hashtags serve the dual purpose of increasing the visibility of the post on Instagram and 75 Gunawardena, Nipun, et al. ‘Instagram Hashtag Sentiment Analysis’, University of Utah, 2013 Gyorffy, Rachele, ‘#NoFilter: Exploring Self Promotion and Identity Creation through Instagram.’ Princeton University Senior Thesis, 2013 77 Instagram, <www.instagram.com> [accessed 29 September 2014]. 76 40 simultaneously affirming steven_wong_91’s personal brand as an affluent youngster partial to fast cars. Additionally, steven_wong_91’s legitimacy to use the #rkoi hashtag (he has a photo that warrants it) places him within a community of other Instagram users, which grants additional social capital. Analysts should not just be aware of how users use hashtags to increase their social capital on an individual level, but also how their use fits into a wider social context. To use the previous example, if it were the case that #rkoi was not popular and that steven_wong_91 was the only person using the hashtag, then it would not carry the same sort of social capital that it does, given its popularity. Whilst it would still carry the everyday connotations of being a rich, young social media user, the fact that #rkoi is well known means that it carries additional implications in terms of community identity. The hashtags used in the post shown in the image above also serve to increase the visibility of the post. This is probably the most common reason people use hashtags on Instagram. Instagram’s “search by hashtag” feature allows posts sharing the same hashtag to be grouped together. This is also a function on Twitter and many hashtags are created with this in mind. Twitter, however, also has a strong emphasis on hashtag trends, which collate the most popular hashtags. Hashtag trends can either be “tailored to you” or location based. Tailored trends are mediated by an algorithm that takes into account who a Twitter user follows. Location based trends are trends that are specific to an area (it is possible to choose “worldwide” as an option here). Through this feature Twitter has encouraged a culture that emphasis Tweets as ephemeral; they are designed to be read soon after they have been posted. With this in mind, hashtags are generally used by Twitter users to increase visibility of Tweets on a temporary basis, and to contribute to current trends. As well as this, Tweets are searchable by the words contained in the Tweets, but Instagram posts are not searchable by the content of their photos or the captions provided. This means that Instagram users must use hashtags to describe the actual content of their photos, whereas Twitter users can use them to both provide extra information and increase visibility. Instagram photos tend to be more personal in content than tweets. Tweets can be in reference to anything whereas the subject of photos is something within the user’s vicinity, so is more likely to be some strong relation to them. Because of this, Instagram hashtags tend to contain more information about a user’s immediate environment than Tweets. This is also because, on Twitter, if a user wishes to communicate their emotional state or perceptions of their environment they have the option to do this via the main body of text. This is not an option on Instagram as the main body is an image (although, perhaps it may be implicit in the image or included in the caption). Whilst we see some similarities in hashtags use across both platforms, hashtags on Instagram have a much wider range of uses that often includes interesting personal, social and cultural information about users. Users on both platforms use hashtags to identify with 41 certain communities and social groups, yet on Twitter these groups are generally more ephemeral and are often connected to real world events (such as the Ice Bucket Challenge, or the Occupy movement). On Instagram groups tend to be permanent as well as sometimes being quite obscure (for example the group based around the #junkiesofig hashtag). This illustrates the point for analysts that the subtle differences arising from the context of features on social media must be understood and accounted for in order to develop a fruitful understanding. Favourites, Likes, and Upvotes - users express approval differently across platforms. Social media platforms all allow users to express approval of other users’ posts. On Facebook and Instagram users can “like” other users’ posts and comment, on Twitter users can “favourite” and on Reddit users’ upvote comments they approve of (and downvote comments they disapprove of). When collecting information about user attitudes and opinions on social media, it should go without saying that favourites, likes and upvotes are an important resource. Ostensibly, it makes sense to have the general assumption that if somebody uses one of these features, they are expressing a positive reaction to something that has been posted. Broadly this is correct. However, as with hashtags, these features are not analogous across all platforms. On Facebook, likes are generally used in three ways: 1. Likes are most often used when responding to a personal post from a friend, to express that they like what they have posted (perhaps news of a new baby, or an amusing joke). 2. Likes are also used to express agreement with something, for example an opinion article or a “page” on a specific topic. 3. Less commonly, likes are used by users to indicate something is worthy of attention, but not necessarily that they approve of (it is approval of the posting of the content, not the content itself). For instance, a tragic news story. (Note: some users are afraid of having likes misinterpreted and will refrain from exercising them in this way). On Instagram likes are simpler; they are used almost exclusively to indicate approval or enjoyment of the photo that has been posted. Because of the popularity of retweeting to express approval or agreement on Twitter, the favourite function is used in a subtler way. Users’ often favourite tweets when they do not believe the tweet warrants a retweet, or that they do not think that it needs to be shared further, perhaps because they do not want it to be visible on their profile (although it is possible to view what other users favourite via the ‘activity’ tab). As well as this, the favourite icon is designed and used as a bookmark button for Tweets. For some users this is the limit of its use, other users favourite some tweets for approval and others for 42 bookmarking purposes. Analysts should be aware of this subtle distinction on Twitter which quite often varies from user to user. They should be wary of drawing concrete conclusions from the fact that a user “favourites” a tweet where the usual connotations of the word favourite do not necessarily apply. Social media analysis and the observers paradox In social sciences the “observer’s paradox” refers to a phenomenon that occurs when events in an experiment are affected by the presence of the observer. The “paradox” comes about because it is impossible to conduct experiments without observation, but if observation takes places then the experimental data can be corrupted. Social media research can suffer chronically from problem. When users are aware or have the impression that they are subject to observation we see a phenomenon known as ‘reactivity’,78 where users alter their behaviour in response to being watched. This could specifically take place in a situation where a researcher or SOCMINT analyst felt it was wise to reveal their true identity and intentions to an individual or group that was being examined. The examined individuals would then adapt their behaviour in response to the presence of an observer, possibly obscuring useful information they might otherwise offer. Social media analysts acting as passive or active observers of networks, communities and individuals across various platforms also raises issues relating to the discipline of netnography, the study of individual and group behaviour online. The example of Aymenn Jawad Al-Tamimi (see analysts and self-representation) demonstrates the potential pitfalls of actively engaging with individuals or communities of interest online. It is imperative that policy-makers develop clear guidelines for analysts searching social media for information and crucially, for when they seek to interact with users to gain valuable insights. Organisations will have to develop their own standards independently although these may evolve into a consensus best practice. 78 Heppner, Paul P, Wampold, Bruce E and Kivlighan, Jr. Dennis M, ‘Research Design in Counselling (Research rd Statistics & Program Evaluation)’, Cengage Learning 3 Edition, 2007. 43 4. Privacy/Accessibility Features Introduction ............................................................................................................................. 45 Much of the most useful information from an intelligence standpoint comes in the gap between perceived privacy and actual privacy ....................................................................... 45 Facebook users can often become confused about the status of their privacy settings ........ 45 Changes in Twitter and Instagram privacy settings can reveal useful information that was originally posted when private. ............................................................................................... 46 Users are often unaware of the information provided by their metadata ............................. 47 Reddit users often disregard their digital footprint, providing an important source of information for analysts........................................................................................................... 49 Analysts should be aware of the potential ethical problem with accessing information intended as private .................................................................................................................. 49 Analysts should consider which groups and demographics are likely to change their behaviour in light of increased awareness of surveillance ...................................................... 49 44 Introduction Privacy features on social media networks are naturally perceived as the biggest restriction to SOCMINT analysts. The reasons for this are obvious; it is not always possible for SOCMINT analysts to access protected content. However, from the relationship between public and private content emerges some useful information that would be impossible to know in a completely public Internet. Much of the most useful information from an intelligence standpoint comes in the gap between perceived privacy and actual privacy Users on social media have a greater incentive to adopt very different personas when interacting publicly as opposed to privately. Whilst occupying their public persona, we can assume that users will act in ways that they perceive to be publicly suitable.79 Likewise we can expect to see possible differences in behaviour, self-representation and interaction when users occupy their private persona. The crucial difference between the two of these is that researchers and SOCMINT analysts do not necessarily have access to information that is shared privately, so it is very difficult to conceptualise the distinction between these two versions of a person. As the analyst’s perception of an individual on social media will almost always be based solely on is his or her public persona, it is difficult to see how they might behave differently when the settings are turned private. What is also clear is that much of the most useful information from the analyst’s perspective is going to be the sort of information that is shared privately. Depending on the analyst’s aims, access to private information could be extremely useful. If we want to understand how an individual interacts with their close friends for instance, private data could be invaluable. Similarly, defence and security analysts would benefit from access to private information but in many cases such access would not be feasible because of privacy, ethical, legal or operational concerns. Whilst the authors of this report do not condone breaching or subverting user privacy in any way, it should be recognised that there are certain aspects of social media that provide the analyst the possibility to view information publicly that was posted under some impression of privacy. The structure of different social networks means that there is often a gap between perceived privacy and genuine privacy, where analysts may find the opportunity to see users interacting with their private personas within the public sphere. Facebook users can often become confused about the status of their privacy settings A common criticism of Facebook is the complexity of the site’s privacy settings. Currently users are granted a lot of control over who can see what content they post. Each individual wall post, photo album or profile section can be given its own privacy setting, with the 79 That said, there is some evidence to suggests that many users do tend to reveal information publicly that they would not deem to be suitable. 45 choice ranging from completely public to viewable only by the user themselves. Whilst this depth of control can be useful for users, it also brings about confusion. Many Facebook users have little understanding of what privacy settings they are using for which posts. A common mistake is for users to set the privacy of one post to “public”, automatically changing the privacy of future posts as well. From then on users may continue to post for a long time under the impression of privacy before realising their error. So it is common to find lots of information on social media that was posted under the impression of privacy. Notwithstanding the ethical implications involved in accessing information made public by mistake, this sort of information can be useful from an intelligence standpoint. It enables SOCMINT analysts to view information that would otherwise be impossible for them to access. As this information is delivered under the impression of privacy it will often contain information that could to be more useful to an analyst than information that has deliberately been made open to the public. The possibility of identifying information that has been erroneously posted publicly under the assumption of privacy poses a challenge for analysts. How is it possible to tell if a user has made this sort of mistake? One possible way would be for analysts to pay attention to when posts have been set to public for a reason (the user is promoting something that is specifically addressed to a wider audience, for instance) and to see if the following posts remain public, with no similar reason. Of course, this method cannot make a conclusive assessment. Linguistic analysis of posts may also be employed to discern any subtle differences between knowingly public posts and perceived private posts. This will require a detailed and nuanced understanding of the individual in addition to software designed to aid the analyst. Changes in Twitter and Instagram privacy settings can reveal useful information that was originally posted when private. Twitter has much simpler privacy settings. Users can either choose to have their tweets public or protected. Public tweets are viewable to everyone, even people not logged into a Twitter account. Protected tweets can only be viewed by users that have been approved to follow a user and they cannot be retweeted. Additionally, protected tweets will not appear in any search and @replies sent to users that you have not approved will not be notified or be able to see them. However, unlike Facebook, Twitter does not allow users individual control of each tweet. This means that all tweets must be set to private or public at any one time, it is not possible to set one group of tweets to public and another to private. This being the case, many users will spend a period of time tweeting privately and then decide to change to public tweets form that point on, changing their future tweets and their previously private tweets, to public. 46 There is then a situation where a user has posted a number of tweets as private (presumably occupying their private persona) but then subsequently changed their privacy settings and revealed these tweets to be public. This change has been made without editing the previous tweets to fit in line with the user’s public profile. Whilst it is possible for users to go back and delete tweets, and some surely will do this, there will be other who will not. The quality of this revealed information (from an intelligence perspective) will be relevant only in the specific context of whatever aims the analyst has, as well as the context of each individual user. It should not be automatically assumed, for instance, that there will be a difference between every individual’s private tweets and public tweets, although this may be true in some instances. Similarly, just because a user chooses to make their tweets private does not mean that they will change the content of subsequently public tweets. Users are often unaware of the information provided by their metadata Metadata is data about data. For our purposes it can be understood as additional information that is included in social media posts and can include things such as the location, time and interaction of posts. Some metadata is available to view directly on posts themselves, other information is only accessible through the application’s API interface. Metadata’s potential use for intelligence analysts cannot be understated; the metadata of posts is often vastly more useful than the posts themselves, primarily because users often disregard the information provided by metadata when posting on social media. The Edward Snowden revelations have often centred on the controversial collection of metadata by US (and other) intelligence agencies. In 2012 Vice magazine published an article boasting that their journalists had recently met up with John McAfee whilst he was on the run in Central America, after being accused of murdering his neighbour.80 Vice uploaded a photo (see below) which contained location metadata about where the photo was taken (the metadata also revealed the iPhone model used, the lens, exposure setting, time and date). These location coordinates pointed to a location in Guatemala that was used by law enforcement agents to track down and arrest McAfee. Whilst it is possible that the metadata was left in the photo for a reason, it seems much more likely that this was a genuine mistake on behalf of Vice employee. It is also an interesting oversight given John McAfee’s notoriously stringent online security regimen.81 This case illustrates the point above that even those sophisticated users who have an immediate and explicit interest not to reveal their location and a heightened awareness of online security, can be careless with their metadata. 80 VICE Staff, ‘We are with John McAfee Right Now, Suckers’, 3 December 2012, <http://www.vice.com/en_uk/read/we-are-with-john-mcafee-right-now-suckers> [accessed 29 September 2014]. 81 ‘Why John McAfee Is Paranoid about Mobile’, Dark Reading, 19 August 2014, <http://www.darkreading.com/informationweek-home/why-john-mcafee-is-paranoid-about-mobile-/a/did/1298090> [accessed 29 September 2014]. 47 Figure 19. The exchangeable image file format (Exif) data for the photo that revealed John McAfee’s location. 82 It is very common for Twitter users to add locations to their tweets inadvertently. The setting is not located with each individual tweet, but in the “security and privacy” section of settings. Whilst it is as simple as unchecking a box to prevent location information being added to your tweets, many users are unaware this setting exists. There is also a button that allows users to delete the location information from all of their tweets. Similarly, on Facebook and Instagram, location is automatically added to updates and must be manually disabled. On Twitter, those who can access Twitter’s API can access the following metadata: 82 - Name - Location - Biography information - Account creation date - Username & Identifier - Tweet’s location, date and time zone - Tweet’s unique ID and ID of tweet replied to - Contributor IDs - Follower, following and favourite count - Verification status - users with significant social stature, at risk from fake imitation profiles, can apply to have their account “verified” with a tick next to their name. ‘Jeffrey’s Exif Viewer’, <http://regex.info/exif.cgi> [accessed 29 September 2014]. 48 Reddit users often disregard their digital footprint, providing an important source of information for analysts As discussed previously in this report (see group interaction and self-representation), Reddit’s segmentation into different subreddits generates a situation where Reddit users will adopt different personas when operating within different subreddits. This seems to have an interesting effect on Reddit users, insofar that they tend to behave in a way on individual subreddits that disregards their activity on others. While there is no formal research evidence to support this, one need only spend 5 minutes browsing through subreddits to see glaring inconsistencies amongst posts as well as the disclosure of highly revealing information. Analysts should be aware of the potential ethical problem with accessing information intended as private In 2012, a U.S. judge forced Twitter to provide Tweets to the district attorney, arguing that Tweets posted publicly have ‘no reasonable expectation of privacy’.83 Whilst this is convincing – it is very clear that when users tweet, that information is accessible to anyone, there is a sense that this sort of tacit consent is not acceptable when we consider the potential use of our data. To put it in another way, just because we post information somewhere for everyone to see, does not mean that we expect that data to investigated out of the usual context of the Twitter experience. Whilst this point has some weight on its own, it is amplified when we see it in the context of viewing data which was originally intended by users as something which was private. If a user decides that they would like to have their posts on Twitter made public, this is unlikely to include a consideration of the implications of making all their previously posted tweets public as well. Indeed users may simply be unaware of this implication of their decision to make their posts on Twitter public. Analysts should consider which groups and demographics are likely to change their behaviour in light of increased awareness of surveillance In the post-Snowden digital environment, we should expect see greater concerns over privacy and surveillance amongst the general population. It seems likely that the more aware users are that their online activities can be observed and analysed, the more they will change their behaviour to limit this. This is being done by users becoming more wary about the kind of information they share and paying closer attention to privacy settings. One extreme example is pointed out in a report by Recorded Future who observed an ‘increased pace of innovation’ in Al-Qaeda’s encryption technology.84 Whilst Al-Qaeda’s motivations to 83 Fitzpatrick, Alex, ‘Judge: Public Tweets Have No “Reasonable Expectation of Privacy”’, Mashable, 3 July 2012, < http://mashable.com/2012/07/03/twitter-privacy/> [accessed 29 September 2014]. 84 ‘How Al Qaeda Uses Encryption Post-Snowden (Part 1), Recorded Future, 8 May 2014, <https://www.recordedfuture.com/al-qaeda-encryption-technology-part-1/> [accessed 29 September 2014]. 49 evade surveillance are obvious, the sentiments of other groups and demographics are likely to be similar. This imposes additional collection, verification and analysis costs on the analyst. Despite this logical presumption, it is clear that not all demographics have not altered their behaviour online, or become more concerned about invasions of their privacy. A study has recently found that young adults in Gothenburg are not concerned about their privacy in light of the Snowden leaks and did not change their behaviour.85 This was predominantly because these young adults assumed governments were already collecting this information and therefore had already accepted this face and modified their online behaviour accordingly. Of course this only points to one limited demographic and can certainly not be considered representative. What is likely is that groups with the most to hide, such as AlQaeda, are also the most likely to adapt their behaviour in response to revelations regarding online surveillance. Analysts should track legal and regulatory developments The relative youth of SOCMINT as an intelligence discipline means that there are sparse legal and ethical guidelines that seek to determine the behaviour of analysts in the online space. The ethical issues surrounding SOCMINT mirror the complexity of privacy on social media but hold an important place in the public consciousness of online issues. In a postSnowden paradigm it is in the interest of the analyst to be strict on any public perception of privacy violation. Whilst the legal repercussions may be minimal, the public outcry can be much more harmful. A solution to this would be a solid set of legal guidelines that seek to govern the behaviour of analysts. The Regulation of Investigatory Powers Act (RIPA) should, in theory provide such a legal framework in the UK. However, RIPA was passed into law in 2000 and therefore predates the social media boom. Demos have advocated that RIPA be updated in order to cover challenges presented by SOCMINT.86 RIPA is arguably unable to deal with the SOCMINT environment because it presents a unique problem that does not fit into the existing legislation. It is in the interest of SOCMINT analysts to both advocate for and remain up to date with any developments that provide a legal and ethical framework for SOCMINT. 85 Hochman, Nadav, and Manovich, Lev, ‘Zooming into an Instagram City: Reading the local through social; media.’, First Monday, Vol. 18, No. 7, 2013 <http://firstmonday.org/ojs/index.php/fm/article/view/4711/3698> [accessed 29 September 2014]. 86 Bartlett, Jamie, Miller, Carl, Crump, Jeremy, Middleton, Lynne, ‘Policing in an Information Age’, Demos, March 2013, <http://www.demos.co.uk/files/DEMOS_Policing_in_an_Information_Age_v1.pdf?1364295365> pp. 1-42, [accessed 29 September 2014]. 50 5. Infrastructural Features Whether a user is updating social media via mobile or desktop affects the character of the information .............................................................................................................................. 52 Instagram’s interface allows it to be used as a personal tool of documentation – a Google Earth seen through the eyes of social media users ................................................................. 53 Reddit’s vote based ranking system means that users pander for votes rather than speaking their mind................................................................................................................................. 55 Analysts should be aware of third-party clients and enhancements utilised by social media users ......................................................................................................................................... 57 Understanding timeline algorithms is the key to understanding user experience on social media ....................................................................................................................................... 57 51 Whether a user is updating social media via mobile or desktop affects the character of the information Mobile use of social media has long since eclipsed desktop use. Many newer applications are created with no desktops counterparts at all (Snapchat) whilst others have introduced very minimal desktop interfaces that are only used by a tiny minority of users (Instagram). Services like Facebook, Reddit and Tumblr started off on desktop but have increasingly become more mobile based. Figure 20. How Mobile are Social Networks? % of time spent on social networks in the United States, by 87 platform Social media on the mobile platform can take on an entirely different character to its desktop based counterpart. This is because mobile social media bears a much stronger relationship to the real-world physical environment in which it’s being used. Desktop computing tends to be done in a fairly uninteresting environment, in the bedroom, office, or workplace for example. Laptop computing can be more interesting in terms of locations, including: in a coffee shop, hotel, or when travelling. Mobile computing cannot be pigeonholed in the same way; it is used anywhere and everywhere with a 3G/4G or Wi-Fi connection. This ability for users to access social media within a changing physical 87 ‘How Mobile are Social Networks? % of time spent on social networks in the United States, by platform’, Statista, <http://www.statista.com/chart/2091/mobile-usage-of-social-networks/> [accessed 29 September 2014]. 52 environment is coupled with increasingly faster internet speeds and mobile devices. Additionally, social media use within real-world social settings has become increasingly acceptable. Analysts have begun to understand an increasingly visible trend towards the proliferation of social media updates that reference real world events local to the social media user. The mobility of social media has taken it out of the bedroom and the office and increased its immediacy and presence in relation to real world actions and events. Twitter is often given as the archetypal example of how real-world events can be monitored and reported in realtime by socially connected users. Instagram also offers the opportunity to photograph incidents of interest, add hashtags to increase visibility and upload them for all to see. Crucially, analysts will expect to see a further decrease in the time interval between realworld events occurring and the reporting of them on social media. Advanced methods in mobile communications will allow Twitter, Facebook and Instagram feeds to be increasingly responsive to real world events. This is especially useful to analysts working in disaster or crisis response. Instagram’s interface allows it to be used as a personal tool of documentation – a Google Earth seen through the eyes of social media users In a fascinating paper detailing how visualisations of social media data can provide social and cultural insights, Hochman and Manovich88 discuss how Instagram’s interface allow it to be compared to “planetary documentation tools” like Google Earth and Bing Maps. Instagram timestamps are not given a specific date, but are rather by a dynamic timespan: photos are “2 days ago” rather than “30/09/14”. This shifts the timestamps from an objective description to something that is entirely relative to the user. Photos are made “atemporal”. However, exact and concrete geographical location is emphasised by Instagram. Users can either tag their photos by venue or add them to a personal “photomap”. 88 Hochman, Nadav and Manovich, Lev, ‘Zooming into an Instagram City: Reading the local through social; media.’, First Monday, Vol. 18, No. 7, 2013 <http://firstmonday.org/ojs/index.php/fm/article/view/4711/3698> [accessed 29 September 2014]. 53 Figure 21. Left to right: Instagram’s timeline, filters page, photo map. 89 This definite geography and temporal-relativity is used in conjunction with “filters” than can be added to Instagram photos to give a variety of nostalgic effects. Each one of these filters (sepia, black and white, cross processed, etc.) is designed to evoke a different feeling in the photographs. With this feature photos are taken with filters in mind and studying the different feelings and impressions created by Instagram features can provide useful information about the mind-set of the user. Filters broadly communicate a feeling of authenticity or nostalgia as well as to stamp the images as personal to the individual user – in the way a one off original print would be. Filters are an attempt to rebel against the cold and bland nature of digital images, to give a feeling of uniqueness to infinitely replicable images by adding synthetic imperfections that, ironically, can be applied an infinite number of times. The third feature that completes Instagram’s status as a subjective tool of documentation is the instant sharing function that increases a posts visibility within a wider social media context. Photos on Instagram are generally not only shared with user’s personal network, unless an account is protected, photos can be searched by user, hashtag and location. Individual photos on Instagram are designed to fit into a wider collaborative project, filling the world with a visual documentation effort. Thus Instagram ‘resists the time and place presented by larger impersonal corporate efforts.’90 Moreover, Instagram’s inherently mobile nature and its ubiquitous presence in the pockets of its millions of users leads to a 89 Instagram official screenshots. Gyorffy, Rachele, ‘#NoFilter: Exploring Self Promotion and Identity Creation through Instagram.’ Princeton University Senior Thesis, 2013. 90 54 daily upload of masses of potentially useful information for the analyst to collate, sift and analyse. It is unclear how far this affects the everyday user experience of Instagram users, certainly some will be more aware than others. What SOCMINT analysts should take away from this is twofold: firstly that Instagram photos can be used as an alternative geographic source to google maps, which provides a view of the world mediated through user experience. This has an obvious role as a tool for geolocation. Secondly, the more users become aware that their Instagram photos are being used for this purpose the more subjective and self-aware this will be, fundamentally altering its contribution to information collection and the formation of an all-source intelligence product. Note: It is also possible to create a Google Earth and Instagram hybrid by combining the two and viewing Instagram photos through the Google Maps interface.91 Reddit’s vote based ranking system means that users pander for votes rather than speaking their mind As mentioned previously in this report, Reddit uses a vote based ranking system to decide the visibility of posts on the front page, individual subreddits and comments within posts. Users can “upvote” posts to increase visibility and “downvote” to decrease it. If a post receives enough downvotes it will be hidden altogether. Figure 22. Reddit’s front page. Upvote/downvote scores a visible on the left of the thumbnails. 91 Some attempts to do this can be seen here: http://www.gramfeed.com/ http://instahood.meteor.com/ http://www.shots24.com/ and http://instaearth.me/#/stephaniedurant/photos/800672074709271886_240181984, [accessed 29 September 2014]. 55 Reddit’s guideline page Reddiquette gives these rules on how users should conduct their voting: Including a series of instructions: do not downvote something just because you don’t like it; do not mass downvote someone else’s posts; do not moderate a story based on opinion of its source or upvote/downvote based on the person who posted it.92 Whilst these rules are generally followed by users voting on threads/posts (see above), when users vote on comments they tend to abide by much less objective guidelines. Figure 23. Reddit’s comment interface, viewable by clicking on the “comments” link underneath a post/thread The very nature of comments on internet forums means that they will be loaded with (to name a few) divisive opinions, questionable facts and ad hominem replies. The upshot of this being that Reddit’s comment voting system descends into a popularity contest, with redditors pandering to popular “reddit” opinions in order to receive upvotes.93 By making the correct pop-culture references, in-jokes and Reddit tropes, users can sail to the top of the comment pile. In addition to this, by far the majority of downvotes in comments on Reddit are from users that simply disagree with the expressed opinion. The effect this has on Reddit from an intelligence perspective is mixed. If we would like Reddit to be a forum which can provide analysts a means to gauge user opinions about certain issues, the voting system obscures. Upvotes are not just given when comments ‘contribute to discussion’,94 but in response to a variety of factors. However downvotes are a little more uniform, generally being given out en masse when a user posts something that is rude or abrupt or something that conflict with Reddit’s general outlook (see above). That said, this cannot be taken as a rule and the specific instances are almost always contextual 92 Reddiquette, http://www.reddit.com/wiki/reddiquette [accessed 29 September 2014]. See above: Representation/Identity features for details on how certain sets of beliefs and attitudes come to define the archetype “Redditor”. 94 Reddiquette, http://www.reddit.com/wiki/reddiquette [accessed 29 September 2014]. 93 56 to the particular subreddit under discussion. This again imposes additional costs on the analyst, an important characteristic of social media as an intelligence discipline. An appreciation of this is leading many organisations, from private companies to intelligence agencies, to develop dedicated SOCMINT teams although this is still a nascent phenomenon. Additionally, because of Reddit’s comment sorting algorithms,95 comments that are posted earlier are much more likely to be visible and receive votes than comments from those that join the thread later. This places another arbitrary constraint on Reddit’s ranking system. The result of all this is a situation where Reddit’s voting system (especially upvotes) should not be trusted to be a reasonable barometer for determining anything in particular. Whilst it may sometimes be the case that Reddit users adopt a reasonable approach, there are too often many possible obfuscating factors that cannot be reasonably screened or adjusted for in any final analysis. Analysts should be aware of third-party clients and enhancements utilised by social media users Third-party social media clients utilise the original platform’s API and provide a new medium with which users can access the applications services. Usually these services aim to give an enhanced user experience; to give more control over a user’s social media; to optimise an application for mobile or desktop or to tailor an application for a specific purpose (marketing, professional etc.). Different social media sites have different rules and guidelines for third-party developers, which the analyst must be aware of. However, most networks endorse third-party development as successful third party apps lead to successful parent products, often through adaptation of the network or through the acquisition and integration of the third-party company.96 Third-party use of social networks mediates use in any number of ways, depending on the specifics of the software. For analysts this is important, many of the factors of social media and their SOCMINT implications usually depend on as assumption that the default interfaces of social media are being used. When the social media experience changes through a thirdparty application this may affect what sort of information can be known by the analyst. Understanding timeline algorithms is the key to understanding user experience on social media A central feature to almost every mainstream social media service is the timeline/newsfeed/front page.97 Taking different names depending on the service, this 95 Comments in Reddit threads can be sorted by the following criteria: best (comments that are predicted to have a very good upvote/downvote ratio), top (posts with the best overall ratio), new, hot (new posts that are getting a good ratio), controversial (posts with lots of up and down votes), old 96 For instance Facebook have offered grants between $25,000 and $250,000 for developers. 97 For convenience I will refer generically to this feature as a “timeline” for this section but it is not to suggest that this is a general term. 57 feature performs a similar function across the board of social media: it is the place where the feeds from user’s subscriptions (friends, liked pages, joined subreddits, pages followed on Twitter) are collated into one stream of information. This is the place that users on social media spend the majority of their time, it dictates what updates they receive from friends or brands, which news stories are visible or which posts from which subreddits. The timeline’s place as the centre piece of social media means that its effect on the user experience cannot be underestimated. A developed understanding therefore, is essential for SOCMINT analysts. In order to do this, we must look in detail about the differences in timeline’s across social media. There are a number of different ways of “sorting” content that platforms use, ranging from the all-inclusive reverse chronological timeline of Twitter and Instagram, to the vote-based system seen on Reddit and Digg and the confidential algorithm that decides what is visible on Facebook. To begin with, the simplest ranking mechanism for a timeline is the all-inclusive, reverse chronological method employed by sites such as Twitter and Instagram (and to some extent, Reddit). This is it exactly what it sounds like, all Tweets on Twitter and photos on Instagram are visible on the timeline, with the most recent tweets and photos being visible at the top. No content whatsoever is obscured from the user in this way. So, as far as user experience goes, we can be sure that if a user is following a page, person or brand, then they will be receiving visible updates from that page. At the time of writing98 there recently been a lot of media attention99 concerning comments made by Anthony Noto, Twitter’s financial chief who said that some of the most relevant Tweets can be buried at the bottom of a user’s feed, suggesting that a more curated feed might be in the pipeline for Twitter. Indeed Twitter has actually been introducing some of these sorts of features into the timeline for some time now. We can now see some Tweets that our followers have favourited, or replied to (although it has been stressed by Twitter representatives that these are only when there are no new Tweets to show but the user is refreshing his or her timeline). Twitter’s traditional timeline is considered an important part of the Twitter experience for many users and is thought to contribute to its democratic and egalitarian ideology. For this reason, suggestions of a filtered timeline have been met with strong opposition by Twitter’s loyal community. Instagram is not showing any inclination of moving in a similar direction, but given the fact that Facebook paid almost $1 billion for the 98 Early September 2014. Hern, Alex, ‘End of the timeline? Twitter hints at move to Facebook-style curation’, The Guardian, 4 September 2014, <http://www.theguardian.com/technology/2014/sep/04/twitter-facebook-style-curatedfeed-anthony-noto?CMP=twt_gu> [accessed 29 September 2014]. 99 58 company (and has hinted at the possibility previously100) then it appears likely that at some point in the future we will see something like embedded adverts in Instagram’s feed.101 Facebook’s algorithm is the subject of a thousand technology blog posts and provides the basis of an entire industry of social media marketers who make a living out of trying to game it. It is incredibly complex and involves thousands of factors in determined where stories appear on a timeline. The algorithm is also secret, so it is not worth attempting to understand its precise mathematics. That said, we know that Facebook still considers the general basis for the now defunct EdgeRank algorithm as important.102,103 It makes more sense to understand the Facebook ranking algorithm in terms of its ideology. Lars Bakstrom, a Facebook engineer has said that Facebook’s ‘main goal is to create the best personalised newspaper for all of its users.’104 This means that Facebook want to strike the perfect balance between showing updates from friends that users care about and updates from brands and pages that they’re interested in. Facebook quantifies this by trying to maximise the number of interactions with the content displayed. Facebook’s algorithm has been heavily criticised for hiding content and creating an “echo chamber” where users only see content they are already interested in, rather than being introduced to new content. This then reinforces pre-existing connections and prevents users from being introduced to new content they would otherwise be interested in. Tim Herrera at the Washington post spent 6 hours scrolling through his newsfeed and was only shown a fraction of the content that was posted by pages and people he was connected with, even to the point where content was being replicated.105 Similarly, Mat Honan wrote an article in Wired in which he described what happened to his Facebook news feed when he liked every single piece of information presented to him my Facebook.106 Honan found that his 100 Borow, James, ‘How Facebook is Already Profiting from Instagram’, Ad Age, 8 August 2013, <http://adage.com/article/digitalnext/facebook-profiting-instagram/243515/> [accessed 29 September 2014]. 101 This is similar to the strategy Snapchat has taken in order to increase revenue. Some of the messages Snapchat users receive are now adverts. It seems that embedding adverts into the normal user experience of these sites is becoming the norm, rather than having ads located in a sidebar or as pop-ups. Whilst Facebook does have adverts in the sidebar, it has also started introducing them into the regular newsfeed. 102 McGee, Matt, ‘EdgeRank is Dead: Facebook’s News Feed Algorithm Now Has Close to 100k Weight Factors’, Marketing Land, <http://marketingland.com/edgerank-is-dead-facebooks-news-feed-algorithm-now-hasclose-to-100k-weight-factors-55908> [accessed 29 September 2014]. 103 Edgerank is a combination of three factors, Affinity, Weight and Time Decay, which decide on where an Edge appears on the News Feed. Edges aren’t just posts on Facebook, they include anything at all that happens on the site, including likes, comments etc. The algorithm looks at all the Edges connected to a user and ranks them based on the importance to the user, objects with the highest EdgeRank setting will be sent to the top. 104 King, Rachel, ‘Facebook engineers explain News Feed ranking algorithms; more changes soon’, ZDNet, 6 August 2013, <http://www.zdnet.com/facebook-engineers-explain-news-feed-ranking-algorithms-morechanges-soon-7000018996/ > [accessed 29 September 2014]. 105 Herrera, Tim, ‘What Facebook doesn’t show you’, The Washington Post, 18 August 2014, <http://www.washingtonpost.com/news/the-intersect/wp/2014/08/18/what-facebook-doesnt-showyou/?Post+generic=%3Ftid%3Dsm_twitter_washingtonpost> [accessed 29 September 2014]. 106 Honan, Mat, ‘I Liked Everything I Saw on Facebook for Two Days. Here’s What It Did to Me’, Wired, 8 November 2014, <http://www.wired.com/2014/08/i-liked-everything-i-saw-on-facebook-for-two-days-hereswhat-it-did-to-me/> [accessed 9 November 2014]. 59 Facebook feed drifted rapidly to the American political right before becoming increasingly polarised between sentiments on the extremes of the political left and right in the US. Interestingly, Reddit offers customised ranking systems to its users. Reddit provides five different optional ranking algorithms that users can choose to sort posts and comments by. The default setting for posts is ‘top’ and the default for comments and replies is ‘best’. Below is an explanation of the five optional ranking algorithms that users can select on Reddit. - Best – The newest ranking algorithm written by xkcd.107 Best aims to make a prediction on the quality of the post based on its current score. It estimates what sort of score the post would receive if everyone had seen it. Posts with the best estimated ratio appear higher up the feed. - Hot – Is based on the rate of upvotes to downvotes. Posts that are currently receiving a lot of upvotes and comparatively few downvotes will appear nearer the top of the feed. - New – Ranks the newest posts first. - Top – Simply, is upvotes vs downvotes. Posts with the best ratio appear highest up the feed. This has been highly criticised because it is massively biased to content that is posted early. If a reply gets posted 5 minutes after the story has been created, it will be much more visible than a (much better reply) that is posted a few days later. The higher something is listed the more visibility it has and the better chance it has of receiving a lot of upvotes. - Controversial – Posts that have a ratio of upvotes to downvotes that is close to 50/50 will appear higher up in the feed. - Old – Ranks the oldest posts first. These different ranking systems alter the way in which information is presented on Reddit and provides the analyst with the opportunity to approach conversations from five different perspectives. 107 ‘reddits new comment sorting system’, Reddit Blog, 15 October 2009, <http://www.redditblog.com/2009/10/reddits-new-comment-sorting-system.html> [accessed 29 September 2014]. 60 Conclusions & Recommendations Recommendation 1: Social networks and their internal features should always be examined contextually. The role of a social network is defined by the situation within which it is being used. This report provides a framework with which to analyse alternate social networks. It also highlights the extent to which social networks must be examined within their specific context. The example of hashtag use across Twitter and Instagram illustrates this. Whilst we see some similarities in hashtag use across both platforms, within specific contexts hashtags are used in very different ways. A similar trend can be seen in many features across social media. This does not mean that generalisations cannot be made, but aims to show that it is often the details of social media use that can provide the most insightful information. Recommendation 2: Organisations should develop ethical and legal due process for analyst’s behaviour on social media. There is, as of yet, no definite legal procedure designed to regulate the behaviour of analysts on social media.108 This is perhaps in lieu of any ethical consensus. It is in the interest of organisations to develop a stringent ethical code of conduct for analysts. Ethical concerns should be seen through the perspective of the social media user as well appropriate legal and regulatory bodies. That is to say organisations should be aware of the public repercussions that can result from unethical social media intelligence practice as well as any additional ethical concerns. As well as this, many of the information gathering techniques necessary for social media intelligence involve analysts interacting directly with users of social media. This being the case, there must be a set of guidelines to govern this behaviour. These may include imperatives such as the necessity for analysts to reveal their identity and intentions when interacting with users on social media. Recommendation 3: Organisations that value SOCMINT should consider funding research into areas that will benefit SOCMINT. These may include: user representation on social media and sentiment/ credibility analysis. SOCMINT’s relative youth as an intelligence discipline means that much of the most beneficial innovation and research have yet to be conducted. Organisations must consider the cost/benefit of either: a) Allocating resources to fund external research projects; or b) allocating funds and assigning personnel to research in-house. The potential benefits from gaining an advantage in areas such as automated sentiment and credibility analysis depend 108 Demos have advocated an application of the Regulatory and Investigative Powers Act (RIPA) as a legal framework for the gathering and use of social media data. Bartlett, Jamie, Miller, Carl, Crump, Jeremy, Middleton, Lynne, ‘Policing in an Information Age’, Demos, March 2013, <http://www.demos.co.uk/files/DEMOS_Policing_in_an_Information_Age_v1.pdf?1364295365> pp. 1-42, [accessed 29 September 2014]. 61 on the success of the research, but could prove immensely useful to intelligence organisations. In addition, research focusing on a psychological or linguistic approach to social media analysis may have fewer obvious short-term benefits but could eventually transform the field, providing information and insights previously thought to be unattainable. Recommendation 4: Organisations must decide whether to implement a dedicated SOCMINT team or to embed SOCMINT specialists within existing analytical teams. As of yet, there is no accepted protocol for structuring SOCMINT within existing intelligence infrastructure. Whether it makes more sense to implement separate or embedded SOCMINT teams is primarily dependant on the relationship between SOCMINT and the other types of intelligence research conducted by a particular organisation. If, for instance a team is working on geolocation and utilising various different methods, then it seems that embedded SOCMINT geolocation specialists would be the right choice. However if the role of SOCMINT is to provide a separate perspective on a particular issue, perhaps Twitter as a counterpoint to traditional news media, then a separate SOCMINT team makes more sense. Recommendation 5: Resources permitting, organisations should consider tasking an analyst to track advances in social media and social media analysis technology. Given the rapidly changing landscape of social media, it is imperative for any SOCMINT organisation to be up to date with the social media status-quo. For instance during the writing of this report, Instagram have publicly announced plans to integrate advertisements into their newsfeed. Algorithms that govern timelines, search results and recommendations are constantly being tweaked. As well as changes to current popular social networks, the massive industry of tech start-ups is churning out endless newer alternatives to the traditional options. Sometimes these may be adaptations of existing social media,109 or entirely new concepts.110 SOCMINT analysts must be aware of any relevant new developments in social media in order to efficiently exploit it for intelligence purposes. Inability to do this is one of the primary problems preventing SOCMINT from being a successful intelligence discipline. Similarly (as mentioned above) social media analysis technology is developing at a fast rate, with many organisations choosing to purchase tools externally rather than develop their own bespoke alternatives. Remaining up-to-date with advances in social media analysis tools is essential to maintaining an edge in SOCMINT. Whilst many well publicised developments may be easy to keep track of, others will require some significant research. The combination of keeping track of developments in social media itself and the tools used to analyse it warrants consideration by organisations, particularly relating to the issue of whether it is worth dedicating an analyst for this specific purpose. 109 110 ‘Medium’ (https://medium.com/) is similar to Twitter but focuses on longer, more detailed posts. ‘Learnist’ (https://learni.st/explore) is a social network based around sharing educational information. 62 Glossary of Technical Terms 4chan – An English-language forum made up of a community of anonymous users. 4chan’s most popular forum “random” or “/b/” has been described as a place where ‘people try to shock, entertain and coax free porn from each other.’111 9Gag – A platform where users share images, videos, links and vote on the quality of content. Comparable (in theory) to Reddit. Application Programming Interface (API) – Specifies a software component in terms of its operations, their inputs and outputs and underlying types. Provides the means by which third party applications can access the data on social media. Astroturfing – The masking of the sponsors of a message or organisation to make it appear as though it originates from and/or is supported by grassroots participants. Avatar – A term broadly used to describe a person’s online representation of themselves; or specifically used to describe the profile picture employed by a user. Caption – The text accompanying a photo (e.g. on Instagram). Emoticon – Small text-embedded images or text itself that is used to convey feelings or emotions on social media (e.g. , , ;) :P :’(). Exchangeable image file format (Exif) – A standard that specifies the formats for images, sound and ancillary tags for systems handling image and sound files. Favourites – A function on Twitter that allows users to express approval of posts and/or collate them for later viewing. Friends Reunited – The first social networking site to become popular in Britain, focused on connecting users with friends they had fallen out of contact with. Geotag – Metadata (see below) attached to a post on social media that provides details about where the user was when posting. Or metadata that is attached to a piece of media (video/photo) containing location details. Hashtag – A pound sign: # affixed to words on social media. For example: #example. Used on Twitter, Instagram and Facebook to increase visibility of posts/contribute to ‘trends’/other functions discussed throughout this report. Likes – A feature of Instagram and Facebook that allows users to express approval of posts. 111 Douglas, Nick, ‘What The Hell Are 4chan, ED, Something Awful, And “b”?’ Gawker, 2008, <http://web.archive.org/web/20080724081826/http://gawker.com/346385/what-the-hell-are-4chan-edsomething-awful-and-b> [accessed 29 September 2014]. 63 LinkedIn – A social network designed for working professionals. Meme – A term coined by Richard Dawkins to designate an idea, behaviour or style that spreads throughout a culture. The term is more commonly used to describe image macros online.112 Metadata – ‘Data about data’, information about the content of a piece of data. Metadata on a Tweet contains that users follower count amongst other things. Metadata on a photo may contain its size, the camera used, the location the photo was taken (see: Geotag above) or other information. MMORPGs – Massive Multiplayer Online Role-Playing Games. A term used to describe video games where many users interact in an online world. Netnography – A branch of ethnography which seeks to analyse the behaviour of individuals and communities online; it originally adapted market research techniques to provide insights. Ranking Algorithm – The algorithm that determines where certain information appears in a timeline/newsfeed/search engine result. Reddit Gold – Reddit users can donate gold to each other in order to give each other premium membership. Reddit receives the money. Regram – A trend on twitter where user’s repost each other’s images; it is not a feature supported by the site but has emerged from user appropriation of the network’s features. SIDE Model – The psychological model that seeks to explain why users behave different when interacting as part of a community; it has some interesting applications for social media analysts. Sock Puppets – A term given to fake profiles on social media, usually ones that have been created for deceptive purposes. Subreddit – A forum on Reddit dedicated to discussion over a particular topic. Trending – Popular words, phrases or hashtags on social media. Tumblr – A popular blogging platform, especially amongst teenagers and young adults. Upvotes/Downvotes – Enable users on Reddit to vote on posts. VKontakte (VK) - The second largest social network in Europe after Facebook, similar to Facebook in purpose and design. VK is primarily Russian-speaking, although it is available in several languages. 112 <http://wac.450f.edgecastcdn.net/80450F/thefw.com/files/2012/05/most-interesting-man-meme.jpg> 64 Works Cited Books, Journals and Articles Bai, Shoutian, Zhu, Tingshao and Cheng, Li, ‘Big-Five Personality Prediction Based on User Behaviors at Social Network Sites’, eprint arXiv:1204.4809, 2010 Bartlett, Jamie, Miller, Carl, Crump, Jeremy, Middleton, Lynne, ‘Policing in an Information Age’, Demos, March 2013, pp. 1-42, <http://www.demos.co.uk/files/DEMOS_Policing_in_an_Information_Age_v1.pdf?1364295 365>, [accessed 29 September 2014] Boyd, Danah, Golder, Scott and Gilad, Lotan, ‘Tweet, Tweet, Retweet: Conversational Aspects of Retweeting on Twitter’, Proceedings of the 43rd Hawaii International Conference on System Sciences, 2010, <http://www.danah.org/papers/TweetTweetRetweet.pdf> [accessed 29 September 2014] Bryden, John and Funk, Sebastian ‘Word usage mirrors community structure in the online social network Twitter’, Vol. 2, No. 3, 2013, <http://www.epjdatascience.com/content/2/1/3> [accessed 29 September 2014] Chan, Michael, ‘The Impact of Email on Collective Action: A field application of the SIDE model’, New Media and Society, 2010. Gunawardena, Nipun, et al. ‘Instagram Hashtag Sentiment Analysis’, University of Utah, 2013, <http://www.eng.utah.edu/~cs5350/ucml2013/3-3p.pdf> [29 September 2014] Gupta, Aditi et al., ‘TweetCred: A real-time Web-based System for Assessing Credibility of Content on Twitter, Indraprastha Institute of Information Technology, 2014. Gyorffy, Rachele, ‘#NoFilter: Exploring Self Promotion and Identity Creation through Instagram.’ Princeton University Senior Thesis, 2013 Heppner, Paul P, Wampold, Bruce E and Kivlighan, Jr. Dennis M, ‘Research Design in Counselling (Research Statistics & Program Evaluation)’, Cengage Learning 3rd Edition, 2007. Hochman, Nadav, and Manovich, Lev, ‘Zooming into an Instagram City: Reading the local through social; media.’, First Monday, Vol. 18, No. 7, 2013 <http://firstmonday.org/ojs/index.php/fm/article/view/4711/3698> [accessed 29 September 2014]. Kosinski, Michael, Stillwell, David and Graepel, Thor (2013), ‘Private Traits and attributes are predictable from digital records of human behaviour’, Proceedings of the National Academy of Sciences of the United States of America, Vol. 110, No. 15, pp. 5802-5805. 65 Lapidot-Lefler, Noam and Barak, Azy, ‘Effects of anonymity, invisibility, and lack of eyecontact on toxic online disinhibition’, Computers in Human Behaviour, Vol. 28, No.2, 2012, pp. 434-443 Omand, Sir David, Bartlett Jamie and Miller, Carl, ‘Introducing Social Media Intelligence (SOCMINT)’, Intelligence and International Security, 2012, <http://www.academia.edu/1990345/Introducing_Social_Media_Intelligence_SOCMINT_> [accessed 29 September 2014] Paulhaus, L, and Williams, K, ‘The Dark Triad of personality: Narcissism, Machiavellianism and Psychopathy’, Journal of Research in Personality, Vol. 36, 2002, 00. 556-563 Suler, John, ‘The Online Disinhibition Effect’, Cyberspace and Behaviour, Vol. 7, No.3, 2004, pp. 321-326. Summer, Chris et al., ‘Predicting Dark Triad Personality Traits from Twitter usage and a linguistic analysis of Tweets’, Proceedings of the IEEE 11th International Conference on Machine Learning and Applications ICMLA 2012, 2012. Graphs ‘Daily Active Facebook Users by Country/Region’, The International Centre for Security Analysis and Facebook, <www.facebook.com>, 2014 [accessed 29 September 2014] ‘Distribution of Twitter users worldwide from 2012 to 2018’, Statista, <http://www.statista.com/statistics/303684/regional-twitter-user-distribution/> [accessed 29 September 2014] ‘Growth of Instagram users worldwide from 4th quarter 2013 to 1st quarter 2014, by generation’, Statista, <http://www.statista.com/statistics/307026/growth-of-instagramusage-worldwide/> [accessed 29 September 2014] ‘How Mobile are Social Networks? % of time spent on social networks in the United States, by platform’, Statista, <http://www.statista.com/chart/2091/mobile-usage-of-socialnetworks/> [accessed 29 September 2014] ‘Millions of Teens Have Abandoned Facebook Since 2011’, Statista, <http://www.statista.com/chart/1789/facebook-s-teenager-problem/>, [accessed 29 September 2014] ‘Most addicted/engaged countries by avg. pageviews per visit.’, Reddit Blog, 2011, <http://www.redditblog.com/2011/06/which-cities-countries-have-most-reddit.html> [accessed 29 September 2014] Olson, Randal, ‘Most Active Subreddits’, 2013, <http://www.randalolson.com/> [accessed 29 September 2014] 66 ‘Percentage of UK Internet users who use Twitter as of February 2013, by age group’, Statista, <http://www.statista.com/statistics/257429/share-of-uk-internet-users-who-use-twitter-byage-group> [accessed 29 September 2014] ‘Regional distribution of Instagram traffic in the last three months as of April 2014, by country’, Statista, <http://www.statista.com/statistics/272933/distribution-of-instagramtraffic-by-country/> [accessed 29 September 2014] Solis, Brian and JESS3, ‘The Conversation Prism’, 2014, <www.conversationprism.com> [accessed 29 September 2014] ‘Which Cities & Countries Have the Most reddit Addicts?’, Reddit Blog, 2011, <http://www.redditblog.com/2011/06/which-cities-countries-have-most-reddit.html> [accessed 29 September 2014] News Articles and Blog Posts Al-Tamimi, Aymenn Jawad, ‘Reflections on Methods’, Aymenn Jawad Al-Tamimi’s Blog, July 22 2014 <http://www.aymennjawad.org/2014/07/reflections-on-methods> [accessed 29 September 2014] ‘Banned Hashtag Search’, The Data Pack, <http://thedatapack.com/tools/blocked-hashtagsearch/> [accessed 29 September 2014] Benthall, Sebastian, ‘“Weird Twitter” art experiment method notes and observations’, Digifesto, 18 October 2012, <http://digifesto.com/2012/10/18/weird-twitter-artexperiment-method-notes-and-observations/> [accessed 29 September 2014] Borow, James, ‘How Facebook is Already Profiting from Instagram’, Ad Age, 8 August 2013, <http://adage.com/article/digitalnext/facebook-profiting-instagram/243515/> [accessed 29 September 2014] Domm, Patti, ‘False Rumor of Explosion at White House Causes Stocks to Briefly Plunge; AP Confirms its Twitter Feed Was Hacked’, CNBC, <http://www.cnbc.com/id/100646197#> [accessed 29 September 2014] Douglas, Nick, ‘What The Hell Are 4chan, ED, Something Awful, And “b”?’ Gawker, 2008, <http://web.archive.org/web/20080724081826/http://gawker.com/346385/what-the-hellare-4chan-ed-something-awful-and-b> [accessed 29 September 2014] Fitzpatrick, Alex, ‘Judge: Public Tweets Have No “Reasonable Expectation of Privacy”’, Mashable, 3 July 2012, < http://mashable.com/2012/07/03/twitter-privacy/> [accessed 29 September 2014] 67 Foster, Michael, ‘Two things the Fappening Teaches Marketers’, All Voices, <http://www.allvoices.com/article/100000692> [accessed 29 September 2014] GrandJean, Martin, ‘Analyser graphiquement son réseau facebook’, MartinGrandJean, 17 March 2013, <http://www.martingrandjean.ch/analyser-graphiquement-reseaufacebook/> [accessed 29 September 2014] Hern, Alex, ‘End of the timeline? Twitter hints at move to Facebook-style curation’, The Guardian, 4 September 2014, <http://www.theguardian.com/technology/2014/sep/04/twitter-facebook-style-curatedfeed-anthony-noto?CMP=twt_gu> [accessed 29 September 2014] Herrera, Tim, ‘What Facebook doesn’t show you’, The Washington Post, 18 August 2014, <http://www.washingtonpost.com/news/the-intersect/wp/2014/08/18/what-facebookdoesnt-show-you/?Post+generic=%3Ftid%3Dsm_twitter_washingtonpost> [accessed 29 September 2014] Honan, Mat, ‘I Liked Everything I Saw on Facebook for Two Days. Here’s What It Did to Me’, Wired, 8 November 2014, <http://www.wired.com/2014/08/i-liked-everything-i-saw-onfacebook-for-two-days-heres-what-it-did-to-me/> [accessed 9 November 2014]. ‘How Al Qaeda Uses Encryption Post-Snowden (Part 1), Recorded Future, 8 May 2014, <https://www.recordedfuture.com/al-qaeda-encryption-technology-part-1/> [accessed 29 September 2014] ‘Is a Retweet an Endorsement?’, Think Differently, 19 December 2012, <http://thinkdifferently.ca/differently/is-a-retweet-an-endorsement/> [accessed 29 September 2014] King, Rachel, ‘Facebook engineers explain News Feed ranking algorithms; more changes soon’, ZDNet, 6 August 2013, <http://www.zdnet.com/facebook-engineers-explain-newsfeed-ranking-algorithms-more-changes-soon-7000018996/ > [accessed 29 September 2014] McGee, Matt, ‘EdgeRank is Dead: Facebook’s News Feed Algorithm Now Has Close to 100k Weight Factors’, Marketing Land, <http://marketingland.com/edgerank-is-dead-facebooksnews-feed-algorithm-now-has-close-to-100k-weight-factors-55908> [accessed 29 September 2014] Munro, Dan, ‘Twitter Community #BCSM Expands Online to Broaden Patient Engagement’, Forbes, 31 March 2013, <http://www.forbes.com/sites/danmunro/2013/03/31/twittercommunity-bcsm-expands-online-to-broaden-patient-engagement/> [accessed 29 September 2014] ‘PsyOps and Socialbots’, Infosec Institute, <http://resources.infosecinstitute.com/psyopsand-socialbots/> [accessed 29 September 2014]. 68 Reddiquette, http://www.reddit.com/wiki/reddiquette [accessed 29 September 2014] ‘reddits new comment sorting system’, Reddit Blog, 15 October 2009, <http://www.redditblog.com/2009/10/reddits-new-comment-sorting-system.html> [accessed 29 September 2014] ‘Resignation calls over councillor’s pineapple retweets’, BBC News, <http://www.bbc.co.uk/news/uk-england-stoke-staffordshire-14709241> [accessed 29 September 2014] ‘Retweet the old fashioned way, using “classic” or “traditional” retweets only’, Ray’s 2.0, 3 September 2013, <http://rays20.blogspot.co.uk/2010/06/traditional-retweet-tr-keyto.html> [accessed 29 September 2014] Rosen, Armin, ‘The Remarkable Story of a Rising Terrorism Analyst Who Got Too Close to His Subjects’, Business Insider, July 22 2014, <http://www.businessinsider.com/tamimi-2014-7> [accessed 29 September 2014 Rusli, M. Evelyn, ‘Facebook buys Instagram for $1 Billion’, The New York Times, April 2012, <http://dealbook.nytimes.com/2012/04/09/facebook-buys-instagram-for-1billion/?_php=true&_type=blogs&_r=0> [accessed 29 September 2014] ‘The Banned #Hashtags of Instagram’, The Data Pack, 26 August 2013, <http://thedatapack.com/banned-hashtags-instagram/#comment-6156> [accessed 29 September 2014] ‘The Open Graph Viz Platform’, Gephi, <http://gephi.github.io/> [accessed 29 September 2014] ‘Twitter users forming tribes with own language, tweet analysis shows’, Guardian Data Blog, <http://www.theguardian.com/news/datablog/2013/mar/15/twitter-users-tribes-languageanalysis-tweets> [accessed 29 September 2014] VICE Staff, ‘We are with John McAfee Right Now, Suckers’, 3 December 2012, <http://www.vice.com/en_uk/read/we-are-with-john-mcafee-right-now-suckers> [accessed 29 September 2014] ‘Weird Twitter’, Know Your Meme, <http://knowyourmeme.com/memes/weird-twitter> [accessed 29 September 2014] ‘Why John McAfee Is Paranoid about Mobile’, Dark Reading, 19 August 2014, <http://www.darkreading.com/informationweek-home/why-john-mcafee-is-paranoidabout-mobile-/a/d-id/1298090> [accessed 29 September 2014] 69 Social Networks and Other Sites ‘Cytoscape’, <http://www.cytoscape.org/> [accessed 29 September 2014] i2 National Security and Defense Intelligence’, <http://www03.ibm.com/software/products/en/national-security-defense-intelligence> [accessed 29 September 2014] ‘Instagram’, <www.instagram.com> [accessed 29 September 2014] ‘Jeffrey’s Exif Viewer’, <http://regex.info/exif.cgi> [accessed 29 September 2014] ‘Learnist’ (https://learni.st/explore) [accessed 29 September 2014] ‘Medium’ (https://medium.com/) [accessed 29 September 2014] ‘Palantir’, <https://www.palantir.com/> [accessed 29 September 2014] ‘Silobreaker’. <http://www.silobreaker.com/network-2> [accessed 29 September 2014] ‘The BCSM Community’, #BCSM, <https://www.youtube.com/user/BCSMCommunity> [accessed 29 September 2014] ‘The Open Graph Viz Platform’, Gephi, <http://gephi.github.io/> [accessed 29 September 2014] ‘Twitter’, <www.twitter.com> [accessed 29 September 2014] Welcome to the BCSM Community’, #BCSM, <www.bcsmcommunity.org> [accessed 29 September 2014] 70
© Copyright 2024