Mingo Sanchez Professor Eric Gaze and Professor Jen Jack Gieseking Data Driven Societies – INTD 2420 14 May 2015 Under the Umbrella: What Data Visualizations Reveal about the Hong Kong Pro-Democracy Movement Introduction In the past decade, technology has given rise to a societal shift unlike any other before it. While technology is always evolving, the ubiquity of computers, smart devices, and social media in recent years has fundamentally altered the ways in which we operate and communicate. Previously, our social networks were limited primarily to those around us in physical space. With the rise of the Internet, however, we now have the ability to reach nearly anyone on the entire planet in minutes. With an entire world of connections at our disposal via websites like Facebook, Reddit, and Twitter, new social phenomena we would not have dreamed of ten years ago are now commonplace. Twitter in particular has been a catalyst behind many movements ranging from small-scale protests to full-blown revolutions all across the globe. When I heard about the so-called “Umbrella Revolution” that began to gain momentum in Hong Kong last September, I knew that the protests had the potential to grow exponentially due to extensive media coverage and presence on social media both domestically (i.e., in Hong Kong) and internationally. While past research has been done on how political movements spread on social media websites like Twitter, I was particularly interested in studying the different types of people present in these online networks. I had previously heard about how Twitter can be used by 1 political figures, media outlets, and even self-proclaimed “hashtivists” (short for hashtag activists) to reach a large audience. As such, I was incredibly curious about how or if these different types of Twitter users would play a role in the discussions taking place on Twitter during large-scale political movements like the Hong Kong protests. Ultimately, I chose to study two different, but related hashtags on Twitter: #UmbrellaMovement and #UmbrellaRevolution. Background and Literature Review There have been several prominent revolutions sparked by social media platforms like Twitter in the past several years. That being said, the social and political atmospheres in these countries were and are all vastly different from the unique culture in Hong Kong. As such, I felt that there were two important components to consider when trying to understand the current movement in Hong Kong: the sociopolitical climate of Hong Kong, as well as the general nature of political movements on Twitter. Because the unrest in Hong Kong started before protests took to Twitter, I began my research by learning more about Hong Kong and its relationship with mainland China. In 1841, Hong Kong was taken from China and became a British dependency. It was not until 1997 that Hong Kong was returned to Chinese control. Because of the strong British influence in Hong Kong, it was decided that Hong Kong would only be returned to China with certain stipulations about the territory’s relative autonomy. These agreements formed Hong Kong’s Basic Law, also known as “one country, two systems.”1 Under the Basic Law, Hong Kong’s citizens have several rights not granted to mainland Chinese citizens including the right to assembly and the right to develop a democracy. Former top ranking Chinese official Lu Ping 1 Euan McKirdy, “‘One Country, Two Systems’: How Hong Kong Remains Distinct from China,” CNN, September 30, 2014. 2 was even quoted as saying, “How Hong Kong develops its democracy in the future is completely within the sphere of the autonomy of Jong Kong. The central government will not interfere.”2 These promises of autonomy helped to mitigate the unease of Hong Kong citizens, although recent developments have upset this balance. Last year, resentment of Beijing officials in Hong Kong began to rise after the government announced it would not allow Hong Kong residents to have full control over electing a chief executive in 2017.3 Instead of having the free elections they were promised, Hong Kong citizens were told that a nominating committee would be formed to select two or three candidates, each of whom must “have the endorsement of more than half of all the members of the nominating committee.”4 Protesters, comprised mainly of student activists and members of the older Occupy Central movement, believe that CCP officials in Beijing plan to use this committee to screen political candidates in Hong Kong. 5 It is important to note that not all Hong Kong citizens support the protesters in Hong Kong. A study conducted by the School of Journalism and Communication at the Chinese University of Hong Kong found that in December of 2014, 33.9 percent of subjects interviewed supported the prodemocracy protesters, while 42.3 percent of people were opposed to the protests.6 After doing background research about Hong Kong and its pro-democracy protests, I shifted my attention to Twitter and social networks. Twitter is a fascinating social media platform because of the unique way in which it facilitates communication; unlike websites like McKirdy, “‘One Country, Two Systems,’” CNN, September 30, 2014. 3 Rishi Iyengar, “Hong Kong’s Umbrella Revolutionaries are Slowly Coming Back to the Streets,” Time, April 14, 2015. 4 Yang Yi, “NPC Decides on Nominating Committee for HKSAR Chief Executive Selection,” Xinhuanet, August 31, 2014. 5 “Hong Kong’s Democracy Debate,” BBC, October 7, 2014. 6 School of Journalism and Communication, the Chinese University of Hong Kong, “Public Opinion & Political Development in Hong Kong Survey Results,” December 18, 2014, 10. 2 3 Facebook and MySpace, on which social connections are bidirectional, Twitter is unique in that users “follow” other users. Whenever a user tweets a message, all of his or her followers are able to see those tweets and reply to or retweet them. Because relationships on Twitter are directed, users’ networks are more akin to audiences than groups of friends. Additionally, the immediacy of Twitter allows for ideas to spread rapidly. In his paper “The Data-Driven Society,” Alex Pentland stresses that maximizing idea flow is essential in any social network.7 Unsurprisingly, the unique qualities of Twitter as a medium have given rise to entirely new types of movements. As Marshall McCluhan says in his book Understanding Media: The Extensions of Man, “the personal and social consequences of any medium—that is, of any extension of ourselves—result from the new scale that is introduced into our affairs by each extension of ourselves, or by any new technology.”8 In the case of Twitter, citizen journalism and digital activism have become much more prevalent as a result of the social network. Dhiraj Murthy explains in his article “Twitter: Microphone for the Masses?” that the user base on Twitter is sufficiently large to allow for the nearly immediate spread of ideas across a large population. 9 As a result, news organizations are quickly adopting Twitter as a new means of reaching consumers and citizen journalists are becoming more and more prominent. The Arab Spring that took place in the Middle East in 2011 was concrete proof of the power of Twitter in allowing people to communicate in order to bring about political change. In 2011, several researchers at the University of Washington found that social media platforms, especially Twitter, Facebook, and YouTube, were critical in allowing for conversations that led Alex Pentland, “The Data-Driven Society,” Scientific American 309, no. 4 (2013): 82. 8 Marshall McLuhan, “The Medium Is the Message,” in Understanding Media: The Extensions of Man (New York: McGraw-Hill, 1964), 1. 9 Dhiraj Murthy, “Twitter: Microphone for the Masses?” Media, Culture & Society 33, no. 5 (2011): 786. 7 4 to political uprisings and subsequent calls for democracy in both Tunisia and Egypt.10 Another similar study published in 2013 found that protesters used Twitter extensively during the revolution in Egypt and the civil war in Syria. Tweets concerning these events were primarily in two languages: English and Arabic.11 In both of these uprisings, the groups of English and Arabic users were largely disconnected from one another.12 Furthermore, in both movements there appeared to be groups of domestic “elite” users tweeting news in English, supporting Murthy’s view that Twitter is primarily used as a platform for speaking to an audience.13 The prevalence of a small number of influential users in the networks during the Egyptian and Syrian uprisings is also consistent with the idea in networking theory that non-random networks contain central nodes or hubs that “dominate the structure of all networks in which they are present.”14 For my research, I was particularly interested in studying influential users tweeting about the Hong Kong pro-democracy protests. Methods and Hypothesis In order to collect tweets containing #UmbrellaMovement and #UmbrellaRevolution, I used the TAGS Google Sheet template created by Martin Hawksey.15 This TAGS software allowed me to collect several thousand tweets every day from February 2, 2015 to March 6, 2015. Because the tweets were not necessarily posted on the same day they were collected using Philip Howard et al., “Opening Closed Regimes: What Was the Role of Social Media During the Arab Spring?” PIPTI, working paper, January 2011, 23. 11 Axel Bruns, Tim Highfield, and Jean Burgess, “The Arab Spring and Social Media Audiences: English and Arabic Twitter Users and Their Networks,” American Behavioral Scientist 57, no. 7 (2013): 889–894. 12 Ibid., 889, 892. 13 Ibid., 885–886. 14 Albert-Lásálo Barabási, Linked: How Everything is Connected to Everything Else and What it Means for Business, Science, and Everyday Life (New York: Basic Books, 2014), 64. 15 Martin Hawksey, “#TAGS,” accessed May 13, 2015. 10 5 TAGS, my data set actually ended up containing tweets from January 29, 2015 to March 6, 2015. Although not the entire Twitter data set was accessible using TAGS, I was able to collect over 18,000 unique tweets without having to request special access to the Twitter API. I collected tweets at approximately the same time every day (10:00 EST) in order to minimize the number of duplicate tweets collected. After collecting over a month’s worth of Twitter data, I removed duplicate tweets using Microsoft Excel. In order to make the data easier to process, I sorted my complete data set by date and created a separate data set containing only tweets with geolocation data. I excluded tweets that supposedly were tweeted at the intersection of the equator and the Prime Meridian (i.e., 0 degrees latitude and longitude) because the locations of these tweets were almost certainly masked. I used several tools to analyze my data set in different ways: Wordle and Voyant-Tools were useful for performing text analysis on my Twitter data set; CartoDB allowed me to map my geolocated tweets and perform spatial analysis on my data set; and Gephi was essential for viewing the structure of the network of users in my data set and performing social network analysis. All of these tools are free to use and available online. I had four major questions before collecting my data: (1) Where are people tweeting from? (2) What are people tweeting about? (3) Who is tweeting? (4) What patterns emerge from my data set? Due to the extensive media coverage in the United States about the Hong Kong protests, I expected to see two primary sources of tweets in my geolocated data set: the United States and Hong Kong. Because Twitter is banned in mainland China, I did not expect to see any geolocated tweets from areas in China other than Hong Kong. I also expected there would not be many people tweeting negatively about the protests using #UmbrellaMovement and #UmbrellaRevolution due to the pro-democracy nature of the hashtags. As such, I predicted that 6 most of the frequently used terms in my data set would be hashtags related to #UmbrellaMovement and #UmbrellaRevolution and other terms related to Hong Kong and democracy. As for the types of users I expected to see in my data set, I predicted that there would be many relatively uninfluential users and several major hubs. Based on what past researchers observed in the Middle East, I expected these hubs to primarily be of two types: media sources and activists. I did not expect there to be a difference in tweeting activity between these two types of influential users; it seemed that protesters would likely tweet more during important events, as would citizen journalists and news organizations. Analysis and Discussion Before analyzing my Twitter data set, I first wanted to have an idea of how groups of people with varying levels of support for the protests were comprised. Using RStudio, I created the visualization below: Figure 1. R graph showing breakdowns of different levels of support for protests by age. Data collected by the School of Journalism and Communication at the Chinese University of Hong Kong, 2014. 7 This visualization suggests that while people who support the protests are of all different ages, the majority of those opposed to the protests are largely people aged 40 and over. In some respects, these results are unsurprising; it makes sense that people who have a certain level of stability (e.g., those who have a job and family) would not want to challenge the government. On the other hand, many of these older people probably lived in Hong Kong before it was returned to Chinese control. In this sense, it seems strange that people who grew up with democracy would not want to stand up for their right to vote. This data set complements my Twitter data set in that it shows that not all people involved in the discussion about the Hong Kong protests are in favor of them. To answer my first research question about where people were tweeting from, I made the following cluster map using CartoDB. I should note that only 0.66 percent of my tweets were geolocated, so the map below might not be an entirely representative sample of my data set. Figure 2. Cluster map showing geolocated tweets containing the hashtags #UmbrellaMovement or #UmbrellaRevolution. Data from Twitter. Made using CartoDB. As expected, the largest cluster of tweets is in Hong Kong, where the pro-democracy movement is taking place. There are several isolated tweets along the East Coast of the United States, but they are not close enough in proximity to one another to form a cluster. The only other cluster of 8 tweets, in fact, is centered in London. This may be because of Britain’s close ties to Hong Kong, as well as the fact that there have been protests in London about the situation in Hong Kong.16 In order to understand what people were tweeting about in my data set, I decided to make a word cloud visualization using Wordle. After filtering out words I deemed to be irrelevant to my analysis – including words such as is, and, the, as well as my hashtags themselves, #UmbrellaMovement and #UmbrellaRevolution – I uploaded the resulting text to Wordle to create the following visualization: Figure 3. Word cloud of tweets containing the hashtags #UmbrellaMovement or #UmbrellaRevolution. Data from Twitter. Made using Wordle. As I had predicted, many of most frequently used terms are related to Hong Kong, democracy, and the protests. These include #OccupyHK, #HK, #OccupyCentral, and #democracy (Wordle removes hashtag symbols). The large size of #Newsbit suggests many of the tweets in my data set came from news organizations or citizen journalists. Because I was curious about the type of conversations that were happening in my data set as well as the content, I did not remove the word RT from my data set. The fact that RT dominates this word cloud shows that people 16 “Hong Kong Protest Outside Chinese Embassy in London.” BBC, October 1, 2014. 9 tweeting about the Hong Kong protests are sharing other people’s ideas very frequently. Several Twitter usernames also appear in the word cloud, most notably @hk928umbrella. Although I had not yet done an analysis of the users in my network when I made this word cloud, I expected this node to be the center of my Twitter data set due to the relatively large size of @hk928umbrella in the word cloud. Below is a visualization showing the influence of users in my data set, as determined by betweenness centrality: Figure 4. Gephi visualization of influential users tweeting using the hashtags #UmbrellaMovement or #UmbrellaRevolution. Data from Twitter. Tweets converted to Gephi-ready format using Deen Freelon’s t2g.py script.17 Unsurprisingly, @hk928umbrella is by far the most central node in my data set. In fact, @hk928umbrella was so influential that I needed to adjust the scaling of nodes so that other nodes would be visible. Because of the scaling changes I made, the relative sizes of the other influential users compared to @hk928umbrella are somewhat misleading: none of the other nodes come even close to being as central to the network as @hk928umbrella. Several of the 17 Deen Freelon, “T2G: Convert (all) Twitter Mentions to Gephi Format,” DFreelon.org, accessed May 14, 2015. 10 other influential users in the network, such as @2legit2trip and @PRHacks, also appeared in the word cloud. The only large node that appears to be somewhat separated from the others is @rightnowio_feed, which appears in the pink subgroup of users to the left of the main network. After identifying the most influential users in my network, I visited each of their Twitter pages to see what types of users they were. Interestingly, nearly all of the most central users were protesters from Hong Kong. There were two primary exceptions: @hk928umbrella, which is an account operated by a group of volunteers to share news about the Hong Kong protests, and @rightnowio_feed, which is an international online news organization. The latter account was the only influential user in my data set that was not from Hong Kong or specifically related to the protests. In order to compare the two types of influential users in my network – activists and news organizations – I compared the relative frequencies of tweets of four users, two of each type, over a period of one week. Below is a graph of these relative frequencies: Figure 5. Line graph of tweet patterns of influential users between 1/29/15 and 2/4/15. Red lines show patterns of protesters, while blue lines represent news organizations. Data from Twitter. Made using Microsoft Excel. 11 It is important to note that @hk928umbrella tweeted many thousands of times, compared to several dozen times for each of the other three accounts. This is why I graphed relative frequencies of tweeting instead of raw tweet counts. I chose to study the week of January 29th to February 4th because on February 1st, activists held their largest demonstration in months.18 While I had expected the tweeting patterns of news organizations and protesters to be somewhat similar, I was surprised to see that there was a much larger spike in the tweeting of activists than the tweeting of news organizations during major events. Overall, the tweeting patterns of news organizations were relatively consistent, whereas protesters tended to tweet much more during large demonstrations and events. Together, these visualizations reveal several things about the people discussing the prodemocracy protests in Hong Kong. First and foremost, the group of people involved in the conversation is not homogeneous. Rather, there are people of a wide variety of ages in many areas of the world participating in the discussion about the protests. As Figure 1 shows, people in Hong Kong have varying levels of support for the protesters, with older Hong Kong citizens being more likely to not support the protests and younger citizens being more likely to support the protesters. Figure 2, although it only displays about 0.66 percent of the total data set, shows that the discussion about the Hong Kong pro-democracy protests is not limited to Hong Kong, although it is certainly centered in Hong Kong. This is not at all surprising considering the protests themselves are taking place in Hong Kong. The second largest cluster of tweets is in London, perhaps due to activist activity in the city and the close relationship between Hong Kong and Britain. Although the other tweets in the geolocated data set do not form clusters, they nevertheless show that people all over the world are tweeting using #UmbrellaMovement and 18 Lauren Hilgers, “Hong Kong’s Umbrella Revolution Isn’t Over Yet,” New York Times Magazine, February 22, 2015. 12 #UmbrellaRevolution. While there are many people discussing the Hong Kong protests on Twitter, Figure 4 shows that several incredibly influential users dominate the discussion. In addition to showing that the conversation about the protests is widespread and diverse – although several users are far more influential on Twitter than others – these visualizations provide key insights into the nature of the discussion. Figure 3 shows that much of the conversation on Twitter about the Hong Kong protests consists of retweets. This indicates that the sharing of information and ideas is vital to the conversation about the Hong Kong protests on Twitter. Furthermore, the most frequently used words in the data set suggest that people discussing the Hong Kong protests use Twitter in two primary ways: (1) for planning demonstrations and (2) for sharing news about the protests. Figure 4 shows exactly how these conversations take place: ideas are shared by one of several particularly influential users. The tweets of these users reach nearly all other users in the network, allowing ideas to be shared quickly with a large number of people. Perhaps the most interesting finding of my research is illustrated in Figure 5. Not only are there several different types of influential users in the network; there are also unique patterns of tweeting for the different types of hubs. This is not at all what I had expected to see; I had initially assumed that regardless of the type of user, people would post more during important events and less during periods of relative unimportance. What I instead found was that the two particularly influential news organizations in my data set had a relatively constant rate of tweeting. In direct contrast to this, activists tended to post many more tweets during important demonstrations – such as the one on February 1st – with relatively few tweets in between. Although I was only able to identify two of these tweeting “signatures” in my data set due to the fact that there were only two types of hubs in my network, it seems likely that there would be 13 similar patterns for other types of users in different contexts. I would be fascinated to see if distinct tweeting archetypes could be identified for different types of users. This may be an interesting direction for future research. Conclusion The Hong Kong protests have been a fascinating example of how political movements can take on an entirely different form through the use of technology. Whereas demonstrations in the past were largely limited to isolated regions or countries, new media platforms like Twitter have allowed these conversations to take place on a global scale. Furthermore, the structure of Twitter is uniquely suitable to directing messages towards large audiences almost instantaneously. The nature of communication on Twitter allows both activists and news organizations or citizen journalists to spread their message with relative ease. As such, it is unsurprising that the most central users in the Twitter discussion about the Hong Kong protests are news organizations and protesters. These two types of users appear to have distinct tweeting signatures. That we might be able to identify types of people based on how they tweet is an exciting prospect not just for the study of political movements on Twitter, but also for social media research in general. 14 Works Cited Barabási, Albert-Lásálo. Linked: How Everything is Connected to Everything Else and What it Means for Business, Science, and Everyday Life. New York: Basic Books, 2014. Bruns, Axel, Tim Highfield, and Jean Burgess. “The Arab Spring and Social Media Audiences: English and Arabic Twitter Users and Their Networks.” American Behavioral Scientist 57, no. 7 (2013): 871–898. Freelon, Deen. “T2G: Convert (all) Twitter Mentions to Gephi Format.” DFreelon.org, accessed May 14, 2015. http://dfreelon.org/2013/05/14/t2g-convert-all-twitter-mentions-to-gephiformat. Hawksey, Martin. “#TAGS.” Accessed May 13, 2015. https://tags.hawksey.info. Hilgers, Lauren. “Hong Kong’s Umbrella Revolution Isn’t Over Yet.” New York Times Magazine, February 22, 2015. http://www.nytimes.com/2015/02/22/magazine/hongkongs-umbrella-revolution-isnt-over-yet.html. “Hong Kong Protest Outside Chinese Embassy in London.” BBC, October 1, 2014. http://www.bbc.com/news/uk-29452299. “Hong Kong’s Democracy Debate.” BBC, October 7, 2014. http://www.bbc.com/news/worldasia-china-27921954. Howard, Philip, Aiden Duffy, Deen Freelon, Muzammil Hussain, Will Mari, and Marwa Mazaid. “Opening Closed Regimes: What Was the Role of Social Media During the Arab Spring?” PIPTI. Working paper, January 2011. http://pitpi.org/wpcontent/uploads/2013/02/2011_Howard-Duffy-Freelon-Hussain-Mari-Mazaid_pITPI.pdf. Iyengar, Rishi. “Hong Kong’s Umbrella Revolutionaries are Slowly Coming Back to the Streets.” Time, April 14, 2015. http://time.com/3814943/occupy-hong-kong-chinaumbrella-revolution-democracy. McKirdy, Euan. “‘One Country, Two Systems’: How Hong Kong Remains Distinct from China.” CNN, September 30, 2014. http://www.cnn.com/2014/09/29/world/asia/hongkong-protest-backgrounder. McLuhan, Marshall. “The Medium Is the Message.” In Understanding Media: The Extensions of Man, 1–11. New York: McGraw-Hill, 1964. Murthy, Dhiraj. “Twitter: Microphone for the Masses?” Media, Culture & Society 33, no. 5 (2011): 779–789. Pentland, Alex. “The Data-Driven Society.” Scientific American 309, no. 4 (2013): 78–83. School of Journalism and Communication, the Chinese University of Hong Kong. “Public Opinion & Political Development in Hong Kong Survey Results.” December 18, 2014. http://www.com.cuhk.edu.hk/ccpos/images/news/TaskForce_PressRelease_141218_Engl ish.pdf. Yi, Yang. “NPC Decides on Nominating Committee for HKSAR Chief Executive Selection.” Xinhuanet, August 31, 2014. http://news.xinhuanet.com/english/china/201408/31/c_133609213.htm.
© Copyright 2025