1 % of Twitter pt. II - 1% of ISIS

Team Members

Rasha Abdulla, Claudio Coletta, Carolin Gerlitz, Stefania Guerra, Christoph Lutz, Bernhard Rieder, Antonin Segault, Steven Talbot, Rebekah Tromble

Introduction

This research was conducted during the Digital Methods Initiative’s 2014 Summer School as part of the project “1% Percent of Twitter pt. II - The return of the Geo” coordinated by Caroline Gerlitz and Bernhard Rieder.

We began with a general interest in the way the current crisis in Iraq is being represented and discussed in multiple languages on Twitter. On 9 June 2014 an organization commonly called the Islamic State in Iraq and Syria, or ISIS, captured Mosul, Iraq’s second largest city. Since then ISIS and various groups have been battling for control of different parts of the country.

ISIS itself has been quite active on Twitter. The organization created an application called Dawn (or the Dawn of Glad Tidings) allowing them to send tweets from the accounts of the application users 1 2 . This application was removed from the Android app store on 19 June 2014 3 .

Research Questions

Our initial research question was: Do those tweeting in different languages frame the Iraqi crisis in different ways? However, as we describe in the “Findings” section below, as we began to conduct deeper analysis of the data, we discovered that ISIS’s own use of Twitter created significant problems for interpretation of the data and ultimately shifted our focus to analysis of ISIS’s effect on the broader Twitter “discourse”.

Methodology

We worked with the 1% sample of Twitter collected between 15 June and 23 June 2014 using the TCAT tool (Borra & Rieder 2014; Gerlitz & Rieder 2013). However, for practical reasons we conducted our analysis on just one day in the 1% dataset, 19 June 2014. We chose this day because there had been significant international news coverage of events related to a major Iraqi oilfield, control over which rapidly changed hands several times, and we therefore expected to find a relatively high volume of tweets in multiple languages on that date.

We ran a query on two basic terms--Iraq and ISIS--but in the multiple languages that were represented by the members of our group (English, Arabic, Italian, German, and French) and using multiple variations of the name of ISIS itself. (It is also called ISIL 4, for example.) Devising the query posed a particular challenge for several reasons. First was the difficulty of managing Arabic transliterations into Latin script. We ultimately devised a rather long list of Latin options for one of the Arabic names for ISIS, Da’ash, but our query is almost certainly not exhaustive

The second challenge was related to the fact multiple names are used for ISIS in Arabic. A bit of research on the organization revealed that Da’ish is actually a derogatory term used by those who oppose ISIS, while the organization and it’s supporters tend to refer to it alternatively as al-Dawlah or al-Dawlah al-Islamiyah (“the State” or “the Islamic State”). Unfortunately, these last two monitors are such common words in Arabic that their inclusion would return an overwhelming number of irrelevant tweets.

Finally, we also faced issues concerning the limitations of queries run in TCAT. Quotation marks cannot be used in queries to designate the desire for exact matches, and simply querying “isis” (without quotation marks) returned every tweet containing common terms such as “crisis” and hashtags such as “#thisisme”. We ultimately decided to use brackets with a space placed before “isis” ([ isis]). This ensures “isis” is not preceded by additional letters, but, unfortunately, it also filters out all tweets that begin with “isis”. The full text of the query we ran is as follows:

#isis OR [ isis] OR #isil OR isil OR #eiil OR [ eiil] OR داعش OR Da'ash OR Daesh OR Da2sh OR Da2ish OR Da'ish OR Daish OR Da3esh OR Da3sh OR Iraq OR Irak

The study mainly relied on the analysis modules provided by the TCAT tool, including statistic modules such as user and hashtag frequencies that we used to identify the main tendencies in the dataset. Given our interest in the (potentially multiple) framing(s) of the Iraq crisis, we then focused on the network of hashtag co-occurrences, exported from TCAT and processed with Gephi.

Findings

TCAT returned a set of 10,404 tweets, with 9,653 unique users and 3,795 tweets with hashtags. The average number of tweets per user was 1.08, with a maximum of 18. Figure 1 provides an overview of these and other basic descriptive statistics.

PercentTwitter_2014_ISIS_fig1.png
Figure 1

The most frequent hashtags (those appearing at least 20 times in the dataset) are almost exclusively English and Arabic-language hashtags (Only one of the top hashtags, “#Irak” was in a language other than English or Arabic.). Figure 2 displays the frequency of these hashtags. The content of these hashtags can be grouped into five logical categories: location, actors, news, “statements”, and miscellaneous. (See Figure 3.)

PercentTwitter_2014_ISIS_fig2.png
Figure 2

PercentTwitter_2014_ISIS_fig3.png
Figure 3

These groupings correspond relatively neatly to clusters identified in the co-hashtag analysis produced by TCAT. Figure 4 shows the co-hashtag network, visualized with Gephi. It reveals the hashtag #Iraq as the central node. Closest to this node we find a dense, highly-connected cluster of location names (especially countries and cities), primarily in Arabic. At the periphery, we find another well connected cluster, the news cluster. It entails hashtags such as “Euronews”, SMS or “Breaking”.

Interestingly, the hashtag for “ISIS” is in between different clusters and somewhat isolated. However, it groups with some other closely related terms: #Isil, #No2isis, etc.

PercentTwitter_2014_ISIS_fig4.png
Figure 4

At first glance it appeared as if the co-occurrence network might point to different framings of the Iraqi crisis based on the languages in which users were tweeting. As noted, the largest, densest, and most central cluster was formed primarily by Arabic location name hashtags, while most of the English-language hashtags are located in different clusters or are isolates. Initially we thought this might indicate that Arabic tweets were focusing on the Iraqi crisis as a regional event, while those tweeting in English developed more diffuse framings that were less concerned with the regional implications of the crisis.

However, as we began to dig deeper into the data--exploring several clusters, specific hashtags and users--we began to discover that the use of many of these hashtags was not as it seemed. Instead, as described in the following sections, we quickly realized that the "Twitterverse" around the Iraqi crisis has been significantly impacted--in fact, we can probably say manipulated--by ISIS itself.

The “news cluster”

In the co-hashtag networks visualisation, we identified a small, highly connected, cluster of news related hashtags (#world, #breaking, #lemonde, #news). When browsing the tweets using these hashtags, we noticed that some of them were using unusual combinations, such as associating #lemonde and #fox, two different media groups. We also discovered that every URL in these tweets were targeting article on the same website, mojahedin.org. We therefore suspected that part of the news cluster was produced by these deceptive tweets. Consider the following example tweets:

#Iran Tire workers in #Tehran staging protest rally http://t.co/o5FNHYvG8s #iraq #LONDON #Belgium #FOX #Euronews #sydney #Syria #world

Al Jazeera: Iraq-#Syria border passages controlled by armed men http://t.co/FqovJnTJ1V #oman #columbus #BreakingNews #Columbia #FOX

Turkey evacuates consulate in Basra Iraq http://t.co/RNscZ00HRr #News #Breaking #usa #AlJazeera #FOX #sydney #sms #politics #Euronews

We then processed the hastag-user network to analyse the user activity around this cluster. As shown in Figure 5, it revealed that a very small number of five quite active users (including the three more active users of our dataset) was responsible for almost all these tweets. These users were all sharing numerous links to the mojahedin.org website and using lot of hashtags. Their profiles identifies them as iranian opposed to the current Iranian and Iraqi governments and supporting ISIS and the syrian revolution.

PercentTwitter_2014_ISIS_fig5.png
Figure 5

#No2ISIS

The hashtag No2ISIS occurred 59 times according to the TCAT-hashtag frequency query, but upon manual inspection, only 57 results returned #No2ISIS in the content of the Tweet. This was due to 2 users including #No2ISIS in their profile description. Of the 57 tweets with #No2ISIS, 10.52% (6) used the hashtag to promote pro-ISIS content. The users used pro-ISIS language in their tweets, identifying ISIS as "The State," and one user even posted a video mourning a lost child claimed to result from a NATO bombing. The majority of the content was a re-circulating of a single picture advocating for the co-operation of Sunni and Shia people.

“Iraq Liberated” (in Arabic)

A total of 229 tweets in Arabic used the Arabic words for “liberated,” which produced tweets mostly centered around “Iraq_is_liberated” (in Arabic). Those were posted by 227 distinct users. Links were used in 59 (25.8%) of those tweets. Many of the tweets were repeated, sometimes over 20 times per tweet, all from distinct users, clearly indicating BOT activity. Also interesting is the fact that the repeated tweets included a time code in the text of the tweet itself, so the “identical tweet frequency” did not catch them, but only caught their retweets. So the point is, it seems there is little actual content produced, and it was all pro-ISIS.

From the 227 accounts that posted the tweets, 92 had under 100 followers; 85 had between 100-1000 followers; 49 had between 1001-5300 followers, and only one had over 24000 followers. That one popular account is now suspended. In terms of language, over one half (51.5%) were registered as English accounts, and 45% were registered as Arabic.

#Iraqi_revolution (in Arabic)

Another pro-ISIS hashtag was “Iraqi_revolution” (in Arabic) which produced 99 tweets by 93 distinct users. The same pattern of user information was detected. Interestingly however, some of the tweets were retweeted from the account @IRAQIRevolution, which has over 50,000 followers, when that user does not seem to have any original tweets in the selection (meaning that the original tweets did not show up in the document). We’re not sure whether this is because the original user did not use the hashtag “Iraqi_revolution” which would have later been added by the retweeter, or because the TCAT tool might have some issues mixing Arabic characters with Latin characters with special symbols (such as “.” or a single quote) in the same tweet.

@Iraq_news01

Iraq_news01 was the most mentioned user in our query, totalling 830 times. The user was identified to be a Pro-ISIS actor, as indicated by their use of “The State,” and posting pro-ISIS content. Iraq_news01 had 68.3 thousand followers, claiming to provide “coverage around the clock or the liberation of Iraq ... Information Office of the Sunni resistance.” Interestingly a sample of 100 of the 830 twitter user profiles showed that 43% of the those who mentioned Iraq_news01 were suspended by Twitter. This may indicate a proclivity of these user to installing and using the ISIS Dawn app, however, more analysis is needed.

Conclusions

Hastags does not give automatically meanings, but guide us to identify some trends. (To be improved)

Bibliography

E. Borra, B. Rieder, (2014) "Programmed method: developing a toolset for capturing and analyzing tweets", Aslib Journal of Information Management, Vol. 66 Iss: 3, pp.262 - 278.

C. Gerlitz and B. Rieder, (2013) "Mining One Percent of Twitter: Collections, Baselines, Sampling" M/C Journal, Vol. 16, No. 2

-- AntoninSegault - 27 Jun 2014
Topic attachments
I Attachment Action Size Date Who Comment
PercentTwitter_2014_ISIS_fig1.pngpng PercentTwitter_2014_ISIS_fig1.png manage 172 K 30 Jun 2014 - 18:24 Main.asegault  
PercentTwitter_2014_ISIS_fig2.pngpng PercentTwitter_2014_ISIS_fig2.png manage 397 K 30 Jun 2014 - 18:25 Main.asegault  
PercentTwitter_2014_ISIS_fig3.pngpng PercentTwitter_2014_ISIS_fig3.png manage 189 K 30 Jun 2014 - 18:26 Main.asegault  
PercentTwitter_2014_ISIS_fig4.pngpng PercentTwitter_2014_ISIS_fig4.png manage 1 MB 30 Jun 2014 - 18:27 Main.asegault  
PercentTwitter_2014_ISIS_fig5.pngpng PercentTwitter_2014_ISIS_fig5.png manage 880 K 30 Jun 2014 - 18:34 Main.asegault  
Topic revision: r3 - 15 Jul 2014, asegault
This site is powered by FoswikiCopyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding Foswiki? Send feedback