Profiling Bolsobots Networks. A quali-quanti approach to repurpose Instagram grammars
Janna Joceli Omena, Thais Lobo, Giulia Tucci, Francisco W. Kerche, Malu Paschoal, Xiaolu JI, André Rodarte, Eleanor Griffiths, Lorenzo Zaffaroni, Gaia Amadori and Shilan Huang
Project facilitators: Janna Joceli Omena and Thais Lobo
Final slide deck: https://bit.ly/profiling-bolsobots-networks
- Overall, the categories and topics that emerged from the qualitative analysis of bolsobots networks are well aligned with the publics and values generally associated with Brazilian conservative political agendas, and especially with Bolsonaro's agenda and supporting publics: family-religion, agribusiness, security, anti-communism etc;
- Some categories and topics, however, point to non-alignments from the characteristics of regular pro-Bolsonaro activists and may indicate strategies (successful or not) of bots' farms administration to increase the perception of widespread support. Examples are a cluster of profiles auto declaring to be located in a region where Bolsonaro's political support is lower, profiles including “out of date” campaign slogan for the 2018 presidential race, and profiles with no clear political affiliation;
- In the following network, accounts related to Family-Religious and Campaigning topics behaved in the most unusual way, due to a disproportion between follower and following users and a suspicious number of posts;
- Discourses enacted by bots networks related to Campaign/Call To Action topics indicated the use of external platforms, such as messaging apps, to coordinate action on Instagram;
- Bolsobots networks offer not only support to Bolsonaro but also to his political allies and supportive institutions (e.g. media channels);
- Discrete bots, composed of ghosts and filler accounts, were central in the bolsobots network and present in every topic-cluster, despite in smaller numbers;
- Identified bots from a specific theme cluster (e.g., anti-leftist cluster) are presumably from the same bot farms, which combined high capacity and low capacity bots. Different bots are likely to play different roles in the group and mixed capacity bot families are more cost-effective;
- A type of humanised and sophisticated bot cluster, the "auntie bots", adopted a Brazilian stereotype of an older, religious, and conservative female and their associated online behaviours.The adoption of specific Brazilian stereotypes and their associated online behaviours makes it more difficult to detect bots. Using 'baby boomers' on social media specifically exploits general perceptions of the 'offline' nature of this user group which makes it very hard to detect user authenticity.
- Bolso bot farms take, misuse, and decontextualise content from genuine users to attempt to legitimise their political views.
Jair Bolsonaro's election and presidency have been tied with accounts of disinformation campaigns led both by supporters, political allies and governmental members (see Santini, Salles, Tucci, 2021
). Several actions have been taken by social media platforms to block the content and inauthentic behaviour of these supporting networks. These accounts, sometimes in a coordinated action, generate massive content that has been labelled by fact-checking agencies as exaggerated, offensive or fake. The main topics of these posts vary from themes such as the measures to curb the coronavirus pandemic to the legitimacy of the electronic vote. Mapping the activities of these networks is particularly important in the coming months ahead of the 2022 Brazilian Presidential election.
Taking into account this context, this project follows, maps and analyses Bolsobots networks on Instagram aiming at profiling fully or partially automated accounts that support the Brazilian president Jair Bolsonaro. By doing this, we expect to understand better and situate the agency of political bots in the context of pre-presidential elections in Brazil.
The project follows previous research about Instagram bots (see Santini, Salles, Tucci, 2021
; Omena 2019
; DMI project report 2019
; DMI project report 2020
). Two premises were then considered based on previous knowledge for guiding this work: firstly, it was assumed that bots networks can be traced by mapping who they are following; secondly, it has been considered that Instagram bots operate in discrete modes, mimicking real people, using private accounts and also no profile picture (ghost accounts).
- Social bots are powerful tools that support large-scale advertising strategies, subtly manipulate public opinion, direct or redirect attention, generate value etc (see also Santini, Salles, Tucci, 2021);
- Bots networks can be traced by mapping who they are following (DMI project report, 2020);
- Bots are part of a diverse ecology. They can operate as:
- --> Discrete bots: look like real people; exist to like, comment and follow — and unfollow — other accounts (DMI project report, 2019);
- --> Ghost accounts: always present, also taken as bridging nodes (Omena 2019, DMI project report 2020);
- -->Private accounts: serve to bot discrete creation (DMI project report, 2019)
- Tracing following networks through Bolsonaro's non-official accounts can provide insights on different actions and behaviours that generate dubious content (Omena, 2019).
2. Initial Data Sets
#Query design: account list-making
The following steps describe how we defined the seed list of bolsobots accounts (fully or partially automated):
Query design based on Instagram recommendations of accounts that support Jair Bolsonaro government;
Query Instagram search engine for keywords such as “bolso” or “bolsonaro” or "mito", using a desktop computer and mobile phone;
Select profiles by checking a high number of followers and/or publications;
Look into selected profiles for recommendations to other similar accounts;
Scrape following accounts of seed profiles
#Data collection & strategies during the Summer School
We started by inserting 35 botted accounts (the seed list) on PhanthomBuster
to scrape a list of following accounts. From 26,900 unique Instagram accounts, we used another module of PhanthomBuster
to extract the bio information from these accounts. However, during the Summer School, we managed to extract only 7,500 accounts. This reduction is made mainly because of Instagram Graph API limitation
, which has limited the number of profiles we had access to.
3. Research Questions
Main research questions:
What can the analysis of bolsobots following networks tell about the agency of political bots in Brazil?
How repurposing Instagram profile info and pictures of bolsobots following networks can assist the understanding of both bot agencies and topics of interest?
Specific research questions and list of analysis:
What can Instagram features tell about bolsobots ecologies?
- Analysis based on an overview of key grammars (emojis, URLs, hashtags, geographical distribution, account type) relying on close reading and content analysis.
Which themes emerge by clustering bot accounts (e.g. by using profile pictures and descriptions)?
What are the strategies of the bot ecologies? Who matters to Bolsobots supportive accounts?
- Analysis focuses on issues alignments & behaviours and advanced by:
visual content analysis with ImageSorter
(image grouping by colour & repetition; qualitative approach)
textual analysis with machine learning and a computer vision network approach;
visual network analyses (following accounts: monopartite & bipartite graphs)
4. Methodology: Account analysis & Following networks
This project takes advantage of Instagram grammatisation to conduct an empirical study about bolsobots following networks. It repurposes profile info and the following accounts' data (as in the table below) to map and analyse fully and partially automated accounts supporting Jair Bolsonaro. Ordinary users’ profile info is also considered in the proposed analysis since bot accounts can follow both bots and non-bots. We follow a quali-quanti approach, proposing three specific analyses and methods, as described below. In the next sections, we provide detailed methodological guidance for each analytical task.
Profile and following accounts data
- The profile description analysis with basic visualisations, machine learning and computer vision networks.
- The profile image analysis with ImageSorter and guided by image grouping by colour or repetition (see Omena et al, 2019; Omena, 2019).
- The analysis with the following bot ecologies using visual network analysis (Venturini et al., 2019; Jacomy, 2021). (to verify ghost accounts within the following network, we took advantage of Instagram’s image URL syntax and the image id the platform automatically provides to accounts without a profile picture).
4.1 Profile description analysis with basic visualisations, machine learning and computer vision networks
As previously mentioned, and due to a limitation imposed by Instagram Graph API rate limits
, we were able to extract the profile information from 7,500 public accounts of a total of 26,900 unique Instagram accounts. Here, moving from a 7,500 Instagram profiles dataset, the investigation focused only on those containing a non-blank bio: in total, 3,650 unique profiles were entered in the data analysis. The main aim was to profile pro-Bolsonaro bots starting from a textual analysis conducted by a two-fold approach, through digital methods and machine learning (R software).
Basic visualisations (a digital methods close reading)
To analyze Instagram bio’s text with digital methods close reading, firstly we imported data scraped from Instagram API into TextAnalysis
tool to get emoji statistics. The emoji frequency rank was inputted into a spreadsheet and the 20 most frequent were qualitatively analysed by their contextual meaning. In total, 110 unique emojis had a frequency equal to or higher than 20. Using the find function on the Google spreadsheet column concerning the bio's text, it was possible to explore iteratively the meaning of each unique emoji in this dataset rather than the meaning that is generally tied to these emojis.
By this emergent coding approach (Bounegru et. al, 2017), nine categories were derived by the emoji use of the studied actors:
- Bolsonaro, concerning emojis used in support of the federal government, patriotic slogans and the presidential campaign in 2022;
- Family & faith, related to emojis used interchangeably to celebrate religion and family values;
- Digital influence, tied to call to action phrases and community building in business-driven accounts;
- Location, concerning declared places of origin and/or current residence of users;
- Sports, related to professionals of the field and/or supporters of recreational activities;
- Agrobusiness, concerning professionals working in agriculture and/or supporters of the practice;
- Security, tied to professionals working in security roles and/or supporters of the practice;
- Professional, concerning users' field of work not encompassed by the aforementioned categories; and
- Various, emojis with no meaningful message, aimed at enhancing the design style of the bio text. This data was then visualized with RAWGraphs circle packing graph.
Secondly, we imported data scraped from Instagram API into the DMI Harvester tool
to extract URLs and URLs hosts from bios' text. The data was inputted into a spreadsheet and categorised by host type. The URLs were categorised into eight labels for host type: website, social media, messaging app, governmental website, business website, database repository website, crowdfunding website, and unavailable website. This data was then visualized with RAWGraphs alluvial graph.
Thirdly, we imported data scraped from Instagram API into RStudio (RStudio Team 2020). We used the R packages tm (Feinerer et al., 2020), wordcloud (Fellows, 2018), and RColorBrewer
(Neuwirth, 2014) to generate the word cloud. Before displaying Instagram bios as a word cloud, we kept only alpha-numeric characters and converted them into lower case. Portuguese stop words were removed using R package stopwords (Benoit et al., 2021). We also used tidytext (Slige, J., Robinson D., 2016) and tidyverse (Wickham, 2021) to analyse bio information in a “bag of words” method (one word per line) and to quantify their hashtags and mentions. We ranked hashtags and mentions by frequency and exported the results. After manual classification, RawGraphs
tool (Mauri, M., Elli, T., Caviglia, G., Uboldi, G., & Azzi, M., 2017) was used to plot a treemap of hashtags and a circle pack of mentions.
To analyze Instagram profiles with machine learning, after cleaning and lemmatizing bio data, the most common words were identified with reference to their frequency in the corpus. To detect the presence of certain themes, we proceeded with topic modelling. Specifically, we trained a Latent Dirichlet allocation (LDA algorithm (Bail, 2014) using a subsample of 396 unique profiles. This helped us to understand the main topics and how they could be divided. In this phase, the maximum number of topics set (k value) was four, in order to get a general understanding and to avoid the over-fitting of categories. Then we used the same procedure on the final dataset of 3,650 unique profiles. In this second step, eight topics seemed to accurately describe the range of discourses articulated through users’ bios. Using the top 10 words for each topic, labels were created to make sense of algorithmic clustering.
Two datasets were created based on the topics: one of them, considering the maximum percentage of correlation of the topic for each document (i.e bio), and a second one using a 15% correlation threshold (set according to the quartile distribution of the topics’ correlation). By using this method, it was possible to consider the affiliation of one text to more than one topic, considering the strength of the relation to both as the key metric.
With the first dataset, we were able to develop a quantitative analysis aimed at describing each topic. In detail we:
- Checked the number of documents related to each topic to assess its impact.
- Traced the regional codes (DDD) from the phone numbers available for each profile and then plotted them on a geographical map in order to depict an overview of the location of these profiles.
- Adopted the ratio between followers and following accounts, weighting that to the post count, to outline what could be considered “unusual” behaviour (according to a purpose-built bot probability boolean variable) and to check which topic could be potentially responsible for more bots.
The second dataset provided a bipartite network of correlation between documents (i.e bios) and topics. By doing so, we were able to understand more general clusters of discourse and to identify profiles with closer connections to each. The network was constituted by two types of nodes, topics and images, that were then analysed in-depth to understand whether the distribution of topics was associated with a shared aesthetic (recurrent symbols, colours, personalities etc).
Finally, we used Google Vision API to examine the circulation of profile pictures, using full matching images, for the purpose of checking the presence of recurrent and common sources for any profile group.
Together, these approaches developed through digital methods and machine learning generated results that were consistently aligned, providing a more robust profile of the dataset.
4.2 Profile image analysis: image grouping by colour & repetition
Using a quanti-quali approach to digital methods, we have carried out a visual, content, and discourse analysis. The data was first clustered using a quantitative approach with the help of an image grouping software (ImageSorter
), gathering a total of 33,768 profile images Then, we have selected the cases within a more qualitative process. The process is explained by the following step-by-step (see also the summary board on Image 8):
- The dataset was composed of 33,768 profile images of Instagram, corresponding to the accounts followed by our 35 bolsobots entry list. We download all the images using DownThemAll. Here, we purposely allowed for the possible repetition of accounts in the following list, thus providing another level of visual analysis, that of profile image repetition, besides grouping the images by colour.
Image Sorter Interface with all 33,768 profile images, sorted by colour
- After image downloading, the images were entered in the ImageSorter Software and clustered by colours or profile image repetition.
ImageSorter Interface with zoom to the Green cluster
- We started the visual analysis by dividing our sample into colour instances and zooming in to investigate some visual patterns.
Image Sorter Interface with zoom to the ‘Patriotic Eye’ Cluster
- Using a semiotic approach, we were able to identify additional thematic clusters by analyzing the aesthetics and symbology similarities between profile pictures.
Screenshots from a couple of profiles were analysed within the ‘Patriotic Eye’ Cluster.
- We have randomly selected from each aesthetical and symbology cluster four-to-six accounts to give an even closer look. By clicking on these selected profiles in the ImageSorter, we have matched the image source with the Instagram Profile in the original dataset spreadsheet. Crucial here was the use of image name files (e.g. 36860353_700310680311360_6085368849069244416_n.jpg) to facilitate a navigational procedure from ImageSorter to the spreadsheet and from this latter to Instagram interface.
- Then we started a discourse analysis of each Instagram profile selected. The points assessed were: user name, profile picture, bio description, posts counts, posts contents, and some unusual comments.
Table 1: Bot categorizations by profile behaviour and message strategies
- As a final step, we have analyzed and categorized the accounts following the categories on our codebook (see Table 1).
Noteworthy that the analytical and categorization processes were somewhat a manufacturing work. The categories emerged from both the fieldwork — during ImageSorter
software navigation and the closer look in the Instagram profiles — and informed and inspired by previous works on the matter, for instance, the Oxford Internet Institute report on industrialized disinformation (Bradshaw, S., Bailey, H., & Howard, P., 2021).
Summary board of Profile image analysis
4.3 Following bot ecologies with visual network analysis
To analyse the following bot ecologies we build two networks based on a digital methods recipe
. The first is a bipartite graph (when nodes are seeds/bolsobots and respective following accounts) that permits multiple appearances of the same account, making it possible to detect the presence and location of ghost accounts, whether the accounts are verified and Bolsonaro’s supportive accounts according to different topics of interest (e.g. using username pattern, as shown in the Gephi’s data laboratory screenshot below). The second, a monopartite graph (one type of node: usernames) helped us to identify the role of the ghost accounts within the network. The interpretation of the graphs was based on visual network analysis (Venturini et al., 2019; Jacomy, 2021), also accounting for data-relational nature and the narrative affordances of the different zones of the network, e.g. centre, mid-zone and periphery (Omena & Amaral, 2019).
On the left, an example of a username pattern is used to create node attributes expressing different supportive discourses to praise the current government and president of Brazil. On the right, bolsobots bipartite following networks, highlighting the different zones contemplated in our analysis.
5.1. Key grammars in profile description
The Instagram bio’s text close reading pointed to some interesting findings regarding the ecologies of bot networks.
The emoji analysis showed that the most frequent symbols were used in the context of supporting Bolsonaro (28,6%), followed by messages praising the values of family and faith (27%) and phrases supporting digital business (16,3%).
Overall, the categories that emerged from the analysis of the contextual meaning of emojis are well aligned with the publics and values generally associated with Brazilian conservative political agendas, and especially with Bolsonaro's agenda and supporting publics. The interchangeably use of the same emojis to represent both values of family and religion interestingly echoes with the advance of the conservative agenda in the country. Political representatives from this strand have been resorting to discourses that merge religion, specially neo-Pentecostal evangelical churches, and the upbringing of the family, mainly a heteronormative one.
Emojis frequency in bios' texts by contextual meaning
Also interesting to note that the location cluster is composed of emojis of palm tree, wave and cactus, evoking the landscape of the Northwestern part of Brazil, where Bolsonaro usually has less political support in comparison to left-wing politicians. Also, it can be seen the use of the flags from the United States and Israel, both seen as allies by Bolsonaro's administration.
In the professional cluster, there is a highlight for emojis linked to musicians of gospel and country music, which relates to Bolsonaro's supporting publics. In the security cluster, the use of emojis representing skulls and knives relates to the symbol of the police tactical unit of the Military Police of Rio de Janeiro State. BOPE's symbol derives from the slogan "knife in the skull" ("faca na caveira'' in Portuguese) which became popularly associated with violent police operations in favelas.
The URLs analysis pointed out that the bios' texts included 65 links. Websites, in general, were the most common type of URLs and concerned pages of digital influencers, pentecostal churches (e.g. Bola de Neve Church), and political movements (e.g. Nas Ruas and Vem Pro Quartel). There was also a significant presence of messaging apps such as Telegram and Whatsapp, which may indicate that the accounts are using external platforms to coordinate action on Instagram.
URLs frequency in bios' texts by host categories
The analysis of word frequency in Instagram’s bio by the bolsobots following network accounts showed in the center of the word cloud Bolsonaro's 2018 electoral campaign slogan. In Portuguese: “Brasil acima de tudo! Deus acima de todos!” (translating to English: Brazil above everything! God above everyone!”).
Word frequency in Instagram’s bio by the bolsobots following network accounts.
Note: The size and the colour of the words indicate the frequency of use.This figure was plotted using the R package wordcloud.
The finding shows that Bolsonaro's cyber troops have been spreading this slogan on Instagram. Santini, Salles, Tucci, & Estrella, (2021) have found evidence of Twitter bots spreading the same campaign slogan on social media since 2016.
Bolsonaro’s presidential campaign billboard in a Brazilian city during the 2018’s campaign. Source: Veja, 2018.
To understand the context of the accounts’ Instagram bio, we extracted hashtags from the texts. We found 473 unique hashtags used 769 times. We coded manually every hashtag that appeared more than four times in the following categories: campaign, family, ideology, location, religious, and other.
Hashtags categories extracted from accounts’ bio description.473 unique hashtags were used 769 times.
More than 50% of the categorized hashtags were campaign hashtags supporting Bolsonaro (in pink). Regarding the electoral campaign context, the #bolsonaro2022 shows that the 2022 presidential election campaign is a relevant topic for these accounts.
Bolsobots following network Instagram accounts’ bio hashtags Treemap
The #bolsonaro2018 suggests ‘out of date' Instagram bios, since this hashtag relates to an electoral campaign conducted three years before the data collection date. This can be an indication of inauthentic behaviour. The use of Twitter accounts that were automated specifically for an electoral campaign and then abandoned has already been reported in the literature (Santini, Salles, & Tucci, 2021).
In a similar process, we extracted 1,532 mentions to 975 unique Instagram accounts. We coded manually every mention that appeared at least four times in the following categories: politician, political party, political movement, traditional media, alternative media, other people, soccer team, and others. We created a circle pack visualization where mentions' circles were coloured by category and sized by frequency.
Bolsobots following network Instagram accounts’ bio mentions
The most mentioned profiles belong to alt-right politicians' (light green), including Jair Bolsonaro and his sons (Carlos Bolsonaro, Eduardo Bolsonaro, and Flavio Bolsonaro). Jair Bolsonaro’s wife, Michele Bolsonaro, was also frequently mentioned. Conservative media vehicles, both traditional and alternative, appear in the top mentions. The result points to strong support to Bolsonaro and his political allies by bolsobots following networks on Instagram.
5.2. Issue alignments & Behaviours
#Aesthetic correlations between images and topics
Adopting a text-driven analysis using the LDA algorithm, eight topics emerged:
Top 10 words for each topic created through the LDA algorithm
Note: All words were translated from Portuguese
- Call to action: aimed at activating and involving the public, it’s closer to the grammatization of Instagram. Common actions in that sense are the invite to follow certain channel, add it on messaging apps (Whatsapp, Telegram), click on certain link and so on;
- Profession: self-description topic, where the profile user refers to its profession;
- Bolsonaro: discourse about Brazil’s president Jair Bolsonaro (without mentioning the party), mostly showing respect and admiration;
- Family-religion: mainly focused on religious aspects and kinship relations (father, mother, married etc.);
- Promoting: similar to the profession as it promotes the profile and also describes it;
- Campaign: a strong emphasis on Bolsonaro’s campaign slogans, especially “Brazil above all, God above everything”;
- Conservative: mainly focused on right-wing, armamentalist and Christian topics, which are also closely clustered.
- Anti-establishment: based on the idea of renewing aspects as a whole.
At first, we assessed the impact and the size of each topic, adopting the number of profiles belonging to each of them as a metric. As shown below, the most recurrent topics in the sample are the Conservative and Family-Religious. This indicates the main way through which the profiles introduce themselves.
Topic frequency in Instagram bios' profiles
Note: Size of each topic based on the number of profiles that are mostly connected to it
Secondly, in order to locate the accounts, we traced the regional codes (DDD) from the phone numbers available for each profile and then plotted them on the map reported below. It is important to state, however, that it is not mandatory to have a phone number attached to the profile, so only 10.5% of profiles (834) were actually taken into account.
Discourse’s geographical map by regional codes (DDD)
Note: the intensity of blue is proportional to the number of profiles detected in that region
It should be noted that the geographical distribution doesn't match the Brazilian demographics: São Paulo, the country’s most populated region, is far away from the first position, and even Rio de Janeiro, the second biggest region, is quite low on numbers. This can be explained by the ideological affiliation of the most present states, located in the South region of Brazil, especially on Rio Grande do Sul. The state had a strong support for Bolsonaro during the election although it was not the highest voting percentage in the country.
It is possible to see in the graph below that the main source of posts in the region had a conservative connotation.
Topic’s geographical distribution by regional codes (DDD)
Note: Size of each topic for each region
Starting with this general overview, we decided to set a 15% correlation threshold (instead of the maximum percentage illustrated above) to consider the affiliation of one text to more than one topic. This way we were able to construct a bipartite network that links each topic to the images related to it. The advantage of this method is that it enables a more relational approach towards the content in each profile bio, showing similarities and shared visual codes to evaluate the existence of a common aesthetic among topics.
The network is radially oriented with the Call To Action topic-cluster dominating the central part of the network, acting as a bridge to other clusters. As the accounts located in this cluster take advantage of Instagram grammatization to invite users to join certain channels, such as WhatsApp
and Telegram, or to click on certain links, it can be said that Instagram itself creates a vernacular that is central to most bolsobots networks profiles. In a word, there is a clear effort by these fake accounts to mimic real-life behaviour in the platform.
On the left part of the network, Campaign and Bolsonaro clusters are tightly related. On the other side of the network, Promoting, Professional and Anti-Establishment are close to each other. This shows a two-way division: on one side, the open support to Bolsonaro; and on the other, a more self-oriented profile. Between both sides, Conservative and Family-Religious" are bridges between a self-oriented bio to other groups that show open support to Bolsonaro. That shows that these conservative and religious vernaculars are being mobilized when promoting one-self or promoting the president. This division suggests the appeal and impact of a family and conservative rhetoric as both promoting the president and showing alignment with his points of view.
Bipartite network of topics and profile images
When taking a closing look at each topic individually, the main change is that profiles associated with Bolsonaro and campaigning tend to have more symbols as profile pictures, whereas Family-Religious, Conservative and Call To Action tend to have more faces. When considering bot activity, the presence of faces doesn’t mean that they are indeed “real people”.
To have a closer look at the images themselves, we used the Google search engine to look at the profiles individually, and see where we would find these images on the web. Many images weren’t found, and some were even found on Russian websites. This difficulty to find the images was a good indication of bot activity. However, the amount of images were so great that to look at each one individually would not be achievable in such a short time.
Results of the reverse image search of one of the profile pictures in the dataset
Moving on from the reverse image search, we used Google Vision API to locate websites with full matching images in order to get a better understanding of the origin of these pictures (see Omena, Pillipets, Gobbo & Chao, 2021).
It was possible to create another bipartite network of images and websites where these images were found online. As supposed, a cluster of stock-photography, mainly represented by Shutterstock, emerged, and it was composed mainly of face images (or selfies): this might suggest the presence of bots that use “non professional” photos to emulate the everyday aesthetics adopted by the majority of the users on this platform. On the same token, social media and other clusters could be catfished using profile pics for already existing users in a new context. However, there’s no certainty that these profiles are indeed from bot accounts.
Ego-network of stock-photo images with Google Vision API
Note: On the top left, the original network created by the full matching images of Google Vision API (the Bolsobots profile image circulation network), on red, the part that is being shown.
Finally, as we captured bot activity among this supporting ecosystem, other quantitative measurements were taken in order to differentiate bot accounts for regular people. Specifically, we adopted the ratio between followers and following accounts, weighting that to the post count, to outline what could be considered “unusual” behavior. This unusual behaviour was detected through a bot probability boolean variable, measured according to three parameters: (1) a following/ follower ratio between 0.8 and 1.2, considering that it is quite common for bots to have a reciprocative follow-following ratio when inserted in a network of other bots. (2) A following/follower ratio greater than 5.0, considering that even though most users tend to follow more people than being followed, such a great disparity is quite unusual. The opposite, on the other hand, is not true, otherwise, all influencers would be considered bot accounts. (3) Finally, having too many posts can be an indication of bot activity: we assumed 2,000 feed posts as a threshold, especially if the account is associated with a low amount of followers.
As shown in the image below, the Family-Religious topic seems to behave in the most unusual way, together with the Campaign, due to a disproportion between follower and following users and a suspicious number of posts, especially for the latter.
Bot mapping according to following/ follower ratio and to post count. In blue, we considered unusual behaviour on the platform that could indicate bot activity.
#Categorisation & behaviour profiling through image analysis
Using the dataset of 33,768 Instagram profile images, we captured emerging categories (see Table 1) to identify and profile pro-bolsobots. Through a combination of ImageSorter
navigation and profile close reading, the categorizing not only depicts the characteristics of bot profiles but also offers clues to bot-hunting in a methodological sense.
More specifically, we made use of the affordance of ImageSorter
—the ability to group images by hue and bring similar images together—to search for bot-like profiles. Once all images are entered and grouped in ImageSorter
, it is easy to navigate and discover repeated image clusters, where repetition implies certain meaning depending on how the dataset is constructed. In our profiling work, we categorized the image repetition in ImageSorter
into four kinds: followed repetition
, identical usage repetition
, related aesthetic repetition
, and related symbology repetition
. These are all indicators of potential bot accounts, however with different implications.
Repeated Image Clusters of bolsobots following network
In the case of followed repetition, the profile image of one unique account appears multiple times in both dataset and ImageSorter
, which means it is followed by more than one seed bot. While identical usage repetition could be found when the same profile image is used by different accounts, pointing to organized bot behaviors. Related aesthetic and symbology repetition also apply to similar rules, though the images used by groups are not exactly the same.
Focusing on repetition clusters then enables us to select certain images and have a closer look at the original accounts. Here we further evaluate the profile page and categorize them into humanized sophisticated bot-like accounts, cheap bot-like accounts, and discrete bots. They can also be divided into bots of varied interaction capacity, measured by a mixture of humanistic features including story highlights, content generated, follow-following ratio, comment replying, etc (Bradshaw, S., Bailey, H., & Howard, P., 2021).
We also found that different colors are connected with specific themes and symbols in profiles. For example, green profile images are more likely to be used by nationalists, since green is the color of the national flag.
RawGraph Alluvial Chart of Categorization in Colors & Themes & Profiles Pics
Similar patterns could be found in other color clusters. As mentioned above, green and yellow
images are most favored by patriotic or pro-party accounts, active politicians and political influencers; in black and white
clusters, we discovered discrete bots, personalized accounts and political information hubs; red
clusters are filled with anti-communists, anti-feminist, religious, and parody profiles; while religious symbols and political faces also appear frequently in orange
Clusters and themes of profile images belonging to bolsobots following networks
Visual Analysis Case Studies: Bot Stories
After identifying the theme clusters we have randomly selected some profiles to carry a discourse analysis considering mainly the textual and image data available in each profile. By doing it we could better access the characteristics of the accounts, for instance, if it had more humanistic or a bot behaviour, what were the main strategies used in the posts messages, if there were some odd attitudes in the comments section. To illustrate this analytical phase we have brought some interesting cases, which will be better narrated below.
>>Hammer & Sickle Cluster
The Hammer and Sickle cluster had an overarching theme of anti-leftist rhetoric which likened left-wing thinking to communism. Their messaging strategies worked to attack the opposition and or mount smear campaigns against the left. The type of repetition within this bot cluster is described as ‘related aesthetic repetition’. This is because each of the higher capacity bots shares the same structure but are differentiated by minor aesthetic differences (such as the profile images, bios, and posed content) whilst sharing the same theme.
The account types combined humanised, sophisticated bots in combination with cheaper, discrete bots. As observed by Cenfessore and Dance et al. within The Follower Factory, ‘high-quality bots are usually delivered to customers first, followed by millions of cheaper, low-quality bots, like sawdust mixed in with grated Parmesan’ (Confessore and Dance et al., 2018). It is likely that similar techniques are applied by the bot farm from which the Hammer and Sickle cluster originated because it is cost-effective and allows different bots to play different roles within the group.
The anti-feminist cluster used anti-leftist rethoric and misused content from authentic profiles to attack the opposition and mount smear campaigns. The account types combined humanised and sophisticated bots that interacted with other users in combination with cheap, discrete bots. The type of repetition was related aesthetic repetition, repeating the same structural motifs with minor individual embellishments.
The following high capacity bot interacts with users impersonating a child with a child’s TikTok
Videos. Other users question the authenticity of the profile.
‘Auntie Bots’ are a humanised and sophisticated bot cluster (with a minority of discrete bots). Their messaging strategy revolves around no political content and or pro-government/pro-party propaganda. The cluster is highly individualised and uses symbology repetition of religious symbols, hearts and flowers.
The ‘Auntie bot’ cluster adopts the Brazilian stereotype of an older, religious, and conservative female and their associated online behaviours, which is used as a form of disguise. This is a progression from stereotypical associations of bots as ‘pretty girls’ to a strategised adoption of contextually relevant socio-digital identities. Impersonating ‘baby boomers’ online in particular also exploits the generalised perception of the clunky and ‘offline’ nature of boomers on social media. Therefore it is even more difficult to detect if it is a bot.
#Behaviour categories connections
We also found differences between different account types in messages strategies as well as the type of repetition of profile pictures. As shown in the graph (see below), discrete bots have no profile image, and no content, and the cheap-bot likes there is a variation of the three message strategy categories, and also on the type of usage repetition (within this category only symbology repetition was not noticed).
In contrast, most humanized and sophisticated accounts use related aesthetic, symbology or followed repetition on profile pictures. They are more likely to post messages that express a viewpoint such as pro-government or pro-party propaganda, and attacking the opposition or mounting smear campaigns. These elaborated messages and profile pictures in turn prove that these accounts are more sophisticated and add up the effort to distinguish between bots and real-person accounts.
Nevertheless, a small amount of humanized accounts with related symbology repetition had no political content and were the hardest to identify. These accounts were the ones that most used the strategy to impersonate a social stereotype. We can hypothesize that these accounts are mostly used to follow, support and comment on other institutional or political official accounts.
Correlation between account types, message strategies and types of repetitions
#Bolsobots following networks
Bolsobots bipartite following network: who matters to pro-bolsobots accounts?
The following questions guided our analysis: who matters to Bolsobots supportive accounts? Who are the bridging accounts? Where are the ghost accounts located in the network?
When observing the following network, we were able to use all 33.771 accounts because there was no need for more in-depth information. In an overview, in the centre of bolsobots bipartite following network, we see relevant actors of Brazilian politics. Bolsonaro and his family, government ministries, alt-right politicians and even the Federal Police of Brazil. In the mid-zones, there are clusters of bolsobots conservative accounts but also Trump-related and alt-right bot accounts. We also detected a cluster of accounts representing the non-successful political party that Bolsonaro had tried to create, called "Alliance for Brazil", and another constituted by Brazilian celebrities and a TV channel network (SBT). In the bridging peripheral zones, there are clusters of accounts of ordinary usernames and also supportive accounts. We suspect that the accounts with ordinary usernames may correspond to the discrete bots previously detected, but further analysis is required to confirm our hypothesis.
Bolsobots bipartite following network
Note: 29,693 Instagram accounts as nodes. 37,771 edges as the act of following other accounts. The maximum number of following accounts per account: 7,500. On the left, the mid-zones of the network emphasize clusters of Bolsonaro supportive accounts. On the right, clusters of ordinary users and Bolsonaro supportive accounts bridging our seeds (some of the bolsobots accounts used as entry points to data collection).
In a more detailed perspective, to identify where verified accounts are located in the network, we colored the nodes: verified accounts - red, unverified profiles - light blue, and seed accounts - dark blue. We found relevant Brazilian political entities in the center of the network. They are highly connected within the network.
Zoom at the centre of Bolsobots bipartite following network
Note: Nodes were colored by account type (verified accounts - red, unverified - light blue, seed accounts: dark blue. The edges were colored as their target nodes color. Relevant Brazilian political entities are highlighted in yellow. This graph was generated using the Gephi Force Atlas 2 layout.
The hyperconnected entities in the center of Bolsonaro supportive network are alt-right politicians (@jairmessiasbolsonaro, @biakicis, @carlosjordy, @carlosbolsonaro, @nikolasferreiradm); Brazil official government pages (@governodobrasil, @policiafederal); Brazilian government ministries (@marceloqueiroga, @genheleno, @damaresalvesoficial, @fabiofaria.br); Bolsonaro’s wife (@michelebolsonaro) and ex-wife (@rogeriabolsonaro).
In the middle of the network we identified a cluster of Celebrities and TV.
Zoom at the celebrities and TV cluster and correspondent Instagram profile prints
Note: Prints of entities related to Brazilian Television Network SBT (@marianagodo, @silviaabravanel, @teletonoficial). Silvia Abravanel is the daughter of SBT owner, Mariana Godoy is a former SBT TV anchor /now TV Record anchor, and Teleton is an SBT broadcast that raises money for children's rehabilitation centers. The original graph was generated using Gephi Force Atlas 2 layout.
The TV and Celebrities cluster indicates people related to SBT (Sistema Brasileiro de Televisão, in English: Brazilian Television System), a Brazilian free-on-air Network Television. SBT was launched by Silvio Santos Abravanel, after acquiring a TV concession in 1981. Silvio Santos is a declared Bolsonaro supporter, which has been criticized for misleading the Estate with the federal government, after calling Bolsonaro his boss (VEJA, 2021). Santos is the father of Silvia Abravanel (see Figure) and the father-in-law of Fabio Faria, Brazilian Communications Ministry (Correio Braziliense, 2020), whose profile appears in the center of the bolsobots following network.
Silvio Santos (SBT owner) and Bolsonaro in a Brazilian government official ceremony. In the picture, from left to right: General Mourão (vice-president), Edir Macedo (evangelical Bishop and TV Record owner), Jair Bolsonaro, Michele Bolsonaro (first lady), and Silvio Santos (SBT owner). Source: O Tempo (2019)
The role of ghost accounts (or accounts using the default Instagram profile picture)
In the bipartite network, the ghost accounts populate the different zones of the network, being followed exclusively by a seed account or serving as bridging accounts (periphery clusters), while occupying the central and mid-zones of the network (when the ghost accounts are followed by multiple seed accounts). As our previous analysis demonstrates, ghost accounts point to a bot typology created to integrate the bolsobots ecology in a discrete mode. Here we see that, although they do not exist in a large quantity, the ghost accounts occupy a central position in the network. As we can see in the monopartite network, they are shared central targets to pro-bolsobots which makes us wonder why ghost accounts are always present and associated with bots.
Bolsobots bipartite following network
Note: At the top, the bolsobots bipartite following network highlights the presence of ghost accounts within the network. At the bottom, a monopartite graph with unique usernames identifying the central position and importance of the ghost accounts
An example of a ghost account located at the network periphery is shown below. The account name is perfilinativo.desativado (in English: inactiveprofile.deactivated). We conducted an Instagram search for accounts named "perfilinativo" (inactiveprofile) and the result can be seen below (right): Instagram returns a considerable number of ghost accounts whose names are variations of the keyword "pefilinativo".
On the left, a print of an Instagram ghost account, @perfilinativo.desativado (in English: inactiveprofile.deactivated). On the right, the results of the search for "perfilinativo" accounts.
Mapping connections through following networks helped us to characterise bolsobots strategies such as the creation of multiple accounts using the same username pattern that indicates topics of interest related to Bolsonaro, e.g. conservative right as we see below. This technique may assist further research about bolsobots and based on username patterns.
Although very diverse, the ecology of bolsobots profiles is situated in specific contexts, pointing to "real life" Bolsonaro's supporting publics: religious and moral conservative (Kalil, 2019), pro-guns (Phillips, 2021), anti-feminist and anti-communist (Kalil, 2019), nationalist (Tamaki & Fuks, 2020).
This is well represented by the size and the geographical distribution of the topics identified from the profile’s bios: the biggest one is indeed that of the Conservative discourse, concentrated especially in the Southern regions of the country, traditionally more conservative, followed by Family-Religious and Campaign. This underlies the different articulations of the rhetoric underlying Bolsonaro's political communications.
Also, considering the bot probability measure, the Family-Religious and the Campaign topic show an unusual behaviour of the accounts, suggesting the bot activity. Finally, these clusters seem to share a common aesthetic made of symbols and national colors.
The impact of the platform in the general analysis can’t be put aside. The vernacular used specifically on Instagram is a bridging point to all thematic clusters in the analysis. This supports the assumption that platforms are not neutral fields of dispute, but have profound impact on any content that they contain.
Moreover, it can be argued that bots may be used to simulate audiences. The existing dataset had contrasting demographics when comparing the profile creator’s phone number and the emojis used on profile names. While in the first case most accounts were created in the southern part of Brazil - where Bolsonaro has more support - and, in the second, emojis represented users' origins as from the northeast part of the country - where it tends to be less popular.
This project concludes that bolsobots operates in two ways: through i) accounts that generate content, gathering substantial numbers of followers, that are supported by ii) a diverse range of discrete bots representing different supportive discourses to the Brazilian president Jair Bolsonaro. In general, we found out that bolsobots following networks use very strategically Instagram grammars to support and praise Bolsonaro’s campaign discourses; from highlighting his political slogan (Brazil above everything! God above everyone!) targeting specific conservative agendas by creating fake supportive publics. In detail, we detected a few strategies associated with bolsobots that can help to build awareness of their agency in the Brazilian context, contributing to scholarly research in related topics.
- Bolsobots use emojis and hashtags as forces of expression:
- emojis appropriation explicitly indicates the publics and values generally associated with Brazilian conservative political agendas (e.g. family, religion and faith), also evoking the landscape of the Northwestern part of Brazil, where Bolsonaro usually has less political support in comparison to left-wing politicians.
- hashtag appropriation resonate Bolsonaro’s election campaign topics (e.g. #bolsonaro2022, #fechadocombolsonaro), Christianity related topics and a combination of neoliberal, conservative and pro-guns debate (e.g. #economialiberal, #armamentista). The use of non-current hashtags (e.g. #bolsonaro2018) points to the bot accounts created to support the 2018 electoral campaign.
- Bolsobots use stock-image pictures or fake user pictures to look like “real people” who support Bolsonaro’s campaign and conservative discourses. In addition, they get the advantage of hashtags, emojis and links in their profile description, calling other Instagram users to join or follow other Bolsonaro’s supportive channels, such as Telegram and WhatsApp.
- Bolsobots hijack social media users’ content to feed their Instagram accounts and also to produce no political content. They also attack (or mock) the opposition by adopting anti-leftist vernaculars.
- When associated with Bolsonaro and Campaigning topics, bolsobots profile images tend to have more symbols as profile pictures, whereas we see more faces in topics related to Family-Religious, Conservative or Call To Action. Those three are bridging actors between open support for Bolsonaro and self-oriented content, showing a “sweet-spot” for discrete bots.
- Bolsobots make use of symbols in their profile pictures to express different forms of alignments with Bolsonaro’s campaign discourses, such as conservatism and nationalist (e.g. Brazilian flag) anti-leftist or anti-communists (e.g. hammer and sickle).
- Discrete bolsobots accounts stand for:
- Instagram accounts with no image profile, the so-called ghost accounts
- humanised and sophisticated bots adopting the Brazilian stereotype of an older, religious, and conservative female and their associated online behaviours
- an identical usage repetition, in which different bots accounts use the same profile picture.
Therefore, this project has shown that the challenge of discovering if an account is a bot or not can be handled by a variety of methods, from unusual metrics to profile-image sorting and in-depth analysis. This multi-method approach is useful not only to spot bots, but to understand their behavior and in which context they have more or less impact.
Bail, C. A. (2014). The cultural environment: Measuring culture with big data. Theory and Society 43, 3-4, 465-482.
Benoit, K., Muhr, D., & Watanabe, K. (2021). stopwords: Multilingual Stopword Lists (Version 2.2). Retrieved from https://CRAN.R-project.org/package=stopwords
Bounegru, L., Gray, J., Venturini, T., & Mauri, M. (2017). A Field Guide to Fake News: A Collection of Recipes for Those Who Love to Cook with Digital Methods.
Correio Braziliense. (2020, June 12). Saiba quem é Fábio Faria, novo ministro e genro de Silvio Santos. Retrieved July 30, 2021, from Correio Braziliense website: https://www.correiobraziliense.com.br/app/noticia/politica/2020/06/12/interna_politica,863197/saiba-quem-e-fabio-faria-novo-ministro-e-genro-de-silvio-santos.shtml
Confessore, N. and Dance et al., G.J.X. (2018). The Follower Factory. The New York Times. [online] 27 Jan. Available at: https://www.nytimes.com/interactive/2018/01/27/technology/social-media-bots.html
[Accessed 15 Jul. 2021].
Bradshaw, S., Bailey, H., & Howard, P. (2021). Industrialized Disinformation: 2020 Global Inventory of Organized Social Media Manipulation. Working Paper 2021”, Project on Computational Propaganda, Oxford, UK.
Jacomy, M. (2021). Situating visual network analysis (Doctoral dissertation). Retrieved from https://reticular.hypotheses.org/1879
Feinerer, I., Hornik, K., Software, A., & Ghostscript), I. (pdf_info ps taken from G. (2020). tm: Text Mining Package (Version 0.7-8). Retrieved from https://CRAN.R-project.org/package=tm
Fellows, I. (2018). wordcloud: Word Clouds (Version 2.6). Retrieved from https://CRAN.R-project.org/package=wordcloud
Kalil, I. (2019). Who are Jair Bolsonaro’s voters and what they believe. Center for Urban Ethonography. https://doi.org/10.13140/RG.2.2.35662.41289
Mauri, M., Elli, T., Caviglia, G., Uboldi, G., & Azzi, M. (2017). RAWGraphs. RAWGraphs. Presented at the Proceedings of the 12th Biannual Conference on Italian SIGCHI, New York, NY, USA. Retrieved from https://rawgraphs.io/about/
Neuwirth, E. (2014). RColorBrewer
Palettes (Version 1.1-2). Retrieved from https://CRAN.R-project.org/package=RColorBrewer
Omena, J.J. & Amaral, I. (2019). Sistema de leitura de redes digitais multiplataforma. In: Métodos Digitais: Teoria-Prática-Crítica, edited by Janna Joceli Omena. Lisboa: ICNOVA. ISBN: 978‐972‐9347‐34‐4
Omena, JJ. (2019). Reading Digital Networks: Climate Emergency, Bolsonaro & Bot Image Circulation by Vision API.The social platforms [research blog].Retrieved from https://thesocialplatforms.wordpress.com/2019/12/07/reading-digital-networks/
Omena, Chao, Pilipets et al. (2019).Bots and the black market of social media engagement. Digital methods summer schools 2012. DOI: 10.13140/RG.2.2.30518.52804 Retrieved from https://wiki.digitalmethods.net/Dmi/SummerSchool2019Botsandtheblackmarket
Phillips, T. (2021, February 15). Anger as Bolsonaro moves to make guns easier to access: “A threat to democracy.” Retrieved July 30, 2021, from The Guardian website: http://www.theguardian.com/world/2021/feb/15/jair-bolsonaro-brazil-guns-easier-to-acquire
RStudio Team. (2020). RStudio. Boston, MA. Retrieved from http://www.rstudio.com/
Santini, R. M., Salles, D., & Tucci, G. (2021). When Machine Behavior Targets Future Voters: The Use of Social Bots to Test Narratives for Political Campaigns in Brazil. International Journal of Communication, 15(0), 1220–1223.
Santini, R. M., Salles, D., Tucci, G., & Estrella, C. (2021). A militância forjada dos bots: A campanha municipal de 2016 como laboratório eleitoral. Lumina, 15(1), 124–142. https://doi.org/10.34019/1981-4070.2021.v15.29086
Tamaki, E. R., & Fuks, M. (2020). POPULISM IN BRAZIL’S 2018 GENERAL ELECTIONS: AN ANALYSIS OF BOLSONARO’S CAMPAIGN SPEECHES. Lua Nova: Revista de Cultura e Política, (109), 103–127. https://doi.org/10.1590/0102-103127/109
VEJA. (2020, 12). Em nota interna, Silvio Santos chama Bolsonaro de “meu patrão.” Retrieved July 30, 2021, from VEJA SÃO PAULO website: https://vejasp.abril.com.br/cultura-lazer/silvio-santos-bolsonaro-meu-patrao/
Venturini, T., Jacomy, M., & Jensen, P. (2019). What Do We See When We Look at Networks. An Introduction to Visual Network Analysis and Force-Directed Layouts. SSRN Electronic Journal. https://doi.org/10.2139/ssrn.3378438
Weltvrede, Lindquist et al. (2020). Good Enough Publics. Retrieved from https://wiki.digitalmethods.net/Dmi/SummerSchool2020GoodEnoughPublics
Wickham, H. (2021). tidyverse: Easily Install and Load the “Tidyverse” (Version 1.3.1). Retrieved from https://CRAN.R-project.org/package=tidyverse
Group members affiliation & task division
Janna Joceli Omena (JJO), Center for Advanced Internet Studies, Germany;
Thais Lobo (TL), King's College London, UK;
Giulia Tucci (GT), Federal University of Rio de Janeiro, Brazil;
Francisco W. Kerche(FWK), Federal University of Rio de Janeiro, Brazil;
Malu Paschoal(MP), University of Trento;
XIaolu JI (XJ), Tsinghua University;
André Rodarte (AR), University of Cambridge;
Eleanor Griffiths (EG), Goldsmiths College, University of London;
Lorenzo Zaffaroni (LZ), Catholic University of Milan;
Gaia Amadori (GA), Catholic University of Milan;
Shilan Huang (SH), University of Amsterdam.
Project pitch: JJO & TL
Key grammars (an overview): GT, FWK & TL
Profile description analysis: FWK, GA, LZ & JJO
Profile picture analysis: MP, XJ, AR, EG & SH
Visual network analysis: GT & JJO
Thanks to Andrea Medina and Antonella Autuori for their valuable feedback to improve the final visualizations and also for helping us with the alluvial diagram