Investigating the Information Landscape of the Russia-Ukraine Conflict on YouTube. Comparisons of the emerged content from distinct queries on YouTube surrounding the political conflict

Team members: Anasthasya Mathilda, Danique Vijlbrief, Harneet Bahal, Li Jiang, Tianyi Li

Summary of Key Findings

This paper observes a degree of disparity between the kind of information one can seek to gain on YouTube based on the keywords they use, especially in the context of the Russia-Ukraine conflict. The keyword ‘Ukraine War’ reveals more general news coverage, whereas the keyword ‘Ukraine Military Operation’ reveals a more subjective coverage of the conflict.

1. Introduction

On the 24th of February 2022, Russia started an unprovoked invasion of Ukraine. At the time of writing, Russian troops are still occupying Ukrainian territories and actively attacking both Ukrainian soldiers and civilians. This war has left Ukraine devastated, while Russia, under Vladimir Putin’s policy, isolates itself further from the west.

While western media widely condemn the Russian invasion and refer to the Russia-Ukraine conflict as a ‘war’, Russia avoids this term and rather refers to it as a ‘special military operation’ (McMahon, 2022; Troianovski & Safronova, 2022; Wesolowski, 2022). Russia creates its own narrative about the Ukrainian war and actively spreads this propaganda through state media (Mozur, Satariano, and Krolik, 2022). More than that, on the 4th of March, Putin signed a censorship law which threatens imprisonment for any journalist who deviated from the Kremlin’s portrayal of the conflict in Ukraine (McMahon, 2022). It should be clear that this conflict has reinforced the opposition between the western news media and the Russian propaganda news media.

Naturally, people search for news items to get more information about the subject. One place where people search for this information is YouTube. With more than two billion users, “!YouTube is one of the most dominant sources of online information” (Li et al., 2020, p. 1). YouTube’s dominant role in providing online information comes with an editorial responsibility to ensure that that information is, at least to some extent, based on facts. When it comes to the Russia-Ukraine conflict, this means that YouTube is expected to react in some way to misleading propaganda from the Kremlin. Indeed, YouTube has taken on the role of the dissident platform in this conflict when it took a political stand and deleted 70,000 videos and 9,000 channels that described the conflict as a ‘liberation mission’ (Milmo, 2022). In this way, YouTube has actively worked to moderate misinformation about the conflict.

In this context, it is interesting to investigate how YouTube mediates the Russia-Ukraine conflict and what the information landscape around this conflict looks like on YouTube. This research is based on two keyword queries that represent the two sides of the conflict: ‘Ukraine War’ and ‘Ukraine Military Operation’. These keyword queries are used when collecting datasets with YouTube Data Tools (Rieder, 2015). A quantitative analysis of these datasets is then performed to answer the research question.

2. Initial Data Sets

For this research, we were looking at a total of 9 datasets, in which 1 was the starting point, 6 were analysed in detail, and 2 yielded insubstantial results for this research. All datasets that were observed and used during this research process can be found in this linked folder.

The initial point of interest of our investigation was the first dataset, which stemmed from the Video List dataset that our project leader extracted surrounding the query ‘Ukraine War’, extracted through the Video List Module on YouTube Data Tools (Rieder, 2015) on two separate occasions (December 2022 and January 2023) in which one of the findings showed the discrepancies of the number of videos published per week from the two sessions of data extraction. This finding, combined with the event of YouTube’s removal of videos, led us to obtain from YouTube Data Tools (Rieder, 2015) the following data sets:

1. From the Video List Module, based on search queries of ‘Ukraine War’ (1 dataset) and ‘Ukraine Military Operation’ (1 dataset); Set on iteration 1, dated from 1 February 2022 to 31 December 2022, with the search for each day, timeframe opted in ranked by date. These datasets were the operational groundwork to answer our main Research Question and the subsequent sub-questions, enabling us to observe timely trends, publishing behaviours, and activities of the actors (i.e. channels).

2. From the Video Network Module, based on search queries of ‘Ukraine War’ (2 datasets; video network and channel network) and ‘Ukraine Military Operation’ (2 datasets; video network and channel network); Set on iteration 1, dated from 1 February 2022 to 31 December 2022, ranked by relevance, and set on a crawl depth of 0, and each query generated both video networks and channel networks. The video network enabled us to simulate user activity if they are searching for each query in their watching session and further shows what is being co-watched in the same session. Furthermore, initially, we also obtained the same datasets with a crawl depth of 1, however, through processing the data, the results yielded were also not substantial and thus not used in the findings section of this research report.

3. From the Channel Info Module, using seeds of channel IDs from the most active channels (based on the numbers of videos published, in accordance with the Video List datasets) from both the ‘Ukraine War’ and ‘Ukraine Military Operation’ datasets. These datasets were used for further investigation into the top 20 channels of each query as we noticed distinct categories of actors between the two queries. We were then able to observe the channel age difference and countries of origin between the active actors in each query.

3. Research Questions

The main question that guides this research is: What does the information landscape in the Russia-Ukraine conflict look like on YouTube?

In order to be able to answer this main question, three smaller sub-questions are answered. They are as follows:

1. Which kind of content circulates in the information landscape around the Russia-Ukraine conflict?

2. Who are the actors contributing to this information landscape?

3. Are there temporal dynamics to both the contents and the actors?

For the first sub-question, we hypothesise that there are mostly videos in the category News & Politics for both queries since the Russia-Ukraine conflict is a political conflict that is reported on by news media on a daily basis. However, this category could be less present for ‘Ukraine Military Operation’ due to YouTube ’s removal of channels and videos that spread misinformation about the conflict. Related to the first question, we expect that the actors contributing to the information landscape on YouTube are mainly news media that report on the conflict regularly. Again, this might be different for ‘Ukraine Military Operation’ for the same reasons as above. Our hypothesis for the third sub-question is that there could be changes in the kind of content and actors that might match key events in the conflict.

4. Methodology

Firstly, based on the research question and its subsequent sub-questions, we decided to compare two different queries with possibly opposing views and connotations and obtain datasets for each that will be compared and contrasted in parallel to each other. In choosing the queries, first, we must acknowledge that within our research sub-group, there was no researcher fluent in or is a native speaker of both Ukrainian and Russian; hence we have limited linguistic and cultural context that may lead us to choose terms (i.e. in the native languages) as a query that possibly shine a better light into the landscape of the conflict in YouTube. Consequently and as mentioned in earlier sections, we arrived at the decision to compare and contrast the terms ‘Ukraine War’ and ‘Ukraine Military Operation’ for this research.

We chose ‘Ukraine War’ as a query as it represented the general global news narrative of this conflict, and ‘Ukraine Military Operation’ was chosen as the other query as from preliminary research done on news articles and reports, Russia evaded and even banned calling this conflict ‘war’, and instead referred to the conflict as ‘liberation operation’ or ‘military operation’ (McMahon, 2022; Troianovski & Safronova, 2022; Wesolowski, 2022). Hence these two queries with distinct connotations were chosen in the hope of showing us both ‘sides’ of this conflict and how it is mediated, portrayed, or moderated on YouTube.

As mentioned earlier, all datasets for this research were obtained through YouTube Data Tools (Rieder, 2015). To recap, the datasets used in this research were: Video List from both ‘Ukraine War’ and ‘Ukraine Military Operation’ queries; Video Network from both ‘Ukraine War’ and ‘Ukraine Military Operation’ queries; and the Channel List from the top 20 (most actively posting, based on the number of videos published) channels from both ‘Ukraine War’ and ‘Ukraine Military Operation’ queries.

For the analyses done in the context of the Video List and Channel Info datasets, Google Sheets was used as the main platform to process the data, in particular, the Pivot Table and Charts function were utilised to process and visualise posting trends (monthly), category share, temporal dynamics of the category share, channel activity rank based on the number of published video, channel country of origin, as well as channel age share comparison between the two queries. As for Video Network datasets, the .gdf files obtained from YouTube Data Tools were processed in Gephi using the Force Atlas 2 algorithm. Statistical processes (average degree and modularity) were also run to detect clusters within the network.

5. Findings

Video List Module Findings

Video Quantity

There is a significant difference in the number of videos generated for each of our queries from YouTube Data Tools (Rieder, 2015). The ‘Ukraine War’ query generated 6,578 videos, whereas the ‘Ukraine Military Operation’ query generated 2,667 videos. The gap in the quantity of videos could be a result of YouTube’s removal of channels and videos that were falsely spreading pro-Russia propaganda.

Video Length

The duration of the videos for each query was also analysed for the purpose of this paper. Videos from the ‘Ukraine War’ query are timed between 0 seconds to 3,599 seconds with an average of 666 seconds, or approximately 11 minutes. Each video with the resulting video duration of 0 seconds is due to a YouTube LIVE recording that started at the onset of the conflict and is still ongoing.

Furthermore, the duration of the videos from the ‘Ukraine Military Operation’ query ranges from 2 seconds to 3,567 seconds, with an average of 370 seconds, or approximately 6 minutes.

Video Categories

An analysis of the overall video categories was conducted for each of the search queries. Figure 1 displays the overall dataset generated for the ‘Ukraine War’ query and indicates a share of 73.7% of videos belonging to the News and Politics category. Additionally, the People and Blogs category belongs to 8.4% of the videos for the ‘Ukraine War’ dataset. Due to the nature of the query, the People and Blogs category is deemed important as it could signify vlog-like reporting, citizen journalism or coverage done by political commentators.

Figure 1. Video categories for the ‘Ukraine War’ dataset.

However, Figure 2 displays the video categories for the dataset obtained from the ‘Ukraine Military Operation’ query. The most prominent category is People and Blogs at 39.5%, followed by News and Politics at 38.1%. Based on this finding, it can be deduced that the conflict has been equally covered by news media outlets and individuals or political commentators.

Figure 2. Video categories for the ‘Ukraine Military Operation’ dataset.

Moreover, an analysis of the temporal dynamics was conducted for the video categories of each of the queries. Figure 3 displays the category share per month for the ‘Ukraine War’ and displays a steady dominance of the News and Politics category over time. Additionally, an increase in the People and Blogs category can also be observed, which may denote an increase in live coverage, commentaries or citizen journalism.

Figure 3. Category share per month for the ‘Ukraine War’ dataset.

Furthermore, the category share per month for ‘Ukraine Military Operation’ can be observed in Figure 4. While the majority of the videos were categorised under News and Politics at the beginning of the conflict, a gradual change can be observed over time. There was a brief period of time wherein the People and Blogs category dominated the category share, and this could be attributed to YouTube ’s removal of thousands of channels and videos that were spreading misleading pro-Russia propaganda. However, after this period, the two primary categories appear to be equally contributing to the information landscape.

Figure 4. Category share per month for the ‘Ukraine Military Operation’ dataset.

Actor Analysis

In order to ascertain the most prominent contributors to the information landscape of the Russia-Ukraine conflict, an overview of the channels that uploaded content pertaining to our two queries was analysed. Table 1 depicts a difference in the top channels for the queries ‘Ukraine War’ and ‘Ukraine Military Operation’, wherein the first query appears to be dominated by official news outlets, and the second query appears to be dominated by individuals.

Table 1. A comparison of the publishing actors for ‘Ukraine War’ and ‘Ukraine Military Operation’.

Furthermore, an analysis of the channel age for the top 20 channels of both queries was also conducted. The results indicate that due to the top channels being official news outlets, the channels for ‘Ukraine War’ tend to be older. However, the channels for ‘Ukraine Military Operation’ seem to be comparatively younger, with 40% of them having started in 2022.

Figure 5. Channel age share for the top 20 channels in the ‘Ukraine War’ query.

Figure 6. Channel age share for the top 20 channels in the ‘Ukraine Military Operation’ query.

Video Network Findings

The video network for both queries, ‘Ukraine War’ and ‘Ukraine Military Operation’, were analysed to discover the different ways the information landscape pans out on YouTube. The comparison of the video network analysis for both queries resulted in the discovery of a large cluster for the ‘Ukraine War’ query featuring video titles like ‘Savage War Crimes’, ‘The KGB and Why It Is Feared’, along with titles featuring dictators like Hitler and Kim Jong-Il. The presence of video titles depicting this sentiment is largely absent in the video network for the ‘Ukraine Military Operation’ query.