Web3 - The influence of the paid and organic keywords on the structuring of the new narrative of Web 3.0 through Google search engine
Anestis Amanatidis, Cemal Tahir Şanlı, Elif Buse Doyuran, Fanni Kovács, Ignacio Guerrero Martínez, Zofia Karolak.
Summary of key findings
This research has focused on finding patterns in the clustering of keywords around major actors that dominate the ad market and the search results in order to understand how the web 3.0 actors position themselves in these topical domains via association with keywords. Within this, we have discovered that the companies and advertisers present a strong influence of technological determinism around the idea of Web3 by emphasizing the technologies and services related to the topic while not establishing an apparent connection to the ideology of the topic.
The term Web3 is getting tossed around more and more in our current days, both in the academic field and everyday matters. Web3 is the idea of the restatement of the World Wide Web which is based on cryptocurrencies, blockchains, decentralized finance and token-based economies (Voshmgir 2020). Web3 was originally coined back in 2014 by Gavin Wood, however, the term has begun speculating high interest only this past year, notably by venture capitalists, big tech companies and aficionados of the crypto world. Due to the phenomenon being fairly recent, there are naturally numerous questions and concerns arising surrounding its presence, as well as curiosity about its potential regarding the future. Due to Web3’s capability of being the future, it is not only interesting but also important to wonder about who exactly is in charge of envisioning the future of the internet and consequently that of finance, and on which exact terms this future is built.
It is hence why, for this project, we have decided to study how Google Ads are defining the narrative on Web3 by the use of related key terms. This was done in order to get an understanding of which actors (or groups of actors) are leading the conversation surrounding Web3 and to see what imaginaries of decentralised technologies are being prioritised in the current narrative. We were also interested in seeing who entered ad-bidding on Google across time, which could also show how cryptocurrencies and NFTs have gotten mainstream in our everyday conversations. For the purpose of this study, we have opted for a qualitative study involving Ahrefs, Answer The Public, Google Trends and Gephi in order to visualize our findings. Each tool aided in a different step of the project, and together they allowed us to find patterns in the clustering of keywords around major actors that dominate the ad market and the search results. This in turn allowed us to understand how the Web3 actors positioned themselves in these topical domains via association with keywords.
2. Research Questions
After getting acquainted with the features of Ahrefs, we have structured feasible research questions. Our main scope of analysis focused on whether we could find a pattern in the clustering of keywords around major actors that dominate the ad market and the search results in order to understand how the web 3.0 actors position themselves in these topical domains via association with keywords. Moreover, our research aim was to potentially discover what narratives the actors are creating around the themes connected to web 3.0, and this way altering users' understanding of it. Therefore, we formed the following sub questions:
- How are different domains bidding for/buying a set of keywords around Web3 and related topics, i.e. crypto, NFTs, blockchain, DeFi and metaverse?
- What other organic keywords lead to the pages that receive the most traffic when users are searching for these same topics?
3. Methodology and initial datasets
As we wanted to study the narratives around Web3 through the lens of Google Ads and upon discovery in the initial phase of our study of deviating dynamics for the ad-buyers and the general public around the concept; we opted to use no existing dataset and compiled our own (two separate) datasets on Ahrefs in order to ensure the studying of the most up-to-date and relevant data for the purposes of our project; one for paid terms (keywords) and one for the terms that were most frequently associated with the subject at hand, i.e. Web3, in organic searches by the public.
Ahrefs is a tool designed for SEO practices that is especially used for link building. It gets data from sanctioned Google APIs and has its own crawler for feeding a backlinks database. It is able to scrape Google’s SERPs (Organic and Paid results), thereby in a way functioning like an archive of Google.
However, firstly to get an overview of the general understanding of web 3.0, we particularly used the help of Google Trends, a service that enabled us to track down the popularity of certain searches among users. By choosing to analyse each term as a ‘topic’ Google Trends displayed all the queries around this keyword. This way, by combining those results with the ones from Answer the Public, we were able to assess which topics come together in the idea of web 3.0 through the perspective of the general public.
In order to answer the research questions and convey feasible research, it was necessary to narrow down the scope of the analysis. By using the service answerthepublic.com, we have been able to extract the 5 most popular topics that have been searched for in Google associated with the query ‘web 3.0’ - blockchain, cryptocurrency, defi, NFTs and metaverse. At a later stage, while preparing a protocol from our analysis we have noticed interesting differences between paid and organic search. Therefore, within our subgroup, we decided to further split into two teams. One was responsible for analysing the network of keywords that were paid for by the domains, whereas the other focused on keywords used for the specific topics in question in Google’s organic search. Even though the two protocols differed, for both Ahrefs was a source of data acquisition with a localisation set for the US.
The first analysis focused on data retrieved from the ‘related terms’ section available in Ahrefs after searching for each aforementioned topic. After that, the related keywords were sorted by features: ‘all’; ‘top 100’; ‘top ads’ and ‘bottom ads’, and before exporting the dataset was additionally filtered by descending CPC (cost per click), as we wanted to see / discover to what keywords ad-buyers attached the most importance in connection with the concepts at hand and therefore paid the most money for.
This way we have obtained a CSV file that presented 20 columns with various characteristics of the term. However, for our research, the file was limited to two columns with the keyword and the domain that was bidding for it. Because the initial file exported from Ahrefs presented advertisers as URLs, we used the DMI Harvester tool (function ‘only return host names’) in order to transform them into domain names which would enable us to provide clearer visualisation at a later stage.
For the study of organic keywords associated with the concepts in question, we adopted a protocol that differed from that of the former sub-study in meaningful ways. As we were dealing with and trying to examine the search practices of the general public, we decided to include variations of each concept to allow for different ways of expression and thus retrieve thorough results. That meant, in practice, an extension of our keyword set as follows:
- web3, web 3.0
- blockchain, block chain, blockchains
- cryptocurrency, crypto, cryptocurrencies, bitcoin
- decentralized finance, defi, de-fi
- nfts, nft, non-fungible tokens, nonfungible tokens
- metaverse, meta-verse
Subsequently, we similarly utilized the Keyword Explorer section of Ahrefs and entered each of the 6 keyword sets separately. Since we were interested in what keywords are located in the centre of most organic traffic, we decided to then filter the data according to “traffic share”. At that point, seeing that some domains most extensively affiliated with the keywords turned out to be items such as “cnbc.com” or “ibm.com”, which in that form provide no clear context or indication as to the source / target for the said keywords; we decided it would be better and yield more accurate results to look at the specific pages within those domains, instead.
There, for all pages that have >= 1% traffic share (to total around 90% for each keyword set), we extracted the affiliated organic keywords of these specific pages in separate csv files, all of which we later combined to arrive at one bigger csv file for every keyword set. Repeating the same process for all 6 sets of keywords, we then brought together all of the data of these combined csv’s in a master data table. Then, in order to get cleaner results, we decided to remove items that have no real traffic (traffic less than 1). Following some additional cleaning of the data to get rid of data points with missing or corrupt values as well as removal of columns containing additional information not to be processed by Gephi; at the end, we arrived at a collective dataset of more than 40k valid data points (rows), which designated keyword - URL pairings.
The visualisations for both datasets were carried out the same way. The CSV files were imported into the Gephi visualisation tool as “edges tables” and data was “copied from ID to label” in order to display the url names properly. The “directed” graph type having been selected, using the Force Atlas 2 layout and utilizing modularity class partition for nodes to achieve easily discernible visual separation between clusters; we were able to obtain and identify various clusters within both of the two networks - the domains bidding for the paid keywords, and domains visible in the organic search result.
The data gathered throughout the research process allowed for various conclusions to be taken away from our initial hypotheses. In order to properly and concisely interpret our findings, they were split up into two sections. In the first part, we had looked at the findings provided through the visualisations done in Gephi for the paid search results. This allowed us to get a better sense of the war on ad-bidding from the advertisers’ side. After that, we had also created visualisations through the same process on Gephi for the organic search results, which then in turn, painted a picture on the users’ perspective. Then, the juxtaposition of the two search results aided us to pull away the conclusions discussed in the discussion section.
Within the paid search result, we did not find any paid keywords that would enable the connection between the topics of blockchain, cryptocurrency, defi, NFTs and metaverse and the general web 3.0 term. However, we decided to focus on the potential network structured by the related terms between the topics themselves, which we have succeeded to present in the visualisation below.
The domains paying for the keywords associated with the aforementioned subjects were found to be the advertisers focused on selling and buying technologies and artifacts. For example, the only domain found to be buying the terms related to metaverse was Facebook. It can be strongly connected with the company’s name change in late October 2021. However, it is interesting to see that Facebook is now seen to be the only domain bidding for anything related to this topic. It might be significant to analyse whether the company is structuring some sort of a monopoly around it.
On the other hand, on the graphic representation above it can be noticed that the domains buying the keywords around the NFT topic also bid for multiple terms related to the e-commerce, domains like shopify.com and Gucci online store, structuring a strong connection between the two clusters. Similarly, in the analysis, we observed a link between keywords related to blockchain and cryptocurrency. These two clusters of terms seem to be interrelated by the educational domains that bid for both of the topics and focus on providing content that enables users to get more information on the working methods of the two topics. Additionally, there are also many domains buying the keywords around the cryptocurrency that offer trading services.
As for the organic search results, there were numerous notable findings that could be observed through the two graphs created on Gephi. To start with, expectedly, there were seven main groups identified, as defined per our keywords (crypto and bitcoin were part of the same keyword set in our query design). The main groups were identified as the clusters visible on the visualization:
Six of these main groups were all interconnected -albeit with differing intensities and volume- with the exception of one, Metaverse, which seemed to have negligible connection to the overall picture. The “Metaverse” group was focused in large part on Facebook related results, following the company’s name change to “Meta”.
One of the most notable things we had noticed is that each cluster of keywords also brought along numerous sub-clusters, which then in turn provided the data necessary for the analysis. Based on the first visualisation we can see the logical flow of clusters going from blockchain to crypto and then to bitcoin.
The blockchain cluster highlighted the biggest websites with the most traffic as the websites linked to information seeking, explaining blockchain as a concept. There's also a sizable sub-cluster for login for the crypto.com site, apparently used by people who are already familiar with and investing in the crypto world, which is a child of the blockchain technology.
The bitcoin cluster within itself showed sub-clusters focusing on price and investment related questions, so were more informational in nature. These were often closely related to the crypto cluster, within which we also identified some historical/comparative questions, and informational questions.
As for the NFT cluster, the main sub-clusters were identified as informational, explanatory ‘encyclopaedic’ ones such as “NFT explained”. Then more separated was a “How to create NFTs” sub-cluster which is more closely related to another sub- cluster around NFT markets, for those already familiar with its trading. These two sub-clusters were more closely related, however a little further apart than the learning crowd with less connections.
For the De-Fi cluster the biggest sub-clusters were also information seeking ones, with the apparition of some smaller ones relating to De-Fi regulations as well as concerns and open-ended questions over De-Fi being a scam. Another smaller sub-cluster were De-Fi applications, referring to the real-life solidified uses of De-Fi.
Finally, the Web3 cluster was interesting as it was very scarcely connected to the other main clusters, it was very separate and very little connection could be observed. A sub-cluster was focused on trying to understand what Web3 represents for the future of the Internet. Another one displayed a comparison with web 2.0, and another one focused on Web3 applications, relating to more real life uses. It also has to be noted that there was no clear concentration in the centre of the Web3 cluster, no clear sub formation as the central area was scattered and unfocused.
When we set out with our project, we sought to see how exactly the concepts under the grander idea of web 3.0 interrelate and interconnect, as well as their entanglement with the broader discussions of the greater web 3.0 “revolution” -with all its disruptive technologies that can potentially re-invent many aspects of the way we live our daily lives. It was our expectation that we would find web 3.0 “at the heart” of the network, both literally and figuratively, meaning that any and all discussion of any of the web 3.0 novelties would have clear connections to the “root” concept.
In this light, it was rather surprising to observe that the more “tangible” notions within the web 3.0 ecosystem such as blockchain, cryptocurrencies and NFTs, that have found -to varying degrees- venues of application in real life, are quite clearly defined in themselves, however these are quite separate from any discussions and narrative around web 3.0 itself as an overarching concept, displaying fragmentation rather than integration into one topical sphere of Web3. This means that web 3.0 has not yet managed to become part of the conversation for many, and most internet users as of today are more interested and invested in the concrete applications of the heralded new age (Henderson 2021), much more so than conceptual or philosophical considerations. This also implies that “web 3.0” has not yet acquired commercial value for advertisers as a keyword.
The only clear exception to this is defi, which stands for decentralized finance brought about thanks in large part to the decentralizing effect cryptocurrencies introduce; with no policing or gatekeeping by the traditional players (governments, banks, regulatory bodies) and all individual users sharing transactions on a giant common ledger that are both secure and anonymous (‘Blockchain Explained: What Is Blockchain? | Euromoney Learning’ n.d.). This kinship between the two clusters as opposed to others could be due to both these two notions being of fairly conceptual nature, rather representing grander ideas than any specific smaller real-world application.
A closer look at some of the clusters yields interesting observations. The fact that biggest concentrations are around “x explained” sources and pages for not only blockchain, but also bitcoin, NFTs, defi and web 3.0 shows that these are still fairly new concepts for most people and most of the internet users are at the stage where they are trying to figure out what these mean and possibly could mean for the days ahead.
When we look at the organic keywords network; a clear flow of succession is plainly manifested with regard to the notion of blockchain technology; the blockchain cluster branches into (“gives birth to”) the crypto cluster, which in turn has a further extension in the form of bitcoin, which is the first and yet greatest example of the cryptocurrencies.
Within the bitcoin cluster, one sizable sub-cluster revolves around questions such as “how much was bitcoin when it first started?”. These queries possibly represent more financially literate individuals who already have an investment portfolio, probably dealing in stocks or other kinds of more conventional instruments. Upon starting to hear so much about bitcoin and cryptocurrencies in general, they are likely beginning to consider also joining themselves in the “hype”, and therefore are trying to assess the trajectory of the item with comparative, analytical questions as to where is stands today compared to its inception, thereby trying to understand whether it’s a viable investment option.
The metaverse cluster is fairly small in size, meaning that not many people are aware or interested in its existence for the time being. And in its current form, it’s heavily dominated by queries about Facebook. While surprising at first glance, it becomes apparent that this comes as a result of the tech giant’s name change in October 2021 (Isaac 2021). Through changing the semantics of their name, the company is apparently seeking to create a strong bond between their existence and the web 3.0. First of all, the keyword itself, metaverse, is being associated with the new step of the development of the web and therefore Facebook marked their participation in the process. However, secondly, with the new name, the company introduced a slightly new business model by strongly concentrating their practices/services on the popular emerging technology of the next stage of the web - the virtual life, land and reality. This way, Facebook as an influential actor on the technological stage defines the future through this novel innovation.
Comparing the two networks, a striking conclusion is that while there are clearer distinctions in the clusters of organic keywords pertaining to different sub- categories, the paid search terms display a much more meshed form. That is to say, paid search terms appear less heterogeneous than those of organic search. It appears that commercial players are even buying much more commonplace, generic keywords of established e-commerce notions and names (e.g. Shopify.com) that are remotely -if any- related to web 3.0 concepts, in an attempt to hook the average user, usher them in somehow and try to get them interested and eventually invest in the tradable web 3.0 products or services the respective merchant is dealing in, be it cryptocurrencies, NFTs or else. This could be read as a discrepancy between conventional commercial practices of “market segmentation” versus complexities of consumer landscapes of digital search.
Through our research, we looked at how different domains structure a specific narrative around the Web3 by the deliberate use of paid and organic search keywords in the Google search engine. We have discovered that the companies present a strong influence of technological determinism around the idea of web 3.0 by emphasizing the technologies and services related to the topic while not establishing an apparent connection to the ideology of the topic.
After conveying the analysis, we believe that it should be of research interest to study Facebook, or now known as Meta, and its monopoly around the metaverse and its related terms. This could help to observe the company’s influence over the shape of the future development of the web and calculate its consequences and potential harms. This way, academics can acquire a deeper insight into society’s dependence on the Internet as a source of knowledge and a tool to help understand various events and ideas.