The study utilized a dataset from Twitter (now X) related to anti-Ukrainian narratives. The dataset of all tweets containing the chosen hashtags comprised 16,700 tweets collected from January 25 to February 22, 2023. The Thick Big Data method, along with latent Dirichlet allocation (LDA) and natural language processing (NLP) techniques, was used to identify recurring themes and two- and three-word sequences, enabling an in-depth understanding of the community’s discourse.
To probe the proliferation of anti-Ukrainian narratives on Twitter in Poland, we employed an array of strategies, including hashtags such as #StopUkrainizacjiPolski (#Stop the Ukrainianization of Poland), #ToNieNaszaWojna (#It isn’t our war), #NiedlaWojny (#No for war), and keyword searches including “Ukry” (Ukrainians), “Ukraińcy” (Ukrainians), “ukraińscy faszyści” (Ukrainian fascists), “Wołyń” (Volhynia), “bandera” (Bandera), “banderowcy” (banderists), “Wielka Polska” (Great Poland), “UPA” (The Ukrainian Insurgent Army), “Ukropolin” (Ukrapolish) and “fuck Ukraine.” We also utilized the snowball method to identify interconnected accounts, broadening our analytical scope. We relied on a Python scraper for data collection — as it allowed us to gather the full spectrum of data beyond the limits of the available API at the time. The scraper code was based on the SNScrape script.
For the subsequent analysis of the extracted tweets, we calculated and selected the most engaging Tweets for more nuanced qualitative research (Philipp, 2014).
The sources of this data are clearly cited in the article, and readers may access them by contacting the lead author. Furthermore, the data contains content that incites violence and hatred, which carries the risk of unethical use.