Introduction
Over the last several years there has been an attempt from Russian trolls to spread propaganda and fake news over social media. The purpose of this is to spread political ideas among the general population, both nationally and internationally.
On this website a great number of these russian tweets have been analyzed and visualized. We will look into how the trolls operate and organize themselves, trying to find patterns in the madness. Such patterns might maybe lead us to being able to detect these Russian trolls in an earlier stage.
This data story consists of four pages, excluding this start page. You will be able to navigate by clicking the following links. We recommend you to read through them in the following order:
Through this data story we will not provide the information of how we found things out or how we coded things. If you want a total understanding of all the processes in creating this page or just view a bunch of code, please take a look at our notebook. There we will also go through our line of thougth step-by-step.
Additional information about the pages
1. General statistics.
This page shows several visualizations of basic statistics. How often did the trolls tweet? At what time of the day? Were they more active during the election? What is the highest amount of followers a troll has had? These questions, amongst others, will be answered.
2. Finding users with bot-like pattern.
This page shows the methods we used to find out and decide which users we would go on to categorize as likely being bots. This may be useful to better understand how to detect the trolls.
3. Are the trolls interacting with each other?
This page shows how the Twitter trolls interact with each other, and in which scale.
4. How you may be able to detect some of the trolls.
This page give you an insight in whether it is possible to detect some of the Russian Twitter trolls, as well as providing statistics for how the users are divided into different classifications.
Technical information
If you want to check all our work for this project, take a look at our github repo for the project.
Our repo for the website can be found here.
We used two different datasets for this project. One was created by two researchers at Clemson University, and contains some extra features as for example classification of the accounts. The set can be downloaded and read about here.
The other dataset was released by Twitter this autumn. It can be found here.
Thanks for your time!