Data Science

Exploring Political Sentiments from Twitter Data

By Godwin Murithi

Twitter is currently one of the most widely used social networks — It is a real-time microblogging platform, publicly launched in July 2006. Twitter has one of the biggest data sets in the world. It is much different from Facebook from the aspect that Twitter is real time. 

Twitter datasets are awesome troves of information and provide great insights. It provides a platform where businesses and public persons engage with their audiences, important messages are shared in near-real time and, as the recent events show, even politics is done. Today, Twitter boasts of 450M monthly users globally.

Due to the rich data in the pooled tweets, governments, businesses and researchers have understood the power of twitter data. 

For instance, think of a way to predict the outcome of an election in a country. Given the fast-paced and concise format of messages shared, it is a nice tool that could be used to gather signals and sentiments that could affect the outcome of an election. These sentiments could be used to understand the mood of the public and their perceptions towards different candidates.

In recent studies, researchers have used twitter data to successfully understand, reach, predict and map violence hotspots during sports events

In business, Twitter helps brands to understand, track, and benchmark the conversations and perceptions surrounding the brand. In terms of customer engagement it helps directly engage with customers to quickly answer questions, resolve their issues, and provide exceptional service.

Understanding political sentiments

In this series, we’ll look into the process for extracting, transforming and understanding Twitter data relevant to the Kenya 2022 general election. To access the codes refer to this Github repository 

To access tweets we developed scripts in Python 3.7, Tweepy  and a couple of other libraries. Tweepy is an open-source python package that provides a way for developers to communicate with the Twitter API. 

Twitter levies a rate limit on the number of requests made to the Twitter API. To be precise, 900 requests/15 minutes are allowed; Twitter feeds anything above that an error

But first, you will need to obtain the necessary credentials from Twitter. These credentials from Twitter are used to instantiate the API. 

Mapping geographic distribution of tweets

To carry out spatial analysis on the data, it is necessary to add a location component to the individual tweets. This helps in mapping the distribution of tweets as well as classified sentiments. To achieve this, you can use the google maps API or Nominating API to geocode. 

Usually the google maps API has a limit of 2500 addresses per day. Nominatim is preferable since it is open and free. Additionally, there is no limit to the number of addresses you can geocode in a day. 

Due to the high number of tweets that you can scrap it is advisable to automate the process. The Tidygeocoder package in R which uses the Nominatim API provides an easy and effective approach to automate. 

Preliminary results

Below is a map of the distribution of tweets related to the 2022 Kenya general election from May to August of 2022.

 

Next steps

Through this, I was able to extract over a million tweets related to the 2022 Kenya general Election. These tweets were cleaned,  geocoded and mapped. From the geocoded tweets it was possible to determine the number of tweets per county, the dominant candidates in each county as well as the dominant candidates in the country from the sentiments of twitter users.

Next we are going to use machine learning to explore and classify the tweets based on their polarity. From the results we will map electoral related  violence hotspots and validate using data from the just concluded election.

Related Posts

Geo-coding Addresses Using QGIS

QGIS may be used to visualize and map addresses if you have a CSV file with addresses in it. This article provides a detailed tutorial with step-by-step directions…

Perfoming Spatial Analysis

Most data and measurements can be linked to specific locations and hence plotted on a map. You can tell what is present and where it is by using…

How Location Intelligence Is Powering Today’s Most Disruptive Products

Airbnb, DoorDash, Airbnb — what core element do these companies have in common? Yes, all three are headquartered in San Francisco and have been darlings of the startup…

AI and Deep Learning: What is there for the Geospatial Industry?

In recent years, Deep Learning has emerged as the leading approach to developing Artificial Intelligence (AI), enabling machines to perceive and understand the world. This advancement has had…

IMAGE CLASSIFICATION IN REMOTE SENSING

By Godwin Murithi Remote sensing is a technology that has been used for decades to collect information about the Earth’s surface using sensors mounted on satellites, airplanes, or drones….

Geo-Marketing: How Location Intelligence Can Drive Business Growth

By Godwin Murithi In today’s commercial world, location has become a well-known term among marketers and business owners seeking to capture the attention of local consumers. This is due…

Leave a Reply

Your email address will not be published. Required fields are marked *