Hello, Guys,
In this tutorial, I will guide you on how to perform sentiment analysis on textual data fetched directly from Twitter about a particular matter using tweepy and textblob.
Sentiment analysis is a process of analyzing emotion associated with textual data using natural language processing and machine learning techniques.
If you're new to sentiment analysis in python I would recommend you watch emotion detection from the text first before proceeding with this tutorial.
We are going to build a python command-line tool/script for doing sentiment analysis on Twitter based on the topic specified.
You will just enter a topic of interest to be researched in twitter and then the script will dive into Twitter, scrap related tweets, perform sentiment analysis on them and then print the analysis summary.
To follow through tutorial you need the following
You need to to signup for twitter Developer Account and then apply to access for API Keys, Apply now
Once you signup for a developer account and apply for Twitter API, It might take just a few hours to a few days to get approval.
After being approved Go to your app on the Keys and Tokens page and copy your api_key and API secret key in form as shown in the below picture.
The easiest way to install the latest version from PyPI is by using pip:
$ pip install tweepy
You can also use Git to clone the repository from GitHub to install the latest development version:
$ git clone https://github.com/tweepy/tweepy.git
$ cd tweepy
$ pip install .
$ pip install -U textblob
$ python -m textblob.download_corpora
Now after everything is clearly installed, let's get hand dirty by coding our tool from scratch.
First of all, I have separated the project into two files, one consisting of API keys while the others consisting of our code for the script.
.
├── API_KEYS.py
└── app.py
0 directories, 2 files
If we look inside the API_KEYS.py it looks as shown below whereby the value of api_key and api_secret_key will be replaced by your credentials received from Twitter
api_key = 'your api key'
api_secret_key = 'your api secret key'
Now Let's start coding our script
from tweepy import API, OAuthHandler
from textblob import TextBlob
from API_KEYS import api_key, api_secret_key
To start fetching tweets from Twitter, firstly we have to authenticate our app using the API key and secret key.
To authenticate our API we will use OAuthHandler as shown below;
authentication = OAuthHandler(api_key, api_secret_key)
api = API(authentication)
To fetch tweets from Twitter using our Authenticated API use the search method fetch tweets about a particular matte just as shown below;
public_tweets = api.search(Topic)
public_tweets is iterable of tweets objects but in order to perform sentiment analysis, we only require the tweet text.
Therefore in order to access text on each tweet, we have to use the text property on the tweet object just as shown in the example below.
from tweepy import API, OAuthHandler
from textblob import TextBlob
from API_KEYS import api_key, api_secre
authentication = OAuthHandler(api_key, api_secret_key)
api = API(authentication)
corona_tweets = api.search('corona virus')
for tweet in corona_tweets:
text = tweet.text
print(text)
When you run the above script it will produce a result similar to what shown below.
$ python example.py
......
RT @amyklobuchar: So on Frontier Airlines you now have to pay an extra fee to keep yourself safe from corona virus. As I said today at the...
There are so many disturbing news these days ON TOP OF CORONA VIRUS. It just sinks my heart![😟](https://s.w.org/images/core/emoji/12.0.0-1/svg/1f61f.svg) We all need therapy.
RT @ug_chelsea: Corona virus symptoms basically are the same feelings you get when your wife is checking your phone
Now Let's use TextBlob to perform sentiment analysis on those tweets to check out if they are positive or negative.
polarity = TextBlob(Text).sentiment.polarity
If the polarity is less than 0 it's negative
If the polarity is greater than 0 it's positive
I then compiled the above knowledge we just learned to build the below script with the addition of the clean_tweets function to remove hashtags in tweets
from tweepy import API, OAuthHandler
from textblob import TextBlob
from API_KEYS import api_key, api_secret_key
def clean_tweets(tweet):
tweet_words = str(tweet).split(' ')
clean_words = [word for word in tweet_words if not word.startswith('#')]
return ' '.join(clean_words)
def analyze(Topic):
positive_tweets, negative_tweets = [], []
authentication = OAuthHandler(api_key, api_secret_key)
api = API(authentication)
public_tweets = api.search(Topic, count=10)
cleaned_tweets = [clean_tweets(tweet.text) for tweet in public_tweets]
for tweet in cleaned_tweets:
tweet_polarity = TextBlob(tweet).sentiment.polarity
if tweet_polarity<0:
negative_tweets.append(tweet)
continue
positive_tweets.append(tweet)
return positive_tweets, negative_tweets
positive, negative = analyze('Magufuli')
print(positive , '\n\n', negative)
print(len(positive), ' VS ', len(negative))
To change a topic you want to analyze or change the topic parameter in the analyze function to the topic of your interest.
Also, you can specify the number of tweets to be fetched from Twitter by changing the count parameter.
When you run the above application it will produce results to what shown below
python app.py
.....................
['@o_abuga Obvious, the test kits the results are doubtful!! Magufuli said it']
9 VS 1
🎉🎉🎉 Congratulations you have just completed a tutorial on Twitter sentiment analysis using python, You should be proud of yourself, Tweet now to share this good news with your fellow developers.
I also recommend you to read this;
The full code for this article can be found on My Github :-)
Previously published at https://kalebujordan.com/twitter-sentiment-analysis/