A world of hot takes, Twitter is just one of the many virtual environments that we can analyze 6,000 tweets per second to better understand how the world feels about a certain topic. It’s 2017, so naturally we’re going to gauge the world’s sentiment of Donald Trump. Then we’ll visualize it.
The folks at Initial State, a platform for building data visualizations, made this a reality, and it’s awesome. Combining Twitter’s API, PubNub, IBM Watson and Initial State, they were able to build a live Twitter dashboard that streams the tweets in realtime, analyzes and gauges the sentiment of the tweets, and publishes them in a number of different ways to the visualization.
And even better, there’s a full tutorial on how to build it. With that, read on!
This tutorial was written by Rachel Gregory, Customer Developer Lead at Initial State. She writes a ton of kickass tutorials, check them out.
This project makes heavy use of PubNub BLOCKS, which provides a serverless environment for executing functions on your data in motion (in this case, executing functions on each tweet that comes from the Twitter API via PubNub).
If you want to check out the source code in its entirety, it’s available here. Now, onto the tutorial!
The key to this entire project is PubNub. Their slogan, “Realtime Apps Made Simple”, is true. They provide over 70 SDKs and have plenty of pre-connected services in the form of BLOCKS. These blocks are basically bits of code that process your data before or after it’s sent to it’s destination — no server needed!
We will be using both the IBM Watson Sentiment Analysis BLOCK and the Initial State BLOCK. We will also be using PubNub’s live Twitter stream.
Note: PubNub’s Twitter stream only pulls in 50 messages a second (“only” 😂) compared to the 6,000 that actually come in. I chose to use it instead of the Twitter API because I’m already throttling the number of messages I send to IBM Watson’s sentiment analysis to just 1,000/day to stay within their free trial guidelines (I was meeting this cap after about an hour of running the script unregulated). If you’re interested in sifting through all of the tweets every second because you have a less popular keyword, or if you’re paying for IBM’s services, checkout the last section of this post — “Part 7 — The Unlimited Stream.
First things first, you need to create a PubNub account. You can register here.
Once you’re in and on the main page, scroll down to create an app. Apps are basically treated like projects, so each one has unique keys and blocks. Enter the name you want for your app (I named mine “Current Emotion in US”) and click “Create New App”. Click on your new app!
You will be taken to the “Key Info” page for your app. We’ll need these keys later! Click on the Key box (should be named “Free”) to be taken to where you can enable features for said key set. Under the “Application add-ons” section, click on the slider next to “PubNub Blocks” to turn them on.
Click on the “Blocks” tab on the left hand side of the page. Here is where our Watson and Initial State code is going to go.
Create a block and name it — I just called mine “Sentiment”. You should only have one key option because we didn’t create any extra. Click “Create”.
Click on your new block to be taken to its event handlers. Since you haven’t created any yet, you will be prompted to.
Name this one “IBM Watson” and name the channel “sentiment-analysis”. Select “Before publish or fire” as the type.
You’ll be taken to a page with a code box and debug console. We are going to replace all of the code inside of the box with code for the IBM Watson: Sentiment and Context Analysis block.
You can copy this code from the IBMWatson.js file.
Click the save icon to keep your changes. We’re going to add our Watson API key later.
Click the “+” next to the IBM Watson tab. We can now create the Initial State event handler. I named it “Initial State” with the channel name “initial-state-streamer”. Select “After publish or fire” as the type.
Replace this handler’s code with the code for the Initial State block. You can copy this code from the InitialState.js file.
Click the save icon to keep your changes. We’re going to add our Initial State access key later.
Let’s setup our IBM Bluemix account so we can use their sentiment analyzer!
IBM Bluemix provides us access to the Watson APIs — most importantly, the sentiment analysis. You can use it for free for 30 days, too!
All good services start with a sign-up, so register at this link: https://console.ng.bluemix.net/registration/
After a successful registration, you’ll have to confirm your account via email. Once back on the main page, navigate to Products & Services -> Launch Bluemix.
You’ll be prompted to name your “organization” and “space” — I just named mine “Twitter Stream” and “dev”.
Next we need to provision the AlchemyAPI. Click on “Apps” or “Catalog” and either search for “AlchemyAPI” or look under Services -> Watson.
Clicking on the AlchemyAPI service will take you to a page describing it that also allows you to change the service and credentials names. If you scroll down you can see the pricing plans — “Free” should already be selected.
Click “Create” to add the service!
You should be taking to a new service page. Click on the “Service Credentials” tab — you should see one set of credentials already created for you.
Select the “View Credentials” drop-down to see your “apiKey”. Copy-paste this key to line 13 in your IBM Watson PubNub block.
Note: IBM Bluemix just announced the retirement of the AlchemyAPI service. They say to use instead the Natural Language Understanding service, also under Watson. This might change the IBM Sentiment Analysis block inside of PubNub — I’m sure they’ll update it soon!
Go to the IBM Sentiment Analysis block inside of your PubNub app. Make sure that the block is Running and that you added your API key to the code.
Copy-paste the following into the “Test Payload” box:
{ "session_id": 1, "text": "I am happy!"}
Click “Publish”. If you see the following message in the debug console, then you successfully published to Watson and receive a sentiment score!
Now to get our Initial State access key!
We want to stream all of our sensor data to a cloud service and have that service turn our data into a nice interactive dashboard that updates in realtime. We are going to use Initial State.
Go to https://www.initialstate.com/app#/register/ and create a new account.
Association with your Initial State account happens because of the Access Key parameter. If you go to your Initial State account in your web browser, click on your username in the top right, then go to “my account”, you will find your access key at the bottom of the page under “Streaming Access Keys”. It’s the long series of letters and numbers.
Every time you create a data stream, that access key will direct that data stream to your account (so don’t share your key with anyone).
Copy-paste this key into line 27 of the Initial State block in PubNub.
Go to the Initial State block inside of your PubNub app. Make sure that the block is Running and that you added your access key to the code.
Copy-paste the following into the “Test Payload” box:
{ "events": [ { "key": "temperature", "value": 16 } ], "bucketKey": "pubnubTest"}
Click “Publish”. If you see the following message in the debug console, then you successfully published to Initial State!
Now let’s go check out our message in a dashboard.
Go back to your Initial State account in your web browser. We only sent our bucket key, so we need to manually create a bucket using the bucket shelf.
At the top of your bucket shelf, click the +cloud icon to create a new streaming bucket. Name your bucket and check the “Configure Endpoint Keys” box to specify the bucket key.
This must match the key you just sent through PubNub so enter “pubnubTest”.
Click the “Create” button at the bottom and the new bucket should be listed at the top of your shelf.
Click on this bucket and you’ll see a tile that says “temperature” with a value of 16!
Time to get everything working together.
Now that we’ve got the pieces working, it’s time to build our masterpiece.
We do want to check and make sure that consumption of PubNub’s Twitter stream is working, so let’s get to it!
Twitter does have their own API, but PubNub has made consuming it easy by providing us with a public subscription key. As I noted in Part 2, PubNub’s Twitter stream only pulls in 50 messages a second, but we’re throttling the number of messages sent to IBM Watson’s sentiment analysis to just 1,000/day anyways.
If you’re interested in sifting through all of the tweets every second because you have a less popular keyword, or if you’re paying for IBM’s services, checkout the last section: BONUS — Part 7: The Ultimate Stream
I initially wrote this project in Python because that’s the language I’m most familiar in, but I’ve gone ahead and added Node JS versions of all the scripts too! These could easily be adapted into another language thanks to PubNub’s many SDKs.
To use PubNub on your own machine (or perhaps a Raspberry Pi?), you need to install it using the command line:
pip install 'pubnub>=4.0.8'
You can see other installation methods here.
Note: You may need to place “sudo” before the command if you receive a permissions error.
To see if the Twitter feed will work, just run the twitterTest.py script! You can copy it to your computer by saving it to a folder you usually run code from or by creating a file with nano twitterTest.py
and pasting it.
Run the script:
python twitterTest.py
Give it a second or two to actually find a tweet that matches our criteria on line 32 ([‘Trump’,’trump’,’POTUS’,’potus’]), and you should start seeing messages print out:
@grassosteve @POTUS big time, which will lead to the otherHere we go Trump care conservative Republicans are not happy but from past told them to shut up and vote the way he tells them to so funny#NativeNationsRise #yourlastterm @realDonaldTrump https://t.co/o2a4d7g3wT45 says we should “invest in women’s health,” but has only nominated anti-choice #SCOTUS . #StopGorsuch #TrumpLies #TheResistance
If that works, we can move on to the master script!
To use PubNub on your own machine, you need to install it using the command line:
npm install pubnub
You can see other installation methods here.
Note: You may need to place “sudo” before the command if you receive a permissions error.
To see if the Twitter feed will work, just run the twitterTest.js script! You can copy it to your computer by saving it to a folder you usually run code from or by creating a file with nano twitterTest.js
and pasting it.
Run the script:
node twitterTest.js
Give it a second or two to actually find a tweet that matches our criteria on line 16 (Trump, trump, or any capitalization of POTUS), and you should start seeing messages print out:
@grassosteve @POTUS big time, which will lead to the otherHere we go Trump care conservative Republicans are not happy but from past told them to shut up and vote the way he tells them to so funny#NativeNationsRise #yourlastterm @realDonaldTrump https://t.co/o2a4d7g3wT45 says we should “invest in women’s health,” but has only nominated anti-choice #SCOTUS . #StopGorsuch #TrumpLies #TheResistance
If that works, we can move on to the master script!
The final script is called pubnubStream.py.
The most important thing you need to do before you try to run this script is add your PubNub keys:
Lines 1–5 import all of our modules.
Lines 7–20 are where we configure PubNub with both the public Twitter key and the private keys associated with our account.
Lines 23–33 handle the results of any publish call and will tell us if a publish was successful.
Lines 36–71 handle the subscription to the Twitter stream. If you want to change what keywords are looked for, you can do so on line 40 (‘Donald’ was omitted because I kept picking up tweets about McDonald’s). Starting at line 64, we actually handle the tweets — first searching for any of the keywords and then publishing the message to our sentiment-analysis channel (which is associated with the IBM Watson block).
Lines 74–134 handle the subscription to the sentiment analysis output. On line 78 we specify our Initial State bucket key (“pubnubtrump”). We will need this to create our bucket in Initial State! Lines 99–134 handle the sentiment analysis response. We build the payload to send to our Initial State block here (using the merge function on lines 137–145) and then publish it.
Lines 149–155 are where we configure our PubNub channel subscriptions and tell them which callback class to use.
Note: On line 67 we are sleeping for 90 seconds before publishing to our sentiment analysis block. This line is present because the free tier for IBM Bluemix appears to cap users at 1,000 calls a day. Only sending every minute and a half ensures that we never reach that cap. If you want near-real-time action, just remove the sleep.
Run the script with:
python pubnubStream.py
You should see the payload start printing out and output in your PubNub block debug console!
The final script is called pubnubStream.js.
The most important thing you need to do before you try to run this script is add your PubNub keys:
Line 1 imports the PubNub modules.
Lines 3–12 are where we configure PubNub with both the public Twitter key and the private keys associated with our account.
Lines 17–40 handle the subscription to the Twitter stream. If you want to change what keywords are looked for, you can do so on line 25 (‘Donald’ was omitted because I kept picking up tweets about McDonald’s). Starting at line 25, we actually handle the tweets — first searching for any of the keywords and then publishing the message to our sentiment-analysis channel (which is associated with the IBM Watson block).
Lines 42–88 handle the subscription to the sentiment analysis output. On line 48 we specify our Initial State bucket key (“pubnubtrump”). We will need this to create our bucket in Initial State! Lines 51–85 handle the sentiment analysis response. We build the payload to send to our Initial State block here and then publish it.
Lines 91–99 are where we configure our PubNub channel subscriptions.
Note: The free tier for IBM Bluemix appears to cap users at 1,000 calls a day. Running this script on the Trump and POTUS keywords hit that limit in 1–2 hours. You may want to implement some sort of delay or counter to keep from reaching the daily limit quickly.
Run the script with
node pubnubStream.js
You should see the payload start printing out and output in your PubNub block debug console!
Time to check the tweets out in a dashboard.
Time to look at our dashboard! We need to create a bucket with the bucket key from our script like we did in Part 3.
So click the +cloud icon at the top of your bucket shelf to create a new streaming bucket. Name your bucket and check the “Configure Endpoint Keys” box to specify the bucket key.
This must match the key you just sent through PubNub so enter “pubnubtrump”. I named my bucket “Trump Twitter 🐦”.
Click the “Create” button at the bottom and the new bucket should be listed at the top of your shelf.
Click on that bucket and you should see Tiles pop up for Tweet, Positive Level, Neutral Level, Negative Level, and Score.
If you want to make your dashboard both more informative and attractive, keep reading!
You can find support articles on what you can do in Tiles, but I’m going to walk through the main things I changed with links to the relevant support article:
And now you have a beautiful dashboard that updates in realtime! It’s also interactive — mouse over the different tiles to see signal names and values:
Note: Using RealTime Expressions requires either a Personal or Professional plan from Initial State
Want to add maps to your dashboard that show which areas of the world are the most positive/negative/neutral?
Just run the pubnubStreamLocation.py script or the pubnubStreamLocation.js script instead!
But first you have to get a Google API key and install the geolocation module based on Google Maps.
From the geolocation python page:
Copy-paste this key into line 8.
Python — Install the geolocation module:
pip install geolocation-python
Node — Install the geolocation module:
npm install @google/maps
Note: If you receive a permissions error, add “sudo” to the front of the command
To get location data from Twitter and then stream it based on the associated tweet’s score, I just added 2 lines to the IBM Watson block inside of my PubNub app. You can copy that block code here.
Be sure to put your IBM API key on line 13!
Add your PubNub subscribe and publish keys to this script too!
The Map tiles should automatically pop up inside of your Initial State bucket.
If you’re looking to sift through every single tweet that’s coming in you’ll need to consume the Twitter API directly. You can do this using the pubnubStreamUnlimited.py script or the pubnubStreamUnlimited.js script.
If you’re planning on analyzing more than 1000 tweets a day, you’ll also have to upgrade your IBM Bluemix plan. The current AlchemyAPI plan looks like this:
As I mentioned earlier, IBM will soon be switching sentiment analysis over from the AlchemyAPI to Natural Language Understanding. The pricing structure is slightly different:
Once you’ve figured out if you need to upgrade your IBM Bluemix service or not, you can get started using the Twitter API!
To get your Twitter API Credentials you will need to:
Copy-paste these keys into their respective spots on lines 11–14 in the Python script or lines 7–10 in the Node script.
Python — Tweepy is an awesome library for accessing the Twitter API through Python. To install it run:
pip install tweepy
Node — You can install Twitter for Node.js with:
npm install twitter
Note: If you receive a permissions error, add “sudo” to the front of the command
Add your PubNub subscribe and publish keys to this script too!
Now you should be all set to ingest the flood of data! I must say that I was super impressed watching PubNub successfully send messages to Watson, give me the response, and then send that data to Initial State even though I was absolutely hammering it. I mean, I hit my 1000 calls a day in less than 5 minutes!
Boom! 💣 You’ve done it! I’ve seen a ton of incredible and innovative realtime tutorials over my years, and this one is definitely one of my favorites! With PubNub BLOCKS and the vast number of partners, the sky’s the limit.
If you like this tutorial, please ❤️ it! Rachel worked incredibly hard, and it definitely paid off!