(home) (about) (rss)

Part 3, Section 2:
Real-Time Data Harvesting.

Python and Twitter.

(Last Update: .)

What is Twitter?

If you already know about Twitter, skip this and the next section. Else: Twitter is a social network where an individual is able to post small snippets of text (140 characters or less) called tweets, which usually deal with small events happening in their life, interesting things they've found online, or whatever else one can fit in 140 characters. Surprisingly, twitter has been used to do substantial things like tracking diseases and assembling rioters; this has often been done by searching and analyzing data for tweets containing certain keywords.

A hashtag, #, on Twitter is generally given at the end of a tweet and (theoretically) relates a tweet to some larger trend or topic. Hence, a tweet like \[\mbox{fever, runny nose, sore throat, and I still have to}\] \[\mbox{go into work :( #unfair #sick #flu #worksucks}\] tells us that this person is a bit sick but they still must go into work. The related hashtags are "unfair", "sick", "flu", "worksucks" of which the middle two would be useful in tracking a disease, for example.

An @ on Twitter is used to reply or talk about a person. For example, if there was a user called "i_hate_twitter" (there is no such user --- I checked!) then in order to talk about or reply to a tweet, one might write something like: \[@\mbox{i_hate_twitter why do you hate twitter so much?}\] \[\mbox{#h8r #twitter #somuchhate}\] One may also retweet a posting. This is just sharing another tweet that someone else has made.

As far as social networks go, Twitter is currently one of the largest and most important. Twitter has an API which will allow us to investigate tweets and do some neat analysis with them. As a side-note, we will work with R later, using the marvelous free textbook which will also go over the Twitter API in some detail. We work with Python first only because the reader will most likely be more familair with Python than R.

Will I need to sign up to Twitter?

For this course, yes; you don't have to tweet, but you will have to have an account on twitter. If you already have an account, you may just use that. If not, go to twitter and sign up.

Okay, I have an account; now what?

We'll now make a sample application; this'll allow us to use the Twitter API.

Here's the steps; if there are hints available, click on the step to expand.

Setting Up.

And that's all you need to know about setting up in Twitter.


To work with the Twitter API we'll be using Twython. Depending on what OS you're running, getting Twython ranges from super-easy to slightly-irritating. We've worked with pip before, so if you've still got pip you can simply use the command

pip install twython

and you're good to go. If not, then you can choose to use Python's easy_install. If you're on Windows and these things are mystifying, try this solution. If worst comes to worst, go back and (re)-install pip.

I'll also note that I'll be using version 3.0.0 of Twython, so if you're reading this in the far-future, some commands may be slightly different.

Tweet with Python.

The point of this section will be to introduce OAuth and demonstrate how we can write a program which will produce a tweet on our account when we run it. Create a new Python program called updatetwitter.py; you can do this in whatever text or Python editor you'd like.

First, let's import tweepy and save some variables. The four variables you'll need are the four long sequences of letters and numbers from the OAuth page:

from twython import Twython

consumer_key = "yourkey"
consumer_secret = "yoursecret"
access_token = "youraccesstoken"
access_secret = "youraccess_secret"

where you replace the strings "yourkey" with the consumer key for your twitter program, and same for the rest. Twython makes it pretty easy to authenticate and start using the twitter API.

twitter = Twython(consumer_key, consumer_secret,
access_token, access_secret)

Note that I broke the command into two lines, but you should keep it as one line when you write it. Okay, that was fun. To test our to see if everything worked, we will attempt to update our status from this program. If this works, perfect; if not, then look to see if you have no misspellings, you've copied all of the variables correctly, and so forth. If you keep getting an error, google the error and see if others have had this problem --- most of the time, there's a solution on a forum somewhere. Using update_status() will update our status with whatever we put inside of it. For example,

twitter.update_status(status="Twython is 
better than love.")

will tweet that message. Note that I broke the command into two lines, but you should keep it as one line when you write it. Check to make sure it does; if not, then something is wrong. Either this documentation is out of date (let me know!) and there are some updated commands, or you've typed in something incorrect. Look at the error it returns, google it, and see what others have done.

True Life: Terrible Documentation.

One of the major problems that I've found while browsing through some API wrappers (like Twython) is that if you search for something like "twython tutorial", you'll bring up the same two tutorials over and over and these tutorials will ordinarily tell you how to authenticate yourself and how to update your status, and perhaps how to pull some tweets from the main page — and that's about it. For the beginner this kind of thing is a bit upsetting: not only will you not know how to do more with Twython, you will not know how to go about learning more!

This section will follow the "teach a man to fish" mentality: we will think about something we'd like to do and then learn how we can learn to do this.

For the sake of choosing something arbitrary, let's look at local trends. Currently on Twitter, depending on your location, you can look at "local trends" which are common or "hot" hashtags people are using in your area. For example, right now in New Orleans there is a huge thunderstorm and one of the trending topics is "storm". Some of these are more interesting than others, of course.

So. What do we do? Well, you can try to google a tutorial or a "how do I...", but currently there is no obvious solution on the first few pages of google, except for two pages which give the solution for a previous version of Twython which will not work with our version. So, let's get our hands dirty. I'll try to make this as general as possible so that you can always follow these steps. But clicking on the following things will give the "specific" example with Twython.

Finding Commands in your API Wrapper.

This may have seemed unnecessarily long; and, indeed, it was purposely a bit verbose to point out some of the things that could go wrong and some of the ways we can fix this. We'll be doing some of this function-hunting later but we will be much briefer with it. Nonetheless, as practice, you should look up how to return the user's (your) top few recent tweets with Twerpy using the process above and print them out nicely. My potential solution is here but don't look at it until you've tried it yourself!

⇐ Back to 3.1HomeOnwards to 3.3 ⇒