maths with Python 6: Twitter API – Tweepy for social media and networks (with Gephi)

We are back again with another Python tutorial. Did you remember the other ones?

maths with Python

maths with Python 2: Rössler system

maths with Python 3: Diffusion Equation

maths with Python 4: Loading data.

Raspberry Pi 001: Setup and Run first Arduino-Python project.

Raspberry Pi 002: Pi Camera, start up scripts and remote desktop

maths with Python 5: Double Compound Pendulum Chaotic Map

In this one we are going to work with Twitter API and Python.

TWWITERAPIFirst of all, what is an API? In short, an API is a library that allows one software to communicate with another software/hardware. In this case, Twitter API allows to communicate with “Twitter” thinking of Twitter as a software.

Second, for what coding languages there is libraries? This question depends on the developers and changes quite a lot. In the Twitter Developers webpage there is a list of the available libraries for different languages (that doesn’t mean there is no more than those, possible there is thousands made by amateurs but are not updated regularly).

We are interested here in the libraries for Python:

  • tweepy maintained by @applepie & more — a Python wrapper for the Twitter API (documentation) (examples)
  • python-twitter maintained by @bear — this library provides a pure Python interface for the Twitter API (documentation)
  • TweetPony by @Mezgrman — A Python library aimed at simplicity and flexibility.
  • Python Twitter Tools by @sixohsix — An extensive Python library for interfacing to the Twitter REST and streaming APIs (v1.0 and v1.1). Also features a command line Twitter client. Supports Python 2.6, 2.7, and 3.3+. (documentation)
  • twitter-gobject by @tchx84 — Allows you to access Twitter’s 1.1 REST API via a set of GObject based objects for easy integration with your GLib2 based code. (examples)
  • TwitterSearch by @crw_koepp — Python-based interface to the 1.1 Search API.
  • twython by @ryanmcgrath — Actively maintained, pure Python wrapper for the Twitter API. Supports both normal and streaming Twitter APIs. Supports all v1.1 endpoints, including dynamic functions so users can make use of endpoints not yet in the library. (docs)
  • TwitterAPI by @boxnumber03 — A REST and Streaming API wrapper that supports python 2.x and python 3.x, TwitterAPI also includes iterators for both API’s that are useful for processing streaming results as well as paged results.
  • Birdy by @sect2k — “a super awesome Twitter API client for Python”

Here we are going to work with Tweepy, but you can try with any of the others, they should work in similar way.

TWWITERAPI

step 1 : Install Tweepy library into your Python distribution (I usually use Anaconda from Continuum Analytics). Just open a command window and

easy_install tweepy

It should install the library without any problems.

command window

step 2: Get a Twitter account (You can skip this step if you already have one). I made @Brickinthesky for that.

twitteraccount

step 3: Go to Twitter Developers webpage and get a code to access Twitter through the API with your Twitter account. First you need to Sign in with your Twitter account.

001

Once Signed in, you can click in “My applications” to create the key codes.

002

Since it is the first time you try to use the API, there is no Apps so, just click and create a new one.

003 Fill the requested details and don’t worry too much if you don’t know what to put there. For instance the website says it can be changed later.

004

 

Hit create and voilà!

005

See the red arrow? It indicates a field you need to change in order to be able to send and receive data from Twitter. Click on it and change to Read and Write.

006

Update the settings and click in the API Keys. Here scroll down to “Create my access token”.

007

Possibly you will need to wait a little bit… and update. And… here they are, the API key and secret, and the access token and secret.

008

step 4: Get the tweets in your public timeline. Go back to Python and with this code (where you have to use your API key and Token key) you will be able to see your timeline.

Get Timeline:

import tweepy

# Copy the api key, the api secret, the access token and the access token secret from the relevant page on your Twitter app

api_key = 'xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx'
api_secret = 'xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx'

access_token = 'xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx'
access_token_secret = 'xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx'

# You don't need to make any changes below here

# This bit authorises you to ask for information from Twitter

auth = tweepy.OAuthHandler(api_key, api_secret)
auth.set_access_token(access_token, access_token_secret)

# The api object gives you access to all of the http calls that Twitter accepts

api = tweepy.API(auth)

# Retrieve the last 20 tweets from your timeline
public_tweets = api.home_timeline()

# For each tweet in your timeline. Print out the tweet text
for tweet in public_tweets:
    print tweet.text

And the result is the tweets in your timeline will appear in Python.

009

step 5: Let’s get the tweets from another account. Let’s try Neil deGrasse Tyson which Twitter looks like this right now.

011

Get his id below his picture “neiltyson” and put it in this code

Get ID Timeline

import tweepy

# Copy the api key, the api secret, the access token and the access token secret from the relevant page on your Twitter app

api_key = 'xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx'
api_secret = 'xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx'

access_token = 'xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx'
access_token_secret = 'xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx'

# You don't need to make any changes below here

# This bit authorises you to ask for information from Twitter

auth = tweepy.OAuthHandler(api_key, api_secret)
auth.set_access_token(access_token, access_token_secret)

# The api object gives you access to all of the http calls that Twitter accepts

api = tweepy.API(auth)

# Now, instead of getting your own tweets we are going to gather another users tweets.
# Look here for help http://tweepy.readthedocs.org/en/v2.3.0/api.html
tweets =  api.user_timeline(id='neiltyson')

for tweet in tweets:
    print tweet.text
 

And after running the code…. here they are deGrasse tweets

010

step 6: Now let’s publish something in Twitter using Python. For that I know that the correct instruction is “api.update_status” but I don’t know how to use it, so lets look at the API reference.

012

Hmmm it looks quite simple, just….

Publish status

import tweepy

# Copy the api key, the api secret, the access token and the access token secret from the relevant page on your Twitter app

api_key = 'xxxxxxxxxxxxxxxxxxxxxxxxxx'
api_secret = 'xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx'

access_token = 'xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx'
access_token_secret = 'xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx'

# You don't need to make any changes below here

# This bit authorises you to ask for information from Twitter

auth = tweepy.OAuthHandler(api_key, api_secret)
auth.set_access_token(access_token, access_token_secret)

# The api object gives you access to all of the http calls that Twitter accepts

api = tweepy.API(auth)

api.update_status('First status from Python')

and it appears in Twitter.

013

 step 7: Generate and upload a picture from Python. For this we are going to create a graph and upload it into Twitter. Simply…

Publish image

import tweepy

# Copy the api key, the api secret, the access token and the access token secret from the relevant page on your Twitter app

api_key = 'xxxxxxxxxxxxxxxxxxxxxx'
api_secret = 'xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx'

access_token = 'xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx'
access_token_secret = 'xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx'

# You don't need to make any changes below here

# This bit authorises you to ask for information from Twitter

auth = tweepy.OAuthHandler(api_key, api_secret)
auth.set_access_token(access_token, access_token_secret)

# The api object gives you access to all of the http calls that Twitter accepts

api = tweepy.API(auth)

#########################
import matplotlib
from pylab import *
import numpy as np

x = np.linspace(0,2*pi,1000)
y = np.sin(1/x)
plt.plot(x,y)
show()

savefig('graph.png')

api.update_with_media('C:\Users\Hector\Documents\Python Scripts\graph.png','Uploading a graph!!')

And this is the result. Isn’t it cool?

014

step 8: Networks. Basically we want to see how the followers of somebody are related. To do that we are going to use…. the free open-source software for networks visualization…. Getphi.

gephiJust download and install.

And now we need to know how to generate files for Gephi using Python.

To do that we are going to create CSV files. Which are the most easy ones (list of supported files).  It seems that something as simple as…. create a txt file, write inside:

A,B
B,A
C,C
D,E
A,D
D,B,E
F,G,A,B

save it as *.csv

Go to Gephi and open it. (Ignore warnings, just hit Ok button). To display the graph… go to Preview, add the names of the nodes and update… and voilà!

gephiw

Cool!! Now back to Tweepy. We are going to use the tweepy comand “api.followers(user)” to see the followers of a particular user. And to save the data into *.csv files we are going to use the csv library (which doesn’t require installation, at least with this Python distribution). The final code is basically an idea of how to draw fractals. Basically you create a function that calls itself inside of it (with one “counter” that reduces each time the function calls itself, so it can only autoreference a finite number of times). Here is the code:


import tweepy

# Copy the api key, the api secret, the access token and the access token secret from the relevant page on your Twitter app 

api_key = 'xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx' 
api_secret = 'xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx' 
access_token = 'xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx' 
access_token_secret = 'xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx' 
# You don't need to make any changes below here # This bit authorises you to ask for information from Twitter 
auth = tweepy.OAuthHandler(api_key, api_secret) 
auth.set_access_token(access_token, access_token_secret) 
# The api object gives you access to all of the http calls that Twitter accepts 
api = tweepy.API(auth) 

#User we want to use as initial node 
user='HCorteLeo' 

import csv 
import time 
#This creates a csv file and defines that each new entry will be in a new line 
csvfile=open(user+'net4.csv', 'wb') 
spamwriter = csv.writer(csvfile, delimiter=' ',quotechar='|', quoting=csv.QUOTE_MINIMAL) 
#This is the function that takes a node (user) and looks for all its followers #and print them into a CSV file... and look for the followers of each follower... 

def fib(n,user,spamwriter):
    if n>0:
        #There is a limit to the traffic you can have with the API, so you need to wait 
        #a few seconds per call or after a few calls it will restrict your traffic 
        #for 15 minutes. This parameter can be tweeked 
        time.sleep(40) users=api.followers(user) 
        for follower in users:
            spamwriter.writerow([user+';'+follower.screen_name]) 
            fib(n-1,follower.screen_name,spamwriter) 
            #n defines the level of autorecurrence 
n=3 
fib(n,user,spamwriter)

This will create a *csv file which you can open in notepad and it will look like…

015

And once loaded in Gephi… voilà!!!

016

Hope you like this very long post and helped people get used to work with the Twitter API.

Advertisements

9 thoughts on “maths with Python 6: Twitter API – Tweepy for social media and networks (with Gephi)”

  1. Great little tutorial this. I will follow it through with all the code over the weekend.

    Have followed.

  2. Any thoughts on exception handling? The script runs well- but continually crashes given twitter’s rate limit. I’ve played with the sleep function a bit, but the results are relatively sparse.

    1. But even with exception handling the rate limit will be there and I’m not sure, but I think calling again the function, when you are temporally blocked by Twitter, will reset the waiting time.

      1. Rate limits are reset every 15 minutes. You may get the remaining calls left and when the next window begins by calling api.rate_limit_status(). What you can do is to call as quickly as possible then when you get a rate limit error, sleep until the start of the next window. You may also determine first the start of the next window before calling.

  3. Great. Calling the function again seems to reset the wait time. Have you encountered any problems capturing all followers as we move through the recursive function? For instance, one user has upwards of 1k followers, yet the script captures ~1/10 of those usernames.

    1. The Twitter API returns up to 100 followers per call only. You need to enable paging so that you can get the remaining followers by cursoring to the next pages.

      To do this with tweepy, replace

      users=api.followers(user)

      with

      users=tweepy.Cursor(api.followers, user=user).items()

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s