Live List of the Most Popular Twitter Clients

I just put up a live list of the most popular Twitter clients. The contents of that page are updated every 60 seconds (the length of time twitter caches their public timeline).

For some reason I woke up at 7 this morning, unable to get back to sleep, and I was randomly wondering how many Twitter clients were out there, and which were the most popular. So I decided to find out.

After a bit of hacking around, I came up with the following Python script:

    from urllib import urlopen
    from contextlib import closing
    import pickle
    import json
    
    try:
        clientlist = pickle.load(open('clientlist.pickle'))
    except:
        clientlist = {}
    
    with closing(urlopen('http://twitter.com/statuses/public_timeline.json')) as f:
        json_str = f.read()
    
    tweets = json.loads(json_str)
    
    for tweet in tweets:
        source = tweet['source']
        clientlist[source] = clientlist.get(source, 0) + 1
    
    with open('clientlist.pickle', 'w') as f:
        pickle.dump(clientlist, f)
    
    total = 0
    for count in clientlist.itervalues():
        total += count
    
    with open('clientlist.html', 'w') as f:
        f.write('<html><head><title>Twitter Client list</title></head><body>')
        f.write(''.join(['<h1>Twitter Client Popularity</h1>',
                         '<div style="float: left; margin-right: 20px;">Out of <em>', str(total), 
                         '</em> Twitter messages, the following were the clients used.']))
        f.write('<table><thead><tr><th>Client</th><th>Tweet Count</th><th>% of total</th></thead><tbody>')
    
        for (link, count) in sorted(clientlist.iteritems(), key=lambda item: item[1], reverse=True):
            f.write(''.join(['<tr><td>', link, '</td><td>', str(count), '</td><td>', 
                             str(round(100.0 * count / total, 2)), '</td>']))
        
        f.write('</tbody></table></div>')
    
        f.write('</body></html>')

The script writes out a pickle file with the list of counts for each client, so multiple runs will generate a cumulative result (don’t run it more than once every 60 seconds, or you will add the same results in more than once).

It then writes out an .html file with a nice sorted list of the most common Twitter clients (well, all the twitter clients it’s seen, sorted by most common first).

I’ve set up a cron job on this server to update the file linked to above every minute, so as time goes on the list will become a more and more accurate representation of what people are using to tweet.

blog comments powered by Disqus