Untiny that url!
There has been some talk about and arguments against and responses to issues about using rev=”cononical” for referencing shorter URLs instead of the automated use of TinyURL when posting to sites like Twitter.
I must say that I agree with Ben Ramsey (see “arguments agains” above) in suggesting we use rel=”alternate shorter” instead.
I also like the idea that Chris Shiflett had of using a HTTP header and a HEAD request to make it so you neither have to retrieve the entire requested page nor parse any HTML. I’d stick with Ben’s suggestion, however, and make the header something like “X-Alternate-Shorter:”, rather than “X-Rev-Canonical”. What’s the harm in calling it something that actually makes sense?
The idea of using HTTP HEAD requests to solve the problem inspired me to come up with a more immediate solution to one of the problems introduced by using url shortening services: uncertainty about where a URL leads.
This problem can be solved on the client side, which requires no work on the part of Twitter (meaning this is more likely to be put into use sooner).
Since most URL shortening services use an HTTP redirect to do their job, all it takes is a HEAD request to the tiny URL in question, and then a look at whatever “Location:” header is returned to see what the real URL is. In fact, you don’t even really need to do a HEAD request in most cases, since most URL shortening services don’t return any body, since they are just redirecting you anyway.
Read on for more information and implementations of an untinyurl function in various languages.
There’s actually already a site online that offers the service of un-shortening URLs for you at UnTinyURL.com, but I wouldn’t suggest using that in any sort of automated system, and it’s of limited usefulness since you don’t really want to have to go to this site just to see what site you’re about to go to. Most people will just click a link, even if it means they might get RickRolled.
For those comfortable with the commandline, a simple curl call can give you the same basic info:
curl -I http://tinyurl.com/c8f5bz HTTP/1.1 301 Moved Permanently X-Powered-By: PHP/5.2.9 Location: http://probablyprogramming.com/2009/04/11/untiny-that-url/ Content-type: text/html Date: Sun, 12 Apr 2009 01:26:08 GMT Server: TinyURL/1.6
Toss in a grep and an awk, and you get your URL in a single line, perfect if you’re handing shortened URLs in a shell script for some reason:
$ curl -s -I http://tinyurl.com/c8f5bz | grep Location | awk '{print $2}'
http://probablyprogramming.com/2009/04/11/untiny-that-url/Here’s untinyurl in Python:
import httplib import urlparse def untinyurl(tinyurl): url = urlparse.urlsplit(tinyurl) req = urlparse.urlunsplit(('', '', url.path, url.query, url.fragment)) con = httplib.HTTPConnection(url.netloc) try: con.request('HEAD', req) except: return None response = con.getresponse() return response.getheader('Location', None)
And here’s a version in PHP. It’s a bit longer and uglier than the Python version because I’m using the low-level fsockopen function to do my HTTP request rather than using cUrl or the HTTP extension. The reason I did this is because every PHP install will have fsockopen, whereas not every install will have cUrl or the HTTP extension.
<?php function untinyurl($tinyurl) { $url = parse_url($tinyurl); $host = $url['host']; $port = isset($url['port']) ? $url['port'] : 80; $query = isset($url['query']) ? '?' . $url['query'] : ''; $fragment = isset($url['fragment']) ? '#' . $url['fragment'] : ''; $sock = @fsockopen($host, $port); if (!$sock) return $tinyurl; $url = $url['path'] . $query . $fragment; $request = "HEAD {$url} HTTP/1.0\r\nHost: {$host}\r\nConnection: Close\r\n\r\n"; fwrite($sock, $request); $response = ''; while (!feof($sock)) { $response .= fgets($sock, 128); } $lines = explode("\r\n", $response); foreach ($lines as $line) { if (strpos(strtolower($line), 'location:') === 0) { list(, $location) = explode(':', $line, 2); return ltrim($location); } } return $tinyurl; }
I’m not too familiar with Ruby, but after poking around for a little bit, I came up with this Ruby version. Holy crap, Ruby, that was easy and short!
require 'net/http' require 'uri' def untinyurl(tinyurl) Net::HTTP.get_response(URI.parse(tinyurl))['location'] or tinyurl rescue tinyurl end
And one more, and Erlang implementation (that <SEMI> is supposed to be a semicolon, but something is wrong with the syntax highlighter Erlang plugin). Be sure you call “inets:start()” before calling this function.
-module(untinyurl). -export([untinyurl/1]). untinyurl(TinyUrl) -> case http:request(head, {TinyUrl, []}, [{autoredirect, false}], []) of {ok, {_Status, Headers, _Body}} -> proplists:get_value("location", Headers, TinyUrl); _ -> TinyUrl end.
Interesting how the Erlang and Ruby implementations look pretty similar.
I’ve made the source code available at GitHub. If you would like to contribute an untinyurl implementation in another language or have a bug-fix or suggestion for an improvement of one of the implementations I have so far, either email me, send me a pull request on GitHub, or post a comment here.

April 12th, 2009 at 4:04 am
You don’t need grep when you use awk the way it is supposed.
curl -s -I http://tinyurl.com/c8f5bz | awk ‘/Location:/ {print $2}’
is enough. You can also use
lynx -head -dump $URLinstead of curl.April 12th, 2009 at 4:35 am
I haven’t used awk that much. I knew there was a way to do that, I just didn’t want to bother looking it up
April 12th, 2009 at 8:51 am
Jonathan Rockway talked about this the other day as well.
http://blog.jrock.us/articles/Unshortening URLs with Modern Perl.pod
Here is how you might write your untinyurl function in (not so modern) Perl.
use LWP::UserAgent; sub untinyurl { my $url = shift; my $res = LWP::UserAgent->new->request(HTTP::Request->new(HEAD => $url)); return $res->previous ? $res->previous->header(‘location’) : $url; }
April 12th, 2009 at 3:29 pm
Not idiomatic Python. The first line in httplib docs says:
“This module defines classes which implement the client side of the HTTP and HTTPS protocols. It is normally not used directly — the module urllib uses it to handle URLs that use HTTP and HTTPS.”
With urllib, the following will suffice:
A sample interaction is as follows:
April 12th, 2009 at 4:22 pm
@Dan Farina
I know it’s not idiomatic Python, and I even came up with that same shorter solution (with a try: except: to catch invalid URLS), but the problem with that version is that it actually does two HTTP requests. I couldn’t find a way to make it only do the first request.
If somebody knows how to make urllib or urllib2 not follow redirects, then I will change it. Until then, this is the shortest version I could come up with that didn’t involve doing some crazy stuff with subclassing urllib2.HTTPRedirectHandler
April 12th, 2009 at 5:49 pm
And a Common Lisp version, using Edi Weitz’s Drakma http client: (require ‘drakma) (lambda (url) (multiple-value-bind (body return headers) (drakma:http-request url :method :head :redirect nil) (or (cdr (assoc :LOCATION headers)) url)))
April 13th, 2009 at 9:21 am
You can perform a HEAD request using Ruby’s Net::HTTP like this http://gist.github.com/94463
April 14th, 2009 at 12:26 am
For the PHP version, you have the get_headers()-function.
April 14th, 2009 at 1:49 am
@Alexander: That’s a good point. I didn’t even notice that function existed. It seems it just does a GET request, sadly, but it’s still better than having all the code in my version above.
April 14th, 2009 at 4:46 pm
Perhaps I’m missing something but how do you propose someone would utilize one of these tools to “untiny” a url in a non-intrusive way; wouldn’t most people continue to just click on the URL even if they risk being Rick Rolled?
Would going to the command line really be any less a pain in the ass than just clicking on the link and finding out where it goes?
It seems like a more inline approach is what is needed (a greasemonkey script or something similar) that will show the expanded URL within the browser.
April 14th, 2009 at 6:23 pm
@Bill: The idea is that people should use these functions in their Twitter clients or apps that pull in tweets for one reason or another. I didn’t really intend that people would run a script in bash just to find out what URL something goes to.
April 17th, 2009 at 5:15 pm
Hi
You can try http://untiny.com http://untiny.me .. It has been there for a while and supports over 75 tiny services It has about 10 addons as well (based on API http://untiny.com/api ) in Firefox, Mac, Linux … http://untiny.com/extra
It’s in Arabic in time being, but it will be available in English hopefully after few days.
You can check the English version while I’m translating it :=) at http://alzaid.ws/labs/untiny-en
Thanks
April 22nd, 2009 at 4:10 am
G’day,
Did you know that “rel” is a space separated list, so “alternate shorter” is interpreted as “alternate” and “shorter”. “alternate” refers to the content itself, not the link (its used for e.g. atom, pdf, text versions of the same) and for “shorter”, what’s shorter? the content?
If you want to avoid any confusion (including the rel=short[_- ]?ur[il] mess) then rel=shortlink is the answer you’re looking for.
Sam