Monday, April 20, 2015

Long URLs

In a world of 140 characters, space is at a premium (even for the longest tweet ever). It has become very common to shorten URLs when embedding links.

There are a lot of URL shortening services available, including branded ones such as t.co (Twitter), goo.gl (Google), nyti.ms (New York Times), and youtu.be (YouTube). You might not know it, but Factor even includes one in the wee-url web application.

You could use something like the LongURL service to resolve short URLs back to the long URL they point to, but I thought it would be more fun to show how to use Factor to do it!

By default, our http.client automatically follows redirects until exceeding a configurable maximum. We will need to make requests that do not redirect, using HEAD to retrieve only the HTTP headers and not the full contents:

: http-head-no-redirects ( url -- response data )
    <head-request> 0 >>redirects http-request* ;

We use symbols to configure a maximum number of redirects (defaulting to 5) and to store the current number of redirects.

SYMBOL: max-redirects
5 max-redirects set-global

SYMBOL: redirects

We want a word that takes a URL and retrieves the next URL, if redirected. If we exceed our maximum number of redirects, it should throw an error.

: next-url ( url -- next-url redirected? )
    redirects inc
    redirects get max-redirects get <= [
        dup http-head-no-redirects drop
        dup redirect? [
            nip "location" header t
        ] [ drop f ] if
    ] [ too-many-redirects ] if ;

To find the "long URL", just loop until we are no longer redirected:

: long-url ( short-url -- long-url )
    [ [ next-url ] loop ] with-scope ;

To see it work, we can try it out with a short URL that I just made:

IN: scratchpad "http://bit.ly/1J0vm1x" long-url .
"http://factorcode.org/"

Neat!

This code is available on my GitHub.

No comments: