Building a URL Shortener on Jekyll and S3

Home, Bangkok, Thailand, 2019-05-24 22:45 +0700

#webdev #cloud #aws

 
Photo by Hello I'm Nik

For my blog and for my CV I have a number of shortened URL’s For example http://u.bren.cc/github points to my GitHub profile. Until now I’ve used a very nice open-source solution called YOURLS which I Dockerized and ran on my home server. YOURLS provides a great admin UI for creating URL’s and a nice reporting dashboard that shows hit counts and geocodes requesters.

As I’m revamping my website I decided to take another look at my URL shortening solution and in particular move it off my home network and into a more sustainable and robust location. One option would be to move YOURLS into a small instance on AWS. Another option would be to build or find a URL shortener that is built on Lambda - Lambda would be an ideal platform for this requirement.

After ruminating on it over a coffee for a bit I decided to take a different and much simpler approach: use Jekyll to create a static site consisting of redirector pages only, and deploy this to S3 in the same fashion as what I’ve just done with my actual website. I’ll lose the reporting dashboard for now, but I will have a complete log of all traffic captured in S3 and I can most likely find a reporting tool to parse those logs later.

The basic concept is that there will be pages rendered by Jekyll for each short code, and each page will do a simple redirect using the <meta> tag. The redirection HTML will be provided by a Jekyll layout so that each redirector page will only need to set a page property specifying the target URL.

I’ve open-sourced the solution described below on GitHub so you can use that as a quickstart if you want to do something similar. Get it from here.

Here’s a quick rundown of how I built this.

First Cut

  • Created a new git repo for the shortener project. If I put this site in the same repo as my main website, commits for either would cause a build of both which would be wasteful in terms of build time and artifact storage on my CI platform.
  • In that new repo created a new site called “site”:
jekyll new site
  • Deleted the default _posts directory as well as the 404.html and example “about” page.
  • Replaced index.md with index.html containing the following code so that hitting the root gives a blank page:
<html>
<head>
    <title>u.bren.cc</title>
</head>
<body>
</body>
</html>
  • Created the _layouts directory and added a layout file called redirect.html with the following code:
---
---
<!doctype html>
<html>
<head>
    <meta http-equiv="refresh" content="0; url={{page.target}}">
</head>
</html>
  • Added a test page in the root called github.html with the following content:
---
layout: redirect
target: https://github.com/brendonmatheson
---

This works, so when I hit http://u.bren.cc/github.html it redirects on to my GitHub profile. Creating a new short URL is pretty simple too:

  • Add a page to the root named with the code plus .html
  • Specify it’s layout as redirect and set the target URL

It works but the .html suffix is a problem. This can be easily cleaned up using Jekyll’s permalink feature by adding the following to _config.yml:

permalink: /:title

Now Jekyll maps /github instead of /github.html to the page when run via jekyll serve - but it still actually generates github.html which will require the .html extension to be specified when served from S3. To work around this I’ve set the permalink to:

permalink: /:title/index.html

Meaning that each redirector get’s its own directory with an index.html file containing the redirection HTML from the layout. As we will set the index page in S3’s static site config to be “index.html”, hitting the naked path will resolve to the index.html and redirect as we want.

Using Collections

Jekyll has a Collections concept which is a way to treat pages as semi-structured data objects - almost like a simple read-only document database. We can improve the solution using collections as follows:

  • Define a collection called “redirects” in _config.yml:
collections:
  redirects:
    output: true
  • Move the redirection page github.html into a new _redirects/ directory which is where Jekyll expects to find it
  • Move the global permalink setting into the definition of the collection:
collections:
  redirects:
    output: true
    permalink: /:title/index.html
  • Add defaults to _config.yml and specify that the layout for items in the redirects collection should be “redirect”:
defaults:
  - scope:
      type: redirects
    values:
      layout: redirect
  • Finally drop the layout property from the redirect node _redirects/github.html leaving it with the following content:
---
target: https://github.com/brendonmatheson
---

After restarting Jekyll, the site functions as before but with the new benefits of:

  • Redirect pages are all kept together in _redirects/ rather than being in the root and mixed with other files
  • No need to specify the layout property for every redirector page.

Wrapping Up

I’ve deployed this solution to S3 with a configuration similar to the one I created for my blog site. Whereas for my site I have a manual Promote action to publish to production, for the URL shortener all commits to master are auto-published.

Adding new short URL’s is now a matter of committing a new text file which I can do from my dev workstation or in Termux on my mobile dev tablet.