Taking Apart URL Shorteners

Ever get a "shortened" url (bit.ly, tinyurl.com or whatever) and stress about "clicking that link"?  Or worse yet, have that "Oh No" moment after you just clicked it?  Or possibly tripped over such a link during IR and have to investigate it?  Is there a way to look at the link contents without a sandbox with a packet sniffer (or fiddler or burp or similar)? 

This may be old news to some of you, but it's really disturbing how even how many security folks will follow a shortened link.  It's enough of a problem that "de-fanging" links is a standard feature in many mail filter / anti-spam products.

Sure, you could go to an online thing like https://getlinkinfo.com , but you don't know who's running those, or how they unshorten the link - you don't want them to actually navigate to the site (which is the default in curl for instance) - more on this later.  For me, I wanted a CLI script that would take a short URL and return the original link - I might want to run that result through something else (a reputation filter or virustotal for instance).  Let's take a closer look at how we can do that.

Luckily, most of these shorteners are very simple.  Let's look at what's behind a bit.ly request using curl:

curl -k -v -I https://bit.ly/3ABvcy5
*   Trying 67.199.248.11:443...
* Connected to bit.ly (67.199.248.11) port 443 (#0)
* schannel: disabled automatic use of client certificate
* ALPN: offers http/1.1
* ALPN: server accepted http/1.1
> HEAD /3ABvcy5 HTTP/1.1
> Host: bit.ly
> User-Agent: curl/7.83.1
> Accept: */*
>
* Mark bundle as not supporting multiuse
< HTTP/1.1 301 Moved Permanently
HTTP/1.1 301 Moved Permanently
< Server: nginx
Server: nginx
< Date: Thu, 25 Aug 2022 12:40:36 GMT
Date: Thu, 25 Aug 2022 12:40:36 GMT
< Content-Type: text/html; charset=utf-8
Content-Type: text/html; charset=utf-8
< Content-Length: 108
Content-Length: 108
< Cache-Control: private, max-age=90
Cache-Control: private, max-age=90
< Location: https://isc.sans.edu/
Location: https://isc.sans.edu/
< Via: 1.1 google
Via: 1.1 google
< Alt-Svc: h3=":443"; ma=2592000,h3-29=":443"; ma=2592000
Alt-Svc: h3=":443"; ma=2592000,h3-29=":443"; ma=2592000

<
* Connection #0 to host bit.ly left intact

Why so many arguments in the curl command?
-k   trust the certificate on the target, without this there's a ton of hoops to jump through to have curl work with https
-v   verbose, since we're investigating here
-I    send a HEAD request instead of a GET, so that we don't risk actually navigating to the target link

Why don't we want to follow the link?  Even if we're using curl so have some decent control over what happens to the returned data (ie, it won't be detonating in the browser), actually hitting the target means our potential adversary now potentially knows we're investigating, or they might think we've actually browsed to the link.  Either way, you don't want to tip your hand to the adversary until it's time to do so.

Looking at the returned data, we see our target "unshortened" link in several places:

curl -k -v -I https://bit.ly/3ABvcy5 2>&1 | grep ://
< Location: https://isc.sans.edu/
Location: https://isc.sans.edu/

Wait, what are that new stuff now?
2>&1 sends STDERR to STDOUT, so that we actually get all that verbose output into an output stream we can work with
grep :// looks for whatever url protocol might have been in the original URL.  We could have used "grep -i https" in this case, but http: or ftp: or tel: or whatever other protocol would have all failed in that case

As we dig further into this, you'll see that mailto: links don't have those two slashes, so we'll have to use a different approach as we go forward.

Looking at several other shorteners (bit.ly, rb.gy, short.io etc), all of the ones I've looked at so far have the "< Location:" tag.  This makes a CLI "unshortener" fairly simple to write, starting with the way we constructed that last set of commands.

This is the final script (windows version since it's %1 instead of $1):

curl -k -v -I %1 2>&1 | grep -i "< location" | cut -d " " -f 3


This takes our first call, looks for that one line "< Location" (case insensitive, sometimes this is lower case).  It then cuts that up into fields using the space character as a delimiter, and returns the third field.

Let's take a look at how this script works using various services:

> unshort  https://bit.ly/3ABvcy5
https://isc.sans.edu/

> unshort  https://4vnx.short.gy/BuB4TW
https://isc.sans.edu/

> unshort  https://tinyurl.com/bdhf48p4
http://isc.sans.edu

> unshort  https://rb.gy/rickpn
http://isc.sans.edu/

This also works for email (mailto:) links and links to phone numbers:

> unshort https://tinyurl.com/3tuudmuv
mailto:rob@coherentsecurity.com

> unshort https://tinyurl.com/3tuudmuv
mailto:rob@coherentsecurity.com

unshort https://tinyurl.com/4zkd52jt
tel://2725035

This even works for the twitter link shortener (t.co):

unshort https://t.co/0BACDYaBmU
https://www.youtube.com/watch?v=dQw4w9WgXcQ

(you should really check out that youtube video)

If you find a link shortener service where this doesn't work please let us know in the comment section?  I'm happy to update this script if needed, I'm finding it pretty useful - if you use it as well share what you can in the comments as well!

... and be sure to check that youtube link  ;-)

===============
Rob VandenBrink
rob@coherentsecurity.com

Rob VandenBrink

582 Posts
ISC Handler
Aug 25th 2022

Sign Up for Free or Log In to start participating in the conversation!