535. Encode and Decode TinyURL - Explanation

Problem Link



Prerequisites

Before attempting this problem, you should be comfortable with:

  • Hash Maps - Used to store bidirectional mappings between long and short URLs for O(1) lookup
  • String Manipulation - Required for parsing URLs, extracting indices, and constructing short URL codes
  • System Design Basics - Understanding trade-offs between different encoding strategies (sequential IDs vs random codes)

1. List

Intuition

The simplest way to create a URL shortener is to assign each long URL a unique integer ID based on its position in a list. The short URL is just the base URL plus this index. When decoding, we extract the index from the short URL and retrieve the original URL from the list. This approach is straightforward but the short URL length grows with the number of stored URLs.

Algorithm

  1. Maintain a list urls to store all encoded long URLs.
  2. encode(longUrl):
    • Append longUrl to the list.
    • Return "http://tinyurl.com/" followed by the index (list length - 1).
  3. decode(shortUrl):
    • Extract the index from the end of the short URL.
    • Return urls[index].
class Codec:

    def __init__(self):
        self.urls = []

    def encode(self, longUrl: str) -> str:
        self.urls.append(longUrl)
        return 'http://tinyurl.com/' + str(len(self.urls) - 1)

    def decode(self, shortUrl: str) -> str:
        return self.urls[int(shortUrl.split('/')[-1])]

Time & Space Complexity

  • Time complexity: O(1)O(1) for encode()encode() and decode()decode().
  • Space complexity: O(nm)O(n * m)

Where nn is the number of longUrlslongUrls, mm is the average length of the URLs.


2. Hash Map - I

Intuition

Instead of using a list with implicit indices, we use a hash map with an explicit incrementing ID. This offers more flexibility since we can use the ID as a key and don't rely on list indexing. Each new URL gets assigned the current ID, which then increments. The hash map allows constant-time lookup when decoding.

Algorithm

  1. Maintain a hash map url_map and an integer id starting at 0.
  2. encode(longUrl):
    • Store url_map[id] = longUrl.
    • Generate the short URL using the current id.
    • Increment id and return the short URL.
  3. decode(shortUrl):
    • Parse the ID from the short URL.
    • Return url_map[id].
class Codec:

    def __init__(self):
        self.url_map = {}
        self.id = 0

    def encode(self, longUrl: str) -> str:
        self.url_map[self.id] = longUrl
        short_url = f"http://tinyurl.com/{self.id}"
        self.id += 1
        return short_url

    def decode(self, shortUrl: str) -> str:
        url_id = int(shortUrl.split('/')[-1])
        return self.url_map[url_id]

Time & Space Complexity

  • Time complexity: O(1)O(1) for encode()encode() and decode()decode().
  • Space complexity: O(nm)O(n * m)

Where nn is the number of longUrlslongUrls, mm is the average length of the URLs.


3. Hash Map - II

Intuition

This approach uses two hash maps to create a bidirectional mapping. One maps long URLs to short URLs, and the other maps short URLs back to long URLs. This ensures that encoding the same long URL twice returns the same short URL, avoiding duplicates. The trade-off is using more memory for the extra map.

Algorithm

  1. Maintain two hash maps: encodeMap (longUrl to shortUrl) and decodeMap (shortUrl to longUrl).
  2. encode(longUrl):
    • If longUrl is not in encodeMap, create a new short URL using the current count.
    • Store the mapping in both encodeMap and decodeMap.
    • Return the short URL.
  3. decode(shortUrl):
    • Look up the short URL in decodeMap and return the corresponding long URL.
class Codec:

    def __init__(self):
        self.encodeMap = {}
        self.decodeMap = {}
        self.base = "http://tinyurl.com/"

    def encode(self, longUrl: str) -> str:
        if longUrl not in self.encodeMap:
            shortUrl = self.base + str(len(self.encodeMap) + 1)
            self.encodeMap[longUrl] = shortUrl
            self.decodeMap[shortUrl] = longUrl
        return self.encodeMap[longUrl]

    def decode(self, shortUrl: str) -> str:
        return self.decodeMap[shortUrl]

Time & Space Complexity

  • Time complexity: O(1)O(1) for encode()encode() and decode()decode().
  • Space complexity: O(nm)O(n * m)

Where nn is the number of longUrlslongUrls, mm is the average length of the URLs.


Common Pitfalls

Not Handling Duplicate URLs Consistently

If the same long URL is encoded multiple times, some implementations generate different short URLs each time, wasting storage. Using a bidirectional mapping (approach 3) ensures the same long URL always returns the same short URL, but this trade-off uses more memory.

Incorrect Parsing of the Short URL

When decoding, you must correctly extract the ID from the short URL. Common mistakes include not handling the base URL prefix properly, using the wrong string split method, or forgetting to convert the extracted string to an integer before lookup.

Integer Overflow with Sequential IDs

Using a simple incrementing integer ID works for small scale, but can overflow for very large numbers of URLs. In production systems, consider using larger integer types, UUIDs, or base62 encoding to handle billions of URLs while keeping short URLs compact.