Before attempting this problem, you should be comfortable with:
The simplest way to create a URL shortener is to assign each long URL a unique integer ID based on its position in a list. The short URL is just the base URL plus this index. When decoding, we extract the index from the short URL and retrieve the original URL from the list. This approach is straightforward but the short URL length grows with the number of stored URLs.
urls to store all encoded long URLs.encode(longUrl):longUrl to the list."http://tinyurl.com/" followed by the index (list length - 1).decode(shortUrl):urls[index].Where is the number of , is the average length of the URLs.
Instead of using a list with implicit indices, we use a hash map with an explicit incrementing ID. This offers more flexibility since we can use the ID as a key and don't rely on list indexing. Each new URL gets assigned the current ID, which then increments. The hash map allows constant-time lookup when decoding.
url_map and an integer id starting at 0.encode(longUrl):url_map[id] = longUrl.id.id and return the short URL.decode(shortUrl):url_map[id].class Codec:
def __init__(self):
self.url_map = {}
self.id = 0
def encode(self, longUrl: str) -> str:
self.url_map[self.id] = longUrl
short_url = f"http://tinyurl.com/{self.id}"
self.id += 1
return short_url
def decode(self, shortUrl: str) -> str:
url_id = int(shortUrl.split('/')[-1])
return self.url_map[url_id]Where is the number of , is the average length of the URLs.
This approach uses two hash maps to create a bidirectional mapping. One maps long URLs to short URLs, and the other maps short URLs back to long URLs. This ensures that encoding the same long URL twice returns the same short URL, avoiding duplicates. The trade-off is using more memory for the extra map.
encodeMap (longUrl to shortUrl) and decodeMap (shortUrl to longUrl).encode(longUrl):longUrl is not in encodeMap, create a new short URL using the current count.encodeMap and decodeMap.decode(shortUrl):decodeMap and return the corresponding long URL.class Codec:
def __init__(self):
self.encodeMap = {}
self.decodeMap = {}
self.base = "http://tinyurl.com/"
def encode(self, longUrl: str) -> str:
if longUrl not in self.encodeMap:
shortUrl = self.base + str(len(self.encodeMap) + 1)
self.encodeMap[longUrl] = shortUrl
self.decodeMap[shortUrl] = longUrl
return self.encodeMap[longUrl]
def decode(self, shortUrl: str) -> str:
return self.decodeMap[shortUrl]Where is the number of , is the average length of the URLs.
If the same long URL is encoded multiple times, some implementations generate different short URLs each time, wasting storage. Using a bidirectional mapping (approach 3) ensures the same long URL always returns the same short URL, but this trade-off uses more memory.
When decoding, you must correctly extract the ID from the short URL. Common mistakes include not handling the base URL prefix properly, using the wrong string split method, or forgetting to convert the extracted string to an integer before lookup.
Using a simple incrementing integer ID works for small scale, but can overflow for very large numbers of URLs. In production systems, consider using larger integer types, UUIDs, or base62 encoding to handle billions of URLs while keeping short URLs compact.