1396. Design Underground System - Explanation

Problem Link



Prerequisites

Before attempting this problem, you should be comfortable with:

  • Hash Maps - Used to store and retrieve passenger check-in data and route statistics in O(1) time
  • Object-Oriented Design - Understanding how to design a class with multiple methods that share state

1. Two HashMaps

Intuition

We need to track passenger trips and compute average travel times between stations. Each passenger checks in at one station and checks out at another, forming a route. By storing check-in information and aggregating travel times for each route, we can efficiently calculate averages without storing individual trip details.

Algorithm

  1. Use one hashmap to store check-in information: map passenger id to (start station, check-in time).
  2. Use another hashmap to store route statistics: map (start station, end station) to [total time, trip count].
  3. For checkIn: Store the passenger's start station and time in the check-in map.
  4. For checkOut: Retrieve the passenger's check-in data, calculate the trip duration, and update the route map by adding the duration to the total time and incrementing the count.
  5. For getAverageTime: Look up the route in the route map and return total time divided by count.
class UndergroundSystem:

    def __init__(self):
        self.checkInMap = {} # id -> (startStation, time)
        self.routeMap = {} # (start, end) -> [totalTime, count]

    def checkIn(self, id: int, startStation: str, t: int) -> None:
        self.checkInMap[id] = (startStation, t)

    def checkOut(self, id: int, endStation: str, t: int) -> None:
        startStation, time = self.checkInMap[id]
        route = (startStation, endStation)
        if route not in self.routeMap:
            self.routeMap[route] = [0, 0]
        self.routeMap[route][0] += t - time
        self.routeMap[route][1] += 1

    def getAverageTime(self, startStation: str, endStation: str) -> float:
        totalTime, count = self.routeMap[(startStation, endStation)]
        return totalTime / count

Time & Space Complexity

  • Time complexity:
    • O(1)O(1) time for initialization.
    • O(1)O(1) time for each checkIn()checkIn() function call.
    • O(m)O(m) time for each checkOut()checkOut() and getAverageTime()getAverageTime() function calls.
  • Space complexity: O(n+N2)O(n + N ^ 2)

Where nn is the number of passengers, NN is the total number of stations, and mm is the average length of station name.


2. Two HashMaps + Hashing

Intuition

The previous approach uses string concatenation to create route keys, which can be slow for long station names. By computing a hash value for each route instead of concatenating strings, we can achieve faster lookups. Using double hashing (two different hash functions) reduces collision probability while maintaining O(1) average lookup time.

Algorithm

  1. Use one hashmap to store check-in information: map passenger id to (start station, check-in time).
  2. Use another hashmap to store route statistics: map route hash to [total time, trip count].
  3. Implement a hash function that combines two polynomial rolling hashes with different bases and moduli to create a unique identifier for each route.
  4. For checkIn: Store the passenger's start station and time.
  5. For checkOut: Retrieve check-in data, compute the route hash, and update the route statistics.
  6. For getAverageTime: Compute the route hash and return total time divided by count from the route map.
class UndergroundSystem:
    MOD1, MOD2 = 768258391, 685683731
    BASE1, BASE2 = 37, 31

    def __init__(self):
        self.checkInMap = {}  # id -> (startStation, time)
        self.routeMap = {}  # hash(route) -> (totalTime, count)

    def getHash(self, s1: str, s2: str) -> int:
        h1, h2, p1, p2 = 0, 0, 1, 1
        for c in s1 + ',' + s2:
            h1 = (h1 + (ord(c) - 96) * p1) % self.MOD1
            h2 = (h2 + (ord(c) - 96) * p2) % self.MOD2
            p1 = (p1 * self.BASE1) % self.MOD1
            p2 = (p2 * self.BASE2) % self.MOD2
        return (h1 << 32) | h2

    def checkIn(self, id: int, startStation: str, t: int) -> None:
        self.checkInMap[id] = (startStation, t)

    def checkOut(self, id: int, endStation: str, t: int) -> None:
        startStation, time = self.checkInMap[id]
        routeHash = self.getHash(startStation, endStation)
        if routeHash not in self.routeMap:
            self.routeMap[routeHash] = [0, 0]
        self.routeMap[routeHash][0] += t - time
        self.routeMap[routeHash][1] += 1

    def getAverageTime(self, startStation: str, endStation: str) -> float:
        routeHash = self.getHash(startStation, endStation)
        totalTime, count = self.routeMap[routeHash]
        return totalTime / count

Time & Space Complexity

  • Time complexity:
    • O(1)O(1) time for initialization.
    • O(1)O(1) time for each checkIn()checkIn() function call.
    • O(m)O(m) time for each checkOut()checkOut() and getAverageTime()getAverageTime() function calls.
  • Space complexity: O(n+N2)O(n + N ^ 2)

Where nn is the number of passengers, NN is the total number of stations, and mm is the average length of station name.


Common Pitfalls

Integer Division Instead of Float Division

When calculating the average time, using integer division will truncate the result. The problem requires a floating-point average, so ensure you cast to float before dividing.

# Wrong - integer division in some languages
return totalTime / count  # May truncate in Java/C++

# Correct - explicit float conversion
return float(totalTime) / count  # Python
return (double) totalTime / count;  // Java/C++

Route Key Collision with Simple Concatenation

If station names can contain the delimiter character used in route keys, you may get false matches. For example, stations "A,B" and "C" would collide with "A" and "B,C" if using comma as delimiter.

# Potential collision issue
route = startStation + endStation  # "AB" + "C" == "A" + "BC"

# Better - use a delimiter unlikely to appear in station names
route = startStation + "," + endStation

# Best - use a tuple as key (in languages that support it)
route = (startStation, endStation)

Not Cleaning Up Check-In Data

After a passenger checks out, the check-in data is no longer needed. While not strictly incorrect, keeping stale check-in entries can waste memory in long-running systems.

def checkOut(self, id: int, endStation: str, t: int) -> None:
    startStation, time = self.checkInMap[id]
    # ... update route statistics ...

    # Optional cleanup to save memory
    del self.checkInMap[id]