One way to serialize an n-ary tree is to record each node along with a unique identifier and its parent's identifier. During serialization, we assign each node an ID, store its value, and record its parent's ID. During deserialization, we first create all nodes, then link children to their parents using the stored parent IDs. This approach explicitly encodes the tree structure through parent references.
Serialization:
DFS, assigning each node a unique identity starting from 1.'N' for the root).Deserialization:
3 characters to extract identity, value, and parent ID.ID and add the node to the parent's children list.class WrappableInt:
def __init__(self, x):
self.value = x
def getValue(self):
return self.value
def increment(self):
self.value += 1
class Codec:
def serialize(self, root: 'Node') -> str:
serializedList = []
self._serializeHelper(root, serializedList, WrappableInt(1), None)
return "".join(serializedList)
def _serializeHelper(self, root, serializedList, identity, parentId):
if not root:
return
# Own identity
serializedList.append(chr(identity.getValue() + 48))
# Actual value
serializedList.append(chr(root.val + 48))
# Parent's identity
serializedList.append(chr(parentId + 48) if parentId else 'N')
parentId = identity.getValue()
for child in root.children:
identity.increment()
self._serializeHelper(child, serializedList, identity, parentId)
def deserialize(self, data: str) -> 'Node':
if not data:
return None
return self._deserializeHelper(data)
def _deserializeHelper(self, data):
nodesAndParents = {}
for i in range(0, len(data), 3):
identity = ord(data[i]) - 48
orgValue = ord(data[i + 1]) - 48
parentId = ord(data[i + 2]) - 48
nodesAndParents[identity] = (parentId, Node(orgValue, []))
for i in range(3, len(data), 3):
# Current node
identity = ord(data[i]) - 48
node = nodesAndParents[identity][1]
# Parent node
parentId = ord(data[i + 2]) - 48
parentNode = nodesAndParents[parentId][1]
# Attach!
parentNode.children.append(node)
return nodesAndParents[ord(data[0]) - 48][1]Time complexity:
Serialization : . For every node, we add 3 different values to the final string and every node is processed exactly once.
Deserialization : . Technically, it is for the first for loop and for the second one. However, constants are ignored in asymptotic complexity analysis. So, the overall time complexity for deserialization is .
Space complexity:
Serialization : . The space occupied by the serialization helper function is through recursion stack and the final string that is produced. Usually, we don't take into consideration the space of the output. However, in this case, the output is something which is not fixed. For all we know, someone might be able to generate a string of size . We don't know! So, the size of the final string is a part of the space complexity here. Overall, the space is .
Deserialization : . The space occupied by the deserialization helper function is through the hash map. For each entry, we have 3 values. Thus, we can say the space is . But again, the constants don't really matter in asymptotic complexity. So, the overall space is .
Where is the number of nodes in the tree.
Instead of storing parent IDs, we can store each node's value followed by its number of children. This gives us enough information to reconstruct the tree during DFS deserialization. When we encounter a node, we know exactly how many children to expect, so we can recursively build each subtree. This approach is more space-efficient since we only need 2 characters per node.
Serialization:
Deserialization:
class WrappableInt:
def __init__(self, x):
self.value = x
def getValue(self):
return self.value
def increment(self):
self.value += 1
class Codec:
def serialize(self, root: 'Node') -> str:
serializedList = []
self._serializeHelper(root, serializedList)
return "".join(serializedList)
def _serializeHelper(self, root, serializedList):
if not root:
return
# Actual value
serializedList.append(chr(root.val + 48))
# Number of children
serializedList.append(chr(len(root.children) + 48))
for child in root.children:
self._serializeHelper(child, serializedList)
def deserialize(self, data: str) -> 'Node':
if not data:
return None
return self._deserializeHelper(data, WrappableInt(0))
def _deserializeHelper(self, data, index):
if index.getValue() == len(data):
return None
# The invariant here is that the "index" always
# points to a node and the value next to it
# represents the number of children it has.
node = Node(ord(data[index.getValue()]) - 48, [])
index.increment()
numChildren = ord(data[index.getValue()]) - 48
for _ in range(numChildren):
index.increment()
node.children.append(self._deserializeHelper(data, index))
return nodeTime complexity:
Serialization : . For every node, we add 2 different values to the final string and every node is processed exactly once.
Deserialization : . For deserialization, we process the entire string, one character at a time and also construct the tree along the way. So, the overall time complexity for deserialization is .
Space complexity:
Serialization : . The space occupied by the serialization helper function is through recursion stack and the final string that is produced. We know the size of the final string to be . So, that is one part of the space complexity. The other part is the one occupied by the recursion stack which is . Overall, the space is .
Deserialization : . For deserialization, the space occupied is by the recursion stack only. We don't use any other intermediate data structures like we did in the previous approach and simply rely on the information in the string and recursion to work it's magic. So, the space complexity would be since this is not a balanced tree of any sort. It's not even binary.
Where is the number of nodes in the tree.
Another approach uses a sentinel character (like '#') to mark the end of a node's children. After serializing a node's value and all its children recursively, we append the sentinel. During deserialization, we read nodes and recursively build children until we hit the sentinel, which signals that all children for the current node have been processed.
Serialization:
Deserialization:
class WrappableInt:
def __init__(self, x):
self.value = x
def getValue(self):
return self.value
def increment(self):
self.value += 1
class Codec:
def serialize(self, root: 'Node') -> str:
serializedList = []
self._serializeHelper(root, serializedList)
return "".join(serializedList)
def _serializeHelper(self, root, serializedList):
if not root:
return
# Actual value
serializedList.append(chr(root.val + 48))
for child in root.children:
self._serializeHelper(child, serializedList)
# Add the sentinel to indicate that all the children
# for the current node have been processed
serializedList.append('#')
def deserialize(self, data: str) -> 'Node':
if not data:
return None
return self._deserializeHelper(data, WrappableInt(0))
def _deserializeHelper(self, data, index):
if index.getValue() == len(data):
return None
node = Node(ord(data[index.getValue()]) - 48, [])
index.increment()
while (data[index.getValue()] != '#'):
node.children.append(self._deserializeHelper(data, index))
# Discard the sentinel. Note that this also moves us
# forward in the input string. So, we don't have the index
# progressing inside the above while loop!
index.increment()
return nodeTime complexity:
Serialization : . For every node, we add 2 different values to the final string and every node is processed exactly once.
Deserialization : . For deserialization, we process the entire string, one character at a time and also construct the tree along the way. So, the overall time complexity for deserialization is .
Space complexity:
Serialization : . The space occupied by the serialization helper function is through recursion stack and the final string that is produced. We know the size of the final string to be . So, that is one part of the space complexity. The other part is the one occupied by the recursion stack which is . Overall, the space is .
Deserialization : . For deserialization, the space occupied is by the recursion stack only. We don't use any other intermediate data structures like we did in the previous approach and simply rely on the information in the string and recursion to work it's magic. So, the overall space complexity would be .
Where is the number of nodes in the tree.
We can serialize the tree level by level using BFS. We use two sentinel markers: '#' to indicate the end of a level, and '$' to indicate switching to a different parent's children on the same level. During deserialization, we track nodes at the current and previous levels, using the sentinels to know when to move to the next level or switch parents.
Serialization:
Deserialization:
class Codec:
def _serializeHelper(self, root, serializedList):
queue = collections.deque()
queue.append(root)
queue.append(None)
while queue:
# Pop a node
node = queue.popleft()
# If this is an "endNode", we need to add another one
# to mark the end of the current level unless this
# was the last level.
if (node == None):
# We add a sentinal value of "#" here
serializedList.append("#")
if queue:
queue.append(None)
elif node == "C":
# Add a sentinal value of "$" here to mark the switch to a
# different parent.
serializedList.append("$")
else:
# Add value of the current node and add all of it's
# children nodes to the queue. Note how we convert
# the integers to their corresponding ASCII counterparts.
serializedList.append(chr(node.val + 48))
for child in node.children:
queue.append(child)
# If this not is NOT the last one on the current level,
# add a childNode as well since we move on to processing
# the next node.
if queue[0] != None:
queue.append("C")
def serialize(self, root: 'Node') -> str:
if not root:
return ""
serializedList = []
self._serializeHelper(root, serializedList)
return "".join(serializedList)
def _deserializeHelper(self, data, rootNode):
# We move one level at a time and at every level, we need access
# to the nodes on the previous level as well so that we can form
# the children arrays properly. Hence two arrays.
prevLevel, currentLevel = collections.deque(), collections.deque()
currentLevel.append(rootNode)
parentNode = rootNode
# Process the characters in the string one at a time.
for i in range (1, len(data)):
if data[i] == "#":
# Special processing for end of level. We need to swap the
# array lists. Here, we simply re-initialize the "currentLevel"
# arraylist rather than clearing it.
prevLevel = currentLevel
currentLevel = collections.deque()
# Since we move one level down, we take the parent as the first
# node on the current level.
parentNode = prevLevel.popleft() if prevLevel else None
else:
if data[i] == "$":
# Special handling for change in parent on the same level
parentNode = prevLevel.popleft() if prevLevel else None
else:
childNode = Node(ord(data[i]) - 48, [])
currentLevel.append(childNode)
parentNode.children.append(childNode)
def deserialize(self, data: str) -> 'Node':
if not data:
return None
rootNode = Node(ord(data[0]) - 48, [])
self._deserializeHelper(data, rootNode)
return rootNodeTime complexity:
Serialization : . For every node, we add 2 different values to the final string and every node is processed exactly once. We add the value of the node itself and we also add the child switch sentinel. Also, for the nodes that end a particular level, we add the level end sentinel.
Deserialization : . For deserialization, we process the entire string, one character at a time and also construct the tree along the way. So, the overall time complexity for deserialization is .
Space complexity:
Serialization : . The space occupied by the serialization helper function is through the queue and the final string that is produced. We know the size of the final string to be . So that is one part of the space complexity. The other part is the one occupied by the queue which is . Overall, the space is .
Deserialization : . For deserialization, the space is mostly occupied by the two lists that we use. The space complexity there is . Note that when we re-initialize a list, the memory that was allocated earlier is deallocated by the garbage collector and it's essentially equal to a single list of size .
Where is the number of nodes in the tree.
Unlike binary trees where each node has exactly two children slots, n-ary trees have variable numbers of children. Failing to encode how many children each node has makes it impossible to correctly reconstruct the tree during deserialization. Each serialization approach must capture this information somehow (explicit count, sentinels, or parent references).
When converting node values to single characters using ASCII offsets (like chr(val + 48)), values outside the valid range cause issues. This approach only works for small, non-negative integers. Larger values or negative numbers require a different encoding strategy with proper delimiters.
When creating nodes during deserialization, forgetting to initialize the children list (or initializing it as null instead of an empty list) causes null pointer exceptions when adding children later. Always initialize the children collection before attempting to add child nodes.
In the parent-child ID approach, mixing up which ID refers to the parent versus the child node leads to a completely incorrect tree structure. The serialized data must clearly distinguish between a node's own identity and its parent's identity.
When using sentinels to mark the end of children (like #), forgetting to increment the index after consuming the sentinel causes the deserializer to get stuck or skip actual node values. The index must advance both when creating nodes and when encountering sentinels.