Trie
The Trie, also known as a prefix tree, is a tree-like data structure used for efficient retrieval of keys with common prefixes. It is commonly used for tasks like searching for words in a dictionary, autocomplete functionality, and storing sets of strings.
In a Trie, each node represents a character, and the edges represent the connections between characters. The root node represents an empty string, and each subsequent node represents a prefix formed by appending a character to the previous node's prefix.
The Trie data structure supports efficient insertion, deletion, and search operations, making it a suitable choice for various string-related problems.
Implementation of Trie in Python
Let's implement the Trie data structure in Python:
class TrieNode:
def __init__(self):
self.children = {}
self.is_end_of_word = False
class Trie:
def __init__(self):
self.root = TrieNode()
def insert(self, word):
current = self.root
for char in word:
if char not in current.children:
current.children[char] = TrieNode()
current = current.children[char]
current.is_end_of_word = True
def search(self, word):
current = self.root
for char in word:
if char not in current.children:
return False
current = current.children[char]
return current.is_end_of_word
def startsWith(self, prefix):
current = self.root
for char in prefix:
if char not in current.children:
return False
current = current.children[char]
return True
In the above code, we define two classes: TrieNode
and Trie
. The TrieNode
class represents a single node in the Trie, and the Trie
class is the main Trie data structure.
The TrieNode
class has two properties: children
, which is a dictionary to store the child nodes, and is_end_of_word
, which indicates whether the node represents the end of a word.
The Trie
class has three methods:
insert(word)
: Inserts a word into the Trie by traversing through each character and creating new nodes as needed.search(word)
: Searches for a word in the Trie by traversing through each character and checking if the word exists in the Trie.startsWith(prefix)
: Checks if any word in the Trie starts with the given prefix by traversing through each character and checking if the prefix exists in the Trie.
Now, let's solve a question from LeetCode using the Trie data structure.
Problem: Implement Trie (Prefix Tree)
Implement a Trie with the following methods:
insert(word)
: Inserts a word into the Trie.search(word)
: Returns true if the word is in the Trie.startsWith(prefix)
: Returns true if there is any word in the Trie that starts with the given prefix.
class TrieNode:
def __init__(self):
self.children = {}
self.is_end_of_word = False
class Trie:
def __init__(self):
self.root = TrieNode()
def insert(self, word):
current = self.root
for char in word:
if char not in current.children:
current.children[char] = TrieNode()
current = current.children[char]
current.is_end_of_word = True
def search(self, word):
current = self.root
for char in word:
if char not in current.children:
return False
current = current.children[char]
return current.is_end_of_word
def startsWith(self, prefix):
current = self.root
for char in prefix:
if char not in current.children:
return False
current = current.children[char]
return True
# Create a Trie object
trie = Trie()
# Insert words into the Trie
trie.insert("apple")
trie.insert("banana")
trie.insert("orange")
# Search for words in the Trie
print(trie.search("apple")) # Output: True
print(trie.search("orange")) # Output: True
print(trie.search("grape")) # Output: False
# Check if words start with a prefix
print(trie.startsWith("app")) # Output: True
print(trie.startsWith("ora")) # Output: True
print(trie.startsWith("ban")) # Output: False
In the above code, we create a Trie
object and insert three words: "apple", "banana", and "orange". Then, we perform search and startsWith operations to test the functionality of the Trie.
Time Complexity The time complexity of inserting a word into the Trie is O(k), where k is the length of the word. The time complexity of searching for a word or a prefix in the Trie is O(k), where k is the length of the word or prefix.
Space Complexity The space complexity of the Trie depends on the number of unique characters in the input words and the number of words inserted. Let n be the total number of characters and m be the number of words inserted.
- The space complexity of the Trie: O(n)
- Total space complexity: O(n)
Let's consider the following problem as an example:
Problem: Word Search Given an m x n board and a word, find if the word exists in the grid. The word can be constructed from letters of sequentially adjacent cells, where "adjacent" cells are horizontally or vertically neighboring. The same letter cell may not be used more than once.
Example:
board = [ ['A', 'B', 'C', 'E'],
['S', 'F', 'C', 'S'],
['A', 'D', 'E', 'E']
]
word = "ABCCED"
Solution:
We can solve this problem using a depth-first search (DFS) algorithm and a Trie data structure. First, we'll implement the Trie data structure, and then we'll use it to solve the Word Search problem.
class TrieNode:
def __init__(self):
self.children = {}
self.is_word = False
class Trie:
def __init__(self):
self.root = TrieNode()
def insert(self, word):
current = self.root
for char in word:
if char not in current.children:
current.children[char] = TrieNode()
current = current.children[char]
current.is_word = True
class Solution:
def exist(self, board, word):
# Build the Trie
trie = Trie()
trie.insert(word)
rows, cols = len(board), len(board[0])
visited = [[False] * cols for _ in range(rows)]
# Helper function for DFS
def dfs(row, col, node):
# Check if the current node represents a complete word
if node.is_word:
return True
# Check for out-of-bounds or visited cells
if row < 0 or row >= rows or col < 0 or col >= cols or visited[row][col]:
return False
# Check if the current character matches the Trie node
char = board[row][col]
if char not in node.children:
return False
# Mark the cell as visited
visited[row][col] = True
# Recursively check the neighbors
for dx, dy in [(1, 0), (-1, 0), (0, 1), (0, -1)]:
if dfs(row + dx, col + dy, node.children[char]):
return True
# Unmark the cell as visited
visited[row][col] = False
return False
# Perform DFS for each cell on the board
for row in range(rows):
for col in range(cols):
if dfs(row, col, trie.root):
return True
return False
In the above code, we define the TrieNode
and Trie
classes, similar to the previous explanation. Then, we define the Solution
class, which contains the exist
method to solve the Word Search problem.
The exist
method performs a depth-first search (DFS) on each cell of the board and checks if a valid word exists starting from that cell. The dfs
helper function is recursively called to explore the neighboring cells and match the characters with the Trie nodes.
Time Complexity:
- Building the Trie: O(L), where L is the total number of characters in the words.
- Performing DFS on the board: O(M * N * 4^L), where M is the number of rows, N is the number of columns, and L is the length of the longest word.
Space Complexity:
- Trie: O(L), where L is the total number of characters in the words.
- Visited array: O(M * N), where M is the number of rows and N is the number of columns.
Overall, the time and space complexity of the solution depend on the size of the input board and the length of the words in the Trie.