Graph Theory: Search and Traversal

0. Graph Traversal

Breadth-First Search (BFS) and Depth-First Search (DFS) are two of the most commonly used graph traversal methods.

The traversal of a graph, whether BFS or DFS, involves two main concepts: visiting a node and exploring a node. Exploration refers to visiting all the children/adjacent nodes.

Among BFS and DFS, Depth-First Search is more intuitive to perform, so let's first explore DFS to set a clear standpoint on what BFS is not.

1. Depth First Search

Depth-First is the process of traversing (visiting and exploring) down the graph until we get to a leaf node or a cycle (re-visiting a node that's already explored). Every time we encounter one of these conditions, we head back to the last parent node (previous level node) and explore an adjacent node (until leaf or cycle) and repeat the process.

In other words: traverse through the tree by visiting all of the children, grandchildren, great-grandchildren (and so on) until the end of a path, only then traverse a level back to start a new path.

Explanation of the above example:

  • Starting with Vertex A - start exploration, say, we go to Node B
  • Node A has two other adjacent vertices, but in DFS, we go depth-first
  • Further exploring the visited vertex B, head to vertex C
  • Cannot further explore Node C as it's a leaf node - hence, Node C is completely explored
  • Head back to its parent (back-track prior level) and explore the next adjacent node, Vertex D
  • Similarly, Vertex E. Now that all adjacent nodes of Vertex B are already explored, head back a level again (Back to Node A)
  • Visit F, explore F; head back to Node A. Visit G, explore G; head back to Node A. DFS is now complete
  • Order of visiting nodes: A, B, C, D, E, F, G

Notice that in scenarios where there are more than one adjacent node, we choose the next node to explore at random, and hence there are several paths to traverse using DFS. Defining specific rules for which node to explore next brings up the topic of new strategies in DFS (In the case for Trees: Pre-order, In-order and Post-order traversal).

1.1. DFS: Detecting Cycles

In the previous example, we understood to back-track when we reach a leaf node. Taking an example with a graph this time to cover the "detecting a cycle" scenario, i.e., visiting a node that was previously visited.

I have highlighted when re-visiting Node G (Slide #6), followed by back-tracking and visiting Node J. Again, this is one particular Depth First Search traversal, but it can be done in many other ways by choosing a different "next" node to visit (at every explore step).

1.2. DFS: Implementation

The core of the solution is to find a way to back-track and head on a different path when encountering two scenarios: reaching a dead-end (leaf node) and reaching an already visited node (cycle).

1.2.1. DFS: Stack

The intuition behind using a stack is that when we reach a dead-end, we want to get to the previously added node (LIFO: Last-In First-Out) and explore other paths. This helps you explore each path deeply before backtracking, done using a stack to go back to the last node.

Easier to understand with visualization:

The key points to notice here are the stack pop operations. On reaching node D, a leaf node, pop() to explore other paths, i.e., Node E. Similarly, Node E is a leaf node, so pop() to head back and explore Node C.

Step Action Stack State Visited Nodes
1 Push A [A] {}
2 Pop A, Push C, B [C, B] {A}
3 Pop B, Push E, D [C, E, D] {A, B}
4 Pop D [C, E] {A, B, D}
5 Pop E [C] {A, B, D, E}
6 Pop C, Push G, F [G, F] {A, B, D, E, C}
7 Pop F [G] {A, B, D, E, C, F}
8 Pop G [] {A, B, D, E, C, F, G}

Note: When visiting a node, add all adjacent nodes to the stack to ensure all possible paths from the current node are explored. This is essential for DFS to correctly traverse the entire graph.

Pseudo Code: Wrapping it all up with 10 lines of code

DFS-Iterative(graph, start):
    let stack be a stack
    let visited be a set
    stack.push(start)
    
    while stack is not empty:
        node = stack.pop()
        if node is not in visited:
            visit(node)
            visited.add(node)
            for each neighbor of node in graph (Optional: reverse order):
                if neighbor is not in visited:
                    stack.push(neighbor)

1.2.2. DFS: Recursion

The recursion solution is quite similar to the above stack solution, where we rely on the call stack as opposed to a user-defined stack.

There's a small difference (in traversal order). In the recursive solution, you handle each node when you see it. Thus, the first node you handle is the first child.

Whereas in an iterative approach, you first insert all the elements into the stack and then handle the head of the stack (which is the last node inserted). Thus, the first node you handle is the last child.

Step Action Call Stack State Visited Nodes
1 Call on A [A] {}
2 Visit A, Call on B [A, B] {A}
3 Visit B, Call on D [A, B, D] {A, B}
4 Visit D, Return from D [A, B] {A, B, D}
5 Call on E [A, B, E] {A, B, D}
6 Visit E, Return from E [A, B] {A, B, D, E}
7 Return from B, Call on C [A, C] {A, B, D, E}
8 Visit C, Call on F [A, C, F] {A, B, D, E, C}
9 Visit F, Return from F [A, C] {A, B, D, E, C, F}
10 Call on G [A, C, G] {A, B, D, E, C, F}
11 Visit G, Return from G [A, C] {A, B, D, E, C, F, G}
12 Return from C [A] {A, B, D, E, C, F, G}
13 Return from A [] {A, B, D, E, C, F, G}

Pseudo Code: now down to 5 lines of code

DFS-Recursive(node, visited):
    if node is not in visited:
        visit(node)
        visited.add(node)
        for each neighbor of node:
            DFS-Recursive(neighbor, visited)

Note: if you want the user-defined stack solution to yield the same result as the recursive solution, you need to add elements to the stack in reverse order. For each node, insert its last child first and its first child last.

2. Breath First Search

Also called Level Order Search. Compared to DFS, exploring in BFS is level-by-level (or in layers); i.e., start with a node, explore an adjacent node (without deep diving till leaf) - repeat until all adjacent nodes are visited; then, choose an adjacent node (child), explore a level down - until its adjacent nodes are also explored; repeat the process.

In other words: traverse through one entire level of children nodes first before moving on to traverse through the grandchildren nodes. Repeat: traverse through an entire level of grandchildren nodes before going on to traverse through great-grandchildren nodes.

Explanation of the above example:

  • Starting with vertex A (Visit A) - start exploration of all adjacent vertices.
  • Explore adjacent nodes in any order, in this case: Node B, followed by F and G.
  • Cannot explore any further, as all adjacent nodes/children are visited.
  • Explore any one of the children, say Node B, and visit all the adjacent nodes of B: E, C, and D (in any order).
  • Again, cannot explore further, as all children are visited.
  • Similar to Node B, explore Node G and F (nothing to explore). BFS is now complete.
  • Order of visiting nodes: A, B, F, G, E, C, D

2.1 BFS: Implementation

Similar to DFS, we need to know if a node is "visited" in order to prevent cycles, i.e., re-visiting a node. Typically, BFS is implemented using a queue (FIFO: First-In First-Out) data structure. I wouldn't necessarily say that it's impossible to solve it with a stack, but it's definitely not conventional and introduces complexity.

Fun Fact: in the worst-case scenario (for Trees), a stack-based BFS performs better than a queue-based BFS. I'll explain more on this in a different post dedicated to Trees.

One important observation in BFS is that we add nodes that we have discovered but not yet visited to the queue, and come back to (visit) them later. With the source node (or root node) in the queue, the process is to visit a node (dequeue), add all the children/adjacent nodes to the queue (enqueue), and repeat the process.

Step Action Queue State Visited Nodes
1 Enqueue A [A] {}
2 Dequeue A, Enqueue B, C [B, C] {A}
3 Dequeue B, Enqueue D, E [C, D, E] {A, B}
4 Dequeue C, Enqueue F, G [D, E, F, G] {A, B, C}
5 Dequeue D [E, F, G] {A, B, C, D}
6 Dequeue E [F, G] {A, B, C, D, E}
7 Dequeue F [G] {A, B, C, D, E, F}
8 Dequeue G [] {A, B, C, D, E, F, G}

The intuition to follow along is: Queues follow the first-in, first-out (FIFO) principle, which means that whatever was enqueued first is the first item that will be read and removed from the queue.

Pseudo Code:

BFS(graph, start):
    let queue be a queue
    let visited be a set
    queue.enqueue(start)
    
    while queue is not empty:
        node = queue.dequeue()
        if node is not in visited:
            visit(node)
            visited.add(node)
            for each neighbor of node in graph:
                if neighbor is not in visited:
                    queue.enqueue(neighbor)

I hate to be the person who uses a tree to explain a graph. Reminds me of the physics class at school, where the lectures and exams are miles apart! So, here is the visualization of BFS for a graph:

In the Breadth-First Search (BFS) for a graph, the same element might be added to the queue multiple times in the presence of cycles (i.e. same nodes can be visited from multiple nodes). However, it will be ignored later based on the visited check. In the above graph BFS visualization, I have skipped adding the same element into the queue and indicated it with arrows (from other node(s)) instead.

This can be prevented by: searching the entire queue (increasing time complexity), using another hashtable to track enqueued nodes (increasing space complexity), or slightly optimized with tail checks.

3. Conclusion

Both Breadth-First Search (BFS) and Depth-First Search (DFS) have a lot of applications and come up way too often when dealing with graphs.

BFS is the first that pops up when finding the shortest path in an unweighted graph. DFS has tons of use-cases—be it computing a graph's minimum spanning tree, detecting cycles in a graph, checking if a graph is bipartite, finding bridges, articulation points, strongly connected components, topologically sorting a graph, and many more. BFS and DFS can often be used interchangeably.

4. References

[1] Wikipedia contributors, "Depth-first search," Wikipedia, The Free Encyclopedia. [Online]. Available: https://en.wikipedia.org/wiki/Depth-first_search.
[2] Wikipedia contributors, "Breadth-first search," Wikipedia, The Free Encyclopedia. [Online]. Available: https://en.wikipedia.org/wiki/Breadth-first_search.
[3] Abdul Bari, "Graph Traversals - BFS & DFS -Breadth First Search and Depth First Search," YouTube. [Online]. Available: https://youtu.be/pcKY4hjDrxk.
[4] Pravin Kumar Sinha, "Stack-based breadth-first search tree traversal," IBM Developer. [Online]. Available: https://developer.ibm.com/articles/au-aix-stack-tree-traversal/.
[5] W. Fiset, "Algorithms repository," GitHub, 2017. [Online]. Available: https://github.com/williamfiset/Algorithms.

Cite this article as: Adesh Nalpet Adimurthy. (Jul 17, 2024). Graph Theory: Search and Traversal. PyBlog. https://www.pyblog.xyz/graph-traversal

#index table of contents