Graph Theory Algorithms
A graph G = (V, E) is a set V of vertices and a set E of edges. Each edge e ∈ E is associated with two vertices u and v from V, and we write e = (u, v). We say that u is adjacent to v, u is incident to v, and u is a neighbor of v.
Graphs are a common abstraction to represent data. Some examples include road networks, where the vertices are cities and there is an edge between any two cities that share a highway; protein interaction networks, where the vertices are proteins and the edges represent interactions between proteins; and social networks, where the nodes are people and the edges represent friends. Sometimes we want to associate a direction with the edges to indicate a one-way relationship. For example, consider a predator-prey network where the vertices are animals and an edge represents that one animal hunts the other. We want to express that the fox hunts the mouse but not the other way around.
A directed graph G = (V, E) is a set V of vertices and set E of edges. Each edge e ∈ E is an ordered pair of vertices from V. We denote the edge from u to v by (u, v) and say that u points to v. This could be useful, for example, in making a graph representation of a road network where some streets were one-way.
A weighted graph G = (V, E, w) is a graph (V, E) with an associated weight function w: E → R. In other words, each edge e has an associated weight w(e). This could be useful in incorporating lengths of roads in a graph representation of a road network.
In an undirected graph, the degree of a node u ∈ V is the number of edges e = (u, v) ∈ E. We will write du to denote the degree of node u.
Common Graph Theory Algorithms
Depth First Search
In depth-first search, we explore a graph by starting at a vertex and then following a path of unexplored vertices away from this vertex as far as possible (and repeating).
Given the explore routine, we have a very simple DFS algorithm. We present it in Algorithm 8. The cost of DFS is O(|V | + |E|). We only call Explore() on a given node once. Thus, we only consider the neighbors of each node once. Therefore, we only look at the edges of a given node once.
Breadth-First Search
With DFS, we went as far along a path away from the starting node as we could. With breadth-first-search (BFS), we instead look at all neighbors first, then neighbors of neighbors, and so on. For BFS, we use a queue data structure.
Minimum Spanning Tree
A spanning tree of an undirected graph G = (V, E) is a tree T = (V, E0 ), where E0 ⊂ E. The minimum spanning tree T = (V, E0, w) of a weighted, connected undirected graph G = (V, E, w) is a spanning tree where the sum of the edge weights of E0 is minimal.
Kruskal’s algorithm
- Sort all the edges in non-decreasing order of their weight.
- Pick the smallest edge. Check if it forms a cycle with the spanning tree formed so far. If the cycle is not formed, include this edge. Else, discard it.
- Repeat step 2 until there are (V-1) edges in the spanning tree.
Prim’s algorithm
- Create a set mstSet that keeps track of vertices already included in MST.
- Assign a key value to all vertices in the input graph. Initialize all key values as INFINITE. Assign key value as 0 for the first vertex so that it is picked first.
- While mstSet doesn’t include all vertices
- Pick a vertex u which is not there in mstSet and has a minimum key value.
- Include u to mstSet.
- Update the key value of all adjacent vertices of u. To update the key values, iterate through all adjacent vertices. For every adjacent vertex v, if the weight of edge u-v is less than the previous key value of v, update the key value as the weight of u-v
Kosaraju’s Algorithm
A directed graph is strongly connected if there is a path between all pairs of vertices. A strongly connected component (SCC) of a directed graph is a maximal strongly connected subgraph.
We can find all strongly connected components in O(V+E) time using Kosaraju’s Algorithm
1) Create an empty stack ‘S’ and do a DFS traversal of a graph. In DFS traversal, after calling recursive DFS for adjacent vertices of a vertex, push the vertex to stack. In the above graph, if we start DFS from vertex 0, we get vertices in the stack as 1, 2, 4, 3, 0.
2) Reverse directions of all arcs to obtain the transpose graph.
3) One by one pop a vertex from S while S is not empty. Let the popped vertex be ‘v’. Take v as source and do DFS (call DFSUtil(v)). The DFS starting from v prints strongly connected component of v. In the above example, we process vertices in order 0, 3, 4, 2, 1 (One by one popped from stack)
Hungarian algorithm
This utilizes the following theorem for polynomial runtime complexity (worst case O(n3)) and guaranteed optimality:
If a number is added to or subtracted from all of the entries of any one row or column of a cost matrix, then an optimal assignment for the resulting cost matrix is also an optimal assignment for the original cost matrix.
We reduce our original weight matrix to contain zeros, by using the above theorem. We try to assign tasks to agents such that each agent is doing only one task and the penalty incurred in each case is zero.
Core of the algorithm (assuming square matrix):
- For each row of the matrix, find the smallest element and subtract it from every element in its row.
- Do the same (as in step 1) for all columns.
- Cover all zeros in the matrix using a minimum number of horizontal and vertical lines.
- If the minimum number of covering lines is n, an optimal assignment is possible and we are finished. Else if lines are lesser than n, we haven’t found the optimal assignment and must proceed to step 5.
- Determine the smallest entry not covered by any line. Subtract this entry from each uncovered row, and then add it to each covered column. Return to step 3.
Kahn’s Algorithm of Topological Sorting
Topological sorting for Directed Acyclic Graph (DAG) is a linear ordering of vertices such that for every directed edge uv, vertex u comes before v in the ordering. Topological Sorting for a graph is not possible if the graph is not a DAG.
Algorithm: Steps involved in finding the topological ordering of a DAG:
Step-1: Compute in-degree (number of incoming edges) for each of the vertex present in the DAG and initialize the count of visited nodes as 0.
Step-2: Pick all the vertices with in-degree as 0 and add them into a queue (Enqueue operation)
Step-3: Remove a vertex from the queue (De queue operation) and then.
- Increment count of visited nodes by 1.
- Decrease in-degree by 1 for all its neighboring nodes.
- If the in-degree of neighboring nodes is reduced to zero, then add it to the queue.
Step 4: Repeat Step 3 until the queue is empty.
Graphs can be used to model many types of relations and processes in physical, biological, social, and information systems. Many practical problems can be represented by graphs. Emphasizing their application to real-world systems, the term network is sometimes defined to mean a graph in which attributes (e.g. names) are associated with the vertices and edges, and the subject that expresses and understands the real-world systems as a network is called network science.