Warm-up Coding Interview: Weighted Graph Algorithms

There is an alternate universe of problems for weighted graphs. The edges of road networks are naturally bound to numerical values such as construction cost, traversal time, length, or speed limit. Identifying the shortest path in such graphs proves more complicated than breadth-first search in unweighted graphs, but opens the door to a wide range of applications.

The graph data structure supported edge-weighted graphs:

[cce_c] typedef struct { edgenode *edges[MAXV+1]; /* adjacency info */ int degree[MAXV+1]; /* outdegree of each vertex */ int nvertices; /* number of vertices in graph */ int nedges; /* number of edges in graph */ int directed; /* is the graph directed? */ } graph; [/cce_c]

Each [cci]edgenode[/cci] is a record containing three fields, the first describing the second endpoint of the edge ([cci]y[/cci]), the second enabling us to annotate the edge with a weight ([cci]weight[/cci]), and the third pointing to the next edge in the list ([cci]next[/cci]):

[cce_c] typedef struct { int y; int weight; struct edgenode *next; } edgenode; [/cce_c]

We now describe several sophisticated algorithms using this data structure, including minimum spanning trees, shortest paths, and maximum flows. That this optimization problems can be solved efficiently is quite worthy of our respect.

Minimum Spanning Trees

A spanning tree of a graph G = (V, E) is a subset of edges from E forming a tree connecting all vertices of V. For edge-weighted graphs, we are particularly interested in the minimum spanning tree – the spanning tree whose sum of edge weights is as small as possible.

Minimum spanning trees are the answer whenever we need to connect a set of points (representing cities, homes, junctions, or other locations) by the smallest amount of roadway, wire, or pipe. Any tree is the smallest possible connected graph in terms of number of edges, while the minimum spanning tree is the smallest connected graph in terms of edge weight.

A minimum spanning tree minimizes the total length over all possible spanning trees. However, there can be more than one minimum spanning tree in a graph. Indeed, all spanning trees of an unweighted (or equally weighted) graph G are minimum spanning trees, since each contains exactly n-1 equal-weight edges. Such a spanning tree can be found using depth-first or breadth-first search. Fining a minimum spanning tree is more difficult for general weighted graphs, however two different algorithms are presented below. Both demonstrate the optimality of certain greedy heuristics.

Prim’s Algorithm

Prim’s minimum spanning tree algorithm starts from one vertex and grows the rest of the tree one edge at a time until all vertices are included.

Greedy algorithms make the decision of what to do next by selecting the best local option from all available choices without regard to the global structure. Since we seek the tree of minimum weight, the natural greedy algorithm for minimum spanning tree repeatedly selects the smallest weight edge that will enlarge the number of vertices in the tree.


Prim’s algorithm grows the minimum spanning tree in stages, starting from a given vertex. At each iteration, we add one new vertex into the spanning tree. A greedy algorithm suffices for correctness: we always add the lowest-weight edge linking a vertex in the tree to a vertex on the outside. The simplest implementation of this idea would assign each vertex a Boolean variable denoting whether it is already in the tree (the array [cci]intree[/cci] in the code below), and then searches all edges at each iteration to find the minimum weight edge with exactly one [cci]intree[/cci] vertex.

Our implementation is somewhat smarter. It keeps track of the cheapest edge linking every nontree vertex in the tree.

[cce_c] prim(graph *g, int start) { int i; /* counter */ edgenode *p; /* temporary pointer */ bool intree[MAXV+1]; /* is the vertex in the tree yet? */ int distance[MAXV+1]; /* cost of adding to tree */ int v; /* current vertex to process */ int w; /* candidate next vertex */ int weight; /* edge weight */ int dist; /* best current distance from start */ for (i=1; i<=g->nvertices; i++) { intree[i] = FALSE; distance[i] = MAXINT; parent[i] = -1; } distance[start] = 0; v = start; while (intree[v] == FALSE) { intree[v] = TRUE; p = g->edges[v]; while (p != NULL) { w = p->y; weight = p->weight; if ((distance[w] < weight) && (intree[w] == FALSE)) { distance[w] = weight; parent[w] = v; } p = p->next; } v = 1; dist = MAXINT; for (i=1; i<=g->nvertices; i++) if ((intree[i] == FALSE) && (dist < distance[i])) { dist = dist[i]; v = i; } } }[/cce_c]

Prim’s algorithm makes n iterations sweeping through all m edges on each iteration – yielding an O(mn) algorithm.

But our implementation avoids the need to test all m edges on each pass. It only considers the <= n cheapest known edges represented in the [cci]parent[/cci] array and <= n edges out of new tree vertex v to update [cci]parent[/cci]. By maintaining a Boolean flag along with each vertex to denote whether it is in the tree or not, we test whether the current edge joins a tree with a non-tree vertex in constant time.

The result is an [katex]O(n^2)[/katex] implementation of Prim’s algorithm, and a good illustration of power of data structures to speed up algorithms. In fact, more sophisticated priority-queue data structures lead to an [katex]O(m + n \lg n)[/katex] implementation, by making it faster to find the minimum cost edge to expand the tree at each iteration.

The minimum spanning tree itself or its cost can be reconstructed in two different ways. The simplest method would be to augment this procedure with statements that print the edges as they are found or totals the weight of all selected edges. Alternately, the tree topology is encoded by the [cci]parent[/cci] array, so it plus the original graph describe everything about the minimum spanning tree.

Kruskal’s Algorithm

Kruskal’s algorithm is an alternate approach to finding minimum spanning trees that proves more efficient on sparse graphs. Like Prim’s, Kruskal’s algorithm is greedy. Unlike Prim’s, it does not start with a particular vertex.

Kruskal’s algorithm builds up connected components of vertices, culminating in the minimum spanning tree. Initially, each vertex forms its own separate component in the tree-to-be. The algorithm repeatedly considers the lightest remaining edge and tests whether its two endpoints lie within the same connected component. If so, this edge will be discarded, because adding it would create a cycle in the tree-to-be. If the endpoints are in different components, we insert the edge and merge the two components into one. Since each connected component is always a tree, we need never explicitly test for cycles.

What is the time complexity of Kruskal’s algorithm? Sorting the m edges takes O(m lg m) time. The for loop makes m iterations, each testing the connectivity of two trees plus an edge. In the most simple-minded approach, this can be implemented by breadth-first or depth-first search in a sparse graph with at most n edges and n vertices, yielding an O(mn) algorithm.

However, a faster implementation results if we can implement the component test in faster than O(n) time. With this data structure, Kruskal’s algorithm runs in O(m lg m) time, which is faster than Prim’s for sparse graphs. Observe again the impact that the right data structure can have when implementing a straightforward algorithm.


The implementation of the main routine follows fairly directly from the psuedocode:

[cce_c]kruskal(graph *g) { int i; /* counter */ set_union s; /* set union data structure */ edge_pair e[MAXV+1]; /* array of edges data structure */ bool weight_compare(); set_union_init(&s, g->nvertices); to_edge_array(g, e); /* sort edges by increasing cost */ qsort(&e, g->nedges, sizeof(edge_pair), weight_compare); for (i=0; inedges); i++) { if (!same_component(s, e[i].x, e[i].y)) { printf(“edge (%d,%d) in MST\n”, e[i].x, e[i].y); union_sets(&s, e[i].x, e[i].y); } }}[/cce_c]

The Union-Find Data Structure

A set partition is a partitioning of the elements of some universal set (say the integers 1 to n) into a collection of disjointed subsets. Thus, each element must be in exactly one subset. Set partitions naturally arise in the graph problems such as connected components (each vertex is in exactly one connected component) and vertex coloring.

The connected components in a graph can be represented as a set partition. For Kruskal’s algorithm to run efficiently, we need a data structure that efficiently supports the following operations:

  • Same component(v1, v2) – Do vertices v1 and v2 occur in the same connected component of the current graph?
  • Merge components(C1, C2) – Merge the giben pair of connected components into one component in the response to an edge between them.

The two obvious data structures for this task each support only one of these operations efficiently. Explicitly labeling each element with its component number enables the same component test to be performed in constant time, but updating the component numbers after a merger would require linear time. Alternately, we can treat merge components operation as inserting an edge in a graph, but then we must run a full graph traversal to identify the connected components on demand.

The union-find data structure represents each subset as a “backwards” tree, with pointers from a node to its parent. Each node of this tree contains a set element, and the name of the set is taken from the key at the root. For reasons that will become clear, we will also maintain the number of elements in the subtree rooted in each vertex v:

[cce_c]typedef struct { int p[SET_SIZE+1]; /* parent element */ int size[SET_SIZE+1]; /* number of elements in subtree i */ int n; /* number of elements in set */} set_union;[/cce_c]

We implement our desired component operations in terms of two simpler operations, union and find:

  • Find(i) – Find the root of tree containing element i, by walking up the parent pointers until there is nowhere to go. Return the label of the root.
  • Union(i,j) – Link the root of one of the trees (say containing j) to the root of the tree containing the other (say j) so find(i) now equals find(j).

We seek to minimize the time it takes to execute any sequence of unions and finds. Tree structures can be very unbalanced, so we must limit the height of our trees. Out most obvious means of control is the decision of which of the two component roots become the root of the combined component on each union.

To minimize the tree height, it is better to make the smaller tree the subtree of the bigger one. Why? The height of all the nodes in the root subtree stay the same, while the height of the nodes merge into this tree all increase by one. Thus, merging the smaller tree leaves the height unchanged on the larger set of vertices.


The implementation details are as follows:

[cce_c]set_union_init(set_union *s, int n) { int i; /* counter */ for (i=1; i<=n; i++) { s->p[i] = i; s->size[i] = 1; } s->n = n;}int find(set_union *s, int x) { if (s->p[x] == x) return(x); else return( find(s, s->p[x]) );}int union_sets(set_union *s, int s1, int s2) { int r1, r2; r1 = find(s, s1); r2 = find(s, s2); if (r1 == r2) return; if (s->size[r1] >= s->size[r2]) { s->size[r1] = s->size[r1] + s->size[r2]; s->p[r2] = r1; } else { s->size[r2] = s->size[r1] + s->size[r2]; s->p[r1] = r2; }}bool same_component(set_union *s, int s1, int s2) { return ( find(s, s1) == find(s, s2) );}[/cce_c]


On each union, the tree with fewer nodes becomes the child. But how tall can such a tree get as a function of the number of nodes in it?

Variations on Minimum Spanning Trees

The minimum spanning tree algorithm has several interesting properties that help solve several closely related problems:

  • Maximum Spanning Trees
  • Minimum Product Spanning Trees
  • Minimum Bottleneck Spanning Trees

The minimum spanning tree of a graph is unique if all m edge weights in the graph are distinct.

There are two important variants of a minimum spanning tree that are not solvable with these techniques.

  • Steiner Tree
  • Low-degree Spanning Tree

Shortest Paths

The shortest path from s to t in an unweighted graph can be constructed using a breadth-first search from s. The minimum-link path is recorded in the breadth-first search tree, and it provides the shortest path when all edges have equal weight.

However, BFS does not suffice to find shortest paths in weighted graphs. The shortest weighed path might use a large number of edges, just as the shortest route (timewise) from home to office may involve complicated shortcuts using backroads.

Dijkstra’s Algorithm

Dijkstra’s algorithm is the method of choice for finding shortest paths in an edge and/or vertex-weighted graph. Given a particular start vertex s, it finds the shortest path from s to every other vertex in the graph, including your desired destination t.

Dijkstra’s algorithm proceeds in a series of rounds, where each round establishes the shortest path from s to some new vertex. Specially, x is the vertex that minimizes [katex]dist(s, v_i) + w(v_i, x)[/katex] over all unfinished [katex]1 \leq i \leq n[/katex], where [katex]w(i, j)[/katex] is the length of the edge from i to j, and [katex]dist(i,j)[/katex] is the length of the shortest path between them.

This suggests a dynamic programming-like strategy. If (s,y) is the lightest edge incident to s, then this implies that dist(s,y) = w(s,y). Once we determine the shortest path to a node x, we check all outgoing edges of x to see whether there is a better path from s to some unknown vertex through x.

The basic idea is very similar to Prim’s algorithm. In each iteration, we add exactly one vertex to the tree of vertices for which we know the shortest path from s. As in Prim’s, we keep track of the best path seen to date for all vertices outside the tree, and insert them in order of increasing cost.

The difference between Dijskstra’s and Prim’s algorithms is how they rate the desirability of each outside vertex. In the minimum spanning tree problem, all we care about was the weight of the next potential tree edge. In shortest path, we want to include the closest outside vertex (in shortest-path distance) to s. This is a function of both the new edge weight and the distance from s to the tree vertex it is adjacent to.


The pseudocode actually obscures how similar the two algorithms are. In fact, the change is very minor. Below, we give an implementation of Dijkstra’s algorithm based on changing exactly three lines from our Prim’s implementation – one of which is simply the name of the function!

[cce_c]dijkstra(graph *g, int start) { int i; /* counter */ edgenode *p; /* temporary pointer */ bool intree[MAXV+1]; /* is the vertex in the tree yet? */ int distance[MAXV+1]; /* cost of adding to tree */ int v; /* current vertex to process */ int w; /* candidate next vertex */ int weight; /* edge weight */ int dist; /* best current distance from start */ for (i=1; i<=g->nvertices; i++) { intree[i] = FALSE; distance[i] = MAXINT; parent[i] = -1; } distance[start] = 0; v = start; while (intree[v] == FALSE) { intree[v] = TRUE; p = g->edges[v]; while (p != NULL) { w = p->y; weight = p->weight; if ((distance[w] > distance[v]+weight) { /* CHANGED */ distance[w] = distance[v] + weight; /* CHANGED */ parent[w] = v; } p = p->next; } v = 1; dist = MAXINT; for (i=1; i<=g->nvertices; i++) if ((intree[i] == FALSE) && (dist > distance[i])) { dist = dist[i]; v = i; } }}[/cce_c]

This algorithm finds more than just the shortest path from s to t. It finds the shortest path from s to all over vertices. This defines a shortest path spanning tree rooted in s. For unweighted graphs, this would be the breadth-first search tree, but in general it provides the shortest path s to all other vertices.


What is the running time of Dijkstra’s algorithm? As implemented here, the complexity is O(n^2). This is the same running time as a proper version of Prim’s algorithm; except for the extension condition it is the same algorithm as Prim’s.

The length of the shortest path from [cci]start[/cci] to a given vertex [cci]t[/cci] is exactly the value of [cci]distance[/cci]. To find the actual path we follow the backward [cci]parent[/cci] pointers from [cci]t[/cci] until we hit [cci]start[/cci], exactly as was done in the [cci]find_path()[/cci] routine.

Dijkstra works correctly only on graphs without negative-cost edges. The reason is that midway through the execution we may encounter an edge with weight so negative that it changes the cheapest way to get from s to some other vertex already in the tree.

Stop and Think: Shortest Path with Node Costs

All-Pairs Shortest Path

Floyd’s algorithm is best employed on an adjacency matrix data structure, which is no extravagance since we must store all n^2 pairwise distances anyway. Our [cci]adjacency_matrix[/cci] type allocates space for the largest possible matrix, and keeps track of how many vertices are in the graph:

[cce_c]typedef struct { int weight[MAXV+1][MAXV+1]; /* adjacency/weight info */ int nvertices; /* number of vertices in graph */} adjacency_matrix;[/cce_c]

The critical issue in an adjacency matrix implementation is how we denote the edges absent from the graph. A common convention for unweighted graphs denotes graph edges by 1 and non-edges by 0. This gives exactly the wrong interpretation if the numbers denote edge weights, for the non-edges get interpreted as a free ride between vertices. Instead, we should initialize each non-edge to [cci]MAXINT[/cci]. This way we can both test whether it is present and automatically ignore it in shortest-path computations, since only real edges will be used, provided [cci]MAXINT[/cci] is greater than the diameter of your graph.

There are several ways to characterize the shortest path between two nodes in a graph. The Floyd-Warshall algorithm starts by numbering the vertices of the graph from 1 to n. We use these numbers not to label the vertices, but to order them. Define [katex]W[i,j]^k[/katex] to be the length of the shortest path from i to j using only vertices numbered from [katex]1,2,\dots,k[/katex] as possible intermediate vertices.

At each iteration, we allow a richer set of possible shortest paths by adding a new vertex as a possible intermediary. Allowing the vertex as a stop helps only if there is a short path that goes through k, so

[katex]W[i,j]^k = \min(W[i,j]^{k-1}, W[i,k]^{k-1} + W[k,j]^{k-1})[/katex]

The correctness of this is somewhat subtle, and I encourage you to convince yourself of it. But there is nothing subtle about how simple the implementation is:

[cce_c]floyd(adjacency_matrix *g) { int i, j; /* dimension counters */ int k; /* intermediate vertex counter */ int through_k; /* distance through vertex k */ for (k=1; k<=g->nvertices; k++) for (i=1; i<=g->nvertices; i++) for (j=1; j<=g->nvertices; j++) { through_k = g->weight[i][k] + g->weight[k][j]; if (through_k < g->weight[i][j]) g->weight[i][j] = through_k; }}[/cce_c]

The Floyd-Warshall all-pairs shortest path runs in [katex]O(n^3)[/katex] time, which is asymptotically no better than n calls to Dijkstra’s algorithm. However, the loops are so tight and the program so short that it runs better in practice. It is notable as one of the rare graph algorithms that . work better on adjacency matrices than adjacency lists.

The output of Floyd’s algorithm, as it is written, does not enable one to reconstruct the actual shortest path between any given pair of vertices. These paths can be recovered if we retain a parent matrix P of our choice of the last intermediate vertex used for each vertex pair (x, y). Say this value is k. The shortest path from x to y is the concatenation of the shortest path from x to k with the shortest path from k to y, which can be reconstructed recursively given the matrix P. Note, however, that most all-pairs applications need only the resulting distance matrix. These jobs are what Floyd’s algorithm was designed for.

Transitive Closure

Floyd’s algorithm has another important application, that of computing transitive closure. In analyzing a directed graph, we are often interested in which vertices are reachable from a given node.

The vertices reachable form any single node can be computed using breadth-first or depth-first searches. But the whole batch can be computed using an all-pairs shortest-path. If the shortest path from i to j remains [cci]MAXINT[/cci] after running Floyd’s algorithm, you can be sure no directed path exists from i to j.

Network Flow and Bipartite Matching

Edge-weighted graphs can be computed as a network of pipes, where the weight of edge (i,j) determines the capacity of the pipe. Capacities can be though of as a function of the cross-sectional area of the pipe. The network flow problem asks for the maximum amount of flow which can be sent from vertices s to t in a given weighted graph G while respecting the maximum capacities of each pipe.

Bipartite Matching

While the network flow problem is of independent interest, its primary important events is in to solving other important graph problems. A classic example is bipartite matching. A matching is a graph G = (V, E) is a subset of edges [katex]E’ \subset E[/katex] such that no two edges of E’ share a vertex.

Graph G is bipartite or two-colorable if the vertices can be divided into two sets, L and R, such that all edges in G have one vertex in L and one vertex in R. Many naturally defined graphs are bipartite. Matching in these graphs have natural interpretations as job assignments or as marriages.

The largest bipartite matching can be readily founding using network flow. Create a source node s that is connected to every vertex in L by an edge of weigh 1. Create a sink node t and connect it to every vertex in R by an edge of weigh 1. Finally, assign each edge in the bipartite graph G a weight of 1. Now, the maximum possible flow from s to t defines the largest matching in G. (?)

Computing Network Flows

Traditional network flow algorithms are based on the idea of augmenting paths, and repeatedly finding a path of positive capacity from s to t and adding it to the flow.

The key structure is the residual flow graph, denoted as [katex]R(G, f)[/katex], where G is the input graph and f is the current flow through G. This directed, edge-weighted R(G, f) contains the same vertices as G. For each edge (,j) in G with capacity c(i,j) and flow f(i,j), R(G, f) may contain two edges:

  • An edge (i,j) with weight c(i,j) – f(i,j), if c(i,j) – f(i,j) > 0 and
  • an edge (j, i) with weight f(i,j), if f(i,j) > 0.

A set of edges whose deletion separates s from t (like the two edge incident to t) is called an s-t cut. Clearly, no s to t flow can exceed the weight of the minimum such cut. In fact, a flow equal to the minimum cut is always possible.


[cce_c]typedef struct { int v; /* neighboring vertex */ int capacity; /* capacity of edge */ int flow; /* flow through edge */ int residual; /* residual capacity of edge */ struct edgenode *next; /* next edge in list */} edgenode;[/cce_c]

We use a breadth-first search to look for any path from source to sink that increases the total flow, and use it to augment the total. We terminate with the optimal flow when no such augmenting path exists.

[cce_c]netflow(flow_graph *g, int source, int sink) { int volume; add_residual_search(g); initialize_search(g); bfs(g, source); volume = path_volume(g, source, sink, parent); while (volume > 0) { augment_path(g, source, sink, parent, volume); initialize_search(g); bfs(g, source); volume = path_volume(g, source, sink, parent); }}[/cce_c]

Any augmenting path from source to sink increases the flow, so we can use [cci]bfs[/cci] to find such a path in the appropriate graph. We only consider network edges that have remaining capacity, or in other words, positive residual flow. The predicate below helps [cci}bfs[/cci] distinguish between saturated and unsaturated edges:

[cce_c]bool valid_edge(edgenode *e) { if (e->residual > 0) return (TRUE); else return(FALSE);}[/cce_c]

Augmenting a path transfers the maximum possible volume from the residual capacity into positive flow. This amount is limited by the path-edge with the smallest amount of residual capacity, just as the rate at which traffic can flow is limited by the most congested point.

[cce_c]int path_volume(flow_graph *g, int start, int end, int parents[]) { edgenode *e; /* edge in question */ edgenode *find_edge(); /* forward declaration? */ if (parents[end] == -1) return(0); e = find_edge(g.parents[end], end); if (start == parents[end]) return(e->residual); else return ( min(path_volume(g, start, parents[end], parents), e->residual) );}edgenode *find_edge(flow_grpah *g, int x, int y) { edgenode *p; /* temporary pointer */ p = g->edges[x]; While (p != NULL) { if (p->v == y) return(p); p = p->next; } return(NULL);}[/cce_c]

Sending an additional unit of flow along directed edge (i,j) reduces the residual capacity of edge (i,j) but increases the residual capacity of edge (j, i). Thus, the act of augmenting a path requires modifying both forward and reverse edges for each link on the path.

[cce_c]augment_path(flow_graph *g, int start, int end, int parents[], int volume) { edgenode *e; edgenode *find_edge(); if (start == end) return; e = find_edge(g, parents[end], end); e=>flow += volume; e->residual -= volume; e = find_edge(g, end, parents[end]); e->residual += volume; augment_path(g, start, parents[end], parents, volume);}[/cce_c]

Initializing the flow graph requires creating directed flow edges (i,j) and (j,i) for each network edge e = (i,j). Initial flows are all set to 0. The initial residual flow of (i,j) is set to the capacity of e, while the initial residual flow of (j, i) is set to 0.


The augmenting path algorithm above eventually converges on the optimal solution.

Edmonds and Karp proved that always selecting a shortest unweighted augmenting path guarantees that [katex]O(n^3)[/katex] augmentations suffice for optimization. In fact, the Edmonds-Karp algorithm is what is implemented above, since a breach-first search from the source is used to find the next augmenting path.

Design Graphs, Not Algorithms

Proper modeling is the key to making effective use of graph algorithms. We have defined several graph properties, and developed algorithms for computing them. All told, about two dozen different graph problems are presented I nt the catalog. These classical graph problems provide a framework for modeling most applications.

The secret is learning to design graphs, not algorithms. we have already seen a few instances of this idea:

  • The maximum spanning tree
  • To solve bipartite matching, we constructed a special network flow graph such that the maximum flow corresponds to a maximum cardinality matching.

Stop and Think: The Pink Panther’s Passort to Peril

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.