# Minimum spanning tree

# Definition

Minimum spanning tree is defined as given graph(G) which a set of vertexes and edges with weights. Find a path which connects all vertexes with minimum total weight of edges. There are two classic algorithms which are designed with greedy algorithm: Prim Algorithm and Kruskal Algorithm.

## Generice MST

```
A = {} // subset of some minimum spanning tree (V, E)
while A is not a minimum spanning tree
find edge(u, v) to that is safe to A
A = A union (u, v)
```

## Safe Edge

Safe edge is a edge that is safe to added into A which out destorying the invarient property that A is subet of some minimum spanning tree. In other words, when you add a edge into A which causes A cannot form MST in the future, the edge is not safe.

### Light Edge is safe

Let G=(V,E) be a connected, undirected graph with a real-valued weight function w defined on E. Let A be a subset of E that is included in some minimum spanning tree for G. Let (S,V-S) be any cut of G such that for any edge (u, v) in A, {u, v} S or {u, v} (V-S). Let (u,v) be a light edge crossing (S,V-S). Then, edge (u,v) is safe for A. (.... It just means finding a minimum weight edge which connects two parts of the graph, when it is cut then no vertexes are shared in two parts)

# Prim Algorithm

This algorithm is based on shortest path algorithm. It builds a set of A in the process and select the light edge connecting A to others.

**Simple version**

```
A = {}
S = {r} // r is the random node you want to start with
Q = V - {r}
while Q is not empty
Find the minimum edge(u, v) so that u in A and v in Q
Add (u, v) in A, add v in S, and delete v from Q
```

**Priortiy queue version**
The min-priority queue is implemented with binary heap.
Use the key of the priority queue to keep track of the minimum distance from A to others.
Note:
Heap-Order: for every node v other than the root,
key(v) >= key(parent(v))

```
Store all V in priority queue Q
for each u in Q
key[u] = infinity
parent[u] = NUL
key[r] = 0
while (Q is not empty)
u = EXTRACT_MIN(Q)
for (each v in adj[u])
// v belongs to others but not the buidling tree
// update its minimum distance if edge is smaller than its expected value before
if (v belongs to Q && key[v] > w(u,v) )
key[v] = w(u, v)
parent[v] = u
```

## Running time analysis

While loop executes V times EXTRACT_MIN(Q) : log V => V log V

The for loop (each v in adj[u]): total executes 2E times decrease key operation : log V => E log V

Overall running time: (V log V + E log V) = E log V // E always > V

# Kruskal Algorithm

Add the edges according to the increasing weight, as long as the operation will not violate the minimum tree property. It uses a disjoint set to keep track the grouping of vertexes, as long as the edge(u, v) where u and v are not in the same group, the edge is safe to be added.

```
A = {} // tree under construction
Sort the edges of E by weights by increasing order
for each v in G
MAKE_SET(v) // every one belongs to its own group initially
for each edge(u, v) in E
if (FIND_SET(u) != FIND_SET(v))
A union (u, v)
UNION(u, v) // they are the same group now
return A
```

## Running time analysis

- sort edges: E log E
- for loop V times: MAKE_SET: log V
- for loop execcutes E times FIND_SET and UNION: log V

Combine 2 & 3 we ge V+E log V => E log V => combine with (1) E log E Overall Running time: E log E

## Disjoint Set Implemented by array

Every child remember its parent index The parent remeber its size (-ve)

```
int Find(int element){
if (nodes[element] < 0)
return element;
else
return nodes[element] = Find(nodes[element]);
}
void UnionSet(int set1, int set2){
nodes[set1] += nodes[set2];
nodes[set2] = set1;
}
int Union(int element1, int element2){
int root1 = Find(element1);
int root2 = Find(element2);
if (root1 == root2) // reject if form a cycle
return 0;
if (nodes[root1] < nodes[root2]) // root1 has more elements than root2
UnionSet(root1, root2);
else
UnionSet(root2, root1);
return 1;
}
```

## Differene between Prim algorihtm and Kruskal algorithm

Prim requires a connected graph, Kruskal can work on unconnected group in which many MST can be formed. Every step in prim is a partial solution.

## Application of MST

MST algorihtm can be also used to build spanning tree based on each side

```
for (int i=0; i<m; i++){
// based on each side, try to build a spanning tree with bigger side
// compare their slimness
for (int j=1; j<=n; j++){
nodes[j] = -1;
}
Edge b = edges[i];
Union(b.u, b.v);
int largest = b.c;
int smallest = b.c;
for (int j=i+1; j<m; j++){
Edge c = edges[j];
Union(c.u, c.v);
}
}
```

k - cluster with mini distance between point = MST - k most expensive edges.