Network Flow Theory.

```
Martin Pei
Section 001
Email: mpei@uwaterloo.ca
Office Hours: Mondays/Tuesdays 4pm-5pm in MC 6492
Mondays/Wednesdays/Fridays 1:30pm-2:20pm
```

9 assignments, due on wednesdays, lowest assignment mark is dropped, submitted via Crowdmark. Assignment 0 is worth a bonus 1%.

Network theory deals with modelling problems using directed graphs, extending CS239. Network flow theory deals with flow problems over these directed graphs. A typical problem: given so-and-so network of roads (directed graph), each with different widths (cost of taking a road), how much traffic can we route from point A to B? Essentially, we're extending linear programming concepts from CO250 to directed graphs.

Real examples of networks: water pipes, electrical grid, and road networks.

Graph theory review: a **graph** is a pair of a set of vertices (arbitrary objects) and a set of edges, which are unordered pairs of vertices. The **degree** of a vertex is the number of edges incident to it, denoted d_G(v). A **walk** is a sequence of vertices in a graph where consecutive vertices are edges in the graph. A **path** is a walk with no repeated vertices or edges. A graph is **connected** if and only if there is a path between any two vertices. A **cycle** is a walk where the last vertex is the same as the first vertex. A **cut induced by a set of vertices S** is the set of edges with one end in S and another end not in S, denoted \delta(S). The size of a cut is denoted d(S) or \abs{\delta(S)}. A tree is a connected acyclic graph.

Theorem: there's an s, t-path in G if and only if for all X \subseteq V(G) such that s \in X, t \notin X, \delta(X) \ne \emptyset - exactly when every subset of vertices including s but not t has a non-empty cut.

Proof:

First we'll prove the forward direction. Assume an s, t-path exists. Let v_0, \ldots, v_k be one of those s, t-paths, let X \subseteq V(G) be an arbitrary subset of graph vertices such that s \in X, t \notin X.

Let i be the smallest index in the s, t-path such that v_i \notin X. Since it's the smallest index, v_{i - 1} must be in X, so v_{i - 1} v_i \in \delta(X), so \delta(X) is non-empty, as required.

Now for the opposite direction. Assume no s, t-path exists. Then s and t are in two different components A and B, since otherwise the path would exist.

Clearly, the cut induced by A must be empty, since if it isn't, then any vertex in the edges of the cut that weren't in A should be in A.

Let T be a tree. Then \abs{E(T)} = \abs{V(T)} - 1. Also, a unique path exists between any two vertices, and adding any edge that doesn't already exist to T will form exactly one cycle, Removing any edge from that resulting cycle gives us a tree again.

Directed graphs (digraphs) are very similar to graphs. A digraph D = \tup{N, A} is a pair of a set of nodes (arbitrary objects) and a set of arcs, which are ordered pairs of nodes. Each node has an **in-degree** (number of arcs that point to it) denoted d(\overline v) or d^-(v), and an **out-degree** (number of args that point from it) denoted d(v) or d^+(v). A diwalk is a sequence of nodes such that consecutive nodes have arcs between them, a dipath is a directed walk with no duplicated nodes or arcs, and a dicycle is a dipath such that the first and last nodes are the same.

Directed graphs can have cycles of length 2, unlike undirected graphs. A directed graph is acyclic if there are no dicycles. We can also talk about cycles in a directed graph, which are simply the cycles if we ignore directions in the graph.

If every node of a digraph has out-degree of at least 1, the digraph must have a directed cycle. Proof: let v_0, \ldots, v_k be the longest dipath in the digraph. Clearly, v_k has out-degree at least 1, so it has an arc pointing to another node u. Clearly, u must be in the dipath, since if it wasn't then the path could be strictly longer, which isn't possible since the dipath is already the longest. So u must be in the dipath, and this forms a directed cycle in the digraph.

Likewise for an undirected graphs, if every vertex has degree at least 2, there must be an undirected cycle. We can prove this in a very similar way.

A digraph is **connected** exactly when its corresponding undirected graph is connected - if every vertex in the underlying graph has a path to every other, ignoring direction. A digraph is **strongly connected** exactly when every node has a directed path to every other node.

For any subset of notes S \subseteq N, the **cut induced by S** is the set \delta(S) = \set{xy \in A \middle| x \in S, y \notin S} - the set of arcs that go outward from S. There's also \delta(\overline S), which is the set of arcs that go inward toward S (the cut induced by the complement).

Just like for an undirected graph, an s, t-dipath exists if and only if every s, t-cut is non-empty. The proof is very similar to the version for undirected graphs.

Suppose we have a network consisting of a digraph, and some objects called **supplies/goods** present at various nodes (nodes are labelled with the net quantity demanded). We want to transport goods from where there's a negative net quantity demanded to where there's a positive net quantity demanded. The quantity we transport along the arcs is the flow. In other words, **a flow is a weighting for each arc in a digraph**, representing how much of a quantity we're transporting along that arc.

The **inflow of a node** is the sum of the weights of the arcs going into the node. The **outflow of a node** the sum of the weights of the arcs leaving the node. Inflow minus outflow is the weight of the vertex - the net quantity demanded.

There are also **costs** associated with letting goods flow through an arc. A **cost** is a edge weighting, just like a flow, but the weight of an edge represents the cost of transporting one unit of a good through that arc.

The goal of the problem is to find a flow such that the net quantity demanded becomes non-positive, such that we minimize the cost of that flow.

New notation: \vec o \in \mb{R}^S, where S is a set, means that o is a vector with \magn{S} elements, each one representing an element s of S. Additionally, we can write the element corresponding to s as o_s.

Formally: suppose we have a digraph D = \tup{N, A}, a node weighting \vec b \in \mb{R}^N (the net quantity demanded of the good at each node in N), and an arc weighting \vec w \in \mb{R}^A (the cost of transporting a unit of the good through each arc in A). A **flow** is an arc weighting \vec x \in \mb{R}^A such that \vec x \ge \vec 0 and \sum_{iv \in A} x_{iv} - \sum_{vj \in A} x_{vj} = b_v for all v \in N (total inflow minus total outflow is equal to b_v for any v \in N). The problem is to minimize \sum_{ij \in A} w_{ij} x_{ij}.

Intuitively, a flow problem on a digraph tries to move positive quantities of goods along arcs from nodes that are negative to nodes that are positive.

If b_v is positive for a node v, then v is a supply node. If it's negative, v is a demand node.

New notation: if N \subseteq S, then o(N) = \sum_{v \in N} o_v. In other words, if we call a vector like a function with a set of indices, we get back the sum of the elements at those indices.

Therefore, \sum_{iv \in A} x_{iv} - \sum_{vj \in A} x_{vj} = b_v$ is equivalent to x(\delta(\overline v)) - x(\delta(v)) = b_v for all v \in N.

Additional assumptions for network flow problems in this course:

- The digraph D is connected.
- The weights are all positive - \vec w \ge \vec 0.
- The total quantity demanded is equal to the total quantity supplied (sum of net demands in 0) - b(N) = 0.

The transshipment problem can be solved simply using linear programming. The general form of the LP is: "minimize \sum_{ij \in A} w_{ij} x_{ij} subject to \sum_{iv \in A} x_{iv} - \sum_{vj \in A} x_{vj} = b_v for all v \in N, \vec x \ge \vec 0" (the values of \vec w and \vec b are fixed). In other words, **we're minimizing the total cost of performing the shipments such that we still manage to make inflow minus outflow equal to the net quantity demanded for each individual node**.

In matrix form, the columns are indexed by arcs, and the rows are indexed by nodes. At a given row a and column ab within the matrix, we put a 0 when there's no such arc, 1 if the digraph contains the arc ba, and -1 if the digraph contains the arc ab.

Formally, an **incidence matrix** for a digraph D = \tup{N, A} is an \abs{N} by \abs{A} matrix M where M_{v, ij} = \begin{cases} -1 &\text{if } v = i \\ 1 &\text{if } v = j \\ 0 &\text{otherwise} \end{cases}. As it turns out, the matrix form of the LP for the transshipment problem will always have a coefficients matrix that is the incidence matrix of the digraph, under the assumptions we make for this course.

Therefore, we can write the LP more simply as "minimize \vec w \cdot \vec x subject to M \vec x = \vec b, \vec x \ge \vec 0", where M is the incidence matrix for the digraph, \vec w is the costs for each arc and \vec b is the net quantity demanded in each vertex.

Interestingly, since in an incidence matrix, every column of M has exactly one 1 and one -1, or is all 0. Therefore, the sum of all the rows in the matrix gives \vec 0. If we sum the A \vec x = \vec b, we get b(N) = 0 - our assumption that the total net quantity demanded is 0.

Consider now the dual for the LP: "minimize \vec b \cdot \vec y subject to M^T \le \vec c for all ij \in A, \vec y free", where M is the incidence matrix for the digraph. We can then use complementary slackness to show that the LP is optimal.

Assignment 1 is out, due on Wednesday.

New notation: M_a is the column in M corresponding to the arc a.

New notation: \overtilde M is M with the last row removed.

Summary of transshipment problem (TP): minimize \vec w^T \vec x subject to M \vec x = \vec b and \vec x \ge \vec 0. Here, \vec w is the costs, \vec b is the demands, and M is the indcidence matrix. The M \vec x = \vec b is called the **flow constraint**, while the \vec x \ge \vec 0 is the **non-negativity constraint**.

To start solving a TP instance with Simplex, we need to find a feasible basis. However, note that the rows of the incidence matrix M are not linearly independent, since when you sum up all the rows, you get \vec 0 - the rank of the n by m incidence matrix is n - 1. ;wip: why can't it be less than n - 1? add proof

To get around this, we ignore one row, so we pretend the matrix is n - 1 by m, and pick n - 1 columns for a basis instead. We usually can do this by inspection, or the auxilary problem method.

In fact, the columns of M that correspond to the edges of any undirected cycle are linearly dependent, and those that do not are linearly independent. Proof: for any undirected arc v_1, \ldots, v_k, v_1 in a digraph, v_i v_{i + 1} is a forward arc, whereas v_{i + 1} v_i is a backward/reverse arc. Let F be the set of forward arcs in the undirected cycle, and R the reverse arcs. Clearly, \sum_{a \in F} M_a - \sum_{a \in R} M_a has the same sum as if the undirected cycle was a dicycle instead. Clearly, the sum of the arcs in a dicycle is 0, since for the dicycle each row will have exactly one -1 and one 1 entry (one incoming arc and one outgoing arc) \sum_{a \in F} M_a - \sum_{a \in R} M_a = 0. We've now found k columns that are linearly dependent, since they sum to 0.

Likewise, a set of linearly dependent columns must contain an undirected cycle (proof left as exercise). So **there exists a set of linearly dependent columns if and only if the arcs corresponding to the columns in that set form an undirected cycle**.

So to find a basis (ignoring one row) of \overtilde M, we need to find a set of n - 1 linearly independent arcs - so by the above proof, a set of n - 1 arcs that don't have any cycles. From graph theory, this must be a spanning tree by definition, since it has no cycles and contains n vertices - **a set of columns of an incidence matrix M is a basis for M if and only if the arcs corresponding to those columns forms a spanning tree of the digraph**. Note that spanning trees ignore arc direction, as if we're working on an undirected graph.

To find a basic solution corresponding to some spanning tree, we can start from the leaves (vertices in the spanning tree of degree 1) and work our way up to the root - a leaf node of value -10 must have 10 flow coming out of it, since it must end up with 0 value in the end, for example. This isn't necessarily a basic feasible solution, because it might not satisfy the non-negativity constraint - some of the flow directions might be different from the arc directions, like if we have a negative valued node and in the spanning tree, it's incident to an inward pointing arc. If we get an infeasible solution, we can just keep trying other spanning trees until we find a feasible one.

Review of simplex method. In the simplex method, at all times we maintain a basic feasible solution for the LP, and a solution for the dual that satisfies all constraints (except the non-negativity constraints), and the complementary slackness conditions. Each step fixes some of the non-negativity constraints, so eventually both dual and primal solutions are satisfied, and by the complementary slackness theorem, both solutions must be optimal.

Recall the transshipment problem LP: "minimize \vec w \cdot \vec x subject to M \vec x = \vec b and \vec x \ge \vec 0", as well as its dual, "maximize \vec b \cdot \vec y subject to -y_a + y_b \le w_{ab} for all ab \in A and \vec y free". The vector \vec y is knows is the **node potential**.

To start solving this using simplex, we start by adding a slack variable to each dual inequality: "maximize \vec b \cdot \vec y subject to -y_a + y_b + \overline{w}_{ab} = w_{ab} for all ab \in A and \vec y free and \vec{\overline{w}} \ge \vec 0". Here, the \overline{w}_{ab} = w_{ab} + y_i - y_j is called the **reduced cost** of the arc ab.

A node potential \vec y is a **feasible node potential** if and only if the non-negativity constraints are satisfied for \overline{w}_{ab} for all ab \in A - when all the slack variables are non-negative. By complementary slackness, if x_{ab} > 0, then \overline{w}_{ab} = 0 in the dual LP.

When we're using the simplex method, we start off with a basic feasible solution. This gives us a spanning tree over the digraph, and we then choose an entering arc and a leaving arc. Adding the entering arc forms a cycle, and removing the leaving arc breaks it again.

This work by Anthony Zhang is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.
Copyright 2013-2017 Anthony Zhang.