Flow
Combinatorial Optimization: Finding Optimum Solution
Breakdown approach
Solution: Understand feasibility
Optimum: Verifying that a solution is optimum
And understanding how to improve one that is not optimum
Finding: actually producing the optimum comes last.
Maximum Flow
Definitions
Tarjan: Data Structures and Network Algorithms
Ford and Fulkerson, Flows in Networks, 1962 (paper 1956)
Ahuja, Magnanti, Orlin Network Flows. Problem: do min-cost.
Problem: in a graph, find a flow that is feasible and has maximum value.
Directed graph, edge capacities \(u(e)\) or \(u(v,w)\). Why not \(c\)? reserved for costs, later.
source \(s\), sink \(t\)
Goal: assign a flow value to each edge:
conservation: for every vertex \(v\), \(\sum_w g(v,w)-g(w,v)=0\), unless \(v=s,t\)
capacity: \(0 \le g(e) \le u(e)\) (flow is feasible/legal) (sometimes lower bound \(l(e)\))
Flow value \(|f| = \sum_w g(s,w)-g(w,s)\) (in gross model).
Alternative: (net flow form)
skew symmetry: \(f(v,w) = -f(w,v)\)
conservation: for every vertex \(v\), \(\sum_w f(v,w)=0\), unless \(v=s,t\)
capacity: \(f(e) \le u(e)\) (flow is feasible/legal) (sometimes lower bound \(l(e)\))
Flow value \(|f| = \sum_w f(s,w)\)
Net flow cleaner for math but involves negative flows and makes lower-bound flows messy. Raw flow may fit intuition/reality better.
Water hose intuition. Also routing commodities, messages under bandwidth constraints, etc. Often “per unit time” flows/capacities.
Maximum flow problem: find flow of maximum value.
Path decomposition (another perspective):
claim: any \(s\)-\(t\) flow can be decomposed into paths/cycles with quantities
proof: induction on number of edges with nonzero flow
if \(s\) has out flow, traverse flow outward from \(s\)
till find an \(s\)-\(t\) path or cycle (why can we? conservation) and kill
if \(t\) has in-flow (won’t any more, but no need to prove) work backwards to \(s\) or cycle and kill
if some other vertex has outflow, find a cycle and kill
corollary: flow into \(t\) equals flow out of \(s\) (global conservation)
Note: every path is a flow (balanced).
Note: decomposition of positive value
i.e. never have paths traveling opposite each other
Note: at most \(m\) paths/cycles (zero out one edge’s flow each time)
Decision question: is there nonzero feasible flow
Suppose yes. consider one.
decompose into paths
proves there is a flow path
Suppose no
then no flow path
consider vertices reachable from \(s\)
they form a cut \((s\in S, t \in T={\overline S})\)
no edge crossing cut
What about max-flow?
Need some upper bound to decide if we are maximal
Consider any flow, and cut \((S,T)\)
decompose into paths
every path leaves the cut (at least) once
So, outgoing cut capacity must exceed flow value
min cut is upper bound for max flow
So if flow equals cut value, we know we have a max flow and a min cut.
Suppose have some flow. How can we send more?
Might be no path in original graph
/|\ / | \ / | \ s / | \ \ | / t \ | / \ | / \|/
Instead need residual graph:
Suppose flow \(f(v,w)\)
set \(u_f(v,w) = u(v,w)-f(v,w)\) (decrease in available capacity)
which means set \(u_f(w,v) = u(w,v)+f(v,w)\) (increase in capacity indicates can remove \((v,w)\) flow
If augmenting path in residual graph, can increase flow
might use residual arcs that are reverse of original arcs
this represents removing flow from that residual arc
so not a real path in original graph
but, augmentation does preserve conservation invariant at each vertex
What if no augmenting path?
Implies zero cut \((S,T)\) in residual graph
Implies every \(S\)-\(T\) edge is saturated
and every \(T\)-\(S\) edge is empty
i.e. cut is saturated
consider path decomposition
each path crosses the cut exactly once
must cross once to reach \(t\)
cannot cross back (and cross again) because \(T\)-\(S\) empty
Thus, capacity of crossing paths equals capacity of crossing edges
i.e. value of cut equals value of flow
but already saw value of any flow cannot exceed value of any cut (because all paths cross cut at least once)
so equality means flow is maximum
We have proven Max-flow Min-cut duality:
Equivalent statements:
\(f\) is max-flow
no augmenting path in \(G_f\)
\(|f| = u(S)\) for some \(S\)
Proof summary:
if augmenting path, can increase \(f\)
if no augmenting path, can find empty residual cut
flow has same value as cut (so is optimal)
conversely, if flow equals cut they must both be optimal
Another cute corollary:
Net flow across any cut (in minus out) equals flow value
True for a path as it must go out once more than in
So true for a sum of paths
Note that max-flow and min-cut are witnesses to each others’ optimality.
Maximum Flow Algorithms
We understand now the optimality conditions for maximum flow well. But how about computing one?
Observe: The Max-Flow Min-Cut theorem can be easily turned into an actual algorithm.
Computing Max Flow with Augmenting Paths
Natural algorithm: Repeatedly find augmenting paths in the residual graph.
By Max-Flow Min-Cut theorem we know that either our flow is already optimal or there is an augmenting path to find.
Each augmenting path computation can be easily done in \(O(m)\) time.
Key question: How many times we might need to find an augmenting path before we reach the max flow?
Consider first the case of integer capacities. (This will be most often the case anyway.)
If capacities are integral then the bottleneck capacity of each augmenting path we find is integer and positive, so at least \(1\).
\(\Rightarrow\) We have at most \(|f|\) augmenting path computations, giving an \(O(m|f|)\) overall bound.
Note that, by Max-Flow Min-Cut Theorem, \[|f|=u(S^*)\leq u(\{s\})\leq nU,\] where \(U:=\max_e u_e\) is the largest edge capacity.
\(\Rightarrow\) The overall running time is \(O(mnU)\).
Note: This running time is polynomial in the values in our input
but not in the size of its binary representation.
(Such algorithms are called pseudo-polynomial.)
When \(U=1\), this is polynomial time algorithm though.
Important corollary: If all capacities are integral then there exists an integral maximum flow too. (Although there could still exist maximum flows that are non-integral.)
When capacities are rational, we will augment by (multiple of) their least common denominator
\(\Rightarrow\) least common denominator never changes
\(\Rightarrow\) running time finite, but likely not even pseudo-polynomial.
When capacities are real numbers, we might not even have termination.
Sidenote: The above algorithm was simple to describe but its dynamics can be quite complex.
Each augmentation constitutes a “greedy” improvement step, but these augmentations might undo/overwrite previous updates, adding flow to an edge and then removing it.
Instructive example to look at: “diamond” graph with edges \((s, v)\), \((s,w)\), \((v,w)\), \((v,t)\), and \((w,t)\), and capacities being \(1000\) on all the edges except \((v,w)\) which has a capacity of \(1\). What will happen if we always choose an augmenting path going through the capacity \(1\) edge? How does it compare to making a “smart” choice of the augmenting path?
Maximum Bottleneck Capacity Augmenting Paths
So far, we assumed that we just find an augmenting path. But maybe there is a “smart” way to choose one?
Natural idea: Always choose an augmenting path \(P\) that has maximum bottleneck capacity \(u_f(P)\).
How to find one? Actually, quite simple. Easy \(O(m\log n)\)-time algorithm: just do binary search over the capacity of “bottlenecking” edge and, for each guess \(u\), check for \(s\)-\(t\) path in \(G_f\) after all the edges with capacity less than \(u\) were removed. By being (much) more sophisticated, one can get \(O(\min\{ m+n \log n, m \log^*n\})\) time.
How much progress each such augmenting path guarantees to make?
Recall: The flow decomposition bound tells us that any flow can be decomposed into at most \(m\) flow paths.
\(\Rightarrow\) If we apply this observation to the maximum (remaining) flow, i.e., the maximum flow in the residual graph \(G_f\) together with an averaging argument, we get that at least one of these paths carries at least \(1/m\)-fraction of the remaining value of the flow. (Note that all such paths have to be augmenting in \(G_f\) by definition.)
\(\Rightarrow\) Each flow augmentation reduces the remaining value of the flow by a factor of \((1-1/m)\).
After \(m\ln |f| +1\) augmentations, the remaining flow would be less than \(1\). So, if capacities are integral, the obtained solution has to be optimal. The overall running time is \[O((m\log n)\cdot m \ln |f|)=O(m^2 \log n \log nU).\]
Note: This is a “truly” polynomial running time, i.e., it is polynomial in the size of the representation of the input numbers. Still, sometimes one can ask for even more: running time that is independent of the input numbers, i.e., depends only on the size of the graph itself. This kind of algorithms are called strongly polynomial (and the ones that are only polynomial in the size of the numbers are called weakly polynomial).
If capacities are rational, the above algorithm still has interesting properties. In particular, if \(U\) will be now the largest denominator/enumerator in the capacities, then by multiplying all capacities by the product of all denominators and applying the algorithm above will give us a running time of \[O((m\log n)\cdot m \ln |f|)=O(m^2 \log n \log nU^m)=O(m^3 \log n \log nU),\] which is still (weakly) polynomial. (Note: we do not actually need to explicitly multiply all capacities here.)
From now on: Unless stated otherwise, we assume that capacities are integral.