\documentclass[12pt]{article}
\usepackage{html}
\latex{\usepackage{fullpage}}
\usepackage{/u/mlittman/papers/tex/newalg}
\usepackage{epsfig}
\newcommand{\ul}{\begin{itemize}}
\newcommand{\li}{\item}
\newcommand{\el}{\end{itemize}}
\renewcommand{\ni}{\noindent}
\newcommand{\href}[2]{\htmladdnormallink{#2}{#1}}
\newcommand{\argmax}{\mathop{\mbox{argmax}}}
\newcommand{\argmin}{\mathop{\mbox{argmin}}}
\begin{document}
\title{Weighting Graphs (18)}
\author{Michael L. Littman}
\date{November 11th, 1997}
\maketitle
\section{ANNOUNCEMENTS}
\ni Exam 2 coming up... November 20th.
\section{NETWORK FLOW}
\subsection{Review}
\ni Recall last time we spoke about bipartite matching. The
fundamental algorithmic idea was that of finding an {\em augmenting
path\/}.
\ni The proof that this leads to a maximum matching depended on the
min-cut max-flow theorem, of which we proved a special case.
\ni We'll now briefly talk about another problem that is solved using
the same basic ideas.
\subsection{Network Flow}
\ni The bipartite matching problem can be generalized quite a bit.
\ni A beautiful generalization is network flow. We'd like to move
water (or packets or trucks or...) from one place to another.
\ul
\li source: A place where water comes from. Source is inexhaustible.
\li sink: A place where water needs to go. Sink is insatiable.
\li nodes: Junctions along the way from source to sink.
\li edges: Carry the water from junction to junction.
\li capacities: The size of the pipes (edges) between junctions.
\li flow: The amount of water moving through each pipe. Flow is
conserved.
\el
\ni Picture...
\subsection{Maximum Flow Problem}
\ni We can set a dial at each node to determine how much of the water
coming into the node will exit through each of the pipes. This
determines a {\em flow\/}.
\ni How do we set these dials to maximize the total flow from source
to sink?
\ni Can be solved efficiently using a generalization of augmenting
paths: $O(|V||E|^2)$. (Even better algorithms are available; well
studied problem.)
\ni Example flow...
\subsection{Matching as Network Flow}
\ni Matching can be seen as a special case: unit capacities, introduce
a source to all $L$ nodes, sink from all $R$ nodes.
\ni The matching algorithm we discussed is just one of the basic
maximum flow algorithms running on this graph.
\subsection{Linear Programming}
\ni Maximum flow is a cool problem, but I'm told that it's not used
all that often in practice because most real-world problems are more
complicated. Instead, a really big hammer is used: linear
programming (LP).
\ni More general than matching or network flow: $Ax \le b$, maximize
$c^T x$. Here, $A$ is a matrix, $x$ a column vector of variables, $b$
a column vectors of constraints, and $c$ a column vector of rewards.
\ni Network-Flow example:
\ul
\li One variable for each edge, constrained to be between zero (no
flow) and the capacity (biggest possible flow).
\li Constraint for each node: sum of value of variables on incoming
edges must equal the sum of value of variables on outgoing edges (flow
conservation).
\li Maximize sum of edges leaving the source (size of the flow).
\el
\ni Small example...
\subsection{LP Algorithms}
\ni Can solve network flow problems this way, as well as some tricky
generalizations (costs in addition to capacities). Very powerful.
{\em ``Use it for all your optimization needs!''}
\ni Theoretically efficient LP algorithms (ellipsoid, interior point)
are very complex. Practical LP algorithm (simplex) is inefficient in
the worst case. Very good commercial implementations exist!
\ni Never know when this might come in handy.
\section{MINIMUM SPANNING TREES}
\subsection{Water For Everyone}
\ni Let's look at a different problem with some similar structure.
\ni We've still got a source, nodes, and water flowing around. But
now we want to just make sure that everyone (all nodes) get some water
(or electricty or ...).
\ni For each pair of cities $(u,v)$ that have a pipe between them,
there is a cost $w(u,v)$ for keeping the pipe open.
\ni Choose a set of pipes to keep open such that:
\ul
\li All cities get water.
\li Cost is minimized.
\el
\ni Answer?
\subsection{Mathematical Abstraction}
\ni A {\em spanning tree\/} is a tree (i.e., a set of edges with no
undirected cycles). How find one?
\ni The {\em cost} of a spanning tree is the sum of the weights of the
edges included in that spanning tree.
\ni A {\em minimum spanning tree\/} is a spanning tree with the
smallest possible cost.
\subsection{Simple Verification}
\ni In many of the problems we've looked at, once you find the answer,
it is easy to see that it is correct. What are the running times?
\ul
\li sorting
\li building a heap
\li building a binary search tree
\li finding a stable marriage
\li finding a string
\li solving a linear equation
\el
\subsection{Optimization Verification}
\ni Optimization problems are trickier. Algorithms lean on powerful
theorems to know when they're done.
\ul
\li girl-optimal stable marriage
\li maximum bipartite matching
\li maximum flow
\li minimum spanning tree
\el
\ni Syntactic check, solve independently and compare.
\ni MST?
\section{FINDING MINIMUM SPANNING TREES}
\subsection{Design Ideas}
\ni Rule \#1 of Good Algorithm Design: If you already have a good
subroutine to solve the problem, use it!
\ni Rule \#2 of Good Algorithm Design: Divide-and-Conquer!
\ni Rule \#3 of Good Algorithm Design: Keep complex information
organized in a good data structure.
\ni Rule \#4 of Good Algorithm Design: Be greedy.
\subsection{Greedy Approach}
\ni Keep adding low-cost edges as long as no cycle is made.
\ni Didn't work for matching. Works here.
\begin{algorithm}{Greedy-MST}{G}
T \= \emptyset \\
\begin{FOR}{i \= 1 \TO |G[V]|-1}
\mbox{Define a cut $C$ s.t. $C \cap T = \emptyset$} \\
e^* \= \argmin_{e\in C} w(e) \\
T \= T \cup \{ e^* \}
\end{FOR} \\
\RETURN T
\end{algorithm}
\subsection{Cuts}
\ni A {\em cut\/} is something that separates a graph into two
pieces. Two equivalent ways of defining a cut:
\ul
\li Two sets of nodes that partition $V$ (what's the phrase?)
\li The set of edges that go between two sets of nodes that partition
$V$.
\el
\ni Example...
\subsection{Greedy-Grow Lemma}
\ni Theorem: If $T$ is a subset of {\em some\/} MST, and $C$ is some
cut that doesn't share any edges with $T$, then there is some MST
containing $T$ and the minimum cost edge $e^*$ in $C$.
\ni Proof: Very standard trick. Let $T^*$ be the MST that contains
$T$. If $e^* \in T^*$, we're done. If not, consider the set of edges
$T^* \cup \{ e^*\} $. It must have a cycle including $e^*$ and some
other edge $e$ (maybe more than one) in $C$. By definition, $w(e^*)
\le w(e)$. Delete $e$ from $T^*$.
\ul
\li Is the resulting set $T^*\cup\{e^*\}-\{e\}$ still a spanning tree?
\li What is its cost compared to $T^*$?
\el
\subsection{Correctness of {\sc Generic-MST}}
\ni The correctness of {\sc Generic-MST} follows from the Greedy-Grow
Lemma.
\ul
\li Proof by induction. The initial set $T=\emptyset$ is a subset of
any MST.
\li At each step, {\sc Generic-MST} adds an edge that is necessarily
part of some MST (by the Greedy-Grow Lemma).
\li So when the set $T$ is a spanning tree, it must be an MST (it is a
subset of some MST, itself).
\el
\subsection{Efficient Implementation}
\ni The {\sc Generic-MST} algorithm and the theorem give some guidance
for how to find MSTs simply, but they don't immediately suggest
efficient algorithms. How do we quickly find cuts and the smallest
edges they have?
\ni We'll look at two classic approaches:
\ul
\li Prim: Related to shortest path (in a few lectures).
Pick a point and grow MST from it. Cut is $T$ vs. $V-E$.
\li Kruskal: Related to union-find (next time).
Continually add the smallest edge that doesn't create a cycle. Cut is
defined implicitly by the smallest edge and the disconnected pieces of
$T$.
\el
\section{PRIM'S ALGORITHM}
\subsection{Idea}
\ni Very simple idea:
\ul
\li Initially, all vertices are on the far side of the cut.
\li Pick one $r$ to be on near side and check the distance of $r$ to
all its neighbors.
\li From the set of vertices on the far side, pick the one $u$ that is
reachable from the near side over the smallest edge.
\li Move $u$ to the near side. Update the distances of $u$ to its
neighbors on the far side to relect the edges in the cut.
\li Repeat.
\el
\ni Example...
\subsection{Algorithm}
\begin{algorithm}{Prim-MST}{G,w}
Q \= \emptyset \\
\begin{FOR}{\EACH u \in V[G]}
\mbox{\sc Initialize-Key}[u] \= \infty \\
Q \= Q \cup \{ u \}
\end{FOR} \\
r \= \mbox{some vertex of $G[V]$} \\
\mbox{\em edge}[r] \= \mbox{\sc nil} \\
\mbox{\sc Decrease-Key}(r, 0) \\
T \= \emptyset \\
\begin{WHILE}{Q \neq \emptyset}
u \= \mbox{\sc Remove-Smallest-Key}(Q) \\
T \= T \cup \{\mbox{\em edge}[u]\}\\
\begin{FOR}{\EACH v \in \mbox{\em Adj}[u]}
\begin{IF}{v \in Q \mbox{ and } w(u,v) < \mbox{\em key}[v]}
\mbox{\sc Decrease-Key}(v, w(u,v)) \\
\mbox{\em edge}[u] \= (u,v)
\end{IF}
\end{FOR}
\end{WHILE}
\end{algorithm}
\subsection{Running Time Analysis}
\ni Three basic operations left to specify:
\ul
\li {\sc Initialize-Key}
\li {\sc Decrease-Key}
\li {\sc Remove-Smallest-Key}
\el
\ni How often do we do each?
% |V|, |E|, |V|
\ni Simple data structure: list of nodes $Q$ with their keys. Running
time per operation? $O(1)$, $O(1)$, $O(|V|)$.
\ni Total running time: $O(|V|^2+|E|) = O(|V|^2)$
\subsection{Better Data Structure}
\ni The simple data structure yields a linear-time algorithm for dense
graphs, $|E|=\Theta(|V|^2)$.
\ni Can do better for sparse graphs.
\ni Heap: Running time per operation? $O(\log |V|)$, $O(\log |V|)$,
$O(\log |V|)$.
\ni Total running time: $O(|V|\log|V|+|E|\log|V|)=O(|E|\log|V|)$
\ni Worse for dense graphs, better for sparse graphs.
\subsection{Even Better Data Structure}
\ni ``Fibonacci heaps'' have a good amortized running time for {\sc
Decrease-Key}: $O(\log |V|)$, $O(1)$, $O(\log |V|)$.
\ni Total running time: $O(|E|+|V|\log|V|)$.
\ni Probably wouldn't really ever want to use this, but it is an
asymptotic improvement.
% \section{KRUSKAL'S ALGORITHM}
%
% zzz Fast Union-Find
\section{HOMEWORK}
\subsection{Homework 9}
\ni Background: Ch. 23.1, 23.4, 23.5, 27.2, 27.3
\ni Due November 13th:
\ul
\li 9.1: CLR Exercise 23.1-3 (pg. 468).
\li 9.2: Extra credit. CLR Exercise 23.1-6 (pg. 468).
\li 9.3: CLR Exercise 23.4-1 (pg. 487).
\li 9.4: CLR Exercise 23.5-1 (pg. 494).
\li 9.5: CLR Exercise 23.5-5 (pg. 494).
\li 9.6: Extra credit. In class, we came up with an alternate
algorithm for strongly connected components. The algorithm went
something like this: Start with a graph. Find a cycle in the graph.
If there is no cycle, we're done. If there is, mark all nodes in the
cycle as being in the same strongly connected component. Now collapse
all those nodes into a single node and repeat. Describe an efficient
implementation of this procedure and give its running time.
\li 9.7: Give a $O(|V|+|E|)$ (linear time) algorithm for maximal (not
maximum) bipartite matching.
\li 9.8: Give an example bipartite graph with $2k$ left nodes and $2k$
right nodes with a maximal matching of size $k$ and a maximum matching
of size $2k$.
\li 9.9: Extra credit. In the previous question, the ratio of the
size of the maximum matching to the maximal matching is 2:1. Prove
that this is the largest possible ratio. We can conclude from this
that a maximal matching is an approximation (to within a factor of 2)
of the maximum matching.
\li 9.10: Book gives a $O(|V|^3)$ algorithm for network flow. When
applied to bipartite matching, is this better or worse than the
$O(|V||E|)$ algorithm we discussed?
\el
\subsection{Homework 10 (incomplete)}
\ni Background: Ch. 24.1, 24.2.
\ni Due November 20th:
\ul
\li 10.1: Given a graph $G=(E,V)$, let $e$ be the edge second smallest
edge in $G$ according to the weight function $w$. Argue that there is
an MST containing $e$.
\li 10.2: Argue that the choice of initial node $r$ in {\sc Prim-MST}
effects the final tree returned, by constructing an example in which
two different choices of $r$ lead to two different trees. Argue that,
ultimately, the choice is immaterial in that it doesn't effect the
cost of the final tree.
\li 10.3: Give tight asymptotic ranges of the number of edges for
which a Fibonacci-heap-based implementation of Prim's algorithm is
asymptotically better than the heap or simple list-based data
structure. Your answer should have the form $|E|=\omega(f(|V|))$,
$|E|=o(g(|V|))$.
% \omega(V), o(V^2)
\el
\end{document}