Seminar in Algorithms (0368.4343.01) Fall 2005/2006

Haim Kaplan and Elad Verbin

Tuesdays 13-15  319 קפלון

General Guidelines

 

SUBJECT:  Markov-Chain Monte-Carlo (MCMC) Methods in Theoretical Computer Science and their uses for combinatorial sampling and counting problems.

 

In the seminar we will present the topic of Markov Chain Monte Carlo (MCMC) Algorithms.

 

Here is an example: Let G be a graph with maximum degree Δ, let k be an integer such that k>Δ+1. You wish to uniformly sample a legal coloring of the graph G with k colors (this is called a k-coloring). It is easy to see that there is at least one such legal coloring, and it is easy to find it. (Note: the problem is not NP-Hard because we are coloring not by the minimal number of colors, but by a number of colors that is guaranteed to suffice). But how can we pick a random legal coloring? We take the (arbitrary) coloring that we found, and we make small local random changes to it repeatedly, while making sure it stays a legal coloring. A process of this form is called a Markov Chain (in this case, the chain is defined on the colorings of the graph). With an appropriate Markov Chain, one can prove that if we run for a sufficiently large number of steps then we will get a coloring from a distribution that is (almost) uniform. The problem is how many steps this actually requires: is it polynomial ? The answer to this question depends, of course, on the Markov Chain Chosen, that is on the type of "local, random changes".

 

The field of MCMC Algorithms is about Markov chains for combinatorial objects and trying to upper bound their so-called mixing time (the time it takes them to "mix" to the uniform distribution, or close to it). There is a variety of techniques to do this, and we will go over some of them.

We will use basic tools from probability theory that will be overviewed in the first three lectures of the seminar.

 

Basides sampling another focus of the field is counting combinatorial objects, for example estimating how many legal k-colorings a graph has. One important way to do this is to use the algorithms for uniform sampling:, together with some additional ideas. We will introduce this in Lecture 7.

 

In the seminar we will concentrate on the application of the MCMC method to three problems: choosing a random k-coloring in a graph (lectures 3-6),  choosing a random independent set in a graph, and choosing a random matching in a bipartite graph (lectures 10-12).. We will also discuss the "counting" variants of these problems. Of notable interest is the counting version of the third problem, which is equivalent to computing the permanent of a 0-1 matrix (lectures 10-12).

 

Powerpoint Presentations

 

Lecture 4a - Sampling Spanning Trees – Michal Moshkovitz

Lecture 4b - Sampling Linear Extensions - Michal Moshkovitz

Lecture 7 -  The relation between Sampling and Counting + Simulated Annealing – Anat Halpern

 

Tentative schedule

 

1. General overview of seminar material, plus some easy examples. Definitions of some of the models; Sketch of relation of sampling and counting. Relation between the CS models and the physical models.

 

2. [7, lecture 2]: Markov Chains – definition and ergodicity; Coupling and the coupling lemma; some sample applications if time allows. Essentially, the student should teach Eric Vigoda's Lecture 2 [7, lecture 2] carefully. Additional resources are the references in Eric Vigoda's Lecture 2; Standard book on Markov Chains; The other surveys.

 

3. [107]: Path Coupling and sampling k-colorings for k=2*Δ+1. See also [7, lecture 4], and most of the other lecture notes and surveys.

 

4. [7, lecture 4] and [115]: Two applications of coupling to design sampling algorithms: Sampling a spanning tree [7, lecture 4], and sampling linear extensions [115].

 

5. [116]: Sampling Independent Sets.

 

6. [110]: Coupling with the stationary distribution. A technique that considerably strengthens path coupling, and allows one to polynomially sample k-colorings for k=1.764*Δ, and sample independent sets somewhat better. Also see [114], which is an improvement of this.

 

7. [104, sections 1,2]: The relation between Sampling and Counting + Simulated Annealing. Additional Resources: The relation between sampling and counting, without the advanced material, is available at [7, lecture 1, section 3]. Also relevant is [8, lecture 8].

 

8. [105,106]: Perfect Sampling (or Exact Sampling): Coupling From The Past (CFTP) [105] and Fill's Algorithm [106], with emphasis on CFTP. Additional Resources: [7, lecture 3], [8, lecture 10], papers cited within [13]; also a variety of papers by Propp and Wilson. Try to google "coupling from the past", for "exact sampling" or for "perfect sampling".

 

9. [112]: The Bounding Chain Method and its application for perfectly sampling k-colorings and independent sets. Additional Resources: Ph. D. Thesis of Mark Huber [113]

 

10. [103]: Approximating the permanent – part I. Presentation of the Canonical Paths Method as used in [103], with proofs of the theorems, and as as possible from [103]. Additional Resources:  See week 7 in Vigoda's Lecture Notes [7] and the references therein; See the Book chapter by Jerrum and Sinclair [1].

 

11. [103]: Approximating the Permanent. – part II. Presenting the rest of [103].

 

12. [104]: More efficiently approximating the Permanent.

 

13. [117]: This talks about the connection of what we have been doing all semester to things that physicists are interested in.

 

14 (if there is sufficient time). [111]: A Non-Markovian Coupling for Randomly Sampling Colorings: This is a state-of-the-art technique to show how to polynomially sample k-colorings for k=(1+epsilon)*Delta, for any epsilon>0. This is a good point to start further research: On simplifying or extending this result.

 

Resources

 

Books and book chapters:

[1] 1996 Book Chapter by Jerrum and Sinclair. Slightly different focus than ours, but very useful. Has good (although somewhat outdated) treatment of Volume Computation, Sampling Matchings, and simulated annealing. Does not treat couplings (at all?)

[2] Book By Aldous and Fill. Not entirely relevant, but could be useful.

 

Survey Articles:

[4] Dana Randall's survey, and slides from the talk.

[5] A. Frieze and E. Vigoda, Survey of Markov Chains for Randomly Sampling Colorings.

[6] Mark Jerrum, Mathematical foundations of the Markov chain Monte Carlo method

 

Lecture Notes:

[7] 2003 course of Eric Vigoda.

[8] 2002 course of Alistair Sinclair.

[11] Course of Mark Jerrum

[12] Notes from Conference in Singapore (some talks there are off-topic for us, but there is, for example, some information on perfect sampling which may be helpful, and perhaps more)

 

Course Homepages and on-line resources:

[13] The Exact Sampling Homepage

[14] Course homepage of Dorit Aharonov

 

 

Papers

 

[103] M. Jerrum, A. Sinclair, and E. Vigoda. A polynomial-time approximation algorithm for the permanent of a matrix with non-negative entries, JACM, 2004..

[104] I. Bezakova, D. Stefankovic, V. Vazirani, and E. Vigoda.  Accelerating Simulated Annealing for the Permanent and Combinatorial Counting Problems, to appear in SODA 2006.

[105] James G. Propp and David B. Wilson. Exact sampling with coupled Markov chains and applications to statistical mechanics. Random Structures and Algorithms, 9(1&2):223--252, 1996.

[106] James A. Fill. An interruptible algorithm for perfect sampling via Markov chains. The Annals of Applied Probability, 8(1):131--162, 1998.

[107] Russ Bubley, Martin E. Dyer. Path Coupling: A Technique for Proving Rapid Mixing in Markov Chains. FOCS 1997: 223-231 (available in the library – ask Elad if you can't find it).

[110] T. Hayes and E. Vigoda. Coupling with the Stationary Distribution and Improved Sampling for Colorings and Independent Sets.

[111] T. Hayes and E. Vigoda.  A Non-Markovian Coupling for Randomly Sampling Colorings.

[112] Mark Huber. Exact Sampling and Approximate Counting Techniques.

[113] Mark Huber. "Perfect Sampling Using Bounding Chains" Ph.D. Thesis. Cornell University. 1999.

[114] Frieze and Vera. On randomly colouring locally sparse graphs.

[115] Russ Bubley, Martin E. Dyer.  Faster Random Generation of Linear Extensions.

[116] Eric Vigoda. A note on the Glauber dynamics for sampling independent sets.
[117] Martin Dyer, Alistair Sinclair, Eric Vigoda and Dror Weitz.  Mixing in time and space for lattice spin systems: A combinatorial view..

 

 

Further reading

 

Randomly coloring constant degree graphs

Martin Dyer, Alan Frieze, Thomas P. Hayes and Eric Vigoda

Proc. of  the 45th Annual IEEE Symposium on Foundations of Computer Science (FOCS 2004).

 

Randomly coloring sparse random graphs with fewer colors than the maximum degree

Martin Dyer, Abie Flaxman, Alan Frieze and Eric Vigoda

 

On the sampling problem for H-colorings on the hypercubic lattice

Christian Borgs, Jenifer Chayes, Martin Dyer and Prasad Tetali 

 

Improved bounds for sampling colorings

Eric Vigoda

 

The Glauber dynamics on the colourings of a graph with large girth and maximum degree

M. Molloy

SIAM J. Computing (to appear)

Proves mixing for k=1.4*Δ, which is the best possible using Markovian Couplings. A good research direction might be to try and generalize it to a path-coupling-like method.

 

L. Lovasz, S. Vempala: Simulated annealing in convex bodies and an O*(n4) volume algorithms. (Note that this heavily uses the next paper.) There is another version available as a Tech. Report: http://research.microsoft.com/users/lovasz/vol4-focs-tr.pdf .

 

L. Lovasz, S. Vempala  Hit-and-run from a corner. You also might be interested in the STOC version.

 

S. Vempala: Geometric Random Walks: A Survey..

 

Course of S. Vempala (not very much on our topic, but does have some material that relates to Volume Computations)

Another Course by Vempala, this time on Convex Geometry.