# The probabilistic method / method of conditional probabilities

MathJax is broken. Equations are not displayed properly.

Here is a brief introduction to the method of conditional probabilities.

If you’re not familiar with the probabilistic method (or randomized rounding), you should first read about it here.

Our interest in this blog is using randomized rounding and the method of conditional probabilities to derive greedy and Lagrangian-relaxation algorithms.

The method of conditional probabilities is a systematic method for converting non-constructive probabilistic existence proofs into efficient deterministic algorithms that explicitly construct the desired object.

Next is a description of the method followed by a simple example.

Raghavan (1988) gives this description:

We first show the existence of a provably good approximate solution using the probabilistic method [1]. [We then] show that the probabilistic existence proof can be converted, in a very precise sense, into a deterministic approximation algorithm. To this end we use an interesting “method of conditional probabilities”… We apply our method to integer programs arising in packing, routing, and maximum multicommodity flow… [2]

(Raghavan is discussing the method in the context of randomized rounding, but it works with the probabilistic method in general.)

To derive a constructive algorithm from a random experiment, one first casts the random experiment as a series of “small” random choices. Then, one modifies the random process, replacing each small random choice into a deterministic choice, chosen so as to keep the current conditional probability of failure (given the choices made so far) below 1.

## Toy example

Here is a toy example to illustrate the principle.

Lemma.

It is possible to flip three coins so that the number of tails is at least 2.

Proof.

If the three coins are flipped randomly, the expected number of tails is 1.5. Thus, there must be some outcome (way of flipping the coins) so that the number of tails is at least 1.5. Since the number of tails is an integer, in such an outcome there are at least 2 tails.

In this example the random experiment consists of flipping three fair coins. The experiment is illustrated by the rooted tree in the diagram. There are eight outcomes, each corresponding to a leaf in the tree. A trial of the random experiment corresponds to taking a random walk from the root (the top node in the tree, where no coins have been flipped) to a leaf. The successful outcomes are those in which at least two coins came up tails. The interior nodes in the tree correspond to partially determined outcomes, where only 0, 1, or 2 of the coins have been flipped so far.

To apply the method of conditional probabilities, one focuses on the ”conditional probability of failure, given the choices so far” as the experiment proceeds step by step. In the diagram, each node is labeled with this conditional probability.

(For example, if only the first coin has been flipped, and it comes up tails, that corresponds to the second child of the root. Conditioned on that partial state, the probability of failure is 0.25.)

The method of conditional probabilities replaces the random root-to-leaf walk in the random experiment by a deterministic root-to-leaf walk to a leaf labeled 0.

## The general method

In general, the method replaces each random step by a deterministic step, carefully chosen to inductively maintain the following invariant:

the conditional probability of failure, given the current state, is less than 1.

In this way, it is guaranteed to arrive at a leaf with label 0, that is, a successful outcome.

The invariant holds initially (at the root), because the original proof showed that the (unconditioned) probability of failure is less than 1. The conditional probability at any interior node is the average of the conditional probabilities of its children. The latter property is important because it implies that ”any interior node whose conditional probability is less than 1 has at least one child whose conditional probability is less than 1.” Thus, from any interior node, one can always choose some child to walk to so as to maintain the invariant. Since the invariant holds at the end, when the walk arrives at a leaf and all choices have been determined, the outcome reached in this way must be a successful one.

This invariant holds initially, before any choices have been made, because the conditional probability of failure at that time is just the probability of failure in the original random experiment, and we are assuming that we have already shown that the experiment succeeds with positive probability. Suppose the invariant holds just before some step in the random experiment. That is, the conditional probability of failure, given the current state, is less than 1. Then there must be some way of making the next step that leads to a state where the conditional probability is also less than 1. This is simply (and crucially) because the conditional probability for the current state is a weighted average of the conditional probabilities for the possible next states.

Thus, each choice in the random experiment can be made to keep the conditional probability of failure below 1. Once all choices have been made, the conditional probability of failure is determined — it is 1 if failure occurs, and 0 otherwise. Since the invariant ensures it is less than 1, it must be 0, guaranteeing a successful outcome.

There are various ways to make each choice to keep the conditional probability of failure below 1. Often, the exact conditional probability of failure is hard to compute. Instead one keeps an upper bound on the conditional probability of failure below 1, which suffices. Or, it may suffice to keep the conditional expectation of some other quantity above a certain threshold. All of these ways have a similar flavor.

## Example: derandomizing the Max-Cut existence proof

Here is how the method of conditional probabilities works for the Max-Cut example in the note on the probabilistic method. That proof shows that for a randomly chosen cut, the number of edges in the graph that are cut is in expectation. Thus, the probability of cutting fewer than edges is less than 1.

Let random variable be the number of edges cut. The algorithm will emulate the random experiment, coloring the vertices one by one. To keep the conditional probability of failure () below 1, the algorithm will keep the conditional expectation of at or above . To do this, it will simply color each vertex to keep the conditional expectation of from decreasing. (Here, the “conditional expectation” refers to the expectation if the remaining vertices were to be colored randomly, even though the algorithm will end up coloring them deterministically.)

So what is the conditional expectation of ? Suppose the first vertices have been colored. Let denote the state (first vertex colors). Each edge that already has both endpoints colored differently will definitely be cut; let denote the number of these edges. Each edge that already has both endpoints colored the same will definitely not be cut. Each other edge has at least one undetermined endpoint and has a 1/2 chance of being cut; let denote the number of these edges. Thus, the conditional expectation of given , that is, , is — the number of edges cut so far plus half the undetermined edges.

To color the st vertex, say , the algorithm computes the conditional expectation that would result from coloring red and compares that to the conditional expectation that would result from coloring blue. One of these two choices will give a conditional expectation at least as large as the current one . (This is because the current conditional expectation is the average of the two possible next conditional expectations.) The algorithm colors whichever way maximizes , guaranteeing that .

More concretely, this algorithm reduces to the following: Consider the vertices in any order. When considering a vertex , color it red if among its colored neighbors, more are blue then red. Otherwise color it blue.

Since the algorithm maintains the invariant , it maintains the invariant that the conditional probability of failure is less than 1. Thus, it is guaranteed to succeed (cut at least half the edges).

## Summary

In general, a probabilistic proof implicitly defines a quantity with one of the following properties:

1. If , then the random experiment succeeds. (or)

2. If , then the random experiment succeeds.

For any probabilistic proof, one can take to be an indicator variable taking the value 1 if the experiment fails and 0 otherwise. Then has the first property above. But this particular choice of is not always easy to work with, and most probabilistic proofs implicitly define quantities that are easier to use.

To apply the method of conditional probabilities, we break the random experiment into a sequence of small random choices, then make the algorithm make each choice deterministically so that the conditional expectation of , given the steps so far, does not increase (or decrease, as appropriate). This is possible at each step because the conditional expectation before the step is a weighted average of the possible conditional expectations after the step. Making each step in this way maintains the invariant (or as appropriate). This invariant implies that the conditional probability of failure stays below 1, thus guaranteeing a successful outcome.

On Wikipedia: