On Sampling Edges Almost Uniformly
[ER17] On Sampling Edges Almost Uniformly, reading note; also the note of the talk given by Professor Talya Eden in workshop WoLA.
Introduction
In fact, the original topic of Professor Eden was “counting subgraphs”. However, Eden let us choose to talk about “counting subgraphs” or “sampline edges”, and finally more people chose the latter.
A nice talk!
Preliminaries
The algorithms we consider access an undirected graph via queries. Our algorithm uses the following queries:
- Vertex query: returns a uniformly random vertex .
- Degree query: given a vertex , returns .
- Neighbor query: given and , return ’s -th neighbor; if , this operation returns fail.
We want to design an algorithm which has low query complexity and can return an edge with probability at least . Moreover, each returned edge is sampled according to a distribution that is pointwise -close to uniform.
To design the algorithm, we need to know the approximate number of edges . And there is an algorithm:
Thm 2.1. ([GR08]) Let be a graph with vertices and edges. There exists an algorithm which uses ˜ query complexity in expectation and outputs an estimate of that with probability at least satisfies .
And another lemma about statistics distance:
Lemma 2.2. Let be a probability distribution over a finite set which satisfies:
Then is pointwise -close to uniform
Warm-up —— In Regular Graph or Bounded Degree Graph
For regular graph, sampling edges is very easy. We just sample a vertex and then sample a neighbor edge of .
For bounded degree graph, let the maximum degree be a constant. We can just make a small change: when we get the edge by sampling the neighbor of , we accept it with probability .
Then the probability we get the specific edge is:
We see that in the case algorithm succeds, the edge returned is uniformly distributed. Then we repeat this algorithm times to get successful probability.
When the graph is a star, then , and the query complexity bacome very large (not sublinear).
General Graph
Defn 4.1. For a given threshold , we say that a vertex is a light vertex if and
otherwise we say that it is a heavy vertex. We say that an edge is a light edge if it originates from a light vertex. A heavy edge is defined analogously. Finally, we denote the sets of light and heavy vertices by and respectively, and the sets of light and heavy edges by and respectively.
The algorithm is very simple:
Lemma 4.2. The procedure performs a constant number of queries and succeeds with probability . In the case where succeeds, the edge returned is uniformly distributed in .
Proof. Almost the same as the bounded degree case.
Lemma 4.3. The procedure performs a constant number of queries and succeeds with probability in . In the case where succeeds, the edge returned is distributed according to a distribution that is pointwise -close to uniform.
Proof. For a vertex , we denote by . is defined similarly. Then we have:
Thus:
And for the edge in line , we have:
On the other hand, implies that:
Thus we have:
Lemma 3.4. If , then Algorithm returns an edge with probability at least t. The distribution induced by the algorithm is pointwise -close to uniform over .
Proof. For a specific loop in line of , we have:
Thus by repeating times, successful probability is at most:
Moreover, by Lemmas 3.1 and 3.2, we have:
Thus, we have:
By Lemma 2.1, we have done.
Finally, if is not known, then we repeat [GR08] algorithm times, and find the median of these results. Chernoff bounds guarantees that is good with probability at least .
Lower Bound
[ER17] also gives a lower bound of the quert complexity that sampling edge almost uniformly, while Professor Eden had no time to discuss in her talk.
TODO
Discussion
The technique we use:
- Rejection Sampling;
- Making things uniform;
- Heavy / light.
Reference
[ER17] Talya Eden and Will Rosenbaum. On Sampling Edges Almost Uniformly. In 1st Symposium on Simplicity in Algorithms (SOSA 2018). Open Access Series in Informatics (OASIcs), Volume 61, pp. 7:1-7:9, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2018)
[GR08] Oded Goldreich and Dana Ron. Approximating average parameters of graphs. Random Struct. Algorithms, 32(4):473–493, 2008.