Teaching:

- Fal. '13 Math 4221: Stochastic Processes I
- Sp. '13 Math 6221: Advanced Classical Probability
- Fa. '12 Math 4221: Stochastic Processes I
- Sp. '12 Math 3215: Probability & Statistics
- Sp. '12 CS 8803: Discrete Fourier Analysis

Upcoming visits:

- September - December: IMA Year on Discrete Structures
- September 4-6: RANDOM/APPROX 2014
- September 22-26: AIM Square workshop, Palo Alto, CA
- December 9: Penn/Temple probability seminar
- December 11: Indiana University probability seminar

Subsampled Power Iteration: a New Algorithm for Block Models and Planted CSP's, *preprint*,
(with Vitaly Feldman and Santosh Vempala)

We present a new algorithm for recovering planted solutions in two models, the stochastic block model and planted constraint satisfaction problems, via a common generalization in terms of random bipartite graphs. The idea of the algorithm is to perform power iteration with a sequence of adjacency matrices from disjoint graphs subsampled from the original random graph. The algorithm is simple, easy to implement, and its running time is linear in the number of edges/constraints used, significantly faster than both spectral and SDP-based approaches.

We present a new algorithm for recovering planted solutions in two models, the stochastic block model and planted constraint satisfaction problems, via a common generalization in terms of random bipartite graphs. The idea of the algorithm is to perform power iteration with a sequence of adjacency matrices from disjoint graphs subsampled from the original random graph. The algorithm is simple, easy to implement, and its running time is linear in the number of edges/constraints used, significantly faster than both spectral and SDP-based approaches.

On the Complexity of Random Satisfiability Problems with Planted Solutions, *preprint*,
(with Vitaly Feldman and Santosh Vempala)

We consider the problem of finding a planted assignment to a random k-SAT formula (or a planted partition in a random hypergraph). This problem exhibits an algorithmic gap: while the planted solution is essentially unique with a large linear number of clauses, there are distributions for which efficient algorithms are only known for n^{k/2} clauses. We give a rigorous explanation of this gap by showing that any efficient*statistical* algorithm requires n^{r/2} clauses,
where *r* is a parameter that depends on the distribution and can be as high as k. We also describe a statistical algorithm that
matches the sample complexity up to log factors. This gives concrete evidence for the security
of Goldreich's proposed one-way function,
and proves Feige's 3-SAT hypothesis for
statistical algorithms.

We consider the problem of finding a planted assignment to a random k-SAT formula (or a planted partition in a random hypergraph). This problem exhibits an algorithmic gap: while the planted solution is essentially unique with a large linear number of clauses, there are distributions for which efficient algorithms are only known for n^{k/2} clauses. We give a rigorous explanation of this gap by showing that any efficient

On Sharp Thresholds in Random Geometric Graphs, *RANDOM 2014, to appear*, (with Milan Bradonjic)

We give a characterization of vertex-monotone properties with sharp thresholds in a Poisson random geometric graph or hypergraph. As an application we show that a geometric model of random k-SAT exhibits a sharp threshold for satisfiability.

We give a characterization of vertex-monotone properties with sharp thresholds in a Poisson random geometric graph or hypergraph. As an application we show that a geometric model of random k-SAT exhibits a sharp threshold for satisfiability.

Computer-enabled Metrics of Statistical Significance for Discrete Data, *preprint*, (with Mark Tygert and Rachel Ward)

This monograph addresses basic problems in statistical significance from an algorithmic perspective. We show that a simple root-mean-square test and the discrete Kolmogorov-Smirnov test are often superior to classical chi-square statistics, and give the relevant theory along with numerous examples.

This monograph addresses basic problems in statistical significance from an algorithmic perspective. We show that a simple root-mean-square test and the discrete Kolmogorov-Smirnov test are often superior to classical chi-square statistics, and give the relevant theory along with numerous examples.

Random k-SAT and the Power of Two Choices, *to appear, Random Structures and Algorithms*.

I give an Achlioptas rule to shift the satisfiability threshold for random k-SAT. Previously this was known for k=2 (Sinclair/Vilenchik), and the parameters have since been improved (Dani et al). I also propose a semi-random decision problem for random k-SAT.

I give an Achlioptas rule to shift the satisfiability threshold for random k-SAT. Previously this was known for k=2 (Sinclair/Vilenchik), and the parameters have since been improved (Dani et al). I also propose a semi-random decision problem for random k-SAT.

Some deficiencies of χ^{2} and classical exact tests of significance, *Applied and Computational Harmonic Analysis*, to appear. (with Mark Tygert and Rachel Ward)

We indicate the serious problems faced by the chi-square test in dealing with sparse data and propose a remedy. See Andrew Gellman's blog post for a discussion.

We indicate the serious problems faced by the chi-square test in dealing with sparse data and propose a remedy. See Andrew Gellman's blog post for a discussion.

On the Resilience of Bipartite Networks, *submitted*, (with Lev Reyzin)

We study the question `Which networks are least susceptible to widespread infection?' asked in (Blume et al) for d-regular graphs. Here we consider half-regular bipartite graphs, a natural model in many applications, and give extremal results, worst-case NP-hardness, and draw a connection with the study of percolation on finite graphs.

We study the question `Which networks are least susceptible to widespread infection?' asked in (Blume et al) for d-regular graphs. Here we consider half-regular bipartite graphs, a natural model in many applications, and give extremal results, worst-case NP-hardness, and draw a connection with the study of percolation on finite graphs.

Large Deviations for the Empirical Distribution in the Branching Random Walk, *submitted*, (with Oren Louidor)

We consider large deviations in the empirical distribution of a super-critical branching random walk. In particular we answer questions such as: What is the asymptotic probability that at least 70% of particles are between -Sqrt(n) and Sqrt(n) after n steps, instead of the expected 68%? We find there are two distinct regimes if the set considered is a finite union of intervals, with rates interpolating between the two for sets that are countable unions of intervals.

We consider large deviations in the empirical distribution of a super-critical branching random walk. In particular we answer questions such as: What is the asymptotic probability that at least 70% of particles are between -Sqrt(n) and Sqrt(n) after n steps, instead of the expected 68%? We find there are two distinct regimes if the set considered is a finite union of intervals, with rates interpolating between the two for sets that are countable unions of intervals.

The Bohman-Frieze Process Near Criticality (with Mihyun Kang and Joel Spencer), * Random Structures and Algorithms*, 2013.

We study the fine behavior of the phase transition of the Bohman-Frieze random graph process, and show that it shares several critical exponents with the Erdos-Renyi random graph. Our technique involves deriving a PDE associated to the process and using singularity analysis to extra information about small components from a multivariate generating function.

We study the fine behavior of the phase transition of the Bohman-Frieze random graph process, and show that it shares several critical exponents with the Erdos-Renyi random graph. Our technique involves deriving a PDE associated to the process and using singularity analysis to extra information about small components from a multivariate generating function.

Hardness of Finding Independent Sets In Almost 3-Colorable Graphs, * FOCS * 2010 (with Irit Dinur, Subhash Khot, and Muli Safra)

With techniques adapted from (Dinur/Safra) and Fourier analysis on the discrete q-ary hypercube, we prove hardness of approximation for graph coloring. In particular, we show that it is NP-hard to distinguish a graph that is almost k-colorable from one whose largest independent set is a 1/k^2 fraction of the graph.

With techniques adapted from (Dinur/Safra) and Fourier analysis on the discrete q-ary hypercube, we prove hardness of approximation for graph coloring. In particular, we show that it is NP-hard to distinguish a graph that is almost k-colorable from one whose largest independent set is a 1/k^2 fraction of the graph.

The Forgetfulness of Balls and Bins, * Random Structures and Algorithms*, 2012.

I answer the question 'How many additional random balls are necessary to disguise an arbitrary initial placement of m balls in n bins?' with a precise answer that depends on the empirical variance of the initial placement.

I answer the question 'How many additional random balls are necessary to disguise an arbitrary initial placement of m balls in n bins?' with a precise answer that depends on the empirical variance of the initial placement.

Computing the confidence levels for a root-mean-square test of goodness-of-fit, * Applied Mathematics and Computation*, 2011 (with Mark Tygert and Rachel Ward)

We give an efficient numerical algorithm for computing the asymptotic distribution of a root-mean-square goodness of fit test. The algorithm is a numerical evaluation of a contour integral.

We give an efficient numerical algorithm for computing the asymptotic distribution of a root-mean-square goodness of fit test. The algorithm is a numerical evaluation of a contour integral.