Fall 2019, Spring 2020

The Georgia Tech ACO Student Seminar is run by students in the ACO program.

The propose of this seminar is to keep students updated with current researches, and to practice presenting skills for those who will give a talk in a conference. Any topic related (but not restricted) to algorithms, combinatorics, and optimization is very welcome. You can share research results, present a work in progress or just share something of general interest. There will also be occasional talks by ACO faculty.

The seminar
currently meets on Fridays from **1pm** to **2pm** at **Skiles 005**, unless noted
below.

Food and drinks will be provided. Thanks to ACO, ARC, and ISYE for their sponsorship.

Subscribe to the ACO Student Seminar mailing list.

Learning Optimal Reserve Price against Non-myopic Bidders

**Abstract: **We consider the problem of learning optimal reserve price in repeated auctions against non- myopic bidders, who may bid strategically in order to gain in future rounds even if the single- round auctions are truthful. Previous algorithms, e.g., empirical pricing, do not provide non- trivial regret rounds in this setting in general. We introduce algorithms that obtain a small regret against non-myopic bidders either when the market is large, i.e., no single bidder appears in more than a small constant fraction of the rounds, or when the bidders are impatient, i.e., they discount future utility by some factor mildly bounded away from one. Our approach carefully controls what information is revealed to each bidder, and builds on techniques from differentially private online learning as well as the recent line of works on jointly differentially private algorithms.

Clustering a Mixture of Gaussians (

Strong self concordance and sampling (

Learning and Testing for Graphical Models

**Abstract: **In this talk we introduce some machine learning problems in the setting of undirected graphical models, also known as spin systems. We take proper colorings as a representative example of a hard-constraint graphical model. The classic problem of sampling a proper coloring uniformly at random of a given graph has been well-studied. Here we consider two inverse problems: Given random colorings of an unknown graph G, can we recover the underlying graph G exactly? If we are also given a candidate graph H, can we tell if G=H? The former problem is known as structure learning in the machine learning field and the latter is called identity testing. We show the complexity of these problems in different range of parameters and compare these results with the corresponding decision and sampling problems. Finally, we give some results of the analogous problems for the Ising model, a typical soft-constraint model. Based on joint work with Ivona Bezakova, Antonio Blanca, Daniel Stefankovic and Eric Vigoda.

Differential Privacy: The Census Algorithm

**Abstract: **For the first time in 2020, the US Census Bureau will apply a differentially private algorithm before publicly releasing decennial census data. Recently, the Bureau publicly released their code and end-to-end tests on the 1940 census data at various privacylevels. We will outline the DP algorithm (which is still being developed) and discuss the accuracy of these end-to-end tests. In particular, we focus on the bias and variance of the reported population counts. Finally, we discuss the choices the Bureau has yet to make that will affect the balance between privacy and accuracy. This talk is based on joint work with Abraham Flaxman.

Graph Algorithms and Offline Data Structures

**Abstract: **Graphs, which in their simplest forms are vertices connected by edges,
are widely used in high performance computing, machine learning and
network science. This talk will use recent progresses on two
well-studied algorithmic problems in static and dynamic graph,
max-flows and dynamic matchings, to discuss a methodology for
designing faster algorithm for large graphs. This approach is
motivated by a fundamental phenomenon in data structures: the
advantages of offline data structures over online ones.

I will start by describing how work on max-flows led to a focus on
finding short paths in residual graphs, and how investigating more
global notions of progress in residual graphs led to a more
sophisticated and general understanding of iterative methods and
preconditioning. I will then discuss a similar phenomenon in dynamic
graphs, where maintaining a large matching seems to require the online
detection of short augmenting paths, but can once again be
circumvented through the offline construction of smaller equivalent
graphs.

Bounds on Ramsey Games via Alterations

**Abstract: **In this talk we introduce a refined alteration approach for constructing \(H\)-free graphs: we show that removing
all edges in \(H\)-copies of the binomial random graph does not significantly change the independence number
(for suitable edge-probabilities); previous alteration approaches of Erdös and Krivelevich remove only a
subset of these edges. We present two applications to online graph Ramsey games of recent interest,
deriving new bounds for Ramsey, Paper, Scissors games and online Ramsey numbers (each time extending recent results of Fox–He–Wigderson and Conlon–Fox–Grinshpun–He).

Based on joint work with Lutz Warnke.

Beyond Submodular Maximization

**Abstract:** In the past decade, the catalog of algorithms available to combinatorial optimizers has been substantially extended to settings which allow submodular objective functions. One significant recent result was a tight (1-1/e)-approximation for maximizing a non-negative monotone submodular function subject to a matroid constraint. These algorithmic developments were happening concurrently with research that found a wealth of new applications for submodular optimization in machine learning, recommender systems, and algorithmic game theory.

The related supermodular maximization models also offer an abundance of applications, but they appeared to be highly intractable even under simple cardinality constraints and even when the function has a nice structure. For example, the densest subgraph problem - suspected to be highly intractable - can be expressed as a monotonic supermodular function which has a particularly nice form. Namely, the objective can be expressed by a quadratic form \(x^T A x\) where \(A\) is a non-negative, symmetric, 0-diagonal matrix. On the other hand, when the entries \(A(u,v)\) form a metric, it has been shown that the associated maximization problems have constant factor approximations. Inspired by this, we introduce a parameterized class of non-negative functions called meta-submodular functions that can be approximately maximized within a constant factor. This class includes metric diversity, monotone submodular and other objectives appearing in the machine learning and optimization literature. A general meta-submodular function is neither submodular nor supermodular and so its multi-linear extension does not have the nice convexity/concavity properties which hold for submodular functions. They do, however, have an intrinsic one-sided smoothness property which is essential for our algorithms. This smoothness property might be of independent interest.

Expander decomposition: applications to dynamic and distributed algorithms

**Abstract: **
Expander decomposition has been a central tool in designing graph algorithms in many fields (including fast centralized algorithms, approximation algorithms and property testing) for decades. Recently, we found that it also gives many impressive applications in dynamic graph algorithms and distributed graph algorithms. In this talk, I will survey these recent results based on expander decomposition, explain the key components for using this technique, and give some toy examples on how to apply these components.

**October 18:** ARC colloquium

High-Order Langevin Diffusion Yields an Accelerated MCMC Algorithm

**Abstract: **We propose a Markov chain Monte Carlo (MCMC) algorithm based on third-order Langevin dynamics for sampling from distributions with log-concave and smooth densities. The higher-order dynamics allow for more flexible discretization schemes, and we develop a specific method that combines splitting with more accurate integration. For a broad class of \(d\)-dimensional distributions arising from generalized linear models, we prove that the resulting third-order algorithm produces samples from a distribution that is at most \(\varepsilon\) in Wasserstein distance from the target distribution in \(O(d^{1/3}/ \varepsilon^{2/3})\) steps. This result requires only Lipschitz conditions on the gradient. For general strongly convex potentials with α-th order smoothness, we prove that the mixing time scales as \(O (d^{1/3} / \varepsilon^{2/3} + d^{1/2} / \varepsilon^{1 / (\alpha - 1)} )\). Our high-order Langevin diffusion reduces the problem of log-concave sampling to numerical integration along a fixed deterministic path, which makes it possible for further improvements in high-dimensional MCMC problems. Joint work with Yi-An Ma, Martin J, Wainwright, Peter L. Bartlett and Michael I. Jordan.

Asymptotic normality of the \(r\to p\) norm for random matrices with non-negative entries

**Abstract:** For an \(n\times n\) matrix \(A_n\), the \(r\to p\) operator norm is defined as \(\|A_n\|_{r \to p}= \sup_{\|x\|_r\leq 1 } \|A_n x\|_p\) for \(r,p\geq 1\). The \(r\to p\) operator norm puts a huge number of important quantities of interest in diverse disciplines under a single unified framework. The application of this norm spans a broad spectrum of areas including data-dimensionality reduction in machine learning, finding oblivious routing schemes in transportation network, and matrix condition number estimation.

In this talk, we will consider the \(r\to p\) norm of a class of symmetric random matrices with nonnegative entries, which includes the adjacency matrices of the Erd\H{o}s-R\'enyi random graphs and matrices with sub-Gaussian entries. For \(1< p \leq r < \infty\), we establish the asymptotic normality of the appropriately centered and scaled \(\|A_n\|_{r \to p}\), as \(n\to\infty\). The special case \(r=p=2\), which corresponds to the largest singular value of matrices, was proved in a seminal paper by Füredi and Komlós (1981). Of independent interest, we further obtain a sharp \(\ell_\infty\)-approximation for the maximizer vector. The results also hold for sparse matrices and further the \(\ell_\infty\)-approximation for the maximizer vector also holds for a broad class of deterministic sequence of matrices with certain asymptotic `expansion' properties.

This is based on a joint work with Souvik Dhara (MIT) and Kavita Ramanan (Brown U.).

Faster Width-dependent Algorithm for Mixed Packing and Covering LPs

**Abstract:** In this talk, we provide the details of our faster width-dependent algorithm for mixed packing-covering LPs. Mixed packing-covering LPs are fundamental to combinatorial optimization in computer science and operations research. Our algorithm finds a (1+\epsilon\) approximate solution in time \(O(Nw/ \varepsilon)\), where \(N\) is number of nonzero entries in the constraint matrix, and $w$ is the maximum number of nonzeros in any constraint. This algorithm is faster than Nesterov's smoothing algorithm which requires \(O(N\sqrt{n}w/ \epsilon)\) time, where \(n\) is the dimension of the problem. Our work utilizes the framework of area convexity introduced in [Sherman-FOCS’17] to obtain the best dependence on $\varepsilon$ while breaking the infamous \(\ell_{\infty}\) barrier to eliminate the factor of \(\sqrt{n}\). The current best width-independent algorithm for this problem runs in time \(O(N/\epsilon^2)\) [Young-arXiv-14] and hence has worse running time dependence on \(\varepsilon\). Many real life instances of mixed packing-covering problems exhibit small width and for such cases, our algorithm can report higher precision results when compared to width-independent algorithms. As a special case of our result, we report a \(1+\varepsilon\) approximation algorithm for the densest subgraph problem which runs in time \(O(md/ \varepsilon)\), where \(m\) is the number of edges in the graph and $d$ is the maximum graph degree.

Fast convergence of fictitious play

**Abstract: **Fictitious play is one of the simplest and most natural dynamics for two-player zero-sum games. Originally proposed by Brown in 1949, the fictitious play dynamic has each player simultaneously best-respond to the distribution of historical plays of their opponent. In 1951, Robinson showed that fictitious play converges to the Nash Equilibrium, albeit at an exponentially-slow rate, and in 1959, Karlin conjectured that the true convergence rate of fictitious play after k iterations is \(O(k^{-1/2})\), a rate which is achieved by similar algorithms and is consistent with empirical observations. Somewhat surprisingly, Daskalakis and Pan disproved a version of this conjecture in 2014, showing that an exponentially-slow rate can occur, although their result relied on adversarial tie-breaking. In this talk, we show that Karlin’s conjecture holds if ties are broken lexicographically and the game matrix is diagonal. We also show a matching lower bound under this tie-breaking assumption. This is joint work with Jake Abernethy and Andre Wibisono.

Thresholds versus Fractional Expectation-thresholds

**Abstract:** In this talk we will prove a conjecture of Talagrand, which is a fractional version of the “expectation-threshold” conjecture of Kalai and Kahn. This easily implies various difficult results in probabilistic combinatorics, e.g. thresholds for perfect hypergraph matchings (Johansson-Kahn-Vu) and bounded-degree spanning trees (Montgomery). Our approach builds on recent breakthrough work of Alweiss, Lovett, Wu, and Zhang on the Erdős-Rado “Sunflower Conjecture.”

This is joint work with Keith Frankston, Jeff Kahn, and Bhargav Narayanan.

Fall 2019, Spring 2019, Summer 2019

A numerical analysis approach to convex optimization

**Abstract:** In current convex optimization literature, there are significant gaps
between algorithms that produce high accuracy
\((1+1/poly(n))\)-approximate solutions vs. algorithms that produce
\(O(1)\)-approximate solutions for symmetrized special cases. This gap is
reflected in the differences between interior point methods vs.
(accelerated) gradient descent for regression problems, and between
exact vs. approximate undirected max-flow.

In this talk, I will discuss generalizations of a fundamental building
block in numerical analysis, preconditioned iterative methods, to
convex functions that include p-norms. This leads to algorithms that
converge to high accuracy solutions by crudely solving a sequence of
symmetric residual problems. I will then briefly describe several
recent and ongoing projects, including p-norm regression using \(m^{1/3}\)
linear system solves, \(p\)-norm flow in undirected unweighted graphs in
almost-linear time, and further improvements to the dependence on \(p\) in
the runtime.

Sticky Brownian Rounding and its Applications to Optimization Problems

**Abstract:** We present a new general and simple method for rounding
semi-definite programs, based on Brownian motion. Our approach is inspired by
recent results in algorithmic discrepancy theory. We develop and present tools
for analyzing our new rounding algorithms, utilizing mathematical machinery
from the theory of Brownian motion, complex analysis, and partial differential
equations. We will present our method to several classical problems, including Max-Cut, Max-di-cut and Max-2-SAT, and derive new algorithms that are competitive with the best known results. In particular, we show that the basic algorithm achieves 0.861-approximation for Max-cut and a natural variant of the algorithm achieve 0.878-approximation, matching the famous Goemans-Williamson algorithm upto first three decimal digits.

This is joint work with Abbas-Zadeh, Nikhil Bansal, Guru Guruganesh, Sasho Nikolov and Roy Schwartz.

Opportunities at the Intersection of AI and Society

**Abstract:** Powerful AI systems, which are driven by machine learning, are
increasingly controlling various aspects of modern society: from
social interactions (e.g., Facebook, Twitter, Google, YouTube),
economics (e.g., Uber, Airbnb, Banking), learning (e.g., Wikipedia,
MOOCs), governance (Judgements, Policing, Voting), to autonomous
vehicles and weapons. These systems have a tremendous potential to
change our lives for the better, but, via the ability to mimic and
nudge human behavior, they also have the potential to be
discriminatory, reinforce societal prejudices, and polarize opinions.
Moreover, recent studies have demonstrated that these systems can be
quite brittle and generally lack the required robustness to be
deployed in various civil/military situations. The reason being that
considerations such as fairness, robustness, stability,
explainability, accountability etc. have largely been an afterthought
in the development of AI systems. In this talk, I will discuss the
opportunities that lie ahead in a principled and thoughtful
development of AI systems.

**Bio**

Nisheeth Vishnoi is a Professor of Computer Science at Yale University.
He received a B.Tech in Computer Science and Engineering from IIT
Bombay in 1999 and a Ph.D. in Algorithms, Combinatorics and
Optimization from Georgia Tech in 2004. His research spans several
areas of theoretical computer science: from approximability of NP-hard
problems, to combinatorial, convex and non-convex optimization, to
tackling algorithmic questions involving dynamical systems, stochastic
processes and polynomials. He is also broadly interested in
understanding and addressing some of the key questions that arise in
nature and society from the viewpoint of theoretical computer science.
Here, his current focus is on natural algorithms, emergence of
intelligence, and questions at the interface of AI, ethics, and
society. He was the recipient of the Best Paper Award at FOCS in 2005,
the IBM Research Pat Goldberg Memorial Award in 2006, the Indian
National Science Academy Young Scientist Award in 2011, and the IIT
Bombay Young Alumni Achievers Award in 2016.

Travel Behavior Modeling Using Machine Learning

**Abstract:** The popularity of machine learning is increasingly growing in transportation, with applications ranging from traffic engineering to travel demand forecasting and pavement material modeling, to name just a few. Researchers often find that machine learning achieves higher predictive accuracy compared to traditional methods. However, many machine-learning methods are often viewed as “black-box” models, lacking interpretability for decision making. As a result, increased attention is being devoted to the interpretability of machine-learning results.

In this talk, I introduce the application of machine learning to study travel behavior, covering both mode prediction and behavioral interpretation. I first discuss the key differences between machine learning and logit models in modeling travel mode choice, focusing on model development, evaluation, and interpretation. Next, I apply the existing machine-learning interpretation tools and also propose two new model-agnostic interpretation tools to examine behavioral heterogeneity. Lastly, I show the potential of using machine learning as an exploratory tool to tune the utility functions of logit models.

I illustrate these ideas by examining stated-preference travel survey data for a new mobility-on-demand transit system that integrates fixed-route buses and on-demand shuttles. The results show that the best-performing machine-learning classifier results in higher predictive accuracy than logit models as well as comparable behavioral outputs. In addition, results obtained from model-agnostic interpretation tools show that certain machine-learning models (e.g. boosting trees) can readily account for individual heterogeneity and generate valuable behavioral insights on different population segments. Moreover, I show that interpretable machine learning can be applied to tune the utility functions of logit models (e.g. specifying nonlinearities) and to enhance their model performance. In turn, these findings can be used to inform the design of new mobility services and transportation policies.

**Bio:** Xilei Zhao is a Postdoctoral Fellow in the School of Industrial and Systems Engineering at the Georgia Institute of Technology. Dr. Zhao received her B.E. in Civil Engineering from Southeast University, China, in 2013, her MSEs in Civil Engineering and Applied Mathematics and Statistics from Johns Hopkins University (JHU) in 2016 and 2017, respectively, and her Ph.D. in Civil Engineering from JHU in 2017. She specializes in applying data science (data analytics, machine learning, optimization, and simulation), complex systems modeling, and risk assessment to tackle challenging problems in transportation and resilience.

Local Guarantees in Graph Cuts and Clustering

**Abstract:**
Correlation Clustering is an elegant model that captures fundamental graph cut problems such as Minimum s-t Cut, Multiway Cut, and Multicut, extensively studied in combinatorial optimization.

Here, we are given a graph with edges labeled + or - and the goal is to produce a clustering that agrees with the labels as much as possible: + edges within clusters and - edges across clusters.

The classical approach towards Correlation Clustering (and other graph cut problems) is to optimize a global objective, e.g., minimizing the total number of disagreements or maximizing the total number of agreements.

We depart from this and study local objectives: minimizing the maximum number of disagreements for edges incident on a single node, and the analogous max min agreements objective.

This naturally gives rise to a family of basic min-max graph cut problems.

A prototypical representative is Min-Max s-t Cut: find an s-t cut minimizing the largest number of cut edges incident on any node.

In this talk we will give a short introduction of Correlation Clustering and discuss the following results:

(1) an \(O(\sqrt{n})\)-approximation for the problem of minimizing the maximum total weight of disagreement edges incident on any node (thus providing the first known approximation for the above family of min-max graph cut problems)

(2) a remarkably simple 7-approximation for minimizing local disagreements in complete graphs (improving upon the previous best known approximation of 48)

(3) a \((1/(2+\epsilon))\)-approximation for maximizing the minimum total weight of agreement edges incident on any node, hence improving upon the \((1/(4+\epsilon))\)-approximation that follows from the study of approximate pure Nash equilibria in cut and party affiliation
games.

Join work with Moses Charikar and Neha Gupta.

Combinatorial algorithm for Optimal Design

**Abstract:** In an optimal design problem, we are given a set of linear experiments \(v_1,...,v_n \in R^d\) and \(k \ge d\), and our goal is to select a set or a multiset \(S\) subseteq \([n]\) of size \(k\) such that \(\Phi((\sum_{i \in [n]} v_i v_i^T )^{-1})\) is minimized. When \(\Phi(M) = det(M)^{1/d}\), the problem is known as the D-optimal design problem, and when \(\Phi(M) = tr(M)\), it is known as the A-optimal design problem. One of the most common heuristics used in practice to solve these problems is the local search heuristic, also known as the Fedorov's exchange method. This is due to its simplicity and its empirical performance. However, despite its wide usage no theoretical bound has been proven for this algorithm. In this paper, we bridge this gap and prove approximation guarantees for the local search algorithms for D-optimal design and A-optimal design problems. We show that the local search algorithms are asymptotically optimal when \(\frac{k}{d}\) is large. In addition to this, we also prove similar approximation guarantees for the greedy algorithms for D-optimal design and A-optimal design problems when \(k/d\) is large.

Effects of risk-aversion and diversity of user preferences on network routing

**Abstract:**
In network routing users often tradeoff different objectives in selecting their best route. An example is transportation networks, where due to uncertainty of travel times, drivers may tradeoff the average travel time versus the variance of a route. Or they might tradeoff time and cost, such as the cost paid in tolls.

We wish to understand the effect of two conflicting criteria in route selection, by studying the resulting traffic assignment (equilibrium) in the network. We investigate two perspectives of this topic: (1) How does the equilibrium cost of a risk-averse population compare to that of a risk-neutral population? (i.e., how much longer do we spend in traffic due to being risk-averse) (2) How does the equilibrium cost of a heterogeneous population compare to that of a comparable homogeneous user population?

We provide characterizations to both questions above.

Based on joint work with Richard Cole, Thanasis Lianeas and Nicolas Stier-Moses.

At the end I will mention current work of my research group on algorithms and mechanism design for power systems.

**Biography: **
Evdokia Nikolova is an Assistant Professor in the Department of Electrical and Computer Engineering at the University of Texas at Austin, where she is a member of the Wireless Networking & Communications Group. Previously she was an Assistant Professor in Computer Science and Engineering at Texas A&M University. She graduated with a BA in Applied Mathematics with Economics from Harvard University, MS in Mathematics from Cambridge University, U.K. and Ph.D. in Computer Science from MIT.

Nikolova's research aims to improve the design and efficiency of complex systems (such as networks and electronic markets), by integrating stochastic, dynamic and economic analysis. Her recent work examines how human risk aversion transforms traditional computational models and solutions. One of her algorithms has been adapted in the MIT CarTel project for traffic-aware routing. She currently focuses on developing algorithms for risk mitigation in networks, with applications to transportation and energy. She is a recipient of an NSF CAREER award and a Google Faculty Research Award. Her research group has been recognized with a best student paper award and a best paper award runner-up. She currently serves on the editorial board of the journal Mathematics of Operations Research.

Dynamic Connectivity in Constant Parallel Rounds

**Abstract:** We study the dynamic graph connectivity problem in the massively parallel computation model. We give a data structure for maintaining a dynamic undirected graph that handles batches of updates and connectivity queries in constant rounds, as long as the queries fit on a single machine. This assumption corresponds to the gradual buildup of databases over time from sources such as log files and user interactions. Our techniques combine a distributed data structure for Euler Tour (ET) trees, a structural theorem about rapidly contracting graphs via sampling \(n^{\epsilon}\) random neighbors, as well as modifications to sketching based dynamic connectivity data structures.

Joint work with David Durfee, Janardhan Kulkarni, Richard Peng and Xiaorui Sun.

Submodular Maximization with Optimal Approximation, Adaptivity and Query Complexity

**Abstract:** As a generalization of many classic problems in combinatorial optimization, submodular optimization has found a wide range of applications in machine learning (e.g., in feature engineering and active learning). For many large-scale optimization problems, we are often concerned with the adaptivity complexity of an algorithm, which quantifies the number of sequential rounds where polynomially-many independent function evaluations can be executed in parallel. While low adaptivity is ideal, it is not sufficient for a (distributed) algorithm to be efficient, since in many practical applications of submodular optimization the number of function evaluations becomes prohibitively expensive. Motivated by such applications, we study the adaptivity and query complexity of adaptive submodular optimization.

Our main result is a distributed algorithm for maximizing a monotone submodular function with cardinality constraint \(k\) that achieves a \((1-1/e-\varepsilon)\)-approximation in expectation. Furthermore, this algorithm runs in \(O(\log(n))\) adaptive rounds and makes \(O(n)\) calls to the function evaluation oracle in expectation. All three of these guarantees are optimal, and the query complexity is substantially less than in previous works. Finally, to show the generality of our simple algorithm and techniques, we extend our results to the submodular cover problem.

Joint work with Vahab Mirrokni and Morteza Zadimoghaddam (arXiv:1807.07889).

Efficiency of First-Fit Chain Partitioning Finite Partial Orders

**Abstract:** Given a finite partially ordered set (poset) of width \(w\), Dilworth's theorem gives an existence and minimality of a chain partition of size \(w\). First-Fit is an online algorithm for creating a chain partition of a poset. Given a linear ordering of the points of the poset, \(v_1, \cdots, v_n\), First-Fit assigns the point \(v_i\) to the first available chain in the chain partition of the points \(v_1, \cdots, v_{i-1}\). It is a known fact that First-Fit has unbounded performance over the family of finite posets of width 2. We give a complete characterization of the family of finite posets in which First-Fit performs with subexponential efficiency in terms of width. We will also review what is currently known on the family of posets in which First-Fit performs with polynomial efficiency in terms of width. Joint work with Kevin Milans.

The Drift Method and Delay-Optimal Scheduling for Data Center Networks in Heavy Traffic

**Abstract:** Queueing systems are studied in various asymptotic regimes because they are hard to study in general. One popular regime of study is the heavy-traffic regime, when the system is loaded very close to its capacity. Heavy-traffic behavior of queueing systems is traditionally studied using fluid and diffusion limits. In this talk, I will present a recently developed method called the 'Drift Method', which is much simpler, and is based on studying the drift of certain test functions. In addition to exactly characterizing the heavy-traffic behavior, the drift method can be used to obtain lower and upper bounds for all loads. In this talk, I will present the drift method, and its successful application in the context of data center networks to resolve a decade-old conjecture. I will also talk about ongoing work and some open problems.

**Bio:** Siva Theja Maguluri is an Assistant Professor in the School of Industrial and Systems Engineering at Georgia Tech. Before that, he was a Research Staff Member in the Mathematical Sciences Department at IBM T. J. Watson Research Center. He obtained his Ph.D. from the University of Illinois at Urbana-Champaign in Electrical and Computer Engineering where he worked on resource allocation algorithms for cloud computing and wireless networks. Earlier, he received an MS in ECE and an MS in Applied Math from UIUC and a B.Tech in Electrical Engineering from IIT Madras. His research interests include Stochastic Processes, Optimization, Cloud Computing, Data Centers, Resource Allocation and Scheduling Algorithms, Networks, and Game Theory. The current talk is based on a paper that received the best publication in applied probability award, presented by INFORMS Applied probability society.

Learning Combinatorial Structures

**Abstract:** At the heart of most algorithms today there is an optimization engine trying to learn online and provide the best decision, for e.g. rankings of objects, at any time with the partial information observed thus far in time. Often it becomes difficult to find near optimal solutions to many problems due to their inherent combinatorial structure that leads to certain computational bottlenecks. Submodularity is a discrete analogue of convexity and is a key property often exploited in tackling combinatorial optimization problems. In the first part of the talk, we will focus on computational bottlenecks that involve submodular functions: (a) convex function minimization over submodular base polytopes (for e.g. permutahedron) and (b) movement along a line inside submodular base polytopes. We give a conceptually simple and strongly polynomial algorithm Inc-Fix for the former, which is useful in computing Bregman projections in first-order projection-based methods like online mirror descent. For the latter, we will bound the iterations of the discrete Newton method which gives a running time improvement of at least \(n^6\) over the state of the art. This is joint work with Michel Goemans and Patrick Jaillet. In the second part of the talk, we will consider the dual problem of (a), i.e. minimization of composite convex and submodular objectives. We will resolve Bach's conjecture from 2015 about the running time of a popular Kelley's cutting plane variant to minimize these composite objectives. This is joint work with Madeleine Udell and Song Zhou.

The Price of Fair PCA: One Extra Dimension

**Abstract:** We investigate whether the standard dimensionality reduction techniques inadvertently produce data representations with different fidelity for two different populations. We show on several real-world datasets, PCA has higher reconstruction error on population A than B (for example, women versus men or lower versus higher-educated individuals). This can happen even when the dataset has similar number of samples from A and B . This motivates our study of dimensionality reduction techniques which maintain similar fidelity for A as B . We give an efficient algorithm for finding a projection which is nearly-optimal with respect to this measure, and evaluate it on several datasets. This is a joint work with Uthaipon Tantipongpipat, Jamie Morgenstern, Mohit Singh, and Santosh Vempala.

High-Dimensional Robust Mean Estimation in Nearly-Linear Time

**Abstract: **We study the fundamental problem of high-dimensional mean estimation in a robust model where a constant fraction of the samples are adversarially corrupted. Recent work gave the first polynomial time algorithms for this problem with dimension-independent error guarantees for several families of structured distributions.

In this work, we give the first nearly-linear time algorithms for high-dimensional robust mean estimation. Specifically, we focus on distributions with (i) known covariance and sub-gaussian tails, and (ii) unknown bounded covariance. Given \(N\) samples on \(R^d\), an \(\epsilon\)-fraction of which may be arbitrarily corrupted, our algorithms run in time \(\tilde{O}(Nd)/poly(\epsilon)\) and approximate the true mean within the information-theoretically optimal error, up to constant factors. Previous robust algorithms with comparable error guarantees have running times \(\tilde{\Omega}(N d^2)\).

Our algorithms rely on a natural family of SDPs parameterized by our current guess \(\nu\) for the unknown mean \(\mu^\star\). We give a win-win analysis establishing the following: either a near-optimal solution to the primal SDP yields a good candidate for \(\mu^\star\) -- independent of our current guess \(\nu\) -- or the dual SDP yields a new guess \(\nu'\) whose distance from \(\mu^\star\) is smaller by a constant factor. We exploit the special structure of the corresponding SDPs to show that they are approximately solvable in nearly-linear time. Our approach is quite general, and we believe it can also be applied to obtain nearly-linear time algorithms for other high-dimensional robust learning problems.

This is a joint work with Ilias Diakonikolas and Rong Ge.

Accelerating the Convergence Rate of Distributed Subgradient Methods with Adaptive Quantization

**Abstract:** In this talk, I will present a popular distributed method, namely, distributed consensus-based gradient (DCG) method, for solving optimal learning problems over a network of agents. Such problems arise in many applications such as, finding optimal parameters over a large dataset distributed among a network of processors or seeking an optimal policy for coverage control problems in robotic networks. The focus is to present our recent results, where we study the performance of DCG when the agents are only allowed to exchange their quantized values due to their finite communication bandwidth. In particular, we develop a novel quantization method, which we refer to as adaptive quantization. The main idea of our approach is to quantize the nodes' estimates based on the progress of the algorithm, which helps to eliminate the quantized errors. Under the adaptive quantization, we then derive the bounds on the convergence rates of the proposed method as a function of the bandwidths and the underlying network topology, for both convex and strongly convex objective functions. Our results suggest that under the adaptive quantization, the rate of convergence of DCG with and without quantization are the same, except for a factor which captures the number of quantization bits. To the best of the authors’ knowledge, the results in this paper are considered better than any existing results for DCG under quantization.

This is based on a joint work with Siva Theja Maguluri and Justin Romberg.

**Bio:** Thinh T. Doan is a TRIAD postdoctoral fellow at Georgia Institute of Technology, joint between the School of Industrial and Systems Engineering and the School of Electrical and Computer Engineering (ECE). He was born in Vietnam, where he got his Bachelor degree in Automatic Control at Hanoi University of Science and Technology in 2008. He obtained his Master and Ph.D. degrees both in ECE from the University of Oklahoma in 2013 and the University of Illinois at Urbana-Champaign in 2018, respectively. His research interests lie at the intersection of control theory, optimization, distributed algorithms, and applied probability, with the main applications in machine learning, reinforcement learning, power networks, and multi-agent systems.

Polynomial Method and Graph Bootstrap Percolation

**Abstract:** We introduce a simple method for proving lower bounds for the size of the smallest percolating set in a certain graph bootstrap process. We apply this method to determine the sizes of the smallest percolating sets in multi- dimensional tori and multidimensional grids (in particular hypercubes). The former answers a question of Morrison and Noel, and the latter provides an alternative and simpler proof for one of their main results. This is based on a joint work with Lianna Hambardzumyan and Hamed Hatami.

Randomness vs Quantumness

**Abstract:** Often the popular press talks about the power of quantum computing coming from its ability to perform several computations simultaneously. We’ve already had a similar capability from probabilistic machines. This talk will explore the relationship between quantum and randomized computation, how they are similar and how they differ, and why quantum can work exponentially faster on some but far from all computational problems. We’ll talk about some open problems in quantum complexity that can help shed light on the similarities and differences between randomness and “quantumness”.

This talk will not assume any previous knowledge of quantum information or quantum computing.

Cancelations in random sums

**Abstract:** Consider a linear combination of independent identically distributed random variables \(X_1, . . . , X_n\) with fixed weights \(a_1, . . . a_n\). If the random variables
are continuous, the sum is almost surely non-zero. However, for discrete random variables an exact cancelation may occur with a positive probability. This
probability depends on the arithmetic nature of the sequence \(a_1, . . . a_n\). We will discuss how to measure the relevant arithmetic properties and how to evaluate the probability of the exact and approximate cancelation.

Sparse random graphs with overlapping community structure

**Abstract:**
In this talk we introduce two different random graph models that produce sparse graphs with overlapping community structure and discuss community detection in each context. The Random Overlapping Community (ROC) model produces a sparse graph by constructing many Erdos Renyi random graphs (communities) on small randomly selected subsets of vertices. By varying the size and density of these communities, ROC graphs can be tuned to exhibit a wide range normalized of closed walk count vectors, including those of hypercubes. This is joint work with Santosh Vempala. In the second half of the talk, we introduce the Community Configuration Model (CCM), a variant of the configuration model in which half-edges are assigned colors and pair according to a matching rule on the colors. The model is a generalization of models in the statistical physics literature and is a natural finite analog for classes of graphexes. We describe a hypothesis testing algorithm that determines whether a graph came from a community configuration model or a traditional configuration model. This is joint work with Christian Borgs, Jennifer Chayes, Souvik Dhara, and Subhabrata Sen.

Fall 2017, Spring 2018

Markov Chains and Emergent Behavior

**Abstract:** Studying random samples drawn from large,
complex sets is one way to begin to learn about typical properties and
behaviors. However, it is important that the samples examined are
random enough: studying samples that are unexpectedly correlated or
drawn from the wrong distribution can produce misleading conclusions.
Sampling processes using Markov chains have been utilized in physics,
chemistry, and computer science, among other fields, but they are often
applied without careful analysis of their reliability. Making sure
widely-used sampling processes produce reliably representative samples
is a main focus of my research, and in this talk I'll touch on two
specific applications from statistical physics and combinatorics.

I'll also discuss work applying these same Markov chain processes used
for sampling in a novel way to address research questions in
programmable matter and swarm robotics, where a main goal is to
understand how simple computational elements can accomplish complicated
system-level goals. In a constrained setting, we've answered this
question by showing that groups of abstract particles executing our
simple processes (which are derived from Markov chains) can provably
accomplish remarkable global objectives. In the long run, one goal is
to understand the minimum computational abilities elements need in
order to exhibit complex global behavior, with an eye towards
developing systems where individual components are as simple as
possible.

This talk includes joint work with Marta Andrés Arroyo, Joshua J.
Daymude, Daniel I. Goldman, David A. Levin, Shengkai Li, Dana Randall,
Andréa Richa, William Savoie, Alexandre Stauffer, and Ross Warkentin.

High-level modelling and solving of combinatorial stochastic programs

**Abstract:** Stochastic programming is concerned with
decision making under uncertainty, seeking an optimal policy with
respect to a set of possible future scenarios. While the value of
Stochastic Programming is obvious to many practitioners, in reality
uncertainty in decision making is oftentimes neglected.

For deterministic optimisation problems, a coherent chain of
modelling and solving exists. Employing standard modelling languages
and solvers for stochastic programs is however difficult. First, they
have (with exceptions) no native support to formulate Stochastic
Programs. Secondly solving stochastic programs with standard solvers
(e.g. MIP solvers) is often computationally intractable.

David will be talking about his research that aims to make
Stochastic Programming more accessible. First, he will be talking about
modelling deterministic and stochastic programs in the Constraint
Programming language MiniZinc
- a modelling paradigm that retains the structure of a problem much
more strongly than MIP formulations. Secondly, he will be talking about
decomposition algorithms he has been working on to solve combinatorial
Stochastic Programs.

**Bio:**

David is a PhD student at Monash University in Melbourne. Before
joining Monash, he completed a Bachelors degree in Systems Engineering
in Switzerland, studied Robotics at University of Bristol and worked in
various engineering roles. As part of his Phd, David is working on a
solver system for stochastic combinatorial optimisation problems that
are modelled in in the Constraint Programming language MiniZinc. In
future, he would like to combine his passion for automation with the
research he is doing in operations research.

Local Differential Privacy for Physical Sensor Data and Sparse Recovery

**Abstract:** Physical sensors (thermal, light, motion,
etc.) are becoming ubiquitous and offer important benefits to society.
However, allowing sensors into our private spaces has resulted in
considerable privacy concerns. Differential privacy has been developed
to help alleviate these privacy concerns. In this talk, we’ll develop
and define a framework for releasing physical data that preserves both
utility and provides privacy. Our notion of closeness of physical data
will be defined via the Earth Mover Distance and we’ll discuss the
implications of this choice. Physical data, such as temperature
distributions, are often only accessible to us via a linear
transformation of the data.

We’ll analyse the implications of our privacy definition for linear
inverse problems, focusing on those that are traditionally considered
to be "ill-conditioned”. We’ll then instantiate our framework with the
heat kernel on graphs and discuss how the privacy parameter relates to
the connectivity of the graph. Our work indicates that it is possible
to produce locally private sensor measurements that both keep the exact
locations of the heat sources private and permit recovery of the
``general geographic vicinity'' of the sources. Joint work with Anna C.
Gilbert.

The Distance Oracle Hierarchy

**Abstract:** A lot of well-studied problems in CS
Theory are about making “sketches” of graphs that occupy much less
space than the graph itself, but where the shortest path distances of
the graph can still be approximately recovered from the sketch. For
example, in the literature on Spanners, we seek a sparse subgraph whose
distance metric approximates that of the original graph. In Emulator
literature, we relax the requirement that the approximating graph is a
subgraph. Most generally, in Distance Oracles, the sketch can be an
arbitrary data structure, so long as it can approximately answer
queries about the pairwise distance between nodes in the original graph.

Research on these objects typically focuses on optimizing the
worst-case tradeoff between the quality of the approximation and the
amount of space that the sketch occupies. In this talk, we will survey
a recent leap in understanding about this tradeoff, overturning the
conventional wisdom on the problem. Specifically, the tradeoff is not
smooth, but rather it follows a new discrete hierarchy in which the
quality of the approximation that can be obtained jumps considerably at
certain predictable size thresholds. The proof is graph-theoretic and
relies on building large families of graphs with large discrepancies in
their metrics.

Fully Dynamic Low-Diameter Decomposition with Applications

**Abstract:** A low-diameter decomposition (LDD) of an
undirected graph G is a partitioning of G into components of bounded
diameter, such that only a small fraction of original edges are between
the components. This decomposition has played instrumental role in the
design of low-stretch spanning tree, spanners, distributed algorithms
etc.

A natural question is whether such a decomposition can be efficiently
maintained/updated as G undergoes insertions/deletions of edges. We
make the first step towards answering this question by designing a
fully-dynamic graph algorithm that maintains an LDD in sub-linear
update time.

It is known that any undirected graph G admits a spanning tree T with
nearly logarithmic average stretch, which can be computed in nearly
linear-time. This tree decomposition underlies many recent progress in
static algorithms for combinatorial and scientific flows. Using our
dynamic LDD algorithm, we present the first non-trivial algorithm that
dynamically maintains a low-stretch spanning tree in \(\tilde{O}(t^2)\)
amortized update time, while achieving \((t + \sqrt{n^{1+o(1)}/t})\)
stretch, for every \(1 \leq t \leq n\).

Joint work with Sebastian Krinninger.

Approximation algorithms for optimal design problems

**Abstract:** The talk is based on https://arxiv.org/abs/1802.08318.

Selling Partially-Ordered Items: Exploring the Space between Single- and Multi-Dimensional Mechanism Design

**Abstract:** Consider the problem of selling items to
a unit-demand buyer. Most work on maximizing seller revenue considers
either a setting that is single dimensional, such as where the items
are identical, or multi-dimensional, where the items are heterogeneous.
With respect to revenue-optimal mechanisms, these settings sit at
extreme ends of a spectrum: from simple and fully characterized
(single-dimensional) to complex and nebulous (multi-dimensional).

In this paper, we identify a setting that sits in between these
extremes. We consider a seller who has three services {A,B,C} for sale
to a single buyer with a value v and an interest G from {A,B,C}, and
there is a known partial ordering over the services. For example,
suppose the seller is selling {internet}, {internet, phone}, and
{internet, cable tv}. A buyer with interest {internet} would be
satisfied by receiving phone or cable tv in addition, but a customer
whose interest is {internet, phone} cannot be satisfied by any other
option. Thus this corresponds to a partial-ordering where {internet}
> {internet, phone} and {internet} > {internet, cable tv}, but
{internet, phone} and {internet, cable tv} are not comparable.

We show formally that partially-ordered items lie in a space of their
own, in between identical and heterogeneous items: there exist
distributions over (value, interest) pairs for three partially-ordered
items such that the menu complexity of the optimal mechanism is
unbounded, yet for all distributions there exists an optimal mechanism
of finite menu complexity. So this setting is vastly more complex than
identical items (where the menu complexity is one), or even
“totally-ordered” items as in the FedEx Problem [FGKK16] (where the
menu complexity is at most seven, for three items), yet drastically
more structured than heterogeneous items (where the menu complexity can
be uncountable [DDT15]). We achieve this result by proving a
characterization of the class of best duals and by giving a primal
recovery algorithm which obtains the optimal mechanism. In addition, we
(1) extend our lower-bound to the Multi-Unit Pricing setting, (2) give
a tighter and deterministic characterization of the optimal mechanism
when the buyer’s distribution satisfies the declining marginal revenue
condition, and (3) prove a master theorem that allows us to reason
about duals instead of distributions.

Joint work with Nikhil Devanur, Raghuvansh Saxena, Ariel Schvartzman, and Matt Weinberg.

Kira will be around all week. Please email her to schedule one-on-one meetings.

Algorithm and Hardness for Linear Elasticity Problems

**Abstract:** In this talk, we study solvers for
geometrically embedded graph structured block linear systems. The
general form of such systems, PSD-Graph-Structured Block Matrices
(PGSBM), arise in scientific computing, linear elasticity, the inner
loop of interior point algorithms for linear programming, and can be
viewed as extensions of graph Laplacians into multiple labels at each
graph vertex. Linear elasticity problems, more commonly referred to as
trusses, describe forces on a geometrically embedded object.

We present an asymptotically faster algorithm for solving linear
systems in well-shaped 3-D trusses. Our algorithm utilizes the
geometric structures to combine nested dissection and support theory,
which are both well studied techniques for solving linear systems. We
decompose a well-shaped 3-D truss into balanced regions with small
boundaries, run Gaussian elimination to eliminate the interior
vertices, and then solve the remaining linear system by preconditioning
with the boundaries.

On the other hand, we prove that the geometric structures are
"necessary" for designing fast solvers. Specifically, solving linear
systems in general trusses is as hard as solving general linear systems
over the real. Furthermore, we give some other PGSBM linear systems for
which fast solvers imply fast solvers for general linear systems.

Based on the joint works with Robert Schwieterman and Rasmus Kyng.

Packing nearly optimal Ramsey R(3,t) Graphs

**Abstract:**
In 1995 Kim famously proved the Ramsey bound \(R(3,t) \ge c t^2/\log
t\) by constructing an \(n\)-vertex graph that is triangle-free and has
independence number at most \(C \sqrt{n \log n}\). We extend this
celebrated result, which is best possible up to the value of the
constants, by approximately decomposing the complete graph \(K_n\) into
a packing of such nearly optimal Ramsey \(R(3,t)\) graphs.

More precisely, for any \(\epsilon>0\) we find an edge-disjoint
collection \((G_i)_i\) of \(n\)-vertex graphs \(G_i \subseteq K_n\)
such that (a) each \(G_i\) is triangle-free and has independence number
at most \(C_\epsilon \sqrt{n \log n}\), and (b) the union of all the
\(G_i\) contains at least \((1-\epsilon)\binom{n}{2}\) edges. Our
algorithmic proof proceeds by sequentially choosing the graphs \(G_i\)
via a semi-random (i.e., Rödl nibble type) variation of the
triangle-free process.

As an application, we prove a conjecture in Ramsey theory by Fox,
Grinshpun, Liebenau, Person, and Szabó (concerning a Ramsey-type
parameter introduced by Burr, Erdös, Lovász in 1976). Namely, denoting
by \(s_r(H)\) the smallest minimum degree of \(r\)-Ramsey minimal
graphs for \(H\), we close the existing logarithmic gap for \(H=K_3\)
and establish that \(s_r(K_3) = \Theta(r^2 \log r)\).

Based on joint work with Lutz Warnke.

A Stochastic Approach to Shortcut Bridging in Programmable Matter

**Abstract: **In a self-organizing particle system, an abstraction of programmable
matter, simple computational elements called particles with limited
memory and communication self-organize to solve system-wide problems of
movement, coordination, and configuration.
In this paper, we consider stochastic, distributed, local, asynchronous
algorithms for 'shortcut bridging', in which particles self-assemble
bridges over gaps that simultaneously balance minimizing the length and
cost of the bridge. Army ants of the genus Eticon
have been observed exhibiting a similar behavior in their foraging
trails, dynamically adjusting their bridges to satisfy an efficiency
tradeoff using local interactions. Using techniques from Markov chain
analysis, we rigorously analyze our algorithm, show
it achieves a near-optimal balance between the competing factors of path
length and bridge cost, and prove that it exhibits a dependence on the
angle of the gap being 'shortcut' similar to that of the ant bridges. We
also present simulation results that qualitatively
compare our algorithm with the army ant bridging behavior. Our work
presents a plausible explanation of how convergence to globally optimal
configurations can be achieved via local interactions by simple
organisms (e.g., ants) with some limited computational
power and access to random bits. The proposed algorithm demonstrates the
robustness of the stochastic approach to algorithms for programmable
matter, as it is a surprisingly simple extension of a stochastic
algorithm for compression.

This is joint work between myself/my professor Andrea Richa at ASU and Sarah Cannon and Prof. Dana Randall here at GaTech.

Determinant-Preserving Sparsification of SDDM Matrices with Applications to Counting and Sampling Spanning Trees

**Abstract:**
We show variants of spectral sparsification routines can preserve the
total spanning tree counts of graphs, which by Kirchhoff's matrix-tree
theorem, is equivalent to determinant of a graph Laplacian minor, or
equivalently, of any SDDM matrix. Our analyses utilizes this
combinatorial connection to bridge between statistical leverage scores
/ effective resistances and the analysis of random graphs by [Janson,
Combinatorics, Probability and Computing `94]. This leads to a routine
that in quadratic time, sparsifies a graph down to about \(n^{1.5}\)
edges in ways that preserve both the determinant and the distribution
of spanning trees (provided the sparsified graph is viewed as a random
object). Extending this algorithm to work with Schur complements and
approximate Choleksy factorizations leads to algorithms for counting
and sampling spanning trees which are nearly optimal for dense graphs.

We give an algorithm that computes a \((1\pm \delta)\) approximation to
the determinant of any SDDM matrix with constant probability in about
\(n^2\delta^{−2}\) time. This is the first routine for graphs that
outperforms general-purpose routines for computing determinants of
arbitrary matrices. We also give an algorithm that generates in about
\(n^2\delta^{−2}\) time a spanning tree of a weighted undirected graph
from a distribution with total variation distance of \(\delta\) from
the w-uniform distribution.

This is joint work with John Peebles, Richard Peng and Anup B. Rao.

The list chromatic number of graphs with small clique number (vedio)

**Abstract:** We prove that every triangle-free graph with maximum
degree \(\Delta\) has list chromatic number at most \((1+o(1))\frac{\Delta}{\ln
\Delta}\). This matches the best-known bound for graphs of girth at least
5. We also provide a new proof that for any \(r\geq4\) every
\(K_r\)-free graph has list-chromatic number at most
\(200r\frac{\Delta\ln\ln\Delta}{\ln\Delta}\).

86 Years of Ramsey R(3,k). (and counting!) (vedio)

**Abstract:**
The search for the asymptotics of the Ramsey function R(3,k) has a long
and fascinating history. It begins in the hill country surrounding
Budapest and winding over the decades through Europe, America, Korea
and Rio de Janiero. We explore it through a CS lens, giving algorithms
that provide the various upper and lower bounds. The arguments are
various more or less sophisticated uses of Erdos Magic and, indeed,
many of the most important advances in the Probabilistic Method have
come from these investigations.