# ACO Student Seminar Fall 2019

## Description

The Georgia Tech ACO Student Seminar is run by students in the ACO program.

The propose of this seminar is to keep students updated with current researches, and to practice presenting skills for those who will give a talk in a conference. Any topic related (but not restricted) to algorithms, combinatorics, and optimization is very welcome. You can share research results, present a work in progress or just share something of general interest. There will also be occasional talks by ACO faculty.

The seminar currently meets on Fridays from 1pm to 2pm at Skiles 005, unless noted below.

Food and drinks will be provided. Thanks to ACO, ARC, and ISYE for their sponsorship.

## Mailing list

Subscribe to the ACO Student Seminar mailing list.

## Contact

If you are interested in giving a talk at the seminar, you can contact any of the organizers: He Guo, He Jia, or Guanyi Wang.

## Scheduled Talks

### Fall 2019

August 30: (Skiles 202) Zongchen Chen, CS, Georgia Tech
Learning and Testing for Graphical Models

Abstract: In this talk we introduce some machine learning problems in the setting of undirected graphical models, also known as spin systems. We take proper colorings as a representative example of a hard-constraint graphical model. The classic problem of sampling a proper coloring uniformly at random of a given graph has been well-studied. Here we consider two inverse problems: Given random colorings of an unknown graph G, can we recover the underlying graph G exactly? If we are also given a candidate graph H, can we tell if G=H? The former problem is known as structure learning in the machine learning field and the latter is called identity testing. We show the complexity of these problems in different range of parameters and compare these results with the corresponding decision and sampling problems. Finally, we give some results of the analogous problems for the Ising model, a typical soft-constraint model. Based on joint work with Ivona Bezakova, Antonio Blanca, Daniel Stefankovic and Eric Vigoda.

September 6: Samantha Petti, CS, Georgia Tech
Differential Privacy: The Census Algorithm

Abstract: For the first time in 2020, the US Census Bureau will apply a differentially private algorithm before publicly releasing decennial census data. Recently, the Bureau publicly released their code and end-to-end tests on the 1940 census data at various privacylevels. We will outline the DP algorithm (which is still being developed) and discuss the accuracy of these end-to-end tests. In particular, we focus on the bias and variance of the reported population counts. Finally, we discuss the choices the Bureau has yet to make that will affect the balance between privacy and accuracy. This talk is based on joint work with Abraham Flaxman.

September 13: (Skiles 202) Dr. Richard Peng, CS, Georgia Tech
Graph Algorithms and Offline Data Structures

Abstract: Graphs, which in their simplest forms are vertices connected by edges, are widely used in high performance computing, machine learning and network science. This talk will use recent progresses on two well-studied algorithmic problems in static and dynamic graph, max-flows and dynamic matchings, to discuss a methodology for designing faster algorithm for large graphs. This approach is motivated by a fundamental phenomenon in data structures: the advantages of offline data structures over online ones.

I will start by describing how work on max-flows led to a focus on finding short paths in residual graphs, and how investigating more global notions of progress in residual graphs led to a more sophisticated and general understanding of iterative methods and preconditioning. I will then discuss a similar phenomenon in dynamic graphs, where maintaining a large matching seems to require the online detection of short augmenting paths, but can once again be circumvented through the offline construction of smaller equivalent graphs.

September 20: He Guo, Math, Georgia Tech
Bounds on Ramsey Games via Alterations

Abstract: In this talk we introduce a refined alteration approach for constructing $$H$$-free graphs: we show that removing all edges in $$H$$-copies of the binomial random graph does not significantly change the independence number (for suitable edge-probabilities); previous alteration approaches of Erdös and Krivelevich remove only a subset of these edges. We present two applications to online graph Ramsey games of recent interest, deriving new bounds for Ramsey, Paper, Scissors games and online Ramsey numbers (each time extending recent results of Fox–He–Wigderson and Conlon–Fox–Grinshpun–He).
Based on joint work with Lutz Warnke.

September 27: Mehrdad Ghadiri, CS, Georgia Tech
Beyond Submodular Maximization

Abstract: In the past decade, the catalog of algorithms available to combinatorial optimizers has been substantially extended to settings which allow submodular objective functions. One significant recent result was a tight (1-1/e)-approximation for maximizing a non-negative monotone submodular function subject to a matroid constraint. These algorithmic developments were happening concurrently with research that found a wealth of new applications for submodular optimization in machine learning, recommender systems, and algorithmic game theory.

The related supermodular maximization models also offer an abundance of applications, but they appeared to be highly intractable even under simple cardinality constraints and even when the function has a nice structure. For example, the densest subgraph problem - suspected to be highly intractable - can be expressed as a monotonic supermodular function which has a particularly nice form. Namely, the objective can be expressed by a quadratic form $$x^T A x$$ where $$A$$ is a non-negative, symmetric, 0-diagonal matrix. On the other hand, when the entries $$A(u,v)$$ form a metric, it has been shown that the associated maximization problems have constant factor approximations. Inspired by this, we introduce a parameterized class of non-negative functions called meta-submodular functions that can be approximately maximized within a constant factor. This class includes metric diversity, monotone submodular and other objectives appearing in the machine learning and optimization literature. A general meta-submodular function is neither submodular nor supermodular and so its multi-linear extension does not have the nice convexity/concavity properties which hold for submodular functions. They do, however, have an intrinsic one-sided smoothness property which is essential for our algorithms. This smoothness property might be of independent interest.

October 4: Dr. Thatchaphol Saranurak, CS, Toyota Technological Institute at Chicago
Expander decomposition: applications to dynamic and distributed algorithms

Abstract: Expander decomposition has been a central tool in designing graph algorithms in many fields (including fast centralized algorithms, approximation algorithms and property testing) for decades. Recently, we found that it also gives many impressive applications in dynamic graph algorithms and distributed graph algorithms. In this talk, I will survey these recent results based on expander decomposition, explain the key components for using this technique, and give some toy examples on how to apply these components.

October 18: ARC colloquium

October 25: Wenlong Mou, EECS, UC Berkeley
High-Order Langevin Diffusion Yields an Accelerated MCMC Algorithm

Abstract: We propose a Markov chain Monte Carlo (MCMC) algorithm based on third-order Langevin dynamics for sampling from distributions with log-concave and smooth densities. The higher-order dynamics allow for more flexible discretization schemes, and we develop a specific method that combines splitting with more accurate integration. For a broad class of $$d$$-dimensional distributions arising from generalized linear models, we prove that the resulting third-order algorithm produces samples from a distribution that is at most $$\varepsilon$$ in Wasserstein distance from the target distribution in $$O(d^{1/3}/ \varepsilon^{2/3})$$ steps. This result requires only Lipschitz conditions on the gradient. For general strongly convex potentials with α-th order smoothness, we prove that the mixing time scales as $$O (d^{1/3} / \varepsilon^{2/3} + d^{1/2} / \varepsilon^{1 / (\alpha - 1)} )$$. Our high-order Langevin diffusion reduces the problem of log-concave sampling to numerical integration along a fixed deterministic path, which makes it possible for further improvements in high-dimensional MCMC problems. Joint work with Yi-An Ma, Martin J, Wainwright, Peter L. Bartlett and Michael I. Jordan.

November 1: Reserved by Dr. Debankur Mukherjee, ISYE, Georgia Tech
November 8: Reserved by Dr. Yuhao Yi, CS, Rensselaer Polytechnic Institute
November 15: Reserved by Digvijay Boob, ISyE, Georgia Tech
November 22: Reserved by Kevin Lai, CS, Georgia Tech
November 29: Thanksgiving Break

# ACO Student Seminar Fall 2019, Spring 2019, Summer 2019

### Spring 2019, Summer 2019

January 11: Dr. Richard Peng, CS, Georgia Tech
A numerical analysis approach to convex optimization

Abstract: In current convex optimization literature, there are significant gaps between algorithms that produce high accuracy $$(1+1/poly(n))$$-approximate solutions vs. algorithms that produce $$O(1)$$-approximate solutions for symmetrized special cases. This gap is reflected in the differences between interior point methods vs. (accelerated) gradient descent for regression problems, and between exact vs. approximate undirected max-flow.

In this talk, I will discuss generalizations of a fundamental building block in numerical analysis, preconditioned iterative methods, to convex functions that include p-norms. This leads to algorithms that converge to high accuracy solutions by crudely solving a sequence of symmetric residual problems. I will then briefly describe several recent and ongoing projects, including p-norm regression using $$m^{1/3}$$ linear system solves, $$p$$-norm flow in undirected unweighted graphs in almost-linear time, and further improvements to the dependence on $$p$$ in the runtime.

January 25: Dr. Mohit Singh, ISyE, Georgia Tech
Sticky Brownian Rounding and its Applications to Optimization Problems

Abstract: We present a new general and simple method for rounding semi-definite programs, based on Brownian motion. Our approach is inspired by recent results in algorithmic discrepancy theory. We develop and present tools for analyzing our new rounding algorithms, utilizing mathematical machinery from the theory of Brownian motion, complex analysis, and partial differential equations. We will present our method to several classical problems, including Max-Cut, Max-di-cut and Max-2-SAT, and derive new algorithms that are competitive with the best known results. In particular, we show that the basic algorithm achieves 0.861-approximation for Max-cut and a natural variant of the algorithm achieve 0.878-approximation, matching the famous Goemans-Williamson algorithm upto first three decimal digits.

This is joint work with Abbas-Zadeh, Nikhil Bansal, Guru Guruganesh, Sasho Nikolov and Roy Schwartz.

February 1: (ACO alumni colloquium at Groseclose 402, 1-2pm, followed by a discussion session with the speaker at 2-2:45pm) Dr. Nisheeth Vishnoi, CS, Yale University
Opportunities at the Intersection of AI and Society

Abstract: Powerful AI systems, which are driven by machine learning, are increasingly controlling various aspects of modern society: from social interactions (e.g., Facebook, Twitter, Google, YouTube), economics (e.g., Uber, Airbnb, Banking), learning (e.g., Wikipedia, MOOCs), governance (Judgements, Policing, Voting), to autonomous vehicles and weapons. These systems have a tremendous potential to change our lives for the better, but, via the ability to mimic and nudge human behavior, they also have the potential to be discriminatory, reinforce societal prejudices, and polarize opinions. Moreover, recent studies have demonstrated that these systems can be quite brittle and generally lack the required robustness to be deployed in various civil/military situations. The reason being that considerations such as fairness, robustness, stability, explainability, accountability etc. have largely been an afterthought in the development of AI systems. In this talk, I will discuss the opportunities that lie ahead in a principled and thoughtful development of AI systems.

Bio

Nisheeth Vishnoi is a Professor of Computer Science at Yale University. He received a B.Tech in Computer Science and Engineering from IIT Bombay in 1999 and a Ph.D. in Algorithms, Combinatorics and Optimization from Georgia Tech in 2004. His research spans several areas of theoretical computer science: from approximability of NP-hard problems, to combinatorial, convex and non-convex optimization, to tackling algorithmic questions involving dynamical systems, stochastic processes and polynomials. He is also broadly interested in understanding and addressing some of the key questions that arise in nature and society from the viewpoint of theoretical computer science. Here, his current focus is on natural algorithms, emergence of intelligence, and questions at the interface of AI, ethics, and society. He was the recipient of the Best Paper Award at FOCS in 2005, the IBM Research Pat Goldberg Memorial Award in 2006, the Indian National Science Academy Young Scientist Award in 2011, and the IIT Bombay Young Alumni Achievers Award in 2016.

February 8: Dr. Xilei Zhao, ISyE, Georgia Tech
Travel Behavior Modeling Using Machine Learning

Abstract: The popularity of machine learning is increasingly growing in transportation, with applications ranging from traffic engineering to travel demand forecasting and pavement material modeling, to name just a few. Researchers often find that machine learning achieves higher predictive accuracy compared to traditional methods. However, many machine-learning methods are often viewed as “black-box” models, lacking interpretability for decision making. As a result, increased attention is being devoted to the interpretability of machine-learning results.

In this talk, I introduce the application of machine learning to study travel behavior, covering both mode prediction and behavioral interpretation. I first discuss the key differences between machine learning and logit models in modeling travel mode choice, focusing on model development, evaluation, and interpretation. Next, I apply the existing machine-learning interpretation tools and also propose two new model-agnostic interpretation tools to examine behavioral heterogeneity. Lastly, I show the potential of using machine learning as an exploratory tool to tune the utility functions of logit models.

I illustrate these ideas by examining stated-preference travel survey data for a new mobility-on-demand transit system that integrates fixed-route buses and on-demand shuttles. The results show that the best-performing machine-learning classifier results in higher predictive accuracy than logit models as well as comparable behavioral outputs. In addition, results obtained from model-agnostic interpretation tools show that certain machine-learning models (e.g. boosting trees) can readily account for individual heterogeneity and generate valuable behavioral insights on different population segments. Moreover, I show that interpretable machine learning can be applied to tune the utility functions of logit models (e.g. specifying nonlinearities) and to enhance their model performance. In turn, these findings can be used to inform the design of new mobility services and transportation policies.

Bio: Xilei Zhao is a Postdoctoral Fellow in the School of Industrial and Systems Engineering at the Georgia Institute of Technology. Dr. Zhao received her B.E. in Civil Engineering from Southeast University, China, in 2013, her MSEs in Civil Engineering and Applied Mathematics and Statistics from Johns Hopkins University (JHU) in 2016 and 2017, respectively, and her Ph.D. in Civil Engineering from JHU in 2017. She specializes in applying data science (data analytics, machine learning, optimization, and simulation), complex systems modeling, and risk assessment to tackle challenging problems in transportation and resilience.

March 1: Dr. Roy Schwartz, CS, Technion
Local Guarantees in Graph Cuts and Clustering

Abstract: Correlation Clustering is an elegant model that captures fundamental graph cut problems such as Minimum s-t Cut, Multiway Cut, and Multicut, extensively studied in combinatorial optimization.

Here, we are given a graph with edges labeled + or - and the goal is to produce a clustering that agrees with the labels as much as possible: + edges within clusters and - edges across clusters.

The classical approach towards Correlation Clustering (and other graph cut problems) is to optimize a global objective, e.g., minimizing the total number of disagreements or maximizing the total number of agreements.

We depart from this and study local objectives: minimizing the maximum number of disagreements for edges incident on a single node, and the analogous max min agreements objective.

This naturally gives rise to a family of basic min-max graph cut problems.

A prototypical representative is Min-Max s-t Cut: find an s-t cut minimizing the largest number of cut edges incident on any node.

In this talk we will give a short introduction of Correlation Clustering and discuss the following results:

(1) an $$O(\sqrt{n})$$-approximation for the problem of minimizing the maximum total weight of disagreement edges incident on any node (thus providing the first known approximation for the above family of min-max graph cut problems)

(2) a remarkably simple 7-approximation for minimizing local disagreements in complete graphs (improving upon the previous best known approximation of 48)

(3) a $$(1/(2+\epsilon))$$-approximation for maximizing the minimum total weight of agreement edges incident on any node, hence improving upon the $$(1/(4+\epsilon))$$-approximation that follows from the study of approximate pure Nash equilibria in cut and party affiliation games.

Join work with Moses Charikar and Neha Gupta.

March 22: (Closed, Spring Break)

April 5: Dr. Vivek Madan, ISyE, Georgia Tech
Combinatorial algorithm for Optimal Design

Abstract: In an optimal design problem, we are given a set of linear experiments $$v_1,...,v_n \in R^d$$ and $$k \ge d$$, and our goal is to select a set or a multiset $$S$$ subseteq $$[n]$$ of size $$k$$ such that $$\Phi((\sum_{i \in [n]} v_i v_i^T )^{-1})$$ is minimized. When $$\Phi(M) = det(M)^{1/d}$$, the problem is known as the D-optimal design problem, and when $$\Phi(M) = tr(M)$$, it is known as the A-optimal design problem. One of the most common heuristics used in practice to solve these problems is the local search heuristic, also known as the Fedorov's exchange method. This is due to its simplicity and its empirical performance. However, despite its wide usage no theoretical bound has been proven for this algorithm. In this paper, we bridge this gap and prove approximation guarantees for the local search algorithms for D-optimal design and A-optimal design problems. We show that the local search algorithms are asymptotically optimal when $$\frac{k}{d}$$ is large. In addition to this, we also prove similar approximation guarantees for the greedy algorithms for D-optimal design and A-optimal design problems when $$k/d$$ is large.

May 9 (ISyE Main 228, 11 am, Thursday): Dr. Evdokia Nikolova, ECE, UT Austin
Effects of risk-aversion and diversity of user preferences on network routing

Abstract: In network routing users often tradeoff different objectives in selecting their best route. An example is transportation networks, where due to uncertainty of travel times, drivers may tradeoff the average travel time versus the variance of a route. Or they might tradeoff time and cost, such as the cost paid in tolls.

We wish to understand the effect of two conflicting criteria in route selection, by studying the resulting traffic assignment (equilibrium) in the network. We investigate two perspectives of this topic: (1) How does the equilibrium cost of a risk-averse population compare to that of a risk-neutral population? (i.e., how much longer do we spend in traffic due to being risk-averse) (2) How does the equilibrium cost of a heterogeneous population compare to that of a comparable homogeneous user population?

We provide characterizations to both questions above.

Based on joint work with Richard Cole, Thanasis Lianeas and Nicolas Stier-Moses.

At the end I will mention current work of my research group on algorithms and mechanism design for power systems.

Biography: Evdokia Nikolova is an Assistant Professor in the Department of Electrical and Computer Engineering at the University of Texas at Austin, where she is a member of the Wireless Networking & Communications Group. Previously she was an Assistant Professor in Computer Science and Engineering at Texas A&M University. She graduated with a BA in Applied Mathematics with Economics from Harvard University, MS in Mathematics from Cambridge University, U.K. and Ph.D. in Computer Science from MIT.

Nikolova's research aims to improve the design and efficiency of complex systems (such as networks and electronic markets), by integrating stochastic, dynamic and economic analysis. Her recent work examines how human risk aversion transforms traditional computational models and solutions. One of her algorithms has been adapted in the MIT CarTel project for traffic-aware routing. She currently focuses on developing algorithms for risk mitigation in networks, with applications to transportation and energy. She is a recipient of an NSF CAREER award and a Google Faculty Research Award. Her research group has been recognized with a best student paper award and a best paper award runner-up. She currently serves on the editorial board of the journal Mathematics of Operations Research.

### Fall 2018

September 14: Saurabh Sawlani, CS, Georgia Tech
Dynamic Connectivity in Constant Parallel Rounds

Abstract: We study the dynamic graph connectivity problem in the massively parallel computation model. We give a data structure for maintaining a dynamic undirected graph that handles batches of updates and connectivity queries in constant rounds, as long as the queries fit on a single machine. This assumption corresponds to the gradual buildup of databases over time from sources such as log files and user interactions. Our techniques combine a distributed data structure for Euler Tour (ET) trees, a structural theorem about rapidly contracting graphs via sampling $$n^{\epsilon}$$ random neighbors, as well as modifications to sketching based dynamic connectivity data structures.

Joint work with David Durfee, Janardhan Kulkarni, Richard Peng and Xiaorui Sun.

September 21: Matthew Fahrbach, CS, Georgia Tech
Submodular Maximization with Optimal Approximation, Adaptivity and Query Complexity

Abstract: As a generalization of many classic problems in combinatorial optimization, submodular optimization has found a wide range of applications in machine learning (e.g., in feature engineering and active learning). For many large-scale optimization problems, we are often concerned with the adaptivity complexity of an algorithm, which quantifies the number of sequential rounds where polynomially-many independent function evaluations can be executed in parallel. While low adaptivity is ideal, it is not sufficient for a (distributed) algorithm to be efficient, since in many practical applications of submodular optimization the number of function evaluations becomes prohibitively expensive. Motivated by such applications, we study the adaptivity and query complexity of adaptive submodular optimization.

Our main result is a distributed algorithm for maximizing a monotone submodular function with cardinality constraint $$k$$ that achieves a $$(1-1/e-\varepsilon)$$-approximation in expectation. Furthermore, this algorithm runs in $$O(\log(n))$$ adaptive rounds and makes $$O(n)$$ calls to the function evaluation oracle in expectation. All three of these guarantees are optimal, and the query complexity is substantially less than in previous works. Finally, to show the generality of our simple algorithm and techniques, we extend our results to the submodular cover problem.

Joint work with Vahab Mirrokni and Morteza Zadimoghaddam (arXiv:1807.07889).

September 28: (1:15pm) Michael Wigal, Math, Georgia Tech
Efficiency of First-Fit Chain Partitioning Finite Partial Orders

Abstract: Given a finite partially ordered set (poset) of width $$w$$, Dilworth's theorem gives an existence and minimality of a chain partition of size $$w$$. First-Fit is an online algorithm for creating a chain partition of a poset. Given a linear ordering of the points of the poset, $$v_1, \cdots, v_n$$, First-Fit assigns the point $$v_i$$ to the first available chain in the chain partition of the points $$v_1, \cdots, v_{i-1}$$. It is a known fact that First-Fit has unbounded performance over the family of finite posets of width 2. We give a complete characterization of the family of finite posets in which First-Fit performs with subexponential efficiency in terms of width. We will also review what is currently known on the family of posets in which First-Fit performs with polynomial efficiency in terms of width. Joint work with Kevin Milans.

October 5: Dr. Siva Theja Maguluri, ISyE, Georgia Tech
The Drift Method and Delay-Optimal Scheduling for Data Center Networks in Heavy Traffic

Abstract: Queueing systems are studied in various asymptotic regimes because they are hard to study in general. One popular regime of study is the heavy-traffic regime, when the system is loaded very close to its capacity. Heavy-traffic behavior of queueing systems is traditionally studied using fluid and diffusion limits. In this talk, I will present a recently developed method called the 'Drift Method', which is much simpler, and is based on studying the drift of certain test functions. In addition to exactly characterizing the heavy-traffic behavior, the drift method can be used to obtain lower and upper bounds for all loads. In this talk, I will present the drift method, and its successful application in the context of data center networks to resolve a decade-old conjecture. I will also talk about ongoing work and some open problems.

Bio: Siva Theja Maguluri is an Assistant Professor in the School of Industrial and Systems Engineering at Georgia Tech. Before that, he was a Research Staff Member in the Mathematical Sciences Department at IBM T. J. Watson Research Center. He obtained his Ph.D. from the University of Illinois at Urbana-Champaign in Electrical and Computer Engineering where he worked on resource allocation algorithms for cloud computing and wireless networks. Earlier, he received an MS in ECE and an MS in Applied Math from UIUC and a B.Tech in Electrical Engineering from IIT Madras. His research interests include Stochastic Processes, Optimization, Cloud Computing, Data Centers, Resource Allocation and Scheduling Algorithms, Networks, and Game Theory. The current talk is based on a paper that received the best publication in applied probability award, presented by INFORMS Applied probability society.

October 12: Dr. Swati Gupta, ISyE, Georgia Tech
Learning Combinatorial Structures

Abstract: At the heart of most algorithms today there is an optimization engine trying to learn online and provide the best decision, for e.g. rankings of objects, at any time with the partial information observed thus far in time. Often it becomes difficult to find near optimal solutions to many problems due to their inherent combinatorial structure that leads to certain computational bottlenecks. Submodularity is a discrete analogue of convexity and is a key property often exploited in tackling combinatorial optimization problems. In the first part of the talk, we will focus on computational bottlenecks that involve submodular functions: (a) convex function minimization over submodular base polytopes (for e.g. permutahedron) and (b) movement along a line inside submodular base polytopes. We give a conceptually simple and strongly polynomial algorithm Inc-Fix for the former, which is useful in computing Bregman projections in first-order projection-based methods like online mirror descent. For the latter, we will bound the iterations of the discrete Newton method which gives a running time improvement of at least $$n^6$$ over the state of the art. This is joint work with Michel Goemans and Patrick Jaillet. In the second part of the talk, we will consider the dual problem of (a), i.e. minimization of composite convex and submodular objectives. We will resolve Bach's conjecture from 2015 about the running time of a popular Kelley's cutting plane variant to minimize these composite objectives. This is joint work with Madeleine Udell and Song Zhou.

October 19: Samira Samadi, CS, Georgia Tech
The Price of Fair PCA: One Extra Dimension

Abstract: We investigate whether the standard dimensionality reduction techniques inadvertently produce data representations with different fidelity for two different populations. We show on several real-world datasets, PCA has higher reconstruction error on population A than B (for example, women versus men or lower versus higher-educated individuals). This can happen even when the dataset has similar number of samples from A and B . This motivates our study of dimensionality reduction techniques which maintain similar fidelity for A as B . We give an efficient algorithm for finding a projection which is nearly-optimal with respect to this measure, and evaluate it on several datasets. This is a joint work with Uthaipon Tantipongpipat, Jamie Morgenstern, Mohit Singh, and Santosh Vempala.

October 26: Dr. Yu Cheng, CS, Duke University
High-Dimensional Robust Mean Estimation in Nearly-Linear Time

Abstract: We study the fundamental problem of high-dimensional mean estimation in a robust model where a constant fraction of the samples are adversarially corrupted. Recent work gave the first polynomial time algorithms for this problem with dimension-independent error guarantees for several families of structured distributions.

In this work, we give the first nearly-linear time algorithms for high-dimensional robust mean estimation. Specifically, we focus on distributions with (i) known covariance and sub-gaussian tails, and (ii) unknown bounded covariance. Given $$N$$ samples on $$R^d$$, an $$\epsilon$$-fraction of which may be arbitrarily corrupted, our algorithms run in time $$\tilde{O}(Nd)/poly(\epsilon)$$ and approximate the true mean within the information-theoretically optimal error, up to constant factors. Previous robust algorithms with comparable error guarantees have running times $$\tilde{\Omega}(N d^2)$$.

Our algorithms rely on a natural family of SDPs parameterized by our current guess $$\nu$$ for the unknown mean $$\mu^\star$$. We give a win-win analysis establishing the following: either a near-optimal solution to the primal SDP yields a good candidate for $$\mu^\star$$ -- independent of our current guess $$\nu$$ -- or the dual SDP yields a new guess $$\nu'$$ whose distance from $$\mu^\star$$ is smaller by a constant factor. We exploit the special structure of the corresponding SDPs to show that they are approximately solvable in nearly-linear time. Our approach is quite general, and we believe it can also be applied to obtain nearly-linear time algorithms for other high-dimensional robust learning problems.

This is a joint work with Ilias Diakonikolas and Rong Ge.

November 2: Starting at 12:20 pm, 2 talks
(12:20 pm) Dr. Thinh T. Doan, ISyE/ECE, Georgia Tech
Accelerating the Convergence Rate of Distributed Subgradient Methods with Adaptive Quantization

Abstract: In this talk, I will present a popular distributed method, namely, distributed consensus-based gradient (DCG) method, for solving optimal learning problems over a network of agents. Such problems arise in many applications such as, finding optimal parameters over a large dataset distributed among a network of processors or seeking an optimal policy for coverage control problems in robotic networks. The focus is to present our recent results, where we study the performance of DCG when the agents are only allowed to exchange their quantized values due to their finite communication bandwidth. In particular, we develop a novel quantization method, which we refer to as adaptive quantization. The main idea of our approach is to quantize the nodes' estimates based on the progress of the algorithm, which helps to eliminate the quantized errors. Under the adaptive quantization, we then derive the bounds on the convergence rates of the proposed method as a function of the bandwidths and the underlying network topology, for both convex and strongly convex objective functions. Our results suggest that under the adaptive quantization, the rate of convergence of DCG with and without quantization are the same, except for a factor which captures the number of quantization bits. To the best of the authors’ knowledge, the results in this paper are considered better than any existing results for DCG under quantization.

This is based on a joint work with Siva Theja Maguluri and Justin Romberg.

Bio: Thinh T. Doan is a TRIAD postdoctoral fellow at Georgia Institute of Technology, joint between the School of Industrial and Systems Engineering and the School of Electrical and Computer Engineering (ECE). He was born in Vietnam, where he got his Bachelor degree in Automatic Control at Hanoi University of Science and Technology in 2008. He obtained his Master and Ph.D. degrees both in ECE from the University of Oklahoma in 2013 and the University of Illinois at Urbana-Champaign in 2018, respectively. His research interests lie at the intersection of control theory, optimization, distributed algorithms, and applied probability, with the main applications in machine learning, reinforcement learning, power networks, and multi-agent systems.

(1:10 pm) Yingjie Qian, Math, Georgia Tech
Polynomial Method and Graph Bootstrap Percolation

Abstract: We introduce a simple method for proving lower bounds for the size of the smallest percolating set in a certain graph bootstrap process. We apply this method to determine the sizes of the smallest percolating sets in multi- dimensional tori and multidimensional grids (in particular hypercubes). The former answers a question of Morrison and Noel, and the latter provides an alternative and simpler proof for one of their main results. This is based on a joint work with Lianna Hambardzumyan and Hamed Hatami.

November 9: (25-30 minute talk) Dr. Lance Fortnow, CS, Georgia Tech
Randomness vs Quantumness

Abstract: Often the popular press talks about the power of quantum computing coming from its ability to perform several computations simultaneously. We’ve already had a similar capability from probabilistic machines. This talk will explore the relationship between quantum and randomized computation, how they are similar and how they differ, and why quantum can work exponentially faster on some but far from all computational problems. We’ll talk about some open problems in quantum complexity that can help shed light on the similarities and differences between randomness and “quantumness”.

This talk will not assume any previous knowledge of quantum information or quantum computing.

November 16: Dr. Mark Rudelson, Math, University of Michigan
Cancelations in random sums

Abstract: Consider a linear combination of independent identically distributed random variables $$X_1, . . . , X_n$$ with fixed weights $$a_1, . . . a_n$$. If the random variables are continuous, the sum is almost surely non-zero. However, for discrete random variables an exact cancelation may occur with a positive probability. This probability depends on the arithmetic nature of the sequence $$a_1, . . . a_n$$. We will discuss how to measure the relevant arithmetic properties and how to evaluate the probability of the exact and approximate cancelation.

November 23 (Closed, Thanksgiving Break)

November 30: by Samantha Petti, Math, Georgia Tech
Sparse random graphs with overlapping community structure

Abstract: In this talk we introduce two different random graph models that produce sparse graphs with overlapping community structure and discuss community detection in each context. The Random Overlapping Community (ROC) model produces a sparse graph by constructing many Erdos Renyi random graphs (communities) on small randomly selected subsets of vertices. By varying the size and density of these communities, ROC graphs can be tuned to exhibit a wide range normalized of closed walk count vectors, including those of hypercubes. This is joint work with Santosh Vempala. In the second half of the talk, we introduce the Community Configuration Model (CCM), a variant of the configuration model in which half-edges are assigned colors and pair according to a matching rule on the colors. The model is a generalization of models in the statistical physics literature and is a natural finite analog for classes of graphexes. We describe a hypothesis testing algorithm that determines whether a graph came from a community configuration model or a traditional configuration model. This is joint work with Christian Borgs, Jennifer Chayes, Souvik Dhara, and Subhabrata Sen.

## Contact

If you are interested in giving a talk at the seminar, you can contact any of the organizers: He Guo or Guanyi Wang. (Thank Bhuvesh Kumar and Uthaipon (Tao) Tantipongpipat for helping organizing in 2018--2019.)

# ACO Student Seminar Fall 2017, Spring 2018

### Spring 2018

January 19: Sarah Cannon, CS, Georgia Tech
Markov Chains and Emergent Behavior

Abstract: Studying random samples drawn from large, complex sets is one way to begin to learn about typical properties and behaviors. However, it is important that the samples examined are random enough: studying samples that are unexpectedly correlated or drawn from the wrong distribution can produce misleading conclusions. Sampling processes using Markov chains have been utilized in physics, chemistry, and computer science, among other fields, but they are often applied without careful analysis of their reliability. Making sure widely-used sampling processes produce reliably representative samples is a main focus of my research, and in this talk I'll touch on two specific applications from statistical physics and combinatorics.

I'll also discuss work applying these same Markov chain processes used for sampling in a novel way to address research questions in programmable matter and swarm robotics, where a main goal is to understand how simple computational elements can accomplish complicated system-level goals. In a constrained setting, we've answered this question by showing that groups of abstract particles executing our simple processes (which are derived from Markov chains) can provably accomplish remarkable global objectives. In the long run, one goal is to understand the minimum computational abilities elements need in order to exhibit complex global behavior, with an eye towards developing systems where individual components are as simple as possible.

This talk includes joint work with Marta Andrés Arroyo, Joshua J. Daymude, Daniel I. Goldman, David A. Levin, Shengkai Li, Dana Randall, Andréa Richa, William Savoie, Alexandre Stauffer, and Ross Warkentin.

January 26: David Hemmi, CS, Monash University
High-level modelling and solving of combinatorial stochastic programs

Abstract: Stochastic programming is concerned with decision making under uncertainty, seeking an optimal policy with respect to a set of possible future scenarios. While the value of Stochastic Programming is obvious to many practitioners, in reality uncertainty in decision making is oftentimes neglected.

For deterministic optimisation problems, a coherent chain of modelling and solving exists. Employing standard modelling languages and solvers for stochastic programs is however difficult. First, they have (with exceptions) no native support to formulate Stochastic Programs. Secondly solving stochastic programs with standard solvers (e.g. MIP solvers) is often computationally intractable.

David will be talking about his research that aims to make Stochastic Programming more accessible. First, he will be talking about modelling deterministic and stochastic programs in the Constraint Programming language MiniZinc - a modelling paradigm that retains the structure of a problem much more strongly than MIP formulations. Secondly, he will be talking about decomposition algorithms he has been working on to solve combinatorial Stochastic Programs.

Bio:
David is a PhD student at Monash University in Melbourne. Before joining Monash, he completed a Bachelors degree in Systems Engineering in Switzerland, studied Robotics at University of Bristol and worked in various engineering roles. As part of his Phd, David is working on a solver system for stochastic combinatorial optimisation problems that are modelled in in the Constraint Programming language MiniZinc. In future, he would like to combine his passion for automation with the research he is doing in operations research.

February 2: (Skiles 006, 1-2 pm) Audra McMillan, Math, University of Michigan
Local Differential Privacy for Physical Sensor Data and Sparse Recovery

Abstract: Physical sensors (thermal, light, motion, etc.) are becoming ubiquitous and offer important benefits to society. However, allowing sensors into our private spaces has resulted in considerable privacy concerns. Differential privacy has been developed to help alleviate these privacy concerns. In this talk, we’ll develop and define a framework for releasing physical data that preserves both utility and provides privacy. Our notion of closeness of physical data will be defined via the Earth Mover Distance and we’ll discuss the implications of this choice. Physical data, such as temperature distributions, are often only accessible to us via a linear transformation of the data.

We’ll analyse the implications of our privacy definition for linear inverse problems, focusing on those that are traditionally considered to be "ill-conditioned”. We’ll then instantiate our framework with the heat kernel on graphs and discuss how the privacy parameter relates to the connectivity of the graph. Our work indicates that it is possible to produce locally private sensor measurements that both keep the exact locations of the heat sources private and permit recovery of the general geographic vicinity'' of the sources. Joint work with Anna C. Gilbert.

February 9: (Joint with ARC Colloquium) Greg Bodwin, CS, MIT
The Distance Oracle Hierarchy

Abstract: A lot of well-studied problems in CS Theory are about making “sketches” of graphs that occupy much less space than the graph itself, but where the shortest path distances of the graph can still be approximately recovered from the sketch. For example, in the literature on Spanners, we seek a sparse subgraph whose distance metric approximates that of the original graph. In Emulator literature, we relax the requirement that the approximating graph is a subgraph. Most generally, in Distance Oracles, the sketch can be an arbitrary data structure, so long as it can approximately answer queries about the pairwise distance between nodes in the original graph.

Research on these objects typically focuses on optimizing the worst-case tradeoff between the quality of the approximation and the amount of space that the sketch occupies. In this talk, we will survey a recent leap in understanding about this tradeoff, overturning the conventional wisdom on the problem. Specifically, the tradeoff is not smooth, but rather it follows a new discrete hierarchy in which the quality of the approximation that can be obtained jumps considerably at certain predictable size thresholds. The proof is graph-theoretic and relies on building large families of graphs with large discrepancies in their metrics.

March 16: Gramoz Goranci, CS, University of Vienna
Fully Dynamic Low-Diameter Decomposition with Applications

Abstract: A low-diameter decomposition (LDD) of an undirected graph G is a partitioning of G into components of bounded diameter, such that only a small fraction of original edges are between the components. This decomposition has played instrumental role in the design of low-stretch spanning tree, spanners, distributed algorithms etc.

A natural question is whether such a decomposition can be efficiently maintained/updated as G undergoes insertions/deletions of edges. We make the first step towards answering this question by designing a fully-dynamic graph algorithm that maintains an LDD in sub-linear update time.

It is known that any undirected graph G admits a spanning tree T with nearly logarithmic average stretch, which can be computed in nearly linear-time. This tree decomposition underlies many recent progress in static algorithms for combinatorial and scientific flows. Using our dynamic LDD algorithm, we present the first non-trivial algorithm that dynamically maintains a low-stretch spanning tree in $$\tilde{O}(t^2)$$ amortized update time, while achieving $$(t + \sqrt{n^{1+o(1)}/t})$$ stretch, for every $$1 \leq t \leq n$$.

Joint work with Sebastian Krinninger.

March 23: Closed, Spring Break

April 6: Uthaipon (Tao) Tantipongpipat, Georgia Tech
Approximation algorithms for optimal design problems

Abstract: The talk is based on https://arxiv.org/abs/1802.08318.

April 20: Kira Goldner, CSE, University of Washington
Selling Partially-Ordered Items: Exploring the Space between Single- and Multi-Dimensional Mechanism Design

Abstract: Consider the problem of selling items to a unit-demand buyer. Most work on maximizing seller revenue considers either a setting that is single dimensional, such as where the items are identical, or multi-dimensional, where the items are heterogeneous. With respect to revenue-optimal mechanisms, these settings sit at extreme ends of a spectrum: from simple and fully characterized (single-dimensional) to complex and nebulous (multi-dimensional).

In this paper, we identify a setting that sits in between these extremes. We consider a seller who has three services {A,B,C} for sale to a single buyer with a value v and an interest G from {A,B,C}, and there is a known partial ordering over the services. For example, suppose the seller is selling {internet}, {internet, phone}, and {internet, cable tv}. A buyer with interest {internet} would be satisfied by receiving phone or cable tv in addition, but a customer whose interest is {internet, phone} cannot be satisfied by any other option. Thus this corresponds to a partial-ordering where {internet} > {internet, phone} and {internet} > {internet, cable tv}, but {internet, phone} and {internet, cable tv} are not comparable.

We show formally that partially-ordered items lie in a space of their own, in between identical and heterogeneous items: there exist distributions over (value, interest) pairs for three partially-ordered items such that the menu complexity of the optimal mechanism is unbounded, yet for all distributions there exists an optimal mechanism of finite menu complexity. So this setting is vastly more complex than identical items (where the menu complexity is one), or even “totally-ordered” items as in the FedEx Problem [FGKK16] (where the menu complexity is at most seven, for three items), yet drastically more structured than heterogeneous items (where the menu complexity can be uncountable [DDT15]). We achieve this result by proving a characterization of the class of best duals and by giving a primal recovery algorithm which obtains the optimal mechanism. In addition, we (1) extend our lower-bound to the Multi-Unit Pricing setting, (2) give a tighter and deterministic characterization of the optimal mechanism when the buyer’s distribution satisfies the declining marginal revenue condition, and (3) prove a master theorem that allows us to reason about duals instead of distributions.

Joint work with Nikhil Devanur, Raghuvansh Saxena, Ariel Schvartzman, and Matt Weinberg.

Kira will be around all week. Please email her to schedule one-on-one meetings.

### Fall 2017

September 15: Peng Zhang, CS, Georgia Tech
Algorithm and Hardness for Linear Elasticity Problems

Abstract: In this talk, we study solvers for geometrically embedded graph structured block linear systems. The general form of such systems, PSD-Graph-Structured Block Matrices (PGSBM), arise in scientific computing, linear elasticity, the inner loop of interior point algorithms for linear programming, and can be viewed as extensions of graph Laplacians into multiple labels at each graph vertex. Linear elasticity problems, more commonly referred to as trusses, describe forces on a geometrically embedded object.

We present an asymptotically faster algorithm for solving linear systems in well-shaped 3-D trusses. Our algorithm utilizes the geometric structures to combine nested dissection and support theory, which are both well studied techniques for solving linear systems. We decompose a well-shaped 3-D truss into balanced regions with small boundaries, run Gaussian elimination to eliminate the interior vertices, and then solve the remaining linear system by preconditioning with the boundaries.

On the other hand, we prove that the geometric structures are "necessary" for designing fast solvers. Specifically, solving linear systems in general trusses is as hard as solving general linear systems over the real. Furthermore, we give some other PGSBM linear systems for which fast solvers imply fast solvers for general linear systems.

Based on the joint works with Robert Schwieterman and Rasmus Kyng.

September 29: He Guo, Math, Georgia Tech
Packing nearly optimal Ramsey R(3,t) Graphs

Abstract: In 1995 Kim famously proved the Ramsey bound $$R(3,t) \ge c t^2/\log t$$ by constructing an $$n$$-vertex graph that is triangle-free and has independence number at most $$C \sqrt{n \log n}$$. We extend this celebrated result, which is best possible up to the value of the constants, by approximately decomposing the complete graph $$K_n$$ into a packing of such nearly optimal Ramsey $$R(3,t)$$ graphs.

More precisely, for any $$\epsilon>0$$ we find an edge-disjoint collection $$(G_i)_i$$ of $$n$$-vertex graphs $$G_i \subseteq K_n$$ such that (a) each $$G_i$$ is triangle-free and has independence number at most $$C_\epsilon \sqrt{n \log n}$$, and (b) the union of all the $$G_i$$ contains at least $$(1-\epsilon)\binom{n}{2}$$ edges. Our algorithmic proof proceeds by sequentially choosing the graphs $$G_i$$ via a semi-random (i.e., Rödl nibble type) variation of the triangle-free process.

As an application, we prove a conjecture in Ramsey theory by Fox, Grinshpun, Liebenau, Person, and Szabó (concerning a Ramsey-type parameter introduced by Burr, Erdös, Lovász in 1976). Namely, denoting by $$s_r(H)$$ the smallest minimum degree of $$r$$-Ramsey minimal graphs for $$H$$, we close the existing logarithmic gap for $$H=K_3$$ and establish that $$s_r(K_3) = \Theta(r^2 \log r)$$.

Based on joint work with Lutz Warnke.

October 6: Josh Daymude, Arizona State University/GaTech theory lab
A Stochastic Approach to Shortcut Bridging in Programmable Matter

Abstract: In a self-organizing particle system, an abstraction of programmable matter, simple computational elements called particles with limited memory and communication self-organize to solve system-wide problems of movement, coordination, and configuration. In this paper, we consider stochastic, distributed, local, asynchronous algorithms for 'shortcut bridging', in which particles self-assemble bridges over gaps that simultaneously balance minimizing the length and cost of the bridge. Army ants of the genus Eticon have been observed exhibiting a similar behavior in their foraging trails, dynamically adjusting their bridges to satisfy an efficiency tradeoff using local interactions. Using techniques from Markov chain analysis, we rigorously analyze our algorithm, show it achieves a near-optimal balance between the competing factors of path length and bridge cost, and prove that it exhibits a dependence on the angle of the gap being 'shortcut' similar to that of the ant bridges. We also present simulation results that qualitatively compare our algorithm with the army ant bridging behavior. Our work presents a plausible explanation of how convergence to globally optimal configurations can be achieved via local interactions by simple organisms (e.g., ants) with some limited computational power and access to random bits. The proposed algorithm demonstrates the robustness of the stochastic approach to algorithms for programmable matter, as it is a surprisingly simple extension of a stochastic algorithm for compression.

This is joint work between myself/my professor Andrea Richa at ASU and Sarah Cannon and Prof. Dana Randall here at GaTech.

October 13: David Durfee, CS, Georgia Tech
Determinant-Preserving Sparsification of SDDM Matrices with Applications to Counting and Sampling Spanning Trees

Abstract: We show variants of spectral sparsification routines can preserve the total spanning tree counts of graphs, which by Kirchhoff's matrix-tree theorem, is equivalent to determinant of a graph Laplacian minor, or equivalently, of any SDDM matrix. Our analyses utilizes this combinatorial connection to bridge between statistical leverage scores / effective resistances and the analysis of random graphs by [Janson, Combinatorics, Probability and Computing `94]. This leads to a routine that in quadratic time, sparsifies a graph down to about $$n^{1.5}$$ edges in ways that preserve both the determinant and the distribution of spanning trees (provided the sparsified graph is viewed as a random object). Extending this algorithm to work with Schur complements and approximate Choleksy factorizations leads to algorithms for counting and sampling spanning trees which are nearly optimal for dense graphs.

We give an algorithm that computes a $$(1\pm \delta)$$ approximation to the determinant of any SDDM matrix with constant probability in about $$n^2\delta^{−2}$$ time. This is the first routine for graphs that outperforms general-purpose routines for computing determinants of arbitrary matrices. We also give an algorithm that generates in about $$n^2\delta^{−2}$$ time a spanning tree of a weighted undirected graph from a distribution with total variation distance of $$\delta$$ from the w-uniform distribution.

This is joint work with John Peebles, Richard Peng and Anup B. Rao.

October 20: (11am at Skiles 005, ARC and Combinatorics) Dr. Mike Molloy, Department of Computer Science, University of Toronto
The list chromatic number of graphs with small clique number (vedio)

Abstract: We prove that every triangle-free graph with maximum degree $$\Delta$$ has list chromatic number at most $$(1+o(1))\frac{\Delta}{\ln \Delta}$$. This matches the best-known bound for graphs of girth at least 5.  We also provide a new proof  that for any $$r\geq4$$ every $$K_r$$-free graph has list-chromatic number at most $$200r\frac{\Delta\ln\ln\Delta}{\ln\Delta}$$.

November 3: (3pm at Skiles 005, Combinatorics) Dr. Joel Spencer, Department of Computer Science and Department of Mathematics, Courant Institute, New York University
86 Years of Ramsey R(3,k). (and counting!) (vedio)

Abstract: The search for the asymptotics of the Ramsey function R(3,k) has a long and fascinating history. It begins in the hill country surrounding Budapest and winding over the decades through Europe, America, Korea and Rio de Janiero. We explore it through a CS lens, giving algorithms that provide the various upper and lower bounds. The arguments are various more or less sophisticated uses of Erdos Magic and, indeed, many of the most important advances in the Probabilistic Method have come from these investigations.

November 24 (Closed, Thanksgiving Break)

## Contact

If you are interested in giving a talk at the seminar, you can contact any of the organizers: He Guo or Guanyi Wang. (Many thanks to past co-organizers Kevin Lai and Saurabh Sawlani's work in 2017-2018).

## Other Semesters

Spring 2017; Fall 2016; Spring 2016; Fall 2015; Spring 2015; Fall 2014; Previous ones.