Scalable Semidefinite Programming
Semidefinite programming (SDP) is a powerful framework from convex optimization that has striking potential for data science applications. This talk develops new provably correct algorithms for solving large SDP problems by economizing on both the storage and the arithmetic costs. We present two methods: one based on sketching, and the other on complementarity. Numerical evidence shows that these methods are effective for a range of applications, including relaxations of MaxCut, abstract phase retrieval, and quadratic assignment, and can handle SDP instances where the matrix variable has over 10^13 entries.
The first method modifies a standard primal-dual convex method - the conditional-gradient augmented Lagrangian method - to store (and work with) a sketched version of the primal. A low-rank approximation to the primal solution can be recovered from this sketch.
The second method begins with a new approximate complementarity principle: Given an approximate solution to the dual SDP, the primal SDP has an approximate solution whose range is contained in the null space of the dual slack matrix. For weakly constrained SDPs, this null space has very low dimension, so this observation significantly reduces the search space for the primal solution.
This result suggests an algorithmic strategy that can be implemented with minimal storage:
(1) Solve the dual SDP approximately (with any first-order solver);
(2) compress the primal SDP to the null space of the dual slack matrix;
(3) solve the compressed primal SDP
Madeleine Udell is Assistant Professor of Operations Research and Information Engineering and Richard and Sybil Smith Sesquicentennial Fellow at Cornell University. She studies optimization and machine learning for large scale data analysis and control, with applications in marketing, demographic modeling, medical informatics, engineering system design, and automated machine learning.
Her research in optimization centers on detecting and exploiting novel structures in optimization problems, with a particular focus on convex and low-rank problems. These structures lead the way to automatic proofs of optimality, better complexity guarantees, and faster, more memory-efficient algorithms. She has developed a number of open-source libraries for modeling and solving optimization problems, including Convex.jl, one of the top tools in the Julia language for technical computing.
Her research in machine learning centers on methods for imputing missing data in large tabular data sets. Her work on generalized low-rank models (GLRMs) extends principal components analysis (PCA) to embed tabular data sets with heterogeneous (numerical, Boolean, categorical, and ordinal) types into a low dimensional space, providing a coherent framework for compressing, denoising, and imputing missing entries. This research enables novel applications in medical informatics, quantitative finance, marketing, causal inference, and automated machine learning, among others.
At Cornell, Madeleine has advised more than 20 students and postdocs on research projects sponsored by DARPA, NSF, and Capital One. She has developed several new courses in optimization and machine learning, earning the Douglas Whitney ’61 Engineering Teaching Excellence Award in 2018.
Madeleine completed her Ph.D. at Stanford University in Computational & Mathematical Engineering in 2015 under the supervision of Stephen Boyd, and a one-year postdoctoral fellowship at Caltech in the Center for the Mathematics of Information hosted by Professor Joel Tropp. At Stanford, she was awarded an NSF Graduate Fellowship, a Gabilan Graduate Fellowship, and a Gerald J. Lieberman Fellowship, and was selected as the doctoral student member of Stanford's School of Engineering Future Committee to develop a road-map for the future of engineering at Stanford over the next 10–20 years. She received a B.S. degree in Mathematics and Physics, summa cum laude, with honors in mathematics and in physics, from Yale University.