Tensor decompositions for multi-context data analysis

Abstract:

Principal component analysis (PCA) is a universal tool in data analysis. It decomposes the covariance matrix to find directions that explain variance. In many domains, data are now collected across multiple contexts (for example, individuals with different diseases, or cells of different types). One example is the contrastive setting, which has two contexts: a foreground dataset, representing an experimental group, and a background dataset, representing a control group. Factors are sought that explain the foreground after “subtracting off" the background. With more than two contexts, we seek factors that are shared across subsets of contexts. This talk will introduce two new tensor decomposition algorithms for finding such factors, building on PCA and independent component analysis (ICA). First, we extend ICA to the contrastive setting, via a method we call contrastive ICA. Second, we propose a general-purpose algorithm for tensor decomposition, which we call the multi-subspace power method. We will discuss expressivity and identifiability of the approaches, via multi-linear algebra, as well as algorithmic performance, guarantees, and applications to real-world data. Based on joint work with Ada Wang, Aida Maraj, João Pereira and Joe Kileel.

Anna Seigal - Bio

Anna Seigal

Anna Seigal is an Assistant Professor of Applied Mathematics in the School of Engineering and Applied Sciences at Harvard, and an Affiliate in the Department of Statistics. Her research is in applied algebra. She was previously a Junior Fellow at the Society of Fellows at Harvard University. Before that, she held a Hooke Research Fellowship in the Mathematical Institute at the University of Oxford and a Junior Research Fellowship at The Queen's College. She received her PhD from UC Berkeley in 2019. She is a recipient of the SIAM Richard C. DiPrima Prize and an Alfred P. Sloan Research Fellowship in Mathematics.