-
Tensor decompositions for multi-context data analysis
Abstract:
Principal component analysis (PCA) is a universal tool in data analysis. It decomposes the covariance matrix to find directions that explain variance. In many domains, data are now collected across multiple contexts (for example, individuals with different diseases, or cells of different types). One example is the contrastive setting, which has two contexts: a foreground dataset, representing an experimental group, and a background dataset, representing a control group. Factors are sought that explain the foreground after “subtracting off" the background. With more than two contexts, we seek factors that are shared across subsets of contexts. This talk will introduce two new tensor decomposition algorithms for finding such factors, building on PCA and independent component analysis (ICA). First, we extend ICA to the contrastive setting, via a method we call contrastive ICA. Second, we propose a general-purpose algorithm for tensor decomposition, which we call the multi-subspace power method. We will discuss expressivity and identifiability of the approaches, via multi-linear algebra, as well as algorithmic performance, guarantees, and applications to real-world data. Based on joint work with Ada Wang, Aida Maraj, João Pereira and Joe Kileel.
