Wednesdays;   3:00 pm   Seminar Archives
collapse allexpand all
To speak at this seminar, contact Afonso Bandeira.

2012 Collapse/Expand

return to top

Date: February 8
Room: Fine 214
Speaker: Lanhui Wang, PACM
Title: A Fourier-based Approach for Iterative 3D Reconstruction from Cryo-EM Images
Abstract: A major challenge in single particle reconstruction methods using cryo-electron microscopy is to attain a resolution sufficient to interpret fine details in three-dimensional (3D) macromolecular structures. Obtaining high resolution 3D reconstructions is difficult due to unknown orientations and positions of the imaged particles, possible incomplete coverage of the viewing directions, high level of noise in the projection images, and limiting effects of the contrast transfer function of the electron microscope. In this paper, we focus on the 3D reconstruction problem from projection images assuming an existing estimate for their orientations and positions. We propose a fast and accurate Fourier-based Iterative Reconstruction Method (FIRM) that exploits the Toeplitz structure of the operator ${\bf A}^{*}{\bf A}$, where $\bf A$ is the forward projector and ${\bf A}^{*}$ is the back projector. The operator ${\bf A}^{*}{\bf A}$ is equivalent to a convolution with a kernel. The kernel is pre-computed using the non-uniform Fast Fourier Transform and is efficiently applied in each iteration step. The iterations by FIRM are therefore considerably faster than those of traditional iterative algebraic approaches, while maintaining the same accuracy even when the viewing directions are unevenly distributed. The time complexity of FIRM is comparable to the gridding-based direct Fourier inversion method. Moreover, FIRM combines images from different defocus groups simultaneously and can handle a wide range of regularization terms. We provide experimental results on simulated data that demonstrate the speed and accuracy of FIRM in comparison with current methods.

Date: February 15
Room: Fine 214
Speaker: Jane Zhao, PACM
Title: Rotationally Invariant Image Representation for Viewing Angle Classification in Cryo-EM
Abstract: We introduce a new rotationally invariant viewing angle classification method for identifying, among a large number of Cryo-EM projection images, similar views without prior knowledge of the molecule. Our rotationally invariant features are based on the bispectrum. Each image is first expanded in an orthonormal steerable basis to generate expansion coefficients. Rotating an image is equivalent to phase shifting the expansion coefficients. Thus we are able to extend the theory of bispectrum of 1D periodic signals to 2D images. The randomized PCA algorithm is then used to efficiently reduce the dimensionality of the bispectrum coefficients, enabling fast computation of the similarity between any pair of images. The nearest neighbors provide an initial classification of similar viewing angles. In this way, rotational alignment is only performed for images with their nearest neighbors. The initial nearest neighbor classification and alignment are further improved by a new classification method called vector diffusion map.

Date: February 22
Room: Fine 214
Speaker: Nikhil Srivastava, Computer Science, Princeton
Title: Covariance Estimation for Distributions with 2+εMoments
Abstract: We study the minimal sample size N=N(n) that suffices to estimate the covariance matrix of an n-dimensional distribution by the sample covariance matrix in the operator norm, and with an arbitrary fixed accuracy. We establish the optimal bound N = O(n) for every distribution whose k-dimensional marginals have uniformly bounded 2+\epsilon moments outside the sphere of radius O(\sqrt{k}). In the specific case of log-concave distributions, this result provides an alternative approach to the Kannan-Lovasz-Simonovits problem, which was recently solved by Adamczak, Litvak, Pajor and Tomczak-Jaegermann. Moreover, a lower estimate on the covariance matrix holds under a weaker assumption -- uniformly bounded 2+\epsilon moments of one-dimensional marginals. Our argument proceeds by randomizing the spectral sparsification technique of Batson, Spielman and Srivastava. The spectral edges of the sample covariance matrix are controlled via the Stieltjes transform evaluated at carefully chosen random points.

Date: March 7
Room: Fine 214
Speaker: George F. Young, Mechanical & Aerospace Engineering, Princeton University
Title: Optimising Robustness of Consensus to Noise
Abstract: One of the key benefits of consensus in a multi-agent system is that it can improve decision making in the presence of uncertainty. However, the act of communication itself will usually introduce additional noise to the system. For this reason it is important to understand how well a group can reach and maintain consensus in the presence of communication noise and in particular whether it is possible to minimise the disturbances created by this noise. Here we use a measure of group dispersion due to noise, which can be interpreted as the H2 norm of the consensus system, to relate the communication graph between the agents to the group robustness. We explore some fundamental bounds on performance and investigate how to improve robustness when the graph is a tree. Our metric is also used to evaluate different sensing strategies in starling flocks to determine which ones achieve robust consensus in the most efficient manner.

Date: March 14
Room: Fine 214
Speaker: Hau-Tieng Wu, PACM, Princeton University
Title: Manifold adaptive local linear regression and its application to sleep depth estimation
Abstract: We study nonparametric regression with high-dimensional massive dataset, when the predictors lie on an unknown, lower-dimensional geometric object. In particular, we focus on the sleep depth estimation problem, where the dataset is generated by applying Synchrosqueezing transform, a time frequency analysis technique, to the sleep data. When the geometric object is a smooth manifold, our approach to the nonparametric regression is to reduce the dimensionality first and then construct the local linear regression (LLR) directly on the tangent plane approximation to the manifold. Under mild conditions, asymptotic expressions for the conditional mean squared error of the proposed estimator are derived for both the interior and the boundary cases. One implication of these results is that the optimal convergence rate depends only on the intrinsic dimension $d$ of the manifold, but not on the ambient space dimension $p$. Another implication is that the estimator is design adaptive and automatically adapts to the boundary of the unknown manifold. The proposed method has a strong connection with manifold learning and the second implication leads to a new diffusion map framework.

Date: March 28
Room: Fine 214
Speaker: Dimitris Giannankis, Center for Atmosphere Ocean Science, Courant Institute of Mathematical Sciences
Title: Capturing intermittent and low-frequency variability in high-dimensional time series through nonlinear Laplacian spectral analysis
Abstract: Nonlinear Laplacian spectral analysis (NLSA) is a method for spatiotemporal analysis of high-dimensional data, which represents temporal patterns via orthonormal basis functions on the nonlinear data manifold. Through the use of such basis functions (determined by means of graph Laplace-Beltrami eigenfunction algorithms), NLSA captures intermittency, rare events, and other nonlinear dynamical features which are not accessible through classical linear approaches such as singular spectrum analysis. We discuss applications of NLSA to detection of decadal and intermittent variability in the North Pacific sector of comprehensive climate models, and dimension reduction of a chaotic low-order model of the atmosphere.

Date: April 11
Room: Fine 214
Speaker: Xiuyuan Cheng - PACM Graduate Student
Title: The Spectrum of Random Kernel Matrices
Abstract: This talk is about random matrix problems motivated by the use of kernel method in high-dimensional statistical learning. We will present a mathematical result concerning the limiting spectral density of inner-product kernel matrices built from p-dimensional Gaussian vectors X_1, ... X_n, where both p and n increase to infinity with p/n staying at a constant. The (i,j)-th entry of the random matrix equals f(X_i^TX_j), where f is a real-valued function. As a well-known classical result, the limiting spectral density is the Marcenko-Pastur distribution when f is a linear function. We will show that, for a large class of kernel function f, the limiting spectral density exists and is dictated by a cubic equation involving its Stieltjes transform. The new family of limiting densities includes the Marcenko-Pastur distribution and Wigner’s semi-circle as special cases. We will also talk about a "spiking" model of the kernel matrices where the vector X_i's admit a model of "low-dimensional information corrupted by high-dimensional noise".

Date: April 18
Room: Fine 214
Speaker: Afonso Bandeira - PACM Graduate Student
Title: Cheeger Inequality for the Graph Connection Laplacian
Abstract: The $O(d)$ Synchronization problem consists of estimating a set of unknown orthogonal transformations $O_i$ from noisy measurements of a subset of the pairwise ratios $O_iO_j^{-1}$. We formulate and prove a Cheeger-type inequality that relates a measure of how well it is possible to solve the $O(d)$ synchronization problem with the spectra of an operator, the graph Connection Laplacian. We also show how this inequality provides a worst case performance guarantee for a spectral method to solve this problem (This is joint work with Amit Singer (Princeton) and Daniel Spielman (Yale) ). These guarantees also play a major role on showing the stability of a certain method to solve the problem of reconstruction without phase (this part is joint work with: Boris Alexeev (Princeton), Dustin Mixon (Princeton) and Matthew Fickus (Air Force Inst. Tech.)).

Date: April 19
Room: Joseph Henry Room - Jadwin Hall
Speaker: Michael McCoy, California Institute of Technology
Title: Random geometry meets data analysis: Sharp recovery bounds for convex deconvolution
Abstract: Consider the common situation where observed data consists of the superposition of two signals. Some examples include: an image of the night sky containing both stars and galaxies; a communications message with impulsive noise; and a low rank matrix obscured by sparse corruptions. Deconvolution is the problem of determining the constituent signals from their superposition. A fundamental question is "When is deconvolution possible with a tractable algorithm?" We describe a convex optimization framework for deconvolution, and provide a geometric characterization of success in this framework. This geometric viewpoint, coupled with a natural incoherence model, leads us into the realm of random geometry. A powerful result from spherical integral geometry yields an exact formula for the probability that our program succeeds. This formula leads to simple summary statistics that reveal sharp phase transitions between success and failure, and important theoretical properties of convex regularizers. We apply our results to deconvolving the superposition of sparse vectors in random bases, a stylized robust communications protocol, and determining a low rank matrix corrupted by a matrix that is sparse in a random basis. We show that empirical results closely match our theoretical bounds. Joint work with Joel Tropp

Date: April 24
Room: 214 Fine Hall
Speaker: Ronen Talmon, Yale University
Title: Differential Stochastic Sensing: Intrinsic Modeling of Random Time Series with Applications to Nonlinear Tracking
Abstract: Many natural and artificial high-dimensional data sets are controlled by few lower-dimensional factors or drivers. As a result, the data is often highly structured and does not fill uniformly the high-dimensional space. In this talk, we present a "differential stochastic sensing" framework for inferring the independent controlling factors (or drivers) of high-dimensional time series. This approach provides intrinsic global modeling for noisy observations based on anisotropic diffusion and local dynamical models. The idea is to implicitly solve local differential equations based on local density estimates in a global graph-based mechanism that inverts the observation function and reveals the underlying structure. Moreover, it implicitly recovers the dynamical model of the data. Hence, it provides a foundation for sequential processing that is applied to nonlinear tracking problems. We revisit classical Bayesian filtering methods and discuss their relationship to the proposed approach. In addition, we show that the proposed intrinsic modeling is invariant under different observation schemes and is noise resilient. Hence, it may be applied to a wide variety of applications. In this talk, we demonstrate applications to the processing of financial and neuroscience time series, and biological and medical imaging.

Date: April 25
Room: Fine 214
Speaker: Sewoong Oh, Massachusetts Institute of Technology
Title: Message-passing algorithms for approximate singular vector computation
Abstract: Low-rank matrix approximation is important in many applications for capturing the important aspects of data naturally described in a matrix form. In particular, we are interested in solving an inference problem using the leading singular vectors of a data matrix, which come from crowdsourcing platforms like Mechanical Turk. Crowdsourcing systems, in which numerous tasks are electronically distributed to numerous "information piece-workers", have emerged as an effective paradigm for human-powered solving of large scale problems. Because these low-paid workers can be unreliable, we need to devise schemes to infer the correct answers to these crowdsourcing tasks from possibly incorrect responses from the workers. In this talk, to solve this inference problem, we introduce a new message-passing algorithm and prove that this algorithm is asymptotically optimal through comparison to an oracle that knows the reliability of every worker. This algorithm is inspired by the power iteration method for computing the leading singular vectors, and there is an interesting relation between the fixed point of the message-passing algorithm and the leading singular vector. The extrinsic nature of message-passing allows us prove sharp asymptotic bounds on the performance using density evolution. However, tracking the densities of real-valued messages is an a priori difficult task. We establish that the messages are sub-Gaussian using recursion, and compute an upper bound on the parameters in a closed form.

Date: May 2
Room: Fine 214
Speaker: Onur Ozyesil, Princeton University
Title: Synchronization in Non-compact Groups: SE(k) Case
Abstract: We present new algorithms for the synchronization problem in SE(k), and focus on the "Compactification by Contraction" approach with applications to synthetic data measurements and to the "Sensor Network Localization" problem. We also give a performance analysis, based on random matrix theory, and identify possible generalizations of our approach for the synchronization problem on Cartan motion groups.

Date: May 24
Room: Joseph Henry Room - Jadwin Hall
Speaker: Ewout van den Berg, Stanford University
Title: Exploiting sparsity and low-dimensional structure: Two new applications
Abstract: We present new algorithms for the synchronization problem in SE(k), and focus on the "Compactification by Contraction" approach with applications to synthetic data measurements and to the "Sensor Network Localization" problem. We also give a performance analysis, based on random matrix theory, and identify possible generalizations of our approach for the synchronization problem on Cartan motion groups.

2011 Collapse/Expand

return to top

Date: February 8
Room: The Joseph Henry Room - Jadwin Hall
Speaker: Teng Zhang, University of Minnesota
Title: Modeling Data by Multiple Subspaces: Theory and Algorithms
Abstract: click to view
We study the problem of modeling data by several affine subspaces, which generalizes the common modeling by a single subspace. It arises, for example, in object tracking and structure from motion. One of the simplest methods for such modeling is based on energy minimization, where the energy involves p-th powers of distances of data points from subspaces. We first analyze under certain assumptions (e.g., spherically symmetric outliers) the effectiveness of such energy minimization for recovering all subspaces simultaneously and also recovering the most significant subspace. We reveal the following phase transition in our setting: when p<=1 the underlying subspaces can be recovered by such energy minimization; whereas when p>1 the underlying subspaces are sufficiently far from the global minimizer. Nevertheless, for more general settings (i.e., outliers which are not spherically symmetric) we can point at some disadvantages of the energy minimization strategy. In order to practically solve the problem, we present a simple and fast geometric method for multiple subspaces modeling. It forms a collection of local best fit affine subspaces, where the size of the local neighborhoods is determined automatically by the Peter Jones’ beta numbers. This collection of subspaces can then be further processed in various ways. For example, greedy selection procedure according to an appropriate energy or a spectral method to generate the final model. We demonstrate the state of the art accuracy and speed of the suggested procedure on applications for several applications.

Date: February 10 - 11 a.m. (special time!)
Room: The Joseph Henry Room - Jadwin Hall
Speaker: Yuan Yao, Peking University
Title: Topological and Geometric Methods for Data Analysis
Abstract: click to view
Over the last two decades, the world has witnessed an enormous growth in data sets that are complex, high-dimensional, and massive. Traditional techniques for analyzing data have become inadequate. In this talk we will discuss how some classical mathematics created for completely different purposes (e.g. algebraic and differential topology, combinatorial differential geometry), could nonetheless provide powerful new tools for exploring these modern data sets. In particular we will focus on some novel applications of Hodge Theory, a bridge over topology and geometry, in statistical data analysis which have never been exploited in such a perspective.

Date: April 12
Room: 138 Lewis Library
Speaker: Nicolas Boumal, Université catholique de Louvain (Belgium)
Title: Low-rank matrix completion: optimization on manifolds at work
Abstract: click to view
We consider moderately large matrices (millions or billions of entries) of low rank. We address the problem of recovering such matrices when most of the entries are unknown. Matrix completion finds applications in recommender systems. In this setting, the rows of the matrix may correspond to items and the columns may correspond to users. The known entries are the ratings given by users to some items. The aim is to predict the unobserved ratings. This problem is commonly stated in a constrained optimization framework. We follow an approach that exploits the geometry of the low-rank constraint to recast the problem as an unconstrained optimization problem on the Grassmann manifold. We then apply a superlinear Riemannian trust-region method to solve it. We improve on key aspects of existing methods such as Admira, FPCA, OptSpace, SET and others. I will demonstrate the performance of our algorithm in terms of accuracy and speed on synthetic data, and discuss real data in the conclusions.

Date: April 26
Room: 314 Fine Hall, 3:30 p.m (Special Time and Place!)
Speaker: Dr. Mou-Hsiung (Harry) Chang, Mathematical Sciences Division, U.S. Army Research Office
Title: Asymptotic Behaviors of Quantum Markov Semigroups
Abstract: click to view
In this talk, we discuss some large time behaviors of a quantum Markov semigroup of bounded linear transformations acting on a von Neumann algebra on a complex Hilbert space. Large deviation principle, Lyapunov stability as well as necessary and sufficient conditions for existence of an invariant state are established using integral representation of normal state. The results are directly applicable to open quantum systems.

Date: May 17
Room: 138 Lewis Library
Speaker: Dr. Hiba Abdullah, Fourier Institute
Title: Diffusion process on a flow of Riemannian manifolds
Abstract: click to view
In this talk, we create links between the properties of diffusion of the Riemannian manifold and its geometry. We embedd a family of Riemannian manifolds whose metric is time dependent, into a Hilbert space with its diffusion properties. Namely, via the eigenfunctions of the corresponding laplacian or its heat kernel. We prove that we can construct embeddings via a finite number of eigenfunctions for all families of Riemannian manifolds (M,g(t)) such that g(t) is analytic in t. Then, we construct the fundamental solution P for the non-linear heat equation acting on (M,g(t)), such that the volume (M,g(t)) is constant, and we prove that, under certain constraints, we can embed (M,g(t)) into a Hilbert space via P.

Date: September 20
Room: The Joseph Henry Room - Jadwin Hall
Speaker: Daniel Kaslovsky, Univ. of Colorado - Boulder
Title: Optimal Tangent Plane Recovery from Noisy Manifold Samples
Abstract: click to view
Efficient data processing algorithms may be realized by exploiting the low-dimensional manifold structure often inherent in a data set. We seek efficient parameterizations of such data sets via projection into appropriate manifold tangent planes. Parameterizing a data set thus becomes a problem of estimating local tangent planes from noisy manifold samples. Principal component analysis (PCA) is often the tool of choice, as it returns an optimal basis in the case of noise-free samples from a linear subspace. To process noisy data, PCA must be applied locally, at a scale small enough such that the manifold is approximately linear, but at a scale large enough such that structure may be discerned from noise. We present our approach that uses the geometry of the data to guide our definition of locality, discovering the optimal balance of this noise-curvature trade-off. Using perturbation theory of eigenspaces, we study the stability of the subspace estimated by PCA as a function of scale, and bound (with high probability) the angle it forms with the true tangent space. By adaptively selecting the scale that minimizes this bound, our analysis reveals the optimal scale for local tangent plane recovery. Applications are discussed, with a focus on image processing.