| Date: February 4 |
| Speaker: John Storey, Lewis-Sigler Institute & Molecular Biology, Princeton University |
| Title: New Methods for Quantitative Genomics |
| Abstract: It is now possible to simultaneously measure thousands of genomic features from a given biological sample, most notably variations in the DNA at hundreds of thousands of locations and RNA transcriptional levels for every known gene. One of the main goals in utilizing this information is to understand the genetic and molecular basis of variation in higher-order traits (such as disease status or a quantitative trait) at the genome-wide scale. I will describe our recent efforts in developing a quantitative framework for tackling this problem, which involves new concepts and methods for experimental design, statistical significance, and causal network modeling. Applications to recent experiments will be discussed. |
return to top
|
| Date: February 11 |
| Speaker: Moses Charikar, Computer Science, Princeton University |
| Title: New Insights into Semidefinite Programming for Combinatorial Optimization |
| Abstract: Beginning with the seminal work of Goemans and Williamson on Max-Cut, semidefinite programming (SDP) has firmly established itself as an important ingredient in the toolkit for designing approximation algorithms for NP-hard problems. Algorithms designed using this approach produce configurations of vectors in high dimensions which are then converted into actual solutions. |
| In recent years, we have made several strides in understanding the power as well as the limitations of of such SDP approaches. New insights into the geometry of these vector configurations have led to breakthroughs for several basic optimization problems. At the same time, a sequence of recent results seems to suggest the tantalizing possibility that, for several optimization problems including Max-Cut, SDP approaches may indeed be the best possible. In this talk, I will present a glimpse of some of this recent excitement around SDP-based methods and explain some of the new insights we have developed about the strengths and weaknesses of this sophisticated tool. |
return to top
|
| Date: February 18 |
| Speaker: David G. Stork, Ricoh Innovations and Stanford University |
| Title: Did the great masters 'cheat' using optics? Computer vision and graphics addresses a bold theory in art history |
| Abstract: In 2001, artist David Hockney and scientist Charles Falco stunned the art world with a controversial theory that, if correct, would profoundly alter our view of the development of image making. They claimed that as early as 1420, Renaissance artists employed optical devices such as concave mirrors to project images onto their canvases, which they then traced or painted over. In this way, the theory attempts to explain the newfound heightened naturalism or "opticality" of painters such as Jan van Eyck, Robert Campin, Hans Holbein the Younger, and many others. |
 |
| This talk will describe the application of rigorous computer image analysis to masterpieces adduced as evidence for this theory. It covers basic geometrical optics of image projection, the analysis of perspective, curved surface reflections, shadows, lighting and color. While there remain some loose ends, such analysis of the paintings, infra-red reflectograms, modern reenactments, internal consistency of the theory, and alternate explanations allows us to judge with high confidence the plausibility of this bold theory. You may never see Renaissance paintings the same way again (http://www.diatrope.com/stork/FAQs.html). |
| |
| |
|
return to top
|
| Date: March 3 |
| Speaker: Brendan Frey, Electrical & Computer Engineering, University of Toronto |
| Title: Closing the optimality gap using affinity propagation |
| Abstract: An important problem in science and engineering is how to find and associate constituent patterns or motifs in large amounts of high-dimensional data. Examples include the identification and modeling of object parts in images, and the detection and association of RNA motifs that regulate tissue-dependent gene splicing in mammals. One approach is to identify a subset of representative data exemplars that are used to summarize and model the data. This is an NP-hard problem that is traditionally solved approximately by randomly choosing an initial subset of data points and then iteratively refining it. I'll describe a method called 'affinity propagation', which takes as input measures of similarity between pairs of data points. Real-valued messages are exchanged between data points until a high-quality set of exemplars and corresponding clusters gradually emerges. Affinity propagation is a general-purpose method and has been applied in a variety of areas, including digital communications, genomics, transcriptomics and document analysis. I'll outline open problems and possible future directions of research. |
return to top
|
| Date: March 10 |
| Speaker: Peter Winkler, Mathematics, Dartmouth College |
| Title: Branched Polymers |
| Abstract: A branched polymer is a finite, connected set of non-overlapping unit balls in space. The powerful "dimension reduction" theorem of Brydges and Imbrie permits computation of the volume of the space of branched polymers of size N in dimensions 2 or 3. We will show how these and some related computations can be done using elementary calculus and combinatorics. |
| New results include methods for random generation, asymptotic diameter in 3-space, and a combinatorial proof of the notorious "random flight" problem of Rayleigh and Spitzer. Joint work with Rick Kenyon (Brown). |
return to top
|
| Date: March 24 |
| Speaker: Blaise Aguera y Arcas, Microsoft Live Labs |
| Title: A worldwide web of images |
| Abstract: In this talk we'll explore the emerging potential of computer vision to transform the way we think about the interconnectedness of digital imagery and the Web, and how these relate to our physical environment. We'll begin with an introduction to the foundations of "3D computer vision", a bag of tricks which has been developing steadily for three decades, combining classical photogrammetry with machine vision. We'll then dive specifically into Photosynth, based on a combination of the Photo Tourism project (a collaboration between Microsoft Research and the University of Washington) and Seadragon, a multiresolution networked platform allowing one to play with arbitrarily many arbitrary large visual objects using only constant-time and constant-bandwidth operations. The aim of Photosynth is to allow meaningful 3D navigation within real-world environments reconstructed entirely from the photos. Interesting social dimensions are added to this application when one considers that the source photos can be mined from the existing Web, aggregated from user communities, and actively contributed to and interconnected. We'll end with some preliminary findings about the latent graph structure of Internet photography, and a glimpse of where we're heading next. |
return to top
|
| Date: March 31 |
| Speaker: Joyce McLaughlin, Mathematical Sciences, Rensselaer Polytechnic Institute |
| Title: Mathematical and Computational Challenges in Shear Stiffness Imaging of
Tissue: Can cancerous and benign lesions be distinguished? |
| Abstract: For centuries doctors have palpated tissue to detect abnormalities. We target imaging the stiffness the doctor feels in the palpation exam, including imaging deeper than what can be felt in this exam and distinguishing between benign and cancerous lesions. Current applications include breast and prostate cancer. Current experimentalists with whom we collaborate are: Dr. Richard Ehman, Mayo Clinic; Mathias Fink, ESPCI, Paris; and Dr. Kevin Parker at the University of Rochester. We describe the challenges and opportunities for imaging, including mathematical modeling and algorithmic development, with the data from the individual experiments. |
return to top
|
| Date: CANCELLED - April 7 |
| Speaker: Iven Mareels, Electrical & Electronic Engineering, The University of Melbourne |
| Title: Water Information Networks & Efficiency in Irrigation Systems |
| Abstract: The world's sustainable water supply is heavily used (it is estimated that annually 65% of the available water resources are extracted), and with very poor efficiency (typically less than half the water taken from the environment serves the objective for which it was intended). The UNESCO World Water reports 2003/2005 identify management as one of the main issues to be addressed in order to avoid a water catastrophe. Australia
is in a particularly critical situation, where management has to deal with significant climate change effects. |
| In this lecture we outline a sensor networks and systems engineering approach to underpin the management of an entire water catchment basin. The technology exists to construct a sensor network to monitor at a global scale the water resource and manage in closed loop the resource through the distribution infrastructure using the data derived from the sensor network. The control objective is to deliver water on demand with maximal overall efficiency. Such technology would provide the necessary data to implement a sustainable water policy in an adaptive way, where economic, environmental and social issues are properly taken into account. |
| We review the results from a number of substantial pilot projects in Victoria and New South Whales Australia in which, where Rubicon Systems Australia Pty. Ltd., who commercialise the technology, have realized significant gains in water efficiency in irrigation distribution. We show how this experience may lead to substantial gains in water management overall, and leads to better on farm practices, building further water savings. At present the state government of Victoria is backing this technoly with a 1 billion dollar investment to create significant water savings across the state. We discuss aspects of modeling of water dynamics followed by the control aspects enabled in the present and envisaged hardware upgrades. |
return to top
|
| Date: April 14 |
| Speaker: Eitan Bachmat, Computer Science, Ben-Gurion University and Brandeis University |
| Title: Airplane boarding and space-time geometry |
| Abstract: It is hard to think of a process that is more boring than boarding an airplane. In the hope of relieving, or at least shortening, some of the pain, airlines have devised various boarding strategies such as back-to-front, window to aisle, boarding by zones or even unassigned seating. In the talk we will try to overturn the negative image that airplane boarding has and will try to portray it as a very exciting process which is modeled via space-time (a.k.a Lorentzian) geometry with a touch of random matrix theory. Using the model we will try to figure out what are the better strategies. If time permits, we will use insights from the airplane borading process to suggest an interpretation for Einstein's law of motion in which god plays the ultimate dice game. The talk is entirely self contained. Partly based on joint works with D. Berend, L. Sapir, S. Skiena, M. Elkin and V. Khachaturov. |
return to top
|
| Date: April 28 |
| Speaker: Rob Nowak, Electrical and Computer Engineering, University of Wisconsin-Madison |
| Title: Active and Semi-Supervised Learning Theory |
| Abstract: Science is arguably the pinnacle of human intellectual achievement, yet the scientific discovery process itself remains an art. Human intuition and experience is still the driving force of the high-level discovery process: we determine which hypotheses and theories to entertain, which experiments to conduct, how data should be interpreted, when hypotheses should be abandoned, and so on. Meanwhile machines are limited to low-level tasks such as gathering and processing data. A grand challenge for scientific discovery in the 21st century is to devise machines that directly participate in the high-level discovery process. Towards this grand challenge, we must formally characterize the limits of machine learning. Statistical learning theory is usually based on supervised training, wherein a learning algorithm is presented with a finite set of i.i.d. labeled training examples. However, modern experimental methods often generate incredibly large numbers of unlabeled data for very little expense, while the task of labeling data is often painstaking and costly. Machine learning methods must leverage the abundance of unlabeled data in scientific problem domains. Active learning (AL) and semi-supervised learing (SSL) are two well known approaches to exploit unlabeled data. In both paradigms one has access to a large pool of unlabeled examples, and only a few labeled examples are provided or selected. AL is a sequential feedback process. Unlabeled examples that are predicted to have very informative labels, based on previously gathered labeled and unlabeled data, are selected for labeling. In SSL, labeled examples are randomly provided, without regard to potential informativeness. Today, little is known about theoretical limits of AL and SSL performance. Sparsity and complexity of the underlying data-generating distributions appear to play a central role in the performance of AL and SSL, and this talk will discuss some of the known theoretical results. |
| This work is joint with Rui Castro, Aarti Singh and Jerry Zhu. |
return to top
|