The principled learning of functions from data is at the core of statistics, machine learning and artificial intelligence. In this talk I will focus on Bayesian approaches to learning in the context of inverse problems on unknown domains and in particular to semi-supervised learning. More specifically, we consider the problem of recovering a function input of a partial differential equation formulated on an unknown domain M where we can define a ground-truth Bayesian inverse problem. We assume to have access to a discrete domain Mn = {x1, . . . , xn} ⊂ M, and to noisy measurements of the output solution at p ≤ n of those points. The discrete domain is then endowed with a graph structure which is used to create a graph surrogate for the ground-truth Bayesian inverse problem. The first theoretical result that I will present establishes the convergence of the graph posterior distribution associated to the graph-Bayesian inverse problem towards its ground-truth counterpart as the number of unlabeled data points converges to infinity. Our analysis relies on the variational description of posterior distributions, the choice of an appropriate topology to compare distributions on different domains, and precise quantitative estimates of the spectrum of graph Laplacians. I will then show that our consistency results have profound algorithmic implications: when they hold, carefully designed graph-based Markov chain Monte Carlo (MCMC) algorithms have a uniform spectral gap, independent of the number of unlabeled data.

This talk is based on works with Zachary Kaplan, Thabo Samakhoana, and Daniel Sanz-Alonso.