^{Some Mathematical Aspects of Deep Learning and Stochastic Gradient Descent}

**Abstract: **This talk concerns several mathematical aspects of deep learning and stochastic gradient descent. The first aspect is why deep neural networks trained with stochastic gradient descent often generalize. We will make a connection between the generalization and the stochastic stability of the stochastic gradient descent dynamics. The second aspect is to understand the training process of stochastic gradient descent. Here, we use several simple mathematical examples to explain several key empirical observations, including the edge of stability, exploration of flat minimum, and learning rate decay. Based on joint work with Chao Ma.

**Bio:** Lexing Ying is a professor of mathematics at Stanford University. He received B.S. from Shanghai Jiaotong University in 1998 and Ph.D. from New York University in 2004. Before joining Stanford in 2012, he was a post-doc at Caltech and a professor at UT Austin. He received a Sloan Fellowship in 2007, an NSF Career Award in 2009, the Fengkang Prize in 2011, and the James H. Wilkinson Prize in 2013. He is an invited speaker of ICM 2022.