We propose a function space approach to Representation Learning  and the analysis of the representation layers in deep learning architectures. We show how to compute a `weak-type' Besov smoothness index that quantifies the geometry of the clustering in the feature space. This approach was already applied successfully to improve the performance of machine learning algorithms such as the Random Forest  and tree-based Gradient Boosting . Our experiments demonstrate that in well-known and well-performing trained networks, the Besov smoothness of the training set, measured in the corresponding hidden layer feature map representation, increases from layer to layer which relates to the `unfolding' of the clustering in the feature space. We also contribute to the understanding of generalization  by showing how the Besov smoothness of the representations, decreases as we add more mis-labeling to the training data. We hope this approach will contribute to the de-mystification of some aspects of deep learning.
 Y. Bengio , A. Courville and P. Vincenty, Representation Learning: A Review and New Perspectives, IEEE Transactions on Pattern Analysis and Machine Intelligence 8 (2013), 1798-1828.
 O. Elisha and S. Dekel , Wavelet decompositions of Random Forests - smoothness analysis,sparse approximation and applications, Journal of machine learning research 17 (2016), 1-38.
 S. Dekel, O. Elisha and O. Morgan, Wavelet decomposition of Gradient Boosting, preprint.
 C. Zhang, S. Bengio, M. Hardt, B. Recht and O. Vinyals, Understanding deep learning requires rethinking generalization, In ICLR 2017 conference proceedings.
Shai serves as Head of AI at WIX and is a visiting associate professor at the school of mathematical sciences at Tel-Aviv University. For further information see: https://www.shaidekel.com/