Title: Distributional properties of Bayesian neural networks
Abstract: After a short introduction to Bayesian deep learning (that is to say, to Bayesian neural networks), I will present a recent work focused on Bayesian neural networks with Gaussian weight priors and a class of ReLU-like nonlinearities. Such neural networks are well-known to induce an L2, or weight-decay, regularization. Our results characterize a more intricate regularization effect at the level of the unit activations. Our main result establishes that the induced prior distribution on the units before and after activation becomes increasingly heavy-tailed with the depth of the layer. We show that first-layer units are Gaussian, second-layer units are sub-exponential, and units in deeper layers are characterized by so-called sub-Weibull distributions. This provides new theoretical insight on Bayesian neural networks, which we corroborate with experimental simulation results.
Joint work with Mariia Vladimirova (Inria Grenoble - Rhône-Alpes), Stéphane Girard (Inria Grenoble - Rhône-Alpes) and Jakob Verbeek (Facebook AI)