Statistics and Computational Methods Seminar Series - Fall 2025
Speaker: Nick Whiteley
Title: Statistical exploration of the Manifold Hypothesis
joint work with Annie Gray (Alan Turing Institute) and Patrick Rubin-Delanchy (University of Edinburgh)
Abstract:
The Manifold Hypothesis is a widely accepted tenet of Machine Learning which asserts that nominally high-dimensional data are in fact concentrated near a low-dimensional manifold, embedded in high-dimensional space. This phenomenon is observed empirically in many real world situations, has led to development of a wide range of statistical methods in the last few decades, and has been suggested as a key factor in the success of modern AI technologies. We show that rich manifold structure in data can emerge from a generic and simple statistical model --- the Latent Metric Space model --- via elementary concepts such as latent variables, correlation and stationarity. We clarify the role of PCA as a method for preprocessing high-dimensional data before the application of nonlinear dimension reduction methods in exploratory data analysis.