Abstract
Deep learning has achieved remarkable success in solving complex problems across diverse domains. Despite its widespread use, the fundamental concept of generalization to unseen data — which ensures that the model does not memorize (i.e., overfits) the training data but instead learns the underlying features that represent a broader range of examples — remains poorly understood. Generalization performance is commonly assessed post hoc via prediction accuracy on test data. Analyzing generalization without test data, however, unveils the learning process and whether the model is capturing the intended features. This commonly involves evaluating the model complexity, through an analysis of decision boundaries (which delineate different regions of the data space) and the model's learned parameters (which define the mapping of input data to predictions). Current efforts seek to establish generalization bounds or simple metrics correlating with the model's ability to generalize. This project instead aims to exploit topological data analysis, or more precisely persistent homology, to characterize the intrinsic structures within decision boundaries, trained parameters and activations, that contribute to superior generalization. Understanding this relationship holds significant potential for enhancing model design, interpretability and resource efficiency, and providing valuable insights into the behavior and limitations of deep learning, guiding future research directions.
Researcher(s)
Research team(s)
Project type(s)