Overview
I completed my Master’s thesis at Aalto University, in which I develop a category theoretic framework for representing neural networks and deep learning systems. The idea is to model such systems as parametric spans, a categorical construction that naturally captures the compositional structure of layered networks while keeping track of the parameter spaces of a given layer or a given group of layers.
Read the Thesis
The full thesis is available through the Aalto University document repository:
Key Ideas
- Parametric spans as a general mathematical object to represent parts of a neural network
- How composition of parametric spans corresponds to layer composition in deep networks
- The role of the parameter space and how gradient-based learning fits into the framework
- Representing the symmetries of a neural network as a natural transformation of its deep learning system
Background
Fundamental motivator for my thesis was the 2022 paper Neural network layers as parametric spans by Mattia G. Bergomi and Pietro Vertechi. I expanded upon this idea by considering neural networks and eventually the deep learning systems around the neural networks as compositions of parametric spans.
I became interested in the abstract properties of neural networks from reading the 2021 paper Geometric Deep Learning: Grids, Groups, Graphs, Geodesics, and Gauges by M. Bronstein et al. I wanted to develop a universal representation, which would also contain information about a given neural networks relation to the structure of its input data.