Neural Networks and Deep Learning Systems as Parametric Spans

Overview

I completed my Master’s thesis at Aalto University, in which I develop a category theoretic framework for representing neural networks and deep learning systems. The idea is to model such systems as parametric spans, a categorical construction that naturally captures the compositional structure of layered networks while keeping track of the parameter spaces of a given layer or a given group of layers.

Read the Thesis

The full thesis is available through the Aalto University document repository:

View on Aalto University →

Key Ideas

Parametric spans as a general mathematical object to represent parts of a neural network
How composition of parametric spans corresponds to layer composition in deep networks
The role of the parameter space and how gradient-based learning fits into the framework
Representing the symmetries of a neural network as a natural transformation of its deep learning system

Background

Fundamental motivator for my thesis was the 2022 paper Neural network layers as parametric spans by Mattia G. Bergomi and Pietro Vertechi. I expanded upon this idea by considering neural networks and eventually the deep learning systems around the neural networks as compositions of parametric spans.

I became interested in the abstract properties of neural networks from reading the 2021 paper Geometric Deep Learning: Grids, Groups, Graphs, Geodesics, and Gauges by M. Bronstein et al. I wanted to develop a universal representation, which would also contain information about a given neural networks relation to the structure of its input data.