The prestigious Thamsanqa W. Kambule Award for the best doctoral dissertation in the computational and statistical sciences at an African university was awarded to Dr Arnu Pretorius from Stellenbosch University.
Dr Pretorius completed his dissertation, On noise regularised neural networks: initiation, learning and inference, under supervision of Professor Steve Kroon from Stellenbosch University’s Computer Science Division and Professor Herman Kamper from the Department of Electrical and Electronic Engineering at Stellenbosch University. His research focuses on the mathematical underpinning of how neural networks behave when “noise” is introduced into the learning process.
He is also an extraordinary senior lecturer in SU’s Applied Mathematics Division, where he is involved with lecturing for the MSc programme in Machine Learning and Artificial Intelligence.
Announcing the Award
The award was announced during the Deep Learning Indaba held in Accra, Ghana, from 3 to 9 September 2023.
The award, held in honour of one of South Africa’s foremost mathematicians and teachers Dr Thamsanqu W. Kambule, recognises excellence in research and writing by doctoral candidates at African universities in any area of computational and statistical sciences. The Deep Learning Indaba was established in 2017 to strengthen machine learning and artificial intelligence in Africa.
“At the 2017 Deep Learning Indaba, I was blown away by the quality of the work presented at the poster session, and the fact that some researchers had work presented at top AI venues such as the International Conference on Machine Learning (ICML). At the time, I didn’t think this was possible.”
Dr Arnu Pretorius
About Dr Pretorius’s Research
Dr Pretorius’s dissertation investigates the regularisation techniques in deep learning, specifically focusing on a method that introduces noise into neural networks. As an example, the frequently used dropout technique is considered. The study revolves around three core areas of modelling:
- learning;
- initialisation; and
- inference.
The research first analyses the learning dynamics of denoising autoencoders (DAEs), a type of shallow noise regularised neural network, to understand how noise impacts the learning process.
It was observed that learning behaviour depends on initialisation, prompting a deeper look into how noise interacts with the initialisation of a deep neural network in terms of signal propagation dynamics during the forward and backward pass.
The study then explored how noise influences inference in a Bayesian context. It primarily focused on fully connected feedforward neural networks with rectifier linear unit (ReLU) activation functions.
The learning dynamics of DAEs were thereafter analysed by deriving closed form solutions to certain differential equations. For initialisation, mean field theory is used to approximate the distribution of the pre-activations of individual neurons, leading to new initialisation schemes for noise regularised neural networks that ensure stable signal propagation.
Testing The Effects of Initialisation on Training Speed and Generalisation
A large-scale controlled experiment was conducted to scrutinise the impacts of initialisation on training speed and generalisation, presenting a comprehensive understanding of how noise can influence inference. This was achieved by drawing parallels between randomly initialised deep noise regularised neural networks and Gaussian processes (GPs). This comparative approach unveiled new connections between a specific initialisation of such a network and the behavioural pattern of its corresponding GP.
Building on these findings, the study concluded that noise regularisation could effectively help a model concentrate on more salient statistical regularities within the training data distribution, thereby facilitating later generalisation. However, it highlighted a potential pitfall: noise regularisation can destabilise deep networks if not appropriately initialised.
Despite this drawback, it was observed that noise curtailed the depth of successful training in networks and amplified constraint and uncertainty in noisy neural network GPs. The study didn’t stop at identifying problems; it also proposed solutions. It introduced a novel technique that employs self-stabilising priors to bolster the robustness and performance of training deep Bayesian neural networks.
Download and read Dr Pretorius’s full dissertation at https://scholar.sun.ac.za/handle/10019.1/107035?show=full and read more about his award here: http://www.sun.ac.za/english/Lists/news/DispForm.aspx?ID=10191.