Exploring the Use of Contrastive Learning to Perform Audio Classification for TB Screening

Currently, Stellenbosch University is researching an accessible tuberculosis (TB) diagnosis method for developing countries, using cough recordings analysed by machine learning models like logistic regression or a convolutional neural network. This simple test could be conducted at primary healthcare clinics without specialised staff or labs, potentially reducing undetected TB cases. 

Under the supervision of Professor Thomas Niesler, Minette Farrell aims to expand on the current research by investigating whether contrastive learning can improve the accuracy and robustness of an existing model. This will be achieved by implementing a Siamese network with a contrastive loss function to generate features that can serve as input data for the existing model.

Figure 1: The flow diagram shows the testing setup.

Background and Purpose of this Research

According to the World Health Organization, TB affects 10 million people and causes 1.5 million deaths annually. The bacterium Mycobacterium tuberculosis causes it, and it is spread through the coughing, sneezing, or spitting of individuals with pulmonary TB. Despite its curability, many TB patients, particularly in developing nations, go untreated due to limited medical facilities and healthcare staff, a scarcity of treatments, and undiagnosed cases.

Two TB diagnosis tests exist: Mantoux tuberculin skin test (TST) and interferon-gamma release assay (IGRA). Before COVID-19, 2.9 million of 10 million TB cases were undiagnosed or unreported, leading to increased transmission and deaths. Missed diagnoses stem from prioritisation issues, patient loads, communication problems, treatment refusal, staff mistreatment, and TB stigma.

This research aims to improve the performance and robustness of the existing model in the presence of external noise. Improving the empirical performance of the model increases the likelihood of its acceptance by the medical profession for assisting in the diagnosis of TB patients. Improving the model’s robustness can lead to better performance in various circumstances. Mr Ferrell explains this as follows:


“This project used different aspects of my degree to address the issue at hand, including using Statistical and Machine Learning techniques and Computer Science skills to build a Neural Network based model. The model receives input in the form of audio signals, uses a Siamese Network to contrast different signals, and generates features. The results show a best-case improvement of 20% and an overall improvement of 10%, indicating that the addition of Contrastive Learning leads to improved performance.”

Research Results

This research aimed to identify whether contrastive learning could be applied to the field of TB classification. The test results from this project indicate that contrastive learning performs well under certain circumstances for this dataset, but it does not generalise well. Overall, the models trained using the NT-Xent loss performed the best. The results also show that a Siamese network can be used to generate features for logistic regression. This project shows that there is potential in applying contrastive learning to perform automatic TB classification.

Download and read the full thesis, including the experimental results.