Strategies to use prior knowledge to improve the performance of Deep Learning: an approach towards trustable Machine Learning systems

PhD Thesis of developing strategies for incorporating priors and domain knowledge into design of Deep Neural Networks.
Author

Jay Paul Morgan

Published

January 2, 2022

Google Scholar Bibtex

Abstract

Machine Learning (ML) has been a transformative technology in society by automating otherwise difficult tasks such as image recognition and natural language understand-ing. The performance of Deep Learning (DL), in particular, has improved to the point where it can be applied to automotive vehicles – a situation in which trust is placed on the ML systems to operate correctly and safely. Yet, while fundamental ML algorithms can be formally verified for safety without much trouble, the same may not be said for DL. A key problem preventing the trustworthiness of DL is the existence of adver-sarial examples, where small changes in input result in catastrophic misclassifications, thereby undermining their use in safety-critical systems.Using pre-existing knowledge from domain experts has been shown to successfully in-crease not only the performance but critically the resilience of DL models to adversarial examples. The current thesis developed four different strategies of integrating prior expert knowledge into DL models: feature specialisation, specialised information pro-cessing, stimulation of attention mechanisms, and augmentation of training data. Prior knowledge from three scientific domains was used (Quantum Chemistry, Corpus Lin-guistics and Astrophysics) as case studies to provide a comprehensive framework for evaluation of the strategies performance given different types of data (i.e., text-based, image-based, and graph-based) and model architectures (e.g. recurrent, graph, and convolutional). For the Quantum Chemistry and Corpus Linguistics case studies, two novel datasets are introduced to facilitate the training of prior knowledge informed DL models. Each of the four proposed strategies were tested independently on the case studies to understand their isolated contribution, as well as combined with other strategies to evaluate their interaction.The results show that, combined, the four prior knowledge integration strategies (a) are an effective method of increasing model performance; (b) result in fewer misclas-sifications as a result of misleading features; (c) lead to increased model robustness to adversarial examples; (d) create informative representations by visualising learnt representations of prior knowledge; (e) lessen the number of training samples needed to achieve adequate model performance; and (f) lead to better generalisation to dif-ferent problem tasks other than those the model was trained for. The findings show the prior knowledge integration strategies used here improve the performance of ML while being more resilient to adversarial examples. This can lead to more trustworthy ML systems in practice.