Smart Medical Imaging

EPSRC Centre for Doctoral Training


2019_001 - Explaining the predictions of deep learning models in cardiology

1st supervisor: Andrew King
2nd supervisor: Bernhard Kainz

The use of deep learning in cardiology is an active research area and recently state-of-the-art results have been achieved in tasks such as segmentation of the myocardium and blood pool1, and estimating volumes from MR2. However, whilst deep learning techniques have produced impressive results, a significant problem remains. Often, the techniques that produce the most accurate results lack one important feature that is important for the clinical acceptance of new technology: explanatory power. Put simply, most deep learning models are able to make predictions but are not able to explain in human-interpretable terms how the prediction was arrived at. The reason for this is that often the input to such algorithms are very high-dimensional data, and optimal performance is achieved by using complex, often non-linear combinations of subsets of these data. This reasoning process is opaque, and clinicians are reluctant to base clinical decisions upon recommendations from such “black-box” models.

Producing explanations from deep learning tools is a significant challenge. In the machine learning literature, most attempts at “interpretable machine learning” have focused on one of two approaches:
(1) try to visualise the inside of the black box, e.g. by using “saliency maps” which show areas of the input image that were important in making the decision,
(2) train a simpler model which may be more interpretable to some degree. Both of these approaches are likely to be inadequate in many medical applications.
For example, in cardiology, which is our focus in this project, an “explanation” that will be acceptable to a cardiologist is likely to require information about disease aetiology along with a discussion of concepts such as tissue properties and electrical/mechanical activation patterns.

The key challenge is to find ways of linking the model’s automated decision with “higher level” human-interpretable concepts. In this project, we will investigate ways of making these links in a specialised domain. One promising area that has recently emerged from the computer vision literature is the investigation of ways of querying the importance of human-interpretable concepts to deep learning models3. However, to date, such techniques have only been applied to simple toy problems with quite low-level concepts. It will be intriguing to discover whether such ideas can be extended to higher-level concepts that will be meaningful to cardiologists. Another interesting avenue for exploration will be ways of putting humans (i.e. clinicians) “in the loop” of the training of deep learning models4. This type of approach could be used to encourage the deep learning model to learn features that are clinically meaningful, effectively creating a dialogue between clinicians and deep learning models. Further insight could be gained by combining concepts from more traditional artificial intelligence such as symbolic representations with deep learning approaches5, Here, symbolic representations correspond to human understandable concepts, and deep learning provides an effective means for learning features that can be mapped to these concepts.

There are many intriguing avenues to explore in this field which are relatively untouched in the medical domain, and the potential for novelty is high. Our ultimate aim is to produce a computer aided decision-support tool to assist cardiologists in diagnosing heart disease and planning its treatment. The tool would act like a “trusted colleague” or “second reader” that the cardiologist could engage with to find their opinion about difficult cases as well as the reasoning behind this opinion. This is a highly ambitious aim and this project represents the first part of this journey, but if successful the impact could be great.

1. Litjens et al, A Survey on Deep Learning in Medical Image Analysis, CoRR, 2017.
2. Second Annual Data Science Bowl: Transforming How We Diagnose Heart Disease, URL:
3. Kim et al, Interpretability Beyond Feature Attribution: Quantitative Testing with Concept Activation Vectors (TCAV), Proc ICML, 2018.
4. Lage et al, Human-in-the-Loop Interpretability Prior, Proc NIPS, 2018.
5. Garnelo et al, Towards Deep Symbolic Reinforcement Learning. arXiv preprint arXiv:1609.05518, 2016.