Evaluating and interpreting caption prediction for histopathology images

Published in Machine Learning for Healthcare Conference (MLHC), 2020

Recommended citation: Zhang, Renyu, et al. "Evaluating and interpreting caption prediction for histopathology images." Machine Learning for Healthcare Conference. PMLR, 2020. https://proceedings.mlr.press/v126/zhang20b.html

The automatic generation of captions from medical images can provide for an efficient way to annotate histopathology images with natural language descriptions. Such large-scale annotation of medical images may help facilitate image retrieval tasks and standardize clinical ontologies. In this work, we focus on developing and methodically evaluating a new caption generation framework for histopathology whole-slide images. We introduce PathCap, a deep learning multi-scale framework, to predict captions from histopathology images using multi-scale views of whole-slide images. We demonstrate that our framework outperforms a standard baseline caption model on a diverse set of human tissues and provides interpretable contextual cues for understanding predicted captions. Finally, we draw attention to a novel dataset of histopathology images with captions from the Genotype-Tissue Expression (GTEx) project, providing a valuable dataset for the machine learning and healthcare community to benchmark future caption prediction and interpretation methods.