Abstract: We present a study of electronic medical record (EMR) retrieval that emulates situations in which a doctor treats a new patient. Given a query consisting of a new patient's symptoms, the retrieval system returns the set of most relevant records of previously treated patients. However, due to semantic, functional, and treatment synonyms in medical terminology, queries are often incomplete and thus require enhancement. In this paper, we present a topic model that frames symptoms and treatments as separate languages. Our experimental results show that this method improves retrieval performance over several baselines with statistical significance. These baselines include methods used in prior studies as well as state-of-the-art embedding techniques. Finally, we show that our proposed topic model discovers all three types of synonyms to improve medical record retrieval.

Learning Objective 1: After participating in the session, the learner should be able to identify novel methods of query expansion in order to enhance the performance of eletronic medical record retrieval, particularly with polylingual topic models to separately model treatments and synonyms of a medical record.


Edward Huang (Presenter)
University of Illinois at Urbana Champaign

Sheng Wang, University of Illinois at Urbana Champaign
Doris Lee, University of Illinois at Urbana Champaign
Runshun Zhang, Guang’anmen Hospital
Baoyan Liu, National Data Center of Traditional Chinese Medicine
xuezhong Zhou, Beijing Jiaotong University
ChengXiang Zhai, University of Illinois at Urbana Champaign

Presentation Materials: