Enter Note Done

Media

Using Electronic Medical Records Search Engine (EMERSE) to Improve Research Involving Rare Cancers. A. Bhattacharya1, L. Hoff1, C. Harter1, E. Olsen1, N. Mott1, T. Hughes2, C. Angeles2 1University Of Michigan,Medical School,Ann Arbor, MI, USA 2University Of Michigan,Department Of Surgery,Ann Arbor, MI, USA.
Introduction: The digitization of medical records has allowed researchers to aggregate cohorts of patients for large-scale retrospective studies. While many data points are structured and easily retrievable from the electronic health record (EHR) systems, other relevant clinical information is more difficult to reliably retrieve due to its inclusion in free-text notes. For cancer patients, the text in non-synoptic pathology reports and provider notes are common sources that may be missed with traditional chart review. Moreover, International Classification of Diseases (ICD) codes within the EHR may be ambiguous, making it difficult to identify the maximum number of patients with these conditions. We sought to explore if the Electronic Medical Records Search Engine (EMERSE), a highly robust clinical text mining tool, could meaningfully augment the manual chart review process for patients with cutaneous leiomyosarcoma (cLMS), a rare sarcoma of the skin and soft tissue.

Methods: EMERSE allows users to define single-word and multi-word phrases to search over 80 million free-text clinical documents from the EHR of our large tertiary care health system. We used terms such as “leiomyosarcoma” and “cutaneous leiomyosarcoma” along with specifiers like keyword proximity (distance of each term in a single document) as well as other identifiers (e.g. “dermal” and “subcutaneous”) to generate a list of potential cLMS patients. To assess how many true cLMS patients were identified using EMERSE, we manually chart reviewed the list and also obtained the associated ICD-0-3 tumor and primary site classification from the institutional Cancer Registry.  

Results: From EMERSE we identified 311 potential cLMS candidates in the EHR. Through chart review, 140 patients were deemed to have cLMS. However, only 101 patients of the 140 were identified by ICD-0-3 classification in the Cancer Registry. If only traditional ICD-codes were used to query the 311 patient cohort, nearly 28% of the positive cLMS patients found by chart review would have been missed. The remaining 173 false positives captured by EMERSE was mainly due to standardized language from genetic consult notes stating, “risk for cutaneous leiomyosarcoma”.

Conclusion: This study demonstrates how EMERSE can be used as an efficient and sensitive method for identifying patients with a rare cancer diagnosis directly from free-text clinical notes. Although use of EMERSE had significant false positives it still managed to detect patients who may otherwise have been missed by traditional methods of querying. Applying more granular search criteria that excludes standardized language may yield even better results. For larger, multi-institutional studies, this type of resource could prove to be an advantageous approach for cohort research studies on rare pathologic diagnoses.

View Session Detail/Add to Schedule