Identification of drug-induced acute kidney injury in free text and in coded electronic health record data
by Rachel Murphy
The electronic health record contains extensive information pertaining to a patient’s hospital stay, captured in a variety of formats. Previous studies have demonstrated that evidence of adverse drug events can be found in this routinely documented data. An adverse drug event is usually defined as any harmful event resulting from drug therapy. In a retrospective single center cohort study, we examined three types of electronic health record data for evidence of drug-induced acute kidney injury. This included structured (ICD-10 diagnose codes), semi-structured (allergy module) and unstructured (free text clinical notes) data. We found no evidence of drug-induced acute kidney injury in the ICD-10 diagnose codes, and little evidence in the semi-structured allergy data. Most evidence of drug-induced acute kidney injury was identified in the clinical notes. However, manual review of the clinical notes is time-intensive in comparison to analyzing structured data. Natural language processing has the potential to automate this process. To prepare for using NLP, we conducted a scoping review to get an overview of the natural language processing pipelines that are applied to identify adverse drug events in clinical narratives and the performance and validation of the methods in these pipelines. Our review showed that the lack of good quality annotated corpora is the main bottleneck in applying NLP for adverse drug events detection. Therefore, as a follow-up study, we propose to augment annotation of corpora with pre-annotation and assisted annotation methods for natural language processing. In the latter study, we aim to implement a pre-annotation method and an assisted annotation method and compare these with gold standard annotations with respect to accuracy and effort.