Ontology-based approaches to identify patients with type 2 diabetes mellitus from electronic health records: development and validation

Download files
Access & Terms of Use
open access
Copyright: Rahimi Khorzoughi, Alireza
Altmetric
Abstract
Introduction Issues around the data quality (DQ) of patient registers are often raised when a data set is used for clinical or research purposes. An ontology-based approach provides a flexible semantic framework and supports the automation of data extraction from electronic health records (EHRs). This research aimed to assess the flexibility of an ontology-based approach to accurately identify patients with type 2 diabetes mellitus (T2DM) in a clinical database. This research also demonstrated the role of an ontology-based approach to assess quality of a register. Method A systematic review was conducted, which addressed DQ, ‘fitness for purpose’ of data used and ontology-based approaches. Included papers were critically appraised with a ‘context-mechanism-impacts/outcomes’ overlay. Using a literature review, the Australian National Guidelines for type 2 diabetes mellitus, the Systematised Nomenclature of Medicine – Clinical Term – Australian Release and input from health professionals, a five-stage methodology for DQ ontology (MDQO) was adopted. The methodology consisted of: (1) knowledge acquisition; (2) conceptualisation; (3) semantic modelling; (4) knowledge representation; and (5) validation. Although MDQO can be used in any validation domain, this thesis validated it in the context of T2DM diagnosis and management. The accuracy of the MDQO was validated with a manual audit of general practice EHRs through the diabetes mellitus ontology. Contingency tables were prepared and sensitivity and specificity (accuracy) of the model to diagnose T2DM was determined, using T2DM cases of a general practice, which kept a diabetes register with complete and current reason for visit information, found by manual EHR audit as the gold standard. Accuracy was determined with three attributes – reason for visit, medication and pathology – singly and in combination. Results The T2DM ontology included six object properties, 15 data properties, 68 concepts and 14 major themes in four main classes: actor, context, mechanism and impact. The validation showed sensitivity and specificity were 100% and 99.88% respectively with reason for visit; 96.55% and 98.97% with medication; and 15.6% with pathology test result. This suggests that medication and pathology test result data were not as complete as reason for visit data for the general practice audited. However, the completeness was adequate for the purpose of this thesis, as confirmed by the very small relative deterioration of accuracy (sensitivity and specificity of 97.67% and 99.18%, respectively) when calculated for the combination of reason for visit, medication and pathology test result. Discussion Current research shows a lack of comprehensive ontology-based approaches for DQ in chronic disease management and there are few validation studies comparing ontological and non-ontological approaches on the assessment of clinical DQ. The MDQO developed in this thesis provides a semantically flexible mechanism to capture patients’ data from EHRs. It is also designed to be generalisable and reusable. This T2DM ontology-based model (constructed using the MDQO) is sufficiently accurate to support a semantic approach, using reason for visit, medication and pathology tests data from EHRs to define patients with T2DM. The accuracy of the T2DM ontology approach was established with respect to the DQ dimensions. The MDQO helps with the implementation of DQ based on “fitness for use” and hence better utilisation of routinely-collected clinical data for research. Conclusion This thesis contributes an ontology-based methodology for DQ assessment and management in a diabetes context. It provides new insights into the identification and assessment of patients with T2DM from EHR data. This ontology-based approach can potentially support the assessment of the impact of DQ on a data set in terms of the purpose for which it is used. There is a need for similar ontology-based research in other clinical domains, beyond T2DM, to address DQ in chronic disease management.
Persistent link to this record
Link to Publisher Version
Link to Open Access Version
Additional Link
Author(s)
Rahimi Khorzoughi, Alireza
Supervisor(s)
Prof. Liaw, Siaw-Teng
Prof. Ray, Pradeep Kumar
Creator(s)
Editor(s)
Translator(s)
Curator(s)
Designer(s)
Arranger(s)
Composer(s)
Recordist(s)
Conference Proceedings Editor(s)
Other Contributor(s)
Corporate/Industry Contributor(s)
Publication Year
2015
Resource Type
Thesis
Degree Type
PhD Doctorate
UNSW Faculty
Files
download public version.pdf 14.1 MB Adobe Portable Document Format
Related dataset(s)