Temporal reasoning with medical data - A review with emphasis on medical natural language processing
Temporal reasoning with medical data - A review with emphasis on medical natural language processing
Abstract
Temporal information is crucial in electronic medical records and biomedical information systems. Processing temporal information in medical narrative data is a very challenging area. It lies at the intersection of temporal representation and reasoning in artificial intelligence and medical natural language processing. Some fundamental concepts and important issues in relation to temporal representation and reasoning have previously been discussed, mainly in the context of processing structured data in biomedical informatics; however, it is important that these concepts be reexamined in the context of processing narrative data using medical natural language processing. Theoretical and methodological temporal representation and reasoning studies in biomedical informatics can be classified into three main categories: category one applies theories and models from temporal reasoning in artificial intelligence; category two defines frameworks that meet needs from clinical applications; category three resolves issues such as temporal granularity and uncertainty.
Currently, most medical natural language processing systems are not designed with a formal representation of time, and their ability to reason about temporal relations among medical events is limited. Previous work in processing time with clinical narrative data includes processing time in clinical reports, modeling textual temporal expressions in clinical databases, processing time in clinical guidelines, and building time standards for data exchange and integration.
In addition to common problems in medical natural language processing, there are challenges specific to temporal representation and reasoning in medical text, which occur at each level of linguistic structure and analysis. Despite advances in temporal reasoning in biomedical informatics, processing time in medical text deserves more attention. Besides the need for more research in temporal granularity, fuzzy time, temporal contradiction, intermittent events and uncertainty, broad areas for future research include enhancing functions of current medical natural language processing systems on processing temporal information, incorporating medical knowledge into temporal reasoning systems, resolving coreference, integrating narrative data with structured data and evaluating these systems.
One. Introduction
One. Introduction
Temporal information is crucial in electronic medical records and biomedical information systems. Healthcare providers normally record the progress of a disease or a hospital course chronologically in text, and procedures and laboratory tests are stored in databases with time-stamps. The electronic medical record is only significant in a certain temporal context. Retrieval of information often relates to time; for example, "what happened after that operation?" As a fundamental entity, time is intrinsically connected with many medical reasoning tasks. Automatically reasoning about temporal information can help us understand the dynamics of medical phenomena and may potentially improve the quality of patient care.
Temporal information processing in medicine is a task that draws from many fields, including philosophy, artificial intelligence, database management, computational linguistics, and biomedical informatics. During the last two decades, researchers with different backgrounds, perspectives and objectives have attempted to bring together the fundamental methodologies and techniques from these disciplines to conduct research in this challenging area. Several review articles present detailed summarization and analysis of this area. However, research efforts in temporal reasoning in biomedical informatics are predominantly in processing structured data, which is the focus of most of these previous review articles. Recently, Augusto argued that in the medical domain, "Natural Language turns into a very fertile area of research where temporal issues are very important."
Nowadays, the most significant impact of computer technologies in medicine is in processing structured data. In general, structured data is information presented in a standard, predictable form (e.g. defined data types and operations) that is easily processable by a machine. By contrast, unstructured data is information that does not have a data structure. Examples include free text, audio and video. The term "natural language" is used to distinguish languages for human general-purpose communication from computer languages (e.g. a programming language or a formal representation). In the medical domain, natural language commonly appears in patient charts, scientific literature, technical and administrative reports, emails, surveys, presentations, etc. Natural language processing refers to any system that manipulates text or speech. In this paper we focus on medical text processing along with associated applications, such as information extraction and retrieval, and we also limit our scope to the English language.
Processing time in medical text is a very challenging area, which lies at the intersection of temporal representation and reasoning in artificial intelligence and medical natural language processing. In addition to the general difficulties in temporal representation and reasoning and medical natural language processing, there are many other critical issues that need to be deeply studied in order to handle such information. In natural language, times related to an event are not always stated explicitly. Interpreting this implicit information require complex linguistic analysis and domain knowledge. In addition, temporal expressions are diverse and often vague (e.g. "during that time", "recently"). Another challenge is to determine whether two expressions refer to the same concept. For example, in the statement "The patient was in her usual state of good health one day prior to admission when she developed running nose, fever, cough and respiratory distress. Her symptoms worsened on the day of admission", "Her symptoms" refers to a set of manifestations underlined in the first sentence.
Although many fundamental concepts in relation to temporal reasoning have been extensively discussed in previous publications, they are inadequately addressed in the context of processing medical narrative data. Thus, this article first examines some of these concepts, specifically focusing on basic time-related notions, theories and models in artificial intelligence, with emphasis on medical narrative data.
Then, we present a methodological overview of temporal reasoning in medical domain. One motivation is that we consider that these theoretical and methodological studies will in general contribute to processing time in medical text, either in the development of systems which make use of these methods, or in a future direction that integrates processed narrative data with structured data for higher level temporal reasoning tasks.
In the remaining sections, we emphasize natural language processing and processing temporal information in medical narrative data. We start with an overview of previous work on processing time in natural language processing in general. Then, we introduce several important studies on processing temporal information in medical text in the field of medical informatics. We provide a discussion of significant issues and challenges on handling temporal information. Finally, we conclude this article by pointing out several key areas for future research.