Getting to know your own data better: scanning free text using terminologies and FHIR

Being better acquainted with one’s own data: Scanning free text with the aid of terminologies and FHIR

Dr. André Sander, Head of Technical Development at ID in the DMEA Gold Partner Interview

The software-supported derivation of medical classifications such as ICD or OPS from a structured medical documentation is not rocket science. It gets more fascinating when terminologies such as SNOMED CT are used to trawl through free text for symptoms, because this allows for more complex analyses, which can help to improve efficiency and the quality of the treatment. ID is a company specialising in medical documentation and coding, and at DMEA it is introducing a new product which fulfils this particular function. The initial details are revealed by Dr. André Sander, Head of Technical Development at ID.

Real-world data analysis is becoming increasingly important in healthcare, and this trend was made even more evident as a result of the coronavirus pandemic. What kind of information can be easily, digitally selected from clinical documentation, and in what respects is it more difficult?

What can be relatively easily extracted from the documentation are the classifications required by law for accounting purposes, i.e. ICD-10 for diagnoses and OPS for operations and procedures. These have been available for a long time and we at ID have been offering the relevant digital tools for more than two decades. But of course this is not the only interesting feature. Laboratory findings and medications are important for dealing with many questions, especially in a medical context. Much of this is still documented in free text fields. This applies even more so to symptoms, an area where there has been hardly any structured data until now.

What basic possibilities exist for displaying this data, and symptoms in particular, in a standardised and thus an available form?

In principle of course a consistent structuring of the primary data would be one possible approach, but I do not think so, or if it is feasible, then only selectively. Especially when we are talking about symptoms: everyone is different, and is described individually by a doctor. It is difficult to apply this in a structured form. The extreme example of the limitations of structured documentation can be found in psychiatry, but it also applies to other disciplines. The alternatives are provided by software solutions, backed up by classifications and terminologies and which understand the language, i.e. computer linguistics, applied to free text fields. Increasingly the time has come for this. One important reason is that the syntactic HL7 standard FHIR more or less provides us with a framework, specifying how the data that we extract should eventually be processed. Compared with the situation only a few years ago, this offers major advantages.

At DMEA you are presenting a concept for the standardised display of clinical symptoms using free text as a basis and you are also presenting a relevant product. How exactly does your free text analysis of symptoms function and what part is played in this process by such terminologies as SNOMED CT?

At DMEA we are showing how we analyse all the components of a patient file and make the content available for further evaluation. In so doing we not only want to derive specific classifications from free text but also enable free interpretations to be made. For this reason we may not only make use of classifications for accounting purposes but have to operate with efficient terminologies and semantic networks. For example, we use SNOMED CT, where already possible in German, the Wingert nomenclature, LOINC for the laboratory area and Orphacode for rarer diseases. In so doing we use the terminologies directly to present the symptoms, and the link with an ontology enables further processing to take place: In this way users can make inferences about non ICD-coded diagnoses or also evaluate the quality of the treatment. Hypotheses can be generated about aetiologies, for example if dizziness is correlated with fever or with certain medications. Combining the analysis of free text with terminologies and ontologies creates a flexible instrument for analysis, which can be applied in addressing a great many questions.

Perhaps we could be more precise here: your customers are the hospitals. Exactly how can they benefit from using such an analytical tool?

What is obvious is the research, even though we are not primarily focusing on this. Because patient data can be displayed in any of the terminologies, it is possible to find some outstanding connections suitable for further evaluation. Of greater relevance for many of our customers is the evaluation of the quality of treatment, for example in causal investigation into cases of exceptionally long hospital stays. Improved efficiency is also an important keyword. The tool can be used to compare different patient cohorts, for the purposes of patient-oriented benchmarking. Controlling can analyse ICD 10-coded data with other evaluation logics and in this way, for example, other areas can be opened up too. Support can be provided for evaluations in connection with enquiries by the Medical Service. And then there are the legally required interoperability functions, such as the Medical Information Objects (MIO). This is where an increasing number of standardised FHIR structures exist, which will enable interoperable content to be communicated in the future. Our tool can be used to enable facilities to fill these FHIR structures from routine data, thereby saving a great deal of work.

Is this still just a concept, or do you already have a product at DMEA for performing all these functions?

This is a standalone product that has been developed in a partnership between ID and DMI, and the first version is being presented at DMEA. The name of this product is DaWiMed, from the German words for “data, knowledge, medicine”. The core component is our ID terminology server, an ongoing development within the framework of our Medical IT Initiative (MI-I), which has also been scientifically validated. DaWiMed is offered primarily in the form of software-as-a-service and can be implemented with relatively little outlay. The process involves entering all the required electronic and digitalised documents in a data file directory which we then analyse and subsequently make available for any desired evaluation. The selection of the documents to be analysed takes place on the basis of the Clinical Document Class List, which is a permanent component of the DMI solutions. That is also what makes this solution so special: knowledge about the types of document to be analysed exerts a considerable influence on the quality of the results of the analysis.

Could DaWiMed therefore be described as a kind of “terminology server plus“, an application that augments a terminology server with free text analysis and thereby makes the free texts available for medical business intelligence purposes?

One could put it that way, yes. What is important is that the terminology server is also available as a separate product. If the intention is only to translate certain classifications and documentation standards into other formats, the terminology server alone is sufficient, and I do not need any free text functions. Take the case of rare diseases, laboratory documents or the cancer registry: in such cases the duty of documentation is constantly increasing, sometimes with separate nomenclatures, which are either already mandatory or soon will be. Of course, the complete codes have to come from somewhere. A terminology server cannot fully automate this coding and code conversion process but can at least go some way to achieving this.

To top of page