Development of an ontology architecture to ensure semantic interoperability between communication standards in healthcare Appendix with additional explanations
|Home -> Frank -> My Thesis|
3. Improvement of interoperability
Practical interoperability (working interoperability) can ultimately be demonstrated only in running systems / applications. This must be done based on (completely) developed interfaces. On the timeline of the process steps this is the last one:
Until then, there are several ways to enable interoperability at all and to improve it. This improvement in interoperability can be achieved in various ways, which complement each other. In addition to the primary time axis, the standard used for the introduction of improvements is essential. The earlier and more precise standardization is done, the more effective are the activities. In between, the individual activities can be divided/classified.
The following list of the improvements are classified accordingly.
During the development of versions of the v2 family has been shown that the specification of standards using a text processing program [SchoOem2001] is problematic. The first known version 2.1 has been delivered in a way that could not not even be printed, because the whole formatting was adapted to the American Letter format which was for the German DIN/A4-Format too wide. As a result, "table-oriented" information has been wrapped illegible. In the next version (v2.2), this shortcoming has been removed, but there was enough reason to ask questions. This was especially triggered by inconsistencies within the documents. To make things worse, a manually compiled appendix caused confusion. In 1993, these problems have lead to the founding of the first technical committee within the German user group to help clarifying questions of this kind.
In the course of later developments, both problems have been addressed. The base is the so-called "comprehensive HL7 database" [OemDud1996, OemSto2000], which contains the relevant information of the standard. The Appendix is now generated from the extracted data stored in the database. Thus, the Appendix is at least consistent with the actual chapters.
It remains here the question of whether the database matches the standard one hundred percent? Since you cannot fill any relational database with inconsistent information, it must be denied.
When extracting the data for the database out of the actual standard, these inconsistencies are identified and resolved. Here, the list drawn from this is the basis for proposals to develop the next version. Unfortunately, for compatibility reasons many of these improvements are not implemented directly because they address problems whose solution in the current inconsistent way has already been accepted.
The problem remains that different committees continue their work the chapters, without being accurately informed about changes in other chapters. A style guide [OemVei2000] can take care of this problem partially. Directly it provides clear guidance on what formatting is applied when. These guidelines can also indirectly be connected to semantic content (aka metadata mapping). One can not guarantee that the editors hold themselves to these principles, but a semi-automatic process can process the information (paragraphs, headings, tables) to extract and store them in an intermediate database. The latter then allows for consistency checks. For example, if all used tables have also been defined, or whether lengths are identical. Moreover, also semantic contents is verified. For example, whether fields of data type CNE are only used in combination with HL7-defined tables. All such information can contribute as negative comments to the ballot processes, and thus lead to an improvement into the overall standard.
3.2. The HL7-Database
The first version of the HL7 database was only used to get the HL7 Version 2.1 into a consistent state. At the same time, in Germany the interpretation of the standard began. In the U.S. version 2.2 appeared at the same time.
Both have been used as an opportunity to expand the HL7 database. In the course of further development steps, this database contains an impressive list of properties that are used to ensure consistency of the standards and also for development of tools and interfaces of large and small manufacturers:
Currently it is considered whether this database can be the basis for future developments of the standards.
The first steps are initiated to further enhance this database into a multi-standards database, which also includes IHE and DICOM beside HL7 v2.x. For this purpose the corresponding metamodels are worked out and the documentation of the standards through the establishment of other style guides are prepared. The latter requires an intensive cooperation with the various committees and individual editors which can hardly be achived.
This extension could also make cross-border consistency and integration tests.
Filling the database with values
A significant problem is filling the database with values, i.e. the question of how the many individual data sets can be created consistently. In the course of the work, various approaches were tested and not found enough not useful or functional, even though they have delivered results. A multiple redesign was caused not least by increasing the complexity and scope of the standard itself, so that the additional manual effort must be reduced to a minimum.
Due to the inconsistency in the standard itself at least three steps are necessary:
Step b) is purely hard work to match the data obtained with the original documents. The real intention of the problematic data must determined in order to correct the inconsistent data accordingly. The latter leads automatically to a list of errors, which are the basis for comments in the next ballot cycle. The final step - uploading the data into the final database - can be done automatically as the data is already available in a structured form.
The real difficulty is to efficiently access the data in the Word documents. Here, the following variations have occurred:
Table 1: Alternatives to populate the database with values
3.3. Alternatives to fill the database with values
The high expenditure, challenging even the editors of the particular standards, caused the search for alternatives. The easiest way is the direct mastering the standards based on a database. This must be combined with the use of additional tools that allow a convenient editing. It is important to meet additional requirements:
A first executable approach was presented at the 2001 IHIC in Reading: [SchoOem2001]
3.4. Documentation of the interface
The most important step for successful interoperability is a good documentation of the implemented interface. Unfortunately it must be noted that most manufacturers do not have such. If they are ever able to provide a written documentation, in many cases it is a copy / extraction of the original standard.
In order to achieve a clarification in case of failure, an accurate understanding of system behavior is essential. Therefore it is necessary to enrich the documentation with more information. Last but not least, the customer has a right to a full and correct documentation. One difficulty, however, is the fact that when "customizing" the interface hidden configuration details must be made transparent accordingly.
The following graphic [Sing2003, SinJuu2001] illustrates that by accurate understanding of the interface behavior and appropriate documentation very much is done to prepare the migration of HL7 v2.x to V3 already.
The HL7 version 2 messages require additional information that will be implemented relatively "static". These include restrictions, for example, the field lengths, amount of repetitions, the interpretation of segment contents or the presence of pre- and post-conditions for a message. In the case of using an interface, there are further restrictions such as the binding to specific catalogs, which should not be fully presented, but only referenced.
On closer inspection, this information covers the static and dynamic definitions and the various scenarios in the V3 standard.
The problem is the documentation with the so-called Proposal #605, which has been deposited in the Proposal Database and is currently in a multi-improved form (V07) prepared for implementation in version 2.8. The ECCF ( Enterprise Conformance and Compliance Framework ) of the SAEAF ( Services Aware Enterprise Architecture Framework ) project takes up this proposal.
3.4.1 Message Profiles
Messages profiles form a hierarchy, i.e. a profile is built on another. But one has to obey to certain rules. Hence, in this derivation hierarchy only restrictions can be added. For example, if an element (e.g. field) in a profile is "REQUIRED", it may not again be derived to "optional" or even "not allowed". (The possible transitions are described in Chapter 2 of HL7 v2.x.)
Below 2 excamples for such a hierarchy are listed. In the first instance there is a national guidance, on which a hospital chain bases its request, which is then adapted by a vendor. In Example 2, a vendor builds on the national guidance in a generic form, that can be refined onto the ultimate targeted application:
These examples can also be combined and nested to any depth.
3.4.2 Conformance Testing
The goal must ultimately be that two implementations are compatible with each other, so the last level from this presentation can be taken where two implementations are checked against each other.
To replace the test of two systems in real use (5), an alternative must be sought: (6). This works under some conditions: (On the subject of certification see further below.)
Depending on whether one tests an implementation against a specification (document), a documentation against a specification or two documentations or implementations against each other, one speaks of conformance, compliance or compatibility:
3.4.3 "Documentation Quality Hierarchy"
The above-mentioned Proposal #605 defines several quality criteria, which form a hierarchy based on each other. To get the documentation to a higher level of quality, greater demands must be met. The following are the explanations for Proposal #605:
In the V3 standard, this problem was addressed quite early and by the so-called "pubs DB" ("Database Publishing"), respectively. The editors are invited to enter the contents according to instructions ( Facilitator Guide ).
Basis for the viability of this concept is the existence of so-called Artefact IDs which are unique identifiers for the various objects (Application Roles, Interactions, message types, ..), enabling a unique reference. This Artefact IDs are generated by an algorithm so that the various TCs are working independently on a refinement of the specifications. Referencing external objects is therefore also possible.
Shared information such as CMET (Commen Message Element Types) are cyclically distributed to the TCs, so that they are working on the same versions.
Each TC has his own pub-DB. For publishing the information, the databases are merged and semi-automatically processed to create a comprehensive publication.
In the future, this database should be migrated to the so-called MIF files, which are also the basis for this work.
3.6. document-oriented spezifications
Not all specifications are model-oriented, but are created primarily as a text. Examples are the CDA and the "Refinement, Conformance and Localization" specifications. Such documents are prepared in accordance with a predefined schema in XML and thus can be subjected to a single publication process.
The integration of external objects such as graphics is done by the Artefact IDs.
3.7. v2.x: Component Model for Data Types
Until the version 2.4 data types have been considered as a simple juxtaposition of the components. Here only the data type itself was considered.
In the text - both in the description of the data type and of the field - there was additional information that had to be specifically assigned to one component. About the so-called component model, which was officially introduced with version 2.5, the assignemnst described in the text could have been done explicitly.
In the HL7 database this component model was introduced with version 2.1 in order to reflect the existing details correctly. (An example thereof is the data type CM envisaging different structures depending on usage - what is not necessarily appropriate for semantic interoperability.)
One difficulty is to classify the various properties of elements correctly, as there is no formal metamodel for the 2 versions. The following table tries to reflect this accordingly:
Table 2: Assignment of properties
3.8. v2.x: implicit Segment Structures
A problem indicated already are the many implicit information contents. An implementation of the standards is simplified considerably if this information is also shown explicitly. In the following message structure the explicit information content is inserted as "single-choice" elements (in gray). This means of expressivity is definitely available in the standard since the first HL7 Version 2, but only since v2.3.1 mentioned explicitly. Unfortunately, the various TCs have neither recognized nor used these opportunities:
Table 3: Message Structure "ADT^A24: Patient Link" with explicit information
In the above example, the two-segment sequences for the identification of the two patients are identical. A correct assignment of semantics, however, can only be determined by the order / position of the segments in the message. This problem results in additional difficulties to the XML encoding, because the XPath expressions are identical. In the example shown above, it may still be fixed by the position information, but not in other examples.
The gray lines give the information implicitly present an explicit name. In the XML encoding, this solution would result in placing the segments in their own (named) group and thus simplify access to the content. (This change introducing the substructures would then be incompatible with existing implementations.)
3.9. v2.x: Explicit Message Structures/BNF
An equivalent problem to the segment structures is the reusability of complete messages. For example, the app. 60 ADT messages consist of three basic structures when ignoring the few query messages:
The basic message varies for most of the ADT messages minor in scope (optional structures are omitted) or by the order of the segments (which is not always one hundred percent equal). This reveals a disadvantage in not using the metamodel approach when defining the standards: In principle, similar messages vary and must be developed dedicatedly. Apart from the additional development costs for the manufacturer the voting and maintenance requirements for the standard itself is unnecessarily increased.
The introduction of "Message Structure Identifiers" helps in a first step identifying exactly the same message structures as such. This was introduced with version 2.3.1.
Note: Unfortunately, this identifier will not be taken into account when advancing the standards so that messages with previously the same structure will get a new one.
In a second step these MsgStructID should be used to eliminate duplicate message definitions in the standard and thus reduce the size of the standard, since each structure is then specified only once.
Note: This proposal leads to a compatibility issue, but it would solve the problem of varying structures permanently, as it then provides each structure only once.
3.10. v2.x: National Development
National peculiarities (e.g. the insurance card in Germany, the information for the DRG reimbursement, the use of ICD-10 (the Americans still use ICD-9 and ignore the therewith missing problems), the extended character set in Russia and Japan, the special handling of names in different countries, etc.) require extensions. Therefore, the HL7 standard has provided the so-called Z-segments and Z-events.
The use of a Z-segment may have two different causes:
In the first case, the existing specifications should be implemented correctly. (But there are also examples that consciously do not use this path: HL7 v2.4 messages Profile provided by NHS in the UK or the bedside phone numbers of patients in Austria.)
In the second case, the additions in the Z-segments should be introduced to the standard as an expansion proposal. According to the author, each solution is reason enough to provide an expansion proposal, because if a vendor needs such an extension, then a second will certainly do as well?
3.11. V3: National Development
In version 3, this situation looks different but is similar in principle. See also the sections on CMET (substitution) and localization rules.
3.12. v2.x: XML Encoding
The XML encoding for Version 2.x ("HL7.xml") is based in principle on the use of XML structures and elements, which are based on the standard encoding (ER7). Thus, the structures are implemented compatible to ER7. However, there are some differences:
On the one hand (apart from the XML tools) implementations based on that have a certain advantage, if the structures are clearly identifiable. On the other enhancements lead to incompatibilities.
3.13. v2.x: Message Profiles
The following hierarchy is used in Germany: Based on the original standards, a translation / interpretation was made over the past 15 years. This informative - but unofficial - version is the basis for the message profiles, which was also balloted by the HL7 user group and are thus regarded as normative. On this basis, vendors can then define and implement any additional restrictions:
On this basis, for example, three different profiles for the admission was defined. Which is then realized should be specified in the message header:
An important point is the documentation in the end: Each realization of the basic specification is a profile in itself but it is not relatively often seen as such and therefore not well documented.
Another issue is the HL7 compliance, that is, whether such a profile is defined according to the general requirements or not.
As shown in the above figure several profiles are defined being built upon each other. Normally these will be clearly identified by the corresponding OIDs. However, these OIDs are only sometimes stored in a register, so that it they are hard or not at all be verified.
3.14. v2.x: Compatibility among Versions
The 2 versions are according to the standard-Encoding Rules conceptually upward compatible. This is restricted only in two points. On the one hand by semantic changes to the field definitions. It may happen that information be placed in other fields. Secondly, in terms of coding using the XML syntax. Here, the data types of the data elements are used as tag names unless they contain more than one component. So, if atomic field contents is divided into components, for the encoding an additional layer is introduced. The same effect occurs when segments in the extension of messages will be aggregated to new groups.
The first problem can be solved if the data types are not implemented using their own element names, but in relation to the transmitted component:
Table 4: Proposal for XML-Encoding
Thus, the effort required for the definition of the schemas increases significantly, but it also solves another unnamed problem: Certain data types are further comstrained by additional restrictions when they are used within a particular field. Example: "CE", "CNE" and "CWE".
The second compatibility problem can be solved only if the previously described implicit information content is made explicit. For this problem, it means that all segments are checked to see whether they are eligible for a later grouping with other segments. This approach is very limited, because segments from future versions are not yet foreseeable. For this reason, this objective can not be achieved.
3.15. Character Set
A very different kind of problem which only occurs under certain conditions, is the character set used. The various implementations write relatively clueless information from the data set directly into the message instance. This usually results in:
It is expected that the other side is using the same character set and the transmitted data is exploited directly. Thus, the associated problems are not noticed so far:
A point here are the escape sequences: To switch between the various character sets escape sequences are necessary. But these are not defined for all character sets. In principle, according to the table 0211 for each set of characters allowed an appropriate escape sequence must be present, but this is not the case.
The Japanese use for their preferred character set (ISO IR87), the internally-defined escape sequences. The current implementations show that this works well.
InM WG decided on 16.3.2010 to not permit the character set UTF-16 and UTF-32. These two are basically only alternative forms of representation for UNICODE and can thus certainly be represented by UTF-8. Thus a fundamental incompatibility between the different applications will be excluded. UTF-8, however, is designed that no fundamental problems are caused by overlapping with ASCII. Be noted for defining the delimiter in this case that these can be represented with 7-bit.
3.16. V3: Localisation
The localization of V3 elements occurs beside the exchange of CMETs (see below) by demotion and promotion of data types. By this is meant that the respectively used data types are replaced by simpler and more generic. The document "Refinement, localization and constraints" of the HL7 V3 standards regulates this procedure.
With Demotion the simplification of a specified data type is meant.
Thus, for example, "IVL
The localization of messages and models generally happens at the national level on the exchange of so-called CMETs, i.e. in the models, the corresponding CMETs are shared across regions. Thus, for example the CMET "patient" in Germany could be modified so that more information from the KV-card or health card (eGK) with insurance data are included.
3.18. V3: Interversion-Compatibility
The HL7 Development Framework (HDF) provides that, in principle, an algorithmic procedure is responsible to derive the sequencing as an XML structure for
the implementation of domain models from the abstract representation using the RIM. Hence, taking the same input parameteres, the same result ahould be achieved.
The important thing is the so-called ITS - the Technology Implementable Specification - which controls the
Implementation of the models - in this case to XML. The current ITS uses the names of the classes (see lecture on the IHIC 2009 and GMDS 2009).
A fundamental problem with the existing ITS is that the attributes classCode and moodCode
determine the semantics and thus the validation of instances. Conversely,
this information is implemented in the XML ITS as attributes, so that a validation is
generally not possible via the XML schema:
3.19. V3: Mixture of different versions
As stated above, the version 3 is developed based on the RIM. As explained in the work
this is not relevant alone. Then there are the data types and vocabularies,
so that in an implementation a combination of at least these three aspects is relevant.
From 2010 this will change, because then both the data types will be available in Rel.1.1 and Rel.2
as well as alternative ITSes. This should further increase the cost of
various implementations, if not from the beginning a generic implementation approach was taken.
3.20. CDA - Clinical Document Architecture
The "Clinical Document Architecture" was originally developed to exchange any medical / clinical documents. Due to the migration scenario then in addition to the metadata the text itself is in the foreground. Every CDA document MUST contain text to preserve readability, and it does not matter, if this text was created manually or generated by a program.
An important role - and also one of the biggest advantages of CDA - are the different levels:
Recent proposals still try to realize a further subdivision. So it could be a
Level 0, where there are only embedded objects, i.e. binary data which cannot
be evaluated. (IHE ITI XDS-SD would be such a specification.)
The first release from CDA was released in 1999. At that time the idea was announced,
to define additional levels to be able to exchange structured data.
But it never happened. At that time, SCIPHOX was founded in Germany to
establish a cross-sectoral communication based on XML.
The second release, published in 2005, is now fully based on version 3.
Likewise, the data types were taken.
The third release is currently in preparation. All the unimplemented requirements and additional ideas are considered. This includes the support of the HL7 V3 data types Release 2.
Currently is also considering whether we can define an alternative XML schema ("Green CDA"), which makes implementation easier for a vendor.
3.20.3 SDA - Structured Document Architecture
Another development of a very different kind is the realization of Structured Document Architecture increasing the abstraction. CDA is then "only" a special version (constraint) of SDA. Another specialization example would be SPL, the Structured Product Labeling.
3.21. Conformance Tests
To improve interoperability, there are three ways of testing:
The first option is already used in the IHE Connect-a-thon. Here, it is checked whether a system in a given communication scenario can process the data the another system sends to him. (A comparison of the messages transmitted is mostly omitted due to lack of time.) The latter shows the pragmatic approach of IHE. It is important to demonstrate a successful communication between the different systems. Corrections of the implementation are allowed on site. The result of the test is then archived for trade shows and demonstrations. This pragmatic approach also implies, that in this way no software is created that can be purchased officially by customers. (It can not be distinguished whether a vendor tests a very rudimentary prototype or an almost finished product. Via the comparison of the results of past tests, it can be investigated, but this is very tedious and does not guarantee the product maturity of a tested software.)
In the second variant, a vendor must submit many (several thousand) messages a representative sample (several hundred) of which is then automatically checked. This form is favored by AHML [AHML]. The test is geared to legal generic (syntactic) rules, which sometimes are not without controversy: Thus, for example, a simple data type is checked for a second component.
As a test of this nature can be conducted only against constrainable profiles, many cases remain undetected. On the other hand, such an examination is provided cost effective, because no specifics must be considered.
The third variant is the so-called certification of interfaces [ZTG]. A vendor has to submit a specification that is a valid restriction of a constrainable message profile. In the first step of this certification it is then checked whether this specification is indeed a valid restriction. In a second step, the vendor must then prove that its own software fits to this specification.
Because of the (currently) high degree of manual audit work (examination of the specification, Configuration of the system, input data from the given test scenario) this is an expensive variant. However, in this way most of the problems are discovered.
These three variants span an orthogonal space:
It is undisputed that a test such as the IHE Connect-a-thon has its practical value. No wonder that at the Conect-a-thons app. 80 vendors with more than 100 systems are participating - amount increasing. (However, it should be noted here that after several successful attendings vendors are only conditionally inclined to repeat the tests, so that for the older profiles a decrease is recorded.)
With the certification, however, the vendors face a dilemma: Most interfaces are configurable and can be adapted to different requirements. To meet the requirements for certification and to simplify the test, a number of optional features are configured "away". Since the interface specification, however, are part of the certification process including a publication of the results less features are published accordingly.
The aim for HL7 must be, to achieve a commitment to publish the Interface specifications (the so-called Conformance Statements) in order to declare HL7 conformance. With the certification process one comes closer to this goal indirectly.
The basis for forward-called process model for HL7 V3 was developed by Kai Heitmann and I added to conformance claims for the implementation and certification.
Last Update: March 22, 2010