Archive for the ‘Health-IT’ Tag

Why are hierarchical databases like MUMPS still popular in Healthcare?

For my class on Healthcare IT Infrastructure at GGU (ITM 351) I had to explain to my students why hierarchical databases are not only still popular, but also practical, and show the limitations at the same time. It is amazing how little material there is comparing hierarchical and relational models – maybe because outside of Healthcare M is not used much and in computer science students only learn about RDBMS. But Healthcare is different. From Meditech to Epic, many EMR systems still use MUMPS or M. And actually for a good reason. Here is an excerpt from my GGU ITM 351 class:

Hierarchical database models continue to play a very important role in Healthcare IT, we need to review this aspect a lit tle bit more. First, let me explain what a normalized RDBMS is. Based on Codd (1971), the pioneer of modern Database technology, a database in normalized when it is at least in third normal form. Third normal form is required to prevent update and data inconsistency issues.

In figure 1 you find a very simple normalized relational database model:

Figure 1 Relational Model

In this example we have three entities (Patient, Encounter, and Procedure) to reflect a core concept of medical records. Every time a patient enters a Hospital, a new Encounter is created. One Patient can have multiple encounters over a period of time, so this is a one-to-many relationship. Each Encounter will require one or more Procedures, so again we have a one-to-many relationship, and each of the procedures can require one or many orders – all parent-child like relationships, all one to many and hence perfect for a hierarchical model. But in a relational model,  it is necessary to normalize to the 3rd normal form (NF), and in order to do this, the data architect needs to satisfy the requirements of the first normal form (1NF) first:

  • No repeating groups within or across columns

That means if a patient can have multiple phone numbers or multiple encounters, we cannot store these in the same table. Multiple Phone Numbers in one column would create problems when a program needs to update a single Phone number, because a single query would return the content of the field, which would be multiple phone numbers, not one in particular. And having multiple columns of phone numbers would violate 1NF, because we could have a lot of empty columns (i.e. if a patient has only one, but we anticipate up to three), or, if a patient has more phone numbers than we anticipate, we wouldn’t have enough. So that’s why we create a new table called “Encounter”, or “Phone Number”. We use MRN, the primary key of our Patient table, as a foreign key in the Encounter table (that establishes the link between the two tables), and we create an encounter ID. Encounter ID and MRN together identify Procedures, and we can add as many procedures as we like for each encounter, and as many encounters as we like, without ever violating 1NF.

Second normal form (2NF) requires:

  • All non-key attributes must be fully dependent on the key

The encounter table meets this requirement, because the table has three attributes (MRN, Encounter-ID, and Procedure Code). The non-key attribute “Procedure Code” is fully dependent on the composite key. Why do we need a composite key? One patient can have multiple Encounters, so I could have the same patient (= same MRN) with multiple encounters, and in each of those encounters a different set of procedures.

Third normal form is achieved if

  • No functional dependencies of non-key attributes

Orders are dependent and specific to each encounter, so we could not have orders and encounters in one table with MRN and Encounter ID as key. So the above simple diagram is a database schema in 3NF.

Now let us look at the same data organized hierarchical:

Figure 2 Hierarchical Data Model

In a hierarchical data model we have one restriction, which is that we can only model one-to-many relationships. But in the previous normalization exercise we discovered that that applies to all relationships. One patient -> many encounters. One encounter -> many orders. And when you think about this, medical records are always organized like this – they are logically hierarchical tree structures, which lend themselves to the database models like MUMPS, which not coincidentally was developed in Healthcare. The other advantage of hierarchical databases is that they do not have to be in 1NF. I could list multiple encounters and multiple phone numbers in the Patient table, and then link from there to the child, so for example encounter 1 links to a table with details about encounter 1, which contains many orders etc.

Because simple operations, like looking up a phone number, require costly table joins, some database designers purposely violate 3nf and design some redundancies in their database schema for the benefit of efficiency.

What is more efficient – RDBMS or M?

The Codd model of an RDBMS is very elegant, and a great implementation of mathematical set theory, which allows us to relate data in ways that were not necessarily predefined. If I would like to know all patients in the Hospital that have had an Appendectomy, I could simple formulate a query such as:

Select MRN from Procedure where Procedure Code = “47.19”

And if I wanted to know the Name of patients that had had appendectomies, my SQL query would be:

Select FNAME, LNAME from Patient where MRN =( Select MRN from Procedure where Procedure Code = 47.19”)

This is called a JOINT operation, in which two tables are joint on a common field (in this case MRN), which is great for all kinds of queries. It is the reason why RDBMS are so popular. But at the same time, every time I want to know a simple thing like which phone numbers a patient has, I have to use a joint also:

Select FNAME, LNAME from Patient AND Phone Number from Phone_Numbers where Patient MRN = Phone_Number MRN

Now, joints are computationally very costly, because in a joint first all data elements are brought together into an intermediary table, and then the joint condition or constraint is applied (in this case matching MRNs). If a join requires combining a database table with 500,000 rows with another table of 500,000 rows, the intermediary table will have 500,000 x 500,000 entries before the constraints are applied. In contemporary information systems that is not a big deal, but ten, twenty or forty years ago, it certainly was. And still today, RDBMS response time can be an issue if the database schema has not been designed with efficiency in mind. So while a relational model is very elegant, and allows all kinds of queries, a hierarchical model is very efficient BECAUSE of redundancies.

Downside of hierarchical models

It is hard to believe, but it is difficult for a hospital using a hierarchical database (and that is the majority of Hospitals in the US) to answer a simple question like “how many patients do you currently have with H1N1 diagnosis”. Hierarchical databases are built to structure data hierarchical, so if I Look up a patient X, I can most certainly find out if he has a diagnosis of “y”. But if I want to know all patients that have a diagnosis of “y”, which would be an unusual query for a regular Hospital process, but not so unusual for public health purposes, I would have to look up each and every patient tree to retrieve that information.

In interoperability the standard defined by the Department of Health and Human Services is the Continuity of Care Document (CCD) or Continuity of Care Record (CCR). Both CCD and CCR are XML schemas, which are built hierarchical. But XML uses metadata tags, so the query of diagnosis Y is a little easier, because I can look for a particular metadata tag, and select only entries where the metadata tag is non-empty. Still, the system would have to parse all records and then count the ones that match.

Another common way to overcome the disadvantage of a hierarchical model for non-standard queries is to load the data in data warehouses or data marts. The latter approach becomes very common for quality reporting and public health requirements, which are part of the ARRA/HITECH Meaningful use guidelines, and represent obviously a challenge for all the EMR systems based on MUMPS.

Codd, E.F (1971), “Further Normalization of the Data Base Relational Model.” (Presented at Courant Computer Science Symposia Series 6, “Data Base Systems,” New York City, May 24th-25th, 1971.) IBM Research Report RJ909 (August 31st, 1971). Republished in Randall J. Rustin (ed.), Data Base Systems: Courant Computer Science Symposia Series 6. Prentice-Hall, 1972


Building an interoperable Health Information Exchange

Building an interoperable Health Information Exchange
The federal government declared in the American Recovery and Reinvestment Act (ARRA) of 2009 [1] its intent to fund significant investments necessary to built an interconnected health information exchange (HIE) in the U.S., with the goal of quality improvement and cost containment [7]. In recent years there have been many attempts to built regional HIE, often called RHIOs, but most of these RHIOs failed after they exhausted their initial government funding [2]. Reasons for RHIO failure were both economic, with unsound business plans and monetization models [6], and technical. Learning from the failures of the last decade, a successful approach must provide both a sound financial model for all participating parties, and incorporate proven components into a scalable, secure, extensible and standards oriented architecture. This essay describes at a high level some of the critical components required for building an interoperable HIE.
HIE exchanges allow health data exchanges between various organizations and thus different information systems. Given the sensitive nature of health care information, data privacy has to be maintained throughout such a federated information system in compliance with HIPAA , requires an auditable log of every passive or active data access. In order to fulfill both the regulatory and functional requirements, the following core elements are required:

– Communication adapters that allow data extraction from existing legacy applications such as Electronic Medical Record systems (Providers) or Claims Data Repositories (Payers), usually implemented in a service oriented architecture (SOA). This is achieved by tagging data elements in feeder systems against a common data standard. A template for a common data architecture is HL7 CDA2 [4]. ANSI developed with HITSP specific work flow profiles for common tasks in the provider environment [3]. Complexity of peer-to-peer communication and the requirement of interfaces would grow  , in which n represents the number of information systems connected to a HIE. In the approach to map against a common data template, the interface requirement is reduced to n, a significant reduction in complexity. If the HIE is implemented in multiple loci, interfaces can be re-used, further reducing complexity. Nevertheless, given that most current EMR implementations are proprietary and do not adhere to a standardized architecture, tagging data elements in proprietary architectures does represent a substantial technical and financial challenge in the creation of HIE.

– Master Data Management (MDM) systems that allow identification of unique person profiles across multiple information systems, even in the absence of a single, unique identifier. While some countries do have such identifiers, in the U.S. the use of the social security number is not permitted. However, identifying data belonging to the same person across multiple systems is absolutely crucial for both patient safety and cost containment purposes. Popular systems like Initiate or Quadramed are proprietary in nature and create vendor dependency. With Mural, there is a generic, open source technology available, which is however lacking healthcare specific adapters. However, since communication adapters are essential for the entire system, as discussed earlier, adapters could be used to extract person identifying information and utilize the interpolation capability of the Mural project.

– Record Locator Services track data sources for medical information. The combination of MDM and SOA allows extraction of data related to a specific person from their original record keeping system on demand, when required (ad hoc). In the proposed interoperable HIE the record locator service is implemented in a distributed fashion, thus eliminating single point of failure. Synchronization of the various, distributed record locator services would follow a propagation scheme analogous to Network Routers, which keep routing tables locally without a single point of failure.

– Repositories create data artifacts that are accessible outside of the original record keeping system. This approach is used to create a persistent subset of medical information with emergency information, such as allergies and medications. If the data is constantly updated by trusted sources, it can be used for medical purposes. If it is exclusively or substantially maintained by user input, it is only a consumer directed personal health record (PHR) without clinical application. An interoperable HIE should not contain a repository of all healthcare data, as such an approach would create significant, inherent scalability issues. Every data artifact would have to be constantly checked for accuracy, generating unnecessary information traffic. However, emergency subsets and medical images could and should be kept in a repository in order to achieve high service availability levels with fast response times, while more detailed data is exclusively kept in the original custodial system of record.

– Role management is used to define across the organizations connected in a HIE roles that are associated with data access rights, i.e. which types and to which extend data can be requested by authorized HIE users.  This is an important regulatory requirement, but also a helpful feature to streamline clinical workflow.

– Identity Management (IdM) is used in organizations to create auditable and traceable identities of system users that have certain rights to access or create/update information. It includes access management and single sign on, but also identity provisioning. While each organization within an HIE might have their own IdM solution, those individual solutions have to be federated in order to allow HIE wide access and provisioning. Federated systems create a circle of trust, in which access right and roles migrate with the access request across organizations. Besides the technical implementation of an IdM federation, it also requires audit logging and role definition across the participating organizations.

– Consent management as an extension to access management specific to healthcare privacy concerns. While normally access and role based access to information is sufficient, a specific consent management extension implements patient rights to restrict data access further, while propagating and tracking consented access. The new, extended privacy requirements of HIPAA expressed in ARRA2009 could make consent management mandatory.

– Clinical applications, such as Laboratory Data viewers, consolidated DICOM viewers, Medication records, and Clinical decision support systems. While all the aforementioned modules and systems are enablers of a HIE, the clinical applications are the return on investment. From a cost containment point of view, avoidance of redundant procedures is the direct measurable component. Provided that imaging procedures, for example, are a very rapidly growing cost factor in health care [5], access to recent imaging can both reduce cost and improve decision making. The same is true for laboratory test, albeit the per-procedure savings is smaller by an order of magnitude. Cost savings caused by redundancy avoidance is a major factor in Walker et al.’s value calculation [8]. Indirect cost savings are achieved by access to medication records, which can unveil medication compliance and avoid undesired drug-drug interactions. In recent implementations extending information to citizens also has become a desired feature, be it for prevention or disease management purposes.

It is important to note that working applications for all modules exist, eliminating the need for costly and risky development. However, significant integration effort is required to combine all functional elements to a seamlessly working, secure and scalable information system.

In conclusion, the experience of building RHIOs and HIE over the past decade has demonstrated the risks and challenges of a complex health data exchange, but it has also yielded components and experience that make it today substantially easier to architect working HIE. While the technical problem therefore seems manageable, the core issues of existing RHIOs remain financial viability and access to vast amounts of data that are not currently captured electronically. In recent years, payers have begun to address this gap by mining claims data for longitudinal medication and diagnoses information, which is further evidence that both commercial and public payers (such as Medicaid) should be critical stakeholders in any HIE project.
1.    111th Congress of the United States of America. American Recovery and Reinvestment Act of 2009 (ARRA), 2009.
2.    Adler-Milstein, J. and Jha, A. Fledgling firms offer hope on health costs. Harvard Business Review, 86 (3). 26.
3.    American National Standards Institute (ANSI). HITSP – enabling healthcare interoperability. ANSI ed., 2009.
4.    Dolin, R., Alschuler, I., Boyer, S., Beebe, C., Behlen, F., Biron, P. and Shvo, A.S. HL7 clinical document architecture, release 2. Journal of the American Medical Informatics Association, 13 (1). 30.
5.    Levin, D.C. and Rao, V.M. Turf wars in radiology: the overutilization of imaging resulting from self-referral. Journal of the American College of Radiology, 1 (3). 169-172.
6.    Miller, R.H. and Miller, B.S. The Santa Barbara county care data exchange: Lessons learned iHealth reports, California Health Care Foundation, 2007.
7.    Walker, J., Pan, E., Johnston, D., Adler-Milstein, J., Bates, D.W. and Middleton, B. The value of health care information exchange and interoperability. Health Affairs.
8.    Walker, J., Pan, E., Johnston, D., Adler-Milstein, J. and et al. The Value Of Health Care Information Exchange And Interoperability. Health Affairs, 24. 10.