Subscribe to our blog

Your email:

Clinical Architecture Healthcare IT Blog

Current Articles | RSS Feed RSS Feed

A Primer on the Vocabularies of Meaningful Use – Allergy Vocabulary

  | Submit to Digg digg it |  Add to delicious  delicious |  Submit to StumbleUpon StumbleUpon | Submit to Reddit reddit 

PLEASE NOTE: The Meaningful Use Final Rule was released on July 12 and the UNII is no longer listed as the standard for allergy terminology.  In fact, there is NO standard listed for allergy interoperability.  For the record, I do not think that the following blog post, which "aired" on June 30th, influenced the governments decision making process in any way.  My next post will suggest a significant stimluls for healthcare IT companies with the word 'architecture' in the name... just in case.  In order to preserve history, I am leaving the post as it was.  It still provides a decent overview of UNII for those of you that would like to leverage it.

The vocabulary chosen to represent patient allergies is the FDA Unique Ingredient Identifier or UNII (I guess ‘UII' would be a difficult acronym to use in casual conversation...).

The UNII is part of the Substance Registration System whose purpose is to provide unique identifiers for:

Foods

  • Food substances are specific foods or components of food, regardless of whether the food is in conventional food form or a dietary supplement, such as vitamins, minerals, herbs, or other similar nutritional substances.

Drugs

  • Drug substances include both active and inactive ingredients used in drug products, including those for veterinary purposes.

Biologics

  • Biologic substances include both active and inactive ingredients used in biologics, such as blood products, therapeutic products, vaccines, cellular and gene therapy products, allergenic products, tissues, and certain devices (e.g., enzymes in stabilized solutions).

Devices

  • Device substances include certain components of some devices (e.g. silicon for implants, and chemical reagents for glucose test kits).

Cosmetics

  • Cosmetic substances are components of cosmetic products, such as flavors, fragrances, colorants, vitamins, plant- and animal-derived ingredients, and polymers.

 There is more general information on the UNII here: http://www.fda.gov/ForIndustry/DataStandards/SubstanceRegistrationSystem-UniqueIngredientIdentifierUNII/default.htm

According to the above site, the UNII is:

  • One of the core components of the United States Federal Medication Terminology.
  • Used in the FDA's Structured Product Labeling
  • Used to assist in the generation of the National Library of Medicine's (NLM's) RxNorm.
  • A US government standard for drug ingredient and food allergen identifiers
  • A component of the Environmental Protection Agency's Substance Registry System (future)

The UNII may be found in:

  • NLM's Unified Medical Language System (UMLS)
  • National Cancer Institute's Enterprise Vocabulary Service
  • USP Dictionary of USAN and International Drug Names (future)
  • FDA Data Standards Council website
  • VA National Drug File Reference Terminology (NDF-RT)
  • FDA Inactive Ingredient Query Application

The UNII is provided, rather inconveniently, in excel format. 

There is a multi-worksheet (A-S, T-Z), denormalized, zipped excel workbook dated 6/25/2010 at the following location.

http://www.fda.gov/downloads/ForIndustry/DataStandards/StructuredProductLabeling/UCM217498.zip

The sheets are difficult to work with because they have combined the concepts and their synonyms into a single list.  It is also worth noting, that in the data provided the synonyms do not have unique identifiers.

Sheet Structure

The primary sheets with the UNII codes in them have the following columns:

Preferred substance name

This is the preferred name of the substance

UNII

The Unique identifier the preferred substance name

Substance name

A synonym for the preferred substance name

IT IS TSN

This is not really documented, BUT I believe it is, where applicable, a code representing the USDA Integrated Taxonomic Information System (ITIS) Taxanomic Serial Number (TSN).   This appear to only be populated for food ingredients

Molecular Formula

This is, you guessed it, the molecular formula.  It seems to only be populated for chemical ingredients.

UNII Sheet

Code structure and design

The UNII code is a ten character alpha-numeric code.  The first nine digits are randomly generated and the tenth digit is determined by an algorithm (a check digit for you old timers who wrote serial port interfaces...).

 The Numbers

There are 16,655 unique UNII concepts in the provided list.

There are 67,715 synonyms, including the preferred names.

What's missing?

UNII Type:             

We know that the scope of the UNIIs covers a number to types of substances.  It would be very useful if there was a way of telling which UNIIs are of which type so that we could filter them.  I may not want to include cosmetics OR biologics in my allergy pick list, for example.

Allergies:              

In most systems that track allergies, medication allergies in particular, they allow the user to represent allergies using medication ingredients, common brand names OR allergy classes.  The UNII scope only covers one of these.  How will we use UNII to represent and documented allergy or adverse reaction to ‘Nyquil' or ‘cephalosporins'?  Also, if you are going to represent allergies should the list include animals and environmental allergies.

Not a Rantsoap box

I don't want to get off on a rant here... but it seems like for some of these meaningful use terminologies, rather than creating a terminology designed to support appropriate interoperability, we looked to see what we already had lying around.  UNII is not an allergy terminology, it is a substance terminology. They are not the same thing.  They are terminology domains that merely overlap.  I know, creating a terminology is hard but, ahem, 19 billion dollars!  This is not a criticism directed at the UNII codes or the people that maintain them.  It looks like a very thorough substance terminology with a fairly simple design, but it will not support allergy interoperability as it should be supported.  Now, we could change UNII terminology to include allergy classes, animals and environmental terms, but that would make it a less wonderful substance terminology then, wouldn't it?  Perhaps a better approach would be to use UNII in our allergy interoperability terminology, in the utopian future, to represent substances (with types please) and we could append the other allergy types (classes, animals, environmental) to save money and reduce the deficit.  I could live with that.

 (I will now climb down from my virtual soap box, so that you can come out from behind your furniture...)

To make up for the non-rant, I am happy to provide a normalized version of the most recent UNII data for your experimentation.  It is provided in a zip file as two, pipe ‘|'delimited text files with the following structure.

UNII File Format

If you would like to receive this file, contact us and ask for it.  We will email it to you or provide you with access to our FTP server. 

I want to thank Bonnie for reminding me that I should do this post.

I will try to post more frequently.

 

A Primer on the Vocabularies of Meaningful Use - Problem Vocabularies

  | Submit to Digg digg it |  Add to delicious  delicious |  Submit to StumbleUpon StumbleUpon | Submit to Reddit reddit 
What seems to be the Problem?

International Classification of Diseases (ICD)

International Classification of Diseases is a publication from the World Health Organization (WHO) and it provides a number of vocabularies for expressing disease concepts. 

The history of the ICD is available here: http://www.who.int/classifications/icd/en/HistoryOfICD.pdf

ICD-9-CM

The International Classification of Diseases, Ninth Revision, Clinical Modification (ICD-9-CM) is based on the World Health Organization's Ninth Revision, International Classification of Diseases (ICD-9).

ICD-9-CM is the official system of assigning codes to diagnoses and procedures associated with hospital utilization in the United States.  

The structure of  ICD-9-CM codes is relatively straight forward.  The code itself is an explicit hierarchy with the primary disease characteristic typically represented by the first part of the code and the secondary characteristics grouped in numeric sequence in the second part of the code. 

ICD9 structure

 

 

 

 

As you can see in the example below you should always treat the ICD-9-CM code as text and not a numeric as numeric interpretation of the code would be a disaster.

Below are the ICD-9-CM codes representing ‘hypertensive chronic kidney disease':

403             Hypertensive chronic kidney disease

403.0          Hypertensive chronic kidney disease, malignant

403.00        Hypertensive chronic kidney disease, malignant, with chronic kidney disease stage I through stage IV, or unspecified

403.01        Hypertensive chronic kidney disease, malignant, with chronic kidney disease stage V or end stage renal disease

403.1           Hypertensive chronic kidney disease, benign

403.10        Hypertensive chronic kidney disease, benign, with chronic kidney disease stage I through stage IV, or unspecified

403.11        Hypertensive chronic kidney disease, benign, with chronic kidney disease stage V or end stage renal disease

403.9           Hypertensive chronic kidney disease, unspecified

403.90        Hypertensive chronic kidney disease, unspecified, with chronic kidney disease stage I through stage IV, or unspecified

403.91        Hypertensive chronic kidney disease, unspecified, with chronic kidney disease stage V or end stage renal disease     

Note: This manner of establishing codes is less than ideal.  A smart code is a identifier that implies meaning through its structure.  Typically this manner of establishing codes becomes fraught with issues as a coding scheme becomes more complex over time.  For example, there is not a very good way to express a disease or procedure in ICD-9 if it belongs in more than one place in the hierarchy (poly-hierarchical) without creating duplicate concepts (which is bad).

There are roughly 22,000 ICD-9-CM codes.

The ‘Home Page' of ICD-9-CM is http://www.cdc.gov/nchs/icd/icd9cm.htm

Wikipedia has a fairly robust ICD-9 list and reference capability here: http://en.wikipedia.org/wiki/List_of_ICD-9_codes.

Like all public standards is not provided in a format that makes it easy to use.  The downloads that are available via FTP are rich text files that are human readable that are not easy to parse into a typical application consumable vocabulary file. 

There are a number of web sites that providing search and lookup tools for ICD-9 but the only source of free coded ICD-9-CM codes (that I have found) is the UMLS metathesaurus (also not easy...).

If you want easy and well structured you need to pay...

I recommend Ingenix at the following link: http://www.shopingenix.com/Category/100093/Product/16699/

ICD-10-CM

Like ICD-9-CM, ICD-10-CM is based on the World Health Organization ICD-10 coding system.  ICD-10 is designated to replace ICD-9 and is a more granular terminology (actually more like SNOMED-CT).

The structure of ICD-10-CM is different than ICD-9.  The codes are alphanumeric where the initial alpha code delineates the codes into 22 chapters. 

Below are the ICD-10-CM codes representing ‘hypertensive chronic kidney disease':

I120              Hypertensive chronic kidney disease with stage V chronic kidney disease or end stage renal disease

I129              Hypertensive chronic kidney disease with stage I through stage IV chronic kidney disease, or unspecified chronic kidney disease

 

There are roughly 68,000 ICD-10-CM codes.

The structure of the ICD-10 that is as is depicted below (thanks to the AHIMA website).

There is a good primer on the differences between ICD-9 and ICD-10 on the AHIMA website here: http://library.ahima.org/xpedio/groups/public/documents/ahima/bok1_038084.hcsp?dDocName=bok1_038084

The ‘Home Page' of the ICD-10-CM is http://www.cdc.gov/nchs/icd/icd10cm.htm

Wikipedia has a fairly robust ICD-10-CM list and reference capability here: http://en.wikipedia.org/wiki/ICD-10.

RTF was apparently too easy as ICD-10-CM is published as a PDF file...

ICD-10-CM can also be pulled from the UML Metathesaurus and purchased in convenient formats from Ingenix.

SNOMED-CT

SNOMED CT (Systematized Nomenclature of Medicine--Clinical Terms) is a comprehensive clinical terminology, originally created by the College of American Pathologists (CAP) and, as of April 2007, owned, maintained, and distributed by the International Health Terminology Standards Development Organization (IHTSDO).

SNOMED-CT codes do not have a hierarchical code, like the ICD vocabularies.  Rather, SNOMED-CT creates meaningless identifiers and relates them to each other in a directed acyclic graph or DAG (which is where the phrase DAG-gumit! originiated... I am pretty sure).   This means that any term in the vocabulary can be related to zero-to-many terms, as long it is cannot end up being its own parent.  The relationships themselves are separate from the SNOMED-CT code.  SNOMED-CT also separates the notion of concepts and descriptions (or concept synonyms).

Below are the ICD-10-CM codes representing ‘chronic kidney disease':

431855005|disorder|Chronic kidney disease stage 1

431856006|disorder|Chronic kidney disease stage 2

433144002|disorder|Chronic kidney disease stage 3

431857002|disorder|Chronic kidney disease stage 4

433146000|disorder|Chronic kidney disease stage 5

 

There are roughly 68,000 active disorder concepts in SNOMED-CT.

I have created a number of posts (and a few screencasts) on SNOMED-CT so I would first direct you to earlier posts in this blog.

The main NLM page for SNOMED-CT is located here: http://www.nlm.nih.gov/research/umls/Snomed/snomed_main.html

The SNOMED-CT user's guide is downloadable here: http://www.ihtsdo.org/fileadmin/user_upload/Docs_01/SNOMED_CT/About_SNOMED_CT/Use_of_SNOMED_CT/SNOMED_CT_User_Guide_20090731.pdf

The main CAP page for SNOMED-CT is located here: http://www.cap.org/apps/cap.portal?_nfpb=true&_pageLabel=snomed_page

The Wikipedia page for SNOMED-CT is located here: http://en.wikipedia.org/wiki/SNOMED_CT

You can download SNOMED-CT release files from the NLM site here: http://www.nlm.nih.gov/research/umls/licensedcontent/snomedctfiles.html

Note:  to download NLM data files, like SNOMED-CT, you need to register and obtain a license from the NLM.  You can do that here http://wwwcf.nlm.nih.gov/umlslicense/snomed/license.cfm

The next post will cover procedure terminologies.

A Primer on the Vocabularies of Meaningful Use - Introduction

  | Submit to Digg digg it |  Add to delicious  delicious |  Submit to StumbleUpon StumbleUpon | Submit to Reddit reddit 
I recently had a request to create a post providing a primer on the vocabularies of meaningful use.  Let's start with a review of the vocabularies that are named in the meaningful use criteria described on pages 21 and 22 of the January 13th release of the federal register located here: http://edocket.access.gpo.gov/2010/pdf/e9-31216.pdf.

The "Chosen Ones"

The listed vocabularies and their purpose are as follows:

Terminology

Stage

Purpose(s)

ICD-9-CM

Stage 1

Stage 1

Problems

Procedures

ICD-10-CM

Stage 2

Problems

ICD-10-PCS

Stage 2

Procedures

SNOMED-CT

Stage 1

Stage 2

Stage 2

Problems

Problems

Lab Results (Submission Public Health)

CPT-4

Stage 1

Stage 2

Procedures

Procedures

Third Party Drug Vocabularies*

Stage 1

Stage 1

Medications

Electronic Prescribing

RxNorm

Stage 2

Stage 2

Medications

Electronic Prescribing

UNII

Stage 2

Medication Allergies

CVX

Stage 1

Stage 2

Immunization Registries

Immunization Registries

LOINC

Stage 1

Stage 1

Stage 2

Stage 2

Lab Orders (from Reference labs)

Lab Results (from reference labs)

Lab Orders (All)

Lab Results (All)

UCUM

Stage 2

Units of Measure

CDA template

Stage 2

Vital Signs

* Third Party drug vocabularies that are listed as complete in RxNorm by the NLM

Meaningful Selection

What does it mean that a vocabulary is one of the chosen ones?  My understanding is that the meaningful use criteria (based on my reading of the federal register) defines that to be certified EHR technologies must provide patient summaries and interoperate (exchange data) using the listed vocabularies for their defined purposes.  In other words, the vocabulary standards are for interoperability not native persistence in the EHR application.

It is not reasonable to expect that every hospital / physicians office in the US will migrate their patient data to these standards (and then do it again for 2013).  As a good application architect, your objective is to determine how you will be able to express your client's patient information in the anointed vocabularies.

Where to Learn More

There is a lot I can say about these vocabularies, both their suitability to the task that have been so capriciously assigned to them and the challenges associated in working with each of them.  This is not the post for that particular diatribe.  In this post, I will try to give you some high level information and some places to find out more.  So strap on your learning caps and practice your right click ‘open in new tab' skills.

The next post will cover the problem vocabularies.

Mapping SNOMED-CT to ICD9 Screencast

  | Submit to Digg digg it |  Add to delicious  delicious |  Submit to StumbleUpon StumbleUpon | Submit to Reddit reddit 

I have completed a new screencast that provides a 17 minute screen castoverview on how to leverage the ICD9 cross map files in the SNOMED-CT optional download.  Grab a sandwich, MS Access and curl up by the monitor and see how this great resource from the NLM can save you some time and effort mapping SNOMED-CT terms to ICD9.

Follow this link to the resource page to access the screencast.

I am always looking for new ideas for screencast.  If you have one email me and let me know.

Have a great week!

SNOMED-CT Essentials and CORE subset screen cast

  | Submit to Digg digg it |  Add to delicious  delicious |  Submit to StumbleUpon StumbleUpon | Submit to Reddit reddit 

I had a request to do a screen cast on the new SNOMED-CT CORE Screen Castsubset.  In order to do it justice, I decided to provide an overview of the SNOMED-CT Essentials terminology and the CORE subset together to provide context for people that are new to both.

Follow this link to check it out.  It is also available on the Resources page.

It is about 20 minutes and I tried a new approach.  I hope you find it useful.

Please do not hesitate to contact me if you have any feedback or suggestions for future screen casts.

Basic Interoperability with RxNorm

  | Submit to Digg digg it |  Add to delicious  delicious |  Submit to StumbleUpon StumbleUpon | Submit to Reddit reddit 

As we here at Clinical Architecture have been developing our Symedical product, I have had the pleasure of spending some quality time with UMLS Metathesaurus and RxNorm.  As I was going through this journey of discovery, I thought to myself that it might be a good idea to share my findings and experiences with others who might also be looking into using RxNorm and UMLS to enhance or improve their clinical interoperability.

So, to that end, I have created the first of, in what I hope will be, a series of Screen Casts that provide some insight into UMLS and RxNorm.

(Video should work in all browsers but requires the Quicktime plug-in)

Basic Clinical Interoperability with RxNorm - Screencast

Special thanks Jan Willis and the other folks at NLM for their feedback.

What are the characteristics of a good terminology?

  | Submit to Digg digg it |  Add to delicious  delicious |  Submit to StumbleUpon StumbleUpon | Submit to Reddit reddit 

When looking into what makes a good terminology, I would be remiss if I did not mention Dr. James Cimino's ‘Desiderata for Controlled Medical Vocabularies in the 21st Century'.  Dr. Cimino's body of work is very enlightening and this particular publication started me on my personal journey into medical informatics.

The characteristics represented in this post stem from what I learned from Dr. Cimino and other mentors that I have had the privilege of working with directly (you know who you are) as well as  my personal experience, ideas and pragmatic tendencies.

For the purpose of this post, a terminology is a set of terms with identifiers.  The terms collectively are designed to model some facet of a particular domain.  This could be a terminology of medications, ingredients, routes of administration, lab tests, units of measure, species, electronic parts, legumes or even Pokemon.

Basics

There are some basic characteristics that you should always look for in a terminology:

Characteristic #1 - Unique Identifiers

The identifier for a given term should be unique.  The same identifier should never represent two different concepts in the terminology, ever.

Characteristic #2 - Stable Identifiers

The identifier for a given term should be persistent.  Regardless of the status of the term the identifier should stick around and never be re-used to represent another term. (see Rule #1 - and YES, I am looking at you National Drug Code...)

Note: Rule #1 and #2 are important because identifiers get stored in electronic records and when that data is accessed later, the electronic record is invalid if the identifier is gone or the meaning has changed. 

Characteristic #3 - Dumb Identifiers

An identifier itself should not have meaning.  If an identifier is comprised of other identifiers that have been combined, then the composite identifier is inherently unstable.  If the circumstances that related the composite identifiers together in the first place change, the resulting identifier must also change.  For example, drug identifiers with smart numbers based on therapeutic classes become unstable when a class splits or the drug is assigned to another class.  This also can become a problem if part of the composite key outgrows its original bounds and effectively breaks the parse-able nature of the composite keys.  

The term ‘dumb number' was created as a counterpoint to ‘smart numbers' or numbers with built-in meaning.    I change number to identifier, because I am not convinced that identifiers need to be numbers. Dr. Cimino's has a great name for them: ‘non-semantic identifiers'.

Characteristic #4 - Coverage

The terminology should adequately cover the domain it is meant to model.  If the terminology does not have enough terms the consumer will find themselves wanting or worse... free texting.

Well Managed Terms

There are several things that you should look for to determine if the terms are well managed or have junk DNA that can pollute the terminology.

Characteristic #5 - Concept Orientation

This is a notion that is described in Dr. Cimino's Desiderata very well.  It means that "terms must correspond to at least one meaning ("nonvagueness") and no more than one meaning ("nonambiguity"), and that meanings correspond to no more than one term ("nonredundancy")".  In other words, a term should represent a concept and that concept should only be represented once as an active term in the terminology.  If you lose concept orientation, you end up with a pick list where a term is repeated or a term that represents a concept broader than the scope of the terminology.   If the concept orientation is not well managed in a terminology, it will look a mess.  There will be repetition of terms, or worse, terms that don't make sense (for example, an ingredient terminology with the term ‘Powder')

Characteristic #6 - The Controlled Terminology should be controlled.

This states that a terminology should have a focus, and it should stay true to that focus.  All too often the keepers of a terminology may be tempted to introduce a term into the set that is not an appropriate concept for the terminologies domain, but it serves some other purpose.  On Sesame Street there was a segment called ‘One of these things (is not like the others)'.  This is the way you feel when working with a vocabulary that is not well controlled.  You come across terms that don't quite fit.   This is especially prevalent in older terminologies and hopefully they have some indicator, or classification, that can help you navigate around them (because it can be hard even for a monster - click the link...).

Characteristic #7 - Consistent Term Structure

The terms themselves should have a consistent structure.  When dealing with a term that describes a granular concept, like a dispensable medication.  The lexical components that make up a term should have the same ordinal pattern from term to term.  You should not see for example ‘ibuprofen 200 mg oral tablet' and later ‘warfarin 200 mg tablet oral'.

Bringing the terminology to life

Making the terminologies more robust, or three dimensional, allows the terminology consumer to leverage the resulting metadata and maximize the utility of the terminology.  The following characteristics help breath life into a terminology.

Rule #8 - The Terminology should have a lifecycle

Terminologies evolve.  New terms are created, existing terms are split, become obsolete or are replaced.  A good terminology provides the user information on the status of a term that allows the terminology consumer to take action when a something happens to a terms.  This can be as simple as a term status that indicates whether it is active or obsolete, or as complex as replacement pointers that help the terminology consumer decide how to transform the obsolete term they are referencing.   This is especially true if the terminology is stable.  Since an identifier never gets removed from the terminology, the terminology consumer needs to know when it is past is ‘sell by' date so it does not continue to get selected and used in an electronic record.

Characteristic #9 - The Terminology should be part of a well defined domain ontology

If the terminology represents a concept that is comprised of component parts, those parts should be represented by terminologies associated to those terms following the same guidelines listed above.  In other words, if I have a terminology that describes a fully specified lab test, the term is naturally made up of several components, in this case: the analyte name, specimen type, method and result unit (for example) for the test.  Each of the components should be represented by a terminology and my terms should have an associative relationship to those component terms.   This allows the consumer to sift and sort terms and leverage the ontology to get maximum use form the terminology.

Characteristic #10 - Interoperability

A terminology should be associated to a standard interoperable terminology if one is available. When choosing a terminology the consumer needs to have the ability to exchange information with other applications.  Not all domains have an interoperability standard, but those that do, like medications and RxNorm, should participate in those standards appropriately or have links to them.

Characteristic #11 - Extensibility

No terminology can satisfy all the needs of the consumer.  Defining the terminology in such a way as to allow the consumer to extend the terminology facilitates the extension of the terminology to bridge a period until the term is added by the source or permanently, if the term is very local to the consumer.  This can be accomplished in several ways and could also depend on how the consumer implements the terminology in their solution. 

Summary

These are some of the characteristics that make up a good terminology there may be others that are both generic and domain specific.  I welcome any comments and wish you luck in your personal informatics journey.

Clinical Interoperability - The Antics of Semantics

  | Submit to Digg digg it |  Add to delicious  delicious |  Submit to StumbleUpon StumbleUpon | Submit to Reddit reddit 
Semantic interoperability deals with the actual "language" contained in the conversation between applications.  Solving the syntactic interoperability issue by using a standard message format does not mean that the terms used by one application are the understood by the other.

Applications Can't See the Bunny

if you cant see the bunny you may be an applicationIn healthcare to have a meaningful exchanges we require a ‘language' that can describe characteristics about a given situation with as little ambiguity as possible.  The human brain is designed to process and recognize patterns.  It's why clouds look like bunnies and that potato chip looks like Elvis.  This is why we can read "Tummy ache" and translate it on the fly to "abdominal pain" or even "gastric distress".  We are built to process ambiguity.  Software applications do not share our gift for interpretation, at least not yet, so they need another way, to accurately and consistently describe the characteristics of a healthcare situation, hence the proliferation of codes.

Codes are unique values that have been assigned to represent a piece of information.  They can be numbers, letters or a combination of numbers and letters.  These codes can come from a content vendor, a standard public domain source or can be home spun codes created and maintained by an institution for its own purposes.

The Problem with Code Sets

It is important to note that there are relatively few code sets that have been created to foster interoperability.  In the beginning, there were not companies that specialized in the creation and management of code sets (also known as vocabularies and terminologies).  Codes were created by the organization to support the unambiguous processing of information within the organization.  As applications evolved on the business side and there was a need to exchange data with trading partners, the need for third party code sets grew.  Eventually, code set vendors (also known as compendia) sprang up and established proprietary code sets that could be licensed and used by both partners.

Healthcare applications have evolved.  Their need to express more complex and critical clinical information has created a more extensive need for stable, granular code sets across several conceptual domains.  Many code sets have evolved to support these uses and some have even evolved to help with interoperability.

Some of the domains and their code sets are as follows:

Laboratory Tests and Observation Code Sets:

General Medical Code Sets:

Medication Code Sets (including Medication allergies):

Public

Commercial (in alphabetic order)

Units of measure:

Procedures:

Specialty Code Set Examples:

I will stop here, but anyone that has seen the Unified Medical Language System (UMLS) list of sources knows that I have just scratched the surface.  There are over 100 terminology sources, and that's not counting the home grown terminologies that are still in use in many institutions for various domains.

Take the Blue Pill

So how do you facilitate semantic interoperability when dealing with such diversity?

There are two answers to this question. 

The easy way is to convince your exchange partner to abandon their code set and adopt yours.  This means that they will have to potentially rewrite their application and convert all of their patient data, but it will certainly make your life easier.  Oh yeah, you could adopt theirs - but that's crazy talk.

The other way is to create a map that facilitates the translation of your codes into theirs and vice versa.  This tends to be the way most people go about it.

Fine, Hand Built Maps

Here we engage our most expensive computing resource, the human being, who dutifully reviews each term in the source and target code sets and creates a map between them.  This can be expensive, depending on the size and complexity of the code set.  It should also be noted, that this is not a onetime gig.  Code sets are living data; they grow, shrink and change.  This means that the maps need to be tended to regularly.   It also is better is done by a knowledgeable resources who you trust to make the right choices.

Over time, our friends at the National Library of Medicine worked out a way to help those trying to make interoperability with the heavy lifting.

Universal Translator

Unified Medical Language System or UMLS was started at the National Library of Medicine in 1986 as a long term R&D project to explore how they could overcome barriers to effective retrieval of coded information.

The UMLS is comprised of three databases: Metathesaurus, Semantic Network, and the SPECIALIST Lexicon.

  • The Metathesaurus represents a clinical terminology clearinghouse with inherent mechanisms for normalizing terms on a conceptual basis.
  • The Semantic Network is a set of files that attempt to provide a mechanism for the categorization and hierarchical organization of applicable terms.
  • The SPECIALIST Lexicon is a database designed to support natural language processing.

For the purposes of semantic interoperability, we will focus on the Metathesaurus which provides a conceptual backbone for the medical terminologies that participate. 

The Metathesaurus by itself does not do solve the semantic interoperability problem, but it does provide the functionality of a thesaurus by relating terms from different sources that have the same or similar conceptual meanings.

Using the Metathesaurus is both simple and complex.  I won't go into the details here, but I will be releasing a whitepaper in the near future designed to help understand it.  In the meantime there are a number of great sources of information starting with the NLM provided documentation, which is actually pretty good.

In a nutshell, the Metathesaurus maps the source codes provided by the creators of the different code sets to unique strings (SUIs), normalized lexical terms (LUIs) and distinct concepts (CUIs).  This information lives in the primary file in the Metathesaurus, MRCONSO.  The name of this file stands for Metathesaurus Relational Concepts and Sources in a quaint 1980's limited filename sorta way.

UMLS MRCONSO

The second most used file in the Metathesaurus is MRREL.  This is the file that contains the relationships between concepts that supports the traversal of a source's ontology.

Also included in the Metathesaurus are files that describe hierarchies, ambiguous terms, co-occurrences and attributes among other details.

For the world of medication terminologies, NLM has created a specialized subset of the Metathesaurus called RxNorm.  Part of the complexity of using the Metathesaurus is the steps involved to extract a subset of sources.  By focusing on medication terms, RxNorm is easier to use and update.  Being focused on medication relationships, RxNorm provides an excellent source of conceptual mapping for the complex and highly utilized medication terminology subset.

Leveraging the Metathesaurus

Whether you are using the a custom subset of the Metathesaurus or RxNorm it is a useful to aid you in establishing maps between code sets in support of interoperability.  I am not sure I would rely entirely on the either as a standalone source for this task.  It is always important to note that the updates in the Metathesaurus can and will lag behind those form the sources.  If you rely entirely on the Metathesaurus the fidelity of your interoperability is likely to suffer.

This Old Code Set

It is worth noting, that if you are dealing with a homegrown code set you are pretty much on your own.  This makes more work, but remember the language of healthcare is fairly distinct, especially if you know the context if use.  A home grown code value associated with the word ‘acetaminophen' can get me to an established code value for ‘acetaminophen' if I know they are both how a patient described an ingredient they were allergic to.  If you are in a position to move off of a home grown code set, you should check the Metathesaurus source list and ask around for a good public or commercial source.

Resources That Can Help

There are a number of companies and tools that can help with the creation and management of maps.   You can find them by googling ‘clinical interoperability'.  I would also be happy to provide recommendations, so feel free to contact me.

 

Clinical Interoperability - Getting the message across

  | Submit to Digg digg it |  Add to delicious  delicious |  Submit to StumbleUpon StumbleUpon | Submit to Reddit reddit 
As mentioned in the first article in this series, two critical parts of Clinical Interoperability are physical and syntactic interoperability.   These two are essentially the first and second of what I consider to be the four laws of interoperability dynamics.

In order for two applications to exchange high fidelity information about a patient, there must be:

1. A physical transport mechanism and the low level protocols that physically move the information from one system to another.

2. An understanding of how the information is packaged for transport by the sender so that the receiver can extract the information.

3. A common terminology (or translation mechanism) for the relevant, codified patient information so that the receiving application can ‘understand' the information once extracted.

4. Equivalent functionality in both applications.

This article focuses on the first two of these.

Physical Interoperability

What is interesting about the physical transport of critical information is that people outside of healthcare probably think that our industry is dominated by the electronic data transactions.  I am not sure that is the case.  One example of this is prescriptions.  According to NACDS, of the 3.5 million prescriptions filed in 2007, only 2.1% were processed via electronic messaging.   Keep in mind that the medication prescription area is one of the most advanced, in terms of electronic messaging, in healthcare.  So, today, when we talk about physical interoperability, we are talking about transport mechanisms that include ‘sneaker-net', faxing, file transfers as well as pure electronic processing.  This works today because there are people that are acting as interoperability adapters.  They are interpreting the data, transforming it and entering it into the receiving system.  This ‘chair to keyboard interface' approach is very inefficient, error prone and, technically, does not qualify as ‘high fidelity'.

When we talk about the physical transport mechanisms in the future, what are we looking for?  A combination of existing networking technology, communication protocols and data exchange software (commercial or home grown) are fairly ubiquitous.  The reason for the low adoption is not a reflection of a lack of readiness on the physical side of things.  We have the plumbing to support good clinical interoperability, so what is the problem?  Perhaps we don't have good standards for message syntax.

Syntactic Interoperability

If this were a treatise on healthcare messaging standards, it would be much much longer than an article and written by someone other than me.  Back in the early days of my career (right after we eliminated the dinosaur problem), I was involved in the business of laboratory interfaces.  In the beginning, most interfaces between applications, or applications and devices, were complete custom endeavors.  The two parties would negotiate a format and write their respective parts.  After a while, standards began to emerge.  The first I was involved in was the ASTM standard for laboratory information exchange, which went on to become what is now Health Level 7 (HL7).  It wasn't perfect, but it served our purpose and saved us from reinventing the wheel.  So we adopted it.

Today, there are several standard messaging formats that play a role in healthcare interoperability.  Their scope and maturity are proportional to their drivers and adoption.  For the purposes of this article, I will describe a few of the major formats that have significant adoption, or are in the news.

Note: If your favorite standard is not mentioned, please feel free to throw it in the comments section.  Just make sure that it is not in the standard graveyard.  My rule of thumb is, if three out of four search engine links resolve to a missing page, the standard is likely not a going concern.


Electronic Prescribing

NCPDP SCRIPT Standard

The NCPDP script standard is a collection of message formats designed for the express purpose of facilitating electronic prescribing.  It was approved as a national standard in 1997 and became the official standard for pharmacy claims in HIPAA in 2000.  Visit the NCPDP website for more information.


General Healthcare Information Exchange

These standards are designed to provide common formats to support a number of common transactions in a healthcare setting.

Health Level 7 (HL7) Version 2.x

HL7 v2.x is a collection of messaging standards including: Patient Administration, Order Entry, Information Query, Financial Management, Observation Reporting, Medical Records, Scheduling, Patient Referral, Laboratory Automation, Patient Care, Personnel Management, and Application Management to name a few.

This is probably the most widely adopted standard in healthcare.  Version 2 was first introduced in 1990 and was intended to be an '80 percent' solution. 

Health Level 7 (HL7) Version 3.0

The goals of HL7 version 3.0 were much more ambitious than its predecessor.  The goal of version 3.0 included: internationalization, a consistent data model, a more precise standard and freedom from any constraints that would be imposed by compatibility with version 2.x.  The complexity and proscriptive nature, and associated costs, have resulted in slow adoption of Version 3.0. Visit the HL7 website for more information on both version 2.0 and version 3.0.


Application Synchronization

HL7 Clinical Context Object Workgroup (CCOW - pronounced ‘sea-cow')

The CCOW standard was introduced to provide a standard to support the visual integration of healthcare applications.   Providers are often required to work across multiple applications.  The CCOW standard describes a collection of mechanisms that allows applications to shift focus to the same patient.  Visit the HL7 CCOW Page for more information.


Electronic Health Record Information Exchange

These standards are designed to support the exchange of a patient medical record.  This is different than a transaction.  A transaction is sending information and instructions to facilitate some action being performed.  These formats are intended to share a current snapshot of a patient's clinical context. 

Health Level 7 Clinical Document Architecture (CDA)

The CDA is an XML-based markup standard intended to specify the encoding, structure and semantics of clinical documents for exchange.  CDA is based on the HL7 reference information model (RIM) and is part of the version 3.0 standard. 

Visit the HL7 CDA FAQ for more Information.

ASTM Continuity of Care Record (CCR)

The CCR was created by ASTM International, the Massachusetts Medical Society (MMS), HIMSS , the American Academy of Family Physicians (AAFP), the American Academy of Pediatrics(AAP), and other health informatics vendors. It was designed to contain the most relevant and timely core health information about a patient.  It has been adopted by a number of healthcare application vendors, and most notably by Google's Patient Health Record.  More information can be found CCR Resource Site.

HL7/ASTM combined Continuity of Care Document (CCD)

The Continuity of Care Document is an HL7 standard whereby a patient's current clinical context is expressed in the framework of the Clinical Document Architecture. More information on the CCD can be found on the HL7 website under CDA Release 2.


The Standards are Out there

There is likely a message format that can satisfy your need to interoperate syntactically.  The important factors are what your trading partners support and which formats can you implement without re-engineering your entire application.  There are also a number of businesses that provide adapters that can be bolted onto you application to make this easier as well.

A number of the formats mentioned also have specific terminologies that are ‘supported' by the standard.  Some support several and some require a common terminology.

This brings us to the third law in interoperability dynamics.  Even if you have an established physical transport and agreed upon standard message format, if you are not communicating using terms that the receiver can understand, you will not have a high fidelity exchange of coded information.

There are information sources and strategies for coping with terminology challenges.  Like some medical conditions, there is currently no cure, but there may be a way we can manage it until there is.  That is the subject of the next article.

Why is Clinical Interoperability Important?

  | Submit to Digg digg it |  Add to delicious  delicious |  Submit to StumbleUpon StumbleUpon | Submit to Reddit reddit 
 

Now that we have a documented definition for clinical interoperability and its macro components, the next reasonable question is: "Why is clinical interoperability important?"

Before continuing, please consider the following interoperability scale.

This scale represents the potential signal loss when information is exchanged between systems through a computer interface.  The line represents the clarity of the information as the level of interoperability improves.   This is also relevant, as the Certification Commission for Healthcare Information Technology, or CCHIT, has defined interoperability as "the high fidelity exchange of information between an EHR system and other healthcare IT systems."

Isolated systems cannot exchange clinical information about a patient, so the amount of signal loss between them is 100%.

Systems with a translational relationship exchange information, but not in a way that a software application can take advantage of.  An example of this is when systems can exchange notes about a patient or systems that can exchange free text patient information.  It requires a human to translate the information and take any action or decision support.

Interoperable systems can exchange information, and a percentage of the information can be put to coded, productive use.  This type of relationship usually has established syntactic interoperability, but less than ideal semantic interoperability.  In other words, it is getting messages but some of the information must be translated as free text for a human to interpret.

Compatible systems are leveraging established syntactic and semantic interoperability mechanisms with a high percentage of useful exchange.  An example of this is two systems exchanging HL7 messages operating on the same terminology.

Integrated systems (as the name implies) operate on a common framework.  An integrated relationship is driven by strict adherence to a common standard and a common or well-correlated set of terminologies.

It is not uncommon for people to say they have an interface between systems, even if the interface is translational and provides limited value to the consuming application.  I offer this scale to provide a vocabulary to further describe the capabilities and utility of an interface in the interoperability dialog and to further refine our question to "Why is High Fidelity Clinical Interoperability Important?"

 

There are two answers to this question.  One explains why interoperability has been important and the other tells why interoperability will become even more important in the near term future.

Interoperability has been important to the Healthcare End users for at least the last decade for the following reasons:

Data Exchange with trading partners

A system's ability to provide high fidelity information to its trading partners during a commercial transaction is imperative to supporting high-volume data exchange, as well as ensuring the accuracy of the information itself.  Some examples of this are: reference laboratory testing, e-prescribing and drug formulary compliance.

Data Exchange between Co-Mingling Applications within a Healthcare Institution

Many institutions have a core application that is responsible for the current patient information.  They may also have additional niche applications, like an Emergency or ICU specific application, that comes from another application vendor.  In this situation, the disparate applications are not integrated, but the need to exchange the patient's information is as relevant as if they were.

Data Exchange between Co-Mingling Applications within a Network of related healthcare Institutions.

There has been much consolidation in the healthcare industry as health networks merge and grow through acquisition.  When this happens, it is often the case that the new siblings do not use the same core applications.  In this situation, there is still a need to exchange a patient's information when they move from venue to venue, without the effort and risk associated with manual data entry.

Data Aggregation and Biosurveillance

With the advent of PHRs and RHIOs, there is a desire to move patient information to repositories so that the aggregated, centralized data can be put to use to the benefit of the patient and/or the healthcare industry at large.

All of these and more are reasons why the end users of healthcare IT have been rending their garments and gnashing their teeth.  These are the reasons that solutions interoperability is better now than it was ten years ago, even though it is not yet where it needs to be. 

Healthcare IT is changing. It has been recognized that for the industry to evolve and improve, it must begin movement toward convergence.  The road to convergence is paved with interoperability.

Enter the Certification Commission for Healthcare Information Technology, mentioned earlier, whose mission is "to accelerate the adoption of robust, interoperable health information technology by creating a credible, efficient certification process".

There is, and will continue to be, growing pressure on application vendors to become CCHIT certified.  While CCHIT certification is more than just interoperability, there are 39 requirements statements that relate directly to clinical interoperability in the ambulatory system certification and 22 in the inpatient system certification requirements.

The need-driven interoperability of the last decade was really about the exchange of information between distinct applications, often limited to an operational transaction.  In other words, it was sending an HL7 records to perform some discrete task or it was passing a complete patient context to a system it was tightly coupled to through a specific translational process.

By contrast, the CCHIT driven interoperability is more focused on general/holistic interoperability, requiring an accepted message format and terminology set. 

CCHIT interoperability does not require that I know about my partner.  I should be able to exchange a patient's clinical context with anyone who is CCHIT certified.

Whether you are an HIT application vendor looking to share data with business partners, cooperate with high performance niche applications and support your client's acquisition-related issues, or are concerned about being left behind because you can't put a mark in the CCHIT compliance check box, high fidelity clinical interoperability IS important.

The next question is how to make it happen.  That is the subject of another article.

All Posts