PLEASE NOTE: The Meaningful Use Final Rule was released on July 12 and the UNII is no longer listed as the standard for allergy terminology. In fact, there is NO standard listed for allergy interoperability. For the record, I do not think that the following blog post, which "aired" on June 30th, influenced the governments decision making process in any way. My next post will suggest a significant stimluls for healthcare IT companies with the word 'architecture' in the name... just in case. In order to preserve history, I am leaving the post as it was. It still provides a decent overview of UNII for those of you that would like to leverage it.
The vocabulary chosen to represent patient allergies is the FDA Unique Ingredient Identifier or UNII (I guess ‘UII' would be a difficult acronym to use in casual conversation...).
The UNII is part of the Substance Registration System whose purpose is to provide unique identifiers for:
Foods
- Food substances are specific foods or components of food, regardless of whether the food is in conventional food form or a dietary supplement, such as vitamins, minerals, herbs, or other similar nutritional substances.
Drugs
- Drug substances include both active and inactive ingredients used in drug products, including those for veterinary purposes.
Biologics
- Biologic substances include both active and inactive ingredients used in biologics, such as blood products, therapeutic products, vaccines, cellular and gene therapy products, allergenic products, tissues, and certain devices (e.g., enzymes in stabilized solutions).
Devices
- Device substances include certain components of some devices (e.g. silicon for implants, and chemical reagents for glucose test kits).
Cosmetics
- Cosmetic substances are components of cosmetic products, such as flavors, fragrances, colorants, vitamins, plant- and animal-derived ingredients, and polymers.
There is more general information on the UNII here: http://www.fda.gov/ForIndustry/DataStandards/SubstanceRegistrationSystem-UniqueIngredientIdentifierUNII/default.htm
According to the above site, the UNII is:
- One of the core components of the United States Federal Medication Terminology.
- Used in the FDA's Structured Product Labeling
- Used to assist in the generation of the National Library of Medicine's (NLM's) RxNorm.
- A US government standard for drug ingredient and food allergen identifiers
- A component of the Environmental Protection Agency's Substance Registry System (future)
The UNII may be found in:
- NLM's Unified Medical Language System (UMLS)
- National Cancer Institute's Enterprise Vocabulary Service
- USP Dictionary of USAN and International Drug Names (future)
- FDA Data Standards Council website
- VA National Drug File Reference Terminology (NDF-RT)
- FDA Inactive Ingredient Query Application
The UNII is provided, rather inconveniently, in excel format.
There is a multi-worksheet (A-S, T-Z), denormalized, zipped excel workbook dated 6/25/2010 at the following location.
http://www.fda.gov/downloads/ForIndustry/DataStandards/StructuredProductLabeling/UCM217498.zip
The sheets are difficult to work with because they have combined the concepts and their synonyms into a single list. It is also worth noting, that in the data provided the synonyms do not have unique identifiers.
Sheet Structure
The primary sheets with the UNII codes in them have the following columns:
|
Preferred substance name
|
This is the preferred name of the substance
|
|
UNII
|
The Unique identifier the preferred substance name
|
|
Substance name
|
A synonym for the preferred substance name
|
|
IT IS TSN
|
This is not really documented, BUT I believe it is, where applicable, a code representing the USDA Integrated Taxonomic Information System (ITIS) Taxanomic Serial Number (TSN). This appear to only be populated for food ingredients
|
|
Molecular Formula
|
This is, you guessed it, the molecular formula. It seems to only be populated for chemical ingredients.
|

Code structure and design
The UNII code is a ten character alpha-numeric code. The first nine digits are randomly generated and the tenth digit is determined by an algorithm (a check digit for you old timers who wrote serial port interfaces...).
The Numbers
There are 16,655 unique UNII concepts in the provided list.
There are 67,715 synonyms, including the preferred names.
What's missing?
UNII Type:
We know that the scope of the UNIIs covers a number to types of substances. It would be very useful if there was a way of telling which UNIIs are of which type so that we could filter them. I may not want to include cosmetics OR biologics in my allergy pick list, for example.
Allergies:
In most systems that track allergies, medication allergies in particular, they allow the user to represent allergies using medication ingredients, common brand names OR allergy classes. The UNII scope only covers one of these. How will we use UNII to represent and documented allergy or adverse reaction to ‘Nyquil' or ‘cephalosporins'? Also, if you are going to represent allergies should the list include animals and environmental allergies.
Not a Rant
I don't want to get off on a rant here... but it seems like for some of these meaningful use terminologies, rather than creating a terminology designed to support appropriate interoperability, we looked to see what we already had lying around. UNII is not an allergy terminology, it is a substance terminology. They are not the same thing. They are terminology domains that merely overlap. I know, creating a terminology is hard but, ahem, 19 billion dollars! This is not a criticism directed at the UNII codes or the people that maintain them. It looks like a very thorough substance terminology with a fairly simple design, but it will not support allergy interoperability as it should be supported. Now, we could change UNII terminology to include allergy classes, animals and environmental terms, but that would make it a less wonderful substance terminology then, wouldn't it? Perhaps a better approach would be to use UNII in our allergy interoperability terminology, in the utopian future, to represent substances (with types please) and we could append the other allergy types (classes, animals, environmental) to save money and reduce the deficit. I could live with that.
(I will now climb down from my virtual soap box, so that you can come out from behind your furniture...)
To make up for the non-rant, I am happy to provide a normalized version of the most recent UNII data for your experimentation. It is provided in a zip file as two, pipe ‘|'delimited text files with the following structure.

If you would like to receive this file, contact us and ask for it. We will email it to you or provide you with access to our FTP server.
I want to thank Bonnie for reminding me that I should do this post.
I will try to post more frequently.
Its Just a Simple Procedure...Current Procedure Terminology (CPT)The Current Procedural Terminology (CPT) code set is maintained by the American Medical Association through the CPT Editorial Panel. The CPT code set accurately describes medical, surgical, and diagnostic services and is designed to communicate uniform information about medical services and procedures among physicians, coders, patients, accreditation organizations, and payers for administrative, financial, and analytical purposes.All CPT Codes are 5 digits. There are roughly 9200 CPT codes today.Here is an example of some CPT codes15788 CHEMICAL PEEL, FACIAL; EPIDERMAL15789 CHEMICAL PEEL, FACIAL; DERMAL15792 CHEMICAL PEEL, NONFACIAL; EPIDERMAL The code is not hierarchical but they do fall within defined numeric ranges called sections.The use of CPT codes requires a license from the AMA. The main CPT page on the AMA site is located here: http://www.ama-assn.org/ama/pub/physician-resources/solutions-managing-your-practice/coding-billing-insurance/cpt.shtmlThe Wikipedia page for CPT is located here: http://en.wikipedia.org/wiki/Current_Procedural_TerminologyThere is a good document covering the fundamentals of CPT coding here: http://pmiconline.com/ebook.pdfCPT codes are in the UMLS metathesaurus, but you are required to have a CPT license to use them.
ICD-10-PCSThe International Classification of Diseases, Tenth Revision, Procedure Coding System (ICD-10-PCS) was developed as a replacement for the International Classification of Diseases, Ninth Revision, Clinical Modification (ICD-9-CM) volume 3, Procedures.It is important to note that unlike the diagnosis and procedures subsets in ICD-9, ICD-10-CM and ICD-10-PCS are separate and distinct vocabularies. They are not subsets of a holistic ICD-10. This starts with the fact that the structure of a ICD-10-PCS code is not the same as a ICD-10-CM code.

ICD-10-PCS has a seven character alphanumeric code structure. Each character contains up to 34 possible values. Each value represents a specific option for the general character definition (e.g., stomach is one of the values for the body part character). The ten digits 0-9 and the 24 letters A-H,J-N and P-Z may be used in each character. The letters O and I are not used in order to avoid confusion with the digits 0 and 1.The second through seventh characters mean the same thing within each section, but may mean different things in other sections.In all sections, the third character specifies the general type of procedure per-formed (e.g., resection, transfusion, fluoroscopy), while the other characters give additional information such as the body part and approach. In ICD-10-PCS, the term "procedure" refers to the complete specification of the seven characters.I think the approach here is risky, primarily due to the limit of 34 items within each component of the smart key. Many vocabularies have abandoned smart key system like this because of the complexity that is introduces when you need to add an item beyond the possible range (like #35…).The reference manual for the ICD-10-PCS coding system can be downloaded here: https://www.cms.gov/ICD10/Downloads/pcs_refman.pdfWikipedia has a fairly robust ICD-10-PCS list and reference capability here: http://en.wikipedia.org/wiki/ICD-10_Procedure_Coding_System
You can download the latest ICD-10-PCS data files here: http://www.cms.gov/ICD10/13_2010_ICD10PCS.asp#TopOfPage
(The file you want is the ‘2010 Code Descriptions – Long format, Table format’ zip file.)
There are roughly
72,000 ICD-10-PCS Codes.
According to the American college of Emergency Physicians (ACEP), some preliminary inpatient hospital testing of ICD-10-PCS has indicated that the new procedure coding system is problematic to learn for both experienced and inexperienced coders. If this is true, it may be that CPT-4, which is also an anointed vocabulary, would be a better choice for procedures.The next post will be about allergy vocabularies.
What seems to be the Problem?
International Classification of Diseases (ICD)
International Classification of Diseases is a publication from the World Health Organization (WHO) and it provides a number of vocabularies for expressing disease concepts.
The history of the ICD is available here: http://www.who.int/classifications/icd/en/HistoryOfICD.pdf
ICD-9-CM
The International Classification of Diseases, Ninth Revision, Clinical Modification (ICD-9-CM) is based on the World Health Organization's Ninth Revision, International Classification of Diseases (ICD-9).
ICD-9-CM is the official system of assigning codes to diagnoses and procedures associated with hospital utilization in the United States.
The structure of ICD-9-CM codes is relatively straight forward. The code itself is an explicit hierarchy with the primary disease characteristic typically represented by the first part of the code and the secondary characteristics grouped in numeric sequence in the second part of the code.

As you can see in the example below you should always treat the ICD-9-CM code as text and not a numeric as numeric interpretation of the code would be a disaster.
Below are the ICD-9-CM codes representing ‘hypertensive chronic kidney disease':
403 Hypertensive chronic kidney disease
403.0 Hypertensive chronic kidney disease, malignant
403.00 Hypertensive chronic kidney disease, malignant, with chronic kidney disease stage I through stage IV, or unspecified
403.01 Hypertensive chronic kidney disease, malignant, with chronic kidney disease stage V or end stage renal disease
403.1 Hypertensive chronic kidney disease, benign
403.10 Hypertensive chronic kidney disease, benign, with chronic kidney disease stage I through stage IV, or unspecified
403.11 Hypertensive chronic kidney disease, benign, with chronic kidney disease stage V or end stage renal disease
403.9 Hypertensive chronic kidney disease, unspecified
403.90 Hypertensive chronic kidney disease, unspecified, with chronic kidney disease stage I through stage IV, or unspecified
403.91 Hypertensive chronic kidney disease, unspecified, with chronic kidney disease stage V or end stage renal disease
Note: This manner of establishing codes is less than ideal. A smart code is a identifier that implies meaning through its structure. Typically this manner of establishing codes becomes fraught with issues as a coding scheme becomes more complex over time. For example, there is not a very good way to express a disease or procedure in ICD-9 if it belongs in more than one place in the hierarchy (poly-hierarchical) without creating duplicate concepts (which is bad).
There are roughly 22,000 ICD-9-CM codes.
The ‘Home Page' of ICD-9-CM is http://www.cdc.gov/nchs/icd/icd9cm.htm
Wikipedia has a fairly robust ICD-9 list and reference capability here: http://en.wikipedia.org/wiki/List_of_ICD-9_codes.
Like all public standards is not provided in a format that makes it easy to use. The downloads that are available via FTP are rich text files that are human readable that are not easy to parse into a typical application consumable vocabulary file.
There are a number of web sites that providing search and lookup tools for ICD-9 but the only source of free coded ICD-9-CM codes (that I have found) is the UMLS metathesaurus (also not easy...).
If you want easy and well structured you need to pay...
I recommend Ingenix at the following link: http://www.shopingenix.com/Category/100093/Product/16699/
ICD-10-CM
Like ICD-9-CM, ICD-10-CM is based on the World Health Organization ICD-10 coding system. ICD-10 is designated to replace ICD-9 and is a more granular terminology (actually more like SNOMED-CT).
The structure of ICD-10-CM is different than ICD-9. The codes are alphanumeric where the initial alpha code delineates the codes into 22 chapters.
Below are the ICD-10-CM codes representing ‘hypertensive chronic kidney disease':
I120 Hypertensive chronic kidney disease with stage V chronic kidney disease or end stage renal disease
I129 Hypertensive chronic kidney disease with stage I through stage IV chronic kidney disease, or unspecified chronic kidney disease
There are roughly 68,000 ICD-10-CM codes.
The structure of the ICD-10 that is as is depicted below (thanks to the AHIMA website).

There is a good primer on the differences between ICD-9 and ICD-10 on the AHIMA website here: http://library.ahima.org/xpedio/groups/public/documents/ahima/bok1_038084.hcsp?dDocName=bok1_038084
The ‘Home Page' of the ICD-10-CM is http://www.cdc.gov/nchs/icd/icd10cm.htm
Wikipedia has a fairly robust ICD-10-CM list and reference capability here: http://en.wikipedia.org/wiki/ICD-10.
RTF was apparently too easy as ICD-10-CM is published as a PDF file...
ICD-10-CM can also be pulled from the UML Metathesaurus and purchased in convenient formats from Ingenix.
SNOMED-CT
SNOMED CT (Systematized Nomenclature of Medicine--Clinical Terms) is a comprehensive clinical terminology, originally created by the College of American Pathologists (CAP) and, as of April 2007, owned, maintained, and distributed by the International Health Terminology Standards Development Organization (IHTSDO).
SNOMED-CT codes do not have a hierarchical code, like the ICD vocabularies. Rather, SNOMED-CT creates meaningless identifiers and relates them to each other in a directed acyclic graph or DAG (which is where the phrase DAG-gumit! originiated... I am pretty sure). This means that any term in the vocabulary can be related to zero-to-many terms, as long it is cannot end up being its own parent. The relationships themselves are separate from the SNOMED-CT code. SNOMED-CT also separates the notion of concepts and descriptions (or concept synonyms).
Below are the ICD-10-CM codes representing ‘chronic kidney disease':
431855005|disorder|Chronic kidney disease stage 1
431856006|disorder|Chronic kidney disease stage 2
433144002|disorder|Chronic kidney disease stage 3
431857002|disorder|Chronic kidney disease stage 4
433146000|disorder|Chronic kidney disease stage 5
There are roughly 68,000 active disorder concepts in SNOMED-CT.
I have created a number of posts (and a few screencasts) on SNOMED-CT so I would first direct you to earlier posts in this blog.
The main NLM page for SNOMED-CT is located here: http://www.nlm.nih.gov/research/umls/Snomed/snomed_main.html
The SNOMED-CT user's guide is downloadable here: http://www.ihtsdo.org/fileadmin/user_upload/Docs_01/SNOMED_CT/About_SNOMED_CT/Use_of_SNOMED_CT/SNOMED_CT_User_Guide_20090731.pdf
The main CAP page for SNOMED-CT is located here: http://www.cap.org/apps/cap.portal?_nfpb=true&_pageLabel=snomed_page
The Wikipedia page for SNOMED-CT is located here: http://en.wikipedia.org/wiki/SNOMED_CT
You can download SNOMED-CT release files from the NLM site here: http://www.nlm.nih.gov/research/umls/licensedcontent/snomedctfiles.html
Note: to download NLM data files, like SNOMED-CT, you need to register and obtain a license from the NLM. You can do that here http://wwwcf.nlm.nih.gov/umlslicense/snomed/license.cfm
The next post will cover procedure terminologies.
I recently had a request to create a post providing a primer on the vocabularies of meaningful use. Let's start with a review of the vocabularies that are named in the meaningful use criteria described on pages 21 and 22 of the January 13
th release of the federal register located here:
http://edocket.access.gpo.gov/2010/pdf/e9-31216.pdf.
The "Chosen Ones"
The listed vocabularies and their purpose are as follows:
|
Terminology |
Stage |
Purpose(s) |
|
ICD-9-CM |
Stage 1
Stage 1 |
Problems
Procedures |
|
ICD-10-CM |
Stage 2 |
Problems |
|
ICD-10-PCS |
Stage 2 |
Procedures |
|
SNOMED-CT |
Stage 1
Stage 2
Stage 2 |
Problems
Problems
Lab Results (Submission Public Health) |
|
CPT-4 |
Stage 1
Stage 2 |
Procedures
Procedures |
|
Third Party Drug Vocabularies* |
Stage 1
Stage 1 |
Medications
Electronic Prescribing |
|
RxNorm |
Stage 2
Stage 2 |
Medications
Electronic Prescribing |
|
UNII |
Stage 2 |
Medication Allergies |
|
CVX |
Stage 1
Stage 2 |
Immunization Registries
Immunization Registries |
|
LOINC |
Stage 1
Stage 1
Stage 2
Stage 2 |
Lab Orders (from Reference labs)
Lab Results (from reference labs)
Lab Orders (All)
Lab Results (All) |
|
UCUM |
Stage 2 |
Units of Measure |
|
CDA template |
Stage 2 |
Vital Signs |
* Third Party drug vocabularies that are listed as complete in RxNorm by the NLM
Meaningful Selection
What does it mean that a vocabulary is one of the chosen ones? My understanding is that the meaningful use criteria (based on my reading of the federal register) defines that to be certified EHR technologies must provide patient summaries and interoperate (exchange data) using the listed vocabularies for their defined purposes. In other words, the vocabulary standards are for interoperability not native persistence in the EHR application.
It is not reasonable to expect that every hospital / physicians office in the US will migrate their patient data to these standards (and then do it again for 2013). As a good application architect, your objective is to determine how you will be able to express your client's patient information in the anointed vocabularies.
Where to Learn More
There is a lot I can say about these vocabularies, both their suitability to the task that have been so capriciously assigned to them and the challenges associated in working with each of them. This is not the post for that particular diatribe. In this post, I will try to give you some high level information and some places to find out more. So strap on your learning caps and practice your right click ‘open in new tab' skills.
The next post will cover the problem vocabularies.

We at Clinical Architecture have created a new suite of software tools with a unique focus on terminology mapping, Symedcial. You may be thinking that there are already a number of mapping tools out there and are wondering 'Why Symedical?'. I created a four minute screencast to show you how Symedical is different.
Why Symedical?
Symedical can map more than just drugs and can improve your ability to map beyond the scope of RxNorm.
Contact Clinical Architecture and find out how Symedical can help you interoperate.
When dealing with any terminology domain, to establish a
working understanding, you need to get a handle on the anatomy of a term within
the domain. For example, if you are
looking at a catalog of automobiles you quickly see a pattern that revolves
around the vehicle make, model, production year and other characteristics that
identify the vehicle to the required level of granularity. Regardless of the domain, the pattern
typically becomes broken down into primary characteristics, secondary
characteristics and modifiers. The
primary characteristic is the core of information that is absolutely essential
to the meaning of the term. In other
words, if you began stripping off characteristics the primary characteristics
is where you say ‘when’ so that the term is not rendered ambiguous in the
domain. In our automobile example there
are, arguably, a couple of primary characteristics: the make and model. If someone asks you what you drive you typically
tell them the make and model (unless the model strongly implies the make or you
like bragging about the options package…).
The model year, edition and options are secondary characteristics that
further define the vehicle and the color and other minutiae could be considered
modifiers (unless it is purple).
In the medication domain, the primary characteristic is the
list of ingredients, more specifically the list of active ingredients. Active ingredients drive the use of
medication concepts and, like with the car example, most people when asked
about their medications respond with the active ingredients or the brand name
synonym for the active ingredients. For
medications the implied route, dose form, strength are secondary
characteristics that are relevant but not always necessary.
The Inactive
Ingredients are Inactive… or ARE they?
Active and inactive ingredients typically both live together
in the domain of substances (or ingredients).
Whether an ingredient is active or inactive is, in most cases, a role
that the ingredient plays as opposed to what the ingredient is. This post is mostly about active ingredients,
but it is worth a few minutes to talk about inactive so that, and an
implementer, you understand the conceptual differences and the limitations of
the notion of an inactive ingredient when you encounter it in the wild.
The difference between an active and inactive ingredient is
subtle to the non-pharmacist. Typically
the active ingredients are the substances that define the medication, while the
inactive ingredients are excipients that are introduced in the manufacturing of
the drug product OR ubiquitous essence of life ingredients like ‘water’
that do not factor into the medications function. If you refer to the medication continuum in
the previous post, you will note that inactive ingredients do not participate
in the abstract or dispensable generalizations.
Since inactive ingredients are, for the most part, introduced by the
manufacturing process, any attempt to introduce them into higher level
generalizations is risky as it can create false alerts and worse missed alerts
(Which is the topic of another post and covered to a small degree in my ‘allergy
rule of thumb’ post).
Some may argue that if an inactive ingredient is present in
all manufactured forms of a drug you can represent is at a higher level
generalization for that particular situation.
I would argue that stretching rules of the composition of a terminology
to accommodate a few exceptions is not worth compromising the terminology’s
consistency. You need to know that
active ingredients are always active ingredients, diverging from that path
leads to the scary woods of unintended consequences.
You may encounter what looks like an inactive ingredient in
an active ingredient list. This is
either: (A) a valid active ingredient in that particular circumstance, (B)
introduced because it is clinically relevant and there is no other way for the
terminology to deal with this, or (C) it is junk DNA left over from a bygone
era. In any case, you must treat it like
an active ingredient: avoid eye contact and sudden movements. This is discussed more later in this post.
Let’s talk about active ingredients.
The Ingredient Set
Every valid medication concept (I am looking at YOU
medical devices…) has one or many active ingredients that make up its primary
characteristic. This may be referred to
as an ingredient set, ingredient list, generic drug or the formulation (ingredient
set in the medication concept continuum).
In fact, most every drug compendia has a concept that represents this
level. This is important as that defines
the set of valid active ingredient combinations. Most, if not all, drug
concepts in a medication hierarchy point back to this type of concept. These ingredient sets break down into a list
of individual ingredients.
Base ingredients
A single ingredient can represent a base ingredient or a
variation of a base ingredient. This is
significant because a variation of a base ingredient is related to the base
ingredient but can have significant differences (which I will not get into here…
ask you local pharmacist). To illustrate
this, consider the following table of
RxNorm ingredients that start with ‘Erythromycin’:
|
RXCUI
|
SAB
|
TTY
|
STR
|
|
4053
|
RXNORM
|
IN
|
Erythromycin
|
|
4055
|
RXNORM
|
IN
|
Erythromycin Estolate
|
|
4056
|
RXNORM
|
IN
|
Erythromycin Ethylsuccinate
|
|
24346
|
RXNORM
|
IN
|
Erythromycin Gluceptate
|
|
24347
|
RXNORM
|
IN
|
erythromycin lactobionate
|
|
24351
|
RXNORM
|
IN
|
erythromycin stearate
|
|
236847
|
RXNORM
|
IN
|
ERYTHROMYCIN STINOPRATE
|
In this list you can see the base ingredient of ‘Erythromycin’
and the variations (or different salt forms of Erithromycin in this example). In most cases the variations of a base
ingredient are clinical equivalent to the base ingredient and add not
additional clinical value other than accurately describing the variation of the
ingredient in a specific formulation.
Some compendia have only base ingredients, Some have base and variations
and some have defined relationships between the variation and the base.
This information can come into play when processing clinical
rules so you need to be aware of it. For
example a clinical rule may only be attached to the base ingredient so you need
to use the relationship from the variation to the base ingredient to activate the
rule.
In some situation a ingredient variation may represent
something other than a salt form of the base.
Here are some examples from RxNorm of non-salt variations:
|
RXCUI
|
SAB
|
TTY
|
STR
|
|
352374
|
RXNORM
|
IN
|
drotrecogin alfa
|
|
353106
|
RXNORM
|
IN
|
drotrecogin alfa (activated), lyophilized
|
|
RXCUI
|
SAB
|
TTY
|
STR
|
|
797550
|
RXNORM
|
IN
|
Immune Globulin (Human)
|
|
617615
|
RXNORM
|
IN
|
Immune Globulin Subcutaneous (Human)
|
In some cases there may be no base ingredient – only variations:
|
RXCUI
|
SAB
|
TTY
|
STR
|
|
17609
|
RXNORM
|
IN
|
aluminum acetate
|
|
89858
|
RXNORM
|
IN
|
Aluminum carbonate
|
|
17610
|
RXNORM
|
IN
|
aluminum chlorhydrate
|
|
46241
|
RXNORM
|
IN
|
aluminum chloride
|
|
17611
|
RXNORM
|
IN
|
Aluminum chloride hexahydrate
|
|
612
|
RXNORM
|
IN
|
Aluminum Hydroxide
|
|
81948
|
RXNORM
|
IN
|
Aluminum Hydroxide (Gel), Dried
|
|
613
|
RXNORM
|
IN
|
Aluminum Hydroxide Gel
|
|
46242
|
RXNORM
|
IN
|
aluminum magnesium hydroxide
|
|
615
|
RXNORM
|
IN
|
Aluminum Oxide
|
|
17618
|
RXNORM
|
IN
|
aluminum phosphate
|
|
54989
|
RXNORM
|
IN
|
aluminum potassium sulfate
|
|
543375
|
RXNORM
|
IN
|
Aluminum Sesquichlorohydrate
|
|
17621
|
RXNORM
|
IN
|
aluminum sulfate
|
As an implementer, an awareness of the nature of base ingredients
and there variations is useful as it can motivate you to look at the data in different
ways, both in terms of development and validation.
When is an Ingredient
not an Ingredient?
Every now and then you may encounter an ingredient that is
present in an ingredient set that is not an ingredient. You will recognize this because under certain
situations they will wreak havoc.
Sometimes it will be an inactive ingredient, as discussed earlier, and
other times it may be a clinical work-around.
An example could be an ingredient set that has ‘water’ and
an active ingredient. If the user
happens to select that drug (either by picking the ingredient set or the brand
name synonym) to represent an allergen, they have unwittingly indicated that
the patient is allergic to any ingredient set that includes ‘water’.
Another example of a clinical work-around is a ingredient
term that represents a concept like ‘sugar-free’, ‘alcohol-free’ or ‘Preservative-Free’. These were introduced to support firing significant
clinical alerts without requiring existing terminology users to re-program
their applications. In that respect they
are ingenious and likely saved patient’s lives.
The unintended consequence of this, like with the water example, is that
if a ingredient set with a ‘freeness’ ingredient is used as an allergen it
introduces the notion that the patient is allergic to everything else that is ‘sugar-free’. There are not many of these but if you
encounter one you should make sure that you have exceptions in your allergy
checking to ignore ‘free-ness’-based hits.
Finally, some ingredient terminologies may include the notion of a route of administration in an ingredient (see the above example of 'immune globulin' in RxNorm) this is less of an issue because the route is not typically represented as a distinct ingredient, so the net result is similar to a variation of a base ingredient. Sometimes in these cases the routed ingredient may be disconnected from the base ingredient for clinical reasons.
Ingredients Drive
Medication Terminologies
Every use model for medication terminologies is driven by
the ingredients. Take some time, with
whatever terminology you have chosen for your implementation, to understand how the ingredients work and
how they factor into the decision support modules. Understanding this facet of
you medication providers content will provide significant insight into how
everything else works.
In the next post of this series we will cover the secondary characteristics of a medication concept.
The Medication Concept Continuum
In order to understand how a medication concept can be appropriately leveraged, you need to understand its characteristics and which are required to support a particular activity.
For the purposes of this primer, the term ‘medication concept' covers any entity that represents a medication from the ingredient to the physical packaged product. This excludes therapeutic classes, allergy classes and other taxonomies that may be used to group or relate medication concepts. This also excludes a medication order, which is an orchestration of a medication concept with other contextual information (we will talk more about this in a later post...).
The following diagram depicts the Medication Concepts Continuum. It is intended to provide a ‘cheat sheet' that can be referred to throughout the rest of this lesson. The characteristic breakdown in this diagram are generalizations and, as such, do not represent the actual structure of any existing drug vocabulary vendor. To interpret how a particular vendors structure fits into this model, please refer to your vocabulary vendor's documentation.

A medication concept falls into one of three generalizations: Abstract, Dispensable or Actual.
The concepts in the Actual generalization represent things that physically exist in the real world. You can actually put your hands on one. A tablet or bottle of tablets, for example, is an actual medication concept.
The concepts in the Dispensable generalization represent things that can be conceptually dispensed and administered to a patient. Another way to think of them is that they represent a completed notion of a medication. In other words, if I have a dispensable concept I have sufficient information to select a specific actual concept off of the shelf.
The concepts in the Abstract generalization (which is most of them) represent primitive characteristic concepts or an incomplete combination of characteristics. This type of concept is typically created to function as a navigation pivot OR a an anchor for additional information. An example of a multi-characteristic abstract concept would be ‘warfarin sodium tablet' (Ingredient + dose form), which is not sufficient to identify an actual physical entity, but it conveys and idea of an ingredient and whether or not it will affect the patient systemically.
The Davinci Code(s)
When you look at the continuum you will also notice that there is the central inverted pyramid (not unlike the La Pyramide Inversée in front of the Louvre) that represents the continuum of medication concepts. These are the codes that drive order entry, prescribing, alert checking and med-reconciliation.
On the right side there is a collection of other concepts. These other concepts are primitives.
You can think of the primitives as the raw building blocks that, when combined, establish the medication concepts that we find in the continuum. Examples of a primitive would be ‘route of administration', ‘dosage form' and ‘unit of measure'. For obvious reasons, these are critical to the meaning and stability of the medication concepts. We cover them in more detail in a future post.
When you examine a drug concept it is important to note where it plugs into the continuum. In many ways, that will give you an idea as to what you can actually do with it:
- To dispense a medication you need a dispensable
- To determine if a medication systemically affects the patient you need dose form and/or implied route
- To properly validate the dose you need to know the period of release
- To determine the inactive ingredients in a medication you need to know who manufactured it
- To be able to scan it with a bar code reader you may need to know its packaging details.
You may think it sound straight forward, but I have seen attempts to use the wrong concept for the wrong purpose and, like trying to make fruit smoothie with a chipper shredder... it did not end well.
The next topic will be how medication ingredients are typically represented and why it is not always as straightforward as it seems.
Part One - The Medication Domain
As we enter the second decade of the 21st century, we have been
given a mandate to evolve our simplistic, episodic and transient patient records into the robust, longitudinal, and precise paragons of technology that has been promised in board meetings, speeches and science fiction movies. This can be accomplished. Like all good architecture, achieving this objective will require an evolution over time that starts with stable foundational concepts that support the goal. There are a number of domains of clinical terminology: problems, procedures, laboratory tests, nursing orders, etc. One of the most pervasive and complicated of these is medications.
The purpose of this series on medication concepts for engineers is to provide a overview of the moving parts of medications, how they exist in terminologies today, how they are used in systems today and how they could be used in the future to the betterment of healthcare IT.
Medication concepts are used throughout applications in healthcare information technology in various ways. They are used to order medications, record allergies, track inventory, manage purchase pricing, identify insurance coverage, transmit prescriptions, trigger alerts and workflow rules, and the list goes on. It should not be a surprise that the ways that medications are represented in the various standard, proprietary and homegrown terminologies have become quite complicated over the years.
How is the Medication Domain Different?
The medication domain is different from other clinical domains in a few ways.
General functional variability and the resulting ‘Fuzziness'
Unlike a laboratory test or condition, for example, a medication concept can be used to represent something the patient is taking, something I want them to take, something they cannot take (an allergy), something I want to bill for, something I have to pay for and something I have in inventory. This usage variability, and the overleveraging of existing terminologies to accommodate it, has led to some drug concepts becoming ‘fuzzy' or indistinct as they have been evolved in multiple directions to serve multiple purposes. This ‘fuzziness' results in many of the "why is it set up like this?' questions that are encountered when an engineer is introduced to the medication domain for the first time. For example, why do several of the medication terminologies have ‘route' as an attribute of the medication? To my knowledge there are no medications that have a rectum, so why would a medication have a rectal route? Why do so few medication terminologies represent bioavailability, when it can have such a significant impact on identifying the right dose? The answers to many of these questions lie in the origins and design drivers behind those concepts. Medication concepts were used for business transactions before they were used in the clinical setting. If you do not consider this it is very easy to embrace medication concepts with as number of incorrect assumptions.
Multiple clinical uses and the required attributes
The other difference with the medication domain is the number of clinical decision support functions that can be driven by medication concepts in the delivery of patient care. Using medication concepts you can check for allergies, check for drug interactions, avoid medication contraindications and verify appropriate dosage levels, just to name a few. These different uses of medication concepts have different requirements of what characteristics a medication concept must possess in order to drive them correctly. If you try to drive a particular clinical function with a concept from the wrong level of granularity the result can range from too much clinical advice to no advice or worse, bad advice.
Multiple levels of granularity
The medication domain is likely the most widely implemented clinical terminology domain in healthcare today. However, when you take a closer look at how they have been implemented you see that different levels of medication terminologies have been implemented in different ways in these systems. Medication concepts have been modeled as ingredients, classes, drug and route abstractions, dispensable medications and specific drug products. In other words, if two applications have implemented medication allergies there is a fairly good chance that they did not do so in the same way or with the same types of medication concepts.
Multiple third party sources
Medication concepts have been around for awhile and were the first concepts used to really drive clinical decision support. As a result a number of companies have developed stable, updated proprietary medication terminologies and associated clinical content. This is good as it has provided end user systems with the freedom to stop managing local terminologies and focus on developing sophisticated applications and focus on patient care. The downside, however, is that these vendors have each created terminologies that are slightly different. These differences are obvious in some cases and subtle in others. A misplaced assumption, especially with the subtle differences, can mean the difference in getting a allergy hit or not (or getting a thousand extra allergy hits...)
So... what is a medication concept? In the next article we will start to examine the characteristics of a medication concepts and how they drive different functionality in healthcare applications.
I have completed a new screencast that provides a 17 minute
overview on how to leverage the ICD9 cross map files in the SNOMED-CT optional download. Grab a sandwich, MS Access and curl up by the monitor and see how this great resource from the NLM can save you some time and effort mapping SNOMED-CT terms to ICD9.
Follow this link to the resource page to access the screencast.
I am always looking for new ideas for screencast. If you have one email me and let me know.
Have a great week!
For those of you evaluating the use of the SNOMED-CT Core Subset, you need to be aware that the NLM has made some non-trivial changes to the format and content of the subset file in the latest (second) release dated 200908 (July).
If you have developed a load program, as we have, that uses the subset file to identify concepts that are included in the subset, it is likely you will need to modify that program.
Here is a summary of the changes:
Term Changes:
- Nine terms were added and eleven terms were retired from the core subset.
New Terms:
SNOMED_CID | SNOMED_FSN | SNOMED_CONCEPT_STATUS |
208892001 | Closed traumatic dislocation of hip (disorder) | Current |
165468009 | Erythrocyte sedimentation rate (ESR) raised (finding) | Current |
197321007 | Steatosis of liver (disorder) | Current |
40733004 | Infectious disease (disorder) | Current |
165346000 | Laboratory test result abnormal (situation) | Current |
442234001 | Serum cholesterol borderline high (finding) | Current |
442438000 | Influenza due to Influenza A virus (disorder) | Current |
442551007 | Dental caries extending into dentine (disorder) | Current |
4557003 | Preinfarction syndrome (disorder) | Current |
Retired Terms:
SNOMED_CID | SNOMED_FSN | SNOMED_CONCEPT_STATUS |
41006004 | Depression (finding) | Ambiguous |
309158009 | Laboratory finding abnormal (navigational concept) | Current |
371330000 | Fatty liver (disorder) | Duplicate |
131016008 | Increased thyroid stimulating hormone level (finding) | Duplicate |
166829003 | Serum cholesterol borderline (finding) | Ambiguous |
191415002 | Communicable disease (navigational concept) | Current |
78431007 | Influenza due to Influenza virus, type A, human (disorder) | Ambiguous |
416103000 | Elevated erythrocyte sedimentation rate (finding) | Duplicate |
50047001 | Compound dental caries (disorder) | Ambiguous |
63079007 | Closed traumatic dislocation of hip joint (disorder) | Duplicate |
64333001 | Preinfarction angina (disorder) | Duplicate |
File Structure Changes:
June Subset | July Subset | Change |
SNOMED_CID | SNOMED_CID | - |
FSN | SNOMED_FSN | Name Change |
CONCEPT_STATUS | SNOMED_CONCEPT_STATUS | Name Change Now uses Description instead of Code!!! |
UMLS_CUI | UMLS_CUI | - |
OCCURRENCE | OCCURRENCE | - |
USAGE | USAGE | - |
- | FIRST_IN_SUBSET | New Field (YYYYMM) |
IS_RETIRED | IS_RETIRED_FROM_SUBSET | Name Change |
- | LAST_IN_SUBSET | New Field (YYYYMM) |
- | REPLACED_BY | New Field (SNOMED-CT Concept ID) |
New Fields:
New Field | What is it? |
FIRST_IN_SUBSET | This is the issue year and month when the concept first appeared in the subset. |
LAST_IN_SUBSET | This is the issue year and month when the concept last appeared in the subset as a non-retired concept. |
REPLACED_BY | Concept ID of the concept replacing a retired concept. |
OUCH!
If you developed a program that loads the core subset file this update likely broke it.
If you are using a text ODBC/OLEDB driver to load the file the name changes to the columns broke it.
If you are accessing the fields using sequential access and splitting the fields using the pipe delimiter, the insertion of the FIRST_IN_SUBSET before the IS_RETIRED fields will break your load program.
If you created a function that uses the coded values in the CONCEPT_STATUS field to support your load logic, that is now broken by the switch to the text value. (I don't understand this change at all. It seems to run contrary to the move away from free text. I would change it back...)
Needless to say, this update was a painful one for the early adopter. But, if you have already created logic based on the inaugural release of the core subset data... and early adopter is what you are and it is not without risks.
Along with the painful changes that left our load program writhing on the ground, clutching its face and yelling "You broke my nose!" are some new useful additions.
The FIRST_IN_SUBSET, LAST_IN_SUBSET and REPLACED_BY_SNOMED_CID are useful lifecycle management fields that will help with the management of term availability.
Patience is a Virtue
If this update frustrated you, I would ask that you focus on the positive and consider that the Core subset is another in a growing line of great, "FREE" work products from our friends at the NLM.
It is also worth noting that as we in the HIT industry leverage SNOMED-CT, RxNorm and LOINC the bar will continue to be raised in terms of update frequency and format stability. From the interactions I have had with the NLM, I expect that they are paying attention and will be responsive as we evolve and leverage them more.
Free Advice
As someone who worked at a commercial content provider, I would encourage the following with respect to all data products.
1.) Do not change field/column names lightly if they are included in the file, as developers will leverage that with a text driver to load the information.
2.) Avoid inserting fields into a record, as some load programs will operate based on field order. If you append new fields to the end of the record you will be less likely to disrupt the load.
3.) Coded fields are better than text fields...always.
Regardless of the constructive criticism...this is good stuff. If we at Clinical Architecture can help you better take advantage of it, give us a call!