By: Charlie Harp
PLEASE NOTE: The Meaningful Use Final Rule was released on July 12 and the UNII is no longer listed as the standard for allergy terminology. In fact, there is NO standard listed for allergy interoperability. For the record, I do not think that the following blog post, which “aired” on June 30th, influenced the governments decision making process in any way. My next post will suggest a significant stimluls for healthcare IT companies with the word ‘architecture’ in the name… just in case. In order to preserve history, I am leaving the post as it was. It still provides a decent overview of UNII for those of you that would like to leverage it.
The vocabulary chosen to represent patient allergies is the FDA Unique Ingredient Identifier or UNII (I guess ‘UII’ would be a difficult acronym to use in casual conversation…).
The UNII is part of the Substance Registration System whose purpose is to provide unique identifiers for:
- Food substances are specific foods or components of food, regardless of whether the food is in conventional food form or a dietary supplement, such as vitamins, minerals, herbs, or other similar nutritional substances.
- Drug substances include both active and inactive ingredients used in drug products, including those for veterinary purposes.
- Biologic substances include both active and inactive ingredients used in biologics, such as blood products, therapeutic products, vaccines, cellular and gene therapy products, allergenic products, tissues, and certain devices (e.g., enzymes in stabilized solutions).
- Device substances include certain components of some devices (e.g. silicon for implants, and chemical reagents for glucose test kits).
- Cosmetic substances are components of cosmetic products, such as flavors, fragrances, colorants, vitamins, plant- and animal-derived ingredients, and polymers.
There is more general information on the UNII here.
According to the above site, the UNII is:
- One of the core components of the United States Federal Medication Terminology.
- Used in the FDA’s Structured Product Labeling
- Used to assist in the generation of the National Library of Medicine’s (NLM’s) RxNorm.
- A US government standard for drug ingredient and food allergen identifiers
- A component of the Environmental Protection Agency’s Substance Registry System (future)
The UNII may be found in:
- NLM’s Unified Medical Language System (UMLS)
- National Cancer Institute’s Enterprise Vocabulary Service
- USP Dictionary of USAN and International Drug Names (future)
- FDA Data Standards Council website
- VA National Drug File Reference Terminology (NDF-RT)
- FDA Inactive Ingredient Query Application
The UNII is provided, rather inconveniently, in excel format.
There is a multi-worksheet (A-S, T-Z), denormalized, zipped excel workbook dated 6/25/2010 at the following location.
The sheets are difficult to work with because they have combined the concepts and their synonyms into a single list. It is also worth noting, that in the data provided the synonyms do not have unique identifiers.
The primary sheets with the UNII codes in them have the following columns:
|Preferred substance name||This is the preferred name of the substance|
|UNII||The Unique identifier the preferred substance name|
|Substance name||A synonym for the preferred substance name|
|IT IS TSN||This is not really documented, BUT I believe it is, where applicable, a code representing the USDA Integrated Taxonomic Information System (ITIS) Taxanomic Serial Number (TSN). This appear to only be populated for food ingredients|
|Molecular Formula||This is, you guessed it, the molecular formula. It seems to only be populated for chemical ingredients.|
Code structure and design
The UNII code is a ten character alpha-numeric code. The first nine digits are randomly generated and the tenth digit is determined by an algorithm (a check digit for you old timers who wrote serial port interfaces…).
There are 16,655 unique UNII concepts in the provided list.
There are 67,715 synonyms, including the preferred names.
We know that the scope of the UNIIs covers a number to types of substances. It would be very useful if there was a way of telling which UNIIs are of which type so that we could filter them. I may not want to include cosmetics OR biologics in my allergy pick list, for example.
In most systems that track allergies, medication allergies in particular, they allow the user to represent allergies using medication ingredients, common brand names OR allergy classes. The UNII scope only covers one of these. How will we use UNII to represent and documented allergy or adverse reaction to ‘Nyquil’ or ‘cephalosporins’? Also, if you are going to represent allergies should the list include animals and environmental allergies.
Not a Rant
I don’t want to get off on a rant here… but it seems like for some of these meaningful use terminologies, rather than creating a terminology designed to support appropriate interoperability, we looked to see what we already had lying around. UNII is not an allergy terminology, it is a substance terminology. They are not the same thing. They are terminology domains that merely overlap. I know, creating a terminology is hard but, ahem, 19 billion dollars! This is not a criticism directed at the UNII codes or the people that maintain them. It looks like a very thorough substance terminology with a fairly simple design, but it will not support allergy interoperability as it should be supported. Now, we could change UNII terminology to include allergy classes, animals and environmental terms, but that would make it a less wonderful substance terminology then, wouldn’t it? Perhaps a better approach would be to use UNII in our allergy interoperability terminology, in the utopian future, to represent substances (with types please) and we could append the other allergy types (classes, animals, environmental) to save money and reduce the deficit. I could live with that.
(I will now climb down from my virtual soap box, so that you can come out from behind your furniture…)
To make up for the non-rant, I am happy to provide a normalized version of the most recent UNII data for your experimentation. It is provided in a zip file as two, pipe ‘|’delimited text files with the following structure.
If you would like to receive this file, contact us and ask for it. We will email it to you or provide you with access to our FTP server.
I want to thank Bonnie for reminding me that I should do this post.
I will try to post more frequently.