Episode 7: A History and Analysis of ICD-10

May 18, 2020

On this episode of the Informonster Podcast, Charlie Harp discusses the history of ICD-10. He talks about how it was created and evolved with the healthcare industry, and what he thinks the next iteration could do to improve for the industry going forward.

I’m Charlie Harp, and this is the Informonster Podcast. Now today on the Informonster podcast, I’m going to tell you a story; a story of ICD-10 C.M., and it starts something like this, A long time ago, in a greenhouse far, far away: The origin of the ICD goes back to the late 1700s. The main driver for early attempts at establishing disease classification schemes was the tracking of mortality statistics. A French physician and botanist, Francoise Boissier de Sauvages de Lacroix, whose very name was a taxonomy in and of itself, was one of the early nosologists that inspired the many who followed in his footsteps. Now, if you’re unfamiliar with the term nosologist, well, nosology is the branch of medical science dealing with the classification of diseases, and nosologists were the people that endeavored to do just that. Now in 1763, Sauvages published his 850 page Nosologia Methodica, which organized the known causes of death and illness at the time into 10 major categories. Three years after his publication, at the age of 89, Sauvages died, and in a twist of irony, I have been unable to find a record of his cause of death; go figure.

Now let’s jump forward to 1860. At the International Statistical Congress held in London, the Florence Nightingale, who was known for being a passionate statistician, advocated for the uniform collection of hospital statistics so that outcomes could be compared by hospital, region, and country. This proposal was adopted, and ultimately resulted in the first model of systematic collection of hospital data for the purposes of tracking causes of death. Later in 1893, the Bertillon Classification of Causes of Death was introduced by Jacques Bertillon, another French physician. This classification system was based on the principle of distinguishing between general diseases and those localized to a particular organ or anatomical site. In 1900, Bertillon’s Classification was adopted by the American Public Health Association, and was rebranded as the International Classification of Causes of Death.

At this time, the ICD contained 161 primary codes, some with modifiers. The code was a number, and the modifier was a letter. For example, 146, “burns”, modifier A, “by fire”, modifier B, “by corrosive substance.” This simple list pattern, that we use every day in word documents and on websites, was the very first ICD code coding scheme. In the document that he published at the time, he had a section called the prefatory, which is kind of like a forward, and in this prefatory he had a couple of interesting things to say. So let me tell you the first one. what he said was, “The time is especially suitable for the general adoption of a uniform classification of causes of death, to the end that the mortality data of the coming century may be more thoroughly comparable than at present.” That’s my Bertillon voice, but what’s interesting is what he was really saying is we need to do this and we need to do this in a coded way so that in the future we’ll be smarter than we are today.

And later in the prefatory, Bertillon goes on to say, “The Bertillon Classification is not presented as by any means a perfect system of classification of causes of death. No perfect system has ever been devised, and should there be, the progress of medical science would in time rendered obsolete.” Now that statement, right there, is something that many people today don’t completely understand, and that is the concept of semantic drift in healthcare terminology, which means that no healthcare terminology is ever stable as long as we continue to evolve healthcare itself. Things are always shifting and changing. And so I just find it fascinating that when you reflect on all this, it is amazing and terrible to consider the topics and issues we discussed today are not that different than those discussed over a century ago.

So, at a conference in Paris in 1900, delegates from 26 countries reviewed the newly renamed international list of causes of death and decided that they would meet every 10 years to review and revise it. Now, this thing, ladies and gentlemen, that I’m going through is what we call an origin story, and like every good origin story, you examine the event that created the hero or villain to determine why they chose the path we find them on today. Now for ICD, it started out about death, right? “What was killing these people” was the question it was meant to answer and track. It was also about statistics. It wasn’t about the patient; the patient was already dead. It was about understanding what was killing people so they could do a better job, and make a better future, and understand what was going on. So as time passed through the early 20th century, minor revisions were made to the ICD. In 1946, the sixth revision, ICD-6, was released. And this revision expanded the ICD to include morbidity as well as mortality, and it was renamed accordingly to the International Statistical Classification of Diseases, Injuries, and Causes of Death.

Now, the addition of injuries and diseases also introduced the need to add new modifiers for anatomic location and other things. This revision also introduced the alphabetic index, to accompany the tabular index to aid in finding the appropriate code, as the number of terms had increased significantly. This is the point at which ICD shifted from a focus on death to a focus on people’s problems and death. This allowed people that were tracking statistics retrospectively to evaluate not just what killed people, but also what was hurting and afflicting people. This was especially useful when assessing labor force information, and military capabilities, and things that were affecting your population. Now, the seventh edition was in 1955 and the eighth in 1965. These two revisions were limited mostly to corrections and minor updates, but in 1975, the ninth revision of ICD was introduced; and this is the release we know as ICD-9.

Now, ICD-9 is another significant shift in the nature of the ICD. Prior to ICD-9, the ICD had been used primarily for statistics and retrospective reporting. ICD-9 ushered in a new group of users that needed a code system to document patient information for clinical tracking and billing. This resulted in dueling use cases, one use case with statistical reporting, and the other with patient clinical and administrative management. These two groups disagreed about how they should proceed at the time, based upon the different use cases. And in the end, they decided to compromise and leverage the ICD for these two distinctly different purposes. And this, dear listener, is when our hero became a villain. Now, why would I say this? Well, primarily for dramatic effect, of course. But in addition, I want to highlight that this was the point at which the ICD fell into a trap that happens to almost every standard terminology; almost every terminology.

Hear me out. When you take a terminology designed to solve a specific problem and try to use it to solve another fundamentally different problem, you create an architectural compromise. Now once compromised, the appropriateness of that terminology for either use case becomes fuzzy, and each use case pulls aspects of the architecture to suit their disparate purposes. After a while, the terminology is so compromised that it becomes unwieldily and may not be suitable for any purpose. Think about the Swiss army knife. It’s great if you need a toothpick, and a fork, and a knife, and a nail file, but it’s not great at any one of those things. It’s a compromise, just so you can have all those things in your pocket at once.

All right, so what happened next? Well, in 1990, yes, 30 years ago, the 10th revision was released: ICD-10. Now, if you think of the sixth edition as puberty for the ICD, where it broadenned its focus from mortality to morbidity and mortality, and the ninth edition as the ICDs coming of age story, when it starts to cope with the realities and compromises of being an adult, then you can think of the 10th edition as a midlife crisis. ICD starts asking the question, “who am I really?” And it buys a motorcycle. Now in its last release, ICD-9 had about 17,500 terms and the international version of ICD-10 has just shy of 18,000 terms. So you might think it’s not that different, right? Well, there was some pretty significant changes in the way they looked at things, but that happened throughout the lifespan of ICD. The main difference is the version that we use in the U.S. Is not ICD-10. It’s ICD-10 C.M., where the C.M. stands for Clinical Modification, and the ICD-10 C.M. has over 95,000 terms. Let that sink in. ICD-10 has 18,000 terms. ICD-10 C.M. has 95,000 terms. Now I know what you might be thinking, “Charlie, how do you get from 18,000 to 95,000? That’s one heck of a clinical modification? What happened?” Well, we introduced into ICD-10 C.M. a number of causes of injury, places, activities, and other miscellaneous healthcare codes that weren’t in ICD-10 proper, and in some cases we expanded things because we weren’t satisfied with what they had in ICD-10. In ICD-10, there’s a code called “W56.” The term associated with it is, “Contact with Marine Animal.” So there’s one code. In ICD-10 C.M., this code, this rubric, “W56,” has 91 children delineating what the animal was and what it did. Where you’re bitten by a sea lion, or were you struck by an Orca? It is a veritable cornucopia of Seussian semantics. In addition to these high value additions to ICD-10, we also wanted to eliminate the need for any kind of post-coordination and other modifiers that you could add on to an ICD-10 code. And we did this by generating a combinatorial explosion of all predictable modifiers as codes. This means that for every relevant code, the laterality, nature of encounter healing information, et cetera, generates a new code.

And just like that 18,000 codes becomes 95,000 codes. Did I mention that there are 342 ICD-10 C.M. codes related to something bad happening in a watercraft? It makes me glad I live in landlocked Indiana. So we’re all caught up on the history of ICD-10 C.M. from a historical perspective. What else do we need to know? Let’s start with something that might shock you. Are you sitting down? Okay. Brace yourself. ICD-10 C.M. thinks it’s a book. That’s right, ladies and gentlemen, the terminology that drives medical coding and reimbursement in the U.S. thinks that it’s a book. I’ve always said to understand a terminology, you need to know its history, purpose, and drivers. When you examine ICD-10 C.M. it’s easy to see that it has been, and still is meant to be a book. Not the kind of book that you’d want to curl up with and read it in naugahyde lounge or by the fireplace. It is the book you use to find codes, and perhaps kill a large spider when you were not wearing any shoes. You might be asking, “Charlie, how can you say it as a book? What evidence do you have to support this outlandish claim?” Well, I will back up my claim with irrefutable evidence, if you’ll allow me.

First, ICD-10 C.M. is organized into chapters; 22 of them, to be exact. These chapters divide the codes into groupings based on a combination of etiology, body system, and code purpose. These chapters do not have stable identifiers, but are typically correlated into numbers based on the chapter sequence; one, two, three, four… 22. Now each chapter is then divided into sections. These sections logically grouped the rubrics, these three digit codes, into a code range. The sections do not have a stable identifier either, but are typically just associated with the code ranges. For example, in chapter 11, there is a section, called Diseases of the Appendix, that covers rubrics from “K35” to “K38.” Now in a true terminology, the organizational hierarchy would have stable identifiers that are unambiguously linked to the rubrics, and in this case, the chapters and sections would be classes and subclasses used to organize and navigate the rubrics they contain. Now let’s see, chapters and sections are artifacts typically found in a… What do you call it? Let’s uh, Oh yeah, a book. Second, ICD-10 C.M. uses an alphabetic index instead of synonymy to deal with finding something, and these indexes are somewhat convoluted. ICD-10 C.M. has four alpha indexes, to be precise. The alphabet indexes consists of the following: Index of Disease and Injury the Index of External Causes of Injury, the Table of Neoplasms, and the Table of Drugs and Chemicals. Now, the idea is that the user, a human, goes to the appropriate alpha index and finds the words they’re looking for, and that listing either directs them to the appropriate code or redirects them to another word in the alpha index.

Let me walk you through an example. Let’s say you were looking for the code for abdominalgia. You’d go to the indexes of diseases and injuries and run your finger down the list until you find it, and next to it, you would see the following, “See: pain, abdominal.” Okay. You would then flip the index to the P’s and run your finger down the list of “pain, abdominal,” and you would see 16 codes in that list. You find the one you want, “pain, abdominal, rebound,” and next to it, you see the following, “See: tenderness, abdominal, rebound.” So we’ve flipped the index again to the T’s, and we find what we were looking for, “Tenderness, abdominal, rebound,” and next to it, you see eight codes and you pick the basic one, “R10-829,” and you’re done; easy peasy. In a terminology, you have a concept that is semantically unique and it would be associated with synonyms, or hypernyms, that allow you to find what you’re looking for, regardless of what you call it, in a single hop. “Alphabetic indexes,” there’s something that you find in the back of a… Wait for it… A book.

All right, Third: In ICD-10 C.M., important relationships and associations are conveyed as unstructured text. What I mean by that is in ICD-10, there are certain codes that have relationships to other codes. They might be designated as a “code also”, or “code additional,” or “code first,” or an “excludes one,” which means, “this term doesn’t include these,” or “excludes two,” which means, “these two things should not be coded together,” and then there’s something called “includes”, which is actually more synonyms. In ICD-10, these relationships are provided as wild-carded text in the chapter section or code header associated with the codes in the book. So if you were selecting, for example, drug or chemical induced diabetes mellitus, you’re supposed to code what drug or chemical induced it. This is expressed as a code-first relationship, and in ICD-10 this relationship is expressed as, “Codes beginning with T36 through T65, as long as the fifth or sixth character is a one, a four, or a six.” I’ve been a software developer for over 30 years, despite my youthful appearance, and there is one thing that I’ve learned: Software likes specifics, and while you can write software to interpret the instructions I just read to you, wouldn’t a specific list of terms that you were allowed to pick be better? You know, a simple value set? You know where you don’t find value sets? Ding, ding, ding, you guessed it – in a book.

So my last piece of evidence is that ICD C.M. is actually a book, a book that you can buy on Amazon, in spiral bound or hardback. I checked on audible and there is no audio book edition, but if there is I’d want it to be narrated by James Earl Jones for obvious reasons. So originally as part of this podcast, I was going to go into the structure of ICD-10, and talk about its coding scheme and how it’s organized, but then I thought, “Charlie, no one wants to hear a two hour podcast on ICD-10.” And I can tell, by the relief flowing back to me through your earbuds, that I was right. So instead, I’m going to give you a high-level version, and it goes something like this: within its chapters and sections, ICD-10 is organized into three-byte alphanumeric rubrics. These rubrics can have child concepts that can extrapolate out to up to seven characters in some sections using semi predictable patterns. The codes under the rubrics follow a repeating pattern that always reminds me of Green Eggs and Ham, not in a box not with a Fox, not in a house, not with a mouse, you get the idea. Because typically they’re generated variation on the things below the rubric, whether it’s lateral reality or encounter, open or closed, it really is this pattern generated; or at least it reads that way. Some of the terms in ICD-10 are broad and some get pretty specific. There are a number of concatenated terms, like diverticulitis of the large intestine with perforation and abscess without bleeding, and there are some terms that are only available as part of a concatenated term, like macular edema or hepatic coma. Many of the billing codes also require that you document if it is an initial encounter, a subsequent encounter, or sequella to a previous condition, which kind of breaks the fourth wall of clinical documentation and a good example of something that should be post coordinated by the software, not selected by a provider.

All this being said, these characteristics make perfect sense, if you’re a nosologist trying to classify the clinical situation in a given patient encounter, which stands to reason since that’s what ICD was originally built for. Well, that’s my condensed summary, much less than two hours. Let me know if you want the longer, more scenic version. I can make that happen. Now, what have we learned? We have determined that ICD-10 C.M. is a book of codes. It is meant to statistically classify causes of mortality, and also causes of morbidity. Oh yeah, and drive proper reimbursement… And support clinical decision support… Wait, I almost forgot, quality measures and population health, And it’s heavy enough that you can kill a large spider with it before you suffer from the toxic effect of its venom, which can be found as one of the codes under “T63” if you want to look it up.

Now, the point of this podcast is not to criticize ICD-10. As someone who has worked with healthcare content development for over 20 years, I appreciate and respect how much work goes into creating, maintaining, and evolving content. I have a great deal of respect for the standards and the people that curate them as well. I’m pretty impressed with the history of ICD, all things considered, and can only imagine the benefits that it has brought medical science over the last 250 years. The real issue I have is how we use it and what we use it for. The bottom line is ICD is, first and foremost, a classification system, not a terminology. It is a rigid mono-hierarchy with concatenated terms that make it difficult to document discreet details about the patient. When you look at patient information, as we evolve into a model that is more about understanding what’s happening with the patient and less about how we get paid for it, we need to make a shift from the chicken to the egg, so to speak.

I believe we should document the patient’s clinical situation in higher resolution, closer to what a provider is thinking, and then automatically classify that information into the broader codes that drive reimbursement and statistics. I don’t think we can do this using a 250 year old classification list. I think we need to create primary clinical documentation using something different, like SNOMED C.T., or perhaps even something else, something more modern, modular, flexible. This is something we’ll need to figure out. We are approaching an important moment. We are just two years away from the international adoption of ICD-11. Now ICD-11 is more modern than ICD 10, but it is still a classification system; a complex classification system that embraces post coordinated modifiers and will definitely turn our world upside down. All I can suggest is, as we look forward to analytics, artificial intelligence, and machine learning ushering in a bold new era of precision medicine, we need to re-examine what we are building that foundation on. Most people believe that data quality begins with data entry, but that’s not entirely true. The most critical element of data quality is that the codes we use to build that information be fit for purpose. If they’re not, well, in the words of a Inigo Montoya, “I don’t think it means what you think it means.” Thank you so much for listening to this podcast. I hope you enjoyed the story of ICD-10. I look forward to any comments, suggestions, or recriminations you’d like to share. I am Charlie Harp and this has been the Informonster Podcast. Thanks for listening. Stay well.

