Preloader Icon

The Informonster Podcast

Episode 12: Healthcare Terminology Standardization and Normalization

November 17, 2020

In this episode of the Informonster Podcast, Charlie Harp talks about the notions of standardization and normalization, what they mean in the context of healthcare terminologies, and their relative pros and cons.

View Transcript
Hi, I’m Charlie Harp and this is the Informonster Podcast. Now today, on the Informonster Podcast, we’re going to talk about the difference between normalization and standardization, and how those terms are used in healthcare analytics, and some of the confusion those terms create. So let’s start by saying what it is not because if you’re coming from a mathematics background, or you’re coming from a machine learning background, you’re a data scientist, you run into your healthcare terminologist on the street, and they’re talking about standardization versus normalization, the first thing that’s going to happen is you’re going to have a big cognitive dissonance between the two of you. Because in machine learning and mathematics, normalization and standardization are part of something called “feature scaling,” where you’re taking values and you are putting them into a common frame for analytic and comparison. And that’s a valid definition for standardization, where you’re looking at a value within the standard deviation, and normalization, where you’re normalizing data between zero and one so you can do certain types of comparative activities. But when we’re talking about standardization and normalization in the informatics world, with terminology in mind, we mean something a little bit different. So let me talk about that.

So if you think about the ecosystem of a healthcare enterprise, you usually have “n” number of endpoints that are streaming terminology to a central location, or to you. You’re receiving data from “n” number of endpoints. They’re all using different code sets, or at least some set of different code sets, across the collection of them, and you want to do something useful with that data. And to be able to do that, to be able to build rules and graphs and analytics and other things, you need to do that based upon some normative reference terminology. So you have a standard set of terms, or a normal set of terms, that you have defined, and you were building rules against, or you’re doing analytics against. And when all these different terms come in with all these different codes, you need to align those inbound concepts with the things you understand and have based all your rules upon. And that, ladies and gentlemen, is what we mean when we talk about normalization in terminology and healthcare. So I’ve got an inbound code. It’s in a local code system from the hospital. It’s got an integer of 100 as its source code, and the word is “banana.” But when I land that data in my data warehouse, if I want to be able to do my banana analytic on that data, my code for banana is “B75.” So that means when that code comes in, I have to assign it to “B75” for it truly to be recognized as a banana, and that’s normalization.

Normalization, when it comes to healthcare terminology, is typically done with something we call a map, another thing that’s commonly misunderstood because we also use the term “mapping” when we’re talking about message and syntax. But in this case, we’re taking that code of 100 that came in from that site, and we’re saying, “When that comes in, assign it ‘B75,’ because it is in fact a banana.” So normalization allows us to take the data coming from the outside world and normalize it; go from the abnormal to the normal. And I don’t mean the term abnormal pejoratively. I just mean that “normal” is what you expect and what you’re used to, and these other codes are not that.

So that’s normalization. If normalization is taking something from elsewhere and aligning it to the normative reference concepts that you expect and you’re using, then what is standardization? Well, that’s an interesting question because it kind of depends upon whether you’re talking about normalizing, or whether you’re talking about something else, and I’ll explain what I mean. In the context of normalizing, standardizing is when you normalize something to a standard. So for example, if I’m taking codes, medication codes that are coming from all over my enterprise, and I’m aligning them to a code that, you know, my pharmacist Stavros cultivated in the basement, and it’s a fantastic code system of medications, but they’re Stavos codes. I am normalizing the medication codes from across the enterprise to the codes created by Stavos. I am normalizing, and my rules know about Stavros’ code, and so everything works just the way it should. But it is not standard, according to the definition we’re going to work with today. And I’ll talk about scope in a minute, but typically when we talk about standardization, if I were to take those medication codes coming from across the enterprise, and I would normalize them to RxNorm, which is a national standard, then I have not only normalized those codes, but I have standardized the codes. You see what I did there? The reason why it’s different is because a “standard” implies that it is shareable. It is some kind of universal. I have made it a standard, and therefore I can share it with anybody that has adopted that standard.

And so this is where I’m going to diverge a little bit from some people that are, “Hardcore standardization means this,” in that I would argue that standardization is in the eye of the beholder. So I would say that if I normalize to a national standard, then I have standardized my data at a national level. But if I’m part of a large hospital enterprise and I’ve created a “Harp hospital,” or “Harp Health System Standard,” then I’ve also standardized to that, but I’ve standardized to a “Harp Hospital” standard, not a national standard. But when most people talk about standardization, you bump into a guy in the street and you know, you mentioned standardization, you’d say, “Hey, I’ve standardized all my data.” That person is likely to think that you normalize to a national standard. And that means that you can share your data with him, as long as he understands that national standard, as long as he’s from the same place as you. Because if he’s from Bratislava, they might have a different national standard, and then you are not standardized because then you require a global standard to do that. And that’s why I kind of think standards are about scope because it depends upon where you are. If you were to leave this planet and you were to get to, I don’t know, the Horsehead Nebula, and you were to say, “I standardized to the Earth standard for medication terminology.” They would say, “Well, that’s all well and good, but when are you going to standardize to the galactic standard, like the rest of us?”

So there’s another way of thinking about standardization as well. And this is something that I’ve come across recently when talking to large organizations. Sometimes when people talk about standardization versus normalization, they’re not talking about how you normalize, whether you normalize something that’s standard or non-standard. They actually mean something much different than that. And that is whether you’re going to normalize, which is let the data that exists from all your end points, continue to be abnormal or, you know, local, and you’re going to standardize that data when it lands in a central place for you to do aggregation, versus when they say standardized, they’re talking about it more as a function of data governance. And what they’re saying is, instead of letting all of my 50 hospitals all have their own codes for everything, and I will normalize them so I can do things with them when I get the data here, I am going to standardize all of those 50 hospitals so they’re all using the same codes as me. Now, standardization as a data governance function is a slightly different animal because really what you’re doing is you’re establishing a standard, whether you’re adopting a national standard or creating an enterprise standard, and then you’re basically mandating that all of your endpoints adopt that standard in their data dictionaries, thereby eliminating the need to normalize. So the assumption is, instead of everybody speaking their own language, we’re all going to learn the same language and that way I don’t need to translate anything that comes to me. I’ll just understand it. Then when I make decisions or do things, it’s going to be great because we’re all speaking the same language. Now, one of the things I want to say about that is it seems very attractive because normalization is a headache. Anybody that does semantic normalization, and to do analytics, knows that the lifestyle choice. It’s not a project you do once and you’re done because terminologies are constantly evolving and things get added, and it’s a lot of work. And you have to dedicate resources to it, and people don’t like doing it. So it’s, it’s kind of like mowing the lawn. The lawn keeps growing. You’ve got to get out there. You may not like it. Now, I happen to like mowing my lawn, but not everybody does. But you get out there and somebody has to mow that lawn every week, or it gets out of control. Normalization is the same thing.

So people get lulled by this idea, by this utopia of standardization, whereby they don’t have to do that ever again. But you can think, if we’re going to stick with the lawn mowing analogy, you can almost think of standardization as saying, “Well, I don’t want to mow my lawn. So I’m going to pave it over. I’m gonna pave it over with concrete and I won’t have to move anymore. And it’s gonna be awesome.” I think that, and it’s not that I’m against standardization, I think that the problem with standardization is that most people do it because they don’t like normalization, and also because people have committed egregious crimes against the data science out in the field, where people don’t practice good data governance. And so they create horrible terms that are meaningless, or they create duplicate terms or they reuse codes with different meanings. And when that hits you in the aggregation, the analytics, it really hurts your data quality. And so it’s a combination of, “I don’t want to do the normalization,” the people at the sites are not terminologist, They’re people who have normal hospital jobs and they’re adding codes. “Everything will just be better if we standardize.” Now, I’ve seen people do this successfully or, well, they implemented it, but in my experience, it hasn’t been hugely successful. Now the exception that kind of proves the rule is when you think about drug terminologies. So as you might know, I was at FTB for 10 years, and First DataBank, Medi-Span, MULTUM, those compendia are an example of a standard. They’re are third-party standard, but basically when you look at drug terminologies across a lot of organizations, they’re all using First DataBank. They’re all using Medi-Span, ideally. They might be using NDC codes, but they’re using something. They’re sending those in to whoever’s using it, and usually they have RxNorm codes on them and everything else. As a result, medications, you know, they flow pretty well through our electronic ecosystem, but they’re not perfect. Standardization didn’t solve the problem. There is still usually some normalization required in that process. Well, why is that? If standardization is an informatics utopia, then why do we still encounter issues when we try to standardize?

The reason is for the same reason that mapping is never done, if you normalize. Most terminologies, most of the meaningful terminologies, in healthcare are not static. They shift constantly. Things get added, things change things get tweaked. The problem with paving over your yard is nothing grows. And the same is true when you look at a healthcare enterprise. If you mandate from the center, these are the terms you’re going to use. The problem you have is the people at the edges are the ones that are actually in the trenches. They’re actually the ones that are having to add things that they got in inventory, or needed to figure things out, or the doctor is yelling at them, “You need to add this.” The problem is, whereas there are some code systems, some code sets you could standardize from a data governance perspective, if they’re fairly static and they’re not huge, then, you know, things like gender, things like ethnicity, things like encounter type, a lot of those micro terminologies that we have, you should be able to standardize those and say, “Hey, you’re going to use these.”

The problem is when you start trying to standardize the bigger ticket items, like procedures, charge codes – and I’m not just talking about clinical concepts. I’m talking about business and administrative concepts. When those things that people are interacting with in high volume, that are highly dynamic, what ends up happening is,, if you mandate that those come from the center in a standard way, the people at the edges will be stifled. And there’s a decent chance that they will rebel, and I don’t mean pitchforks and torches. They’ll say, “Well, I couldn’t find the code. So I’m not putting in a code. I’m just going to put it in the note. I’m going to hide it away where only a human can find it.” The other thing that’s going to happen is they’re going to get angry. They’re going to get angry because they need to add something, they need to say something, they need to do something, and now they’re interacting with a process to get something added that is stopping them from doing what they need to do. Whereas if they could just go to their local person and say, “Add this to the drug dictionary in the system,” they could do that and move fairly quickly. If now they have to go through some chain of command to get something approved, they’re going to struggle. And what’s going to happen is if you say you’re going to standardize, and you mandate it, and you go out into the field and you say, “This is what will happen,” what I’ve seen occur more often than not is they try to do that, and then the system starts to fray. People start to get stifled. People start to find work arounds. Some people just outright rebel and say, “I don’t care, I’m going to add it,” and then you’ve got this Frankenstein monster of a “standard”, a governance standard approach with normalization on the side, because you are trying to accommodate the people in the trenches that are ultimately the ones responsible for making sure the business is successful. I’m not really sure what to call this because, as I encounter this, people say standardization and some people go, “Yes, standardization is good,” because they think you’re normalizing to a standard like SNOMED, or RxNorm, or LOINC. But when you think about this idea of mandating terminology across your organization, I’m not saying you can’t pull it off, you might be able to pull it off, but you have to have an excellent governance process. You have to have systems in place so the turnaround time and the delivery of those enterprise standard concepts to where they’re being consumed is almost zero latency. Which means that a lot of times, when we engage in this kind of terminology business, we elongate the process from a central data governance perspective, and you really have to be agile or the edges will feel stifled, and they’ll start to work around and it’ll all start to unravel.

That’s just my personal experience, when it comes to this kind of thing. I don’t know that it’s a bad idea. I think, like I said earlier, there’s a lot of terminologies where smaller terminologies, like, for example, if you could standardize locations and departments and specialties, things that you have firm control over, you could do that. That business entity type metadata is actually a really good thing, if you can pull it off, then you just got to find out how to deploy that in all the disparate systems that utilize that kind of data. And that’s the other challenge with that kind of standardized governance is that, you know, when you’re dealing with multiple EHRs, the process for maintaining them, and synchronizing them, and feeding them is also a little challenging.

Go into it with your eyes open, make sure you’ve got a very good process, with a tight turnaround to get the data back out into the field, and have a contingency plan because I would almost guarantee that sooner or later you’ll have to start making exceptions. It’s just the nature of healthcare. I’ve seen a lot. If you’ve done this and you’re successful, then I want to talk to you on this podcast because you know something that other people who want to try this need to know and understand, and I’d like to learn how you pulled it off too. If you try to do this kind of approach and it didn’t work, I’d love to talk to you too. Let’s face it, I’d love to talk to anybody on this podcast. So anyways, so that is my spiel on standardization and normalization. I really appreciate you tuning in, and we’ve started to get some more folks listening. We’ve got some exciting things coming down the pike in the future where it’s not just me, but I would also say that if anybody out there has an idea for a podcast, if you’d like me to dig into something and give you Clinical Architecture’s take on it, I would be delighted to make that happen. So once again, this has been Charlie Harp and this has also been the Informonster Podcast. Thank you. Stay well and stay safe.

Follow Us

Have a question or topic idea?

Get our News and Updates

Get notified about new podcast episodes, upcoming events and webinars, and more!