CLEARER THINKING

with Spencer Greenberg
the podcast about ideas that matter

Episode 252: Evidence-Based Medicine and its discontents (with Gordon Guyatt)

Enjoying the episode? Want to listen later? Subscribe on any of these apps or stores to be notified when we release new episodes:

March 6, 2025

How were decisions made about which treatments to trust before the advent of "evidence-based" medicine? How strong are biological arguments for or against various treatments? When did the Evidence-Based Medicine (EBM) movement begin? How did the EBM movement overcome initial resistance among medical professionals? What resources do doctors have at their disposals to find up-to-date information about treatment efficacies? Why is the pharmaceutical industry allowed such influence over information about treatments? What is the GRADE approach to EBM? What does EBM have to say about the role of patient preferences and values? How bad is being overweight? What are "surrogate" or "substitute" outcomes? How rigorous is the Diagnostic and Statistical Manual of Mental Disorders (DSM)? How often do treatments suffer from a lack of high-quality evidence of an effect versus plenty of high-quality evidence of a lack of an effect? What's the state of evidence about the placebo effect? What are the most exciting current initiatives in EBM?

Gordon Guyatt is a McMaster University Distinguished Professor. His work has focused on Evidence-Based Medicine and promoting high-quality healthcare without financial barriers for all Canadians. His Order of Canada citation acknowledges both contributions. He has been honored as a Canadian Institute of Health Research Researcher of the Year and a member of the Canadian Medical Hall of Fame. Learn more about Gordon and his work at clarityresearch.ca.

Further reading

SPENCER: Gordon, welcome.

GORDON: Thank you. Glad to be here.

SPENCER: Now, Gordon, your work has been incredibly influential in the evidence-based medicine movement, also known as EBM. But I'm wondering, how were decisions made medically about which treatments to trust before evidence-based medicine?

GORDON: They did it by their clinical experience, their own clinical experience, and by referring to the clinical experience of their mentors, or what people wrote in the literature about their clinical experience and physiologic reasoning. They had been taught physiologic reasoning as medical students and residents, and they applied that physiologic reasoning. What they didn't do is refer to the clinical literature the way we do now.

SPENCER: Now I imagine with certain kinds of disorders that might work, "Okay, let's take a wart." You could try cutting it off. You could see, "Oh, the wart's gone, or the wart grew back," or something like that. It's very palpable, and you can see it. But I imagine for many types of disorders, that was probably quite misleading.

GORDON: It turned out to be very misleading. So we have stories of, for instance, antiarrhythmic drugs being used after myocardial infarction, heart attacks, where people had lethal arrhythmias. They were given antiarrhythmic drugs. And eventually, fortunately, as the movement grew, we tested those in randomized trials, and it turned out they killed people rather than saving them. Another prominent example is hormone replacement therapy, where clinicians were encouraging their patients, perimenopausal women, to use hormone replacement therapy with the hopes of reducing myocardial infarction when, if anything, it increases it. So we have many instances of benefits that we thought were there that weren't, and were even harms.

SPENCER: Even today, if you listen to popular podcasts that talk about health and medical issues, so many people will use biological arguments. They'll say, "This drug works because it does this thing, binds to this receptor." Or they'll talk about things in the body, cholesterol and so on. I tend to think these are really weak arguments. Not that there's no role for these kinds of biological arguments, but the problem is that I think there are so many examples where we thought the biology worked one way, but it turns out it's super complicated and counterintuitive. Would you agree with me?

GORDON: You are right, and that is absolutely right. We think we know what's going on, and then we test it, and we find out it's not. One example is HDL, which is a good blood lipid. We thought people with higher HDL have fewer heart attacks, fewer strokes, and die of cardiovascular disease. Yet, if you raise the HDL, it might not work. We had niacin, which raised the HDL, but it didn't work. Let's give a bigger dose and give it for longer. Bigger dose, longer, still didn't work, and then we found out about its side effects. We then had drugs called trasibs [sic]. We gave the first trasibs, and it increased cardiovascular deaths. Something's wrong. "Oh, we know it raises aldosterone, a hormone in the body, that can be a problem. Okay, new trasibs don't raise aldosterone. This has got to work." Turns out, it doesn't work. Very often, just as you are pointing out, the physiologic reasoning makes sense, but things turn out to be more complicated than we knew.

SPENCER: It seems to me that the issue of confusing correlation and causation is a big factor here. Very often in medicine, we find that A is correlated with B, and that raises a hypothesis that maybe it's causal, but the problem is, often it's not causal. I think about vitamin D trials in particular. There have been so many vitamin D studies that found a correlation between health and vitamin D in all different kinds of health metrics, and almost universally, when we go test them in carefully done randomized control trials, we find that giving people vitamin D doesn't lead to the benefit. Maybe it does in a few edge cases, but it seems almost never does.

GORDON: The example that I just gave you of HDL: HDL very consistently in every study that has ever been done shows that people with higher HDL have fewer heart attacks and less cardiovascular morbidity. However, you have a number of agents now that raise HDL, and in one instance, it killed people. Raising HDL with a drug does not have the same effect as naturally higher HDL. An example of really problematic correlation versus causation is that generally, people say, "If you do a comparison of how likely people are to die in the hospital versus not in the hospital, people die in the hospital. Clearly, hospitals are dangerous, and if we only stay away from them, we'll lengthen our lives." The obvious problem is that people in hospitals are different from those who aren't, and that becomes obvious to people, but in other situations, it becomes less obvious.

SPENCER: That's a great example. And on that point, there's a lot of concern people have about danger being in hospitals due to getting an infection in the hospital or other kinds of issues that can happen in hospitals. I don't know how much you've looked at this, but how dangerous is it to be in the hospital, separate from the selection effects of course, more sick people end up in the hospital?

GORDON: Things go wrong in hospitals, and there's no doubt people have complications in the hospital. Hopefully, when one is in the hospital, one gets taken care of and avoids those things. But there are risks of things going wrong in hospitals. However, obviously, the situation is: don't go to the hospital if you're not sick and if you're in the hospital and you're feeling better, get out of there. So I would certainly suggest that. But obviously, if you're really sick, there's a net benefit of being in hospitals.

SPENCER: Yeah, that makes a ton of sense. Now, when you started this movement of pushing for evidence-based medicine, I imagine that some people were defensive. If you're a physician and you've been treating things a certain way for 20 years, and someone's coming in and saying, "Well, we need to make sure it really works," I could see that creating a lot of defensiveness. I'm wondering, what was that initial reaction to this movement?

GORDON: Defensiveness is one word, hostility is another. So yes, we were not subtle about it, and in retrospect, that was probably a good thing. We got the attention very quickly, but we were effectively telling people, senior people, authorities, "Sorry, guys, your training didn't include a certain aspect that turns out to be quite important, and the way you've been making decisions has its limitations." Initially, people were not thrilled to get that message, and there was a lot of hostility.

SPENCER: But it seems like the evidence-based mass movement has been very successful. How did you overcome that? I imagine a lot of people wanted to shut it down.

GORDON: There have been some sociology studies of what was going on, but as it turned out, there was incredibly quick uptake among the leadership, certainly in North America. After the landmark publication appeared in JAMA, the world's leading circulation journal, in 1991, a couple of years later, I received a promotional publication from the American College of Physicians, saying, "In this era of evidence-based medicine," two years later, there was really extraordinary uptake by big swaths of the medical leadership. Although there was a lot of hostility on the ground, the leadership bought in in a big way. Very soon, we were training people in residency programs on the principles of evidence-based medicine. Not too long before, every textbook that wanted to be credible had to be talking about the evidence-based approach to this, that, or the other thing. So there was hostility on the ground, but among the leadership, at least in terms of nominal support for what we were doing, that happened very quickly. JAMA ran a big series right off the bat of our user's guide to the medical literature, which was picked up by all the residency programs, so there was enough positive endorsement going on that I never felt I had to really address it. I just kept doing what I was doing. Some of my colleagues were pushing back against these, "Yes, no, yes, no." I didn't bother with that. I just kept doing what I was doing, and gradually the hostility diminished.

SPENCER: And what year was that when this big paper came out, and the adoption started being picked up?

GORDON: 1991 was the first time the term EBM appeared in the literature. I was the sole author of that paper appearing in not a prominent place. In 1992, JAMA published what was the article that brought EBM to the world. It was the one that got a lot of attention. In 1993, we started this series of articles called the user's guide to the medical literature in JAMA, which then basically ended up as the curriculum for evidence-based medicine in residency programs. The first series of articles ran between 1993 and 2000.

SPENCER: It's really interesting to think about how if you went to the doctor in the early 90s, you would have such a different experience than if you go now in terms of what they are actually using to make their decisions.

GORDON: Yeah, your experience might be the same. I'm not sure. I don't think we know how much physicians nowadays actually say, "Well, the evidence is such and such," versus just, "Oh, I think you should do such and such." There may not be that much difference in how they present themselves, unless patients start to ask about it. But what did differ is what underlies those decisions. So it is in all the authoritative information that clinicians rely on now. The evidence following EBM principles underlies every recommendation that they will find when they look for guidelines.

SPENCER: My understanding is that doctors now have available to them these up-to-date evidence databases, essentially, so they can look up what's the latest evidence on a given condition. Is that right?

GORDON: Yes. So I am highly, highly conflicted. The resource that is most used around the world, when I'm conflicted talking about this resource, is one called UpToDate. So I go around the world speaking, and I say, "How many people use UpToDate?" All around the world, at least in academic audiences, two-thirds of the people will put up their hands. It's an electronic textbook that was never anything other than electronic and has this huge uptake, and professes to use evidence-based principles behind everything they do. I've been working with them for 15 years or so, and my job with them is to help them be as evidence-based as possible. So I am very conflicted about them when I talk about them, but that is one resource that I, of course, having worked with them for so long, endorse. I think I still do better, but they're doing very well in terms of being evidence-based and extremely popular. And of course, there are other resources that guidelines, many leading organizations have adopted principles of trustworthy guidelines, which are essentially evidence-based guidelines. So yes, clinicians can go to UpToDate sources. One advantage of UpToDate over traditional guidelines is they get new information into their recommendations within a matter of weeks.

SPENCER: Yeah, it seems in the past, doctors would be trained, but then they'd have to remember their training and it wouldn't be updated unless they went back and got retraining. So it's amazing that they can have the newest studies at their fingertips.

GORDON: The electronic world has made things happen much more quickly. We've also become much smoother at integrating new information with previous information in what we call systematic reviews. We produce these now very quickly, and the electronic world allows it to get out to people very quickly.

SPENCER: The systematic reviews essentially summarize the state of evidence on a given topic, right?

GORDON: You got it exactly. We do it systematically. We have a science of systematic reviews, where we make our questions clear, we do comprehensive searches, we assess the risk of bias, and we summarize the studies according to EBM principles. It's a whole science of systematic reviews that summarize the best evidence for any particular clinical question.

SPENCER: You mentioned that you feel these database systems, as an example, still have room for improvement. What are some of the future kinds of improvements you would see in these databases?

GORDON: One is just consistency. You will find times where they slip, where the evidence has been misinterpreted, or they haven't found the most up-to-date evidence, or they were influenced. Here's one they could do better: they still have expert input, and expert input is always necessary, but if it is influenced by experts, conflict of interest can be a problem. So yes, use the evidence. Understand it, make sure it's up to date, make sure it's the best. And when you then say, "How are we going to use this evidence to produce guidelines?" Make sure you have unconflicted people making those decisions.

SPENCER: You mean free of conflicts of interest essentially?

GORDON: Yeah, free of conflict of interest. The most obvious is money from the pharmaceutical industry. But at times, we call it non-financial conflict of interest, sometimes intellectual. Investigators can get very attached to their own work, and that attachment can be even more powerful than conflicts of interest associated with money.

SPENCER: Yeah, the conflicts of interest seem really important. Something I've read about is that medical companies sometimes control training for doctors where they'll offer them trainings, which obviously has a huge conflict of interest, where they might overfocus or over promote their own drugs. They also send medical sales representatives to doctors' offices and give them free drugs, things like that. They have friendly people who befriend them and so on. How big a problem do you see this as being?

GORDON: Okay. Well, let's just be clear. We have two different issues. One issue is the people producing the guidelines for the physicians, and that's where EBM comes in. We want those decisions to be evidence-based. Another thing is the effective marketing, where you will have one message when you go to UpToDate, but you have a different message with no attempt. As a matter of fact, if the marketing departments of the pharmaceutical industries were truly evidence-based, they should all be fired because their job is to make the drug look better than it is. It's not to make an accurate presentation; it's to make it look better. It's a very different issue from the efforts of the pharmaceutical industry, often successful and often with the sort of marketing strategies that you just mentioned, to make drugs look better than they actually are.

SPENCER: I've heard that this can come up a lot with side effects as well. My understanding is that early on for SSRIs, used for depression and anxiety, people viewed them as not having that high side effect rate. But now with better evidence, we know that there are actually really high rates of side effects, like, I think sexual side effects in particular, can be as high as something like 40 or 50% on some of these drugs. It seems that drug companies may not have been fully forthcoming with the evidence.

GORDON: They're not forthcoming. You're quite right. There's a lot of stories of suppression. But the other thing is, even if they put it out in the publications, if physicians look to material from the pharmaceutical industry rather than trustworthy sources, they will find that, while it may be in the publications, the emphasis on the adverse effects is hugely underplayed, so they get a very different impression than they would if they were reading the summaries and the guidelines from unconflicted sources.

SPENCER: To what extent are their methods just kind of the normal social forces? If a friendly person comes into your office and smiles at you and gives you a free sample, you might feel inclined to try that out with your patients, versus suppression of evidence, where they're actually hiding evidence or reporting their study results in fishy ways that make it hard to interpret.

GORDON: I have never seen a pharmaceutical rep, so my personal experience is zero in that. I'm not very aware of any literature that has tried to tease out — the problems that may be there, but I don't know about it — the problem you mentioned about the relative influence, what you can say for sure is that the way the material is presented, as I suggested a minute or two ago, if they give a truly objective depiction in their marketing materials, then their marketing departments are doing a bad job. Their job should be, within the constraints that there are some constraints put on what you can say, to make the drugs look as good as possible.

SPENCER: Going back to the evidence-based medicine initiatives, I'm interested to know why people at the top of the food chain bother to adopt it? We know that there was resistance lower down from individual doctors. I'm sure some doctors were in favor, but it sounds like there was quite a bit of resistance. But what was motivating the people higher up to say, "Yeah, we want this"?

GORDON: Well, here you are asking me to go well beyond my expertise and think of medical sociology or psychology or politics, which I am not at all an expert on. So I would be speculating as much as other people. There's one guy who, unfortunately, had personal problems that never led him to finish, but he was working on a PhD about evidence-based medicine, and his theory was that at the time that EBM came along, organized medicine was in something of a crisis. There was a credibility crisis, and EBM offered the opportunity for something new, exciting, and advanced; medicine was moving forward. That was a way of addressing that credibility crisis. I don't know if it's true or not, but that was this gentleman's theory.

SPENCER: Were people doubting the effectiveness of medicine at the time, to your knowledge?

GORDON: Oh Gosh, thinking back on it. When this gentleman was telling me his theories and so on, "Yeah, it sounded very plausible." I had not experienced Western medicine being in a crisis at the time. But obviously, somebody who might have been more attuned to what was going on in public perceptions might have seen that.

[promo]

SPENCER: One thing I would ask about is how evidence-based medicine works, and I know that the GRADE approach is a part of that. Can you tell us about the GRADE approach?

GORDON: Yeah, so if you think of the evolution, first of all, we started to do randomized trials. Then we found ways of doing randomized trials better, with things like blinding and making sure we followed patients up. So we're now doing randomized trials well. We then want to have a number of randomized trials addressing an issue. Now we want to summarize them in systematic reviews, where this becomes a science of how to summarize the evidence. As I've said, comprehensive search, clear questions, assess risk of bias, and so on. Then we started having guidelines. If you go back 40 years, they were almost non-existent. From maybe 30 years ago, they started to get more and more prominent. Initially, the guideline movement was what we describe as GOBSAT, good old boys sitting around a table. Then we came up with rules for a science of developing the guidelines. The GRADE approach has to do with the science of systematic reviews and the science of guidelines. It basically says, "How do we know what's true? How do we know what evidence is high quality and what evidence is low quality?" GRADE is a sophisticated system that is now very widely adopted by over 110 organizations worldwide. In saying, "How do we know what evidence is high quality and what evidence is low quality?" Then we need a process of going from evidence to recommendations. One of the ironic principles of evidence-based medicine is that evidence never tells you what to do. It's evidence in the context of people's values and preferences. GRADE is now a system for saying how do we do these systematic reviews optimally? How do we say this is high quality evidence and low quality evidence? What is the process of going from evidence to recommendations? GRADE does both of those things and has been extremely widely adopted.

SPENCER: That's really interesting about values and preferences being a key part. Could you give an example of that, how that comes into medicine?

GORDON: Yeah, I'll give one example. So let's say you have atrial fibrillation, abnormalities of the heartbeat, and it turns out your risk of stroke goes up. If your risk of stroke goes up a lot, then you take anticoagulants, and it's a substantial reduction in the risk of stroke. But there's bleeding that goes with it, and serious bleeding can kill you too, and even if it doesn't kill you, it's not much fun. Big bleeding goes up substantially. So what's better? Well, if there's a big reduction in stroke, almost every patient will say the reduction in stroke is what, "I'll take the anticoagulants. I'll take the risk of bleeding," but for some people with atrial fibrillation, their risk only goes up a little bit, and so the magnitude of effect when you take any anticoagulants is only a small reduction in stroke. Under those circumstances, some people will say, "I'll accept the bleeding, even though it's only a small reduction in stroke." Other patients will say, "No, thank you. Why should I take this drug when it's going to increase my bleeding when I'm only going to get a small benefit?" So there's one trade-off I can tell. I'll tell one more story. Periodically, I am a specialist in internal medicine, but I'm asked to talk to other audiences, like pediatricians, and I have to come up with a pediatric example. The pediatric example that I often use is antibiotics for middle ear infections. Well, as it turns out, antibiotics for middle ear infections provide a small reduction in how quickly the kids get better from the pain in their ear in a proportion of the kids, but some of the kids have the antibiotics give them diarrhea, GI upset, and rashes. So what's better? Pain gets better in a day or two early, or the risk of diarrhea and rashes? When I present to audiences, I tell them at the beginning, "Okay, what do you think about taking antibiotics?" Then I give them the evidence, and then I say, "Okay, who thinks now the kids should get antibiotics?" Some people put up their hands, and some say, "Who thinks they shouldn't?" Some people put up their hands because it's how they weigh the benefits and the downsides. Then some people don't put up their hands, and I say,"Only those of you who didn't put up your hands have actually grasped the principles of evidence-based medicine, which is, it isn't our choice, folks, it's the patient's choice, and in this case, the parents' choice with their kid." Some will say, "Yes, kids screaming, anything that'll make the pain better quickly, I want it. I'll take the risk." Others that aren't bothering the kid too much, diarrhea, rash, "No. Thank you." So those are a couple of situations, one from adult more serious medicine, one from a very common kids' problem that illustrate that evidence doesn't tell you what to do. It's evidence in the context of people's values and preferences, which means that clinicians need to attend to the patient's values and preferences.

SPENCER: It seems to me that this comes up a lot and is even more amplified in end-of-life care, where some people are going to say, "Yeah, give me every single thing you can to keep me alive," and other people are going to say, "You know what, if I'm suffering, I don't want to be alive. If I don't have a good prognosis, I'd rather not be here."

GORDON: You're absolutely right. That's another extremely vivid example where values and preferences differ. The evidence can be that we have somewhere a three-month gain in lifespan at the cost of increased suffering. It's a very reasonable decision to say, "Okay, yes, I want those extra three months. I'll suffer a bit, but I want it," versus, "No. Thanks. Not for me."

SPENCER: In my own life, I feel that I've had some doctors that seem to handle these values and preferences a bit really well, and others that didn't handle it well. Sometimes you go to the doctor and they just say, "Okay, take this medicine," and you don't really feel like they give you enough information to understand the trade-offs you're making. Whereas other doctors say, "Well, I could give you this, but here's the downside. If I give you this, you could have this side effect, and you have to trade that off against how much you care about the condition," and so on.

GORDON: No, you're absolutely right. From an EBM point of view, there are better doctors and not so good doctors. If you're on a fee-for-service schedule and your goal is to get the patient out of the office as quickly as possible, maybe you just tell them, "Oh, this is what you should do. Thank you very much. See you later," whereas others will be what we would consider more responsible and take the time to have what we call shared decision-making, the goal of which is to ensure that the decisions are consistent with patient values and preferences.

SPENCER: I feel that to some degree this cuts against doctors as authorities, because obviously their authority is on the medicine, but in my experience, they often communicate in a very confident way, saying, "You have this problem, take this medicine," and it seems to me they must be trained to do that, because it's so ubiquitous. Doctors can communicate in this confident way. Maybe it's because it leads to greater compliance, but it seems to me that it does present a problem when you really have to put things back in the patient's hands and say, "Well, what do you actually want? What do you care about?"

GORDON: Our line in EBM is we are the experts in the evidence; the patient is the expert in their values and preferences. I think you're right about the emphasis on training people to do shared decision-making. It is something that we could be doing a lot better, and training people in that way would address, to some extent, the problems that you've correctly identified.

SPENCER: I love that phrase, the way that you said that. I think that's so perfect to encapsulate the concept. Tell us a little bit more about GRADE. So what is GRADE actually producing, and what are some of the factors that go into deciding the quality of evidence?

GORDON: So let's remember GRADE has these two aspects, quality of evidence and then going from evidence to recommendations, but you asked about the quality. To put it very simply, randomized trials start as high-quality evidence in GRADE's categorization of high, moderate, low, and very low. Non-randomized studies start as low-quality evidence.

SPENCER: Could you tell the audience why that is? What's so special about randomization?

GORDON: I'll use two examples to illustrate. I'll come back and say, "Okay, I did a study, and I find out that people die more often in the hospital, and I conclude hospitals are dangerous places. If we only stay away, we'll be much better off." Obviously, everybody thinks it's ridiculous because they say, "But the people in the hospital are much sicker." On the other hand, as soon as you move away from that extreme example, it becomes not so obvious. For example, it turns out that there are good biological reasons why antioxidant vitamins should reduce cancer and cardiovascular disease. Large observational studies, non-randomized studies, have demonstrated that people who took antioxidant vitamins had appreciably lower cardiovascular risk and appreciably lower cancer risk.

SPENCER: So that's just looking at surveys and saying, "Are you taking this vitamin or not?" and then correlating that with health.

GORDON: Or, "Is your diet high in antioxidants?" That's right. The people whose diet was high in antioxidants and who were taking vitamins had less cardiovascular disease and less cancer. It was true. The people who took the vitamins did have lower cardiovascular risk and did have lower cancer. Unfortunately, it had nothing to do with the antioxidant vitamins. As it turns out, when the randomized trials were done, there was no benefit at all with either cancer or cardiovascular risk from the antioxidant vitamins. It's like being in the hospital. The people in the hospital are different from the people who are not in the hospital. The people who take antioxidant vitamins are different from the people who don't take antioxidant vitamins. Any non-randomized study is plagued by this bias, which we call confounding, which simply means the people who are exposed to the intervention are different from the people who are not exposed to the intervention. If the intervention appears beneficial, it may not be the intervention at all. It may be the differences between the people who use it and the people who don't use it. That is the essential reason why randomized trials start as high in our hierarchy, and non-randomized or observational studies start as low. But saying, "All randomized trials are wonderful," would be very naive. As it turns out, what grade does is we've identified five categories of problems: risk of bias. Randomized trials may not be done well; people may mess up the design, and that's a problem. The results may be inconsistent from study to study, and sometimes we don't know why, and that lowers our certainty. The sample size of the studies may be very small, with wide confidence intervals, as we say, that the truth could be very different. With a small sample size, we really do not have precise estimates. That could be a reason for writing down, or the patients could be very different. I work as a general internist, a hospital-based general internist. I deal with a lot of people over the age of 90. I try to base my practice on randomized trials. Do they really apply to those patients over 90? Very few, if any, would have been in randomized trials. Maybe not, I'm not so sure. Finally, we worry about publication bias. Although randomized trials start out as high certainty evidence, problems in any of those five categories may lower the certainty of the evidence. On the other hand, although observational studies start as low certainty, they can go to very low if they have the same problems. Occasionally, observational studies, non-randomized studies, can yield high certainty evidence. When does that happen? It happens when we have huge treatment effects. If you think of dialysis in people who have terminal renal failure, if you think of epinephrine in anaphylactic shock, if you think of insulin in diabetic ketoacidosis, if you think of hip replacements, these are all situations where, appropriately, we never did randomized trials. Why not? Because the effects were quick enough and large enough that you didn't need randomized trials. It was obvious that these treatments were beneficial. Occasionally, observational studies, when there's a very large effect, can end up with high certainty evidence. It's a much more sophisticated system. Randomized trials start high, observational studies start low, but a very careful accounting of the things that can lower the quality from the randomized trials, and particularly large effects that can raise the quality of the evidence from observational studies. That's basically how the system works in terms of deciding on the quality of the evidence.

SPENCER: I think about that point about effect size, that if you have a really strong effect, you don't need those same rigorous controls. To use a really extreme example, imagine someone claiming they have an invisibility potion. If they drink it, they'll turn invisible, and then they drink it and turn invisible right in front of your eyes. Your first thought is, "Okay, maybe they're doing trickery. They've got mirrors or a magician." But if you could rule that out, you only have to turn invisible a couple of times to be convinced that, "Okay, it really works." The effect being so unlikely to occur if the treatment isn't working sort of allows you to get around some of these requirements.

GORDON: I'll tell you the most extreme example: resuscitation after cardiac arrest. It used to be that when your heart stopped, you were dead, and that was the end of the matter — zero survival. Now we have a resuscitation procedure. Most people still die, mind you, but a few of the people get through who obviously would have been dead. There's an example where you're 100% sure that if you just left the patient alone, they're dead. You go through a resuscitation procedure, and some come back to life. People also use the example of jumping out of a plane with or without a parachute. Another example I can say is when someday someone gives someone a pill after an amputation and their limb grows back, we will not need randomized trials.

SPENCER: [laughs] That's a great example. I feel that this issue of study quality is so important and it's so misunderstood because there seem to be a lot of people who have come to be distrustful of science. Part of that is they see all of these studies claiming all of these things, and then later there's a study claiming the opposite. You can kind of cherry-pick to prove whatever you want using studies. The missing piece there is study quality. If you think of studies as just one kind of thing, then, yeah, it's totally baffling. Why are there studies saying X and also not X at the same time? There's a wonderful graph that shows everything has been proven to cause cancer and cure cancer, showing all these different studies indicating that coffee causes cancer and cures cancer, bananas cause cancer and cure cancer. But once you really realize that, "Wait, many studies are low quality, if you don't get them out of the system, then you're just going to be overwhelmed with noise and garbage. If you really focus on high-quality studies, though, yes, occasionally studies will contradict each other, but much less often, and when they do contradict, it's often for much more explainable reasons, like the populations under study were much more different from each other."

GORDON: Everything you said is absolutely right, except the more explainable. As it turns out, the observation of the non-randomized trials, their failures are easily explainable for the reasons I suggested: the people exposed to the intervention, the treatment, the exposure, are different from the people who weren't. In fact, we know that happens. We know in advance that it happens. So if we're smart, we may. If all we have is the observational studies and we don't have randomized trials, we have to rely on them. At one stage, it made sense to take antioxidant vitamins. "Oh, it looks like they may reduce cancer and heart disease," but people understand the situation; they might say, "I'm not going to bother with the antioxidant vitamins until we really find out whether they work." Those people who waited would have saved themselves some money and actually some side effects from antioxidant vitamins when we found out. The point is to know if that's. I'll give you an example. Now there is a big fuss about alcohol being bad. There have been a lot of studies of alcohol for years, but now authorities have said, "Oh, stop drinking." We do not know the adverse effects of alcohol because nobody has done randomized trials and followed up to see the effects of alcohol. The inferences being made are much too strong. It's not mysterious; they are much too strong because of this problem: people who drink more and people who drink less are different from one another. If there are differences in outcomes, which are minor and inconsistent in the results across the studies, it may be those things, if there is any effect at all, that are responsible. We actually do know the problems with these low-quality studies. Randomized trials can be low quality if they mess up in implementing them too. But in general, we know in advance that there's lower quality evidence from these observational studies, and that's the way it is. If it's all we've got, we should pay some attention to them, but with a healthy dose of skepticism.

SPENCER: Fair point, and on the alcohol consideration in particular, I think we probably do have a lot of evidence that really high levels of alcohol are terrible for you. Would you agree with that?

GORDON: You're absolutely right. If you're an alcoholic, it's bad news with terrible effects on your health, no question. But moderate drinking, I would characterize the evidence as low quality, and maybe cutting out completely would have good health effects. I'm quite skeptical.

SPENCER: It's such an interesting example, because it's been in the news for 20 years. The news has been running headlines about alcohol. There was some observational evidence that having one drink is actually better than having zero drinks.

GORDON: That's right. So the evidence has evolved, and the evidence has moved toward a suggestion of possible harm, but hugely overplayed, from my point of view.

SPENCER: Clearly, we need to randomize people to have alcohol, right?

GORDON: Yeah, unless you could do something where people would agree to say, "Okay, I'll total for the next whatever time period, or I will be my usual moderate drinker," we're never going to know for sure. And that's one of the things people don't like. If you say, "Sorry, we're never going to know for sure," which I think is the case here, because I don't think those randomized trials are feasible. So we have to live with uncertainty. That makes some people uncomfortable, and that's one of the things that EBM accepts. In many situations, we're never going to have better than low-quality evidence. We're never going to be sure.

SPENCER: Yeah, and that's really tough as a patient. If you have some serious problem and all the evidence is low quality, you're fishing in the dark a little bit. You just have to try things.

GORDON: What you have to do or the clinician will say, "Here's our best guess of the benefits, we're not sure. Here's our best guess of the downsides. We're not sure." Then you have decisions under uncertainty, and some people will be risk averse and some people will not be risk averse. The decisions we were talking about before involve trading off thrombosis or bleeding, trading off quantity or quality of life, or trading off the duration of the child's pain versus the diarrhea and the rashes. What we're talking about now is adding an additional component that makes that decision more difficult, which is, "Well, we're not so sure about the bleeding or the stroke reduction. We're not so sure about the duration of the pain or how often the diarrhea or the rashes will occur." That uncertainty adds an additional challenge to the decision-making.

SPENCER: Decisions are hard enough when they trade values against each other, when you know what you're trading off, but now it's probabilistic; you're probabilistically trading off values.

GORDON: Yeah, it makes it more difficult. It makes it more difficult for the physician and for the patient, but that's what we're facing, and we definitely think it's much better to acknowledge the uncertainty than to pretend we know when we don't.

[promo]

SPENCER: Do you think that we have too many low-quality studies, and it would be better if those resources were spent on a smaller number of high-quality studies? Because obviously high-quality studies are often more expensive, sometimes way more expensive. But it seems to me that there's a certain bar below which a study is not even useful. It's so low quality. I don't know about you, but I see a lot of studies that don't meet that bar where I'm like, I would have rather the study not even be run, because the money would be much better used on higher quality stuff.

GORDON: Absolutely. The COVID pandemic was a great example. We had many crummy randomized trials, hundreds of them, I don't know, thousands of them. And then we made huge advances very quickly. We got three drugs that are known to have clear benefits in non-severe COVID and another three classes of drugs in severe COVID. We did the randomized trials very quickly and got answers, which was great, but that was a small number of large, high-quality studies, with huge waste on hundreds, literally hundreds of crummy little studies that did not help.

SPENCER: They probably made things worse, because people read all kinds of things into these. They created conspiracy theories.

GORDON: You're absolutely right. One example: ivermectin. The early studies suggested possible benefits, crummy studies. When better, larger studies were done, no benefit.

SPENCER: And fortunately, some people, by the time the early studies were done, had already convinced themselves that ivermectin was this incredible wonder drug, and then they couldn't psychologically back out once the better studies showed it didn't work.

GORDON: People find it hard to go back. You're absolutely right.

SPENCER: It's a sad state of affairs. I don't know how much you've looked at this in particular, but obviously people have a lot of concerns about the vaccines. The effect sizes on the vaccines are pretty incredible, from my point of view, if you look at the studies on the effectiveness of the vaccines, but people had a lot of concern about negative side effects. Did you look into that issue in particular?

GORDON: No, I'm familiar with it in the sense that the benefits of the vaccines for anybody at high risk absolutely clearly overwhelms any possible harms that people will experience. People and their values and preferences may say, "Oh, I put a much higher value on avoiding the harms than getting the benefits." But if they understood that the benefits are lowering their death rate to a substantial extent and the harms are very, very rare, it'd be hard to think of a rational person declining a vaccine once they know that.

SPENCER: It gets a bit more complicated when you're not talking about an 80-year-old with chronic health conditions, but you're talking about an 18-year-old who's perfectly healthy. The chance that the COVID vaccine really saves their life is very, very small.

GORDON: Absolutely Right. A huge issue that evidence-based medicine makes a big fuss about is, what is your baseline risk? If your baseline risk of something is very, very low, then the benefit you can get is going to be very, very small. Absolutely, that's why, appropriately, public health officials tell people over 65 it's time to get your vaccines; they will not push it on younger people. Why? Because the risk of bad things happening is so much greater in older folks. So absolutely, it's a rational approach to this: if it's very unlikely that bad things are going to happen to you, don't bother with the testing, don't bother with the vaccination, and don't bother with the screening tests. Generally, our screening tests, breast cancer screening in particular, colon cancer screening, are oversold to people with low risk.

SPENCER: Yeah, that's a really interesting point because false positives are such an issue. If you have a low base rate of having a disease and you test yourself for it, and you don't spread that across society, you're going to find tons of people that apparently have the disease but don't really.

GORDON: That's absolutely right. There are epidemics being created when we define the disease. An epidemic is, in part, real, but in part it's because of changing thresholds of what we call diabetes, particularly diabetes in pregnancy. We create disease by setting thresholds that are quite arbitrary. "Okay, shall we call diabetes when your blood sugar, your hemoglobin A1c, is at Level A or B? If you lower the threshold where you're going to call it diabetes," suddenly you have a lot more diabetes.

SPENCER: That's really interesting. But if you hold that constant, the definition, surely, it has gone up a lot with the rise in obesity. I assume,

GORDON: Yeah, type two diabetes, which is the one you get in adulthood, has definitely gone up as a result of lifestyle issues. We have an obesity epidemic. It's very related to obesity. So yes, holding that constant, it has gone up for those reasons.

SPENCER: The obesity issue I find to be so interesting because I think there's a lot of prejudice towards obese people, which I think is extremely unfortunate. At the same time, as far as my reading of the evidence, losing weight when you're obese has really kind of dramatically positive health effects, and so there's this interesting tension where there are people who want to resist the health evidence saying, "No, it's okay to be any weight. You should love yourself at any size," and so on. But at the same time, I think, unfortunately, it is really linked to negative health outcomes.

GORDON: I will use your alcohol analogy, which I think applies here. If you're very obese, that's a huge health problem. If you're modestly obese, not so sure. So for the very obese, it's an unequivocal issue; for the modestly obese, maybe not so sure.

SPENCER: Although my understanding is that with diabetic patients, putting them on diets where they lose a modest amount of weight seems to have caused significant improvements. If I understand the evidence properly.

GORDON: You understand the evidence very well, and it causes their hemoglobin A1c or their blood glucose to go down. Does that translate into fewer strokes, less cardiovascular disease, less blindness, or fewer kidney problems? That hasn't been shown yet. Quite possibly, until, once again, we're relying on biological reasoning. In terms of biologic reasoning, we have a whole bunch of drugs that lowered blood glucose and did nothing for stroke, heart attacks, cardiovascular death, or kidney problems. Nothing. Lowered the blood glucose really well. In the last few years, we've had a couple of drugs that come along, one of which has a trivial effect on blood glucose and lowers heart failure, cardiovascular death, and total mortality. There is a total disconnect between the extent to which your blood sugar is lowered and the prevention of what you really want to prevent, which is long-term morbidity and mortality. So again, it cannot be so sure. Yes, you're absolutely right. For some people with diabetes, that particular condition, losing weight will bring your blood sugar down. Yes, there is a good chance that over the long term, that'll prevent the complications. We don't know for sure. Outside of the diabetes situation, it becomes even more questionable whether being modestly obese and losing weight will actually have health and long-term health benefits.

SPENCER: Well, you touched on such an interesting issue, which is this, I don't know what the technical term for them is, but essentially, proxy variables or intermediate variables.

GORDON: We call them surrogates, surrogate outcomes, or substitute outcomes.

SPENCER: Because it would be awesome if you could, for every study, track people for 10 or 20 years and see how many of them die. And see what the effects on mortality are, or other outcomes that we intrinsically care about, right effects and heart attacks, et cetera. But the reality is, that's an insanely expensive thing to do, and also, because those things are really rare, you would need an absolutely massive number of participants in your study to get enough people getting heart attacks or dying to measure those outcomes. So it's super appealing to say, "Well, let's just use a surrogate variable that we are pretty sure predicts heart attacks or predicts mortality, so that we don't need to track an insanely large number of people for an insanely long amount of time." But then it raises all these issues, "Well, okay, but how good is that surrogate? Are we sure the changing surrogate variable really will lead to a difference in the outcomes we care about?"

GORDON: And the answer is over and over again, no. And we have examples. Now, one thing you said there I want to point out. You said, "Well, if your risk is really low, then you would need thousands and thousands of people to find out." But as you pointed out earlier, if your risk is really low, your possible benefit is really small; maybe it isn't worth it anyway. Under those circumstances, when your risk gets higher, it would be nice, but over and over again, I gave the HDL example where we were sure people with high HDL have lower cardiovascular risk. No doubt, we demonstrated over and over again. I described a whole series of studies where we had drugs that raised the HDL and over and over again did not do anything for your cardiovascular risk. And we have many, many examples of failure of surrogates. So from the EBM, the demonstration of an effect on the surrogate is low-quality evidence about any impact on the patient-important outcomes.

SPENCER: What are the best surrogates that we have that we're really confident that changing the surrogate variable will lead to a difference in outcomes?

GORDON: There are very few. Viral load in HIV. Every time, different drugs, every time we substantially lower viral load, the mortality goes down. That's the only one, the only one that I feel really confident about.

SPENCER: Wow, that's a sad affair that there aren't more, because they would be so incredibly useful if we had really good surrogates, because we could then just study the effects of drugs in the surrogates and come to conclusions so much faster.

GORDON: You are absolutely right. Sadly, we don't.

SPENCER: Another issue that we touched on very briefly, but I think it's worth addressing is the idea of absolute risk versus relative risk. It's so common that the media will report on some finding saying, "This triples your rate of such and such cancer," and it sounds so scary, but then you learn, "Okay, wait, the cancer has like a one in one hundred thousands chance or one in a million chance you get that cancer." So what is tripling the rate? "Okay, from one in a million to one in three million. Well, it's almost certainly not going to kill you," so almost certainly tripling that rate is irrelevant to your life.

GORDON: When it's those low rates, the only studies that could have been done to substantiate that in the first place would be observational studies, which are low quality to start with. So you've got low quality studies showing trivial effects, and indeed a fundamental principle of evidence-based medicine is that relative risks are fundamentally unhelpful in making decisions. Another example I'll give you: you get a 50% relative risk reduction from mortality from a drug. "Oh man, cutting your risk of dying in half. That sounds good." However, if your risk is only 1% and using the drug takes you to half a percent, and there are lots of side effects, absolute risk reduction is one in 200. With lots of side effects, maybe not the same. A 50% relative risk reduction going from a 40% risk of dying to a 20% risk with an absolute reduction of 20% is a clear winner. So that 50% relative risk reduction sounds good either way. In one instance, it's trivial. In another instance, very important. So you've hit again on one of the fundamental principles of evidence-based medicine: relative effects are actually, by themselves, basically useless unless you know the baseline risk and know that the magnitude of the harm, if we're talking about an increase, or the magnitude of the benefit, if you're talking about a decrease, is substantial because of a substantial baseline risk. If it's trivial, no, thank you.

SPENCER: It's so easy to get the two of them confused as well. Relative risk, you're saying a percent reduction and absolute risk, you're saying a percentage points reduction.

GORDON: Well, it's both percentages. The issue is it's an increase from what it was before, in terms of twice as much, three times as much, or half as much, or a quarter as much, and trying to think people have the if good, when people use about shopping, "Okay, 20% off on a particular item." Well, if the item only costs $1, you've saved 20 cents. "20% off another item," item costs $1,000, you've saved yourself $200. So it's that. That would be an analogy that might be helpful for people.

SPENCER: Now shifting topics a little bit before we wrap up, I wanted to ask you about some specific areas where evidence-based medicine might play a role. One of them is the DSM, the Diagnostic and Statistical Manual for Mental Disorders, as I understand it. When the DSM was first developed, it was very much developed by a committee, a bunch of professionals getting together and people advocating for their favorite mental disorder to try to get it in there. Over time, I think they've at least been somewhat more evidence-based, but I don't know to what extent they have been. So what are your thoughts?

GORDON: My thoughts are, I don't know much. I am not a psychiatrist, and I think it's still a committee. The one thing I would point out is, inevitably, these things are very, very value and preference laden. So what is a disease? There was a time when homosexuality was a disease in the DSM. At one time it was a disease. Fortunately, it is no longer. A disease is alcohol use disorder. A disease is a drug use disorder. If you think of it as a disease, you treat it differently than if you consider it a failure of discipline on the part of individuals who have the problem. Aside from the evidence, there's a lot of value judgments going on in the whole notion of psychiatric illness. I think psychiatry is moving in the right direction, but still not great.

SPENCER: Another area where this has come up recently in the news is with Alzheimer's treatments, and in particular the amyloid hypothesis, that amyloids in the brain basically build up, and then this is sort of intrinsically linked to getting Alzheimer's. If you can clear these amyloids out, maybe people will recover. Obviously, Alzheimer's is an incredibly serious disease affecting huge numbers of people, so it's really a big deal. But a lot of this evidence has now been called into question, or some people are arguing, maybe it's just correlational; maybe if you remove the amyloids, it doesn't actually cause much improvement in outcomes.

GORDON: The reduction in amyloids is a surrogate. People with Alzheimer's have more amyloid, therefore if we get rid of the amyloid, it's going to help? Sorry, not necessarily. As far as I can see, none of the anti-Alzheimer's drugs have any sort of impressive effect.

SPENCER: It's so sad because it's such a serious condition.

GORDON: It's horrible. It is. My dad died of Alzheimer's. It's horrible, the most horrible.

SPENCER: And it almost makes you angry at the low quality of evidence in something so serious. You're like, "Come on, we need to get our act together and use high quality evidence. We're talking about people's lives."

GORDON: Well, it's not low quality evidence, actually. For most of the drugs, it's high quality evidence that they do little or nothing.

SPENCER: Okay, fair .

GORDON: It's quite different.

SPENCER: What's more about the site? Yeah, I guess the scientific process kind of failed us.

GORDON: We haven't got the drugs that work. It's not that we haven't tested them. We've tested them and we find out they're more or less useless. We just haven't come up with the medical breakthrough of something that would work. And you know the diabetes example from the discovery of insulin way back in the 20s, till very recently, we had lots of things that lower your blood sugar, but nothing that lowered your cardiovascular risk. Finally, only in the last decade, and even less in terms of the uptake, have we finally got something that if diabetics take it, they actually improve their long-term outcomes.

SPENCER: What is that drug?

GORDON: Two major ones are there. I can tell you the classes of the drugs, and there are multiple ones in each. SGLT2 inhibitors is one class of drugs, and GLP-1 agonists is a second class. We now have these two classes of drugs that lower cardiovascular and renal risk. As I said, for one of them, the GLP-1s have similar blood sugar reductions to other anti-diabetic drugs that have no benefits. The SGLT2 inhibitors have less glucose-lowering properties, but the others don't have benefits for cardiovascular risk, and the SGLT2 inhibitors do.

SPENCER: The GLP-1s, isn't that the class of drugs, Ozempic and the other kind of NBCD [sic] drugs?

GORDON: The other thing about GLP-1s is that some of them, and it varies across the drugs, have these incredible weight loss properties. So again, these are discoveries. Evidence-based medicine can tell you whether it works or doesn't. It's something that happens before the evidence-based processes in the labs or whatever, where you come up with the treatments that evidence-based medicine can then tell you whether they work or not.

SPENCER: Another question I wonder about for evidence-based medicine is how our understanding of placebo has evolved.

GORDON: I don't know how much this has evolved. We know placebo effects exist. We know that there are bigger placebo effects on some things. Injections have bigger placebo effects, and pills and surgery have very nice placebo effects. There's been refinement, but the fundamental nature of the placebo phenomenon has been known for a long time.

SPENCER: People have questioned the efficacy of placebos. In fact, I'm going to do a whole episode about this in the future, and I think the main argument there is that a lot of what is called the placebo effect is not really a placebo effect. If you have a randomized control trial, you've got your intervention group, you've got your placebo group, and the placebo group is receiving a sugar pill or some other placebo, but a lot of the improvement you see in the placebo group is not necessarily due to the placebo effect. There are all these other effects that could lead to an improvement in the placebo group, like regression to the mean and other effects. People suggest that maybe the placebo effect is a lot weaker than a lot of people realize.

GORDON: You may know something I don't. I know there's controversy and back and forth, but sometimes there are big placebo effects. I was just involved in studies of a drug for chronic cough, and there were absolutely gigantic placebo effects. Unequivocal, gigantic placebo effects.

SPENCER: But how do you know it's a placebo, rather than, say, reversion to the mean or some other effect?

GORDON: Excellent point, because in observational studies, regression to the mean occurs when people come to you at their sickest. You anticipate, in the ups and downs of the disease, that if you get them at their sickest, they're going to get better. That's regression to the mean. But we have observational studies of these people with chronic cough, and they stay stable over long periods of time. You put them in a randomized trial, and whether they got the drug or the placebo, their coughs get much better, with huge, really quite astounding improvements just from placebo effects. This could create a problem for the makers of the company. One solution to that is to start giving them the placebo, let them have their placebo effects, and when they stabilize at their new level after the placebo, then randomize them to continue with the new drug. The placebo effect swamped any possible effect of the drug; it was much bigger than any effect of the drug. I'm sure there are situations where there's a minimal placebo effect, but it's clear this variability depends on condition to condition.

SPENCER: Right. If you're studying something like a cold, which is naturally healing for the vast majority of people, if you give them nothing, they're of course going to feel better, regardless of whether there's a placebo effect. So you've got to be careful. But as you point out, if it's a chronic condition that we know doesn't tend to improve on its own, that makes it a stronger case for being a real placebo.

GORDON: Yeah, for sure. This chronic cough can be really miserable for people. Unfortunately, once you've had it for a while, it tends to hang around.

SPENCER: The thing I want to ask you about is the future of evidence-based medicine. What are the initiatives happening now? Where do you see the biggest room for improvement in what we're doing?

GORDON: One thing is we have talked about shared decision-making. You've mentioned the doctors who don't really do much on shared decision-making. We need to train our clinicians to do shared decision-making, and we need to make it more efficient. We need better decision aids that help the patient and the physician in the clinical encounter to efficiently understand the evidence and make appropriate shared decisions. That's one thing. Another thing is that during COVID, we had a big investment in randomized trials. Two things about it: number one, there's a tremendous waste of these small randomized trials. We need to concentrate our resources on doing the big, well-done trials that yield results and really tell us something. The COVID academic response was very good; whether we can keep that up and extend it is a much bigger question. Another thing within public health, and what you alluded to, is that we cannot pretend that we know what's going on. We don't know what's going on. As you pointed out, that disillusions people. They see everything changing. "What's going on here?" If we had said in the first place, "Sorry, we don't know what's going on here with the masks and the other personal protective equipment," at the beginning, we didn't know at all, and we're still not sure. We are still not sure about how much good those masks do, but that's not the message that was given out. I think we made a big mistake taking the kids out of school, and again, if we had acknowledged the uncertainty and the downsides, maybe we wouldn't have made those decisions. Another area where we can get a lot better is in our presentations, acknowledging the uncertainty. When there's a lot of uncertainty.

SPENCER: It seems that authorities feel it's better to state something confidently, even if they're uncertain, than to express the uncertainty. I feel that was disastrous with COVID. They would tell us things with confidence, and then those things would be wrong, which completely eroded people's trust.

GORDON: I could not agree with you more, and that's the point I was making. When we are uncertain, we need to acknowledge our uncertainty. You alluded to the fact that people might not like that. People might be uncomfortable with uncertainty, but the strategy of dealing with that discomfort by pretending you know when you don't know is a mistake.

SPENCER: Gordon, thank you so much for coming on. I really love this discussion. I don't know how much you think about this, because your work is very indirect, but you've probably saved an absolutely massive number of people's lives through your work. I just want to thank you on behalf of society.

GORDON: Well, that's very kind of you. Yes, my effects are indirect, as far as that's concerned, but I'd like to think that the work has had the impact that you're suggesting. Thank you very much.

[outro]

Staff

Music

Affiliates


Click here to return to the list of all episodes.


Subscribe

Sign up to receive one helpful idea and one brand-new podcast episode each week!


Contact Us

We'd love to hear from you! To give us your feedback on the podcast, or to tell us about how the ideas from the podcast have impacted you, send us an email at:


Or connect with us on social media: