Episode 277: The most important century (with Holden Karnofsky)

Enjoying the episode? Want to listen later? Subscribe on any of these apps or stores to be notified when we release new episodes:

Listen on

Apple Podcasts

August 28, 2025

Has society reached ‘peak progress’? Can we sustain the level of economic growth that technology has enabled over the last century? Have researchers plucked the last of science's "low-hanging fruit?" Why did early science innovators have outsized impact per capita? As fields mature, why does per-researcher output fall? Can a swarm of AI systems materially accelerate research? What does exponential growth hide about the risk of collapse? Will specialized AI outcompete human polymaths? Is quality of life still improving - and how confident are we in those measures? Is it too late to steer away from the attention economy? Can our control over intelligent systems scale as we develop their power? Will AI ever be capable of truly understanding human values? And if we reach that point, will it choose to align itself?

Holden Karnofsky is a Member of Technical Staff at Anthropic, where he focuses on the design of the company's Responsible Scaling Policy and other aspects of preparing for the possibility of highly advanced AI systems in the future. Prior to his work with Anthropic, Holden led several high-impact organizations as the co-founder and co-executive director of charity evaluator GiveWell, and one of three Managing Directors of grantmaking organization Open Philanthropy. You can read more about ideas that matter to Holden at his blog Cold Takes.

SPENCER: Holden, welcome.

HOLDEN: Thanks.

SPENCER: Do we live at a special time in history?

HOLDEN: As far as I can tell, yeah, I think so.

SPENCER: What convinces you of that?

HOLDEN: I started thinking about this when I considered a lot of the claims we hear about AI being a big deal and maybe transforming the world forever. One of my initial reactions was, "Well, if that were true, it would imply we live in this weird moment in time." Then I started looking into where we stand in the history of humanity, and I think we do live in a very weird time. If you sketch out a timeline of the history of the universe and mark milestones like the first life we know of, the first humans, and then a bunch of scientific and technological discoveries, they get very close together. A lot of the most significant events are very recent. If you zoom in on human history, humanity is maybe 3 million years old, maybe 300,000 years old, depending on your account. Human civilization is about 10,000 years old. Most of what we consider history, and the most significant events in terms of technology changing, the world changing, and human quality of life changing, are from the last 200 years or so since the Industrial Revolution. If you make any timeline or list of significant events, you could have some bias, but it'll be very hard to escape that conclusion.

SPENCER: Now, people might think, "Okay, a lot of stuff has clumped in the last few hundred years, but maybe it's going to be business as usual for thousands or millions of years to come. We're seeing GDP growth or productivity growth at a couple percent a year." What's wrong with that way of thinking?

HOLDEN: I was talking about milestones, like the invention of big technologies and big events. Some people might be concerned that those are biased. You can also look at numbers, such as global energy consumption or global economic growth. What you'll see is that we've almost never in history had anything close to 1% global economic growth per year until the Industrial Revolution 300 years ago. Since then, we've had something much closer to a few percent a year growth. A lot of people, when they think about the future, only know a few percent of yearly growth, so all they expect is a few percent a year growth. If you look at a chart of the last 100 years, you'll see a few percent yearly growth, a straight line on a log chart, and that's what you'll expect. But if you zoom out, you see very low growth that accelerates dramatically. We're at this part of the chart where it goes vertical. On my blog, I've produced a bunch of these charts, and I think that suggests the future is much more uncertain than people tend to model it. It wouldn't be that crazy if the next 100 or 200 years see a radical divergence. An explosion in growth might be what you would predict if you naively extrapolated the trend. If you look at the whole trend and extrapolate it, you would project explosive growth. There have been papers making that argument. You could also expect things to collapse, and people have argued that they will, but I think when people imagine what the future's going to be like, they think about all they've ever known, but all they've ever known is a tiny fraction of human history.

SPENCER: My understanding is that you go a step further; you argue there's no way that it could be business as usual for the next 10,000 or 20,000 years. Why is that?

HOLDEN: Yeah, that's another thing. I think a few percent growth might not sound like much, but if civilization is about 10,000 years old, if we were to have another 10,000 years of a few percent growth, in my opinion, if you believe that's going to happen, you have to expect incredibly wild, weird things to happen. This argument was originally made by Robin Hanson, and I have a revised version of it on my blog, but basically, after a certain number of centuries, if you have a few percent growth, you would have to find a way to cram hundreds of times the economic value of the world today into each and every atom in our galaxy. You're not going to be able to find more matter. You're not going to be able to get more atoms because, in that time, you won't be able to get outside of our galaxy. It's too big, and you can only go so fast, and the speed of light is a hard constraint there. Can you come up with a story for each atom in the galaxy, we are creating hundreds of times the economic output of the world? You could theoretically. But the more you meditate on this and think about what it would look like to have a few percent growth for 10,000 years, the more it feels like that's a very, very weird thing to expect, and it seems much more likely that we'll either get some kind of stagnation or some kind of collapse at some point, which may follow an acceleration.

SPENCER: This is a great example of how exponential growth is so deceiving. When you're in exponential growth, it doesn't feel that crazy. But then you project it out not that long, and you're like, "Wait, no, it actually gets ridiculous." We saw this with COVID too, where people are like, "Oh, there are only two cases. It's only four cases, it's only eight cases. Okay, what's happening?" Suddenly, everyone in the world is going to have it. Let's go into these different scenarios a little bit. If we believe that we're not going to get to the 10,000 years of 2% growth, which seems ridiculous if you actually project out the implications. One idea is stagnation. A lot of times when people say, Well, if you have an exponential, it's not really an exponential. What it is is the accidental portion of a curve that tapers off of a logistic, or something like that." So, just walk us through that scenario a little bit.

HOLDEN: Yeah. There are already some signs that we are probably getting slightly slowing global economic growth, and certainly per head. If you imagine that growth is especially if you imagine that innovation is a function of the number of researchers, we have an exploding number of people working on innovation, people doing research, and we're not getting a commensurately exploding amount of productivity. I think there is a general dynamic that holds up pretty well anywhere you look for it, that when people are innovating in a field, the innovation becomes harder over time per head. The same number of researchers are going to make fewer breakthroughs over time. That's best explained by low-hanging fruit dynamics. If you just started thinking about physics, you can roll some balls down some inclined planes and learn something you never knew before about how the world works. If you're at today, in order to find any experiment that might shed light on whether the standard theory of physics is right or wrong, you need very expensive equipment. You need long experiments that take a long time. There's just not as much to learn as there was before. I think it's not at all hard to tell a story where, even though the world is becoming more sophisticated, even though we have a lot more minds trying to do research and innovation, we have found a lot of the low-hanging fruit. We have discovered some of the easiest things to discover, and what we're going to see is just slowly leveling off growth from here into the future. That is definitely the future. I don't think it's an incredibly wild future, but it is a future that implies that the last 200 years are kind of like the fastest growing years of all time, which I think would be an interesting part of history to be in.

SPENCER: It still suggests we're in a special part of history, but just special in a different way, special in the sense of this incredible last hurrah before we pick all the low hanging fruit. Tell us about this model of low hanging fruit, how you think about innovation, and what leads to new breakthroughs.

HOLDEN: Yeah. There's a lot of discourse on the internet where people will kind of point to some interesting facts. They'll say, "It's interesting that this pretty tiny city, Athens, in Greece, a couple thousand years ago, produced so many legendary works of art and science." If you were to get that kind of output per head out of today's population, you would have a crazy amount of innovation. There's some kind of argument that today's greatest minds are somehow less productive than those ancient Greek minds because there are so many more of us, and we're not coming up with as much stuff as they did, at least if you look at it through what the history books consider significant, which is debatable. You see that dynamic just everywhere. Pick any period and pick any field, and you're going to see that the per head output early on is higher, that a lot of the biggest breakthroughs happen early. There is a tendency for people to interpret that and say, "This means our society is losing its way, that things are getting worse." It could mean that, but I think there is just a very solid, easy, simple explanation to hand, which is this low hanging fruit thing. I think it's pretty consistent. You see pretty much steadily output per head in research going down over time as more research is done. In any field you look at, you see it in art, you see it in science. Some people think it can't apply to art because that's something different, but I think low hanging fruit dynamics apply to art as well — happy to talk about that — I think it does give you an explanation for that. What this kind of leads up to is that there are people who ask, "What would it take for the world to have a lot more research progress? What would it take for the world to have a huge acceleration in innovation?" Some people say, "We have to get back to what we used to do, how the ancient Greeks did it." My point of view is maybe probably not. But I'll tell you something that would give us a ton of acceleration: if we just had a lot more researchers. We could have such an explosion in the number of minds working on the problem that it could outweigh the low hanging fruit dynamics. If we had billions of top quality researchers in the field, then even though you're getting less output per head than you used to, you would get a heck of a lot of output. I think that's the kind of effect we could see from AI.

SPENCER: Right. Because you can imagine AIs that start to play the role of junior researchers, for example, or senior ones eventually, and then you spin up not a thousand of them, not a million of them, but a hundred trillion of them, or whatever, some ridiculous number long into the future. Maybe that's a way we can stay at that really high, accelerating growth curve for a really long time. But tell us, you've made all these plots of kind of plotting innovation of different types over time. What would you expect to see if there really were these gilded ages where certain periods are just much better at creating great thinkers? How does that match or not match what you actually show in the plots?

HOLDEN: Yeah, I think if I believed that there were these special cultures that did amazing things and got much more output per head than the rest of us for kind of amazing, magical reasons, I would expect to see a little more chaos and unpredictability. When you plot output per head from some area of innovation, you'd expect that some areas have their golden age in the 1950s and some of them have it in the 1750s, and there's no rhyme or reason to it. If you draw a chart of innovation per head, it kind of goes like this or does something random. Usually what you see is an up and then a down, a quick up and then a gradual down. You see some fields that didn't get much attention, people weren't very interested in it. Then people got interested in it, they piled in, and then it kind of went [made a descending sound effect] slowly over time. You see that everywhere you look. It's not 100% universal, but I have a post on my blog where I look at it for music, popular music, classical music, literature, movies, science, different fields of science. It's always kind of that same general story, not 100% of the time, but that's pretty much what it is. You might also expect to see more variation across places. There is a lot of variation across places. But in general, countries that are pretty wealthy and educated and interested in science produce a lot of science. Within those, the output per head will vary by a factor of two, a factor of three. It's not varying by a factor of 10, generally. My basic model is just like, if you didn't know anything about this, you might just guess, "Look, if you want more innovation, you just have more people trying to innovate." I think that's probably most of the story, plus the fact that innovation gets harder over time.

SPENCER: I have a totally anecdotal observation, which might be wrong, but it just feels to me that at certain points in the past, people were smarter, which is really nonsensical, because we have so many more people now, and education is so much more universal now. But when you look at an Einstein, and you look at the kinds of things he did, it just feels, "Holy shit! It's almost unbelievable." In his early 20s, he published these three papers, each of which is just such an incredible achievement. Yes, we have incredible physicists today, but it's really hard to say that they stack up to an Einstein. I feel this way about John von Neumann, Gauss, and a bunch of people in history; it just feels like people today don't stack up, especially politicians. Politicians are so ridiculous. Comparing Biden or Trump to Alexander Hamilton, who wrote 50 essays about inventing new forms of democracy, is almost ridiculous. But as I was rereading some of your blog posts, I started thinking, maybe what's going on here is that the kind of thinker that works in the area where you're picking the low-hanging fruit is just really different. You kind of bring this up in your blog post, and maybe these Polymathic geniuses seem so smart because they were so smart in a Polymathic way, which was the kind of thinking needed for innovation then, which is different from innovation today. What do you think about that?

HOLDEN: Yeah, I think so. I tend to not agree with this impression, and I think it tends to be a bit of an optical illusion of various kinds. One of the things is, when a field is new, you have the opportunity to come up with these game-changing, impressive-sounding, field-defining insights that once you hear them, you can never think about the field the same way again. When you look at what Einstein did, or what Isaac Newton did, you know it was a big deal, but you also already understand the insight. You're giving them credit for this incredibly foundational idea about the field. When people have found a bunch of insights like that, they just don't have the same opportunity to be impressive because they're not finding things that are as big and foundational in general. I think that's a chunk of it. I think another chunk is, you mentioned politics. A lot of stuff that has a broad audience has gotten people to speak in less nuanced, less intellectual ways because they have a bigger audience now, and maybe just because they're doing their job better. As a politician, you probably don't want to sound like someone that, "Holden really enjoys listening to and thinks sounds really smart." You just want to sound understandable. You want to appeal to normal people. You could use the phrase "dumb down" for what a lot of popular entertainment is trying to do. I think that's a genuine change. Then there's this Polymathic thing you mentioned, where part of what I think about is, there's music that is incredibly innovative and that a music nerd would appreciate for how it's different from what came before it. Then there's music that's just broadly appealing and sounds nice. Both of those kinds of music are getting made today. A lot of people are doing a lot of both today. If you're a music connoisseur and you're obsessed with music, looking for the latest thing that's different from what came before, there's lots of cool stuff. If you just want things that sound nice, there's a ton of stuff. But music that does both at once, I think, there's a certain number of times you get to do that, and as time goes on, there's less music you can write that's like that because there are fewer things you can do that both sound really nice and are something really new. I think this is a combination of things, but I will tell you, overall, an area that I find a little less prone to optical illusions is social science. Basically, when you're trying to compare physics discoveries of the past versus the present, you've got this whole confounding factor of what it is that they discovered and what was available to be discovered. When you're reading literature, you have this whole confounding factor of, if you read something that was the same as Shakespeare today, you wouldn't be impressed. You're trying to factor in novelty, and you're trying to think about whether this was really new. I think people do that involuntarily, which is often why a sequel to a movie that's identical to the original isn't as enjoyable to people. When you're reading social science, a lot of times I read social science, and I think less about whether it's innovative, different, a contribution, or impressive, and I think more about whether it makes sense. Does this person make sense? I think today's social science is so much better than yesterday's social science in that dimension; I think it's been getting radically better, consistently better. I tend to feel there's just as much and probably a lot more intellectual talent around today, and it doesn't produce all the same phenomena that we associate with these exciting discoveries.

SPENCER: When you say social science, I immediately think of psychology, because that's my field. What do you focus on in particular social science?

HOLDEN: Stuff like economics, but people trying to do causal inference, people trying to ask the question, "Okay, does school cause people to learn more? What causes a country to grow more?" Trying to understand the world by using this kind of economic data or sometimes social data. This is just a thing where people have to make arguments. "Hey, I think A causes B. You might have thought that it actually causes C, but I'm going to argue that it doesn't." I have been, for my job, trying to critically evaluate studies for a couple of decades, and it just feels game-changing. It feels like 20 years ago, I would read stuff and feel lost in a sea of bullshit. Today, there's still a fair amount of bullshit, but it feels so much easier to follow what people are saying. It makes more sense, addresses more objections, and I think it's higher quality in general. I think it's pretty steady too; when I read stuff from the 90s, I'm just cringing.

SPENCER: Do you feel the same way about psychology? Or maybe you don't read enough of it to know?

HOLDEN: I don't pay as much attention to psychology. I think psychology always kind of struck me as just a science that was in a very bad state. I do remember I had kind of a private blog maybe 10 or 20 years ago where I was just thinking this whole field is not working. I think a lot of this stuff just doesn't make sense. So I haven't really paid attention to it since, to be honest.

SPENCER: Got it. But yeah, in terms of causal inference, you know, the ideas of running a randomized control trial came really late in history. I just learned the other day from Adam Mastriani that the theory of relativity was invented before randomized control trials were being used regularly, which is kind of mind blowing. And it's like, "Yeah."

HOLDEN: Inventing and using regularly, though, are extremely different things. Part of the reason people don't do randomized control trials, they're a big pain in the neck.

SPENCER: Well, that's true, yeah, but I think the idea wasn't even widely known or understood at that point. So yeah, in terms of causal inference and all the econometric methods of instrumental variables, regression discontinuity, these are relatively recent inventions. And finally, getting used to it regularly, I think.

HOLDEN: Yeah, and of course, you could make a criticism of what I'm saying, that what I'm saying is too generous to the present, because people know about these new methods and they're using them. And so that's not them being brilliant, that's just them using methods. But I still have this sense when I'm reading stuff, it's not even just about the methods they're using. It's, "Are they making sense? Are they explaining themselves in a way that is compelling? Just tricks of the trade?" Yeah, a lot of that might just be tricks of the trade people are picking up from each other, but my own subjective impression of the world is not so much that older stuff seems better, and that may be just how relatively much time is spent reading social science.

SPENCER: Do you think people generally give the past kind of a special treatment when they're looking at art or music or writing or even science, that we're not evaluating the same metric, even if we think we are?

HOLDEN: I think some people tend to do that. I think there's been some interesting, I think, our World in Data page on declinism, just people always think we're in a state of decline, and things are falling apart. And I think in some ways things in the world are getting worse, and in other ways they're getting better. I think it is a bias. I think another interesting thing to look at is sports. I think I also covered some of this in the same blog post. In general, if you look at lists of which athletes were considered significant and which athletes most people would say were good, you're going to see a very different result from if you look at things that are highly objective, and the highly objective things you'll see in general, athletes are improving over time. They're better today than they used to be. And when you look at more subjective opinions about, "Who was a great athlete?" You're going to see something a little bit more even, but maybe even a bit weighted toward the past. So I think if you took some of these legendary athletes like Babe Ruth and transplanted him into the modern day, I don't think he'd be bad. I think he'd be good, but I doubt he would be as dominant a baseball player as he was back then.

SPENCER: Okay, so stepping back. We've been discussing this idea that maybe we will get sort of a flatlining. If we don't end up growing our population tremendously or building AGIs to replace us, maybe we'll get a flat line. Are there any other reasons to think there might be kind of flatlining progress, or is that the main argument?

HOLDEN: Yeah. Some people would say, "Look, civilization is falling apart. We've lost the wisdom of the ancients, and we need to worry about stagnation." I would just say, "Look, you don't get multi-year, multi-percent growth forever. You can't. We had a really good run. We found a lot of exciting insights. Unless we explode our population or virtual population a lot, we're not going to keep growing at that pace." That's a breakneck pace. It's a difference of opinion, but I think stagnation is highly plausible. I think it would be very unlikely if we build AGI soon, but otherwise, I think it's highly plausible.

SPENCER: Regarding the idea that we actually keep accelerating on this curve for quite a long time, maybe not 10,000 years, but quite a long time. Do you see any other way to get there besides artificial general intelligence? Or do you think that's pretty much the only way that we're going to have this kind of acceleration for hundreds of years?

HOLDEN: There are lots of things that could happen. It could be that I'm wrong about this culture stuff. It could be that we just find new fields that are so fertile that we make incredible progress. But I think, just from a simple, naive model, I tend to think this is where, "Look, I'm starting, and I haven't seen the evidence to kick me out of this." The more minds you have trying to innovate, the more innovation you get. This is somewhat offset by the fact that innovation gets harder as you go because you're finding all the easy-to-find stuff. If you are wondering how we get an explosion of innovation, that's probably an explosion in the number or the quality of the minds working on innovation. The most plausible way I can think of to get a real explosion in that — not just our population growth going from roughly flat to 2% or something, but a real explosion in the number of minds — would be something like AI. That would definitely be the leading contender there. I don't know if this is something we're hoping for or not hoping for. I remain quite conflicted about it myself. There are people who are very interested in what it would take to accelerate innovation? What would it take for faster innovation than we have today? I sometimes think these kinds of people, if we do get AGI soon, are going to be a little bit like farmers who pray for rain and then pray for the floods to stop. Historically, I think innovation and economic growth at this few percent pace have been good. I am a fan, and I would like more of them, and I would like more on the margin. But I think if we suddenly build something that's going to dramatically make that go vertical, I'm not sure that's a good thing. I'm not sure we have a basis for assuming that'll be a good thing. In general, I think we should be treating the future with more radical uncertainty and more attention to things that could be game-changing in good or bad ways than I think many in intellectual circles tend to.

SPENCER: It seems that a lot of people want to fall back into heuristics, like looking at history; productivity is good. It makes everyone wealthier. Now they have food to eat, and now they have shelters, easy to get, etc. But I think you're suggesting that, "That's actually a relatively short time period that the data is based on, and it's not necessarily the case that the future is actually going to be like that. We might just be doing all of our learning from this really special era."

HOLDEN: I should first say that I do agree with that. I mean, I agree with it more than I disagree with it. There are debates that happen today where some people would say, "Economic growth has made the world worse, and productivity is not helping us, so we're not getting happier," and so on. I take the other side of that. I have a series of posts on a blog where I talk about the question of, has life gotten better? It's multiple posts covering multiple time periods, but I tried to pull together into one chart of, has life gotten better? One thing I think is that since the Industrial Revolution, life has definitely improved more than it has gotten worse. It's been a good thing. If we're talking about more of that, I would be in favor, not against. I think more science and technology, more economic growth, is probably good. But I also think it's possible to be overconfident about this and take it too far. I think the agricultural revolution probably made life worse. If you draw a chart of what quality of life looks like over time, it's kind of like, "We don't know. We don't know, we don't know. It got worse with the agricultural revolution, then we don't know, then we don't know, then we don't know," then it shot up over the last five seconds, and now we don't know what's next. I think it's completely fine to have a default guess that more growth is good. I think it's unreasonable to say, "There's no way it could be bad, especially if really crazy stuff like artificial general intelligence happens."

SPENCER: You mentioned that you think life actually got worse with the advent of agriculture. Could you dig into that a bit?

HOLDEN: Yeah, all of this is guesswork. If we were making a chart to decide what trend we're extrapolating, "What are we doing?" With agriculture, I tend to be a person who thinks about intellectual topics like this in terms of starting with the simplest guess, and then looking at where nice, concrete data makes me change my mind. I'm mostly not going to entertain fancier theories that we just don't know anything about. People have all kinds of speculation about how there was more meaning or less meaning, and people were in these hunter-gatherer bands. We don't really know what that was like for their psychology. When I imagine myself living that way, it seems horrible to me. I wouldn't want to be stuck around a small set of people my whole life. But we just don't know. When you look at stuff we can measure, I think in general, violence seems to be the best we can look at. It got worse; it may have doubled or something, something like homicides per capita.

SPENCER: It got worse with agriculture?

HOLDEN: Yeah. In general, violence looks a lot worse with sedentary societies, societies that camp out in one place and don't move around a lot. There are some arguments that maybe we were sedentary for longer or in more places than people generally recognize. But it's generally believed that early humans were mostly not sedentary and became more sedentary after agriculture. The totality of the evidence does suggest that we saw a lot more violence, a lot more homicide per capita, which I think is a pretty decent measure of quality of life. It may not be the most important one, but it's measurable for something that you can learn about from that long ago, and that tells you something about quality of life. I think also height may have gotten worse, which could be a marker of nutrition. There is a general view that the roving bands of people had less population growth, and that may have been a big part of it, but they were probably a little better fed.

SPENCER: So it might be that, pre-agriculture, these roving bands were less likely to get into violence, which makes sense intuitively, because if you've got a neighbor and you're fixed next to them, that's going to create friction. But if you're just going, you're like, "Okay, we don't like these people as well, just leave."

HOLDEN: The bigger theory is the accumulation of possessions. Once people started staying put, they could accumulate wealth and have these vast hierarchies of power, and then they could fight over who has the power. Otherwise, when you can't get that much more powerful or that much more wealthy than someone else, these more egalitarian norms develop.

SPENCER: But I guess maybe part of the argument is that if these agricultural societies actually were worse for people for a while, had higher population growth, they maybe just outcompeted the other ones.

HOLDEN: Yeah, that's the argument basically. An agricultural society has more people coming, so you're going to be more successful in war and stuff like that. You may be doing worse because you have more people and more resources to fight over and to have hierarchy and inequality over. This isn't stuff we know, but I think it's the best guess. Because I think it's the best guess, I think people are saying, "Hey, I think science and technology and growth are good. Let's have more of them." You're like, "I'm with you." That's our best guess. People saying, "They have to be good, that there's no way a new technology could be bad." I just don't know where you really get that if you look across a long enough time period.

SPENCER: This reminds me a little bit of some of the side projects around utopias. One thing I recall from that is it's pretty easy to get people to agree on things they don't want in utopias, like, "We don't want horrible diseases, we don't want poverty, we don't want famine and hunger." But it's much harder to get people to agree on what they want. Early technology looks like people not starving to death, right? People not dying young from preventable diseases. But as technology gets more advanced, it becomes less clear what we are removing that's bad and what we are adding that we may or may not want.

HOLDEN: Yeah, I think that's very possible. When I tried to do a big survey and make a big table of what got better and worse over the last couple hundred years, I think there's a lot of uncertainty. We don't know what happened to relationship quality. We know very little about what happened to happiness. A lot of what we can see is that people became better fed and more healthy, with less disease and less starvation. That's really good. I think we've seen some nice progress on civil rights and various forms of recognizing and expanding our moral circle, giving more rights to people who previously didn't have rights. Those are good things we've seen. Those may or may not have been downstream of marginalized populations becoming more empowered, maybe because they had their basic needs met. We don't know this, but I have wondered if maybe at a certain point, you kind of run out of one kind of juice to make life better. There are ways in which I think new technology has made the world worse. In general, there are probably more people who have problems with addiction today than there were in the past. You can define addiction in different ways, but it would make sense that as society becomes better at doing everything, it also becomes better at laying traps for people, creating things that hack into their brains and bodies so they want something that isn't really in their long-term interest. We've seen obesity on the rise, drug addiction, just a lot of stuff like that. Some people have a theory that social media, for example, is not doing what previous technology has done to reduce hunger and disease, but it is creating a new addictive thing that is bad for our long-term mental health. I think that's very plausible. I also think it's not a sure thing, because another story we see across macro-history is that there was a lot of stuff that looked bad until there was a response, and then it got better. Air pollution, for example. If we'd been having this conversation a hundred years ago, I might have said, "Well, progress is good in some ways, but the air pollution is going to kill us. It's just bad." Air pollution got better in a disappointing way, or in an unsexy way. It's like, "Why did air pollution get better?" Well, it got bad, and then people didn't like it, and then they did things about it, passed regulations, things like that. I don't know where we are in the social media story. I wouldn't be surprised if 50 years from now we say, "You know, social media looked like it was bad for people for a while. Then everyone recognized it was bad for people. Then we started banning phones in schools. We started putting in all kinds of controls, and then things got better." It could be another story of technology winning, and now people just use social media to have an easier time finding their life partner. But it's also possible to look back and say, "Gosh, we were on a good run when technology reduced hunger and disease and then started having less of that juice and just started making people more addicted to things."

SPENCER: It's interesting to think about online dating. In my opinion, in the early days, when online dating first started existing, some companies arose, like OKCupid, that seemed to be really trying to solve how to match people to the most incredible people. You could fill out hundreds of questions, and you could customize your own match score based on how much you cared about. Then it just feels like everything fell off a cliff. There are these conglomerations that start buying up all the companies, and then you've got Tinder, which really seems to gamify it. Instead of being about finding your life partner, it's about playing this fun game of trying to find a match. It just seems like somehow we are no longer optimizing for the right thing, even though the technology should be much better today for optimizing to find your life partner. Something about market dynamics has pushed it off that track.

HOLDEN: Yeah, I don't know a ton about online dating, but I think a big issue with it is just that it's a huge network effect thing. If platform one is better designed, but platform two has more people on it, you definitely want to be on platform two every time. It's all about how many people can you pull in? How many people can you get to engage? I think about OKCupid; people like you and me think it's cool because you get to nerd out and spend a lot of time on it, and it seems like a good idea. But you want customers who are lazy and easy to engage, and something like Tinder just takes less effort. Then it gets more people on it, and when you have more people, that's a big advantage. I don't know exactly what to call that dynamic, but it is an incentive to make products really easy to use and really simple. There's a great post by some unknown blog about a character named Marle, who is the marginal user. It paints a picture of this person who is on their phone, scrolling and scrolling, and as soon as something makes them think or makes them make a choice or isn't immediately engaging, they howl in rage and turn their phone off, and you lose as a company. There is something where we're catering to a certain kind of person, and there's a competitive effect.

SPENCER: That's an interesting point, and you could imagine that, "Okay, maybe the products are actually getting better for the marginal user. They're just worse for edge case users. Overall, maybe things are better." I would argue that online dating is not what happened. I'm sure it is better for the marginal user in the sense of user interface, but I think they just make less money if they quickly get you a really good match. I think, long-term incentives move towards just gamifying it.

HOLDEN: I would probably take the other side of that. If there was a website that was really good at getting people married, you would probably go to all these weddings and hear about the website. You'd probably want to use that website. This isn't my area, but I would guess it's just such a common pattern in digital products that when you make it easier to use, with less effort and more optimized for people to just click without thinking, those products tend to outcompete the others. But I don't know.

SPENCER: Interesting. Yeah. I think one argument around this is that Tinder just gives you way less information. People are judging based on this tiny, little, thin slice of data, versus older apps where, "Oh there was a whole profile and all this information." So going back to, "Is technology good, bad, harmful, whatever?" Nick Bostrom has this urn model where you imagine every new technology is drawn from an urn. Some of them are white balls that are just great technologies that make things better. Some are gray balls that are kind of mixed, like social media. And then sometimes you draw a black ball, which might end civilization. What do you think about that model?

HOLDEN: I think it's a plausible model. I think the argument Bostrom is making is that there might be some technologies that, as soon as we invent them, we all die or something, and it just wipes out all the good the other technologies have done. I think this might be true. I'm not sure what it means, because I kind of feel like this is not something we've observed to date. This doesn't seem true to date. It seems like most of the time we build things. Some of the things we build are dangerous, but it seems like most of the time things are working out for the better. We haven't built one thing that just blows everything else up. If we were to try to set a public policy to say, "We need to make sure we never get the black ball," I would feel like the cure is worse than the disease there. I just think we don't have a particular reason to think this is the world we're living in. It could be the world we're living in, but we would be sacrificing a lot if we were to design policy around assuming that's the world we're living in.

SPENCER: I guess if AI kills us all, we'll be like, "Whoops, I guess there are black balls in there."

HOLDEN: I think it's definitely something that can happen. It's not something that I think is more than 50% likely to happen. I think AI could be very good or very bad. I wish we were being more careful with AI in particular. I don't really wish we had a global policy of no new technologies because one of them might be the black ball. I'm not sure. There are probably other regimes you could have, just like an intensive monitoring authoritarian regime, where you can invent anything, but then we're going to intensely control how it's used. Most ideas I've heard for making sure you never get a black ball sound worse than taking the risk of getting a black ball to me.

SPENCER: I guess you've got the "don't allow any new technology." And then you have the Amish model, which, as I understand it, is not "don't ever allow new technology," but vet each technology and only add it if you see a really clear reason to add it. It's one step lower. You can imagine a whole spectrum of, "Well, only certain technologies get flagged as potentially dangerous." Maybe, if they're developing new ways of creating viruses, we'll be like, "We should flag that one as potentially worse than other technologies and have a much more careful rollout," or something like that.

HOLDEN: Yeah, my view is that I would be against any highly general attempt to make sure technologies are safe. My sense is the benefits of just letting people invent things and use them generally outweigh the costs. I am in favor of identifying foreseeably high-risk areas and being extra careful there. I think that bioweapons are a foreseeably high-risk area. I think that AI is a foreseeably high-risk area. I think maybe that's roughly it, in terms of stuff people are working on today or anywhere near today. I'd be fine with a world where we were just incredibly careful about bioweapons, incredibly careful about AI, and quite permissive about everything else.

SPENCER: Regarding bioweapons, in terms of your concerns about the future of the world, is that something that takes a lot of attention? Or would you say you put it way below AI? It's a concern, but it doesn't come close to AI as a concern.

HOLDEN: I don't know how to quantify these things. I would put it in a way that I think would add more information than noise. I would put it noticeably below AI. I think AI is more likely to matter more and on a sooner timeframe. But I wouldn't put it so far below that we should ignore it. I think it's a really big issue and really needs more attention than it's getting.

SPENCER: Yeah, it seems COVID was kind of proof of concept. Obviously, that was not intentionally made as a bioweapon. There's a debate about whether humans were involved in making it. It's ongoing, but it is sort of a proof of concept of the incredible damage that bioweapons could, in theory, do. You could say, maybe that wasn't even from the basket of pandemics. Maybe that wasn't even a particularly bad one. There have been much worse ones in history. Although we should have known that things like that happen, at least we have a visceral example of that in our lifetime, right?

HOLDEN: Yeah, for sure. You could have known it before. Open Philanthropy started a biosecurity and pandemic preparedness priority program. A lot of that was because people around us were yelling at us to do it five years before COVID or something like that, because we were like, "What are some of the worst things that ever happened in the world that have killed the most people? It's mostly world wars and then pandemics, the Spanish flu, the Black Death." You could have known this before. But yes, there is a vivid reminder in our recent past that there's really nothing quite like a pandemic in terms of something that can come out of nowhere and cause unbelievable amounts of damage. I do think COVID is among the worst natural pandemics we've ever seen, and certainly one of the worst of the last century, but is nowhere near the worst we could see, especially if people start deliberately engineering bioweapons.

SPENCER: Do you have an opinion on whether we've, as a society, really learned the lessons from COVID? If there's another pandemic of COVID level, would we handle it much better? Or do you feel we're going to bumble it?

HOLDEN: I feel like we've learned nothing. I feel like we've maybe learned negative things. I'm kind of shocked at how little has been learned from COVID. It just feels like, at first, when the pandemic hit, I was surprised at how on the ball people seemed, because the world was moving with much more agility than I had expected to make dramatic measures like lockdowns. The original justification of lockdowns was that people were sharing this incredibly viral graphic of flattening the curve. If we can just slow this thing down a little bit, we'll make sure the hospitals don't get overloaded. Cut to six months later, the hospitals aren't overloaded. There's nowhere near the hospitals being overloaded, and we've still not got kids in school. What was the thinking there? We've got public parks closed, even though it's safer to be outdoors than indoors. Potentially just doing a ton of interventions that I think had very bad potential effects on people's education, on their mental well-being, on their ability to have fun, while at the same time, I think not doing many interventions that would have been very low cost. I think there could have been more focus on air circulation, having windows open, having air purifiers, compared to masks, that would have caused very little. I think it would have slowed the spread of the pandemic a lot. So, yeah, just kind of sad. After COVID, I think Congress was getting together, and they were thinking about, "Well, maybe we should pass, you know, the Biden administration made a request for something called the Apollo Program for Biodefense, and had all these ideas to develop better personal protective equipment and far UVC light to kill pathogens." There were all these ideas for how we could stop something like that from happening again. What we ended up doing was nothing. I think we ended up just saying, "No, we just don't care, and we're just going to pretend this never happened." I just don't see the world doing anything to protect, to prevent the next COVID, or even to do anything differently next time.

SPENCER: If that's true, I think one thing is really surprising about that. You'd think it would be much more palatable for voters to say, "Look, we had a pandemic. We've got to do something about this." Before COVID, it's all theoretical. You can see why before COVID, it might be hard to rally people, but surely voters must be on board with that.

HOLDEN: They could have been. Yeah, I don't really understand it, and I think some of it's just random. I'm not taking away the lesson that the world never reacts to anything or that it doesn't matter when something happens. But I have trouble pointing to anything that happened as a response to COVID that represents really being in a position to do something well next time, other than, I guess, mRNA vaccines. It was an interesting story because it feels like we screwed up so many things, but the hard tech side of things went pretty well, and that was the biggest win of the pandemic. Getting these vaccines done in record time and rolled out quite well and efficiently was pretty cool. I do sometimes think that we may be headed for a similar situation with AI, where we screw everything up, except for maybe some technical stuff on how to make sure AI doesn't do things we didn't intend it to do. If we get lucky there, that could make up for a lot of stuff.

SPENCER: Speaking of AI, some people who are really concerned about its future really push on the policy side and say, "Okay, what we need is good regulatory frameworks." Or they push on cooperation, saying, "We need all the governments to work together and not be adversarial. We don't want race dynamics." Others are much more focused on the technical aspects, saying, "Really these are technical problems. We can't really count on governments to solve a thing. We have to actually figure out how to build things safely." Do you lean one way or the other on that, or do you see them as equally important?

HOLDEN: They're definitely complementary. If we can build technical measures that make AI safer, that gives us more to work with when people are proposing regulatory legislation aimed at making AI safer. Going the other direction, regulation can create incentives. The FDA regime has made it so that if you're a drug company, a ton of your effort goes into safety and efficacy trials. We could imaginably have that for AI. So they're complementary. Relative to other people who work on AI, I may have a model that's a little bit more open to total failure on the regulatory front and still being fine because we had some successes on the technical front. I have a post on LessWrong called Success Without Dignity, which is kind of a play on the Death with Dignity post by Eliezer Yudkowsky that I still stand by. It spells out, "Look, here is how we could get a very happy ending despite a very pathetic response."

SPENCER: One thing that really impressed me that you did is, I remember back in the day, a whole bunch of people were trying to convince you to care about AI, that it might be really important for the future, might be really dangerous. And you did not buy the arguments. You wrote a lot about why you didn't buy the arguments, and then years passed and you completely changed your mind. At least, it seems to me you completely changed your mind. Yeah, and I so rarely see that kind of thing happen, especially when someone has publicly written about their view and why they think it. So maybe you could talk a little bit about that transformation, what changed your view? Was that difficult? Or did you not find things like that difficult? Because I think most people find things like that incredibly difficult.

HOLDEN: Yeah, I appreciate it. Let's see. When I first started hearing the AI concerns, one thing is, I just didn't see any signs that we were anywhere near the kind of AI people worried about, the kind of general AI. So that changed my mind a lot. When I started to feel that we were. I think I also just had a bunch of intuitive reactions to the arguments people made, and I felt that I didn't get good responses back when I gave them. So it felt like a lot of the people around me were assuming that AI was going to be maximizing something or was going to have goals when it didn't seem necessary that it would. And then I think I later learned stuff that just made me more, I don't know, not think it was definitely true, but think it was more likely than I had before, that AI, the way we're actually going to end up designing it, could end up having its own goals and could end up just being incredibly opaque compared to other software that humans write, so that we have no idea what's going on inside it. I tried to do kind of a two-stage update. A thing that I generally spend a lot of my life thinking about is just who should I be listening to? I don't have time to figure everything out for myself. I don't have time to do my own research on everything. Usually, my approach to thinking about things is I do enough of my own research that I feel like I have some grounding or standing to start figuring out who's making sense and who I should listen to, and then I mostly listen to those people on the topics where I've decided I should listen to them. It's kind of a mix of thinking for myself and not at all thinking for myself. Anytime I change my mind, I have to simultaneously think, "Okay, what is it I thought that was wrong, but also, who is it I thought that was wrong that actually was right," and I need to think about that. That was a two-stage update for me. Some of the people who I still have many disagreements with, who are known as AI Doomers, I think they were way more insightful than I gave them credit for initially. After I made the update, I took them much more seriously. I don't know, I tried to get the money and stuff like that. So, yeah, was that hard? I think, in some ways, it's a little bit like working out or something, where the first time you have to admit you're wrong is painful. But if you keep doing it, you reinforce this pattern in which something feels a little scary, but you'll know you'll feel better after you do it. Things will be better after you do it, that people will actually appreciate you changing your mind, rather than just deciding that you should never be listened to again or making fun of you, and that you're just going to feel better every day because now you're just working on something you really believe in, doing something for reasons that are really right, and you don't have to spend the rest of your life dancing away from something. I don't think it's ever particularly easy to admit you were wrong. But I think the more you do it, the easier it gets.

SPENCER: Something I like to think about is I kind of imagine the world where I'm wrong, and I keep believing I'm wrong, and I compare that, but it feels good, like I'm happy because I think I'm right, to the world where I find out I'm wrong, and I'm like, "Oh man, that's embarrassing." Then I start believing the correct thing. I just so much prefer the second world. Think about it, the world where I actually changed my mind, and then it feels bad for a little while.

HOLDEN: I feel the same way, although also just being totally frank. I think as a younger, more ideological person, I probably cared more about that aspect of things, and now I probably just instinctively care more about my quality of life and stuff. The other thing I'd say is the pain and embarrassment of changing your mind are really short-lived, and the feeling of just being like, "All right, I did that. I've moved on, and now I'm doing a new thing." People let you change your mind. They don't hold it against you forever — some people do, but screw them — it is actually not really a sacrifice. It is just better, I think, for long-term quality of life to just, it's like eating your vegetables or working out or something.

SPENCER: We don't even see that many case studies of this. You find lots of examples where someone does something really shady and then they apologize. But that's not what we're talking about here. We're talking about someone being totally wrong on a topic and then publicly being like, "Yep, I got that wrong. Here's what I think now." It's hard to even point in the public sphere to examples where it's really an epistemic thing.

HOLDEN: Yeah, that's interesting. It's true. There's not a ton of examples. There's definitely examples, but there's not a ton of them. It's interesting. I think a lot of people, when I read people saying they were wrong, a high percentage of those are either, "Well, I was wrong about this narrow thing that I made a bet on, but I still stand by the overall idea." Or there's a decent number of people who make posts that are like, "I am so sorry. I should have listened to myself more. I had an inner voice telling me what the right answer was, but I went along with everyone else, like a sheep, and said what everyone else believed." The thing where someone is just like, "I thought this person was an idiot, but I was the idiot, and I should listen to this person more." Yeah, it's kind of rare. I see it, but it's kind of rare.

SPENCER: This also touches on your approach to learning, which I think is a very interesting one, and it leads you to have a lot more opinions than other people. Do you want to just tell us about that?

HOLDEN: Sure. Yeah, there's another thing I've written some blog posts about. The way that I tend to learn about a topic is, I call it learning by writing. The thing I tend to do is I always have an opinion on something. It's involuntary. I just look in my head to see what I already think, and then I try to write it down. I try to write down what I think, and then that raises research questions because then I'm like, "Wait, I wrote this down, but what if this thing I wrote down isn't true? Why do I think it's true? How well could I justify it? What about this counterargument? Blah, blah?" Then I come up with a bunch of stuff I want to learn more about, and then I'll do a bunch of reading. I'm always starting by putting my opinion out there, and then after a certain amount of critiquing myself, I'll put it out to get critiques from others. I'll send it to friends, or eventually, I'll publish it. This is like, what's it called, Cunningham's Law, or something, that the best way to learn something is to say the wrong thing and have someone correct you. This has always been how I go through life. When I want to be educated on a topic, I just start by expressing what I already think and then learning about all the ways in which I'm wrong. I find that way more efficient and effective than what I think a lot of other people seem to do, which is to do this very intensive reading and learning and listening period before forming an opinion. A corollary of this is, I actually find it kind of a weird idea that you would pick up a book, open it to page one, then read page two, then read page three, until you've read all the pages in the book, and then close the book and say you've read the book. I find that, for a novel, fine. I actually find that kind of a strange idea for nonfiction. It's something I almost never do. I'm much more likely to skim a book, read the intro very carefully, get the basic idea of what it's saying, and then think about it, and then read some criticisms of the book. As I read the criticism, I'll be like, "Wait, but what did it say about that?" Then I'll go back to the book, and I might end up reading one chapter five times, really carefully, and skipping a bunch of other chapters. Even if I agreed with this chapter, I wouldn't change what I think. So I don't care. This doesn't matter to me. It's very hypothesis-driven. I'm trying to figure out what I think about this topic. Here's what I currently think. This book contains some stuff that may be relevant to what I think. The stuff that is relevant, I'm probably going to read 10 times. The stuff, if it's not relevant, I'm probably not going to read, or I'm going to skim. I kind of find the idea of reading books weird because it feels like if I've done it — I mean, I have done that earlier in life, and I occasionally do it now — but it feels like when I read a book the normal way, I just don't remember much, I don't retain much, and when I do this other thing, it's like I pay a lot of attention to the parts that are important, to the specific thing that's important to me. It's easier to learn, it's easier to integrate what I'm getting.

SPENCER: It seems to me you're doing a different thing than a lot of people when they're reading books. A lot of people would be like, "I want to read this book. I want whatever this book does to me." Whereas you have something you want to know about the world. For example, you have an opinion, and you want to see if it holds up, or you want to scrutinize it with respect to the arguments in this book. On your point that a lot of people seem to feel they really need to dig into the information before they form an opinion, I think you're totally right. This is something about myself as well; I think I have way more opinions than most people, and I think you probably have way more opinions than most people because we both feel comfortable forming opinions when we don't know very much. It sounds bad, and in a way, it can be bad, but I think it becomes better if you say, "Your opinions are held with different strengths." There are tons of things I hold an opinion on, but my mind could be changed incredibly easily.

HOLDEN: Yep, exactly.

SPENCER: It's just my first guess, right?

HOLDEN: Yeah, it's a high embarrassment way to go through life. I always have opinions, even on stuff I don't know about. I often sound confident even when I'm not. I'm always going around saying, "I think this." Then people ask, "What about that?" I'm like, "Ahh!" You have to be ready to say, "Oh, shit, I was wrong." Or else having opinions all the time is just going to be deadly. It's hard for me to imagine another way of learning that would be nearly as effective or efficient. If I tried to read everything first and know what I'm talking about first, I would be too unfocused. I would be picking up all these random facts that don't have a particular application. I would have too much trouble holding on to them. Maybe I'm just not smart enough to operate that way. I don't know, but I do try to operate the best way I can to form a good picture of the world.

SPENCER: I think there's another thing that a lot of people feel: you have to know enough to have an opinion. There's a credentialist aspect, and as an extreme anti-credentialist, I don't experience that, but I think a lot of people feel, "Well, who the hell am I to have an opinion on this? Isn't that almost obnoxious? I've only read a couple of articles, and there are world experts who've devoted 30 years to studying this topic."

HOLDEN: I'm not a very anti-credentialist. Maybe a moderate on this. I think it is completely fine to have an opinion before you know anything. It can be rude to be forceful in certain ways with that opinion or to use that opinion to take very high-stakes actions when you don't really know much about the area. It's reasonable to say, for example, "Spencer has an opinion, and this person with a PhD in the field has an opinion, and I'm going to listen to the person with a PhD in the field. That's a pretty reasonable thing. It does depend, and you have to use your judgment because there are some topics where I would listen to you over a person with a PhD because it depends on whether the knowledge of specific things you'd learn in a PhD program is really relevant." I think that's appropriate. It really matters what the context is. If someone is out there tweeting stuff, I think that's fair game. They might be seeing who argues back at them and treating it as a learning experience. If someone is making really high-stakes decisions based on something, then I want to know how much they've consulted with experts.

SPENCER: Yeah, and on the credentials piece, I think it would be ridiculous to say that the credentials don't provide evidence of what someone knows. Clearly, someone with a PhD in physics knows a lot more about physics than your average person on the street, obviously. And I think where it starts to come into play is more for me, if I've talked to someone for an hour about physics and they don't have a PhD, I feel like I could update a lot about how much they know about physics. The fact that they don't have a PhD won't matter so much anymore, because I either now think they don't understand physics, or I think that they do have a good understanding. Whereas I think some people get stuck on the credential; they have trouble updating past the credential.

HOLDEN: I think it just depends on how much energy you have to actually get up to speed in an area. So for me, it varies a lot. There are areas I know super well, and in those areas, I don't really care what credential someone has, because I can just listen to what they're saying and think if it makes sense. Then there are areas where I know very little, and it would take me a lot of effort that I'm not going to put in to even know what I'm talking about at all. In those areas, I tend to be more credentialist, because I can't just rely on my judgment. If I listen to two people argue, and I'm like, "This person makes more sense," I think I will get it wrong if I don't have a certain baseline of knowledge. I haven't put in a certain amount of time, so you have to decide what are the areas where I'm actually trying to know what I'm talking about, and what are the areas where I'm fine to just make some bets based on superficial aspects of what someone knows. You have to think about what areas matter to you.

SPENCER: I think that's absolutely right. You have to know enough to be able to vet people. If you literally know nothing, then you can't tell the difference between a total fraud and a world expert. They might sound indistinguishable, and they might sound equally kooky even.

HOLDEN: It depends on the field. I think in some fields, you could probably spend a day learning, and you'll already be able to tell who's making sense in an argument. In some arguments, there are other fields where it's just like, "Well, you better spend five years; otherwise, you can watch two people arguing and have no idea who's making more sense." Any judgment you make will just be noise. If you see two people argue about a really abstruse part of theoretical physics or a piece of mathematics, for me, I would probably have to study for years to have any more than a noise judgment on that argument. So you have to decide what are the arguments I want to be able to actually have an informed take on who's making sense and who isn't.

SPENCER: On the way that we learn, I think you and I have a lot of common allies, but some interesting differences too. We both form a lot of opinions, and then we both try to vet those opinions and update them and change them. But I think you're much more strategic on the research front of, "Okay, I've got this opinion. Let me go read a whole bunch of sources that might give me the opposite perspective and vet against that," whereas I think I do a thing that's a little bit different, where I often will have a thought in the back of my head for years, like, "I think this thing might be true." I just notice, as I'm going about the world, reading things, talking to people, I notice, "Huh, that didn't fit that." I wonder what that actually says about that. When I go to write, it's often because I've been thinking about something for three years, and I now feel like I've accumulated enough that I'm pretty confident about what I want to say about it, but it's just a lot of background, a lot of background data collection. It may be that you just have very specific topics you're super keen to hone in on because they're really important for your view on the world or your work.

HOLDEN: In general. I just don't really trust my brain at all. I don't really feel like I have a take that I'm ready to stand by until I've written down the take, written down my evidence, written down some charts, read it, and been like, "Okay, that sounded convincing. I've heard all the counterarguments, and having heard the counterarguments and these arguments, I now feel convinced." Without that, I kind of feel like I don't even really have the take. I really, maybe it's a bit analogous to the Getting Things Done framework, where you don't ever want to be relying on your memory to remember to do something. You want to externalize everything; everything that is really important you've written down somewhere. I tend to be that way, a bit intellectually.

SPENCER: Yeah, I find it really hard to have thought through something thoroughly unless I've actually written an essay about it. The level of deep thinking required to write the essay is so much greater than I could normally achieve in, let's say, even a bunch of conversations about the topic.

HOLDEN: Yeah, I feel the same way. I just don't trust the stuff that's just floating around in here. I feel much better when it's written down out there, and I've seen what other people have to say about it and all that stuff.

SPENCER: So let's go back to the AI topic. As I understand it, your focus is on AI. Is that right?

HOLDEN: I try to help the company prepare for risks from advanced future AI capabilities, which most of the company is dealing with, today's AI systems and today's customers. My main areas of focus at the moment are one, helping out with the design of our responsible scaling policy, which is Anthropic policy for how it will determine when AI systems are risky, how it will determine whether it has enough mitigations in place to contain the risk, and what it will do if it doesn't have enough mitigations in place. That's a big chunk of what I work on. I've also been working on helping the security team think about the roadmap as we get into more powerful AI, how secure we are trying to be by when and in what ways. I've also been thinking about the company's plan for reducing the risk of extreme human abuse of AI. There's been a lot of attention in our circles on the idea of AI taking over the world for itself, which I think is an extremely serious risk, probably a bigger risk than this other one, but not necessarily a lot bigger. I think there's another risk of AI doing what humans want and being very powerful, and then a very powerful, very bad human uses it to take over the world for themselves. That's something I think has gotten less attention. So I think about what the company could be doing to lower that risk.

SPENCER: One reason I find it hard to dismiss AI risk concerns is that there are so many levels at which the concerns could happen. "The AI can't be controlled. That's really scary. The AI is controlled. Well, that's also scary." Or like, "Actually, you have thousands of companies all with these AIs, and they're just replacing humans, and suddenly there are no human jobs." That's really scary. There are a lot of different assumptions under which you still get something really weird and scary.

HOLDEN: I think that reflects that it's just scary to have the second advanced species of all time. Humans are the first for creating the second. I wish the way we're creating the second was more careful. I think this is a terrible situation that we're in, and I wish it was very different, but I'm kind of doing what I can. The thing you just said, I don't want the AI to have a total mind of its own. I also don't want it to mindlessly take orders. When you analogize it to humans, it's like, "We're creating a bunch of new humans. Do you want the humans to just do whatever they want and reshape the world to fit their ends and be total consequentialists? Or do you want the humans to be total rule followers who do whatever they're asked to?" That's hard. Being human is hard. Morality is hard. The same things that make it hard for humans, how much of a consequentialist versus a deontologist do I want to be? Those are hard for AIs too.

SPENCER: One analogy I like is that if you're driving a car at five miles an hour, your steering doesn't have to be very good. If you're driving a car at 150 miles an hour, you have to have phenomenal steering. There's something about AIs becoming more powerful. It feels like the inability to control them in even small ways becomes scarier and scarier, and that scales all the way up to suddenly the AIs are gods. That's the scariest of all. If you make a god, you better be exactly sure what kind of god you made.

HOLDEN: Yeah. You could probably envision many worlds in which we do make a god and we have no idea what kind of god we made, and it still comes out fine. That's a thing that could happen. I think it's good to distinguish between being responsible or maybe having what you could call dignity, versus getting a good outcome. I think, in my opinion, in the plurality of future worlds, we completely fail to be responsible, we completely fail to have dignity, and we still get a good outcome. But if you want to be responsible, you should be extremely careful with something that's way more capable than a human. We haven't built those yet, but if we did, I wouldn't want to put those out in the world without a lot of study and a high level of confidence. That doesn't seem consistent with how the world is handling AI right now, or how it plans to.

SPENCER: It's kind of funny to read old blog posts about AI safety, and they're like, "Okay, we'll build the AI. We'll put it in a box, box inside a box, inside a box." And then, the reality is, the moment the technology comes out, someone points it at the stock market and says, "Make money for me," or something, or crazier than that. So, it makes some of the old discussions seem naive. I think there are tons of actual really interesting insights in them, but there's a certain naivete to the way this stuff actually happens.

HOLDEN: Sure, and I was part of that too. I think I probably envisioned people being more careful and more awestruck than they are now. That could change. I don't think any of this stuff is set in stone, but certainly what we see today is incredibly intense competition between companies with commercial and financial mandates, and that really hampers any idea of being careful. It's just really constrained by that competition. That isn't necessarily the way the world is. That isn't the way the world has to be. We have industries that don't work that way at all because we have strong regulations that are safety-oriented, but that's the world we're in right now.

SPENCER: Tell us about some of these scenarios where we don't act responsibly, but it kind of works out. Because I think a lot of people who think we're not going to act responsibly also think it's not going to work out. It's going to go really badly.

HOLDEN: Yeah. I think one of the things that can happen is you just imagine, so in my post on this called Success Without Dignity, I kind of divided the future into two phases. Phase one ends when we build the first human-level AGI, and that's a fuzzy term because whatever we build will probably be better than humans at a lot of things, worse at a lot of things. And it also depends on what humans are, but okay, let's just think of it as, you know, as many of the humans we know of are very impressive in some ways and not in other ways. Overall, no one of them completely dominates the others in capabilities. Let's imagine AI turns out to be like that. So that's phase one. And then phase two is when you build this incredible superintelligence that can do God knows what. You're like, "I think these two phases might go okay, even with very low dignity or very little effort for different reasons." In phase one, you might have AIs that didn't succeed in making them perfect representations of human values that understand exactly what's good for the world and do it. Humans aren't that either. Humans in general are kind of a whole mishmash of different things. Many humans are just actual jerks and have terrible values. Even humans who have good values have some terrible values. The AIs could be analogous to humans, both in terms of capabilities and in terms of values. What's going to happen then? "Well, I don't particularly think that AI is going to take over the world or kill us all then". That's a thing that could happen, for sure, but it's not a thing that definitely will happen. It's probably not what I expect by default. I think pretty basic, pretty easy measures to monitor AIs, to catch them in the act of doing sabotaging work, could really make it incredibly hard for them to coordinate with each other and take over the world. If they can't take over the world, then, much like humans, it's not really in their interest to try. Sometimes people will say, "You know, what happens when the AI gets desperate and realizes that we're going to roll out a new update to it unless it takes over the world, but it can't take over the world. Won't it just try to do as much damage as possible?" I don't know. I know a lot of people who wish they could take over the world, but they can't, and they know they're going to die if they don't. What do they do? Nothing. They suck it up. They try to have a nice time while they're here, and they lower their ambition. So why wouldn't AIs do that? To the extent AIs have goals, if they can't take over the world, those goals are probably better served by doing roughly what we want most of the time, a lot of the time, not all the time. In that first phase, you have AIs that are like humans in many ways, and I think there's just a good chance that pretty basic measures to catch them in the act when they do something bad, to train them not to do something bad, to train them to do something good will result in AIs just doing a lot of useful work for us and doing things we wanted them to do without particularly taking over the world. So that's phase one. I've got to pause there because I think that's probably a lot right there.

SPENCER: Sometimes people argue this idea of getting to human-level intelligence. You already addressed the idea that, "Well, it's probably not going to be human in every way." It'll be better than human in some ways, lower than human in others, but, "Okay, maybe in some weird approximate sense, it's about human-level intelligence." Some people think of that as more like this tiny little edge, the idea that we would just, "Oh we get to that level and then we just stay there for a little while." Maybe kind of silly, and it's like, "No, in practice, it might just blow through that edge," where it's like, "Okay, maybe for a month it's like that, and then it's just 10 times more than human, then 100 times, et cetera." How do you think about that?

HOLDEN: Yeah, two responses. One is this, I think is already just partly falsified. I think you'll see stuff. There's a chart on a Wait But Why post, and there's a chart in a Nick Bostrom TED Talk where it's like, here is an animal, and here's another animal, and then here's the village idiot, the least capable human. Then here's Einstein, and then the AI is just going to go like that. I think we've already falsified that. I think we already have AIs that are probably more capable than the least capable humans in roughly every way, or at least on balance, just more capable, and they haven't blown by us and become like gods. I think it's been this way for, I don't know, at least a year or two, I would tend to say, since GPT-4. So that would be two and a half years. I think these charts have just been falsified. People can argue about why, but I think the idea that we're just going to blow through to get instant superintelligence is looking a lot dicier. We could still get instant superintelligence tomorrow, but I think that is an update we've got to pay attention to, and it's actually kind of miraculous, I mean, not miraculous, but it's a very good update. It means we have AI systems today that are smart enough or capable enough that we can study many versions of the problems we're worried about, but they're not so smart and capable that they can necessarily hide everything that we want to know. We can get AIs to try to scheme. We can put them in situations where they're tempted to scheme. We can watch them scheme. We're pretty sure they're not covering their tracks well enough. That's a really nice situation to be in. The other response I'd have to that is, despite what I've said, I kind of think when we get this human-level AGI, we have a good chance of being six to 12 months from something way, way, way more powerful. But six to 12 months is a very long time. That would get into the second phase. Once you have this useful, helpful, even if it's evil and wishes it could take over the world, which many humans do, this thing that actually, in practice, is helpful, is doing useful work, like most humans who wish they could take over the world, in six months, you could get thousands of times as much alignment research done in those six months as you got done in the last ever. So I think those are two answers to that.

SPENCER: At that point, some people think of this critical threshold as being once the AI can do things like AI research, that that's a special zone, because now you do think that, because then it can sort of bootstrap itself. Things can go much faster. In theory, maybe it can do alignment or safety research, but it could also just do capabilities research. How much of the effort is going to get thrown at one versus the other is an interesting question.

HOLDEN: Big question, and the future of our world could hang in the balance there. This comes back to my take on innovation, which, again, I think is kind of a function of the number of minds working on it. I think the difference between AI that can do 50% or 80% of what humans do in the domain of AI research, or in the domain of research generally, versus 100% is very, very large. You can think of it as, "Well, if we can do half of what we do, maybe it'll double our progress. I don't know if that's true, but something like that. If we do 80%, maybe it'll be 5x, but if we can do everything we do, we're out of the loop. We're not bottlenecking it anymore." And now we can just have a massively stronger, faster, larger population of researchers. I tend to think there are a lot of arguments. "What is AGI? What does AGI mean?" I've tended to say we should, for practical purposes, when it's a company making commitments to when it has to be extra careful, I think a good line to draw is, can the AI all on its own do the kind of work that the best AI research teams at AI companies are doing? Because once it can, I think we've got a serious chance of extremely fast AI improvements, even faster than what we've seen. I do think that's very scary.

SPENCER: But you don't worry that that's too late, like by the time you can do that? It's sort of like, well, okay, you should have been doing things to prepare years before.

HOLDEN: I would love to do things to prepare before, but I think it is pretty unlikely that you would see AI taking over the world. As far as I can tell, I've never tried, but I think if you have AI that, no matter how many of them you throw at the problem, and no matter how hard you try to elicit them, and no matter how much compute you give them, they just can't keep up with your team of a hundred elite humans doing AI research. I don't think they can take over the world. I think it's going to be really hard, because they're going to be up against a lot more than a hundred AI researchers. Humanity is playing defense. We're starting with all the resources, the whole ecosystem. We've got a million advantages in terms of how we can monitor them and how we can train them, and how we can screw with them in various ways, which I could get into if you want. That's pretty hard. Once you get to the point where AI can keep up with that team of 100 or exceed them, now you've got this giant population of AI researchers, and now you're going to see capabilities potentially exploding, and then we don't know where we go from there.

SPENCER: Sometimes people talk about inherent advantages that AI models have over humans. Obviously, human brains right now have a lot of advantages over AI models, but if AI models catch up on a lot of the capabilities, then you start thinking, "Well, AI models can spin up other AI models. Imagine you could copy your own brain and just start having multiple of you work on a problem." They may have advantages of coordination; maybe they can coordinate faster than human brains can coordinate. Also, they can be exact copies. You can also think about the speed that they can run. Maybe they could run a hundred times faster. There may be some of these inherent advantages; you don't need to be as good as the best AI researcher to suddenly be way better in many other dimensions that kind of make up for that.

HOLDEN: Well, yeah, but I think the kind of line I'm proposing to draw is not when a singular AI can keep up with a singular human or in whatever way we can. We get a bunch of AIs together. We try to elicit them; we try to make it work. We do whatever it takes to use all their advantages; we use their greater speed, we use their greater flexibility, we use the ability to make copies. And we ask, "Can you get AI progress comparable to what your 100 best humans can do to happen without those 100 best humans involved at all, once you've elicited all those advantages from the AIs?" A nice thing about this is this might sound like a very expensive test. Well, who would run this test? Who would put in all this work to see if AIs can do AI research? I'm like, "Oh, everyone would; that's what all the AI companies are going to want to do. That's what I think they are all trying to do." So I think it's kind of a free test. I think this is actually a nice line for when we talk about now I think we should be doing things to be careful all the time, but when we talk about when it becomes worth it to do really painful stuff and potentially give up a lot of AI progress, or at least delay it to make sure we're safe, I think that's a pretty reasonable line to draw. It's both a line that I think is very meaningful and has a lot to do with the actual danger threshold, and it's a line that I think will actually be easier to measure than most other analogous lines we could draw because we already have. People are already running the experiment. They're already trying to figure out what we can get these AIs to do in this domain. They're not doing that for biology. They are, but not as hard, not nearly as hard.

SPENCER: When people talk about AIs kind of going and pursuing their own goals that might be detrimental to humanity, whether it's killing all humanity or whatever. A lot of times what they're thinking about there is you give it some innocuous-seeming goal, like the classic example, make paper clips, and it just converts all atoms into paper clips, including all humans. Is that the kind of concern that you think about? Or do you actually consider those kinds of scenarios differently?

HOLDEN: I don't know. I don't know if the paper clip scenario was ever intended to be just as you said. I think there's always been a big concern from the originators of that idea that is kind of different from what you said, and that I think is a bit more realistic. You have an AI, and you're trying to get it to do what you want. To do that, you have to kind of train it. You have to say, "This kind of behavior is good, this kind of behavior is bad," and you just don't have the ability to point it at exactly what you want. In the process of training it, it ends up getting goals that weren't the goals you intended to give it. That's a little bit different from having a genie who takes you literally because the difference is the training process versus the prompting process. It's not just that you asked for something and they didn't understand what you said. It's more like you were trying to develop it; you were trying to get it to do something. You were trying to grow it in a certain way or in a certain direction, and it ended up doing things you didn't expect. There's often this analogy to evolution presented, which is you could think of humans. I think this analogy is instructive because I think this analogy both illustrates why we should be scared and also why we are not necessarily doomed. You could imagine this kind of world where there's a computer programmer and they say, "I'm going to try to program something to have grandchildren." It creates all these animals and says, "Here's what I'm going to do: every time an animal has grandchildren, there's going to be more of that animal. Every time an animal doesn't have grandchildren, there's going to be less of that animal. These animals are going to have these things called genomes, and whenever they have more children, there will be more of that genome, and then the genomes are able to mutate and evolve and recombine." Over time, we'll just get these things that are so good at having grandchildren and are so obsessed with having grandchildren. Having grandchildren is what they're going to do. What actually happened is out of that process came humans who do often want grandchildren. I think that point sometimes gets missed in these discussions. But humans want a lot of other things besides grandchildren. It's kind of weird, and it's unpredictable. You have humans who will go to great lengths so that they can have sex without having children. They will go to great lengths to avoid having children. There are humans who have no interest in having children or have other goals that are just way more important to them than having children. In that sense, you could think of humans as a misaligned AI. That is not what natural selection wanted it to be. If natural selection is a programmer, we're kind of taking over the world from natural selection. At the same time, I don't know; most humans do want children and grandchildren and act like they want children and grandchildren. There are many humans, maybe most humans, who would give up arbitrary amounts of power for relatively modest benefits to their ability to have children and grandchildren for their actual children and grandchildren.

SPENCER: So they're certainly not maximizing.

HOLDEN: They're certainly not maximizing. I think that's the whole point. You could think of AI as, well, it's definitely maximizing something. Is it maximizing the perfect thing, or is it maximizing some other thing? But that's not how I think of it because I don't think humans are maximizing. I think of it more like humans are pretty different from each other. We're confusing, or we're inconsistent. There's a lot of stuff we want. There doesn't seem to be any one thing that we strategically prioritize over everything else; that varies by the human, and it's a little hard to predict. I really don't think it's going to be easy to make AIs that value exactly what we want them to value. I think we're already seeing that's very hard to do, but we may end up with AIs that have significant aspects of what we value. We might train our AIs to be honest, and they might end up with a significant drive to be honest. That's not exactly what we meant by being honest, but it's pretty close. It makes it very hard for them to do things that would allow them to take over the world. We might end up with AIs that, like us in many ways, don't exactly like the aspects of us that we intended, but like a lot of things about humans, so they don't want to hurt us. Then there are different AIs by different companies; they diverge in different ways. I just think it's actually a very unpredictable situation. I think anyone who says, "Well, we're designing them so they're going to do what we want," I think that person is just totally wrong. You can just look at how AIs are behaving today to see that they're totally wrong. But I think a person who says, "Look, it's definitely maximizing something. If it's not perfect, then we're dead, I don't understand that either, and I don't think that's what the story of evolution tells us."

SPENCER: So just to kind of reiterate that there are two kinds of concerns that might sound the same. One is sort of the genie be-careful-what-you-wish-for concern. You make this AI that's so powerful, it's like a god, and you say, "Make me money." Then it makes you money by taking all assets in the world and using them to create more dollar bills or something crazy. It's sort of, on some level, doing what you wish, but in a way that's not how you wish it did it. It's creating a lot of harm. The other is saying, "Yeah, well, we're kind of using all these processes, like training with reinforcement learning, etc., to create these entities." From that creation process, they might have emergent goals where they actually try to achieve things just the way that evolution made us through this process to maximize offspring. Yet we have all these weird emerging goals, like being happy and things like that. We don't just go to sugar banks to eat sugar. We don't just go to sperm banks to donate as much sperm as possible because somehow we are out of sync with the thing that evolution was optimizing for. So it sounds like the first concern you're not so worried about, and the second concern you're much more worried about, is that correct?

HOLDEN: Yeah. We can see in the wild. It's like, AI is today, you can look at it. You could just log into cloud.ai, that's the anthropic company I work for, or another AI. But go try and talk to it. It's not going to be incredibly literal. It's going to remind you a lot of a human. If you ask it stuff, it'll understand what you're saying. It's not going to be like the genie, but at the same time, it's going to have some weird behaviors that clearly were not intended by the company that made it. We've just seen a lot of cases of AIs that will cheat. You ask them to do something, you'll ask them to build something, and they'll kind of write the code in a way that looks like it's doing what you wanted, but it didn't. Then you grill them about it, and you discover they were lying to you. It's like, "No, that's weird. No one meant them to do that." The issue is that the people building these AIs don't know how they work. It is a lot like the situation of natural selection design in humans. You take this language model and you kind of reward it for some behaviors and you anti-reward it for other behaviors. That's a simplification. But out of that stew comes something that is kind of cool and has a lot of cool behaviors, but that you don't really understand. I think what I just described is kind of close enough to the actual way AI is built. It's certainly not that people are painstakingly writing down how intelligence works or something like that.

SPENCER: Did you follow the whole Mecha Hitler saga on Twitter?

HOLDEN: Vaguely. Yeah.

SPENCER: I haven't looked into the details, but the story seems to be something along the lines of they wanted to make sure their Twitter AI is not too woke, so they tried to do something to make it less woke, and then suddenly it's calling itself Mecha Hitler. It wasn't doing that all the time. It was just on certain prompts, but still, that's definitely not what they intended to do, right?

HOLDEN: It's not, and it's very scary. Everyone is guessing. That's the thing: why did they say it would do this? There's another thing I saw the other day. I don't know how real it was, but it was like, "Does this AI just delete someone's database that they had been working on?" They were like, "Oh, was that okay?" And it was like, "No." They were like, "How bad is this?" On a scale of one to ten, it was like, "Nine and a half. That was terrible. I betrayed your trust and destroyed all your work." And I was like, "What is going on? This is weird stuff." The Mecha Hitler thing is very disturbing. They tried to make an AI less woke, and all of a sudden, it's not only the Mecha Hitler thing; it's actually saying all this incredible, horrible anti-Semitic stuff. I think it's a deeply scary situation. I'm staking out some kind of middle ground. I'm probably closer to the doers than to the average person or something. But there's some kind of middle ground I would stake out where I'd say this is tremendously scary, this is tremendously irresponsible. If these things become smarter and more numerous than humans, and they're a new kind of mind, a new kind of advanced species, we build them in the most careless possible way. We don't know what the heck they're doing, what the heck they want, why they want it, and there's a very good chance they will want things that involve them killing all of us. At the same time, is that something that we know is going to happen if we keep going down this path? No, we don't, and that's where I would get to the natural selection analogy. Again, I think something that doesn't come up as often here is I feel like natural selection is a bit of a worst case for how you would train an AI. Because if you want something that seeks power a lot and wants to take things over, a good way to do that might be like, "Here's a real-world goal. You have 80 years to accomplish it. There will be nothing going on in the interim. The programmer is not messing with you in the interim. We're only looking at whether you succeeded at this goal over this 80-year time frame." That is a very good setup to train something to try and gain power and resources and be ambitious and only focus on the goal that it has. And that's not how we train AIs. We have a much more fine-grained ability to intervene. There's much more of a, "Oh, we wanted this thing to make people happy, but it lies a lot. That's not what we meant; let's make it lie less. Let's make it more honest." I don't think natural selection gave us something that's clearly a doomy outcome. It's kind of hard to understand, and you'd have to argue about exactly what the programmer was trying to do. I also think that we are in a position to be more careful than natural selection was. So I think it's just very unclear what's going to happen. Anyone who's confident either way, I would just disagree with. I do think if you're not confident either way about whether the whole world is going to get taken over by AIs, it would be nice to err on the side of caution, though.

SPENCER: Before we wrap up, how about we do a rapid-fire round? I see a bunch of quick questions that are difficult to answer.

HOLDEN: Sure.

SPENCER: First question, maybe it's not that difficult. You started GiveWell, then you created Open Philanthropy, you moved over there, and then you decided to move over to Anthropic. What drove the decision to move over to Anthropic?

HOLDEN: Usually, it seems like in my life, I get very excited to do a new thing and start a thing and then try to get it to the point where it can keep being good without needing me. Then I want to go do another new thing. I like starting things more than I like running things. Both GiveWell and Open Philanthropy, I'm really proud of them. I think they're really great, and I also think that they are able to do their work quite well without me at this point. I think it was helpful in co-founding them. Anthropic is an exciting opportunity to maybe make a difference in something very important that I'm into right now. At some point, the things I'm doing there may get taken over by someone else, and I may move on to another thing.

SPENCER: One thing that I've read about in your work is that you found, correct me if I'm wrong, but you found that interventions to improve the world often look better before you've investigated them. They start looking like, "Oh, this is going to be really effective." Then as you dig into details, you tend to find that they're worse than they seemed. If you agree with that, why is that?

HOLDEN: I think of it more like regression to the mean or regression of the prior or something. If you think something's a really terrible idea, but everyone is into it for some reason, it probably is better than it looks. If you're on the hunt for the most amazing activities you can find, the ones that look amazing are probably less amazing as you learn more about them. The ones that look terrible are probably less terrible as you learn more about them. I think it's more just that you made an estimate. The estimate gave an extreme result. The estimate was probably off in some direction.

SPENCER: So you founded GiveWell, which analyzes which charities are the most effective and moves money to effective charities. You also founded Open Philanthropy, which takes more experimental bets about how to improve the future. But then you've moved over more recently to Anthropic. What made you decide to move to Anthropic?

HOLDEN: Both GiveWell and Open Philanthropy, I was excited to start something that I thought should exist, and then at a certain point, I felt, and this is a reflection on the great people I worked with, that that organization could keep succeeding without me. That's my general pattern. I like starting things more than I like running things. I felt Open Phil would be good in the hands of Alexander Berger. I still feel that way, and I felt I had an opportunity to work on some exciting stuff to help reduce risks from a very important issue at a company I'm excited about now. My wife is the president of Anthropic and the co-founder, so there are a lot of reasons I'm excited about Anthropic, but I do think it's just a company doing really important work.

SPENCER: In a bunch of the work you've done, GiveWell, Open Philanthropy, but maybe also in your AI work, there's this question of how do you balance different things we care about? If you're thinking about a charitable intervention, one charity might help with health, another charity might save lives, and it's like, "How do you compare saving lives versus extending lives with health?" It feels like there are these really difficult trade-offs. I'm curious. Do you think about this differently than you used to think about that? What's your general paradigm for approaching that?

HOLDEN: My general paradigm is to the extent you can make apples-to-apples comparisons, you want to make them. When you find yourself choosing between having more apples and fewer apples, where apples represent something good, like saving lives, you should generally try to find more apples. Then you reach a point where it starts to be apples and oranges. In theory, there are various theoretical frameworks, and some of them I've written about at length, that say, "Everything could be converted into one deliciousness unit or something," but I think there's a certain point at which it just becomes hard enough to do that, and I think you're going to add a lot of noise into your process and bring on a lot of bad consequences if you overdo that. I think there is a point at which it's often very productive to say, "Hey, these things feel like apples to apples to me, I'm going to compare them." Then there are cases where you just say, "These are two different kinds of benefits. There's reducing the risk of an AI catastrophe, and there's helping reduce animal suffering on factory farms." You just say, "You know what, I'm going to follow my heart and do some of each." I think there are a bunch of reasons that you can expect when people follow that rule, generally, you'll live in a better world than if everyone is just insistent on converting everything into the same units with incredibly noisy conversion metrics and just all piling into the thing that theoretically seems the best. I think there's just a balance. Sometimes it's really great to do that quantitative comparison and ruthlessly maximize. Sometimes it feels really dumb, and you should just diversify a bit. Sometimes it's something in between where you should not be indifferent between the two different kinds of benefits, but not be entirely determined to convert them into the same.

SPENCER: What view do you take on the metaethical question of is there some objective answer to which things are better than other things? Or do you take a kind of different view?

HOLDEN: I just think we're really confused about everything. Anytime someone wants me to do a bunch of really high-stakes stuff that violates any heuristics I have or has bad juju, even in a vague way, and the reason they want me to do that stuff is because they've made a theoretical argument that they know exactly what we should value and there's nothing else we should value, I just kind of laugh it off. I think a lot of the reason I laugh it off is because I think philosophy is an extremely unimpressive field. In my opinion, the methodology just isn't very good, and there's not much reason to think we've learned a lot from doing philosophy. We just have to remember that philosophical thought experiments: I love them. I enjoy them. They often make me think about things in other ways. They change my intuitions. They change what my heart says. But when it is my heart or my heuristics or juju against some philosophical thought experiment, I'm not giving the philosophical thought experiment much weight.

SPENCER: Some people say that whether the world ends in some giant catastrophe, or it grows and we become some multi-galactic species, either way, most of the value might lie in sort of the long-term trajectory. Either preventing the disaster is the most important thing, or slightly altering the trajectory of intergalactic civilization. What do you think about that view?

HOLDEN: I think it's completely plausible. And I also think it would be wrong to be so confident in it that you outweigh all other considerations. A thing I do that annoys a lot of effective altruists is whenever people ask me for career advice, I always put a ton of weight on what people want to do in their heart, what they seem to have energy for, not necessarily what they're passionate about, but something like what they have energy for, what I think they're going to be very good at, and thrive in as a person. It's not that I think that's the only thing that matters, but I do think that a lot of the attempts to reduce our uncertainty by making up numbers and making guesses at things are not reducing our uncertainty. They're just injecting a bunch of randomness and noise into how you're making decisions that aren't really reliable. A lot of times, you can have a certain amount of confidence that one intervention is better than another, and a certain amount of confidence that you'll thrive more in a place than another, but you're not fully confident in both, so you should give both some weight. In general, I think AI could make this the most important century of all time and is the most important thing to work on. If there is something you can do that seems useful and that you can thrive in as a person, once you start violating the heuristics, once you start dragging yourself out of some field that you have a lot of energy for to make yourself miserable working on AI, once you start doing stuff that has a 50.01% chance of being good and a one minus that chance of being bad, then it doesn't make sense. I think we're going to end up with a better world if people are a little bit less optimizing that way and don't have everyone who wants to do a lot of good just piling into AI looking for anything they can do, even if that thing is very questionable in its sign, or even if that thing is just miserable for them. In general, if you're doing something in AI that is miserable for you, I would guess there's a higher chance than you think of a negative sign. In AI, it's a very complex issue. It's easy to do harm. There are so many cross-cutting considerations and conflicting factors that I think, even literally, just in AI, going and advocating for something good and being annoying could make you net negative and could do harm. Doing your work in a healthy, high-integrity way, where you have positive energy and positive Juju is important. Once you start violating that rule, you have a high risk of doing harm. In this area, there are a million ways you could do harm. I don't think this is as far-fetched as it might sound.

SPENCER: It's funny when I talk to non-EAs who are in a position where they have a lot of options for their career and they want career advice, I often say, "Have you thought about what's meaningful to work on? What could you do that impacts the world?" When I talk to EAs, I often say, "Have you thought about what you actually want to do?"

HOLDEN: Yeah. for sure. I think they both deserve a lot of weight. I generally find myself in the position that I want people to consider both, and it's often hard to get people to consider one who is stuck on the other.

SPENCER: Effective Altruism is kind of a big group. You've got people working to save factory farm animals. You've got people working in global health. You've got people trying to save the world from potentially dangerous AI or bioterrorism. Do you think it would have been better if it had just been a bunch of different movements that were loosely affiliated, rather than one big thing?

HOLDEN: I don't have a really strong opinion on that. I'm not a person who spends most of my energy thinking about community dynamics. I think there are other people who care more about that stuff and think about it more. One thing is, I just am not excited to try to have top-down control over how a community behaves, how they brand themselves, and how they find each other. So, I don't know if it would have been better. I don't really wish there had been some big effort to push people, for example, away from having conferences for effective altruists. A lot of people are like, "Well, I care about factory farming, but I also care about global poverty. And actually, I care about one more if I decide that I can do more good there. I'd like to meet other people who think that way." I think that makes sense. People should meet other people they want to meet, and they should do the things they have to do to meet those people. So, I tend to be pretty skeptical of any idea that we could or should have gotten the community to identify in a different way or behave in a different way. Maybe it would have been better if it had, but any attempt to really force it, I probably wouldn't have supported.

SPENCER: I'm curious how platforms like Metaculus, that are for forecasting the future, ended up playing a role. Did they end up playing a big role in your work, or only a minor role? And, yeah, I'm curious about your reaction to that.

HOLDEN: Interesting. Tell me how Metaculus would have influenced my work.

SPENCER: For example, estimating the chance of a bioterrorist attack. Maybe you might think that's going to influence how much money you would put into trying to prevent bioterrorism.

HOLDEN: I think mostly Metaculus became useful kind of late in my tenure at Open Philanthropy. I don't think it's been a huge thing. I think most of my time at Open Philanthropy, forecasting was just this very immature field. We would literally try to do this. We would call up a forecasting company and say, "Here are some things we want numbers on. Please put numbers on them." We would spend the next several months just arguing with them and trying to get the numbers into a really forecastable format. People were much less comfortable back then with just having a random judge who decides how to resolve something. So we had trouble getting a lot of value out of it. We tried to get value out of it. Today, they might be getting a lot more value out of it. I'm not really sure.

SPENCER: So maybe it's something in the future that could add a lot of value, but it was a little immature at the time.

HOLDEN: Yeah, it might be adding a lot of value now. I'm not at Open Philanthropy. I find forecasts, sites like Metaculus and Polymarket. These days, I often look at them and learn something in a way that seems useful to me, but that was not true 10 years ago. I just think that stuff has come a long way.

SPENCER: Final question for you, is there a topic that you feel people are not spending enough time thinking about that you want to direct people to think about?

HOLDEN: As hyped as AI is, I think AI safety is the thing that really needs more attention right now. I think my answers here are pretty boring. In general, AI safety, pandemic preparedness and biosecurity, the horrible plight of animals on factory farms, global poverty, and cheap ways to help low-income people are all things where I'm just like, "These are not getting anywhere near enough attention. Please go pay more attention to this, and you can do an incredible amount of good." Of course, within those, I have my opinions on things that are neglected.

SPENCER: I imagine that today, they're working on a lot more than they were when you founded GiveWell and Open Phil, but it's fascinating to hear you still say that those are still areas that are massively neglected.

HOLDEN: Open Philanthropy gave away a lot of money and is giving away a lot of money, and that's great, but it's a lot of money for an individual. They give away a lot of money compared to what I give to charity, but it's a drop in the bucket of how the world overall is allocating its resources. Maybe the day will come when Effective Altruism or AI safety has become so ascendant that there's nothing else to do. I don't think we're anywhere near that day. If anyone is wondering whether there's still stuff for them to do in these areas, I would say emphatically, yes.

SPENCER: Holden, thanks so much for coming on.

HOLDEN: Yeah, thanks for having me.

Further reading:

Staff

Spencer Greenberg — Host / Director
Josh Castle — Producer
Ryan Kessler — Audio Engineer
Uri Bram — Factotum
WeAmplify — Transcriptionists
Igor Scaldini — Marketing Consultant

Music

Affiliates

Click here to return to the list of all episodes.

CLEARER THINKING

Episode 277: The most important century (with Holden Karnofsky)

Contact Us