Enjoying the episode? Want to listen later? Subscribe on any of these apps or stores to be notified when we release new episodes:
December 5, 2024
What interesting things can we learn by studying pre-humans? How many different species of pre-humans were there? Why is there only a single species of human now? If pre-human species wiped each other out for various reasons, why might the ancestors of chimps and bonobos (who are very closely related to humans) have been spared? What roles did language, racism / speciesism, and disease likely play in the shaping of the human evolutionary tree? How is AI development like and unlike human development? What can we learn about AI development from human development and vice versa? What is an "AI firm"? What are some advantages AI firms would have over human companies in addition to intelligence and speed? How can we learn faster and retain knowledge better? Is writing the best way to learn something deeply?
Dwarkesh Patel is the host of the Dwarkesh Podcast. Listen to his podcast, read his writings on Substack, or learn more about him at his website, dwarkeshpatel.com.
JOSH: Hello, and welcome to Clearer Thinking with Spencer Greenberg, the podcast about ideas that matter. I'm Josh Castle, the producer of the podcast, and I'm so glad you've joined us today. In this episode, Spencer speaks with Dwarkesh Patel about the development of AI relative to human development, the advantages of companies comprised exclusively of AI labor, and learning through podcasts and compiling knowledge through conversation.
SPENCER: We're happy to have GiveWell as a sponsor of this episode. Have you ever wondered where your donation could have the most impact? In 2007, a group of donors had that exact question. But when they sought out information from charities to help them answer this question, they instead received cute pictures or unhelpful stories. Their experience led them to create GiveWell, an organization that would provide rigorous, transparent research about the best giving opportunities they've found. GiveWell has now spent over 17 years researching charitable organizations and only directs funding to a few of the highest impact opportunities they've found. Over 100,000 donors have used GiveWell to donate more than $2 billion. Rigorous evidence suggests these donations will save over 200,000 lives and improve the lives of millions more. GiveWell wants as many donors as possible to make informed decisions about high-impact giving. You can find all of their research and recommendations on their site for free. You can make tax-deductible donations to their recommended funds or charities, and GiveWell doesn't take a cut. If you've never used GiveWell to donate, you can have your donation matched up to $100 before the end of the year or as long as matching funds last. To claim your match, go to GiveWell.org and pick Podcast and enter Clearer Thinking with Spencer Greenberg at checkout. Make sure they know you heard about GiveWell from Clearer Thinking with Spencer Greenberg to get your donation matched. Again, that's GiveWell.org to donate or find out more. GiveWell.org to donate or find out more.
SPENCER: Dwarkesh, welcome.
DWARKESH: Thanks for having me, Spencer.
SPENCER: Why have you been so obsessed with pre-humans lately?
DWARKESH: There's a couple of reasons. First of all, I think it is sort of the wildest thing that has ever happened, at least so far. You can go back pretty proximally close to our time, and you can go back hundreds of thousands of years, or even 50,000 to 60,000 years ago, when there were half a dozen different human species all around the world. It's basically the most interesting thing that's ever happened in human history. It turns out that a bunch of things we thought we knew about it are wrong. "Where did it happen? Did it happen in Africa?" It turns out many of the key periods in human evolution happened outside of Africa — we're learning that now. "When did it happen?" It's the most important thing ever, and we don't know the basic facts about it. We're only learning it now; it's very, very interesting stuff.
SPENCER: So yeah, let's set the stage a little bit. So we go back roughly 50,000 years. My understanding is that back at that time, we had the Denisovans, about which very little is known. We have just a tiny bit of information. But if I understand it properly, many Asians living in the world today have about 4% of their DNA from them, which is kind of amazing.
DWARKESH: And Native Americans.
SPENCER: And Native Americans, too. Interesting. And then we had Homo — I'm not going to pronounce this right — Homo floresiensis, and I think even less may be known about them. We know they were very small and lived on an island off of Indonesia, but I'm not sure much else is known. But they were another species of humans. Then we had Neanderthals, which are kind of the big other group that we know a lot about. And then, of course, our ancestors, Homo sapiens. And then there might have been other groups, but we at least know that there were those four roughly coexisting. Is that right?
DWARKESH: I believe that there are also two different kinds of Denisovans, and they are two distinct species. But, yeah.
SPENCER: And so then Neanderthals, apparently, they looked quite a bit like us. They had larger brow ridges, prominent noses, and a stockier, more muscular build. And so it's really weird to think about this: Imagine you go to the supermarket and there are just three species of humans there. You almost imagine Star Wars, you go and there are humanoids. They're not your species. And that could have been a normal occurrence but we don't see that. And it's very easy to see this in other species too. If you look at things like different species of mammals, there often will be a fairly similar type of mammal, like there are different types of bats. So it's kind of weird that there aren't different types of humans, even though we know there used to be. So that kind of sets the stage a little bit.
DWARKESH: Yeah. And with beetles or something, there are hundreds of thousands of different species there, so the fact that we ended up with one here is very interesting. And in fact, it even gets into some fuzzy area where there are obviously different groups of humans who are alive today, and obviously we're all way more similar than, say, the Neanderthals to humans. But it turns out that for people in Eurasia, a significant percentage of their ancestry is from Neanderthals. So one of the points that David Reich made in the episode was that it's not even clear that people in Europe and Asia are modern humans. They're just Neanderthals who had waves of admixture with modern humans.
SPENCER: I think what makes the whole story so complicated is that there's constant interbreeding between groups. You can find Neanderthal DNA in people living in Europe, for example. I think it's often you'll find something like 2%. So while these were really distinct species, there was constant interbreeding, making the whole story a big mess.
DWARKESH: Yeah. I'll zoom out and say what's especially interesting about this whole story. You obviously know the story of the fact that we wiped out all these other different human species, or at least we did a little bit of interbreeding, but the fact that they're 3% of our genome and not 97% of it, or at least 50%, tells you something important about the way we interacted with them. The thing I found really interesting in David Reich's research — and I guess I should give context for the audience, is that one of the reasons I got interested in this topic is because I was preparing to interview the geneticist of ancient DNA, David Reich, whose book we are discussing these kinds of topics — the reason that that's so fascinating to me is because this is not a one-off event. If you look through human history, there's this group of one to 10,000 humans. It's something like a tribe, or a sort of homogeneous group that is the group from which everybody who has ancestry in Eurasia is descended from. So many, many continents' worth of people are descended from this 1-10,000 people, 50 to 60,000 years ago. Now, this exact kind of thing, where you have archipelagos of different tribes, different groups, and one of them figures something out, and then, because of something they figured out, or some change in them, they are able to have a tremendous impact on the future of the species, the future of the planet. They are able to dominate the gene pool. Another example here is the Yamnaya, who were this group of people 5,000 years ago, who originated in the eastern steppes, and a lot of the ancestry in Europe, in India, and many other places comes from them. Before that, there were the people from the Near East, the original farmers 12,000 years ago, who first discovered farming, and then they displaced the hunter-gatherers in Europe around that time. I just find it fascinating — the story of somebody figures something out, displaces everybody else, then some pocket of that figures something out and displaces everybody else. It happened again and again.
SPENCER: I think another element that deepens the mystery is that chimps and bonobos are still around. We do have common ancestors with chimps and bonobos. In many ways, they resemble us, but they don't resemble us nearly as much as, let's say, Neanderthals. So it's interesting. It wasn't like everything like us got wiped out, but the things that were too similar to us got wiped out. What's your best guess of why? Why was there this kind of genetic bottleneck? It seems like people believe it was something like 10,000 people who kind of took over and spread throughout the world, and then everyone else disappeared.
DWARKESH: Yeah. I found it really intriguing that somewhere between 50,000 to 100,000 years ago, at least according to one line of evidence, the human population in general was pretty low, not just the one tribe that took over everything. I really should clarify that I'm sort of recounting my amateur read-through of this research a couple of weeks of prep before interviewing David Reich. People will hypothesize about the cause of that, whether it was a Toba eruption or something else. The question is, of this group of tens of thousands of humans who were alive at the time, why was it that this one small tribe of 1,000 to 10,000 people — and it might not have been literally a tribe, but it was some small group of people — why were they the ones to dominate? I feel like David Reich didn't give me a really satisfying answer. He thinks it was a cultural change of some kind. His explanation was that in most groups, when key elders die, when some other tragedy occurs, you basically just lose all cultural knowledge. There's no process of accumulation, whereas, if you have, presumably, you develop some cultural technology that allows you to accumulate this library of artifacts and concepts that allows you to be successful. I found that confusing because we know all humans alive today have language, and so language, even though different branches of the human tree of humans split off more than 200,000 years ago, we know that they all have language. Language has been around for hundreds of thousands of years. Humans with language have been around for a long time. What happened 50,000 years ago, if language is so dominant? Anyway, I guess the short version of the answer is, I don't know.
SPENCER: There is some genetic evidence where, if you look at where on the genome you still see Neanderthal DNA, and where there are gaps where there isn't any. It suggests that those genes in those gaps were being selected against. You can find that there is a gap where there are genes that are known to code for language ability, and that people with disruptions in those genes have trouble with language. It is possible that what was going on is that Homo sapiens were significantly better at language than Neanderthals, and that gave them a leg up, although it still doesn't really explain why such a small group were the ancestors of modern people.
DWARKESH: It also doesn't explain why it took so long. I think the FOXP2 gene is the one that people often refer to. I'm assuming there are many more genes that are responsible for language, but that's one where, if it's modified, then somebody will have issues. It's actually really interesting because it tells us a lot about intelligence too. People without this gene will still possess general reasoning power, but they won't be able to form complex grammars and so forth. It is interesting that language and intelligence can be divorced in this way. The FOXP2 gene has been around for hundreds of thousands of years, if not over a million years, and even different species like parrots have it. What exactly happened 50,000 years ago is not clear, and I'm very interested in what that might have been.
SPENCER: I also wonder if straight-up racism, or what you might call speciesism, might have been a factor. Think about how much even modern humans have struggled with groups that look even a little bit different from them and how horribly they've treated each other, killed each other, and massacred each other. Imagine there was a whole different species, and how easy it would be to kind of other them. I can imagine a world where there was general conflict between these slightly different species, and it would be unlikely that the world would end up with more than one of them surviving long term.
DWARKESH: The story that David Reich lays out in Who We Are and How We Got Here is kind of horrendous because it's hard to find an example of two different groups encountering each other that doesn't result in some sort of genocide. For example, he says that, "If you look at Native American ancestry, the only group other than the main ancestry that Native Americans have that survives in the genetic record is a so-called Population Y." The reason we can find traces of it is that that group was lodged within the Amazon jungle. And since the time to travel between the Amazon jungle and hunt everybody down is such a big friction, it gave enough time for that group to interbreed into the dominant group rather than getting killed off, like we think happened with other groups. It's interesting that you have to have some sort of geographic abnormality like that to preserve or prevent genocide when two different groups of humans encounter each other. I also wonder, "What was it like for them?" Right now, if you meet somebody of a different race, you have an understanding of what's happening. They're from a different geographic location, and the ancestral environment selects for these kinds of changes, like in melanin, and we understand that now. What did they think was happening? They just ran into a group that looked completely different from them. What do you think was happening there?
SPENCER: I remember hearing a story of a sociologist who would embed with different tribes, and one time, he embedded with one particular tribe, and they told him about these terrible people that live on the other side of the mountain, that are not human, they're actually demons, and they'll murder you if you go to the other side of the mountain. At some point, the sociologist does go to the other side of the mountain, and he meets another group of people there that, as far as he can tell, are exactly like the people on the first side of the mountain. From his perspective, he can observe no differences whatsoever, but yet, just being on the other side of the mountain, who knows what kind of history they had, they demonize each other. You could easily imagine adding real differences, structural differences, where they look different. Maybe their physical characteristics are somewhat different. Maybe their mental characteristics are somewhat different. How much more does that enhance that?
DWARKESH: Yeah, we know that there are physical differences, for example, between the Yamnaya and the ancient Europeans, the descendants of the Anatolian farmers who colonized Europe, let's say, 8,500 years ago, versus the Yamnaya who colonized 5,000 years ago. The Yamnaya were physically bigger and more imposing. We can tell that based on their fossil bone structure. It would just be wild to see this. They are physically more imposing. They're these crazy steppe people with horses. I would be very curious to understand what they thought was happening when this group came through. One interesting thing I learned from that story was that Yersinia pestis, the bacteria behind the Black Plague, was found in the fossil record in the Yamnaya. The Yamnaya brought on the Black Plague, which accompanied the sort of massacre of the Europeans at the time in Europe. I can't imagine how confusing and tragic that situation from the perspective of somebody there might have been.
SPENCER: This group that's totally different, or at least perceived as totally different, spreads this unknown disease that's killing everyone.
DWARKESH: Yeah.
SPENCER: We also talked about viruses, so that's another theory. Could there have been some virus, bacteria, something that there were different susceptibilities to, and could that explain why either certain groups died out or just a small number of people survived?
DWARKESH: James Scott has a theory that the reason — James Scott is an anthropologist who thinks agriculture was a big mistake and we never should have done it — is why the barbarian or hunter-gatherers weren't able to resist the agricultural nation-states. It basically involves what happened to the Native Americans when Europeans arrived there. That happened again and again throughout human history when the first agricultural states, who had diseases because they had the conditions that lead to diseases, spread those diseases to people who didn't have immunity and couldn't organize around it. It was surprising to me how many of the diseases we think are just native to humanity — everything from tuberculosis to cholera, go down the list of typhus, whatever — are about 10,000 years old. These are not ancient things. This is very much when agriculture happened, and then these things developed.
SPENCER: I wonder if that could help explain why they're so bad, why they hurt humans so much. It doesn't seem like the best strategy for a virus long term to hurt its host so badly. You think that eventually the optimal virus, from the point of view of evolution, actually reaches some kind of symbiosis with its host. Eventually, you wouldn't even really know it's there, but it just kind of infected the 100% of the population.
DWARKESH: Yeah, although malaria is more ancient, and I feel like that's a sort of significant handicap.
[promo]
SPENCER: You also mentioned culture, and I think it's fascinating to think about animals like parrots or even cats. They have some form of communication with each other, for sure. Is it language or depends how you define language, but they have communication, and they don't build on it, sort of starting from scratch. With my cat that recently passed away, I had that kind of language that we developed together. When he sat on my scale, I knew that it meant he either wanted to be petted or play, and then I would go to try to pet him, and if he dodged, it meant he wanted to play. Language kind of happens automatically in a way, or at least communication does. Yet, cats and parrots are kind of starting from scratch every single generation. This seems like perhaps the biggest difference with humans is that we don't start from scratch. We build on what's been done before. It's really hard to compete with building on what's been done before.
DWARKESH: This ties into AI in a very interesting way. The first being — I'm sure you've heard this phrase "unhobbling." I think Leopold coined it in his "Situational Awareness" post — the idea is that you can make these seemingly minor changes that have a very big impact in terms of how powerful the system is. In the context of AI, it could be that it's integrated into your computer, or it can do test time and compute things like that. You really do see evidence of unhobbling in human history. Whatever happened 50,000 to 60,000 years ago, we know it can't have been a humongous genetic change. It really does seem like there was some cultural change that was the equivalent of an unhobbling, where you retain cultural knowledge. Quintin Pope had an interesting article on LessWrong a little while back, where he was making the point that whereas human culture, we're constantly losing information over time — you constantly have to re-update that and relearn it — that's not really a problem in AI. AI can just start without this unhobbling, because when you go from one model to another, you can train on the same data. He interestingly used that as an argument against fast takeoff, because he was saying that this weird unhobbling that exists in humans prevents that. There was this big overhang caused by a lack that previous humans had, which created room for fast takeoff in humans, but AIs already have this ability to retain the equivalent of cultural knowledge over time. When you train a new model, you don't just discard the entire dataset. You can keep training on the same data, you can add more data, and so forth.
SPENCER: Is the idea that with AI, because we're already doing this, there's not going to be this sudden surge of new data that takes things to the next level? It would have to be something if we're going to get a fast takeoff where AI advances really quickly or becomes superintelligent; it would have to be some other factor that's driving it. Speaking of AI, on your podcast, the Dwarkesh Podcast, you've interviewed many luminaries in AI, including Mark Zuckerberg, who is pushing forward AI in his own way. I'm wondering what you changed your mind about doing all these interviews about AI?
DWARKESH: One important thing you learn is how much conviction each of the different CEOs or lab leaders have in the scaling picture. Sometimes they'll say, "We think this is one of many important things we're working on..." Sometimes I feel like Ilya Sutskever was the strongest example of this, and Dario Amodei is another example where they really think you need a couple more years of making the model bigger. If you do believe that, obviously that's going to inform your strategy as a company and as an organization and what you make investments in. But I think that's a very interesting thing to learn from these interviewees.
SPENCER: Are you saying that they each have very strong convictions, but they don't necessarily agree with each other, so they're kind of contradictory?
DWARKESH: Yeah, or just how much they buy the scaling picture, because some of them really think that they are a couple of years out from AGI, where they just think they have to make their current systems bigger. I think Ilya and Dario, for example, believe that if they just make their current systems bigger, eventually, and not that far off, you get AGI.
SPENCER: Dario from Anthropic?
DWARKESH: That's right, yeah. Whereas I think of somebody like Zuckerberg, it seems like he doesn't necessarily think that. I think he thinks it's a more currently useful technology, and over time will get better. I don't want to put words in his mouth, but at least my understanding after I talked to him wasn't that he thinks if you just make Llama 3 a hundred or a thousand times bigger, you're going to get AGI.
SPENCER: One thing I found so fascinating about this topic, on a meta level, is you've really smart people on sort of every single side of this issue, from "AI is way overhyped" to "AI is going to be transformative, but in the way that the internet was transformative" to "AI is literally going to kill everyone on Earth" to "AI is going to create some super authoritarian state" to "eventually we're going to merge with AIs, and it's going to bring in some paradise." Do you have a stand that you take on this? Or do you feel undecided?
DWARKESH: I do feel undecided, but I find one interesting thing here is that it does seem there are a lot of individual opinions, which, if you change your mind on one thing, probably changes your perspective on the whole picture quite a bit. To take one directly, if you thought that the intelligence explosion is not possible, which is to say that if you thought that once you get to AGI, you can't pretty rapidly thereafter arrive at something that is superhuman and to AGI, what humans are to chimpanzees, that significantly changes your picture, not only of what the future looks like, but of what makes sense to do now. I think in the world where you only have AGI, taking less precaution, being a little skeptical of the Doomers and so forth, makes a lot more sense. And you have a lot of questions like this, where how you think about whether ASI is possible has a significant impact on your entire worldview.
SPENCER: Yeah, I don't know that that particular question changes all that much for me, at least because suppose you think that creating an artificial general intelligence, I don't know what definition you prefer, but let's say that can do pretty much any task as well as sort of 99% of humans. Is that a reasonable definition?
DWARKESH: Yep.
SPENCER: So let's suppose you think that's sort of where AI caps out. You can't do much better than that. Well, even if that's true, AIs can coordinate in a way that normal people can't. You could have a million copies of the same AI that work perfectly together because they're literally copies of each other. They could potentially share information with each other much more efficiently than humans can. They could work much faster; imagine an AI running at 10,000 times the speed of a human brain. So even if it can only do 99% of the things that humans can do better than 99% of people, it could do them so much faster that it kind of can displace humans at all kinds of things. So I feel like the world gets just extremely crazy and ridiculous even if you can't get them more intelligent than that.
DWARKESH: I basically think that's right. I don't have big quibbles with that point of view.
SPENCER: Are there any other big cruxes you notice where you think there are things that, if you believe A, then you'll think certain things about the future? If you don't believe A, you'll come to really different conclusions about the future with regard to AI.
DWARKESH: Maybe just going back to this one thing that even if you think eventually you will be able to figure out how to make AGI firms, where you can have millions of copies cooperating in ways that humans cannot cooperate because they can just share their latent space representations directly. We can go down this rabbit hole of what an AI firm would look like, and I think it's a very interesting subject. I think that you probably expect that to take at least many years for somebody to make a very efficient AI firm. For literally hundreds of thousands of years, at least thousands of years, humans have been figuring out how to cooperate with each other, and we do notice that, over the course of time, we learn how to become better at management. We learn how to organize bigger and bigger nation-states or better and better companies. So it's not unreasonable to expect that there will be that kind of learning curve with AI firms as well. And so then it becomes, okay, well, you have many years to figure this out, whereas I think people like Yudkowsky and so forth, who expect that as soon as you get the AGI, you're going to have to worry about the nanobots, I think that picture does rely on some jump to superhuman intelligence that is rapidly imminent.
SPENCER: So you mentioned the AI firm. How would you define an AI firm?
DWARKESH: It's just a firm where all the employees are AIs. And it's in a world, presumably, where the AIs are as skilled as humans, at least.
SPENCER: Then, is this something that you expect to happen? Do you think that at some point in the not too distant future we will see companies essentially with almost entirely AI workforces?
DWARKESH: Yeah. Assuming we get to AGI, (I've been thinking about this recently) I think the advantages that AI firms will have over human firms are so incredibly significant that I think it'll be like the difference between bacteria and eukaryotes. If you look at the history of bacteria, for billions of years, they basically don't change. The same bacteria that are a billion years old are just about as complex as the bacteria we have today. With eukaryotes, you had some unhobbling that allowed them to produce more energy, accumulate more complexity, and so forth. With that change, you can have the emergence of something with much greater levels of complexity. This may be too vague, so I can first start with what the advantages of an AI firm would be. I think there's a lot of interesting stuff. You already talked about a bunch of these, but the first one you have to mention is the fact that you can copy employees arbitrarily. Imagine that you have an AI that is as skilled as Jeff Dean or as skilled as Sundar Pichai. A big advantage is the fact that you can copy these AIs. I think people are so used to human firms that we don't consider how big of a handicap it is that the CEO has such limited bandwidth in comparison to the size of the firm. Elon, who is running five companies at once, can see a small fraction of the things that are happening at Tesla or at SpaceX. You could have a situation where you can make infinite copies of Elon, assuming we have some AI that is as good as Elon that we're talking about here, not only are these copies as skilled as Elon, but they have all the context and knowledge because we're getting the exact replica of their weights. They're getting all the context of knowledge, the tacit knowledge about how Tesla works, who the employees are, and what their plans are, and so forth. These copies can micromanage every single part of the stack. They can manage all the hundreds of thousands of employees. Personally, they can be the mechanic at a Tesla dealership. They can write every press release, write every pull request. You could ask, "What is the big advantage, or why is it important to have one entity doing all this?" The big advantage here is that you have one mind that can comprehend everything that is happening in the firm. You can have something that has the bandwidth to really understand how different strategies relate to each other. Another element here is that you can merge these copies back with each other. And with humans, you just have to talk to each other, and there's very little bandwidth there.
SPENCER: It's really interesting how a normal company, a CEO, even if they in some sense control the company, there are many senses in which they don't. They're not actually doing all the jobs. They're not even seeing the people do every job. There are layers of information passing. Do you think that with AI firms, this idea of an AI firm, the information could be much more efficiently transmitted?
DWARKESH: So look, imagine a firm like Google. How many things does it see in a day? It sees countless petabytes of information. If you look at every single customer service report, every single piece of market analysis and market data, every single new technology that comes out that needs to be mastered, obviously, the CEO cannot see all these things at once. We do know that CEOs are incredibly valuable. Replacing Steve Ballmer with Satya Nadella will have, over the long run, potentially hundreds of billions of dollars, if not trillions of dollars, of impact on the market cap. We know they're valuable, but they're incredibly limited in many important ways. Now imagine if Satya Nadella has access to every single customer service request, so he understands what's wrong with the products. Copies of Satya can constantly be spawned and merged back into the main copy because you can have this sort of surgical incision where you're replacing exactly the pieces of knowledge you want the main copy to have. So whenever you spot a new copy, all the context of what is important, what the tacit knowledge is there. That's one part of the picture of why this is so significant.
SPENCER: Is your expectation that if we get AI firms, they will massively outcompete regular human firms in whatever domain they're able to compete in? Maybe AI can't do everything a human can do, but in areas where they can do what humans can do, and you can have a full company of AIs, they'll just have so many advantages that human firms will go extinct in that arena?
DWARKESH: I think a big thing here is that you can have way more experimentation. I imagine a lot of these AI firms just won't work, and maybe it's a bad idea to have Elon replace every single job at Tesla, but the big advantage here is you can test out different theses, and whatever works, you can replicate it. Suppose some team at Google is really effective, or the company as a whole is really effective. Right now, you simply cannot replicate highly efficient firms, and therefore firms have incredibly short shelf lives, where in a couple of decades, most firms will go bankrupt or be extinguished. They can't replicate their culture over time. They can't replicate their best employees over time. The ability to scale those things, to replicate them with exact fidelity, is such a huge unlock here that I think they will be able to significantly outcompete human firms.
SPENCER: I know in your writing, you mentioned how Warren asked the question, "Why is it that the best firms don't just take over everything? If you have these really great businesses with really great cultures and they have all this money and power, why is it that startups keep coming and beating them, and why do the best firms just kind of copy themselves?" It seems that, as he concludes, it's in the sort of replication step that real-world firms can't just make copies of themselves.
DWARKESH: If you can't replicate yourself, then the real handicap is that the best things can't really be selected. You can have a very successful firm, and then at some point the culture is going to decay. All the best people are going to leave, and then it'll die. Skunk Works, you get it once, but now it's going to become stagnant. The fact that you can find what works and then increase its output, not 2x, not 5x, but 100x, 1,000x. To give you one concrete example, suppose you think there was something special about the initial team that launched SpaceX. For many decades, the American space industry was stagnant, and then over the last one or two decades, they brought down the price to bring a kilogram to space by two orders of magnitude. There was probably something special going on with them. Assuming they were AIs, you can make exact copies of everybody on that team. You could throw each of these 1,000 copies at 1,000 different hardware verticals and let them take a crack at it. Let one copy take a crack at fusion. Let one copy take a crack at some other part of the puzzle. I think that's one part of the big unlock here.
SPENCER: Part of me wonders whether the idea of a firm or a company is really a thing that's dealing with things that humans care about, but that may not be relevant with AI. For example, let's say you need to raise money. Well, you've got to have a company to do it. How else are people going to give it to you? And you're trying to hire people. Well, you need to have a company to do it. But if you have, say, a million AIs that are kind of copies of each other or near copies of each other, and then can communicate perfectly, do you need a structure like that? Or could it be more just like a swarm? A million AIs just doing stuff?
DWARKESH: What you just mentioned, the ability to flexibly increase or decrease labor. Right now, if you figure out a great product and you need to rapidly scale it, you just can't million x the output, except for special cases where, like SaaS software or something. With AI labor, you could do that. Then the question is, "If I can just replicate my labor, why am I as a firm ever going to do business with another firm? Why won't I just spin up the same labor force to do the thing myself?" I think the answer has to be something like IP. If you look at something like Tesla's self-driving today, there's some model built on some sort of data set, and that data set is incredibly valuable and gives the model that Tesla has built some advantage. You could imagine that for different industries, because of these kinds of learning curves, everybody can build the same amount of laborers. Capital is fungible, compute is fungible, and labor in this world is fungible. But things like IP, things like data, and obviously things like physical property and geography will become much more relevant.
SPENCER: What are your current thoughts on whether, in such a world, you would have one AI firm that would kind of dominate, like super monopoly versus whether you might end up with a duopoly, or just a really large number of these kinds of firms?
DWARKESH: I think that's incredibly hard to predict, but the fact is that, I think, firms will definitely get bigger because the diseconomies of scale you have with human firms, at least some of them, will go away. The fact that they can't communicate properly, the fact that the CEO is so limited in his ability to observe what is happening. The firms will get bigger. Whether that will mean that there will just be one Uber firm that figures out the equivalent of the small tribe that 60,000 years ago figured something out and dominated everything else, I don't have a strong opinion on that yet, and this is a question I'm very interested in and have been trying to think through.
SPENCER: It seems to me that to some degree, this recapitulates the question of will one single AI bootstrap itself up into superintelligence. This is sort of asking the question, but at the collection of AI level, where it's like, "Well, if you have a collection of AIs working cooperatively, really integrated, are they able to get a competitive advantage? Maybe it's able to add more sub-AIs faster or do it better, and then it just sort of gets an exponential gain on the others and eventually overtakes everything?"
DWARKESH: I think that's a thing that could happen. I don't see a fundamental reason. I don't have a strong opinion on it. I'm somebody building probabilities. Especially when it comes to things where it's like a future thing I barely understand. The fact that you can't rule it out is a serious worry. AI firms have a unique dynamic that favors bigger firms. So a company like Google, suppose it has a much bigger market share, and it wants to make its workers better by making a better AI model, it can overtrain their model and amortize the cost of higher compute spent on training the model over much more inference because it has millions of AI laborers, suppose. Whereas a small business is only going to use the model for the equivalent of thousands of employees; it just can't amortize that cost over that many copies. You do have this unique economy of scale for bigger AI firms because of this unique copying dynamic that might give bigger firms an even bigger advantage.
SPENCER: Because there's this kind of fixed cost of training, and then when you go in to make copies, you're only doing inference, which might be a lot cheaper. So whoever has the biggest fixed cost can make the best copies, and then if you're making lots and lots of copies, you kind of get more and more advantage from having more and better training. Is that right?
DWARKESH: That's right. And again, lots of uncertainty in this picture, because presumably you could have foundation model companies that are amortizing the cost over many clients, even if many of the clients themselves are small startups or something. But in the world where they're trained within the firm because they have firm proprietary knowledge or something, this is a dynamic you could see play out.
SPENCER: Do you see use cases of, let's say, trillions of dollars of inference with AIs?
DWARKESH: I think the AI firm is the closest example of this. Let me give you another example. So we've talked about this horizontal scaling, where you just have more and more employees. I think that is incredibly important. If a firm is functioning well, it can just rapidly scale up the inputs. But here's another dynamic: would it be worth it for Google if Sundar Pichai could spend millions of years thinking through Google strategy and considering every single hypothetical, every single contingency, but that cost, let's say, a billion dollars to run a million hours of Sundar? I think it'd be very worth it. So you could imagine just spending billions and billions of dollars for Satya Nadella to do Monte Carlo tree search over different five-year plans for Microsoft. Another dynamic I find interesting about scaling up the number of workers you have is that that worker, you don't have to train them up again. They will have all the skills you need, but also they will have all the knowledge that is firm-specific, all the tacit knowledge. If you imagine somebody like Noam Shazeer, if that was an AI, you can make infinite copies of it, all this sort of intuitive knowledge that takes years to build up, they will immediately have. In fact, since you can make copies of them, you can sort of amortize much more training than you previously could. You can give them millions of years of education. And that's not worth it, because every time you make a copy, every copy will have that education.
[promo]
SPENCER: If we think about really large uses of AI inference, what you call industrial scale AI, it seems like one type of it is going really deep, as you were describing, where you have an AI consider something for what would be equivalent to a human thinking for a million years even though it might take the AI a really short amount of time. So that's really deep. Maybe you could think about certain kinds of AI use cases in scientific innovation like this. Maybe you have a thousand AIs that act like scientists, and they think for the equivalent of a human for millions of years to discover new science. The other way you can go is broad, which is what you're talking about, where you could have millions or billions of AI employees all working together to try to achieve some tasks. It seems like there's an interesting split there, but either direction, you could end up with really, really large uses of inference.
DWARKESH: Yeah. To give a historical analogy, one of the people I recently interviewed was the historian Daniel Yergin, who wrote about the history of oil through the 20th century. I found it really interesting that the first use case of oil was for lighting. For the first 50 years of oil, from the 1850s or 1860s when Drake first discovered it in Pennsylvania to the 1910s when the Model T and the first cars were taking off, the main use for oil was just lighting. In fact, people thought that when the electric light bulb was invented, Standard Oil would go bankrupt. In fact, it rapidly increased in value around that time because of market dynamics, or at least the shareholders did, because it got broken up. The point here being, before the internal combustion engine became common, you just didn't have a use case that required millions or billions of barrels of oil. You just had liquid energy, and what do you do with this? You just don't know yet. I wonder if we're in a dynamic like that with AI. Right now, we use OpenAI which has many billions of dollars of revenue, a couple billion or something, and right now that's serving chat, search, things like that. In a world where that increases thousand-fold or more, what's happening? I think these rapidly scalable AI firms, both horizontally and vertically, as you mentioned, is one big guess of how you could just have hundreds of trillions. Obviously, output will increase a lot in the AI world too, but a stupendous amount of compute is spent on serving AI.
SPENCER: Before we wrap up, I wanted to ask you about learning because you've interviewed a large number of brilliant people for your podcast. I've also done a lot of interviews for my podcast, and you wrote something that I thought was pretty striking, which you said, "How do I learn more things? I should write more." Many of the most important questions simply can't be addressed extemporaneously over a podcast. Do you feel like you've learned by doing all these interviews?
DWARKESH: I think I've learned less than I could have. Over the last four years that I've been doing the podcast, I've had the chance to interview experts in many different fields. It's sort of disappointing how little I know about many of these fields. Given that I feel like this is what I've been doing full time, I should know more by now. It's improved a lot recently because of tools like spaced repetition and so forth. But I do notice that for the few subjects I write a blog post about, I have a way higher understanding and retention. One of the reasons is that you really do understand where your gaps are. I think I shared the draft of the AI firms post that we've been talking about with you before, and you made some comments about, "It's not even clear what you're talking about." You put it nicely, but you kind of realize that while you're writing, and that's a huge pedagogical tool. What have you discovered as you've been doing all these interviews, and how do you try to retain and gain insights from what you're doing?
SPENCER: Yeah, it's really interesting. I think I go through a period of learning about a topic, then reflecting on the topic, and then eventually writing about the topic, and that's a similar conclusion to you, that's actually how I feel I learn a lot of the things that I know and learn them really deeply. Maybe I've heard about the topic a bunch of times, and then I go to the phase of reflecting on it. For example, I have an upcoming interview for the podcast with a sociopath, and I also have an upcoming interview for the podcast with a narcissist. Those are two topics that I've been thinking about for years, learning about them in bits and pieces. "What is sociopathy? What is narcissism? What are the traits? What are the misunderstandings about it?" Then I go into this phase where I start writing down my thoughts, and it really starts to cohere. That's when I feel like I begin to actually understand the topic.
DWARKESH: So how long is that process for you?
SPENCER: I feel like often I've been thinking about the thing for, let's say, six months to three years. Then I try to write an essay about it, and it's funny because later on, I'll be talking to someone about the topic, and they'll bring it up, and they won't realize that not only did I spend three years thinking about it, but I actually wrote an essay about it. It sounds like I'm just saying something I came up with on the spot, and it's like, "No, you just don't see all the stuff that goes into having your opinion now." But what I was going to say is that then the podcast itself, when I'm interviewing someone, I do learn little bits and pieces, but really, I think what I learn about is conversations and how conversations work, and the interesting dynamics around that more so than I learn about the sort of object-level topic. I'm hoping my audience learns more than I do about the topic. I'm hoping that by the time I have this conversation, I already know a bunch of the things that we're going to be talking about.
DWARKESH: It's funny because I often think the same thing. One thing I do worry about is how little I remember from conversations, especially conversations that I do on the podcast. I feel like I have total amnesia, and that kind of makes me have some sort of gentleman amnesia towards myself, in the sense that if I have a three-hour conversation with somebody on the podcast, and I barely retain any of it. Every time I go to a dinner or something and people explain interesting things to me, I'm probably forgetting most of it. I've probably forgotten a lot of really interesting conversations in person, not recorded, really interesting things I've learned. I don't know if you have that feeling when you're rewatching a transcript or rewatching an episode that you already recorded, where you're like, "Damn, there was a lot here that I totally missed."
SPENCER: Yeah, I generally do listen to my own episodes, just often to see if I made any stupid remarks or fumbles, but also because I just find it interesting listening to the guests. When you're a podcast host, you're not fully listening to them in the same way. You're paying very close attention, but you're also juggling a bunch of things. It's kind of interesting. I enjoy hearing my guest again from a more relaxed standpoint, but I think the reality is that in order to really retain information, most people have to hear it multiple times or just deeply, deeply care. There are definitely theorems that I learned where I heard it the first time, and I never forgot it because I care a lot about certain theorems in math as a mathematician. But for most things, you probably just really have to hear it multiple times to get it to stick. You will do that though; you will hear it multiple times if you dig deep into that subject. If you really want to learn the thing, you'll probably have a bunch of sources that say different variants of the same thing, and then it will be ingrained for a long time.
DWARKESH: You just mentioned, "I'm hoping my audience takes away more from these conversations than I do." I know Grant Anderson from 3Blue1Brown has wrestled with this, where he's deeply skeptical of how much you're learning from his videos. He's like, "I'm really trying to get you to go to the math textbooks that inspired this video more so than watch the video itself and consider yourself educated." I guess you have a lot of tools that you build to help your listeners at least engage with your content. But I wonder how much you worry about this.
SPENCER: Such a good question. I think there are different types of knowledge and they require different things. For example: there's knowledge of how to do a thing, like how to cook or how to do martial arts. It's absolutely ridiculous to think that you could learn that without ever doing it. To think that you could do martial arts just by reading a textbook is insane. I think it's similar with math. Once you get to a certain level of proficiency with math, you get to a point where you can read something and then go do it. I can do that sometimes with math; I can read a new theorem and then go use it, if it's close enough to the domains I've studied in the past. But until you get to a certain level of proficiency, that's insane to think about. So that kind of procedural knowledge requires you to go do the thing. Then there are specific facts, and specific facts usually require repetition. Some people are really good at remembering little facts, and maybe on topics you're really interested in, you'll pick up a fact here and there that you'll be able to repeat, but you probably won't remember it for years. Most likely, you'll forget 99% of them. Repetition is really a solution: spaced repetition, flashcards, and so on. But then there's another kind of knowledge, which is a subtler kind of knowledge, something like learning the general flow of a topic. If you listen to a whole bunch of podcasts on a topic, even if you don't remember really specific facts, you will learn a bunch of stuff intuitively that will come out later when you're trying to do something related to that. It's hard to say, "I don't remember exactly what happened in that conversation, but somehow it changed my mental models and intuitions." I'm curious what you think?
DWARKESH: A program has an analogy for this; it's the equivalent of compiling code or binaries. You can't see the code itself, but the effect lingers, or its ability to make an impact lingers. I do think some of it is — if you've lost the code, then you better hope they've got the binary at least.
SPENCER: It is that third type that I hope podcast listeners get. Not that they get specific facts or procedural knowledge, but that on these topics, it shifts their thinking in some important way, even if they can't pinpoint exactly what it was. It makes them hopefully think more clearly. That's part of my mission. I hope it does that at least to some degree.
DWARKESH: I have a general thing where I'm okay with not being able to assess the impact of the podcast, where there's not a clear impact, because if I look back at valuable intellectual contributions or valuable content in the past... I'll give you an example. Robert Caro writes the biography of Robert Moses, the man who built New York City, and it takes him, I think, seven years to do it. Did he know while he was spending all that time interviewing thousands of people and getting into the nitty-gritty on every single bridge and every single highway that Robert Moses built that this book would change urban governance in America and how people think about community engagement and different issues related to how cities are governed? I'm guessing he didn't think about the expected value or the marginal rabbit hole he had to get into to write the book. But it is one of the most important books in terms of the effect it's had on governance in America. Does that make sense? Just kind of do it and then hope it ends up having some sort of impact?
SPENCER: I think so. And obviously we have to be mindful that it could be that we're just not influencing people positively. But I also think that there's another thing that happens, which is that sometimes a particular thing hits a particular user or audience member at the right moment, and it really is life-changing. One of my proudest moments in my writing was when I got an email from someone who read an article I wrote about how to make really hard decisions. She told me that she read it at 3am after she'd fled her house because her husband was beating her, and it was sort of the thing that she needed to get her over the hump to leave her husband for good. And I was just like, "Yeah, that's obviously exactly what you want to happen." Lots of people could read that article and maybe in some intuitive way it influences them slightly, but for her, it was just the right thing at the right moment. My hope is that you create those serendipitous moments, like you have high-quality conversations, sometimes it hits just the right person at the right time, and it makes a really big difference.
DWARKESH: Especially, people don't think about how big the scale can get here. You're reaching thousands of people. And you, Spencer, you're reaching all these people. Even if the impact on any one person is small, over this entire cluster, it's a sort of ginormous footprint, and people just lose track of orders of magnitude here. If you kind of just saw the audience live, if there was a stadium or something, it would be staggering.
SPENCER: It's really weird to think about podcasts in that way. Because, let's say you give a public lecture and 200 people show up, you have this room full of people listening to you; that feels like a lot of people. They're standing all around you, and then it's like, "Wait a minute. You can get way more than that in a podcast episode." It feels totally different psychologically because they're not there, but it's just like, often they listen for an hour or maybe longer; a lecture or talk might be 20 minutes or 30 minutes. So it's just very strange to think about the scale and how different it is than the usual things we do, like public speaking.
DWARKESH: Yep. So if you put it into numbers — I don't know what one of my recent episodes is — David Reich, I think that one has, like, I don't know, 700 or something thousand views on YouTube, probably another six figures across all the other platforms. Suppose it eventually ends up with a million total listens. Just thinking about that in terms of a live audience, there's not a venue on earth that could house all that. One thing I've been thinking about recently is I feel I have some sort of obligation to respond to emails, or at least some emails, or to do these kinds of ops-type things in person. I've been trying to think about, "Look, all the million people who could fill this hypothetical stadium are real people. If you're neglecting, suppose I answered a dozen more emails before that interview, but I'd done a couple hours less prep as a result, this sort of scale insensitivity would just be insane." At least whenever I'm deep into the ops work, that kind of thought occurs to me.
SPENCER: Yeah, it's a weird thought. And yeah, your show gets more listeners than mine, so you have it at a whole other level. But my contact form on my personal website broke, and I didn't realize it, and then I got 142 messages all at once in one day because we fixed it. And I was like, "Holy shit, overwhelming. Oh my God, all these people want these things from me, asking me for all these things." When I sat down, I was like, "Okay, I'm gonna devote two hours to trying to go through these 142 messages from people." It felt very overwhelming and strange, and if your audience grows, it just becomes a bigger problem every year.
DWARKESH: Yeah, I do wonder if that's the case. If you think about the most famous public intellectuals, one thing you would hope they get access to is just that the smartest people will reach out to you and give you helpful tips. An author writes an interesting book, and hopefully they're expecting to get a lot of interesting inbound or something. But realistically, they'll get that too, but it'll be so overwhelming there will be so much random noise that your ability to sort of get interesting connections with the outside world really fades. Ironically, I just imagine somebody like Steven Pinker or something, that guy must have a really frustrating inbox.
SPENCER: I can't even imagine. It's almost like email becomes pointless. Or you need a secret email that only your friends get. I'm certainly not at that point, but I think someone like Steven Pinker must be. When I think about podcasting, another thing I think about is that it can just be a seed that gets you interested in a topic or makes you realize that something's really exciting to learn about, and then can spawn you to go deeper. You're someone who goes extremely deep because when you're preparing for your podcast, you've kind of go all in before you do your interview. So what advice would you have for the listener on the kind of final note for how to do this kind of deep research on new topics?
DWARKESH: That's a really good question. I think one big thing is just having patience with yourself as you're going through it. It's just like, you hear about AI. And you want to learn more about AI, so you decide to go through the Karpathy lectures on building up GPT-2 from scratch. Don't just rush through it. Have patience with yourself. "I really didn't understand this one line of code or something, and it's gonna take me a couple of days to go through this three-hour lecture." Same thing with books, especially if it's a book worth reading and deserves your attention. If it's a paper worth reading, you would be surprised, or maybe you wouldn't be. Some sources take me days on things that you would think are just an insignificant amount of pages, why did it take so long? I have had tutors explain very simple things to me that I still don't get. Space repetition is huge, and I think people don't understand the value of space repetition until I'm sure you've had this experience where you miss a card that you thought you remembered when you were writing it down. This is so trivial. Should I even bother writing this down? I guess I will. And then you miss that card, and that happens to me a lot. It's really hard to learn, so take your time, use tools like space repetition, and have patience with yourself.
SPENCER: That's really good advice. Anyone looking for spaced repetition software, we actually make some free software for spaced repetition called Thought Saver, so you can check that out. As a mathematician, I think one thing I got out of that is that sometimes you bang your head against something for just a ridiculous amount of time. I remember problem sets; it was just one homework. I spent four hours on one problem and then thought, "Oh, shit, I have six more problems to do just to hand in on Tuesday." I totally agree with you about having that patience. Some of these topics are just really difficult to learn. You can do it; it just takes time. You need to not feel like you're an idiot just because it takes time. It takes time for everyone.
DWARKESH: That's actually such a great point. The most amount I've learned in a day actually comes from those experiences in which, ironically, it feels like you went through one problem that entire day. On those occasions, I feel like I skimmed through this entire book or vaguely absorbed a lot of different kinds of inputs. I feel like those are wasted time, and it's only when I'm doing the equivalent of what you just said, where it's like, this one problem's going to take me four hours. It just emphasizes that it's a very frustrating experience. At the moment, you feel stupid; you feel like, "Why is this taking so long? Why am I not grokking it?" But that is, in fact, the most valuable time.
SPENCER: Dwarkesh, thanks for coming on.
DWARKESH: This was a lot of fun, Spencer. Great to touch with you.
[outro]
JOSH: A listener asks, "How should one decide on their highest level life goals if they aren't wedded to any particular philosophy, religion, or ideology?"
SPENCER: My preferred way to think about this is to first try to understand your intrinsic values, which are the things you value for their own sake, not as a means to other ends. For example, some people value their own happiness. In fact, most people do. But many people have other values as well. You might intrinsically value deep social connection. You might intrinsically value nature. You might intrinsically value truth or justice, and so on. Once you figure these out, then you can start thinking about crafting life goals that help you get a lot of what you intrinsically value. So if a life goal doesn't help you get lots of what you intrinsically value, then it's sort of not getting you the things you most fundamentally want. And so you could make a list of potential life goals and kind of compare them against your list of values. And that could be one of the considerations.
Staff
Music
Affiliates
Click here to return to the list of all episodes.
Sign up to receive one helpful idea and one brand-new podcast episode each week!
Subscribe via RSS or through one of these platforms:
Apple Podcasts Spotify TuneIn Amazon Podurama Podcast Addict YouTube RSS
We'd love to hear from you! To give us your feedback on the podcast, or to tell us about how the ideas from the podcast have impacted you, send us an email at:
Or connect with us on social media: