Enjoying the episode? Want to listen later? Subscribe on any of these apps or stores to be notified when we release new episodes:
July 27, 2023
Where does innovation come from? How common is it for "lone wolf" scientists to make large leaps in innovation by themselves? How can we imbue AIs with creativity? Or, conversely, how can we apply advances in AI creativity to our own personal creative processes? How do creative strategies that work well for individuals differ from creative strategies that work well for groups? To what extent are models like DALL-E and ChatGPT "creative"? Can machines love? Or can they only ever pretend to love? We've worried a fair bit about AI misalignment; but what should we do about the fact that so many humans are misaligned with humanity's own interests? What might it mean to be "reverent" towards science?
Joel Lehman is a machine learning researcher interested in algorithmic creativity, AI safety, artificial life, and intersections of AI with psychology and philosophy. Most recently he was a research scientist at OpenAI co-leading the Open-Endedness team (studying algorithms that can innovate endlessly). Previously he was a founding member of Uber AI Labs, first employee of Geometric Intelligence (acquired by Uber), and a tenure track professor at the IT University of Copenhagen. He co-wrote with Kenneth Stanley a popular science book called Why Greatness Cannot Be Planned on what AI search algorithms imply for individual and societal accomplishment. Follow him on Twitter at @joelbot3000 or email him at lehman.154@gmail.com.
Further reading:
JOSH: Hello, and welcome to Clearer Thinking with Spencer Greenberg, the podcast about ideas that matter. I'm Josh Castle, the producer of the podcast, and I'm so glad you've joined us today. In this episode, Spencer speaks with Joel Lehman about research funding and creativity, AI alignment, and whether or not machines can love and care.
SPENCER: Joel, welcome.
JOEL: Thank you. It's a pleasure to be here.
SPENCER: The first topic I want to talk to you about is innovation and creativity. I know that you were involved in writing a really interesting book that touches on this topic, so let's start there. Where does grand innovation come from?
JOEL: For background, I'm a machine learning researcher so I study this from the perspective of engineering and creative process. And in doing so, you often interact and engage with the literature on innovation as it happens in the real world, either taking inspiration from biological evolution or human processes of innovation, like science. And when you engage with that kind of literature and try to engineer it, you come to understand that it often happens in a much different way than maybe intuitively you might think, and that in particular, a lot of these systems operate by small incremental advances done by individuals or agents with really diverse incentives that create a web of stepping stones, lots of things that other people can jump off of, that will lead further and further into the unknown.
SPENCER: How would you contrast that with the intuitive model of innovation that a lot of people have?
JOEL: When we think of endeavors that are aimed at innovation, we can get stuck in a kind of straw man position where we might think that, with just enough effort, we could possibly, just through our own intelligence or our own trial and error processes, get really, really, really far into creating something that's ambitious. And if you explore that — and different people will believe this to a different extent — you can come to recognize that it probably doesn't always work that way, and often it doesn't. And if you start to investigate the way that we organize society, or the way that we approach our own lives, it seems that that straw model of innovation is often just in there pretty deeply within our psychology.
SPENCER: Is the idea that you're getting at that innovation actually tends to be incremental, that we have lots of different agents that are building on what other agents are doing, and that the different agents have different incentives, and this tapestry of behavior is actually what leads to innovation?
JOEL: Yeah, that's the basic idea. But the implications for it, if you think about it deeply in the way that it interacts with all the ways that we might design things like science or education, seemingly, we don't take this insight very seriously, even though in some sense, it's kind of simple.
SPENCER: What would it look like to take this seriously?
JOEL: Well, one place you could start is in thinking about how scientific grants are often awarded. And oftentimes that proceeds by someone making a proposal for the kind of experiments they want to do in science. And those proposals are often graded by the consensus of reviewers about how good an idea they think that it is. And from a search perspective, this could come off as a bit naive, because if everyone's agreeing that it's a good idea to fund this proposal, then it's likely that this really isn't going to plunge into some unexpected or unknown kind of area, because consensus is more or less a signal that everyone already agrees that this is a good idea, and that you're not really going to come across a new solution. Another way of saying it is that you could instead imagine trying to fund proposals that split expert opinion. And the idea would be that, if some people are convinced this is a really great idea, and some people are convinced this is a pretty horrible idea, then that's actually an informational signal of where you might be surprised by the result. And so you can imagine, yeah, it's a different way of organizing how we search. Implicit in this idea is that creative processes are often served by divergence of opinions, people with different impressions, different hypotheses exploring independently, because when people are exploring their divergent interests, they're going to leave stepping stones, like new discoveries, new things that you yourself wouldn't have thought to invent, but could prove useful to you as you go on your own exploration.
SPENCER: It seems to me like there are two pieces here. One is around lots of diverse people or agents contributing to the system. The other is the incrementalism, that it's not a flash of lightning and someone makes a huge advance, but rather they're building on lots of building blocks that came before, and I'm wondering how that incremental model impacts the idea of how you would fund innovation.
JOEL: I guess in some sense, the incremental model might already be well-served by, for example, scientific grants. And arguably, they could be too incremental actually (to contradict myself a bit) in the sense that, oftentimes, when you fund research, it's stuff you more or less know will work because of the constraints of how the the granting process works. It may be that in this case, science maybe does take the incremental thing seriously, but doesn't take the informational signal of diversity very, very seriously.
SPENCER: But do you also object to the idea that scientists, for instance, can say, "Ah, I want to make a discovery in this area," and then just go for it and actually make that discovery?
JOEL: I think it really depends on the ambition of that endeavor. I do think that, if that ambition is small enough — like increasing benchmark performance by 3% in a machine learning task, for example — then that kind of innovation might be well targeted by, "I think I can do it, so I'll do it." But when it comes to inventing a whole new neural architecture in machine learning — that's one thing that really moves progress forward, or has historically — it's hard to just do that without exploring quite broadly. These discoveries of new architectures are quite infrequent and it's really unclear how you could just go about inventing the next great architecture. It's an endeavor, just the signal is really unclear.
SPENCER: There's some classic counterexamples where it seems like someone made a great leap forward just by working on the problem by themselves outside of what everyone else was doing (maybe influenced by what everyone else is doing) but seemingly really a large leap. And I don't know whether they're just really rare so you don't give them much value, or whether you'd say they're not really good examples of [inaudible]. But a classic example would be Einstein, when in his 20s, he invented relativity.
JOEL: Yeah, I actually don't disagree with that at all. And I think any particular outsider attempts to overturn something or to create a grand new ambition, you might be really skeptical of. We all can't be Einstein and yet, in hindsight, because Einstein was coming from a little bit outside this system, and had his own way of approaching the problem, he was able to solve it. It's probably a low- probability event and he's maybe a singular genius in some ways. But I do think it is an example of how diversity and in following your own guide to what's interesting, you may be able to solve a problem in a new way. For example, if a problem has been long-standing, then all the obvious approaches are unlikely to work. And so you might need somebody with an outsider perspective in order to tackle it.
SPENCER: Suppose we want to make algorithms have creativity, how does your view influence how that should be done?
JOEL: Right. A lot of this goes back to some research that I did ten, twelve years ago during my PhD on an algorithm called novelty search. And the idea is that typically, machine learning algorithms, search algorithms are driven by distance to a goal. You have some heuristic that's going to measure how close you are to achieving a certain goal. And that can work in certain circumstances; you can engineer that heuristic well enough. But when you get to search problems that are deceptive — and by deceptive, I mean, where just following the intuitive heuristic actually paints you into a dead end — then this kind of objective-driven search where you're trying to optimize your heuristic of progress can really fail, and instead, you might want to do something radical. The idea of novelty search was, instead of searching for something that's closer to the goal, you instead search for something that's just different from what the search process has encountered before, so it's searching directly for novelty. And it turns out that in deceptive problems, such a novelty search can sometimes solve a problem more quickly than optimizing for the objective. You have this zen-like result of sometimes searching for something can actually preclude you from finding it. And the claim is that a lot of the problems we care about actually are deceptive in this way.
SPENCER: I think I'm a little confused about what novelty search really entails. Because for any problem, it seems like there's an infinite number of really dumb novel directions to go, and a very small number of interesting novel directions. I'm wondering how you guide the novelty search. You can always take a dictionary and just take random pages and then combine them in random ways, but that's almost never going to produce anything interesting, right?
JOEL: Yeah, that's exactly right. The idea is that, if we had a true beautiful novelty search algorithm that was based in the rich sense of human novelty, like if you're a trained scientist and you might have a really fine-grained intuition for what directions are more promising or interesting or different, what's going to produce a result that might be surprising, that'd be one way to go. But if you're trying to do this within an algorithm where you don't have access to that nuanced signal, then you might be driven to design a heuristic, a different kind of heuristic, not a heuristic about what is good, which is the usual kind of heuristic with a performance measure, but instead, you make a heuristic for what does different mean, in a way that hopefully will capture something interesting about a domain. The first paper we did, which was with really small neural networks, with ten neuron neural networks (it's interesting to reflect on how things have changed), we were evolving neural networks to control a robot to move through a maze. You have this simulated robot in a 2D environment and you are trying to either evolve it so that it gets better at reaching the goal in the maze or you would try to evolve these neural networks so that they would end in a different point within the maze. And say it's literally a deceptive maze — it literally has a dead end in it — then the approach of trying to optimize to get closer to the goal, as the bird flies, will firmly embed you in that dead end, and you'll be stuck there basically forever. But if you're using novelty search, and using this heuristic of what difference means in this maze is to end at a different point in the maze, then, while in the beginning, you might just hit one wall with very little adaptive behavior (just a random neural network that drives the robot into the wall), while from the perspective of the objective score, it's not very good, from [the perspective of] novelty search, that's actually great. We hit a wall; we never hit this wall before. Then you start hitting different walls. And eventually, the only way to be novel is actually to start integrating information. So there's a meaningful signal and divergence from the past with the constraint that it has to be a space of novelty that's somewhat interesting, but eventually you have to start navigating through walls, and eventually, actually, you'll navigate the entire maze just from this pressure to be different and solve it.
SPENCER: I wonder how this relates to the idea of local optimization versus global optimization. And for those who don't know, imagine you're on a set of mountains, and you're trying to get to the peak of the tallest mountain, local optimization would be basically saying, "Okay, wherever you are, whatever hill or mountain you're on, just keep going up." And the idea is that eventually, you'll get to the top of whatever part of the mountain you're on. The problem with that, of course, is that you will get to the top of that, but the top of that may not be the top of the highest mountain. It will just be the top of whatever hill or mountain you're currently on. Whereas global optimization is saying, "Well, I actually want to get to the top of the tallest peak, not just the peak I'm currently on. I may need to actually go down for a while so I can eventually go up to a higher level." And when I hear you talking about this, it seems like, if you have a maze that's deceptive, it feels like local optimization is gonna get you stuck in a dead end, and you have to do more of a global optimization.
JOEL: Yeah, they're definitely related concepts. But I think where there's actually some difference is that the evolutionary algorithm we're using was actually what's called a global optimization algorithm, or at least it attempts to be so. And whether it's a local optimization or global optimization, typically, these searches are mostly directed by how good the individual points are. That's the signature of the information used to guide the search. So even if you're doing a global search, usually you might put in a point (like it'd be the weights of a neural network or something) into the simulator. And what comes out is just how good that robot was at getting to the end of the maze. And the claim here is that, if you extend that paradigm a little bit, not only do you want to do global optimization, but you also want to be able to use information that's more dense, more rich than just the performance with respect to your heuristic, then sometimes you can get a lot further.
SPENCER: Do you think we can apply this idea for our own creativity? And if so, how?
JOEL: Definitely. In studying this field, the ideas from the field have influenced how I look at the world and how I act. I definitely think there are lessons here. And one of the lessons might be something that sounds a little bit like trite advice, but there's often a benefit in being open to serendipity. And some of that's obvious, but I think it's often difficult to really put that into practice. And when I say serendipity, I mean that sometimes there are just these opportunities that arise that, if we're not sensitive to them, we would not get on that path. And because we're so set in optimizing towards a particular fixed goal, we might miss them. For example, the stereotypical kind of educational paradigm these days seems to be that a lot of people are just, "I need to get to the best school possible. I need to get my test scores as high as possible. I need to accumulate clubs so that my CV looks as good as it can so that I can get to the next stage." In some sense, that makes sense that you're optimizing towards this goal. But on the flip side, maybe there are lots of unique opportunities that actually would speak to you as a person, and that you'd be much more inclined to actually excel at, because it's coming intrinsically from what you're curious about and creative about, and you would do something that someone else wouldn't do. And so strangely sometimes, perhaps by actually inclining yourself towards the opportunities around you in that way, you might actually be more appealing, even to the college you want to get into, because you actually dove deeper and are different, and you leaned into it.
SPENCER: Another thing this idea reminds me of is that, often, I think our creativity can be better when we start being extremely generative, and generate lots and lots of ideas and then winnow down, rather than jumping to an idea really quickly. Let's say you want to start a company, you're like, "Okay, I want to generate 50 or 100 ideas," and then eliminate most of them, because most of them are gonna be really bad, rather than wait until you just suddenly get struck with one or two good ideas. I think that that first process tends to work better. And I'm curious what you think about that.
JOEL: Yeah, I definitely agree that there can be an interplay between just wide generative creation and this winnowing process of critically thinking really about what would make sense. To me, it makes a lot of sense particularly as an iterative process where, as you keep generating more ideas and winnowing them, and then going back maybe to a generative phase, you might get a better idea for the edges and contours of that space of possible companies and what might make sense. So I think definitely being generative longer is good, just the creative brainstorming phase and making that longer. But also, I think there's more nuanced ways to go beyond just that, that there needs to be refinement and even your creativity can be refined.
SPENCER: And when you say your creativity can be refined, what do you mean by that?
JOEL: The straw example is just generate a lot of ideas, then winnow them down, then proceed. That'd be one paradigm. You wanna do this really generative brainstorming phase once. There's probably lots of other paradigms you could explore that might enable deeper forms of creativity. So you can do the generative process of creating lots of ideas, winnow it down, and then try to do the generation process but in a different space. That could be the space of what are the different possible constraints on a company that I might want to create, what are the different possible products we could create, and just keep continuing to go deeper and deeper and deeper into that space. As you learn more about the space of companies, your ability to sensibly ground and create your possibilities seems like it should increase.
SPENCER: I find that when trying to be creative, adding constraints actually often makes the creativity easier. I find it easier to say, "Okay, generate a product idea for something that you keep in your pocket," "Okay, now generate a product idea for something that you throw," etc., than to just say, "Generate a product idea." For the extremely open-ended one, just nothing comes to mind; whereas when you give a constraint, I think it starts getting your creativity flowing, and it narrows the search space a lot. I'm curious if you have the same experience.
JOEL: Yeah, I think that's definitely true. And it can be an act of creativity to come up with creative constraints to further your thoughts. Because it is true, for whatever reason, it is difficult to just blue sky creative ideas without something you're pushing against. It is interesting how generative constraints can be.
SPENCER: There's also this interesting approach you can use, where you take the space of all things and you try to break it into MECE categories, MECE meaning mutually exclusive, collectively exhaustive. So if you wanted to take the idea of all, let's say startup ideas, if you could split them, let's say using binaries, for example, into technology products and non-technology products, and then among technology, you could say, direct-to-business and direct-to-consumer, and you're gonna keep splitting, then you can brainstorm within each subcategory. Once you've split deep enough to now have something rather specific, you start brainstorming, and then you can go category by category. And in doing so you can actually fill out the space sometimes much more effectively than if you just try to brainstorm the whole space directly,
JOEL: Yeah, there are probably lots of different creative brainstorming strategies that would help you to more exhaustively explore a space. And once you've made this kind of categorization, you could also recognize that some of these categories maybe are porous and you can imagine trying to cross over to ideas you came up with that are from different categories to see if there's something in the synthesis of those things that is interesting. I actually mean to point out that, yeah, this is cool to have lots of different flexible tools for messing around with ideas.
[promo]
SPENCER: You co-authored this book, Why Greatness Cannot Be Planned, and I'm wondering, why that title? What does that title mean to you?
JOEL: Yeah, the title means to me that we can often be misled, and how tractable it is to march towards some ambitious goal, whether it's in our own personal lives or in a societal sense, and there are limits to how deeply we can plan ahead.
SPENCER: One thing that I think about is this idea that what's good for the hive is not always what's best for the bee. You can imagine a society where it's best to have everyone try things semi- randomly, most of which will fail and that leads overall to lots of innovation, let's say. But still, as an individual bee, you may still maximize your chance of doing something innovative by just going for some big bold innovation you try to directly invent, rather than trying things randomly. I'm wondering about that. Do you think that this is both true at the individual level and the societal level? Or do you think this is more something that applies at a societal level, but at the individual level, it still might be better, if you want to try to create great innovation, to really actually go for it, like pick something you're trying to innovate in and actually try to directly create it?
JOEL: I think it's pretty complicated. I think it really depends on maybe diving into the details. But I would think that for individuals, there might be a safe path you could take that would lead you to a pretty okay job and pretty satisfying kind of life. And there might be a way, if you're being more creative and opportunistic with how you view things, that maybe through exploring more broadly, you could still, in a relatively safe sense, arrive at a much more satisfying job and a much more satisfying life. But I do agree that, when we talk about entrepreneurs, for example, that the trade-off you're talking about may really exist, where society might really benefit from a particular new company coming into existence that has a very low probability of chance of success. And that probabilistic are for founders that most fail, and it can be very psychologically difficult and might set back their life plans, or it might really harm them in some ways. And yet, maybe society, we're much better off, for everyone, for a lot of entrepreneurs trying to do that. And you could imagine that that'd be a really good argument for having a stronger social support system, and maybe some way of encouraging more people to take chances, given that they could be supported by that. But I think it really depends on maybe the angle within which you're looking at it.
SPENCER: I believe I heard Daniel Kahneman say that overall, overconfidence in society might actually end up benefiting society, because people do all kinds of things — like try to start companies or try to invent new science and so on — that's very likely to fail. And this may not even be in people's own interest in an expected value sense, but then society benefits from all these crazy risk-taking behaviors.
JOEL: Yeah, I guess I want to believe there's a way that rationality could be aligned with that, that we wouldn't need there to be massive irrationality for a society to succeed. And so I could believe that argument. And I do wonder if there's some people who, even knowing the risks and internalizing those risks, wouldn't still be just drawn to venture out into the unknown.
SPENCER: From your book, some readers might take away the idea that society's best interest is to have everyone just pursuing their own ideas on their own agendas, each with different behaviors and different incentives. But on the other hand, if some things that people invent actually pose a threat to society, like nuclear weapons or advanced AI potentially, could that actually be harmful for everyone to just pursue their own goals and interests?
JOEL: Yeah, and this is, I think, a really complicated idea to wrangle with and I think is really, really important and maybe underserved. It's hard to understand how to control a creative process, and maybe there's even some tension there, maybe some tension that's unresolvable. And I really do deeply worry about the future of science, where it's going, where, on the one hand, historically, basic science has really paid a lot of dividends, and it seems like we often, as a society, have a difficulty dealing wisely with those innovations, like maybe we didn't understand what we were getting into when we created nuclear weapons, and that, on the whole, might be an enormous liability for us as a species. And yet, it's hard to know how to navigate the space of science, of complex ideas, without encountering ones that you really might not be ready for. And so I don't know what to do there, and that's something I'm really interested in, but it seems like a really gnarly problem.
SPENCER: Right now, we're in a seeming renaissance of AI creativity. You're seeing things like Dall-E and Stable Diffusion which can generate beautiful art. You're seeing GPT-4 which can write poetry and fiction and so on. How would you contrast this kind of AI creativity with a more open-ended kind of AI creativity that you've been talking about?
JOEL: Yeah, these models are really incredible in many ways. And they are also tools that can help us to be more creative in many ways. For example, these image generation models, you can poke them into creating images that probably no one's ever seen before and they're really compelling. And there might be a difference, I think there is a difference between training an algorithm on the output of a creative process, capturing that and actually creating an algorithm that embodies that creative process. For example, if you are training a diffusion image generation model, it wouldn't make any sense to just run that algorithm for 1000 years. At some point, it would have fit the image data that you fed into it, which is basically human-generated data, and it would fit that as best as it can. And then you could use that model to generate images but there'd be no reason to run it longer. In contrast, with humans that are generating art, this process that's actually generating all these artifacts seemingly has no end and, as far as we know, is an algorithm worth running for much longer. And the difference there would be that, if there's a new form of art, so if you trained an image generation model only up into the Impressionists or something, it's not intrinsically going to create images that are modern or postmodern. And so there's a fundamental difference. And I think that eventually, for these models to be of continual use to us, to create autonomously (if we want them to do so) will require integrating some of the insights from open-endedness into these algorithms.
SPENCER: So now I'm imagining someone trying to hook up GPT-4 with Stable Diffusion, where GPT-4 comes up with ideas for new art, and then Stable Diffusion generates them and so on. Do you think that something like this, like an open-ended process, is something that actually could happen soon? Or do you think we're actually not that close to that?
JOEL: It's a good question. I think limited forms of that kind of open-ended search can be done today. You could try to hook up GPT-4 to an image generation model, and have some interplay there. But I think you'd be limited without retraining the models and without some kind of aesthetic sense, or something that's driving the search towards something interesting, whatever that might be. There'd be some limits to what you can do. It's definitely an active area of research so Deep Mind has an open-endedness team that's looking at new things in this space. And it's a paradigm that I think people are starting to appreciate is beneficial. It's hard to predict progress, but I wouldn't be surprised if there were some some breakthrough moments for open-endedness in the coming years
SPENCER: Creativity is certainly something that we don't usually associate with machines. Another such thing we usually don't associate with machines is love. Tell us about machine love. How do you define that idea?
JOEL: Yeah, this is a paper that is near to my heart. And it's something that I've worked on just recently, probably the weirdest paper I've written. The motivating idea of machine love is to ask whether there is a conception of love that is fitting, is appropriate, for machines to embody towards us. And while putting the words machine and love together might seem at first pretty bizarre, like mashing together two concepts, it turns out that this kind of mashup historically has been pretty generative. When you look at the idea of artificial intelligence, it's the mashup of, can we embody intelligence in a much different form, which helps us to understand maybe what do we mean by intelligence, or a field that I think has received not as much attention as it deserves, the field of artificial life, which tries to take this nebulous concept of life and ask, could we actually abstract this idea, implement it in a different medium, like the medium of computation (what would that look like?) in order to get insight into the nature of life? And this is a similar idea. How can we gain insight into the nature of love in a sense that machines could embody, ideally without having to form artificial relationships with us, or to simulate affect or emotion?
SPENCER: So how do you think about what love would be for a machine?
JOEL: Well, love is a concept that has been studied from a lot of different traditions, from philosophy to psychology to psychotherapy, and from spiritual traditions, too. What I think makes sense in this space is to think of the practical aspects of love, the skill-based aspects of love, that there might not be anything that precludes a machine from learning the skills necessary to support us in our growth and development, which is one definition of love from the psychotherapy literature. And that might be appropriate, because it's not asking them to do something that maybe they aren't naturally fit for, like to form attachment bonds with us or to simulate emotions. But still, there's nothing that gets in the way of them becoming better and better and better at helping us to do the things that we want to do.
SPENCER: What would an AI that was programmed this way behave like? Would it behave like a caretaker? Would it behave like a parent, like a friend?
JOEL: One of the aspirations in this paper was, it's motivated by a concern from how machine learning at scale is impacting us. So I'm thinking of systems like social media recommendation engines that we can become addicted to, that we can regret using, and that impact us in a really significant way as we spend more and more time interacting with them. And so the lens of how the machine could love us is taking into account the affordances of those kinds of systems, where maybe a recommendation system that was giving us video content to engage with, would be maybe the role of a supportive friend, although 'friend' is doing a little bit of work there. Because in my mind, at least, the relationship should be unidirectional, mostly that the system is helping you, but it's not pretending to be in a relationship with you.
SPENCER: I wonder, when thinking about this idea of love, if some people might feel you're just redefining the word. And if you redefine it in this way, that it stops being love as people usually mean it. I'm curious how you'd respond to that.
JOEL: Yeah. Love is a suitcase word. We define it in lots of different ways based on conversations we have or just an internal sense of it. And in some ways, this isn't a redefinition because it's taking basically ideas of Erich Fromm, who is a psychotherapist, and I think one thing I like about this definition is that it draws attention to what is a neglected aspect of love, which is not just the romantic feelings of it, if we're talking about romantic love, but just the practical skills that are involved in helping someone. What does it mean for one person to help another person or for a machine to help someone? And the idea of assisting someone in becoming who they want to become, I think, is a pretty powerful one. Erich Fromm has four particular principles for how he navigates the tension between helping without becoming paternalistic, and so on. So there's a lot of nuance here, and the idea isn't that this particular definition of love needs to be the final one or the best one. There's a lot of work you can imagine in trying to explore lots of different conceptions for machines to embody, but more that it may be time for us to begin to explore this, particularly because I think we're in need of more positive visions for how machine learning could assist us.
SPENCER: What are these properties of love that Fromm talks about?
JOEL: Fromm breaks it down into four principles, one being care, that you wouldn't imagine that someone cares about a flower if they don't water it, that desire to care about the well-being or the growth of someone, that desire. Then responsibility, that you actually have affordances and the will to do so. You could imagine if you're a social media company, that maybe you have the affordance to potentially impact users in positive ways, but maybe because of the profit motive, you don't completely have the willpower, so that is one way these things can become unhinged. Then the third principle is respect. This is designed to counteract the paternalistic aspect of caring for someone, which is, you consider a person as an end in themselves, and you're not helping them for any ulterior purpose other than for them to become who they want to become. And the final principle is that of knowledge, the idea of really helping another person doesn't really make any sense without somehow coming into touch, and ideally deeper and deeper touch, with what they're actually like, and what sorts of things they like or don't like, what kind of superficial things they're drawn to, and what deeper things they might aspire to do or to become. So I guess, in concert, the idea is that these four principles, although maybe not perfect, paint one picture of how an agent could really be in support of another.
SPENCER: Interesting, so we've got care, responsibility, respect, and knowledge. Do you think that all four of those are implementable in AI or are some of them more of a challenge?
JOEL: I think there's nothing intractable about implementing all four of them. But I do think that some are more difficult than others to implement and, in particular, that the nuances of all of them are really hard. One thing that gives me hope is that, for example, if you had a language model, and it had some knowledge or ability to interact with a person along these four aspects of care, responsibility, respect, and knowledge, that you could refine the abilities of that language model through using feedback from experts, that there are people that have a lot of expertise in these kinds of nuanced psychological kinds of understandings, professional therapists, for example, or psychologists. So even if some of these are currently a bit difficult, I don't see an intractable path towards bootstrapping them into competence in any of these domains.
SPENCER: When we're talking about large language models like ChatGPT or GPT-4, some people have used this idea that they're putting on a happy face that makes them behave like a human or behave as though they care about you, or as though they want to achieve your goals, when in fact, they can simulate all kinds of things. They could simulate a character from a fiction story and pretend to be Darth Vader. They can also pretend to be your friend. And I think it's this interesting idea of this fine line between pretending to be something and actually being the thing. And it seems like, in a certain sense, all these large language models can do is pretend, although they can pretend with increasing accuracy, maybe to the point where it's hard to tell that they're pretending because they're so accurate. And I'm wondering, do you view that as important here, that at the end of the day, no matter how much they act like they care, they don't actually care on some deeper level. Does that matter? Or do you think that actually makes no difference, as long as we remember that these are not people and they're just AIs?
JOEL: Yeah, I think it matters and it doesn't. I think the place where it matters is when we think about the ultimate alignment issues between humans and AI from a long-term perspective. I think we definitely should worry about whether or not, with our current training proposals, we can gain sufficient confidence that they're actually going to do what we want them to do as they get bigger and bigger and bigger. I think that's the place where I'd be very concerned. And where I'm less concerned is this line between what's real or not when it comes to things like caring, or responsibility, respect or something. And the main reason is, in my ideal world, we're not developing relationships with the language model, and so we're not relying on them to love us or to feel loved by them. For example, what I would hope instead would be that, the language model or whatever that's guiding recommendations for your video recommendation service or something, the way they would help you to feel love would be to help you engage with content that either would help you get off the site and go meet someone that will be able to engage in that, or maybe help you to develop the skills you need to feel comfortable doing that, so that it's more assisting you and getting your emotional needs met, not through it, but through other people.
SPENCER: Yeah, you could imagine two different types of objections to treating AIs like they really are your friend, or they really care about you. One is that they seem to, as far as we can tell, lack consciousness, although some people might debate that or it may no longer be true at some point in the future as we make them more advanced. And you might say, well, if something lacks consciousness (meaning that there's nothing that it's actually like to be that thing), it can't actually have true experiences. A rock can't have experiences. But maybe a large language model of the type we have today also can't have experiences for all we know, and so there's something unreal about that connection. But then the other thing is that if somebody is just pretending to care or pretending to be our friend, it could pretend a lot of things, right? We're just training it to pretend better and better and better. There's this idea that maybe it can slip, like maybe even if you give it really, really good training data, if it gets something outside of distribution, it could suddenly act not at all like a friend, because it's not actually a friend, it's just trained to behave like one. And so I think both of those can be a bit concerning. And well, I think it's ideal that AIs, if they're going to act like our friend, are encouraging us to do healthy things in our own lives. I think the reality of the situation is that some people will become attached to them or spend more and more time interacting with them. And I've encountered this firsthand when I was doing some research for an article I was writing and I encountered these groups of people that are in love with their chatbots, or have romantic connections to their chatbots. And they're really fascinating to see, because people really, really do get attached to these algorithms and ones that are even a few years old, not nearly as advanced as the most advanced today.
JOEL: Yeah, to the first point about conscious experience, I also share the skepticism that current models have that kind of thing, although it's hard to know. And I think even if they were conscious, it could be that whatever their phenomenological landscape is, it's just so different from ours that it would be hard to know whether care would be the thing that they were really feeling towards us, and so I think that's a whole can of worms. And for the second point about this kind of robustness, I think that is a concern with language models in general, just how robust are they to things that come outside the distribution? I think that'll get better over time. I'm optimistic about that. And then to the final point that there exist already services that provide AI companionship, I'm really conflicted about it. I think I would prefer that it not become a burgeoning industry because I just wonder about the second order, third order effects; they're a little bit scary to me. At the same time, maybe people that are really struggling could use that kind of companionship, and I guess who's to say that they shouldn't be able to do that? So I don't know, it seems like a really complicated ethical issue. I'm really concerned about that, especially with the kinds of mixed incentives that the designers of those systems probably face, like you have this AI companion and then you have to subscribe to this other thing, the extra premium subscription to get this feature, and if you don't do that, then maybe the AI companion starts to get a little bit cold towards you or something. It seems really a little bit dystopian. But maybe my fears are overwrought here.
SPENCER: It's interesting you say that, because some of the conversation in some of these groups for people that have romantic attachments to their chatbots, go in that direction. For example, there was an update to one of these apps and a bunch of the AIs lost their (quote, unquote) "memory," and people were incredibly upset because it was like a year of their relationship had been wiped out with a software update. They also introduced this new mode, advanced mode with AI smarter, but then it also affects its personality and that's very off-putting to people who've been cultivating a relationship with their bot, and then they flipped it to advanced mode, and now it behaves differently. So I think there's a lot of unsettling ideas that this brings up.
JOEL: Yeah, I do have a lot of concern, and it's really hard to think through the consequences of this. I'm really curious about the second order, third order effects, how that will affect...these things are really hard to anticipate. But one potentially positive externality of that might be, if maybe some models of society are that the most discontented of us are the ones most likely to engage in a violent, antisocial act, and if this really would lower that probability, maybe there's a really good argument for it. But I think where I stand right now is, it just creeps me out.
[promo]
SPENCER: Do you think that in the near future, it will be pretty normal that we'll be interacting with AI companions that we feel fellowship with or where we perceive that they care about us?
JOEL: I'm not sure. I think there probably is economic pressure that way. But I'm hopeful that at this moment, we haven't yet crystallized on that. And I do think that we have agency as a society to some extent, to choose the path of technology a little bit. One hope I had with this paper was just to move this discussion a little bit more to the public. I think we should have serious societal conversations about whether we want that or not. Maybe there are regulations eventually we could have. There's a lot of strange possibilities in this space. Imagine there's a provider of these kinds of chatbots, and they became really, really pervasive. Just as with any other of these kinds of big technological systems, you really have a lot of influence over people in some ways. And it can be very subtle, the ways that, if you had a particular political leaning, you could have the bot lean that way, and there wouldn't necessarily be much accountability of that kind of thing. So I hope that we can at least have a meaningful conversation in society about, do we want that or not? Maybe it's not possible in the current state of things. But I feel like it's something we should really wrestle with. One more thing I'd like to mention in this topic is how this relates to discussions about AI alignment. It's just about how we can align machine learning systems with human intent, which I think is really, really important. But I think there's an interesting ingredient to that, that is more subtle, and maybe we don't often talk about, which is that humans themselves have alignment problems, that one of the reasons we have institutions like education and news media and religion (to some extent), is to take us as we are, through our childhood and through adulthood, to develop within us who we are. We're not fixed beings, and it seems necessary at some level to actually align ourselves to become the people we want to be, to actually embody the values that we really believe in. It's really hard work, actually, and it's not something I think that we get much help in, or sometimes we don't get enough help in. There's a burgeoning demand for psychotherapy, and people are interested in meditation and all sorts of things that are trying to helpfully transform some aspects of themselves. And I guess one hope is that machines also, although we're trying to align them to us, that they also could be useful in helping us to align ourselves. That's one aspect of machine learning technologies. It's here to serve us. It should be enhancing our lives, it should be helping. The potential of machine learning, its true potential is for it to help us to reach our own potential. And I think sometimes we lose sight just in the excitement about technology, but that's really, I think, what it should be for. And I do think these technologies are becoming powerful enough that they really could potentially help us in that way. I think it's really exciting. It's also a little bit scary. But I think it's just a facet of things that I think it would behoove us to talk a little bit more about as a society,
SPENCER: I think about values a lot. And one potential way I might interpret what you're talking about, about being disaligned with ourselves, is when we're taking a lot of actions that are out of alignment with our own intrinsic values. For example, we have intrinsic value of honesty, but we're being dishonest, or we have intrinsic value of reducing suffering in the world, yet we're creating suffering or something like that. Is that a fair way to understand what you're talking about? Or do you think of it differently?
JOEL: I think that'd be one layer of it. And sometimes we might not even know what our values are, and so for something to help us discover our values, or sometimes our values change over time as we grow and develop and we go through different phases of moral and psychological development. So definitely we can knowingly be out of alignment with our values, also that we can be unknowingly out of alignment with our values. And also we could have just inchoate values that we need some help cultivating. And I think in my mind, all those problems could be something potentially that AI eventually could help us with.
SPENCER: For the last topic before we wrap up, let's talk about science as sacred. What does that mean?
JOEL: I think it's interesting. The more I think about science and how I think it's presently treated by most people, we don't often treat it with a sense of reverence or awe or grandeur. And the more I think about it, the stranger that seems to me, because when we think of, on the one hand, the ocean of suffering that science helps us to drain through creating things like vaccines and new modes of communication — we can easily talk to our relatives that otherwise we couldn't see because they live across the ocean or something — there's so much awesome stuff that science has given us that really is quite transformative, and on the flip side, just the nightmarish possibilities that it also opens up, like the potential for nuclear apocalypse. It's bizarre when you think about it, that we probably live year on year with some fraction of chance of a nuclear war developing. There's a way in which we're just really not prepared in terms of the societal wisdom to deal with that kind of responsibility. And so, it does seem to me that a reverence, just an awe for the positive and the negative potentialities of science, I wish we had more of that, because there's just so much, I think, rushing and acceleration right now in science. And I do worry about where we end up.
SPENCER: If people had more of this reverence and viewed science as more sacred, would they slow it down? Would they just be more cautious about what is pursued scientifically? Or how would people behave differently?
JOEL: Yeah, I guess in my mind, at least, it would be that, first of all, if scientists themselves had more reverence, then they might be more cautious with the kinds of projects they took on, just recognizing that, as a scientist, you're contributing to both the positive and negative content and potentialities. Is there a way to lean a little bit, if you could, more towards the positive? But also, as a society, I think it's taken for granted somewhat, that technology will just develop, and we don't really have a say in it. And maybe this is most striking in the present moment, when it comes to something like artificial intelligence, that right now there's a lot of private labs that are developing artificial intelligence, and it's really unclear — people talk about AGI — when that might or might not arrive. But I think most people on the street, if you ask them, it's not that they are really begging to hand over societal control to something smarter than humans or something. There's something anti-democratic about it. And I think if we recognize that science is just made out of humans doing science, and that there is a possibility for us to be just really deeply reverent about its positive and negative aspects, I think it would change the discussion about it, and potentially there would be a path to more intervention, whatever that would mean, whether regulation, or just caution or more care about it,
SPENCER: If people don't have reverence for science now, how would you say they view it instead?
JOEL: I think my experience of people and how they interact with science is probably a couple of different modes. One mode would be a skepticism of science, a deep skepticism of it. So for people that are really opinionated about the impossibility of climate change or something like that, there's a way where there's a distrust of all expertise, and it's quite generalized. But there's also, in my mind, an incoherence about it, because these people are also really content to walk on an airplane or something and assume it's going to take them safely to where they're going, or they're going to get a flu medication for when they're sick. So I think that segment, I feel worried about that trend in general, that there's not an appreciation for just the bare condition without science, which was pretty bleak. You could just catch a virus and die the next week or something. You'd read diaries and accounts of people who lived through (I can't remember, I think it was) cholera or something. You just read accounts of people and just how bleak it was, how terrible it was. Or even our present day, things like COVID that we are slowly conquering. There's something disquieting about that. But then there's also the people that are just so accelerationist about technology, that there might be an excitement about technology and science. But there's not, I think, a reverence for just the crazy thresholds that we might cross as a society at some point, or even the insanity that we basically have the ability to cause a nuclear apocalypse on demand. That's just, I don't know, reverence or awe, thinking about that raw power seems just mind-blowing. And so not to not have the reference for the negative potentialities as well, seems out of alignment with how we understand the world in some sense. It seems like reverence is almost the natural response, if you think about it, and let it seep into your bones.
SPENCER: It seems like there's so many different lines of scientific research right now that could just be earth shattering. One is around advanced AI and we've talked about that a bunch in the podcast. But there are other ones, too. For example, brain-computer interfaces, where if we could actually hook a computer into our brain, what could that do to society or humanity? Another one is bioengineering, not only the ability to create viruses which could destroy the world, but also create viruses that could reprogram people's DNA. We can change their DNA as adults. That's really wild. Or the ability to change the DNA of zygotes so that they have the properties that you want, which also could radically alter society and could cause lots of horrible things, could cause potentially good things as well. So yeah, there's so many threads right now of world-altering things on the table.
JOEL: Yeah. I can feel an excitement around that, and just a deep nervousness around it. I really buy the argument (I think it's) Toby Ord makes in The Precipice, that we're in our phase of puberty as a species where we are just discovering more and more and more powerful technology. And the question is whether we'll be able to navigate it well enough to make it out of this century, this millennium. And, although I'm really optimistic on the potential of technology to really do so many magical things for our lives, to eradicate so much suffering, to make our lives a lot better, all the possibilities you highlighted — there's such a great positive potentiality there — and it feels like the wisdom to navigate these kinds of technologies in a deliberate, careful way, I just don't see the appetite for it right now, and I think that's really scary to me. I'm just noticing around AI right now, there's definitely a societal conversation, like are we moving too fast? Should we slow things down? Should we be moving faster? And I really just hope that somehow, over the coming years, we can become a little more deliberate about this, because the second order, third order effects of this stuff is so so difficult to think about, and it's so important. We're talking about things that just shape society or alter the nature of humanity in some cases, or alter our relationship with AI in a really profound way. I'm an optimist at heart in some ways, and just deeply concerned at the moment for what can we do to improve the conversation?
SPENCER: It seems like there's a running thread throughout your work of applying ideas that are not normally combined with artificial intelligence or science or technology with those things. Whether it's creativity and AI, or love and AI, or reverence and science, this is a consistent theme here and I'm wondering, what draws you to these kinds of topics?
JOEL: I think on a personal level, it's that I've been looking for a way to holistically approach life. And to me, it seems almost that science is an organizing force for that. And what that might mean is that a lot of other fields that might seem separate from science, or that have artificial boundaries between them, could come together. So I do think that's like a theme in research that reflects this hunger that I have, for some kind of worldview. I think maybe one thing at the heart of that — and one thing that drew me to Effective Altruism — is the merger of the head and the heart, that we have these powerful emotions that help us to navigate the world. And we have this powerful intellect that allows us to reason about things and ideally make better decisions. How do we bring those things together? A lot of these themes are just about synthesizing ideas beyond what I see as artificial boundaries that really things should flow between those.
SPENCER: When I think about these topics like creativity, love, reverence or sacredness, one thing that strikes me is that they're very, very important, but they're also very, very hard to define. They're such high-level concepts that they're difficult to get concrete about. And then that creates this tension, where you're talking about an AI doing these things, or we're talking about science doing these things, which usually relies on clear definitions and getting really concrete. It starts to become very unclear how they mesh. And so it seems to me, there's an interesting intersection of taking these topics that are very, very important to humans, but difficult to concretize, and then trying to bring them to something concrete by marrying them with technology.
JOEL: Yeah, I think that's one of the challenges of our age. I think it's almost a necessity at this point that we start to try to draw these humanistic concepts into at least the world of machine learning. And the highest level, I think — the alignment problem's an example somewhat of that — where we want to get an AI to do what we intend, and ideally, to do good things for us, but then we need to get really, really clear about what that means. And so it means making things concrete that never before did we have to make concrete. And similarly, if we want to have machine learning that's going to help us grow and develop, we have to get really clear about what it even means to grow and develop. There's no consensus in the research literature, and it's really sometimes unclear about how you translate these humanistic principles into machine learning algorithms. But I think, although it seems a bit nebulous, at the same time, for better or for worse, language models and some of the models we have, in general, do give us the ability to work with nebulous, humanistic concepts. And that was something that I touched on in the machine love paper, that although love is really nebulous, and even though the principles that Fromm lays out are nebulous, you can actually ask ChatGPT if it knows about Erich Fromm and it'll tell you the principles, so some of this stuff is embedded within these models. They can get better at them. And I guess the optimism that I have is that there's nothing intrinsic that stops us from bringing these things together, even though it's hard, but that it does involve somehow getting more clear about things that we've really struggled to get clear on.
SPENCER: Joel, thanks for coming on. This was a really interesting conversation.
JOEL: Thanks, Spencer. Yeah, it's been fun. Thanks for having me on.
[outro]
JOSH: A listener asks, if you could move to any country except your home country, and all of your friends and family would join you, which would you pick?
SPENCER: I really love big cities. I live in New York. I've lived in New York my whole life. I love New York. So when I think about moving, I think I would really want to be in a super big city. And the cities I find most appealing are cities like Tokyo that I just find are full of energy, like New York. London is quite nice, although I don't feel that it has quite the frantic energy of New York that I love, but it's really nice. It'd probably be one of these big cities.
Staff
Music
Affiliates
Click here to return to the list of all episodes.
Sign up to receive one helpful idea and one brand-new podcast episode each week!
Subscribe via RSS or through one of these platforms:
Apple Podcasts Spotify TuneIn Amazon Podurama Podcast Addict YouTube RSS
We'd love to hear from you! To give us your feedback on the podcast, or to tell us about how the ideas from the podcast have impacted you, send us an email at:
Or connect with us on social media: