Episode 269: What do we know about psychology that matters? (with Paul Bloom)

Enjoying the episode? Want to listen later? Subscribe on any of these apps or stores to be notified when we release new episodes:

Listen on

Apple Podcasts

July 4, 2025

NOTE: The video version of this conversation is available on YouTube: https://youtu.be/hWNknrc23Fo

In light of the replication crisis, should social scientists try to replicate every major finding in the field's history? Why is human memory so faulty? And since human memory is so faulty, why do we take eyewitness testimony in legal contexts so seriously? How different are people's experiences of the world? What are the various failure modes in social science research? How much progress have the social sciences made implementing reforms and applying more rigorous standards? Why does peer review seem so susceptible to importance hacking? When is observation more important than interpretation, and vice versa? Do the top journals contain the least replicable papers? What value do Freud's ideas still provide today? How useful are neo-Freudian therapeutic methods? Should social scientists run studies on LLMs? Which of Paul's books does ChatGPT like the least?

Paul Bloom is Professor of Psychology at the University of Toronto, and Brooks and Suzanne Ragen Professor Emeritus of Psychology at Yale University. Paul Bloom studies how children and adults make sense of the world, with special focus on pleasure, morality, religion, fiction, and art. He has won numerous awards for his research and teaching. He is past-president of the Society for Philosophy and Psychology, and co-editor of Behavioral and Brain Sciences. He has written for scientific journals such as Nature and Science, and for popular outlets such as The New York Times, The Guardian, The New Yorker, and The Atlantic Monthly. He is the author of seven books, including his most recent, Psych: The Story of the Human Mind. Find more about him at paulbloom.net, or follow his Substack.

JOSH: Hello, and welcome to Clearer Thinking with Spencer Greenberg, the podcast about ideas that matter. I'm Josh Castle, the producer of the podcast, and I'm so glad you've joined us today. In this episode, Spencer speaks with Paul Bloom about peer review and trends in psychology research.

SPENCER: Paul, welcome back.

PAUL: Spencer, it's good to talk to you again.

SPENCER: Sometimes, when I talk to psychologists these days, they seem to have this sense that we have to start over, we have to recheck everything in psychology because of the replication crisis, and we really can't trust anything. Do you think that's going too far?

PAUL: Yeah, it's an overreaction. I think that there's some stuff that's quite shaky, and I think we should really be cautious before taking traditional findings too seriously. But at the same time, we could go through examples of this. There are traditional bodies of research that have taught us something really interesting about the mind, and we can count on it. There's some really sturdy research out there.

SPENCER: I'd like to focus on that with you today. The things about the mind that we really know, that we figured out and that are not obvious. They're things that are not just confirming our intuition.

PAUL: We could start with memory. A lot of people have a conception of memory — not you, you're sophisticated — but it's a conception of memory as a kind of video recording, a precise recording of the world. And we just, if we work hard enough, we could recover anything we want. It's all in there. Over 100 years of research suggests that that's false, that memory is a reconstructive affair. When I tell you what I did yesterday or last year or five years ago, part of it is based on my actual experience and the traces it left in my brain. And part of it is what I think is a good answer, what I think makes sense, the sort of story I think you want to hear, the sort of story I like to tell. There's a lot of blurring between the truth of memory and the reconstructive aspect. One ramification of this is that some of the things that we totally believe are true about our past aren't; they're false memories. They're false memories, in some cases because somebody implanted them in us, in some cases just because of stories I want to tell. I was once telling somebody at a party a really funny thing that happened to me, and my wife is very kind. It was only when we were driving home that she reminded me it happened to her, not to me.

SPENCER: Yeah, there's some fascinating research in this area. One study is on juries, where if someone says they remembered something, people take that very seriously and often assume that it's much more accurate than it really is. In fact, we know that witness testimony is often not that accurate.

PAUL: If you ask a different question, "What psychology can we count on for real-world practical stuff?" I would say, "I know you have some skepticism. I know this is motivating a lot of your activities." A skepticism about past work sounds reasonable to me, but again, some of it is pretty sturdy, and eyewitness testimony is a case where I think psychologists have made the world better. People like Elizabeth Loftus point out that we are really bad at remembering faces, voices, and what has happened. At the same time, as you just said, people are super confident in it, and people take it very seriously. If I said, "Spencer, I saw you a week ago, you broke into the liquor store," somebody who hears me might say, "Well, if I said it and I pointed you out very dramatically, it can't be wrong," and we can be wrong. We're easily fooled.

SPENCER: Would you say the inaccuracy in memory comes about because of our brains compressing the information, or does it have more to do with when we're retrieving memories, that they're volatile and can change in that process?

PAUL: I think both. I think there are issues of compression. I don't think a physical brain can hold all the information that we receive. I think there are issues of decay; the brain is a physical thing, so, like a banana, it kind of rots over time and you lose stuff. But also, I think there's a filling-in process. I look at standard, boring intro psych studies. You tell somebody a story about someone who goes to a restaurant and orders food, and then they leave, and when you ask them to recount the story later, they'll often say, "Oh, and in the story, the person paid the bill," because that's what people do in restaurants, even if it wasn't in the story. This is not a bad way for the mind to work. For the most part, you put in plausible details. Plausible details are true, but often it leads to mistakes when the plausible details aren't present.

SPENCER: Because if you want to store information really efficiently, you would ignore the things that are usually true, because you could just guess them. If I wanted to store a memory, I don't need to remember the fact that people were wearing clothes, unless it's one of those memories, maybe. But usually, you don't need to remember that, because people are wearing clothes, right? So there's no point in remembering it. It seems as if the brain does something like that; many details can just be guessed pretty accurately, so it doesn't bother actually recording them. But then when you recall the memory and you imagine it, it's just sort of automatically slotted in all those details. But the problem is, it happened so subconsciously, we're not even aware that it just filled in some of the details and we weren't actually remembering them.

PAUL: That's exactly right. And it connects to another body of research we think is pretty sturdy: studies of visual perception and visual cognition. One of the cool findings, and this one replicates because you try, there are a million YouTube videos that do this, something called "change blindness," where I'm looking at you and I turn my head and I look back and you've changed your shirt, or you put on a hat, and typically I won't notice. This is a striking demonstration where it's remarkable to somebody who knows what's happening, what people won't notice. But I think the answer is the same as you gave to the memory, which is, "I don't need to store in memory the shirt you're wearing or whether or not you're wearing a hat, because as I look at you, turn my head, I look back, you're still wearing it. I don't have to encode it," as opposed to a sudden facial expression you make, which I might have to encode, because it will go away. So the brain observes a certain economy in these ways, which only when you fool around with it, when you play around with it, leads to mistakes and illusions.

SPENCER: One of my favorite studies on that topic is when they have people carry a mirror on the street. They have one researcher talking to a study participant they just approached on the street, saying, "Would you participate in the study?" They have two men carrying a large mirror walk between them, blocking them, and then they swap out the researcher, replacing them with a different researcher who looks kind of vaguely similar, who just keeps talking as though nothing has happened. While, of course, some people do realize it, it's amazing that some people actually do not realize that there's a different human being in front of them now that they're talking to.

PAUL: It's an amazing study. The British magician, Derren Brown, has a series of street acts where he imitates the study. If I pop to the side, you pop up, people might not notice. People will notice sometimes a change in gender and a change in race. If all of a sudden you're talking to me, and then a mirror comes in the way, and then you're talking to a black woman, you will probably notice a difference. My ex-colleague, Ryan Schroeder, did the world's quickest study. He had a booth set up, and people would come up and he said, "Do you want to do a study on visual perception?" They said, "Sure," and handed in the consent form. The person would sign the consent form. He said, "Okay, I gotta put this down." The guy goes down to put it away, and then another guy pops up and continues talking to them. Most of the time, they don't notice.

SPENCER: That's incredible. But also, I think there are individual differences, because there are people that will remember what shirt you wore a month ago. It blows my mind, because I have to look down to know what shirt I'm wearing right now. It does seem like some people are much more perceptive than others. Would you agree with that?

PAUL: I would agree with that. I'm sure there are individual differences. I'm not sure if that kind of noticing affects whether you're vulnerable to change blindness. It might be that you get really into fashion and start noticing the clothes people wear, but then if all of a sudden you turn your head, and I moved away from wearing a red checkered shirt to just wearing a black shirt, will you notice it? I'm not sure if it involves the same sort of memory. I'm not saying you're wrong. I just don't know.

SPENCER: It is interesting how different people focus on different things, and I believe there's been interesting research on people with autism who may not notice the same things other people are noticing but might notice other information. For example, they might be less attuned to the expressions on your face, but they might be more aware of the tag on their shirt that is pressing on them.

PAUL: There are studies with kids with autism showing that they have less propensity, for instance, to stare at people's faces instead of people's eyes, and will go more towards parts of your right ear. If I start waving my hand frantically to make a point, you still look at my face. But maybe a kid on the spectrum might just look at my hand. This connects to something where it's not an area of research I know much about, but I think it's super cool, which is studies of conscious experience. There have been more and more demonstrations over the last little while that people see the world in different ways. Some people have synesthesia, where they experience colors as taste and so on. We all remember a few years back with the dress, where some people saw it one color, some people saw it the other color. Maybe regarding change blindness and your autism example, these are further examples of how everyday consciousness differs from person to person.

SPENCER: Yeah, I think people vastly underestimate how different our experiences are, and I think part of the reason for that is that social forces get most people to act similarly to each other. But it doesn't mean that what's going on internally resembles what's going on for other people.

PAUL: A classic thing is pictures in the head, where most people have some visual imagery capacity. They can imagine an apple and so on. Some people say it wasn't until they were 35 that they realized it wasn't just people talking; they really had images in their head, and they have none.

SPENCER: Yeah, right. It's called aphantasia, right?

PAUL: That's right. How is your visual imagery?

SPENCER: Oh, I'm pretty good with visual imagery. Actually, I can imagine every sensory perception in my mind; I can imagine heat, I can imagine my arm going numb. So, if you can imagine an apple, that's like you're simulating sight. But could you simulate sound? Most people can imagine music. Can you simulate touch? Can you simulate itchiness? You can kind of go through every sense.

PAUL: And you're a big simulator.

SPENCER: Well, yeah. I enjoy playing with it, or simulations of different things.

PAUL: Like the taste of horseradish.

SPENCER: Yeah. I got it. What's your limit?

PAUL: I think I have a very limited mental life. I think mostly in language and abstractions. But I could hear in my head Led Zeppelin doing "Stairway to Heaven." It's not sounds I can make, but I could hear it, and stuff like that, but I don't have the full range that you seem to have.

SPENCER: We're actually developing a kind of omnibus test to try to test people for all the different things that they might have that they never realized make them different from other people. Synaesthesia being one, but there are so many other interesting ones. Some people are enraged by chewing sounds. It gives them intense anger.

PAUL: I was reading about that. In this case, maybe you know more about this, where somebody became deaf but then became enraged at watching people chew.

SPENCER: Yeah, I've heard about that. I think it's called misophonia.

PAUL: Yeah. Is there a theory about this?

SPENCER: I don't know if there's any theory about it. Another interesting one is ASMR, where some people get a kind of warm, tingly sensation at certain sounds, especially quiet whispering with a lot of attention. And maybe finger tapping.

PAUL: Yeah, an intersensory thing. My house is being renovated, so I'm staying in an apartment right now, and the sofa is velvet. I find it almost unbearable to touch velvet against my skin, but my wife is totally fine with it. You get these weird sensory things. Going back to autism, a lot of kids and adults with autism have these extreme sensory responses to loud noises or certain textures.

SPENCER: Yeah. So I think that, to me, is one of the strengths of psychology; it really has illustrated individual differences, especially individual differences we may not realize.

PAUL: So you're interested, and one of your many interests is exploring the areas where psychology is going to fail us, and trying to do good studies to sort of see what's real and what isn't. I've often heard it said that the big failures of psychology are more in social psychology than in other domains. Do you think that's true?

SPENCER: It depends on what you mean by failures. I think if you're just talking about doing exactly the same experiment and getting the same result with faithful reproduction, that might be true. But I think there are different ways that things can fail. For example, in personality research, I think there's a different kind of failure, where you do get the same result when you do the same study, but it might be kind of trivial or not that interesting.

PAUL: Yeah, because you're asking the same question two different ways, for instance.

SPENCER: Exactly. You end up relating diagnosing depression to someone's question about how depressed they are, but it turns out the psychiatrist just asks you how depressed you are, or things like that.

PAUL: Yeah, a lot of studies show you get a correlation by just asking the same question twice and hope nobody notices. My feeling is that work in perceptual psychology and the study of memory tends to be more robust. You can imagine different reasons. I'm not sure what kind of role this plays, but I feel that a lot of social psychology often has a political and social agenda behind it, and it has some distorting power in science. A lot of the stuff with implicit priming and so on has this huge intellectual agenda behind it and a lot of us are pretty fragile.

SPENCER: What do you see as the intellectual agenda behind it?

PAUL: I think people are extremely excited. I think there's a specific one in general. The specific one is it has gotten connected with racism and anti-racism, the idea that, for instance, some differences between different ethnic groups are caused by factors in the environment and priming stereotyping. Stereotype threat being the ultimate example of this, where maybe low test performance by social groups is because when they see a test, it activates a stereotype in their brain, and then they do worse as a result.

SPENCER: So just to clarify those studies, they will, for example, remind someone about their race or gender right before they take a math test, and they show they have reduced performance if they are reminded of just their own identity.

PAUL: Yeah, and that, as you know, has had a bit of a checkered history. I think it's very hard to get these effects these days. One could say, to be generous, maybe the world has changed; maybe people are less aware of the stereotypes in our heads. But a less generous way of looking at it is that the findings were never that strong. It was just really stuff people enjoyed and wanted to talk about. So this is a specific thing. The general thing is, I think people find it really cool, the idea that the presence of an American flag at a polling booth makes you more likely to vote Republican, that having a hot shower makes you feel less lonely, that holding a heavy resume on a big clipboard makes you think that the candidate is more substantial. That's just so cool, counterintuitive stuff that gets into pages of Nature or Science. Well, studies of visual priming and cognitive illusions and so on aren't as sexy. The fact that they aren't as sexy, I think, leads to a higher standard of methodological design.

SPENCER: Yeah, I think sexiness is definitely a big problem in Science. There's just such a pressure to publish that if you can dress something up and make it look really cool and interesting, you're more likely to get published, especially in the incredibly competitive landscape. So many people want to be in the same journals. The only way I know how to combat that is just more rigor and evaluation. We have this project, Transparent Replications, where we replicate new papers coming out in top psychology journals, and we only look at top journals, and we use random selection to choose which papers. So it really is a glimpse at sort of the best of the best that's coming out, and it's all fairly new.

PAUL: I think that's great. It can be totally random, because if I do an fMRI study on 40 schizophrenics, you're not going to replicate?

SPENCER: True, true, yes. So we do have criteria in terms of cost of replicating our capabilities, but within meeting our bar for cost and our capabilities, it's randomly selected. And I'll tell you the really shocking thing; we haven't released this result yet, but it's really shocking to me. Whereas if you were to look at research 15 or 20 years ago in top journals, you would find that if you tried to redo the paper, a lot of the time, you wouldn't get the same result. I think my best guess is something like 40 to 50% of these psychology papers wouldn't replicate. Based on these large studies that have tried to replicate lots of old studies, we're finding very little of that. In fact, out of our first 12 replications, we found only two that failed to replicate, and both of them, we believe, failed to replicate for reasons that are different from why things 50 or 20 years ago failed to replicate. Back then, it was a lot of P-hacking or using questionable statistics, or throwing away an outlier here, or using small sample sizes and doing some fuzzy stuff that led to false positives. We don't think that even those two failures were due to that. We think one was a weird confounding effect, and one was an issue of power in the study. So I would say zero out of 12 P-hacked; this blows my mind. It really does. I'm curious to hear your reaction to that.

PAUL: It's good news. I've heard people make the claim that the field has not been cleaning up its act, but it sounds, from what you're saying, that if the stuff replicates, we're doing better. That's great news.

SPENCER: Yeah, I think it's great news too. And yet, we find a bunch of these papers I do not think are great. Now, there are some great ones. I want to make that clear. There's some great research being done, but we do find a bunch of these papers are not getting very good ratings on our scoring system. We think that there's a whole new regime that we have to get into to improve Science, and we call it "importance hacking." It's basically where people mislead reviewers into thinking that their result is worthy of publication when it's not. If you were to redo the study, you would get the same result, but the meaning of the result is not what's claimed in the paper and review, and it has to be done in a subtle way that reviewers don't realize it, because if reviewers realized it, they're not going to publish the paper.

PAUL: Yeah, I'll give you an example of importance hacking I've done recently. I had a paper out on, I won't say my co-author, because I might complain about the paper on that, that children think natural foods are better than just kind of cool findings. It connects to some interesting ideas about impurity and so on. But the sample was American, and what I read said that if you title a paper "American children think natural foods are better," it makes the paper sound less impressive and it implies that all humans do this. In the paper, we compromised. In the paper itself, we were very clear to qualify it, and so on. But the title importance hacked a little bit, and if you were to glance, they say that's a much more interesting paper than the American children do.

SPENCER: That's really good of you to admit that. I do think that everyone feels some pressure to do this in their work because they want to be published. They want to make it seem cool. But where I think it really becomes a problem is when it's such a big leap, such a big amount of importance hacking, that it goes from this should not have been published to this gets published in a top journal, and that's what we see, unfortunately, quite a lot of. We've been thinking about, "What is going on here? How is it that this stuff is getting through?" You might think, "Well, peer review, usually, there are going to be three experts reading your paper, and who are we to catch the errors in the paper? Who are we? These experts know way more about that specific topic area than we do?" I think the difference is that we're remaking the sausage from scratch. We're saying,"Okay, what exactly did they do?" We want to rebuild it. We want to rerun that study from scratch. In the process of rebuilding it, we sometimes have these moments of, "Oh, wow, this is not what it seems to be."

PAUL: I'm not against the principle of what you're saying, but you're making a judgment call. Here's an example. Back in the day when fMRI machines became common and people had a new study, there were a million studies in top journals saying, "This part of the brain lights up when you feel envy. This part of the brain lights up when you look at porn and so on." This stuff is in enough pages of Science and Nature. I always thought, "This is meaningless. Who cares about these findings?" I would think that's extreme importance hacking. They would just say, "This is great, important stuff." I'd say, "Of course, it's in the brain. It has to be somewhere in the brain." But of course, that's a judgment call. If someone's into neuroanatomy, they might say, "No, this is profoundly important." This isn't a criticism, but you're making a judgment call to say, "This person's important, and this person's not."

[promo]

SPENCER: You're right. There has to be judgment at the end of the day. But our claim is, if the reviewers truly understood, they would change their mind. That's what we're trying to assess. Does that make sense? We're saying, if you hold it to a reasonable standard. Let me give you an example. We had one paper. They plugged their parameter into the wrong place in their statistics, so the equation didn't do what it was supposed to do.

PAUL: That's a flat-out mistake. That's not a screw up.

SPENCER: So that's a kind of more clear-cut example. But it means that the result didn't mean what they had said, "Yes, if you ran the wrong statistic the way they did, you'd get the same wrong result." Another example, and this gets more judgment call-y, but I'm curious what you think about this. One paper claimed that there were three different views that people have about where wealth comes from. Wealth could come about because you believe hard work leads to wealth. That would be one view, or another view would be that wealth comes from luck. So you have these three views of where wealth comes from, and you have these three policy positions that people could support, and they had this kind of cool theory that these three views on where wealth comes from are linked to each of these three policy positions. If you have view one on wealth, you have view one on policy. Cool, nice theory. We reran the study, reran the statistics, got the same result, beautiful, but we developed this idea that we call the simplest valid analysis, which is we say, "What is the simplest possible valid way to analyze this data?" We reanalyzed it using the simplest possible method. In our determination, the simplest possible method to analyze it was simply to look at the correlations between those things because the actual way they analyzed it in the paper is very, very complicated. I'm a mathematician, I have my PhD in math, and I'm like, "This is a really complicated way to analyze this." So, okay. So what do we find? Well, we find that one of the policy positions has no relationship to its corresponding view on where wealth comes from. The reason that it appeared is because it has a negative correlation with the other two views.

PAUL: Yeah, I see what you're saying.

SPENCER: Because it doesn't mean what they said or what they indicated, at least.

PAUL: So, there are two senses of importance hacking, and I misunderstood what you were saying. You're giving me two examples. These are really good examples where people have unintentionally misunderstood what they found. They said they found something interesting, but their results do not support that conclusion.

SPENCER: Yeah, to be fair, I do not think they're lying. I'm definitely not saying they're lying, but I'm saying that their analysis, I think, did not present it accurately, and yeah, because of the way it was analyzed.

PAUL: Yeah. So there's another source. There's another type of importance hacking that I worry about. I wrote a Substack piece called Most of Some Developmental Psychology Isn't Worth Doing, which is a good thing.

SPENCER: So, you made a lot of friends.

PAUL: Yeah, that's right. I've gotten stuff, actually, no. Everybody loves it. I got several emails from people saying, "You showed those atheoretical dumbos that their work is crap." You and I know how to do good science. So nobody thinks I'm talking about them. I don't give any specific examples. Even one of the emails is from somebody I was actually thinking about when I did this. But in some way, I think a lot of studies saying five-year-olds do this and seven-year-olds do this, my response is, "Who cares?" It's without a background theory for why you would expect it to happen at these ages, for why it's without any interest.

SPENCER: Why is it not interesting? Isn't it useful information to know at what age different capabilities develop?

PAUL: Well, it's not theoretically interesting, typically, unless you think it's the sort of thing which is innate. Typically, this is, and so maybe it's when do kids realize that it's really mean to be late when you're supposed to see somebody that you shouldn't do that. So nobody thinks that's hardwired into the human genome. There's nothing you pick up, and so at just some point, kids know it, because you and I know it. So who cares whether it's four, five, six, seven, or eight? You could imagine somebody saying there's a theory of kids' understanding of time that says, "Before age six, they have no understanding of time, and then if they notice that at age four, that becomes theoretically interesting." But in the absence of a theory, obviously, it's going to come at some age or another. Who cares what age? But I would estimate half the papers in our top developmental journals are some version or another of what age we get this? And I always think, "Who gives a shit?" Your point about practical value, that's another reason you could do things. For instance, when can kids provide reliable testimony? Seems like a very good question, but most of the studies involve very weird laboratory designs. I have nothing against weird laboratory designs for theoretical purposes, but there's no practical implications from it, because it's not like you could take the study that seven-year-olds know this and six-year-olds know this and extend it to your own children or something, because they only know it in a very special way of testing it. In any case, I think the whole field is important. If I were in charge, I wouldn't accept it in journals. I would accept it at conferences and so on. But it's different sense than what you're saying. You're saying something which I think is far more objective, where the papers don't show what they say they show.

SPENCER: I think of it as running the whole gamut, and I started with examples. They're more clear-cut, but I think what you're talking about could also fall under the importance hacking. However, people can reasonably disagree about what's important, as you point out. So I don't think the standard should be what some random person thinks is important or not, but some kind of consensus around, like, "What we actually think is important?" Where it really bothers me the most is when I think reviewers literally come away misunderstanding what the result meant, as opposed to understanding it and just thinking it's important, even though maybe you and I would disagree with that. "Okay, well, fine." But that's a sort of different debate.

PAUL: There's a problem with reviewers, which then extends to journalists in general. I'm the editor of a journal, Behavior and Brain Sciences. We don't publish empirical pieces; we publish theoretical pieces. We strive to get papers that are interesting and appeal to a broad audience. But there's always a problem: if you submit to our journal a paper on kinship terms and how they work. I've always been interested in kinship terms, and I want to know, is this interesting enough for my journal? Well, who do you send it to review? You send it to people who study kinship terms, and they could tell you a lot that's wrong with the paper, but they're not going to say who cares about kinship terms. They find this intensely interesting. To properly review it and to get around its importance hacking, I have to send it to somebody who doesn't study kinship terms and yet is smart enough to assess whether this stuff is a contribution. That's a very difficult task.

SPENCER: Well, that's a very interesting point, because if you have a whole subfield where the importance has been inflated, and everyone is sort of in on it, then they may not be able to evaluate the importance hacking anymore. You have to kind of go adjacent to that. Let me give you what I think is an interesting borderline case. There's a paper that we're looking at now. We haven't finished our evaluation of it, but they talk about whether people who have had more history of trauma have different behavior in an explore-exploit trade-off. In other words, they're given the choice to either explore more, in which case you don't know what you're going to find, or exploit the best stuff you've found so far. They find, interestingly enough, a link between having more trauma and doing less exploration. I think it's pretty interesting, potentially. Now, how do they actually execute this? They execute it with a simple apple-picking game, where your little avatar sees an apple tree, and then you have to decide whether to take apples from the tree or go down to the next tree. You do a bunch of this. You can imagine the really importance-hacked version of this, where they don't mention anywhere that it's an apple-picking game. The only way to know that is buried in Appendix seven. That would be, I would think, extreme importance hacking. On the other hand, you could talk about the completely unimportance-hacked version, where right in the title, they say, "In an apple-picking explore-exploit simulation game, traumatized people have this." But in real life, often what we see is something in between, where somewhere in the paper it's mentioned that it's an apple-picking game. I think you and I probably would agree that it's not clear that this apple-picking game is going to generalize to real life. So, yeah, I'm curious to hear your thoughts on that.

PAUL: I agree. There's a genre in social media where somebody has a paper with the title, "Drinking carbonated beverages increases risk of liver cancer," and then they put in a response, "in mice." Often you find these huge findings, and sometimes not even in the title, but it's with mice and with rats. Now you can generalize from mice and rats to humans, both physically and psychologically, but it matters that it was done in a different animal.

SPENCER: I know the Twitter cat you're talking about, right? It just retweets the papers.

PAUL: That's right. There's another important hacking. I think Paige Harden, who wrote this wonderful book, The Genetic Lottery, talks about this. When developmental psychologists aren't doing the bad studies I talked about, they're doing a different kind of bad study, where they're doing a million studies finding that if parents read a lot to their kids when the kids are young, they grow up to be adults who love books. You could just, if that, I've never heard of that study, I'm sure it works. Harden points out that there's an obvious causal role of learning; reading the books when you were young caused you to love books when you're older. But there's an obvious genetic component, which is that maybe whatever genetic trait could lead to bookishness, intelligence, curiosity, inquisitiveness, self-control, whatever, is heritable. We already know they're heritable, so the finding exists anyway, but there may be no causal role at all of the early experience. She basically says any study that makes a claim like that and doesn't do some effort to pull apart experience and genes should be consigned to the flames.

SPENCER: In that example, there's all kinds of components. There's even cultural components. Maybe the parents are more educated, and they read to their kids, and then the kid ends up being educated, so they read.

PAUL: It's like rich people both read more to their kids and read more as adults. Richness tends to carry on through life, exactly. Or maybe studious, quiet kids lead their parents to read more to them. This is true. These are child effects, we call them in the business. There are a lot of studies saying that parents who smack around their kids have kids later on who tend to be more violent as adults. Not surprising. It's sort of an obvious genetic fact that aggression might just be heritable, but then there's also a trial effect, like maybe parents will smack disobedient kids more than quiet, obedient, docile kids, which is not surprising either.

SPENCER: Man, this stuff is so hard, but my view is that as long as we know what we're finding, if we know, "Okay, we found a correlation here. It's just a correlation. We don't know if it's causal." It still could be interesting evidence floating around. So I guess maybe I would say that I'm a fan of papers that just find a bunch of facts about the world, even if they can't necessarily interpret those facts. But maybe someone, maybe they'll help develop theories.

PAUL: Maybe I'm less of a fan. I think there's so much data that could be picked up in the world. I could just monitor kids in a playground and see how active they are in play, and look at them five years later and see their grades in history class. "Do a study. Do a correlation. Find there's a correlation." And the problem is there's sort of limited space in journals, or limited reviewer time, or limited attention time by people like you and me. So don't we have to be more selective than just kind of collect cool data and see what happens?

SPENCER: Well, it should certainly be for journal publications, you could be compared about what else you could show. And it doesn't necessarily deserve a slot, but I do think there is a basic amount of general fact-finding that is important, even if we don't know the story behind it. In fact, we actually released a million correlations that you can explore, because I think it's actually useful to have them out there. And people can develop hypotheses and quickly check a theory they have, like, "Hmm, maybe, you know, is OCD linked to autism? Let me go check the correlation and see whether it might be or not."

PAUL: It reminds me of Aella. Do you know her?

SPENCER: I do, yeah, she's been on the podcast before.

PAUL: Oh, that's right. Okay, excellent. So you know better than I do, but she does these enormous survey studies of kinks and sexual preferences and so on. Some of it is theoretically motivated, but there's this enormous amount of data and correlations, and I agree there is something of value there that I don't know. Adults who enjoy military history also enjoy being spanked or something like that — I just made that up, I have no idea — but you might imagine that you wake up one morning with a theory that makes this prediction, and then you would check this body of correlation. "Oh, my God. There it is." And then, of course, you have to do another study. Because when you have a million correlations, I can't do the math in my head, but I might write that sort of at a p- 0.5, 20,000 of them will be significant just by chance.

SPENCER: Yeah. So the way we approach it is actually quite different.

PAUL: It is 50,000. Sorry.

SPENCER: Yeah. So for anyone interested in this, by the way, it's free. You can check it out. It's personalitymap.io. You can go check out these million correlations, explore them, but basically it's really interesting. We don't use statistical significance for this. What we do is we try to limit the confidence interval on them because some of the correlations have 10,000 data points. If you have 10,000 data points, you get quite a small confidence interval. We've done some simulations where we look at different numbers of data points if you're doing a search for what correlates with OCD, for example, and you get some correlations there, how confident can you be in them? It's kind of a different way of thinking about things.

PAUL: Interesting.

SPENCER: Coming back to the value of correlations, if you have a causal theory about the world, very often it will make correlational predictions. One nice thing about having correlations out there is that you can say, "Well, I think A causes B," go check the correlations first, and you might be able to refute yourself in 10 minutes.

PAUL: Yeah, I think those databases are really valuable. The problem is a lot of the studies I'm worried about are the worst of both worlds. They're theoretically unmotivated, but they don't reward you with this enormous body of data you can look at. Rather, it's sort of a performance of 26 kids on some unrealistic task. If you don't have a particular theory guiding your work, at least collect a lot of data that will then serve as a rich resource for someone like you or someone else to come back with an idea.

SPENCER: Yeah, I totally agree with that. And I think while I appreciate many things about the Open Science movement, one thing I don't like about it is that it has given some people the impression that you should collect fewer variables or outcomes. There's this idea that if you collect a lot of outcomes, then maybe you're going to do fishy things, and you'll end up just reporting whatever worked. That was a really big problem. Someone runs a study on some new treatment, collects 25 outcomes, and one of them works, and they just highlight that as the thing that worked, but they ignore the fact that it might just be a false positive, and probably is. What that misses is that there's a difference between doing rigorous science and doing science that only collects a few variables. In fact, you're almost always better off collecting way more variables if it's cheap, but then you have to apply a level of rigor. For example, you might want to pre-register which of the hypotheses are your key ones, or you can do your math in a special way to take into account the fact that you're testing lots of hypotheses.

PAUL: I think that's entirely true. There's no problem in collecting more data, and more data is better than not more data, so long as you have some way to constrain yourself so that you're not doing after-the-fact storytelling. I have two questions that occurred to me. One is, you said that you work off top journals when you do this. That makes sense. I imagine Psychological Science would be one of them.

SPENCER: Yeah. So the journals that we replicate from are PNAS, JPSP, Psychological Science, and we also do behavioral and psychological papers in Nature and Science.

PAUL: So, I could not have a better taste. If I had to start from scratch, I would say the top journals, plus whatever you can manage to squeeze into Science and Nature. Those are the top papers. There's this lore, but I'm not sure, which makes sense: the top journals have the least replicable papers because people send their sexiest stuff to the top journals. For reasons we discussed, the sexy stuff tends to be of lower quality. I'll just qualify that by saying JPSP is less likely than others to have this issue because they publish multiple experiment papers that are often pretty theoretically motivated. But in general, do you think the big journals publish the worst papers?

SPENCER: It is interesting. I definitely think they are sexier sounding results. Because they're optimizing for what they think people will be really interested in. We're not seeing big problems with replicability, as I mentioned. Two out of 12 failed to replicate. But the reasons they failed to replicate were quite weird and idiosyncratic. So I would say, just based on what I know now, that the top journals are the most important fact, but I don't think anyone has any data on that, and it's not like there's any really good measures yet.

PAUL: Maybe it's true. It should mean that the journal set that you'd get 100 out of 100 if you took a lower tier journal. That's nice news for today, that the field is back on track, producing replicable results. A bit of importance hacking in the margins, but still good. So are you bullish on Psych?

SPENCER: I think there's this really big problem that happens in the world broadly, and it has nothing to do with academia, which is that if there's an easy, fast, cheap way to do a thing, and there's a really difficult way to do a thing, and they're equally rewarded, then you get a giant proliferation of the easy, fast, cheap way to do the thing. I think you can see this sort of all over the place, and I think that we see this in psychology. It's really, really hard to do really good psychology research. It's damn hard. You have to try to figure out something new about human nature, and you're also competing with the fact that we humans are really good intuitive psychologists in a lot of ways. Yes, we have examples where we're wrong about human psychology, but we understand psychology pretty well as humans. So I think it's really hard. And I think that what happens is there's tremendous pressure to do the fast, cheap, easy thing, which is basically to mislead about what you found and make it sound a lot cooler than it is.

PAUL: I think that I end in a different way, which is, I feel so much of psychology has gone to surveys. It has gone to studies where you could run on MTurk or Prolific or devices like that where you just ask 1000 people questions on the internet. You don't have to leave your office. Sometimes you can run experiments with them. Sometimes you just ask enough to fill out forms, basically. I feel one thing that got lost, pretty much, for, in part, for the reason that you're talking about, which is this is an easy way to do things, so people want to be drawn to that, rather than running experiments in a lab with deception and complicated machinery. Also, one of the things that happened in the replication crisis, we realized that we have to test a lot more subjects to do reliable work. It's not difficult. If you have the money to do a thousand subjects online, trying to do a thousand subjects in a laboratory might take you five years. As a result, I think this is sort of the questions psychologists ask and the studies that we do.

SPENCER: That reminds me of something I wanted to ask you about, which is, it seems to me that some of the really great classic studies, you could have done them on 10 people, and they would have still had phenomenal results. If Milgram gets five out of 10 people to shock someone seemingly to death. Or Asch gets a handful out of 10 people to report that one line that is clearly longer than another line is actually shorter than that. That's damn impressive. You don't need to do that many times; you're like, "Wow, we figured out something really fascinating here about obedience or about social conformity.:

PAUL: You've cleverly chosen the two best social psychology studies that capture things that are really robust findings. I think for someone like the cognitive dissonance studies, the more subtle findings, you may not get it with a small group of people. In fact, you might not even get the same results with a big group of people. But you're right. I think this is why you ask a bunch of psychologists who is the greatest psychologist of them all, and I think many of them would say Daniel Kahneman, and up until last month, the greatest living psychologist of them all. One of the things about Kahneman and Tversky's research program, which has to do with human reasoning, is that the findings are so obvious. There are studies where they ask you, "What's more frequent, this or that? Does this imply that?" And people make mistakes, cognitive illusions. You don't even need 10 subjects. You could run it on yourself; you're looking at an optical illusion yourself, and you get the wrong answer. "Wow, it's really compelling."

SPENCER: Kahneman and Tversky would apparently would do that. They would test them on themselves. If they both got the wrong answer, they've got something here.

PAUL: Yeah. And I think that stuff is incredibly cool. I have to say, you could even say this is somewhat anti-intellectual, but I tend to distrust findings that you need very sophisticated statistics to show that they work. Now, I know Nature doesn't always give you big contrasts; it can give you subtle contrasts. So this isn't a principal point, but if you say, "Oh, we have this difference, you know, but I had to test 10,000 subjects, and it's really subtle, and I had to do this fancy statistic you don't understand," I much prefer a study like, as you say, "We tested five people and check this out. It's really big." Now, then you're kind of running against the fact that we're natural psychologists. It was so obvious that you don't need to test five people and find it; isn't it so obvious that we know it already? But not always. In the Milgram study, Milgram, ahead of time, asked a bunch of people, including clinical psychologists and various experts, and said, "If I brought people in my lab and told them to kill somebody, would they?" They described the study and revealed that no, the occasional psychopath would, but no. Then Milgram found that over half the people would. That's an excellent example of the perfect study, I think. It's funny because Zimbardo's study came at the same time, and Milgram is often mentioned in the same breath.

SPENCER: Yes, one of the worst studies [laughs].

PAUL: Yeah, I think Zimbardo has had a lot of really important career contributions and is a very serious person. Nonetheless, the Zimbardo study, the Zimbardo Prison Experiment, might well be a worse study.

SPENCER: For those that don't know, he basically got a bunch of, I think it was students, and assigned some of them to be prison guards and some to be prisoners, and did a so-called experiment, the word experiment in quotes, in order to test what would happen. Would the people assigned to be prisoners behave differently because they're assigned to be prisoners? Would the people assigned to be prison guards act like prison guards? But the study was just riddled with problems, everything from different reporting when it was reported on, all different kinds of claims were made about it that seemed not to be true. There's also major problems with him interfering with the study itself and kind of manipulating it during its happening. Also huge problems with data collection. So, yeah, I think it's more like performance art, I would say, than it is an experiment.

PAUL: Well, back in the day, a lot of those social psychology experiments were like these sorts of things. There'd be felonies if they didn't have human subjects' approval. But, in my book, I review this field of psychology, and I have a section on social psychology, and I don't mention the Zimbardo study. It just doesn't make the grade. There's another study, have you heard of the Robbers Cave study?

SPENCER: Yeah, it's pretty flimsy to me.

PAUL: Yeah, the sort of thing that makes it great is that it's like performance art. You imagine it being an HBO series where he gets kids in a camp and they go to war against each other. He sets them up as two different groups, the Rattlers and the Snake Eagles or something, and then they go to war against each other. And he figures out how to diffuse it. It's extremely dramatic and a lot of fun. But what I only found out recently, because there was an expose of this published, I think, in The Atlantic, was this was the first time he tried this. He failed. No matter what he did, the kids got along, and then in a second study, he had to go through extreme lengths to get them to hate each other that aren't described in the study. So I think that's another one to be skeptical about.

SPENCER: Well, it's funny, because I think it's happened many times where there was a study that popularized an idea, later, people took a more skeptical view, they realized it was low quality, and then they dismissed the concept. But actually, the concept is true; the original study didn't really demonstrate it, or didn't demonstrate it well. And I think of power posing in this way. The study on it was really bad. But the reality is, if you look at the data, doing power poses does make a slight improvement in your mood and your sense of power. I think pretty much all the data shows that.

PAUL: I don't want to pile on the power posing, but I think you may be a bit too generous with that. You're right that it's true, but that was almost a manipulation check, and they expected you to feel better when you power pose because you were induced to do it, and the context says it's a good thing. I thought the idea was it would affect cortisol and have these physiological consequences. And that was a little bit not.

SPENCER: Those didn't replicate. Well, it depends on what you think the purpose of power posing is. If you sell it as, "Hey, you're about to go on stage, you want to feel slightly more powerful," they'll be like, "Yeah." "Can you try, it may actually increase your feeling of power." If it's going to have all these life-changing benefits, then, yeah, probably not. But I actually think there are a bunch of examples of this. There was this study on whether smiling makes you feel happier, the facial feedback hypothesis. When people read the original study that said, "Oh, indeed, smiling makes you happier," the study came under scrutiny, and people said, "Oh, no, maybe it's not." Now true, the study doesn't replicate well, but the reality is that smiling definitely makes some people feel happier. Yeah, that's my view.

PAUL: No, I think that's right. I think sometimes there are studies that are faked or not faked, at least in part, or p-hacked, or whatever, because the investigator is very confident it'll have an effect. There's a little bit of nudging to make it work. But in fact, if the investigator was right, this is a real finding. There was something recently involving people going door to door, talking to people about some political issue, and getting a change in their mood or their attitudes due to it. I think it might have had to do with trans rights or something, and it turned out the study was faked. Then they did it again, and they found out it worked when you do it properly. So, yeah, that happens. It's a strange field in some ways.

SPENCER: This kind of goes back to that, you know, we are intuitive psychologists. People are far from perfect in their understanding of psychology, but a lot of people actually do have a pretty good sense of what's true about human psychology, and sometimes you can predict it. Even if you pick your study, you might have actually made a correct prediction.

PAUL: So there was this thing that was going around a little while ago where you get people to predict which studies will replicate and which won't. People do pretty well. I wonder whether they've been doing this with LLMs. You give, say, 20 studies, and ask, "Which of these 10 do you think will work?"

SPENCER: They do pretty well, but they don't do so well that it's that useful a tool. I think they typically will be 60 to 70% accurate. I don't know if that's accurate enough to make it that useful.

PAUL: If it was perfectly accurate, it'd be very strange. Why do psychology at all?

SPENCER: Exactly. Going back a little bit, we talked about areas of psychology that could use some improvement. Before that, we talked about some strong areas of psychology. We've learned a lot. There are some interesting ones that are kind of in the middle. So let's talk about Freudianism. Is Freudianism total bullshit? Is it useful? Was it a building block, and now we've built on it or transcended it?

PAUL: Yeah, the way I put it in my book is that a lot of social psychologists' attitude about Freud is like they're a pharmaceutical company. They've got to start by selling math. Freud is considered this incredibly embarrassing figure, and we want to show what scientists we are by disowning him. I think the verdict on Freud is complicated. I think Freud was wrong in just about all of his specific claims: the relationship between homosexuality and paranoid schizophrenia, the effects of the primal scene when every kid walks in on his mother and father making love, and how that affects the kid. Incredibly bizarre Freudian tales, almost none of which have any support. But I think Freud got the big things right. The idea of an unconscious, the idea of an active unconscious. He was not the first to say it, but he was the first to systematize it at such length. If you go up to political scientists or social psychologists and you say, "What do you think of Freud?" They say, "Freud's bullshit." But if you say, "If you want to know why people voted for Trump or voted for Harris, can you just ask them? Wouldn't that solve all your problems?" they would say, "No." They would say because people might be motivated by forces that they are not aware of. You might think you voted for Harris for this or that reason, but you really voted for her for a different reason. We accept that as psychologists. I accept it; I think it's true. I think often we don't know why we do what we do, even for important things, or why we feel what we feel. I think the sort of Freudian strategy of saying that there are unconscious mechanisms at work that are subterranean is right. That's where I would put his primary contribution. What do you think?

SPENCER: I wonder if that is just where he got those ideas from? To what extent did he originate them? I totally agree with what you're saying. Those are great ideas. I don't know enough about the history to know how much credit he gets. Maybe he does get a lot of credit for that.

PAUL: My namesake, Harold Bloom, has a book, Shakespeare: The Invention of the Human. The idea is that the concept of an unconscious long preceded Freud, so he didn't, and for an incredibly immodest man, he took credit for everything. He was actually honest about it. He said, "Well, I'm not the first to think of these ideas. I systematized them," but Freud gave us an example of what a science of the unconscious might look like and gave a lot of insight. He was not the first to point out that dreams might have hidden meaning, but he did more than anybody else to articulate dreams in the context of a theory of unconscious dynamics. This may be one of the cases where he was mistaken, where maybe the story for how dreams work is simpler than what Freud had. He doesn't get credit for thinking about the unconscious, but he gets a fair amount of credit for systematizing it and exploring it.

SPENCER: What do you think of the fact that quite a number of therapists today are practicing Neo-Freudian methods like psychodynamic therapy?

PAUL: I was on a Brian Leiter show a little while ago, and I thought I was there to talk about psychology, but he only wanted me to talk about clinical psychology. I'm not a clinical psychologist, so I just kind of put a chapter on it. It's a call-in show, and he asked me what I thought of these therapies, and I thought of Jungian therapies. I said, with great contempt, "Nobody does that anymore. That's just nonsense." Then the calls started to come in. "Really, I do Jungian therapy, and I've been there for the last 50 years, and how dare you, your so-called expert." I think people use these methods a lot, Freudian methods, Jungian methods. They're influenced by Freud's followers, like Adler and so on. I think the evidence for the efficacy of these methods in actually helping diminish human suffering is very small, but I also think that some people are really good at helping others, and even if they dress it up in a theoretical context that's unsupported, maybe even silly, they still do great stuff. It's like you could believe talking to a priest about your problems could really improve your life, even if you reject all of Catholicism. Similarly, talking to a psychoanalyst could really improve your life, even if you think Freud is just crap.

SPENCER: This reminds me of my theory of astrologers. We did a big test of astrology where we started with a little silly test in one of our studies, measuring the accuracy of personality tests to predict things about people. We included people's sun signs, like, are they Pisces or Aries. We found it had no predictability at all, any of 37 facts. It's a very nice control group to include in your study to make sure your physical methods are working. We put that out in the world, and a bunch of astrologers got really mad at us. They said, "That's not fair, because it's not real astrology. That's like baby astrology." We asked, "Okay, what's real astrology?", "Real astrology involves looking at the full astrological chart, and you can learn all kinds of things about people". We said, "Okay, cool, let's test that." We got over 100 astrologers to participate, and we showed them lots of information about people. Then they had to guess which was the person's astrological chart. There were decoys, and then there was the real one, and they were no better than chance. Not only were they no better than chance, they didn't even agree with each other. They had very low rates of agreement. Very bad for astrology. It got me thinking because I know people who love astrology and totally believe in it. I started thinking, "Maybe they're not using the chart nearly as much as they think. Maybe they're just paying attention to this person, listening carefully, reading the body language, saying things based on that." The fact that astrologers did not agree at all on which chart to assign the person made me think maybe the chart is just doing 5% or 10% of the work, whereas they perceive it as doing 90% or 100% of the work.

PAUL: That's a nice analogy. It's not a flattering analogy with Freud, but it's not a bad analogy. Sometimes people need a framework. I'm a Capricorn, so I'm skeptical, but I think what Freud and a lot of psychoanalytic training gives you is a series of techniques and a procedure and a theoretical framework to work within. In therapy, things can work even if they aren't true. You come in with your problems, and the psychoanalyst relates it to your childhood, your relationship with your mother, as one does. Maybe it's just nonsense, but maybe it also reassures you. Now there's an explanation, there's a story. You can make sense of it. You feel someone's focused on you. You feel cared for. There are a lot of arguments that the benefits of therapy, putting aside the drugs and all of that stuff, involve what's called a therapeutic alliance, which is the idea of having a smart person who cares about you and wants to help you and believes you will get better. That makes a big difference. Some studies show that it's the therapists themselves that matter a lot, much more so than the framework they work with.

[promo]

SPENCER: I think that's all true, but something that bothers me is that often we have better methods that we know about. I don't have a problem at all with someone using astrology if they find it beneficial, but I think they could probably get an upgrade to some other method that isn't just based on what I think is pure randomness.

PAUL: Yeah, I think if you came into a therapist with a specific phobia, and they started talking about whether you were breastfed, you should run. For some things, cognitive behavioral therapy is the treatment of choice and really does work. I also think the various medications we give for anxiety, depression, and even psychosis are better than the alternative. Some people just go to see a shrink because they're unhappy with their lives and struggling through. An ideal world would have a science behind it. I like to think that 100 years from now, people will look back on how we do therapy and be shocked, horrified. They didn't know what they were doing. I'd like to see therapy a hundred years from now being the equivalent of doctors prescribing antibiotics for an infection, giving a splint for a wound, that sort of thing — the stuff that really works, because we know we have a science behind it. We know how it works. Until then, absent the science, maybe people just benefit from caring, compassion, and someone listening. Maybe that's better than nothing.

SPENCER: Yeah, I do think we know some things about what works though.

PAUL: Why do you think everything is cognitive behavioral therapy? Or do you think of drugs? What do you think?

SPENCER: Yeah. So imagine someone comes in with depression. And the therapist is talking to them and trying to understand the depression. And the person says, "Yeah, well, I'm just a worthless sack of shit, and I'm never gonna amount to anything, and I have no value in society. And, so I avoid all my friends because I'm ashamed, I'm not good enough for them." It's like, "Okay, there's something to be done here." It matters what you do with this person. Just reading into their childhood trauma, but not giving them any actual methods to change what the actual problems are facing today, I think is under-serving them. Whereas I think a cognitive behavioral therapist in that situation is going to help them think about those negative thoughts they're having about themselves and what the impact of those are. They're going to help them think about their behaviors, like avoiding their friends, and what the impact of that is. I just think that's way more useful than the bare, basic version of just giving empathy. Giving empathy is great, but I think there's more to be done.

PAUL: I think that's right. But I also think that a lot of people who market themselves as Freudian or Neo-Freudian do a lot of the things you're talking about. They talk about maladaptive thoughts. They just talk about your life with them and so on. I agree the emphasis on childhood trauma could be a waste of time. But I also think sometimes an intelligent person, CBT, and if you look, it's kind of a little bit demeaning. I think sometimes people really want to talk about their lives, and CBT doesn't do that. They want to talk about their problems and work their way through, and I think maybe the Neo-Freudian gets closer. But I'm not disagreeing. I'm pro-CBT.

SPENCER: Schema Therapy has an interesting strategy for this, which is that, as I understand it, they still use a bunch of the elements of CBT, but they also let you talk about your childhood trauma to try to understand where your schemas come from. Then they can do CBT on that, which I think is actually kind of nice, because I do think it could be interesting to reflect on your childhood and how that might have influenced you today. The main problem with it is that it's always stuck in speculation. You're never going to be sure that the reason you're having this problem today is because of that thing that happened to you when you were five years old. Additionally, even knowing that thing happened when you were five years old doesn't automatically unlock and free you from that behavior today. That's the disconnect, but I do think it can be interesting as part of therapy to explore that.

PAUL: Exactly true. It's a real-world case of the sort of compounds we were talking about earlier in developmental psychology, which is, you come in and you're just tremendously anxious, and then it turns out, as a child, you had a kind of crappy childhood in all sorts of ways. Well, maybe the crappy childhood caused your anxiety. But then again, maybe if you were raised in the most loving, supportive environment, you'd be exactly the same person, because it might be that your environment played very little role in what happened to you. I gotta say, though, as somebody who's been married for a while, nobody will willingly listen to your dreams. A shrink, you could tell them your dreams, and they get paid all the money to listen and maybe talk about them, and that's gold.

SPENCER: That's true. There's almost nothing more boring than hearing a random secret dream sequence that seems to have no coherence whatsoever.

PAUL: That's right. But shrinks live for it. You just kind of, "I had this great dream." Yeah.

SPENCER: The more you believe that dreams have deep significance, the more interesting they are to listen to.

PAUL: So I have to ask you the question, which I've been asking myself and everybody has been asking each other. So what are LLMs? How are LLMs going to affect all of this? Some people I know say we should be running our studies on LLMs and not on people. To me, this is madness.

SPENCER: You mean simulated humans? Yeah, I've seen some studies like this. So they'll give an LLM a prompt, like, "You're a 42-year-old American man who lives in Chicago, and you're a plumber," and then they ask it a bunch of questions and have it simulate the answers. They can even do experiments. They can give it stimuli and have it do things based on that. Then they do this with large samples of these random pseudo people, and then they try to say, "Well, how well does it match human data?"

PAUL: Yeah, yeah. And there are some people who think that LLMs will soon replace human therapists, which I think right now, if you end up talking about your problems with ChatGPT for a longer time, you'll find it disappointing. But maybe this is a heretical view, but it wouldn't surprise me if two or three years from now, we get to a point where that's possible, and you can get an LLM that could be as seemingly empathic, engaged, smart, and patient as a human, except they're always there for you, and they're totally focused on you.

SPENCER: And they never forget anything you tell them.

PAUL: They never forget anything. They never confuse you or somebody else. They never go on vacation. They don't go on vacation in August. Shrinks always go on vacation in August. And then the question is, how will we react to this? There are studies showing that people think AI advice and AI responses are more empathic than human advice and responses unless they know it's an AI; then they downgrade it. But I think as AIs get better, it may become irresistible to think of them as people.

SPENCER: It certainly seems like technology is on its way to do that. I have a friend who uses an AI as her therapist, and what she does is give it a very specific prompt about how she wants it to behave. I don't want to call her out, but it would be something like, "I never want you to do XYZ. I always want you to talk to me as though I'm this way. I always want you to ask me before you give me advice because I don't like advice that's unprompted." Whatever it is, you have to design your own perfect therapist. And then if it does something you don't like, you change the prompt and it doesn't do that anymore.

PAUL: That's a more general problem with AIs, which I've been interested in, which is they will do what you tell them to. In some ways, they're the perfect companion. She now has found the perfect therapist who doesn't do anything that pisses her off or upsets her, but maybe she would benefit from getting advice before being asked. Maybe she would benefit from having somebody disregard those instructions.

SPENCER: Yeah, acting in a way that you never even imagine a therapist would act, but actually could be helpful.

PAUL: Maybe we aren't the best judges of what's best for us. So your therapist ends up wanting to talk a lot about your current problems at work, and you want to tell it more about the dreams, but it says, "Enough of the dreams. Tell me about your current problems at work." And maybe, if it's an AI, you'd say, "AI, override, do what I tell you to," but I don't know. I'm worried about the malleability of these companions and whether we would be better off with a little bit of friction.

SPENCER: I've seen this happen with AI use around critical thinking, where I've seen some people use it in a way that I think really enhances their critical thinking, where they say, "Hey, can you fact-check this for me? Can you tell me what I might be wrong about? Can you give me both sides of this argument so I can really understand the different sides?" I think that's great, or just accelerating your gathering information for you. That goes into your thinking. But then I see other people using it to outsource the core part of the thinking. So instead of them doing the core part of thinking, now the LM is doing it for them, and then they're just trusting the output, or, even worse, they're asking it just to make arguments in favor of what they already commit to being true, and then they'll take that, post it in a comment in some argument they're in on Twitter or Facebook, and they don't even necessarily say it's from an AI.

PAUL: That's an extreme sock puppet.

SPENCER: I've seen a lot of people do this.

PAUL: Whenever I do a Substack or anything, I give it to Claude to proofread it, tell me typos and radical letters. It's getting better and better at that. But it also says, "What a wonderful article, so insightful, so smart," and I feel great about it. And then sometimes I say, "Give me the best arguments against it." And then it's still sort of sucking up. But if you tell it, you know, "Be its worst critic, be dismissive," finally, you get some stuff back from it.

SPENCER: I use custom instructions. They're permanent, so they are always a part of it, and it prevents it from complimenting me. I find that good for my soul.

PAUL: I'm impressed. I'm impressed you have forbidden your AI to do so.

SPENCER: I don't forbid it from complimenting me, but I give it a whole bunch of instructions, and the net effect is it never compliments. For example, I tell it to, "Never use extra words that are not necessary."

PAUL: That's good, that's efficient. Are you one of these people who use LLMs in your everyday life?

SPENCER: Oh, constantly, yeah, many times every day, for so many different things, everything from helping me brainstorm to fact-checking me to coming up with arguments for and against to naming things. It's good at generating names, lots of stuff.

PAUL: Yeah, same with me. A while ago, I heard a podcast with Tyler Cowen, who says that instead of using Google, he uses ChatGPT for more and more things. I have found that it's often astonishingly good at some things, it's bad at other things, and you can't trust it for citations, quotations, and that sort of thing. But sometimes, for a lot of stuff that's not particularly worth doing, I had a recent search where we had to send to the administration biographies of the people we had rejected for the search. It does nothing. It's just in a file. So you just feed it the CV, and boom, boom, boom.

SPENCER: For bureaucratic things where nobody's going to read it, and it really doesn't matter, but you have to produce some document that has some specifications, it can be quite handy for that kind of thing. I really think this technology can help accelerate the way we think and make us better thinkers. But you really do have to be aware of pitfalls. One of the things is that hallucinations are a problem. They're getting better. Hallucinations make up facts less, but you really do have to have a sense of when you can trust it, and also a sense of how critical this information is. If you just want the gist of something where you know nothing about it, and even if you have 5% mistakes, you still learned a lot. Maybe that's fine, but if you're really going to rely on it for a serious decision, you have to really think about, "Can you double-check it?"

PAUL: That's right. You should never use these things for something where it's important you get everything right. But I also find it's very good as an explainer. I even say, explain it to me like I'm 12 years old, and I ask questions and say, give me some quizzes to make sure I understood it, and so on. I think there's the capacity to be a wonderful tutor.

SPENCER: Absolutely. I mean being able to talk to a piece of information. Instead of just reading a textbook, ask the textbook questions, ask it to explain it a different way because you didn't understand the first explanation. Ask it to give you an example. That's so powerful. I think both with general knowledge, where it already knows a lot, but even with more and more, I find myself pasting long documents into it. The prompt sizes are getting bigger and bigger. I can paste in a 20-page document and ask questions about it.

PAUL: Early on, before they had limits, when it first came out, I had a draft of my book site, which was apparently made up of 150,000 words, and I just uploaded it and said, "Write me a new chapter." It wrote me a new chapter on, I think, emotions and rationality or something. It was pretty terrible, but the fact that it could do it was astonishing.

SPENCER: It's pretty mind-blowing. The technology is moving so fast, it's hard to even talk about these things because every six months there's some major innovation. It always feels like you're on the edge of being outdated.

PAUL: I enjoy writing books, and I'm tempted to write something, but I don't want to write on AI because it takes me two years to write a book, another year for it to come out, and by that time, we'll all be slaves or something to our AI masters. Things will have changed so much.

SPENCER: Yeah, do you think that the book world will be altered? You said, "Give me a new chapter," and it wasn't great, but in two years, three years, at what point can it produce a sensible chapter that's actually pretty readable? At that point, does that change the whole book world dramatically?

PAUL: If AI could get to a point where it could write a good novel, then all bets are off. Can you imagine saying, "I like Ian McEwan, but he writes a book once every three, four years. I don't like waiting, so I just go, please write me an Ian McEwan novel, set it in Dublin, make it in this year," and then, boom, it does it. What will the world be like when you could get that? Horrible for writers.

SPENCER: Horrible for writers, for sure. There's this very strange phenomenon where a bunch of the things that LLMs do right now. If you were to tell people 10 years ago, "Oh, by now, an AI will be able to do X, Y, Z," they'd be like, "Well, are there famous AI authors? Are there AI running businesses?" It's not that clear why it's possible for an LLM to do the amazing things it can do now, but it can't do some other things.

PAUL: You're right. It's a very uneven profile. I told you I'm an editor, and at times I've just used it for fun, I've never done this, but I've stuck in a whole manuscript. The reviewers that write me an action letter of a decision, boom, three seconds later, there it is. Now I don't always agree with it, but the fact that it can do it. But there's also nothing we don't have. Particularly, it's mastered the world of abstractions, but I still don't have my robot maid or my robot butler. I don't even have my self-driving car. Somebody from the past would be astonished at where we are now, but also a bit disappointed that other things didn't come with it.

SPENCER: And that's why it gets so strange when you think about writing a whole novel, because is writing a whole novel one of the things that turns out to be, "Oh, yeah, it could write a whole novel, but it still can't do all these other things? Or by the time it can write a full novel that's excellent in a particular style, now it can take over the world. Now it can run companies. Now it can replace us in everything we do." And I think that's a huge open question nobody really knows the answer to.

PAUL: That's a great question. I would think that once it could write a novel, it can do anything: take over companies, take over the world, kill us all, persuade us to do anything. But I think the physical world poses certain surprising challenges, maybe because part of it is robotics and not AI at all. In order to build a robot, you have to build a heavy machine that can stand upright and move around and so on. But certainly, I think once it could write a novel, there are very few things intellectually that would be beyond its capacities.

SPENCER: But would you have not said that though, 10 years ago, if you had been thinking about the future and someone said, well, when an AI can write an essay on philosophy, surely it would be able to take over the whole world. And, well, now it can write an essay on philosophy, and it's decent, better than your average undergrad.

PAUL: Yep, yep. Would I have expected us to already be dominated by it? Maybe. I have to say, I'll be honest about this, that the ascent of these AIs has been the biggest intellectual surprise of my life. If you had asked me one week before the release of ChatGPT, when would we get something? I would have said, 10 years, like that science fiction-y stuff. And back then, during that week, people could say, "Can we get a machine that could pass a Turing test?" Or maybe not, whatever. And then things happened incredibly fast. And now we live in a science fiction world where I can reach for my phone and have a conversation with a super intelligent being, which is amazing. I have 1-800-ChatGPT, which is tennis, which I use as sort of a demo in class. I was just goofing around, and I said, very narcissistic story. I said, "I'm Paul Bloom. Of all my books, what's your least favorite?" And I thought it would say, "I don't want to judge... But it says, Just babies." "What? Why?" And it said, "I think it's a wonderful book, that compliment is, but it's kind of derivative and not as imaginative as so many other works." And I was like, and I got in a big fight with it, saying, "You're missing stuff." It's amazing we can do this. I worry that sometimes we're losing our sense of wonder about it, that we are now thrown into the stuff of science fiction. In between being worried it will destroy the world and believing that it will bring about paradise, we should just marvel.

SPENCER: It is truly astounding. I would just say to the listener, if you're not finding ways to use it in your life, I think you're really missing out, because there are tremendous ways you can use it for almost anything you want, as long as you treat it as a tool, something you have to actually practice at and understand, not something you can just use perfectly on the first try. Yep, agreed. I saw someone post on Twitter about how they tried using it for something very simple, like reorganizing something alphabetically, and it made a mistake. Then they're like, "Look at this ridiculous technology. It can't even do this one thing right." I'm like, "Yeah, there are a hundred thousand things you could have asked it to do, and it has done an amazingly good job."

PAUL: Yeah, I'm underwhelmed by people. I think the hallucinations and mistakes are really worth knowing. They tell you something interesting about how it works, but people who then jump to say, "Oh, it's nothing. It's just crap." It's in some way like self-driving cars, which occasionally cause accidents and more than once have killed somebody, but you look at the amount of miles they've driven, and they do a lot better than people. So it's not like people don't screw up and fail to alphabetize and miss basic things.

SPENCER: I saw a funny tweet. I think it was Aliasky about how signs that your daughter might be an LLM are that they struggle to multiply seven-digit numbers in their head. I'm paraphrasing, but the expectations we have for it are so high [laughs].

PAUL: Exactly.

SPENCER: Paul, this was such a delight. Thank you so much for coming back on the show.

PAUL: This was a hoot. Let's do it again. I really enjoyed it. Thank you, Spencer.

SPENCER: That's great. Thank you.

[outro]

JOSH: A listener asks: "What do you think the rationalist movement is most wrong about?"

SPENCER: Hmm. Well, it's not necessarily necessarily wrong about any one particular belief or something like that. But I think that sometimes rationalists can be over-zealous in thinking, "Oh, everyone's totally irrational; therefore there's all of this low-hanging fruit where you can just go in and solve a problem." And I think that can be a bit naive, that often when there seems to be low-hanging fruit and everyone seems to be behaving irrationally, there are often forces that are not that obvious and make it difficult to pick that low-hanging fruit. And people are not quite as dumb as it seems. And while I do think people often succumb to cognitive biases — and I know that I do, for sure, and I do think often people think about things in unreasonable ways — I think that people's natural behavior, even if they can't explain it, is often a bit more rational than it might seem if you're just analyzing it in an armchair. And I think rationalists sometimes can miss that.

Staff

Spencer Greenberg — Host / Director
Josh Castle — Producer
Ryan Kessler — Audio Engineer
Uri Bram — Factotum
WeAmplify — Transcriptionists
Igor Scaldini — Marketing Consultant

Music

Affiliates

Click here to return to the list of all episodes.

CLEARER THINKING

Episode 269: What do we know about psychology that matters? (with Paul Bloom)

Contact Us