with Spencer Greenberg
the podcast about ideas that matter

Episode 106: How meanings get words and social sciences get broken (with Literal Banana)

Enjoying the episode? Want to listen later? Subscribe on any of these apps or stores to be notified when we release new episodes:

May 26, 2022

How do meanings get words? What is ethnomethodology? Some attempts at defining words are successful; but why do some words seem to become more slippery the more we try to pin down their meanings? What sorts of problems uniquely plague the social sciences? What subtle aspects of the placebo effect are not noticed or easily forgotten by researchers? How can social science researchers clarify and strengthen the meanings of words in their questionnaires? More broadly, what are some of the less-talked-about ways that the social sciences can become more robust and reliable?

Literal Banana is literally a banana who became interested in human social science through trying to live among them. After escaping from a high-tech produce delivery start-up, she now lives among humans and attempts to understand them through their own sciences of themselves. Follow Literal Banana on Twitter at @literalbanana.

JOSH: Hello, and welcome to Clearer Thinking with Spencer Greenberg, the podcast about ideas that matter. I'm Josh Castle, the producer of the podcast, and I'm so glad you've joined us today. In this episode, Spencer speaks with LITERAL BANANA: about the philosophical foundations of social science, challenges around conducting surveys, and employing skepticism when reading papers.

SPENCER: Literal Banana, welcome. I've been following you on Twitter for a while. It's really awesome to have you on.

LITERAL BANANA: Thank you. Thank you for inviting me.

SPENCER: So the topic we're talking about today is one that's very interesting to me, which is essentially about the philosophical foundations of social science. So how do we make progress in this field, but not just from a methodological standpoint, with all the issues coming up with replication crisis about, you know, should we lower p values? And should we change your reporting confidence intervals and focus on effect sizes instead of whether something's p less than point five? But there's kind of like more conceptual issues of what are we really trying to do in social science. What are the questions we're trying to ask? What makes a question scientific? And are some of these sort of more philosophical issues holding back the field? So I'm really interested to hear what you have to say about this.

LITERAL BANANA: When I first started interacting with humans, I thought you could just look stuff up about humans, you could just look in the psychology literature and learn things. Then the replication crisis happened. And it turned out you couldn't really do that. So I was in this very confusing place of all the stuff I thought I learned about humans turned out maybe not to be true or to be meaningless in some ways. So yeah, I got interested in things like psychology and behavioral economics, the ones that weren't replicating. But it seemed the problem was deeper than people were realizing. So I was noticing things that I've taken as true and feeling kind of embarrassed that they were kind of silly in retrospect and wondering how did I take that seriously. How did I believe that? And so, kind of digging through the foundations of what I believe to see what else is like poisoned?

SPENCER: Yeah, it's very upsetting when you realize for like years and years you've been referencing some study that isn't replicated. You're like, I'm thinking about my belief system. Do I have to now kind of unwind? And how do I actually do that through my whole network of beliefs? So, one thing that you are very interested in, and I mentioned as well, is the kind of meaning of words, and in particular, you wanted to talk about how meanings get words. And so, can you tell us what does that mean--How meanings get words?

LITERAL BANANA: So do you have any hobbies that have a specific jargon for them?

SPENCER: Yeah, sure. So I do mixed martial arts, and I do bouldering.

LITERAL BANANA: So I'm actually not very good on the bouldering lingo. But I think it's a great example of where people are using these kinds of jargony words, but they're clearly doing something with them. And they're using them to kind of kick out, hold, or something a piece of reality in a way that lets you use it better. And that's like a place to ground yourself or kind of to start when thinking about words because we're going to get into thinking about really abstract words. My little hobby is spinning yarn, like processing wool and stuff like that. And there's a lot of jargon in that. But it's jargon that kind of helps you see something specific and solve a problem or communicate with people about work.

SPENCER: So what's an example of a yarn-related word.

LITERAL BANANA: So when you're pulling the wool that you're about to spin-off of the hackle, you use a diz. It's kind of like a little button-shaped thing with a little hole in it to pull the wool through, dizzing off the roving. So there's a bunch of terms. They're kind of old terms because spinning is an old craft.

SPENCER: It makes it sound like you're just making stuff up, honestly, like you're just saying.

LITERAL BANANA: So the hackle is the sort of spiky comb thing that you comb the wool onto. Diz, as I said, a little button thing to pull it out. Roving is the preparation of wool before you send it as one preparation. So if you're wanting to learn more about spinning and it's a great time that you learned about spinning because there's so many videos on YouTube, but it helps to know the jargon and it'll help you understand how things work and how to pick things out and even properties of things, not just objects, but properties like the twist of the yarn, different kinds of flies, like how the yarn is shaped, how it's constructed. You can talk about that in a pretty advanced way, but it helps to have the jargon, and the jargon is always like pointing to something in the world. It's not some abstract concept exactly. It's a feature of reality that you can literally point to. I think pointing is an important function of words and I'm kind of influenced by ethnomethodology–that's a big word. That's a field of social science that is kind of small, but I think it has a really good, good take on things.

SPENCER: So what is ethnomethodology?

LITERAL BANANA: It is a way of trying to understand contexts, I would say, the ethnomethodological way of studying a group is kind of to try to embed yourself in it and try to figure out how it works. And you're never gonna have a complete account of it. You're kind of looking at it, seeing how people are doing the things they're doing. Instead of starting out with abstraction and then kind of imposing that onto reality, you're kind of figuring out what the methods people are using and what abstractions they're using from people themselves. This is something that was pioneered by people in the 70s. Garfinkel is probably the most important one. There's a modern guy called Ken Liberman [inaudible, 05:47]. Lucy Dutchman. They do really creative things. One of the best studies is on coffee tasters, and how people who are professional tasters of coffee and trying to help big coffee buyers buy the right coffee, how they arrive at a description of the coffee, including like, not just very general, but including transcribing their actual conversations with the intonation of the conversation, like how they're arriving at a note that they're finding the coffee. And I think the idea of pointing of words as picking something out, so you can share it with someone else is an important concept.

SPENCER: I guess you could imagine that if people are first learning to do different techniques with yarn, someone invents some new technique or some new device that's helpful. And then other people want to use it. And so they have to have some way to refer to it so they can talk about it. And then so they come up with a term for it. And then you can also imagine some conceptual drift, where maybe like, what that object is, its exact structure changes over time, or what that method has changed over time to now, you can get these weird edge cases like is that really that technique? Or is it really that object or a different one? But still, it doesn't seem that confusing, because at least we're talking about just something in the actual world, right? We can all kind of oh, that's the thing we're talking about. And then things can get a lot more abstract when you're talking not about things in the world, but you're kind of using words for ideas. So do you want to talk about that?

LITERAL BANANA: Yeah. And I think people, obviously, we have all these abstract words, they're good for something. We use words like trust, compassion, and risk aversion. And well, that's maybe sort of specialized, but these abstract terms and they're useful for something just like the specific jargon is useful. They're useful, I think, within conversations. There are examples of jargons that are not exactly about making or doing something, but that are more sort of spiritual, like religious jargon, perhaps pointing out something that is a shared experience between people, but not exactly in the world. So that's kind of an interesting case. To me, I think they're probably still using the jargon to do stuff in the world because they're having social relations, and they're having institutions and keeping and maintaining those. But it's a more abstract case. And I think that the psychology and economics use of abstractions tends to do two things. One, it tends to imply a kind of universal need for abstractions that there's some kind of true meaning not in any particular conversation or language use, but just in general, that we can just know what trust means or what compassion means, or generosity or something like that. And the second one, the second mistake I think they make is kind of giving degenerate meanings to words. So trying to measure abstractions like measuring generosity. We might measure them with a survey or with an economic simulation game like chess game or Dictator Game or something. And then those abstract nouns kind of get equivocated on this. Are we talking about this methodical property from this experiment? Or are we talking about the term we use all the time? Are we talking about this kind of rich nuance term? And I often see in papers that people kind of equivocate between these two meanings that, first they're talking about trust as in the difference in value of the groups and how much they donate to the public goods gain or whatever. And then they'll switch to talking about what about just like trust in general. So I think there's a couple of ways of using words, I would say wrong, or at least not helpfully that differentiate that from the usual case of language because I think conversation is really good, and language exists for a reason. But I think it can be abused.

SPENCER: So let's use an example. So a common thing that psychologists want to talk about is happiness. And I think happiness might be especially prone to this because we all sort of on some level, know what happiness is, in the sense that we experience happiness. But then as soon as we start talking about happiness, there's kind of a slippage and it's like, well, what do we really mean? Or what do we mean evaluation of your life like life satisfaction, when you ask someone you know overall, all things considered, you know, how satisfied are you with your life? Or do we mean moment to moment positive experiences like where you pick people at random times and say how would you feel right now? Or do we mean stuff around the quality of different attributes that we think make a good life? Like do you have meaningful relationships? Do you have first dates and so on. And so already now as soon as we think of it as an abstract thing, happiness, it starts becoming quite unclear what we're referring to. And then when you go to measure happiness, maybe you use some particular scale, right, like the PERMA scale, or life satisfaction scale, or whatever. And now we have this additional slippage between this abstract concept and the scale itself. And people want to conflate those and say, well, this group's happier than that group. And really, what they mean is that they answered these questions and got a higher average score than this other group. And then they're kind of equating that with happiness.

LITERAL BANANA: That's a great example. Yeah. But often these abstract nouns are used to mean, it was given this results on this particular survey. Conscientiousness, for instance, is the big five personality factors. So when we're looking at, for instance, abstract nouns measured by surveys, they're actually everything to do with that can mean has to be in the questions. And often these questions are extremely vague, and subject to a lot of different interpretations. And we'll never know how the people answering the survey interpreted those in their mind, we just kind of have to hope that they were all about the same, that they all kind of meant the same thing. For instance, the conscientiousness questions are often things like, are you a hard worker, stuff like that? Or do you finish things when you start them? And you kind of every single question if say, compared to what is this, if someone has a lot of hard working people around them, they might think that they're relatively less conscientious. They might say, oh, I'm not a hard worker compared to all these guys. But if they have just kind of lower standards, they might read themselves higher. Whereas the idea that you can measure this underlying property of conscientiousness, I'm not sure that that's suitable for things like self report surveys; not the other reports, surveys do much better. They often don't really correlate with the self report. But certainly, and even even if it's an other report survey, there's still a person answering, and we don't know what that question meant to them.

SPENCER: Right. Other report, I assume you mean, like a friend or family member?

LITERAL BANANA: Oh, yeah. So some kind of an informal and formal report. So it seems like you're getting around the survey question making them objective, but I think you're just kind of passing the buck to this other survey taker. And that's true for instruments that are essentially surveys that are given kind of answered by somebody else, like the some of the depression scales will be directly a survey, but the doctors answering for you. So it's extra removed from the person's experience. It's kind of what what do you seem like to this other person? So I don't think that makes it more objective. I think the survey problems are still there.

SPENCER: When I think of what is actually going on, when someone answers a survey question, I think of is this multistage process, where first the person reads the question, they form a concept of in their mind loop? So the question is, you know, how organized Are you? And then there's some scale like, you know, very much, moderately, somewhat a little not at all, or something like this, right? And so that they read the question, they form some concept of in their mind, then they run some kind of search procedure, in their mind, like, searching for an answer to how organized, am I right, or whatever the whatever the concept they converted into, and then that search procedure produces a result that sort of like some degree of like, strength of like, Oh, I feel like quite a bit or whatever. And then they have to then convert that feeling they have back into the answer options, right? So whatever that feeling of how organized they are, from that query, then then after that, map it onto Okay, that's, I'm gonna put that in the quite a bit answer option, because it's closest to my feeling. And you can you imagine this failing at any of these stages, right? So when someone reads it, they could get the wrong concept from it. Right? Maybe if they like, didn't know what the word organized meant?

LITERAL BANANA: Yeah, there's so many moving parts that all have to be working for the survey to be meaningful. And things like pain scales, like to answer how you are from one to 10 on the pain scale, you have to kind of picture the the possibility of human pain. I don't know how well I could do that. And of course, everybody's kind of has kind of their own idea of what's a 10 pain, or what's a one pain. It's something that people actually do use, at least in hospitals, to try to figure out like how to deliver pain medication, but I think it's a kind of low fidelity process that people are answered vastly differently, based on what we'll never know. But what subjectively might be the same thing.

SPENCER: I think these are valid critiques of survey methods, but I don't view them as devastating because I think there's still ways that this can be useful. For example, if one person is rating themselves on a one to 10, pain scale, and they're doing it multiple times a day, you could still say, well, I was at a four and now I'm at a five and then I went down to two or three after I took this medication. And that still seems meaningful, like even though we can't say that that person's pain scale matches on to another person's and we can't say exactly what a three is versus a four. We can still say, well, this first one went up and then they went down and that actually tells us something.

LITERAL BANANA: Yeah, it's not it's not nothing, and I don't think that there's never a possibility for surveys to be meaningful, what I advocate is just to be much more suspicious of not just their base, but a lot of different methods, that there's so often some problem with a specific implementation, that it makes sense to just be very skeptical, in general. Different pain studies may have different problems. One thing is a, like a Likert scale seems to be a lot easier to get a result from the binary kind of thing. So something data that's continuous basically, like on a one to 10 scale, is seems to be easier to get a significant result from than something that's binary. And there's a question of how much information you're encoding from those those one to seven scales like political beliefs, or pain or happiness or whatever, there's a there's a risk in noise mining I think that it's every every study is going to be different, but using that there are many, many situations, I think, where they get a positive result on a survey in terms of change, like you're talking about change in the skills that I don't think are real. And it's not that that could never be meaningful, but that there are a lot of potential issues and problems with with any specific case. Things like placebo results I'm very suspicious of.

SPENCER: What do you mean by that? Do you mean, when there's no placebo control?

LITERAL BANANA: When placebo control, I think is a different thing from what's considered the placebo effect. I think a lot of people's conception of the placebo effect is believing something can work to make it work. And I think you have to distinguish, believing something can work can make it work from one thing to consider what is the activity of answering a painkiller answering a survey or something like that and think it's communicative. You're trying to communicate something, if you feel like whatever you had to communicate has been communicated. And maybe you've been treated in some way, then you no longer have the same motivation to express your feelings of pain or feelings of whatever it is.

**SPENCER:**Right. So there's a danger that when you run a study, where you have an intervention that we're actually changing is not the underlying phenomena of interest, let's say how much pain someone else but you're rather changing the way people report that phenomenon. You're causing people to report less pain, rather than to be in less pain.

LITERAL BANANA: Yeah, and one thing I often see is surveys correlating to other surveys. So rather than necessarily measuring the property that they're trying to measure, and maybe that they're measuring how people answer similar survey questions, similarly,

SPENCER: Yeah, I've seen this with things like looking at can neuroticism, the Big Five trait, predict anxiety disorders? And the answer is yes. But then when you look more closely, you're like, well, but how is neuroticism measured at some of the questions that are asked? Typical neuroticism personality skills actually look a lot like the kinds of things that you might answer if you're taking the GAD seven, which is a measurement of generalized anxiety disorder. And so you start saying, well, maybe there's a circularity here where you're just sort of part of what neuroticism is, is the same thing as when you're reporting how anxious you are.

LITERAL BANANA: That's a great example. Yeah, absolutely. Similarly, things like conscientiousness, that's a particular one that I've that I've looked at a lot. There was one study that found a big result a big effect size for conscientiousness, on grades, but it turned out that the way they were formalizing conscientiousness was not the normal way, it wasn't the normal Big Five inventory. It was a few questions, they decided were related to conscientiousness, but they were very much about how you did in class, but they're about how well are you able to study in class and stuff like that, and I can see that correlating with grades a little bit. But it doesn't necessarily measure conscientiousness, but it's at least it's not the full picture of conscientiousness, and they probably get a much lower correlation to finish with that.

SPENCER: Yeah, conscientiousness is famously related to work performance jobs as well. But sometimes it can become dangerously close to secularity, where I've seen some consciousness scales where they ask questions, they're like, do you work hard? You know, do you always perform well, and things like this? And you're like, well, it's almost like asking someone how good they are at their job. I mean, it's not exactly that. But it's starting to become very unmysterious why that might be correlated with someone's work performance, because it's almost an aspect of work performance.

LITERAL BANANA: I saw something recently, I think it's still a preprint. It was a pretty big study. I think that about 3,000 subjects on Big Five and socioeconomic status indicators, one of them found a correlation between conscientiousness, which should predict educational attainment, it predicted it 0.04. And they managed to get it up to 0.06 with some variance adjustments. They say they were kind of surprised by this, like the scientists were surprised to find this extreme low correlation. You'd never be able to notice that in reality, and they were saying that possibly in the past that has been a matter of degrees of freedom sort of [inaudible 19:50] things that smaller studies that have managed to get bigger correlations by doing methodological tricks. Even when they're asking the same thing. I'm not sure that in large populations, they necessarily predict each other But the results kind of look worse and worse.

SPENCER: Yeah, with correlations that small, they're almost never relevant. I think there are some exceptions. Let's say you had a new treatment for a kind of cancer that's untreatable, and it had a correlation of 0.1 with like people surviving, that's still might be worth it. It was not too expensive, right? Because it's like your only chance of survival. Yeah, give me that treatment. Right? That's because that's like, the most serious thing. It's like, literally life and death. But for almost anything else. Yeah, correlation less than point one, I just, you know, basically, it's not meaningful.

LITERAL BANANA: And I think there's a there's a way in which, even in the studies that have really tiny correlations, the authors of the study tend to talk about the stories of things as if they had a bigger correlation as if they would actually affect something you could observe in reality in a magnitude that you could observe it. And if that's, that's generally not the case, it seems it seems misleading to me.

SPENCER: Right, to say like, why does this lead to that, as though leading to that doesn't mean the correlation of 0.5 right, as though it's some kind of deterministic thing. Right?

LITERAL BANANA: Exactly. And then you'll see, you know, tons of, of tweets about, oh, this is exactly true. I know, I know this from my own life. And I think that's way more meaningful. Like, if you, if you think something's true, because of what you've observed in the world, that probably means something, especially if I trust you. But yeah, I think it's almost like an excuse to tell a story when it's not really appropriate to do so when it's not merited

SPENCER: A correlation of 0.005 is essentially suggesting that the story is not true, right? Like, it's that's actually evidence against the deterministic story, not in favor of it.

LITERAL BANANA: I've seen a methodology recently where they do a Bayesian analysis, and often pre-registered fancy Bayesian analysis. And I saw one recently it was on ego depletion, they're still trying to replicate ego depletion. That's the idea that you have only a certain amount of executive function. And if you have to make a hard decision, you get stupider, or you make worse decisions, pretty much a dead theory that has failed to replicate. But there were two attempts recently. So the first one is a Bayesian analysis, and they found no effects. But then they went in and did a non preregistered, sort of, here's if we were being naughty analysis. And they found an effect size of 0.08. And there was another study that wasn't Bayesian, and they found an effect size of 0.1, very similar in size, but they didn't reject it based on their analysis. So I wonder if, if Bayesian analysis could help with the small effect sizes to see if there's evidence for the null hypothesis.

SPENCER: It's funny, you bring up the example because I wrote a really long essay on self control, and a talk about ego depletion. And I basically think that a lot of what's going on, there is a philosophical issue. So suppose someone said to you, after you make lots of decisions, you can get exhausted and bored. And that makes you like, put less effort into future decisions, right? As like, anyone who's like how to plan lots of logistics, or how to throw a wedding can probably tell you, you know, maybe the first question about the napkins rings, like, okay, maybe you're paying attention, but you know, but after four hours of answering questions, yeah, you're probably not going to make a very good job at decisions. And I just think that would be like, sort of totally unsurprising to people. And almost everyone be like, yeah, of course, that happens. Right? Whereas let's say the claim is, well, we have this special variable in our minds, which is like our level of like, ability to make these kinds of decisions. And as you make decisions it like, it goes down, and then we like run out of it. Well, that's starting to sound, a starting to sound very interesting and cool. But is it really different than that first thing, and how can we tell the difference?

LITERAL BANANA: Yeah, and I think there's, there's two ways you could go. One is, you have a exciting and plausible result, counterintuitive results, gets a lot of attention. And those often turned out to be fake. But the sad thing is, when you get a really plausible, boring result, that still turns out to being fake, it's so believable, but it's not positing any kind of counterintuitive situation, but it still turns out to be developed in the face of the same methods. So that was like, you can't even use plausibility of the result as a guide to how much you should take it seriously.

SPENCER: Well, I don't know. I mean, I think in this particular case, I think there's just not clear enough distinctions between what we're really talking about, like it's clear, people get tired, it's clear people get bored. It's clear, people get hungry. They don't get you for a long time. These are all reasons why like decision making ability, and self control will fall over time, you know, all else equal. So there's clearly exist. And then like, in my opinion, a lot of the studies haven't just haven't clearly distinguished between all these things carefully enough to really say, whether they've measured something that's just different than the obvious things. And then when you get these failures to replicate, a lot of times, if you look closely, the failures replicate. They're so specific, you're like, I don't even know that I would predict like, even if you go to [unaudible 24:43]. Is that really the kind of thing that we should expect to happen? In other words, there'll be like, okay, well, if you give people you know, one task where you have to find the number of e's and a bunch of lines of text, and then after that, you're given a bunch of puzzles, one of which is impossible and we see how long you play that puzzle before giving up like, and then you fail to replicate. I mean, this is basically a real study I'm describing that failed to replicate. Okay, what does that really prove? And has that proven there is ego depletion? I in my opinion, it hasn't. Because like that is so specific. And it doesn't seem to me, obviously, that connected to the fundamental concept we care about.

LITERAL BANANA: Yeah, the reason I think the failure is to replicate matter is because those are the paradigms that the original claims were being made based on that we have isolated this phenomenon. And, and often, it's not the claims that I object to at all that I don't think that the claims are more or less likely to be true, just because it failed to replicate that I mean, the general terms that we could we could get from kind of common sense, like people get tired. I think it's the methodology that we have to be suspicious of whether, as you say, having someone pick, count how many e's are in a particular text, and then maybe choose between a salad and a piece of cake or something different study? What is that measured? That that's probably different for every possible subjects contexts. Can we generalize to things like very few laboratory protocols are any good, still, they kind of generate results and get attention for the hypotheses based on the methodology. And if you hear the claim, you often don't even necessarily think about the methodology, when you hear, maybe not hear, but maybe see a tweet about a recent scientific finding. what I think is the ideal situation, if you think well, what's that based on? What was the methodology there? And can I kind of look at this with a straight face? Like does this make sense as something that you could generalize to this abstract noun you're talking about?

SPENCER: Yeah, I think this idea of abstraction is really important. My preferred way to read papers now is I read the abstract just to get a sense of is this a paper I'm interested in. And if it is, I immediately jumped to the method section, I read exactly what they did, what exact questions that they asked what exact stimulus do they use, it's very frustrating when they don't include the exact stimulus or exact questions. But more and more people are, which is a great thing. But still listeners don't. And then after I read the method section, I jumped to the results section. See, okay, what exact statistics of the calculate and what exact values of the calculate for the statistics. And to me that's actually the evidence I want to update on. the interpretation around it, I find often actually makes things more confusing, because what's happening is they what they actually measured is they asked for these three questions. And what they actually calculated was this particular, you know, ANOVA. But then when they talk about those, what they're saying is, well, we measured self control, and we calculated whether self-control was affected by so and so, and it's like, well, sort of, but we've lost like a lot of information in the conversion of like, you ask these questions, and you kind of get this statistic to, you know, we showed that self control was depleted.

LITERAL BANANA: Yeah, I think the loss of information is a really important factor, that not losing information in terms of going from very specific findings to general claims, and also losing information about the subjects and what they actually thought about the situation, because we'll really never know how they interpreted it. It's become more common to ask sort of exit questions like, did you just to try to figure out if people discovered the purpose of the study, but I don't know how much that really gets. I think that's still not very much information. We don't really know how people interpreted the question, if it's a survey, or they were playing an economic simulation game, where they gain attention, or they just bored. And on a deeper level, are the properties that we're claiming to measure, are those realistically measurable by the method that we're using?

SPENCER: One of my favorite quite underused methods in social sciences to ask people how they interpret questions. And usually you do this in piloting before you finalize your questionnaire. But it's very time consuming, which is annoying, but it's extremely valuable. And I've had just some startling realizations when doing this. I can't remember if I've ever mentioned on this podcast before, but one time we were developing the question for the sunk cost fallacy. And we asked people, okay, imagine you're eating at a restaurant, and the waiter brings your food and you start eating it and you realize not only are you already full, but you actually don't like to taste the food either. Would you continue eating it? And most people said they would. And then we said, aha, we you know, we mentioned the cost fallacy. But you know, trying to be careful, we asked you, can you please explain your answer? And what a lot of people said is that they would the reason they keep eating is, do you want to guess, actually?

LITERAL BANANA: Social pressure feeling polite, something?

SPENCER: Yeah, because they felt like they most likely they would be with the person and they would view it is awkward, or like impolite to not eat any food at the table. And then so then we like, change the question we specify, okay, you're eating alone, right? We've made it really clear. Now, a bunch of people said they'd be worried about the fact that chefs and all this time preparing the food and then they didn't eat any and feel bad. So in order to get this question to work, we ended up going through like four or five versions of it, where we kept adjusting it asking people to interpret it and ended up coming up with a version that kind of avoided these issues. But you could see how easy it would be to stop after the first one, okay I've measured sunc cost fallacy, I'm done. Right. And in fact, what have you really measured? You mentioned completely different than when you saw. There's something about feeling the social pressure.

LITERAL BANANA: Yeah. And I'm glad to hear about how careful you are in that situation. I think that's probably not terribly common, at least, maybe it's becoming more common. I'm always really interested. But my favorite kind of survey to read about is when people do exactly that just I'm interested in reading what people thought at these questions like when they have people take a survey, for instance, conflict tactic scale is one that I think is really interesting here, constantly tactic skills supposed to measure abuse, like domestic abuse, but instead of just saying, Have you been abused the idea as well, people don't know if they've been abused. So you ask them specific questions about specific acts. And they sound like really serious abuse, like have you been kicked or have you have you been bitten or something like that, and that counts as severe abuse. But when they actually interview people about what they meant by it, very frequently, a kick will be something like a kick in bed while play wrestling, or something a lot racy and less sad than abuse. So probably conflict tactic scale massively exaggerates the level of abuse. And I only know that specific reason because people have bothered to do studies kind of interviewing what people meant by this.

SPENCER: That's a great example.


SPENCER: I've also looked into a kind of a similar question around what percentage of people have experienced sexual assault? And as soon as you start thinking about the question, you realize, actually, that's a really tricky thing to measure.

LITERAL BANANA: Yeah. How do you define it?

SPENCER: Yeah, what do we really mean by it? And then even if we figure out what we mean by how do we ask in a way where it's clear what we mean. So for example, one method I was looking at, it was a survey some people ran, I noticed they had a question that was something like, have you had sex when you were like manipulated or something like this? Or when you maybe they said when you're drunk? And it's like, okay, you could definitely see that how that could be assault, right? Like with a stranger. On the other hand, let's say someone's in a 10-year relationship, and every Friday, the two of them get drunk and have sex, then you're like, yeah, that's like that. Most people would probably not think of that assault, you know, maybe some would, but like, it gets really tricky. And you get down to defining what you mean.

LITERAL BANANA: Yeah, what are the scales that's use a lot is the sexual experiences scale SES. And that one includes things like about race and stuff like that. But it also asks things like, have you engaged in some kind of sex, because someone lies to you about how serious the relationship was, and that that might be kind of a jerk move, but I don't think that sexual assault, but it's what counts is that like, different people may define it in different ways. But a very common tactic is to lump very severe stuff with very minor stuff, and then call it all the same thing, just have it all under the same heading the same abstraction,

SPENCER: Something I think is not going to happen, but I actually think would be really beneficial is if we had a way of adding numbers along with words about their severity, like assault one, assault two, assault three to assault 10, or something like this. Because so often people will say a word, and then it's just not that clear what they mean. And there's different versions of the thing that range from like, incredibly bad to, like, only a little bad. You know, like assault one could be like, you throw a like a tiny pretzel with someone or something, right?

LITERAL BANANA: Practically, that's assault. Yeah.

SPENCER: Yeah, like assault eight could be like you throw a brick at them and assault 10 give you throw a grenade at them. Right. And like, that's like, it doesn't really make sense that call those things, you know, physical assault, and just like leave it at that what I wish they would do on the surveys when you tried to measure things like, what percentage of people have been assaulted, or whatever, is they would give a bunch of different definitions. They say, Well, if we define it this way, this is the percent we get. And if we define it that way, there's a percent and try to make it clear, but often, what they end up doing is just rolling it up into one number. And it actually takes like, you know, 40 minutes are really careful reading to figure out what that number even means.

LITERAL BANANA: Exactly. And most people just want to either use the claim for their own purposes. I think one thing I hope to promote here is how fun it is to dig into the methodology that it's a really great hobby. As you'll never run out of suspicious scientific findings to investigate. It's kind of like little mysteries every time trying to figure out, you know, how did they come up with this? And what does this mean?

SPENCER: Yeah, a process that I've found super helpful is just randomly sampling papers from top journals, and then kind of studying them. And the reason I'm sampling is interesting, because it can give you kind of a flavor for field, like, what's going on this field? What sort of methods they use? What are the sorts of mistakes that they make, or the kinds of conclusions they draw? And I feel like I've learned a lot about social science by doing that.

LITERAL BANANA: Yeah, I agree. Same.

SPENCER: Now, we talked about surveys and some of the challenges with them. But I think you at some point you wanted to make about the context of a survey how that sort of different than the normal context that we communicated?

LITERAL BANANA: Yeah. So in conversation, when you don't agree on something like you're talking about the severity of something, if there's a misunderstanding, you have a chance to work it out. If people are talking about different things, they can figure out how to be talking about the same thing, they can realize there's a problem. But in surveys, there's there's no opportunity for that. And it's actually a lot of things, it's a lot of different levels, including the context of the actual interviewer, if it was a face to face interview or a phone interview. Often often surveys are online. That's that's a major part of that context. How is that different from everyday life. And I think surveys are these special little thing that doesn't really have a counterpart to everyday life. It's not a conversation. It might be kind of communicative, like people might use surveys to communicate something. But it's not really like anything else. And one that I think particularly illustrates this is back in 2006, there was a study called at least the heading of isolation in America. And they claimed to find based on the General Social Survey, a major decline and how many friends people had, how many close confidants they had, and this site, it has 3000 citations. Plus, it's still being cited to this day, I stopped many citations to it this year, but people started questioning what the statement really meant. This task that they were basing it on was, okay, name your friends, basically, name your close confidants and they kind of check off how many first names or initials you could give. There are actually a lot of reasons why you may or may not tell a person on the phone, who your friends are. The study ended up having interviewer effects, meaning some of the interviewers were just terrible. So one reason you might just say, No, I don't have any friends, it's because you want to get off the phone with the person quickly. And many, many of these interviews had these interviewers had, the majority of their the people they contacted said they had no friends. And it seems to be. Yeah. So I think 27 out of 28 of their interviewees had no friends. That's something I wouldn't have thought of, unless except for it, you know, these people kind of went back and did a bunch of math on it to question kind of a surprising finding. Often that'll come down to there was there's some difference in the interviewing environments, there's differences between phone surveys, and in person surveys, there's differences between online surveys and, and phone surveys. So yeah, that's that's kind of what I mean, by the context, that there's so much stuff there. It's not just the questions being asked in the ether, there's a specific context. And the person answering is in a particular mood and having things going on in their life and having thoughts in their head. And all those are influencing what happens in terms of the data, we pull out nice numerical data from these things. But the actual process of getting the data is is not that clean. I think it's not always good to rely on it as if it were really this sort of mathematical truth arising from just reality.

SPENCER: It seems to me that there are some significant advantages to the online survey context. It's certainly not perfect and has disadvantages too. But I tend to believe that people are very influenced by the person they're talking to. So you can imagine an interviewer giving like subtle cues about what's the normal answer or subtle cues about how they might judge the person based on their response to give?

LITERAL BANANA: Yeah, absolutely.

SPENCER: At least the online survey, this anonymized and the person know it's anonymized, at least you don't have social pressures involved.

LITERAL BANANA: Except to the extent that people on like, survey platforms like Mechanical Turk, really want their contribution accepted. So they're almost like, if you read the forums of people who make that their job or trying to make money that way, they're almost kind of paranoid. They're in this mode, where they definitely don't want to miss, like a trap question, trying to check whether they're paying attention. They're trying to answer, I think, as helpfully as possible to the researchers so they get their contributions accepted. I wonder if that influences how easy it is to get a result. Like if you can successfully communicate what you want them to say, with your survey instrument, they're probably really willing to tell you what you want to hear.

SPENCER: Yeah. And I think in some context is really clear that no matter how they respond, they're going to get paid. And others it might be a little bit less clear, where they might feel like, oh, I want to like follow the rules exactly. Because I don't want to get my work rejected, but that actually could cause them to answer differently than they normally would. One thing that we do sometimes in our surveys is will actually say very explicitly, "The rest of the survey, you will get paid no matter what you say." Just to try to really emphasize it.

LITERAL BANANA: That's nice.

SPENCER: But you know, all these issues you bring up, there's a sort of meta issue that encompasses many of them, which is that you have to work really hard to figure out the truth. If you're not like being really self skeptical of your own research, if you're not kind of looking for ways that your survey might not be measuring the thing you think, it's so easy to just get a nice looking result and then be like, oh, I guess ego depletion is a thing, or I guess, I guess, you know, consciousness predicts this, but and then like, not take that extra step and be like, Hmm, but have I measured the thing that I think I measured? And like, what does this really mean? And, and, you know, could people be responding in a way because I designed the survey that way? And not just not because it's like, the actual truth?

LITERAL BANANA: Yeah, and I think social science would benefit from being more adversarial. And not just social science, in the sense that of expecting to have your work criticize of expecting to have people try to poke holes in it. And is that even meaningful? What does that even mean? This is something I think that people understand when it comes to like parapsychology, like ESP and stuff, experiments to try to measure that, that we should be very skeptical in it kind of assume it's fake unless it has really, really strong evidence that it's real. But I don't think that's really been applied in social science, the, the replication crisis ended up kind of being an introduction of a little bit of adversariality. And I would like to see that trend continue, I think it would be much better if it was just a brawl.

SPENCER: So I definitely agree that on the margin, we need more skepticism and criticism, and so on. But I actually think that it can introduce its own problems. And the way this comes up for me is, if I'm trying to figure out the truth about something, I often am adapting on the fly based on what I'm learning. So like, let's say I run the study, and I'm analyzing the data. And I learned something in the data, I'm like, oh, wow, that's not what I expected. If I'm being fully truth seeking, right, like, let's just take that for granted. I mean, fully truth seeking, that can actually really shift what I'm doing in the moment. And like, oh, I'm gonna analyze this totally different way than I thought, and like, actually need to, like kick out these outliers in a way that I didn't expect to and so on. And I think when people are doing a form of science that's like, I need to protect myself from criticism. What they do instead is they tend to do things like I'm going to use the standard method analyzers, and I'm going to pre register my result. And those are good things a lot of times, but it also gets you in this pickle, where like nobody, you actually needed to figure out the truth is like, do these things that you didn't expect and do this thing, that's not the standard way of doing it rather than kind of do everything by the book. So you can prove to everyone that you're you know, you're uncriticizeable.

LITERAL BANANA: Yeah, I think that's true. I think preregistration isn't nothing, it's, it's good to limit degrees of freedom, but also your limiting degrees of freedom. So if you genuinely want to study something different, yeah, that can kind of limit you. And I see all the time pre-registered stuff with extra exploratory analysis. I think that's fine. Like it's it's very contextualized in that way that you know, what you plan to do. And then you also know, here's what we did, after the fact when we when we looked at our data. That is pretty honest to me. I don't I don't usually put a lot of faith in the results, but at least seems more honest, better than this is the impression that this is what we plan to do from the beginning, or we didn't make any particular changes in our methodology. Yeah, I think the procedural things are a solution to some problems. I don't think they're the solution. I don't think you could just have the procedural issues taken care of and have automatically meaningful, like, I've seen pre registered stuff that I don't like the methodology of it all.

SPENCER: Right. So to that point, you've lobbied a bunch of interesting critiques of ways to do social science. What is the solution? Like how do we do science better?

LITERAL BANANA: I think one thing is we have to admit that a lot of the questions we're asking are not capable of any kind of answer. There are two general. The things about just people in general, properties of people's personality, how trusting people are, how risk averse people are. Those are almost too general to have an equal answer. When I see things like, "Are attractive people more generous?" Economic simulation game evidence goes in one direction, the survey evidence goes in another direction. I don't think either one is right, because I don't think that's a good question to be asking. I think it should go more in the situation of like less research is needed. I think that one response is you could just not do that. Just not do those methods anymore, that aren't really getting knowledge and answering a question. The ideal situation for me is if it got a lot more specific, rather than pretending to study people in general, actually studying real groups of people, and realizing that things you learned from this group will likely not generalize to others, there may be patterns that pop out. That's kind of what we like to read. I think a lot of the points of doing sciences have stories. That's that's what we can understand. But I think a lot of the hope of discovering the truth about big social abstractions is just it's not going to happen. There isn't an answer, because it was never it was never a good question.

SPENCER: Right? It's like if people vary too much, then you can't say "You know, people like x like y. Well, some like x are like y." It seems that you want your questions your posing to be as general as they can be while still being applicable. Maybe you can't say our people like x are like y but maybe you can see people like z are like y maybe that actually has an answer.

LITERAL BANANA: I think there's, there's something it's kind of hard to articulate. But it's that very little social science results are being used in the world like, but not much happened in the world after the replication crisis, because those results weren't load bearing for anything, nothing. Nothing depended on them being true.

SPENCER: Well, a bunch of people stop power posing, which I find that kind of ironic cause I think power posing probably works.

LITERAL BANANA: Fine, why not? You have to get good. Yeah, so what what would social science look like where people were actually trying to do things in the world and solve problems might might not look like laboratory, it might look more like people going about their lives and doing things, trying to start clubs, stuff like that.

SPENCER: Well, we try to apply social science in the world. I don't know if you've happened to check out our website clear thinking. But we have about 60 modules on there. And many of them are trying to take an idea from social science and help people apply it to an actual thing. For example, we have a tool called decision advisor that tries to help people make a big life decision. And what we're doing there is throwing a bunch of techniques at the person while they're making the decision to try to help that decision go better. And so that's like, our goal there is very specific, make it so that people are more likely to be happy with their decision than they would have been had they not used our tool. And it doesn't have to apply to every person in the world, it just has to apply to the people who use the tool. So it's sort of like in the context of people who use our tool in exactly the format of our tool, we want this as you go better than it would have gone. And so by narrowing the scope to something very practical like that, I actually think that that does has an answer. Like it's not it's not too vague to have an answer.

LITERAL BANANA: That's awesome. Yeah, I think it's possible at one, one way I think we can see as possible is that we can hear people around us using certain social science contexts to explain things or to, to pick out patterns in their lives. Like I'm thinking about, like sort of pop psychology concepts like narcissism, people love to talk about narcissists, and what their behaviors like, I think that's people using an abstraction in conversation, often potentially, in a helpful way. Things like trauma, like I think trauma theory is kind of curse, but people seem to be really interested in using that as a way to, to communicate about their experiences, to conceptualize their lives. And whether those are our net beneficial or not, who knows, but people are kind of using those. So even some of the pop psychology concepts that might not be based on on experimental evidence, necessarily, to the extent that people use them, maybe they're, maybe they're fine.

SPENCER: But certainly narcissism, as a phrase provides us with a kind of language to use to talk about certain things, certain patterns we observe in reality, and I think, whether or not these narcissism skills are out there, measure it, you know, optimally, or even at all, I think they do measure to some degree, but but they may be missed a lot, it still seems really useful to be like that person is narcissistic, and then you're communicating a lot of information, and you're not going to communicate perfectly, but like some ideas are gonna be so communicated by saying that,

LITERAL BANANA: Yeah, and yeah, there's a there's a trend kind of, of people making videos talking about specific people and how their behavior like public figures generally, and talking about how their behavior is narcissistic, or is some some other thing and so they're kind of trying to work out examples and, and people disagree. And I think that's healthy. That's good.

SPENCER: So one phrase that you use is magic as metaphor in psychology. Could you unpack that for us and tell us how it relates what we're talking about?

LITERAL BANANA: So there's, there's kind of two directions. One, one is just an example that that I noticed that I think is really interesting. So in the old days of magic and contemporary magic, for that matter, I'm talking about like stage illusions, that kind of magic, people doing magic tricks to entertainment, the classical sort of background hypothesis that you're supposed to have is they're really doing ESP, or they're really moving things with their head, or they're really, they're really making things disappear. We all kind of know that's not really true. But that's the that's what the illusions based on the magician saying, Oh, I'm gonna read your mind. There's a magician, a British magician called Derren Brown. And he has a really interesting style and his his thing is, instead of claiming that his tricks are based on ESP, he claims that they're based on psychology, which is fascinating. He, according to practicing magicians, he does normal tricks like they're, they're not actually based on psychology. They're based on sleight of hand and misdirection and stuff like that. But the fact that he takes that as his sort of background explanation is interesting to me. And it, it kind of seems like a diss to psychology if you think about it, that the ideas of psychology are kind of powerful, magical enough that they can see and they can create status like magic trick as the as background explanation is very popular like it, it seems like a lot of psychology classes will go to this, this shows as part of their classes as kind of a field trip. But it's interesting to me and I think a lot of the the priming studies, in particular, the sort of most obnoxious psychology studies that that tried to show, like, if you do a word scramble with words related to old people like Florida, then you'll walk a lot slower. And those those seem like magic tricks to me. Those seem like, I'm going to influence you with only the power of this text. And I'm gonna make you do this interesting thing, kind of like you're hypnotized on stage, I think they were basically doing magic tricks.

SPENCER: I find that really interesting about Darren Brown, I've come to a similar conclusion. And my suspicion is that the reason he talks about his tricks in terms of psychology, is because many people today would not really believe it, if he referred them as magic. But if you're first in the psychology, people will actually sometimes believe it. And it's actually more impressive to them. So I've actually seen a bunch of people just who I know be like, oh, yeah, Derren Brown, he just these amazing psychological, you know, manipulations, not realizing that it's just magic tricks.

LITERAL BANANA: I wonder how common that is that people not realizing that it's regular text. He is very good, but not the stated reason.

SPENCER: Right? It makes it kind of more amazing. But I would also say he sometimes does, I think throw in psychological tricks, which is also kind of throws you off the trail, right? Because like, let's say just one trick that's clearly using psychologist not a magic trick. And then the rest of his tricks or magic tricks, right. But now you're kind of convinced by his claim that he's, you know, just a psychology expert or something.

LITERAL BANANA: That's a good point. Yeah, like that. So the other aspect of this, there was a famous skeptic magician named James Randi, the Amazing Randi. And in the 80s and 90s. He was the scourge of the woo people the ESP, and they feeling and things like that. And at the time, there were, you know, science labs trying to measure parapsychological phenomena. And as a stage magician, he was looking at this and saying, Well, they're just doing tricks. And he did a lot of kind of adversarial activity. He planted his own people kind of undercover and taught them how to do tricks how to fool these, these Parapsychology researchers, and they were very successful at fooling them. On the other side of that he would claim to have parapsychological powers from the standpoint of someone who knows how to do tricks. So knowing all the possible tricks, he was able to figure out how they were doing these things, instead of just sort of accepting Oh, well, it must be must be real or something. And he's very successful, because knowing the background of how you might do this trick. In a lot of cases, he was able to figure that out. I think there's a parallel in psychology that once once you've seen a way that a trick is done, you can see it in a lot of other places. And I think psychology probably deserves to be treated more like Parapsychology in terms of the adversarial nature of it, but taking the position that there are tricks involved, unless proven otherwise.

SPENCER: Right. I think we do have a few James Randi's of the social science world that are not always looked on favorably?

LITERAL BANANA: Yeah, definitely. It seems almost like impolite to criticize it like they're working as hard as they can. But I think if you value truth, more than more than politeness, that's it's kind of a hard, they're making claims about truth in general, and his claims are very influential. And maybe it's more important to maintain the quality of the information comments than it is to to be polite, and to kind of let let the researchers get on with whatever they wanted. And I don't necessarily think it's the most important thing to influence researchers. I think one thing that's, that's the most important thing is to shift the level of trust and the level of sort of unquestioning acceptance of scientific claims and in headlines and abstracts and to shift to a position where it's probably not meaningful, unless there's really good evidence that it is meaningful.

SPENCER: I think it's such a sad thing, though. There's so many important questions that we could be answering with social science research, right? Like, how do we help humans have better health, mental health? How do they help humans be happier and have better relationships and less conflict and more productivity? And how do we help people achieve their goals more, make better decisions? Right? These are really important topics. And yet it feels like if we get to the state, which I think you would say we are in already were a typical paper, you basically just have to assume it's not true until lots and lots of other studies can confirm it. It just seems hard to see how we're going to really make progress on these topics.

LITERAL BANANA: Yeah, I thought people were all excited about a single patient study that came out in news reports about a woman who had some kind of transcranial electrical stimulation to cure her depression and she's been depression free for a year. But we have to think like there were a lot of examples of this in the past that didn't work out that well. And the the early studies that was great for for brain implants and depression, once they actually did randomised controlled trials, there was nothing. There was nothing better. So yeah, I think knowing the history of it can help. But knowing the history just makes you more skeptical.

SPENCER: Yeah, I don't fundamentally think these topics are intractable. But it does seem like something has gone off course and the current methods we're using, do not do not allow us to make significant progress. Like I do think some progress was made. I do think, you know, there's some people doing really great work. But it does seem like many of the most important topics in social science, like progress has not gone nearly as fast as you'd like.


SPENCER: If you could change one thing about the way social science is done, what would you change that you think would actually lead to better outcomes?

LITERAL BANANA: I would want a shift from theory based stuff to collecting postage stamps, you know, the exactly talking about the idea that biology before Darwin was just collecting postage stamps was just kind of, you know, looking at little specific things and putting them in your curio cabinet. I don't think that's really been done. I think things like like history as a social science is pretty good, because it's often extremely specific events. Generalization, of course, is that history. But things like people trying to study mundane, everyday life in the past in different places and times, I think that's really good. I think things about people in general tend to be much worse. And you can get into just as many problems when you study your own little area of history, and then try to generalize it to everything. But I think just studying little specific things is much better, not big abstractions, studying particular people studying phenomena, traffic instead of compassion or something like that.

SPENCER: Isn't it a lot of the value though in being able to generalize,

LITERAL BANANA: It would be great, I don't think we're nearly there yet, I think, thinking about what has to be done biology in order for evolution to sort of pop out of it. I don't think we're even necessarily on the road to that. It's like we are starting with nice, abstract theories. Without having as much on the ground description as we could have. I think we have more of that just outside of the social sciences, things like people posting the whole life online, that's great. There's a lot of value from that, I would maybe interview every person over the age of 90 in my town and just see, you know, ask them what life was like when they were young. That's the kind of thing that I would like to see preserved more than vague abstractions that may not be useful or meaningful in any way, just actual actual specifics of reality. And one problem is that the they don't necessarily, the specifics aren't necessarily able to be pinned down in words, if you want to try to record some specifics, you have to choose you know how to how to write it, how to present it. And the quality of your job of doing that is to determine how how interesting it is how useful it is, you can't probably get everything in there. You can't explain the story of a person's life and everything that was important to them in a few pages, but it's at least something I would I would like to see people get a lot more specific.

SPENCER: I can see why you'd want to go that way. But I actually don't find that very inspiring vision. And I don't. And I don't think that that takes us to a future where we have much better social science. I think it's sort of like from my point of view that like just lowers the ambition of it to just be descriptive and collect details are something I'm curious to hear your reaction to this. But my approach to trying to do social science better is to get extremely concrete about what we're trying to accomplish, and on which populations and then try to actually do that thing in a way that we can tell if it works. So you want to help people form new habits. You think you think you actually know how to do that well actually build a digital tool that actually tries to get people to form new habits and then see if it works, right. You want to help people make better decisions actually try to develop an intervention and then And what I like about digital interventions in particular is that they're, they're scalable. Like, if you do it in a study, you can then give it to people and let everyone use it. Right? And, okay, it's true that if someone way outside the target original population use it, maybe it's hard to generalize, but as long as the people using it are in a similar population to your original research, well, they're using exactly literally the same protocol, because it's exactly the same for everyone. And so the generalizability problem is much smaller, but we don't have to lower our ambitions so much, because we're still trying to do the important things like trying to help people with the things that they really care about.

LITERAL BANANA: Yeah, that sounds specific to the I don't think we just like that sounds like you're trying to do something for specific people with a specific outcome. Yeah, I don't. I think that sounds like a good start.

SPENCER: Well, yeah, it is. It is specific, I guess. I guess what I'm responding to more is this like, doesn't seem like stamp collecting. It seems like trying to do something that creates an effect as opposed to just describe, maybe I misunderstood your point, though.

LITERAL BANANA: Yeah, yeah, I think doing something and trying to have an effect is maybe a measure of how real it is in the first place. But your average that's trying to do something's probably more real than than something descriptive, you could still see see tricks and stuff. Like it's not it's not necessarily guaranteed, it's still really difficult probably to do that. But I yeah, I don't think I don't have any problem with people trying to develop apps to help people feel better. I think that's right, we need more options for how to help people feel better.

SPENCER: So kind of wrapping up this discussion. A lot of this, we talked about are reasons to be skeptical of papers. And I'm wondering, so how much do you think we should test papers in general? What is the attitude that you go in when you read a paper?

LITERAL BANANA: I go in with this is probably fake. What's wrong with it? What are the main levels of problems with this thing? I don't go in trusting something. I'm very suspicious when someone tweets something and says this is a great study. it might be and it might turn out that I like it when I read it. But I go in very skeptical.

SPENCER: So what's the point of reading it then? Because I mean, I agree with you, I tend to go in very skeptical too. But someone could say, well, if you're skeptical, why read it in the first place? You aren't really gonna learn much.

LITERAL BANANA: One, because it's fun. And I don't think we choose what we're interested in. I feel like it's just inherently interesting. And another is kind of to figure out new tricks like a lot of a lot of times you'll see the same things over and over. But sometimes you'll see something genuinely new, like some some way to screw up that you had never seen before. I found that really exciting.

SPENCER: Oh, my God, it's so depressing. So you read to find new ways to screw up?

LITERAL BANANA: I want to I want to learn the magic tricks.

SPENCER: That's like studying longevity to find new ways to die than ways to live long.

LITERAL BANANA: I don't know, I think it's, I think it's entertaining the way like mysteries are entertaining. Like, what's going on here? What what has happened here? What's it's almost like doing a post mortem.

SPENCER: So like social science as like, car wreck video or something where you,

LITERAL BANANA: Basically yeah, and the worst are better. An my favorite ones that make me the happiest or the absolute worst ones that are just ridiculous.

SPENCER: See, I do feel like I learned quite a bit reading social science papers. But I often don't learn by reading the analysis by the authors so much as looking at okay, what exactly they do and what exactly did they compute from what they did. And I do get updates from that, like, a lot of times I read the paper and be like, Yeah, I can't update anything important on this, because either I don't think they say something important, or I don't think their method allows you to draw any reasonable conclusions that are worth concluding. But quite often, I do get an update, where basically I'm like, Oh, I they did this thing. And I expected this thing to happen. And something else happened, so now I've learned something about the world or at least gained some evidence that's kind of shift my probabilities around. It sounds like you don't feel like that's happening for you. Is that right?

LITERAL BANANA: I think that's very rare. I don't think it's 0% of the time. Yeah, I usually think of updating on here's a new way to do a trick or here's I can't believe that people believe this. Yes, it's almost a compulsion. Like it's, it's fun to me. But it definitely like draws me and I want to figure out how it works. And it's usually not because I think they have something legitimate to say about reality. But to me, it's fascinating that the social sciences as they are part of our reality, and that people have kind of non skeptically believed in conclusions that now to me seem goofy, and still do that I feel like the more familiarity you have with it, the more tricks you've seen, the less seriously you're going to take the actual conclusions of those papers.

SPENCER: I can't help but be a little depressed while you were talking.

LITERAL BANANA: It's fun though. No, but I think it has to happen. What I want to know ultimately is what is left like what real stuff is underneath all of this garbage. And if there is stuff that's left I really want to know about it. So I think it's it's harsh, probably the way I treat it, but I really want to know what the truth is. I want to know, what misconceptions do I have what am I believe that's not real and what is potentially there that that it could teach us? And I think it I would feel like I could only come up with positive knowledge from that if people were taking seriously how bad it is especially shifting from a generally trusting view toward a study to a generally skeptical view toward a study. I don't think the practice of science is helped by people just believing everything they say, I think that's probably the opposite of healthy for science.

SPENCER: There is an interesting tension where some people are total science deniers, and that can create one set of problems. And then some people are total science believers and that can be a different set of problems. And I think some people are reluctant to criticize science because they're worried they're gonna give support to, you know, people who are, you know, anti all medicine or, you know, completely skeptical of science in general.

LITERAL BANANA: Flat earth, right?

SPENCER: Exactly.

LITERAL BANANA: Yeah. And I think it is kind of being out as being skeptical of some science kind of makes it seem like maybe you're wacky in general. But, yeah, so it's kind of a risk. And I think a lot of the bad science, like this sort of pseudoscience, you can learn a lot from the same way that I learned a lot from fictious science, you can learn new ways to fail and screw up. And I really enjoy a lot of pseudoscience, documentaries, learning about how they're communicating and how they're making their points, and what things seem to resonate with people, kind of the structure of their claims. I think it's really interesting, as sort of a source of ways you can be wrong.

SPENCER: It's funny, because I think of myself as very pessimistic about social science in many ways. But I'm definitely more optimistic than you are. I think to kind of try to concisely explain some of my reasons for more more optimism is that to learn about the world, all you have to do is design experiments, where if the evidence comes out a certain way, you know, the probability of the evidence given your hypothesis is much higher than the evidence given the hypothesis being false. You know, this is Bayes rule teaches this, it tells us exactly what we have to do to get evidence that updates our beliefs. And I just don't think that that is unattainable. Like, I think there's tons of things that can be done in social science today, with tweaks on the existing methods that actually give you that kind of Bayesian evidence on all sorts of topics. And so I try to go in to do research with like, what's going to give me that update to one way or the other on this topic, and tried to take a truth seeking attitude. And then I think it's actually quite easy often to design experiments that give those kinds of updates. And so while I agree with you that there's so many flaws with a lot of the existing literature, it doesn't seem unattainable to me to actually like converging towards true things. And furthermore, I really believe that the world has lots of regular patterns. You know, humans are really, really complicated. And psychology is really, really complicated. Every human is different. And that makes this all very challenging. But there are patterns. And we know that our patterns because we navigate our everyday social life pretty successfully. And we often make very successful predictions about how people will behave. So clearly, our minds are modeling regularities, and the part of social science can be about formalizing those regularities, and maybe even uncovering ones that we can't pick up on day to day, but that do exist.

LITERAL BANANA: Yeah, I think that's the ideal situation. And I don't want people to think it's impossible, I want them to realize it's really hard and rare for there to be good information coming out of an experiment. One thing that depresses me is how bad the most important abstractions are, like, for instance, the Diagnostic and Statistical Manual for diagnoses of psychological conditions, things like depression, these are not very scientific. And they're lumping in a bunch of symptoms that may or may not go together may or may not be related to our conception of what depression means, for instance. And I think there are these words for all the different diagnoses, not just depression. And the stereotype is, you know, you're a first year psychology student, and you're diagnosing all your friends with these conditions. What happens is when people do like a psychiatry rotation, they start to see what those words mean, they get something in reality to map those things to so Okay, well, that's what bipolar disorder means, right? But I thought it was what the words meant. But now that I see it, I see that it's this really extreme manifestation. And the problem is that everybody's getting a different set of patients to, to relate that to so I think, in reality, every doctor has sort of a different idea of what what the DSM says, and there's probably major variations between regions and between populations. And those are so important things like like treating depression, that's really important, and the abstractions are bad. The tests that are used, the inventories that are used to measure them are bad, and they don't get much of a result when trying to prove the treatments work or not. The latest meta analysis is the antidepressants cause about a two point reduction on a 52 point scale over and above placebo. That's not something you really notice. So it seems like that they're such serious problems and they're so deep, and very few people can do care about it. I worry about whether or not progress can be made like, it might be that someone can determine something new about some some minor topic, but when the really deep ones are that bad, I think I think there's a major problem.

SPENCER: Yeah, on the antidepressant point, I think that's a contentious one about how well the antidepressants work. The best article I've seen on this is a Slate Star Codex one that goes into a lot of discussion on how well do they work, how bias is literature. So I'd recommend people check that out, if you want kind of like to kind of see the various sides of that argument. But I totally agree with you that the DSM diagnostic system is fundamentally arbitrary, right? You know, if they say, Well, you have to have these two symptoms for at least this many weeks, and at least two of these other out of eight symptoms, and so on. Like, clearly, there's an arbitrary choice there where they're drawing a boundary. And they could have easily drawn that differently. And putting people in these binary categories is just not the way things really work in the real world, right. Like, there's such a thing as being a little depressed, a little more depressed, a little more depressed, all the way up to extreme depression. And to just have a binary like they have major depression, they don't is clearly going to lose a ton of information. That being said, I don't think that's all bad, because it is nice that at least people can agree on what a population is, like, you know, if you're understanding a major depression, I want to say a major depression, at least there's some similarity between the groupings and then if we find something works, and then another person is diagnosed with major depression, we feel more confident that that treatment worked for them. So it's at least not a total free for all. But it is very far from what you might consider a reasonable diagnostic system, in my opinion, when in fact, almost any of these traits is a continuum and not a binary.

LITERAL BANANA: Yeah. And often it's the opposite traits will be will be equally, evidence of the depression such as eating too much, eating too little sleeping too much sleeping too little stuff like being agitated being having Psychomotor retardation, yes, it's a very messy classification. I actually, I learned a lot from reading the most recent meta analyses and all the papers, kind of citing them and arguing with them. I think reading one paper is great, but kind of reading different people's arguments and what each side kind of concedes and what each side so argues what what points they pick out to fight on, is one of the more interesting things and I'd like that, again, it's adversarial. So they're really arguing against each other. I think you can learn more from that than just from one paper explaining even if it's, even if it's a literature review paper, I think you've learned more from the debate.

SPENCER: Yeah, I love adversarial collaborations. I wish there were far more of those where two people who disagree work together to like write a paper, do a study, and they kind of point out where they agree and where they disagree. I think you're talking about just back and forth and literature, which is nice to know, they can even better when they're like writing the same paper together and trying to agree on what they don't agree on.

LITERAL BANANA: Yeah. One of the ego depletion papers was that they had one of the original researchers on the on the team.

SPENCER: The conclusion there?


SPENCER: Did the original researcher agree with that?

LITERAL BANANA: Yeah. Well, I mean, she has her name on the paper. So I assume she agreed with it.

SPENCER: All right. So before we wrap up, I just want to give you a quick, rapid fire round of questions, if that's alright.

LITERAL BANANA: Sure. Thank you.

SPENCER: So first question for you. What makes you a literal banana?

LITERAL BANANA: Well, because I'm literally a banana. That's all. It's nothing. Like it's not a joke or anything.

SPENCER: So you're literally a little banana. Okay, got it. Yeah. Okay. I thought that was more of a metaphor.

LITERAL BANANA: I do have a human suit that I can pilot at times. It's kind of like a mecha suit. But it's not very realistic.

SPENCER: Got it. And why are you obsessed with lichen?

LITERAL BANANA: Lichen are so awesome. They're beautiful, like in our symbiotic organisms that has a fungi, some kind of mushroom element. And then they have plant bacteria that photosynthesizes for them.

SPENCER: So they're all living together, there's different.

LITERAL BANANA: Yeah, and they, they grow on rocks, mostly. But they can also grow on trees and fences and things like that. And they come in beautiful colors. They make beautiful natural dyes. Some of them are very sensitive, so they only live in places with very low air pollution way high up in the mountain. So it tends to be that you find lichens and really beautiful places.

SPENCER: So what do you like about Twitter?

LITERAL BANANA: My favorite thing about Twitter is when somebody says some specific thing, some specific experience, and contextualize it in a way that everybody else resonates with, they're like, Oh, I've never thought about before, but that's exactly exactly how I experienced it. That's my favorite thing about Twitter.

SPENCER: Awesome. And what do you hate most about Twitter?

LITERAL BANANA: When you have a lot of followers, you get a lot of just thoughtless replies, that's gonna be a lot to skim through, but it doesn't really detract from the experience. I think it's great.

SPENCER: Do you experience a lot of anti banana purchase?

LITERAL BANANA: Not really. I think I think people joke about it, but but I don't think they really many people really hate bananas. Sometimes it can be difficult to be taken seriously. But I think when when people get into my work, they can kind of accept it and they might even change their their opinion about bananas about what they're capable of and what they can do.

SPENCER: When you think of the argument that and bananas were made by God. And that's why they fit our hands so perfectly.

LITERAL BANANA: I think were bananas made by people. That's the origin story we have they were they were cultivated by people over 1000s of years and, and gradually came into our current form. But I think that could have been involved. I don't know.

SPENCER: You could have been nudging people on the way. So, why with the pseudonym?

LITERAL BANANA: What pseudonym? I think having..

SPENCER: Well, I heard that you sometimes use a human pseudonym? Is that true?

LITERAL BANANA: Yeah. I sometimes write undercover as a person. And that was more about wanting to be taken seriously as a person and feel like when I write under the banana, I'm not having any claims to expertise. And literally banana, I'm not saying I have any degrees, I'm not claiming I have any credentials. And literally banana. I'm just trying to describe the world. So whether or not people take me seriously should be based on the ideas and the presentation.

SPENCER: When you write about banana related topics, do you call on your lived experience as a former credential?

LITERAL BANANA: I don't know if I call it a credential because I think everybody has lived experience. But I think I think anecdotes are actually really good evidence. And they're good evidence, to the extent that you trust the person telling them and trust them across a lot of dimensions like them, how well they remember things, not just whether they're telling the truth, how sensitive they are. So I think I think ended up anecdotes get a lot of hate, but I think they're pretty great. It's not that they're failure modes. But there's a really great mode of transmitting information.

SPENCER: But isn't for any anecdote, there's basically an opposite anecdote that like sort of makes the opposite point.

LITERAL BANANA: Yeah, that's how that's how you know, they're real. That's there's a there's, you know, have you heard of the list of human universals, Donald Brown's list of human universals. It's almost like a poem. It's, it's a beautiful document. Here's a book, that's great. But the list itself, is just a list of allegedly based on him and his reading of the anthropological literature. What is in common between every human group ever studied? So if there's one that's missing from it? It's according to this. One of the alleged human universals is Proverbs comments in mutually contradictory forms. So I like that a lot that the idea that a basis of human knowledge is that we have opposites because we know there's a context when one thing is more true. And there's a context when the opposite is more true. And there's not just sort of universal truth of these proverbs. I think that's true of anecdotes to the if there's opposite anecdotes, that's probably because they're onto reality and reality is a mess.

SPENCER: So what do we learn from anecdotes then if they are contradictory ones, like how do we use them?

LITERAL BANANA: First of all, contradictory anecdotes could absolutely both be true, if they're just in different contexts. So we learned actually something pretty rich, richer than the maybe moral of the anecdote itself, we learned that the opposite things can be true in different situations. And we learn if somebody has experienced somebody's somebody's specific experience, and we can moderate how much we believe that based on how much we trust the person, especially on this specific issue, maybe I trust somebody a lot regarding computer science, and not that much regarding aesthetics or something like that. I don't think an anecdote is just free floating information. It's within a context. And that's, that's why it can still it can still be rich. And also why things like anecdotes from a friend of a friend, in a resonate that they don't carry a lot of information, like they may be a good story. And that's enough to tell it like if it's a good story, that's fine. But if it's divorced from its context, that has less information,

SPENCER: Because you can't know which information to condition on because you don't know the full context of what happened?

LITERAL BANANA: Exactly. Or who even happen to you or whether.

SPENCER: Right, right. Yeah, people say well, how do you explain such a such story that I heard from a friend of a friend or a friend you're like, yeah, that's basically the information. Yeah. Literal Banana, thanks so much for coming on. This was really fun.

LITERAL BANANA: Thank you so much for having me. That was really fun.


JOSH: During a podcast recording, do you ever have insights mid conversation or is everything you say pretty much ready to go before you start recording?

SPENCER: Almost nothing I say is pre prepared. I know the topics that we're going to talk about, but that's pretty much it. And I don't know what the guest is gonna say other than maybe I have some sense of their rough views on this particular topics, and I might have a sense of what I think on those topics, but no, I've no idea what I'm gonna say until we get into it.




Click here to return to the list of all episodes.


Sign up to receive one helpful idea and one brand-new podcast episode each week!

Contact Us

We'd love to hear from you! To give us your feedback on the podcast, or to tell us about how the ideas from the podcast have impacted you, send us an email at:

Or connect with us on social media: