Episode 116: Are scientific journals just parasites? (with Chris Chambers)

Enjoying the episode? Want to listen later? Subscribe on any of these apps or stores to be notified when we release new episodes:

Listen on

Apple Podcasts

August 8, 2022

How is outcome bias especially relevant to science publishing? What are some possible solutions for overcoming outcome bias? In what ways are the publishing and peer review processes flawed? Why do many (or maybe most) scientists perform peer reviews for free? What value do publishers add to the scientific process, especially given that the internet democratizes distribution? How do (and should) scientific journals differ from newspapers? What kinds of changes must academic systems implement in order to improve in parallel with the proposed improvements for journals? How likely is it that a random preregistered study will replicate? Why can't we come to a consensus about some fraught topics like ego depletion?

Chris Chambers is a professor of cognitive neuroscience at Cardiff University. His primary research focuses on the psychology and neurobiology of executive functions and higher cognition. He is also interested in the relationship between science and the media, ways the academic community can better contribute to evidence-based public policy, and methods for improving the reliability and transparency of science. You can reach Chris via email at chambersc1@cardiff.ac.uk, follow him on Twitter at @chrisdc77, or learn more at his website.

JOSH: Hello, and welcome to Clearer Thinking with Spencer Greenberg, the podcast about ideas that matter. I'm Josh Castle, the producer of the podcast, and I'm so glad you've joined us today. In this episode, Spencer speaks with Chris Chambers about biases and academic publishing, popular journalism, and reforming the peer review process.

SPENCER: Chris, welcome.

CHRIS: Thank you, and good to be here.

SPENCER: I'm excited to talk to you because I think you have some really interesting ideas about how to reform science. And in particular, how to reform it from an institutional level where we modify or change the institutions of science, the expectations of science, to make science go faster, be more robust, and so on. So, my first question for you about this is tell me about why you think outcome bias is important in science and kind of what your ideas are to try to reform that?

CHRIS: Yeah. So first of all, thank you for having me. It's really great to talk about these topics because I think they're so important now. And all of the issues we're going to talk about today are just increasing in importance. Outcome bias — for those listeners who aren't familiar with it — it's a phenomenon, and it's a cognitive bias in which the outcome of a particular intervention or a particular policy or a particular decision determines whether or not that decision was a good one or not. And it might seem intuitive, in some ways that, if a decision is made in the right way, then perhaps the outcome should be positive or should be good. But in fact, it's a logical problem. Because in science, especially if we allow what we consider to be good science to be determined by the results we get — whether we like those results, whether we like those outcomes — then we risk creating a scientific record that is quite biased and doesn't reflect reality. One of the main goals that I have, and many of the colleagues I work with have, is to try and neutralize this type of bias as much as possible so that we can create a scientific record that we can trust.

SPENCER: Could you give a concrete example just to illustrate the point?

CHRIS: Yeah, so let's suppose I'm running a clinical trial on a particular drug to treat some disease. I have in my trial the drug I want to test, and I have placebo control as a baseline. And let's suppose I run my trial, and I find that the results don't support the drug working. So they sit to what we call a negative result. And let's suppose, faced with the evidence that my drug is probably not very effective, I decide I don't like those results. So I'm not going to publish those. I'm not going to tell anybody about those results. Instead, I'm gonna just take that result and file away and make some reason why maybe I didn't like the study. I'm gonna let the outcome determine my judgment of whether the study was good quality or not. That's it, I won't speak of it again. Now, if I do that enough, and other people are running similar trials on this drug, and by chance, their trials show that it works. Imagine the kind of literature you create at the end of that process where all of the negative results, all of the inconvenient results, which suggest the drug is ineffective, get effectively censored from the scientific record. And anyone who looks at the scientific record based upon that some of the evidence would conclude, "Hey, the drug works. Look at all these trials showing that it's effective," with all the negative trials having disappeared. And that's one of the simplest examples really of how outcome bias can distort knowledge. And it can happen across all of the sciences, basically, not just even clinical medicine.

SPENCER: So that's a helpful example. And in that case, we had an instance where the researcher had a desired outcome. And then when they didn't get this outcome, they didn't want to publish it. But my understanding is that, a big other reason this can happen is because they might find it harder to publish, or they find this less worth investing the time to try to get it published. Do you want to comment on that?

CHRIS: Yeah, exactly. Outcome bias can come from a number of different directions. And in science, it takes a very specific form that we call publication bias. Publication bias is the form of outcome bias that we see reflected in the scientific literature, when whether the results conform to expectations or whether they're what we call statistically significant, whether there are reliable differences or effects shown, whether those results separately from everything else influence whether the study is published. And that bias can spring from either the researcher themselves saying, "I don't want to publish my research because the results are inconvenient. And I don't like those results." It can also arise from scientific journals, which control the scientific literature and act as gatekeepers. Journal editors and reviewers are faced with inconclusive or difficult results or inconvenient results might say, "Oh, I don't like these results." Or, "Given these results, I'm going to nitpick the methodology in a way I wouldn't with results that I like." And therefore they can block publication of those sorts of findings. So you can imagine this bias springing from a number of different sources, both from the individual researcher and also from the community around them, particularly those who serve gatekeeping roles.

SPENCER: I'll just add another source of this, which is that negative results can just be a lot more boring. So people feel that they're not going to publish them in a top journal because top journals are looking for a kind of exciting and novel seeming results rather than boring results.

CHRIS: This is part of the problem, right? So journals have manufactured brands and reputations and prestige based not primarily on publishing the best science, but publishing the flashiest results. If you think about that market that you've created there, it's very dangerous. Because if the most popular journals, the most prestigious journals, selectively published results they deem to be exciting and novel and interesting and not boring, then you're creating a lopsided scientific record in which certain findings get prioritized and may influence the way science goes and applications, drug development, whatever it might be. Whereas, findings of equal quality stemming from studies of equivalent or even higher quality, that produce perhaps boring results, or less interesting results, or more confusing results, get buried away in more obscure journals or not published at all. And it's kind of insane, actually, when you step back from it, that we allow this to happen. Because the scientific record should not be determined by the results, it should be determined by the quality of the work going into it. I mean, just imagine if, when we were flying, think of commercial air travel mentioned if the safety checks on planes were only considered where the checks pass, and checks where planes were discovered to have faults were ignored, would anybody fly?

SPENCER: I want to dig into this idea of quality a little bit because if you think about someone running a study, there's a huge number of different possible things they could study, for a huge number of possible outcomes. If you just think about, for example, in nutrition research, there's thousands of different compounds people could ingest. There's thousands of different outcomes you could look at. Everything from heart health to diabetes, to all these kinds of things. And if you just throw a random dart at one of those a priori, it's extremely unlikely to work. And if you take some random berry for some random condition, it's probably not going to work. So it results in a negative result for just a random example that was not really driven by theory. It seems to be not that valuable because our a priori expectation should be it's very unlikely to work anyway. Whereas in other cases, you might be studying something where maybe it's already believed that it works. So if you were actually to put a negative result out there, that would actually have a lot of evidence value to people, because it would help change your mind about something. Cases where maybe it's a really high stakes like if this turned out to work, well that would be a really big deal. So it just seems to me that we have to prioritize to some extent. Not all really well-done studies are equally valuable because they don't provide equal importance of information.

CHRIS: But what's interesting about all the examples you just used is that everything that gave those studies value was before the results were known. So some studies were trivial out of the gate, because the chances of finding an effect are so low. So other studies have more at stake, and the findings might be more influential. All of those judgments take place before the results are known. You can make that judgment based upon the importance of the question, the quality of the methodology that's being used to answer that question. The results themselves are simply the results. So, to step back from it, if you believe that testing and obscure intervention in nutrition is worth proposing and worth doing, then surely the answer is worth knowing. Otherwise, what's the point of doing research in the first place?

SPENCER: Right, and I think this is a good segue into one of your proposed interventions, which is registered reports. So do you want to elaborate on how those work and how you came to develop that idea?

CHRIS: Registered reports try to neutralize outcome bias and publication bias by ensuring that journals decide which research gets published before results are known. So the way you do this is you take the normal peer review process, which typically is undertaken only after studies are completed and manuscripts have been written and results are fully interpreted. Instead of performing peer review only at that end stage, what you instead do is go back and perform peer review after the completion of a protocol. So after a protocol is written and proposed, with a full theoretical background and all of a methodological detail, that part is peer reviewed before this study is done. And then based upon an evaluation of the quality of the method, the importance of the question, the ethical standards and so on — all of those elements that really contribute to scientific quality — the journal then decides, "Will we accept this or not, without knowing the results? So, results can no longer influence the evaluation. It's impossible because the results are not known. And based upon this journal issue, what we call an in principle acceptance, the authors then go away and do the research. Armed with the knowledge that the outcomes of the research will not determine whether or not it gets published. When they're finished, they come back, the same reviewers, the same journal and look at it again. And they say, "Did you follow the protocol? Are your conclusions based upon the evidence?" And if they are, tick, published. And so the idea here is that you eliminate all of the bias inferences, all of the bias processes, from the scientific process. Thinking of the scientific publication process as itself an intervention on science, and then trying to eliminate the bias really at root.

SPENCER: Yeah, I love this idea. I'll just comment that, it seems it's very important that the protocol of how you're going to do the analysis, and how you're gonna conduct the study be really well specified. Because if the protocol of how you're doing analysis is a little bit ambiguous, it still allows researchers to make the results more like they want it to be by kind of choices in the statistical analysis, your choices of what to report.

CHRIS: Absolutely. And this is one of the reasons why it's so important to perform peer review of those protocols before the results are known. We know from clinical trials that registered — what we call registered protocols — protocols are written in advance of doing trials, but which are not peer reviewed, are often as you say, "quite vague." And that vagueness provides researchers with the opportunity to still cherry pick certain findings or certain approaches out of their protocol, and then highlight them to accentuate the positive as it were. But what's great about registered reports is that by performing the peer review of the design at the protocol stage, it provides the opportunity for expert reviewers to really drill down into that detail. Is this precise enough? Is this method the best? Is this control condition the most robust? Does this hypothesis make sense in light of the theory and so on. So that by the time a manuscript goes through this process, by the time a protocol is approved, then we know it's as rigorous as robust, and as solid, really as important as it can be. And that gives us some assurance, hopefully, that what gets through the review process and eventually is published as a registered report at the end meets a certain quality standard.

SPENCER: Could you tell us a bit about the history of this idea and to what extent it's being adopted now?

CHRIS: The idea of pre-registration as a way of pre-specifying predictions, methods and analyses. It's been around for a long, long time. And the earliest references to this come back to the 19th century in some of the very early scientific approaches that considered this as a way of preventing cherry picking and bias. Throughout the last 50 years especially, there's been a lot of discussion about the value of trying to control these kinds of biases — the outcome bias, the publication bias — that are so prevalent. And it really wasn't until about seven or eight years ago that a number of different lines of argument took shape to create registered reports. So the key ingredients of registered reports that there is, what we call pre-study peer review. So the protocol, the design, the theory, the methodology is evaluated prior to the research being undertaken by expert reviewers. And based upon that, there's an in principle acceptance regardless of outcome. Those two ingredients haven't really existed before, about eight or so years ago. And amongst the journals which created this was the journal, Cortex, where I'm an editor. Then we've pushed quite hard to promote this initiative since then, and now it's about 300 journals or so that offer the format.

SPENCER: That's fantastic you've gotten so much adoption. So, they offer the format, but to what extent are people able to use it? And what does that process look like relative to using the traditional method of publishing?

CHRIS: Well, it's still very much the minority. So, 300 journals sounds like a big number, but there are thousands of journals in science of varying prominence and quality, and 300 is a drop in the ocean. And by the same token, if you look at any one of those 300 journals, registered reports make up still a small percentage of the overall empirical submissions that they receive somewhere between 2% to 5%, typically, at a journal. Now, this is not necessarily a bad thing. So the registered reports format, as it is, works for certain kinds of research, where researchers are operating in a mode that they can specify things before they do them. Science doesn't always work that way. There are observational exploratory approaches in science which defy pre-specification, where you're not really sure what you're looking for when you set out, you're just kind of mapping the terrain, you're testing some exploratory idea. It's not always possible to specify your protocols and the level of detail that's needed, which is perfectly fine. Registered reports are never designed to be the only way of doing science. But where they are applied, they turn out to be — and we're finding this now by looking at the impact of the initiative — quite a high string like a 'Superdrug' (the science). And they've been shown to be quite influential, well-cited, possibly quite influential in driving theory. For me, that's promising. So it means that we don't have to have a huge number of these submissions. What we do need is for them to be universally available to scientists in every field so that they can avoid all of these biases if and when it's appropriate to do so.

SPENCER: I'm really glad you mentioned these different ways of doing science for different purposes. Because in my own work, where we're studying topics in psychology, I find it's often really useful to do a whole bunch of exploratory studies early on to kind of map out the phenomenon. And then only at the end, do a confirmatory study to make sure we're not bullshitting ourselves. But that seems to me to come at the end of the process. So, I guess the way I think about it — and I think a lot of people disagree with me on this, so I'll just say this is somewhat controversial — but I think that it is, like the confirmatory stuff is a bit poking above the water if you think of science as an iceberg. And most of the stuff is trying to understand the phenomenon, where we use a lot of different methods to try to do that. But the confirmatory part is super, super important, because it's what is put out there in the scientific record, and what other people can see and base their own judgments on.

CHRIS: Right, so that you can decide — and it's a really interesting topic, isn't it? — of how much of what goes on in science should be above that 'see' level. How much should we see, and people make different arguments. You can argue from more of your ends that perhaps we should be, the scientific record should be, dominated by the findings that we believe that we trust that have been replicated, confirmed (as you say) maybe following a series of exploratory approaches. Others argue perhaps that we're not seeing enough even then. There's an awful lot of wasted science, a lot of wasted effort, when researchers spend years and millions of pounds or dollars investigating certain exploratory roots. And they never publish any of it, because none of it bears what they would consider to be fruit. Then, are we making the best use of the resources we have? Are we advancing knowledge in the most efficient way? I tend to lean more in that direction that, yes, exploratory science is important and so are the outcomes of exploratory science. And so even when we're not pursuing the registered reports format, when we're not actively eliminating that bias from the process, we should still be trying to surface as much of our exploration as possible for the benefit of the scientific community and for public good.

SPENCER: Yeah, I can certainly see the value for example in publishing datasets. If you didn't end up using them in your research, you might as well publish them as long as you can anonymize them properly, and so on, so that others can benefit from the data you've collected. But I will just say, in my own experience with psychology research, what I find very often happens is the first study we conduct really just shows us all the things we don't know [laughs]. So it shows us, "Oh, wow. Okay, that was not the right way to say this phenomenon. Now, we learned something about how to study it." So it often takes two or three studies to get our bearings and actually understand enough about the phenomena to know even how to study it. I don't know whether that's how ubiquitous that is.

CHRIS: Yeah, totally. It's true of registered reports, as well. So, I've edited hundreds of registered reports. And often the conclusion of a registered report is obviously a ton we don't know. We may have conducted a relatively biased-free test of a proposition. But still, we've raised as many questions as we have answered. This is, I think, common not just in psychology, but in all sciences. And I think the lesson I take away from it is that it doesn't necessarily matter. The main thing is that we're not fooling ourselves along the way. Because if we invest too much in conducting a ton of exploratory research, without making it transparent, without actually revealing what we found as inconclusive or challenging or assumption challenging as it might be, then we're potentially leading ourselves in the wrong direction anyway. There's so many blind alleys in so many fields that people keep treading down because nobody publishes their previous attempts down those very same alleys. And it's just this kind of ridiculous process of repeating mistakes over and over again that we could learn so much more if we just were more open in the way we do our science.

SPENCER: It raises an interesting question about how that kind of information can be shared. So you go design a big study, and you run it, and you kind of realize, "Oh, that wasn't really the right way to study this." And kind of we were often these ways. It's not a classic scientific finding in the normal sense, a lot of times. It's not the sort of thing you could get a flashy publication from. It's more like, "Oh, yeah, don't do it this way." [laughs]

CHRIS: Right. But that's so useful.

SPENCER: That is. So, how do you publish that?

CHRIS: Yeah, you know what, publish it as a registered report because I've edited registered reports which have gone down exactly that road, where they've pre-registered a protocol which has gone through peer review and been really rigorous, very, very thoroughly reviewed. And then they found that for some reason, the hypothesis wasn't testable using that approach. Some assumptions failed. Something went wrong. But because it was so rigorously evaluated, that's still an informative finding. And it's still a very useful methodological message for the field. Guess what, guys, if you try this, try it in a different way. There are still lessons to be learned, even from what we might consider to be failure. And I think, as an academic community, as a scientific community, we were a little bit too afraid of failure. And we define failure as just not getting perfect outcomes or not venting the knowledge in some very pure sense. But there's so much we can learn from any rigorous valid attempt to answer really any question that we shouldn't be trying to be a gatekeeper at that level. We should be saying, "You know what, you've done this to a high standard, it doesn't matter what you found, that's not going to determine whether you get published or the prominence with which you get published." But instead, it will tell us what we are going to learn from this at different levels.

SPENCER: I'm curious if you relate to the following experience, which I've had a number of times, where I'm doing some research, and I realized that the thing that I think is actually the most true seeking is not the thing that is the prettiest. And so if I put on the hat of how I would get this published, it looks really different than the hat of how to actually figure out what's true. Just as an example of this, sometimes when you're doing your analysis, you realize that there's a good reason to exclude things that you never thought of after the fact. So your pre-registration said, "Oh, I was only going to exclude these examples." But then once you dig into the data, you're like, "Actually, if I don't exclude these other examples, it's good." I think it's actually gonna distort the record. It's gonna be misleading. But the cleanest result would be to not exclude this thing, because I said I wasn't going to do it and it's going to be the easiest to publish. So I find that these trade-offs come up a lot, just like doing science to look nice versus doing science to seek the truth.

CHRIS: Yeah, yeah. And by the way, again, the approach you described, the better approach, which is to report the extra analysis, is totally compatible with the registered reports format. Researchers do this all the time. It's what we call exploratory analysis. They'll get their data, they'll have their protocol, they'll look at their data, and they'll realize they might have missed something. Maybe there was some assumption that they missed or like in your example, maybe there's some other way of analyzing the data, which deals with the problem, and which is more informative. And there's nothing stopping researchers doing that. In fact, one could argue that the registered reports format is more protected against the bias of reviewers against certain outcomes. They may not like that to emerge from exploratory analyses because the manuscript cannot be rejected based upon any of the results. So, you can do all of this just fine. And the only requirement really, when we're thinking in an open science kind of mindset is that we're transparent about it. That we don't try to reinvent history to pretend that something we discovered after the fact was, in fact, something we predicted, or that something which looked nicer should dominate the conclusions only because it looks nicer, rather than because it's the most valid approach to take.

SPENCER: Yeah, I find writers who report extremely appealing, especially because you're not advocating that everything should be registered reports. You're just saying this should be an option on the table, and I feel like that's really hard to argue against. It's just such a strong idea. And I'm wondering, does anyone oppose this? Or is everyone like, "Yeah, that seems like probably a good idea."

CHRIS: Early on, people did. So, no matter how strenuously we made the case that we're only advocating this as a universal option, there was a fear, I think, among certain corners, particularly of psychology and neuroscience, that it would become dominant so quickly because of its inherent advantages, that there'd be a slippery slope toward it being the only way to do science. So in my opinion, that's the classic slippery slope fallacy. The community academics love playing out worst case scenarios. They love nitpicking any possible idea to find the one weakness and then expanding it to take up the entire space. It's kind of what we're paid to do in many ways. And so early on, I think we face quite a lot of that political opposition to this initiative. Despite all of the attempts we made to try and address that argument directly, and really, what's defeated it in the long run, that particular objection, is just the passage of time and the fact that these fears have not yet emerged. They have not been realized in reality. When we look at how registered reports are doing, we see that they do very well, they're popular, they're impactful, they're signs that they are eliminating the very biases that they are designed to eliminate. But we're not seeing them dominate science in some way that puts science in chains or restricts freedoms or inhibits exploration and serendipity, and whatever it might be. We're finding, in fact, that they're complementing signs very well in certain areas because we need them. They're filling a gap, there's a natural incentive for researchers to use them, and they have a natural place at the table.

SPENCER: Now, has anyone investigated if you apply to a journal with a register report format, and you apply it to the same journal without the register format? How does that affect your odds of getting published?

CHRIS: No, it's a really good question. So the idea of a randomized trial in which you would compare the outcomes of a registered report process with a traditional process is the Holy Grail of this area of metascience because it will provide the strongest evidence that the approach is effective in eliminating various kinds of bias. However, it's a very difficult trial to run because researchers very much value their academic freedom and they don't appreciate being randomized into different groups and told, "You're in this group, you're going to do research this way. And you're in your group, you're going to do research this way." This is something that is anathema to the academic community. So, whilst it's possible to do a trial, there are ways in which it can be engineered. It's turned out to be challenging for researchers to get funding to do such a causal intervention study. So to date, most of the research that's looked at this has done so retrospectively.

SPENCER: I can see why it's really difficult to run this trial. But it'd be really interesting to get a group of researchers who all are in favor of doing registered reports and say to them, "Hey, next time you're thinking of submitting a registered report, would you be willing to flip a coin? And if it lands heads, you submit registered reports and if it lands tails, you submit in the regular way. [laughs]

CHRIS: Yeah, absolutely. You know what, I predict if you found no difference between the two conditions, one of the obvious explanations could be that because you pre-selected scientists who already were favorable about registered reports, perhaps they wouldn't engage in bias practices anyway, because they've changed the way they work. They've changed their lab culture so that regardless of whether or not they happen to use the registered reports format, in a particular setting, they would be more immune (if you like) to the kinds of typical biases we see anyway. So in a way, I know it's tricky to really do this in a way that controls that sort of sampling bias. But you want to introduce the intervention in a sense that researchers aren't even aware that it's happening, which is ethically dubious, and technically almost impossible. But these are the kinds of conceptual and logistical challenges we face when we try to do these sorts of real world trials.

SPENCER: Yeah, it's really tricky.

[promo]

SPENCER: So one thing you could do with a trial like that is investigate the level of bias in the published research. But another thing you could do is just see whether people increase their odds or decrease their odds of getting published by using registered reports. And I'm wondering, is there any preliminary data on that even if it's just observational?

CHRIS: Well, what we do know is that the level of rejections with registered reports after results are in are virtually zero. This is not any systematic review so far because it's still early days in the general life of the initiative. But, we know this anecdotally from just talking to editors at journals, that when an article gets stage one acceptance — or in principle acceptance — it's very, very rare for it then to be rejected after the results are in because most of the heavy lifting has been done. So most of the really deep evaluation that would lead to a quality-based rejection has been done. And none of the traditional reasons for rejection, such as you got no results, or your results aren't as cool as we'd like, or your results disagree with my pet theory, etc, etc. None of those reasons for rejecting a paper are admissible. And so there's not much room left for rejection, except if authors go off the rails and try to conclude something crazy, for instance, from their findings, which doesn't happen. So what we can say early on, I suppose, is that registered reports are working in correcting that bias after results are in compared to the very high rejection rates we see for regular articles. And we're also seeing promising signs that hypotheses that are tested within registered reports are much more likely to be disconfirmed. So we're much more likely to find out that we're wrong when we go down the registered reports track compared to the regular track, which tells me, well, we're much less likely to be fooling ourselves.

SPENCER: Yeah, there's something really interesting about this format, which is that both a check that others can evaluate, right. So it helps others know that you did this research in a kind of unbiased way, but also it helps you prevent self-deception. I think self-deception is actually an extremely large problem in science. It's not just that people are tricking others. It's actually they're tricking themselves. I think that most researchers believe their own reason is valid even when it's not.

CHRIS: Right. This was Richard Feynman's famous warning that you are the easiest person to fool in your own research. We all want to see certain things. We ask questions because we care about the answers. But if we care about the answers too much, if we allow our own biases to try and influence those answers and push them around, then we are fooling ourselves first and foremost, and then we're fooling everybody else as well. And because we fooled ourselves, we're going to try and convince everybody else that we're right along the way, because we don't see it, we're blind to our own bias. And this is something very powerful that we're seeing from the registered reports initiative. That by just engineering a process, which eliminates that bias at root. Just by nature of the \structure of the review process, we're seeing a very different scientific record emerging, in which we're finding out that we're wrong an awful lot, because we're not fooling ourselves as much.

SPENCER: So switching topics, you also have interesting critiques of the way publishing works more broadly. Do you want to go into that a bit?

CHRIS: Yeah. So the typical peer review process, which I think has a lot of value, and particularly when it's occurring in a registered report, is typically managed by publishers. And the most dominant publishers in science are commercial publishers, which exist primarily to make profit. It's because of the publishers controlling the peer review process that they maintain quite a lot of power over science and over the academy. This is despite the fact that, who's doing the peer review and the evaluation is us, the scientists. So we're contributing our expertise to review each other's articles for free — usually, we're not paid for this and it's something we do as a quid pro quo — and publishers are serving as the curators, the managers of this process, and thereby retaining power over it. This is a big problem, because it serves to reinforce, I think, an unhealthy control of the scientific community by profit-making organizations. So I think that if we can take back control of peer review, we can reduce the cost of publishing greatly, and we can start to level the landscape.

SPENCER: So why do people agree to do peer review for free? I mean, when I was a PhD student, I did it because my advisor would ask me [laughs]. So, I was just doing it for my advisor, in a sense, but I don't really think I fully understand the spectrum of motivations there.

CHRIS: There's a range of intrinsic motivations, and also kind of more incentive based ones. So, intrinsically, I think, most scientists are motivated to do some public good. And they see peer review as a way of contributing service to the scientific community, and helping their colleagues help the field advance because it's in everybody's interest. But there are also more selfish reasons to engage quite actively in peer review, which is you get to know about science before it's published. So you get sort of advance tickets to the theater. You're also getting yourself known to journal editors as being someone who is — particularly if you're an early career researcher — somebody who is well read, careful, keen analytical mind, keen critical mind, is somebody who is perhaps worthy of notes in the field. So there's ways of personally advancing yourself as well by being known to be a good reviewer, which can in turn, then you can end up becoming an editor one day, sitting in the driver's seat in the captain's chair and deciding which articles make the cut or not. And there are obvious career advantages as well for being known in your field as being someone who does very careful reviews.

SPENCER: I'm confused about that. Because isn't it usually anonymous?

CHRIS: It's not anonymous, as most people think. I think people certainly know that at an official level, it's anonymous. And where review is anonymous, editors don't divulge identities. But still people can usually guess, within a reasonable range, who's reviewing your paper. People who are careful scientists often get an implicit reputation from doing this. Also, many reviewers sign. There's an increasing push toward more transparency in the review process to try and overcome certain biases and tone issues that can infect that at the moment. So there's a whole range of motivations that drive peer review. And I think because we are responsible for it, the academic community is providing the service to the academic community, we should be the ones who ultimately manage it and take control over it.

SPENCER: I think I'm still a little confused about the benefits of being a reviewer in terms of reputation. Are you saying that it gets known internally who the good reviewers are? And that gives them prestige, even though it's actually, usually, not published to the reviewers and even though it's anonymous?

CHRIS: It's only anonymous to the authors. Reviewers are, of course, known to the editor because the editor has to know who they're choosing. An editor selects a reviewer. If that reviewer gives a really excellent, thorough, tearful, nuanced, clear review, then that reviewer is likely to be written to return to that review again in the future. Maybe when the journalist thinks we need to refresh our editorial board and take on some new people, the editor thinks, "You know what, that person was really good. And they've done so many great reviews for us. They really deserve to have a role as an associate editor at our journal." All of a sudden, you've gone up a level. And so now you're at a different stage in the process. So at that level, just at the level of just doing the reviews for the journals, you can build a reputation that advances your career. Being on an editorial board, being a journal editor, is seen as being something good. It's something that fellowship panels, grand panels, promotion panels care about and can work in your interests.

SPENCER: Now, if we think about the history of commercial publishers, I think it makes a lot of sense that they were developed over time. Back in the day, you had to actually print things on paper. You had to distribute these paper publications. You had to go around and make deals with different universities, and ship them and so on. Then the internet comes into play, and suddenly a lot of these dynamic shifts. So, I'm curious to hear your analysis on what commercial publishers' role was, and how you see it changing with technology?

CHRIS: The publishing system that we have is essentially a 17th century model that's represented in the 21st century of the serial exchange of letters. Essentially, between scientists as was originally conceived by the Royal Society (at a very macro scale) and it's a big business, even in the last 20-30 years. We've seen it expand tremendously. The number of journals has exploded, the number of metrics that are used to assess them and rank them, and so on and so forth. Publishers maintain a very powerful position in the scientific world because of this. And I think that has shed light on the process for many academics who were thinking about reform and ask them to question, "Have they got too powerful? Do we really need corporate publishers to be managing processes that we conduct ourselves? And are they really adding the value that they could be adding or should be adding to the scientific process?" I think this is something that you, all of us, have to consider. Particularly given the cost of publishing is so high, and that in turn, is a burden, a public tax burden, really, in many cases.

SPENCER: Well, I think we can differentiate the two different roles for the publishers. One is distribution and the other is credibility, like deciding what is a credible paper. The distribution function used to be really, really important back in the 18th century or whatever. Now with the internet, it's really quite easy to distribute things. You just put them on a website and they're distributed basically. It takes extra effort to put them behind a paywall. But the credibility function, maybe one could argue that that's still really needed. And so I'm wondering, when you think about how one decides what is credible, how would you like that to work in your ideal world?

CHRIS: Well, let's look at who's giving it credibility. It's not publishers who give peer reviewed science credibility. It's the peer reviewers. Publishers don't perform peer review, they manage it.

SPENCER: They have brands though. But the journal is associated with credibility.

CHRIS: Those brands are built upon freely provided services by the scientific community, that the scientific community could do of its own accord, and contribute exactly the same quality, exactly the same reputation. But the publishers take ownership of it, because they very cleverly took control of managing that process. So, they basically claim credit for it. Credibility does not come from publishers. They're a proxy for it. There is a heuristic that we use, but there is no reason why peer review can't be controlled and managed at a very low cost by the academic community itself. Using preprints; preprints are a very efficient way of publishing scientific research freely prior to peer review. We can perform peer review of those preprints at very low cost and provide exactly the same credibility as publishing in a prestigious journal would. Doing so ensures that the academic community retains control over that process, and then forces and will force publishers to provide genuine quality. Publishers want to claim all of this money and all of these profits, then I think it's incumbent on them to prove that they're actually adding value to the scientific process, rather than cannibalizing the quality that we already provide.

SPENCER: Well, let me use an analogy. Suppose you have a firm in consulting, like McKinsey, or BCG. One thing you could say is, "Well, what is the firm really adding?" The individual consultants are the ones doing the work. They could just split apart and form consulting teams and consult for businesses. They would make a lot more money individually, instead of having it go through this corporate entity. The counter argument to that is the corporate entity is actually building those relationships. It builds a brand of quality. It also has internal resources that help organize the teams and give them lots of internal research and things like this. So, you can see these two extremes. On the one hand, you could say, "Well, an entity is just a collective of individuals and individuals could just do the thing as well." On the other extreme, you could say, "Well, the entity is actually, how is this sort of the quality brand and the individuals, if they weren't sure on their own, you don't know that any individual is reliable. But you do know that the group as a whole is reliable." And so I think, you're arguing that it's more at the extreme of the individuals, or could just move the quality seal over somewhere else very easily, totally separated from the entity itself. So, I'm curious to hear your thoughts on that way of looking at it.

CHRIS: The problem is not with the entity. In fact, it's good for there to be an entity which provides an overall trust badge, as it were a reputation for performing review to a high standard and to managing it that's important. What we can do is ensure that that entity is something that is not driven by the need for profit. We don't need profit-making publishers charging huge costs to university libraries and huge open access costs for authors if they want to publish in a way that breaks paywalls. We don't need any of this. It's all completely superfluous. We can become that entity ourselves. And I've been very much convinced by an initiative called the Peer Community In or PCI initiative, which was created a few years ago and takes control of the peer review process at a very grassroots level. It's a nonprofit, and it exists to manage peer review of preprints. And then it publishes, in a very open way, those reviews and evaluations. Having done this, it then says if journals wish to use these evaluations and decide what to publish, they can. But we're taking control of the evaluation, and we're publishing these evaluations, so they're very transparent. It's beautiful, in a way, because it's an entity. It has a reputation, which is fine. We're not trying to fragment everything or atomize this thing to a bunch of individuals on their own. We're simply saying we can act as an entity that works in our interests and in the interests of the public, rather than in the interests of corporate shareholders.

SPENCER: I like that idea a lot. I'm just trying to understand the details. So when they publish a review of an article, how does one, as a consumer of the research, decide what's low or high quality?

CHRIS: Well, you read the science, and you look at the evaluations, and it's all there for you. In exactly the way it would be at a journal that was just as open. But in fact, most journals are not very open. Most journals don't publish the peer reviews alongside the paper. And most journals don't show you the versions of the article that underwent revision previously. So you can't see the evolution of the article throughout the review process. This is the great strength of transparency. When you build a process, which serves the interests of science rather than the interests of corporate branding, you can build something in which it's much easier to assess quality. I'll give you an example. So, when we set up the peer community and registered reports, which brings together the logic and the power of the registered reports initiative, and implements it using the PCI initiative, we perform all of the quality evaluation at the level of preprints before journal submission. And then, we have a whole fleet of journals on board, which have committed to accepting our evaluations without further peer review. So, our review process, our quality evaluation, substitutes for the evaluation that the journal would have managed, had authors submitted to that journal. This is incredibly important and incredibly powerful, because it gives authors the power to decide where to publish their science, not journals. Authors, armed with their acceptance from PCI registered reports can say, "Right, I want to publish in this journal." And the journal will say, "Yep, that's fine. You've got a PCI registered report acceptance, and so we'll accept your article without further peer review."

SPENCER: I love that idea, and it's super interesting. I'm still kind of confused about this, though, because I don't understand how this interacts with a kind of credibility seal. It's a sad state of affairs. But the reality is, a lot of people use the name of a journal as a very quick quality metric. Like "Oh, that was published in Nature. It's probably more worth reading than if it was published in some journal I've never heard of." That's how people use a lot of this stuff, or they're reviewing someone's resume, and they see their published nature, and they think, "Ah, okay, that lends them more credibility." And there's a lot of problems with doing it that way. But I think that is the reality of how people think a lot of the time. So, I'm wondering, how do we replace that with a different system if we assume that human nature is not going to change? Are people still going to use these really quick shorthands a lot of times for evaluating credibility?

CHRIS: Well, I think you have to look at why people use those heuristics. Why are we judging the book by its cover? It's because there's way too much science published in the first place. There's too much to read and evaluate at a deep level. And also, because you got to look at what the publishers taking ownership of that gives this credibility, that gives this flashy cover. And if you look at it, it's two things: it's results based evaluation, it's outcome bias. So it's cherry picking certain findings, which are flashier than others (that's the first thing). And the second is taking control of the review process and managing the review process. So, journals with these reputations, academics will look at them and say, "Well, yeah, they publish really important findings, and they've got a really rigorous review process." And so academics fall into the trap of thinking that because it's published in journal X, that's why it's high quality, but it's got nothing to do with that. It's got nothing to do with any of that. All of that stuff is either biased and either completely misleading. So, as we've discussed, outcome bias does not help advance knowledge, or it's a credibility signal that is provided by the academic community itself. There is nothing that the journal itself is adding to the process. It's simply taking ownership of it. It's like a parasite. And it's sitting on top of it saying this is mine, even though I did absolutely nothing, all I did was manage the process. I had a bit of software, which invited reviewers. My editors did some work, but there's nothing, there's no real value being added. And so I think, to break this kind of heuristic, we have to go back to basics and say, "Let's strip away. Let's take back control of those parts of the review process that belong to the academic community and ensure that that ownership remains within the academic community." And then, we can perform a proper test. Can these publishers really contribute true value? Does their credibility remain when the parts that we owned all along are given back to us in order to manage?

SPENCER: I'm so much of a fan of so much of what you're doing, but I don't think I buy this particular argument. Just because I think that it's a very human thing. It's part of human nature that we want these simple heuristics. When someone hears someone went to Princeton, they're gonna make a judgment about them immediately based on the reputation of Princeton. Or if someone hears someone worked at Google, they're gonna make a judgment about them. And I think this has happened in every field. So what I guess would be in favor of, is academics providing a similar credentialing mechanism outside of the publishing system where there are certain credentials and say, "Ah, this paper was accepted at this level of quality." So that people have another system to use in replacement for the traditional publishing system, as a way to get those quick checks. I'm not saying those quick checks are a good thing. I'm just saying, I think that that's the reality of how human nature works a lot of the time. We want there to be a quick symbol of quality, right?

CHRIS: It is. You can rely on heuristics to some extent, okay. You can maybe decide whether certain initial entry tests will pass. But there really is no substitute for reading science. If you want to assess quality, and you want to do it right, and you want to do it in a way, which I think honors the trust that we have placed in us as publicly funded researchers, then you've got to read the science. You've got to perform proper evaluation. And yes, you can say "Yeah, but people don't do it." But this is why we're changing the system. This is why we have initiatives like registered reports, which get escaped, and completely bombed the hell out of that heuristic, that results determined quality. And that's something that journals are using, and they're already being seen themselves as a heuristic, "Oh, this was published in a registered report. And that means it must be of good quality." That may not be a good heuristic in the long run, but it's one that people are adopting. We've got to try and just move the community toward making more rational decisions. And you do that, I think, by identifying all of these points, like outcome based evaluation, like peer review itself, and who controls it, and saying to other people who are controlling this at the moment, and buying the credibility of those initiatives. Do they really deserve it? Is it a signal that we can really trust? Or are we the ones who really should be in control?

SPENCER: I certainly think there's a really strong place for this kind of open field where everything can be evaluated on its own merits. And each thing, you can go read the reviews. But I also think that if you think about the time versus information trade-off, it's true that if you really want to understand a person, you need to look beyond, "Did they go to Princeton?" "Did they work at Google?" But if you just have one minute, and you find that they went to Princeton, okay, that gives you a decent amount of information relative to the amount of time you're investing. There's a fundamental trade-off here of time versus information. There's also something fundamental about human nature, the way we use these credibility symbols to make quick judgments. So I'm just wondering, is there a version of this that could work in sync with what you're doing? For instance, could you have websites that are run by academics that are non-profit, where to get listed there, you have to go through a more stringent standard of people evaluating it both on the reliability of the research and on how important it is in the field or how valuable it is to the field? Or do you feel that something that couldn't work?

CHRIS: I think it can, and I think it's probably inevitable that it will be created, because there's an increasing push now for researchers to be promoted, to be hired, and to be assessed really based on deeper evaluations of quality. Not just on, for instance, broken metrics, like the impact factor, or how much grant income they've got, or the names of the journals of which they've published. These may have been the traditional heuristics and the heuristics that many scientists still use, particularly senior and older scientists. But they're falling out of favor for a very good reason, which is that they don't work. Well, just because somebody went to Princeton or somebody published in Nature doesn't mean they're a good scientist. It could mean they got lucky. Or, they were favored by certain structural biases in the system. There's all kinds of missing information, so we have to be very careful. But if you build a system which evaluates merit and quality at a very rigorous level, then it's certainly conceivable that you could take that and create a simplified presentation, which can be assessed quickly. You've got to do the hard work first. You've got to build the systems which evaluate quality in an actual way, in a real way, rather than taking the easy way out and saying, "Oh, because it was published in Science Magazine, it must be good."

SPENCER: Yeah, I agree. A lot of the traditional credentialing we have is not nearly as reliable indicators people tend to think because there's so much pressure to do these flashy results. Actually, I'm not sure that since it's published in Nature is actually more likely to be correct than something published in a second or third tier journal.

CHRIS: This argument has been made that if you look at the retraction rates due to fraud, they're significantly higher in more prestigious journals, and more prominent journals. And there are various reasons why that might be the case, but one of them is potentially that the research quality there is lower because the filter — through which research must pass in order to be published in those journals — is so strict, so outcome oriented, that it will select for a combination of people who got lucky, people who made their results up, and people who actually conducted very high quality research. And you will never be able to tell which of those three, just from reading the paper. Only time will tell replication, the traditional scientific approach. So as you say, we have to be very careful in relying on just the name of the journal.

SPENCER: My dream world would be one where there's a really robust, credible rating system where if I go look at a paper, I can see that it was rated as, "Okay, this is a really high quality robust study design, they did good practices." I can quickly assess that, in case they don't have time to go dig in, which might take an hour or two to really dig into the paper. Maybe I don't have time for that. But I would love to have a quick assessment. I can see "Oh, okay, this looks really reliable," or "Huh, this doesn't seem that reliable." And then in addition to that, a filtering system for what's actually important, because that's sort of a second thing that journalists are doing, is they're directing attention. They're saying this work is more important than that work. So it's not just about reliability.

CHRIS: Yeah. And this is where you've got to, I think, very clearly distinguish between the role of a newspaper and the role of a scientific record. I think what's happened, unfortunately, in the scientific publishing world, is that the role of curating the scientific record has become blurred with the role of promoting cool stuff. So the outcome of that is that we have journals which have their own sort of evaluations of newsworthiness, that they decide what's cool, what's not. Our journal has its own brand, we want to protect that. We want to publish the cool stuff, we want to be the cool kids, and they've kind of forgotten that that's not their mission. That's the role of a traditional newspaper. That's not the role of a scientific journal. What we should be doing, as you say, is thinking about this in terms of two tiers. At one level, there should be a scientific record in which research is published, regardless of outcome, based solely on an evaluation of quality. Quality includes methodological rigor, the importance of the question, the innovation, etc, are involved in conducting the research separately from the results. And that scientific record is something which we should be using to judge and evaluate scientists and science itself. And then you can imagine on top of that, there's another layer, perhaps of what's the cool stuff, what's the stuff from that record that we should perhaps be talking about? That might be the research that triggers new ideas to most or drives forward an area. And that's fine, as long as the two levels are distinguished and they don't blur into one, which is what we've got at the moment.

SPENCER: Yeah, it seems like one drawback of having journalism draw attention to what's important is that journalists seem less equipped to do that than scientists in many cases. They seem more likely to be taken in by the flashiest, silliest finding. You see so many popular science articles on something that I think most scientists would just kind of say is ridiculous.

CHRIS: Right. And this is super important. I have a whole side branch of research just looked at the relationship between science and the media and the importance of ensuring that the press releases these knowledge subsidies that we release. In order to raise publicity about science that these are as accurate as possible, because most journalists who report on science do not have a high level of scientific training, that often media graduates can be very experienced journalists, which is a very unique skill of its own. They may not have always had the level of training or expertise to be able to detect bullshit, basically. And so therefore, as you say, a lot of the news stories about science can be easily distorted. We have a big responsibility, I think, as scientists who ensure that the press releases we issue and the ways in which we talk to journalists are as careful as possible and as factual as possible, and try to minimize and control that kind of spin. This is why I sat for several years on the advisory committee of the Science Media Centre, which is an independent press office run out of London, which connects journalists and scientists in order to improve the quality of science reporting. And I learned that there's a huge gap between the standard of scientific knowledge that we really need amongst many reporters and the level that they've actually achieved. There's a number of different initiatives, for instance, to improve statistical reasoning in journalists, because that's a huge part of detecting error, detecting bias and other steps to take. And I think there's still a long way to go.

SPENCER: I agree. I'll also just add, I think there's an incentive issue, which is that more and more journalists are incentivized to get clicks. And getting clicks is kind of at odds with filtering out the most ridiculous but exciting research.

CHRIS: It's true. And clicks will be driven, in some ways, by findings, right? So imagine two trials of equal quality addressing the same question, one of them gets published in Nature because it gets cool results. And the other one goes into that plus one, because it got no results. Which one do you think is going to end up in the news? Which one do you think people in on their daily commute, will read in the newspaper, on their phone, whatever? Which one might influence policymakers and politicians? All because reporters picked up on the one in the flashy journal. And that problem goes right back to the hierarchy of journals that we the scientific community have allowed to exist. We've allowed journals to prioritize certain findings over others. And this is the root of the problem. A lot of the problems we see in news reporting are actually our problems as scientists that we haven't addressed at a lower level.

SPENCER: I would say there's also an outcome bias in sexiness of results. Imagine two studies are done, one finds that chocolate helps cure cancer and one finds that chocolate doesn't help cure cancer. Which one do you think journalists are going to write about?

CHRIS: Let's go more extreme. Imagine there were a hundred studies done asking whether chocolate cures cancer, and 99 of them found nothing. And one of them found incredible results, just by chance. You can imagine that one would probably get published, potentially in a prominent journal. What's gonna happen to the 99? Are they even published at all? Do they even get into the scientific record? Let alone, do they get reported by the journalists?

SPENCER: Alright, so the last topic I want to bring up before we wrap up is about the way that researchers are hired and promoted and kind of switching these incentives for doing research by changing the way that tenure works and things like that. So do tell us some of the background and how it works now. And then, what are your thoughts on improving the system?

CHRIS: So along with the push to try and improve the quality of science and the openness and transparency of science itself, there's this drive to change the way we evaluate scientists, because there's no point trying to change one part of the system, whilst leaving the other ones still stuck in the old ways. You've got to try and move everything together. The academic system is very much like a machine with cogs. When you change one part, you've got to change all the other kinds of cogs in elements around it to work in synchrony. And one of those major cogs, one of the major incentives, which drives researchers is the conditions under which they're hired, or they're promoted or assessed or evaluated. Traditionally, that has gone down the road that we've discussed, where panels rely on crude heuristics like which journals that they published in, or how many papers have they published, or how much grant money have they attracted? And this is wrong, all of it is wrong. It's not assessing the quality of the research that's been done. It's lazy, fundamentally. It also rewards questionable practices which are unrelated to quality and might actually work against quality. So I believe, and members of my colleagues working together on this, that we need to try and alter the ways in which researchers are hired and promoted. For instance, to value open practices, to value quality, to value the value for money in the science that they conduct.

SPENCER: Ultimately, how are these decisions made right now? Is it fundamentally a committee and what that committee values and the committee consists of, probably in most cases, older, more prominent researchers in that same field?

CHRIS: Very much so. In many cases, those who sit on promotion and hiring panels will be the most senior academics in a particular department. Probably those who have done the best under the old system, and are therefore, the least inclined to challenge the way it works. Because, "Hey, I did well. I've got a job. So everything must be fine." They're not necessarily going to be the ones that are prepared to challenge all of the irrationalities in the system. So yeah, you've got these people really holding the strings of power. And they'll be looking at all of these traditional indicators, like volume of publications and grant money achieved, because these are the ways in which they were assessed and the ways they were evaluated. They won under that system. And so the instinct is very much to apply that same system again.

[promo]

SPENCER: So how do we nudge the system into a different kind of equilibrium?

CHRIS: We change it again. I'm very much a reductionist. When I'm thinking about these changes in policy, you identify the pressure points, just as we did with registered reports, just as we're doing with peer review. You identify the points in the system where things can be changed, and you focus attention on them very precisely like laser focus. So you get one institutional panel to say that candidates for research positions at this institution will be assessed, not just based upon the traditional indicators, but also on the extent to which they conformed to growing standards in open science. So to what extent have they made their data available, their materials available, To what extent have they pre-registered their protocols where it's appropriate to do so. We can ask evaluation panels to also consider these at the table when they're looking at who gets a job or who gets promoted. And you do it step by step. So, you start by introducing these as perhaps desirable criteria. Then over time, you can increase the amount of weight that's given to them. So that we move the community gradually toward assessing people on a much more rational basis. You can't change the system overnight. It's never gonna happen. Academia is very slow to change. What you can do is make focused nudges in different areas. And that seems to be working reasonably effectively at the moment, although I wish it was faster.

SPENCER: I'm not sure I understand why they would agree to do that. Is it just because there's enough people on board with the idea that we should be doing this, that there is some appeal?

CHRIS: That's right. I mean, those of us who are at the mid-tier level, who have gone through being taught all of the old, we know how to win in the old game, right? And we could just as easily keep playing that game if we wanted to. But there's many of us who have decided we don't like that game. We don't like the rules. But we're getting to a position where we are able to start changing those rules from within. Those people are starting to take positions of power, on panels, in departments, at funders. They're the ones sitting at a table going, hang on a second, when the silverback professor says, "Well, this candidate has published three papers in nature. So surely they're the best." There's somebody now saying, "Uh, uh, uh. That's not the way we do it. We shouldn't be looking at the article title at the journal title. We shouldn't be looking at these crude heuristics. We should be looking at, for instance, open practices." And that person is getting that text written into policy. That policy is being spread and it's being distributed, it's being propagated. So we're seeing a gradual, I suppose, changing of the guard. This is why it's relatively slow, I think, because it's very hard to change people's minds. But it's time to guarantee that we change who is in those positions.

SPENCER: Tell me your thoughts on things like citation count, and to what extent you think that they should take into account.

CHRIS: Well, citations are a certain measure of impact. So, a highly cited article is one that is likely to have been read more. Although that comes with the caveat that a lot of citations are not based upon a careful reading of the work, and many are based simply upon the scanning of an abstract. But that caveat aside, articles with high citation counts tend to be more impactful. And that can be because they make an influential claim or they propose an influential theory, or they have influential findings or whatever. But it doesn't necessarily mean that the work is of higher quality. We have to be very careful not to fall into the trap of relying on superficial metrics as some kind of proxy for quality.

SPENCER: Yeah, my experience often when I would submit a paper on how we get back from the reviewers, "Oh, you should cite these 10 other authors." And so I feel, "Okay, I have to go add citations to these people because I want my paper to be published."

CHRIS: That's even more fun. When the reviewer says, "Cite these 10 articles by this one person." You're thinking, "Gee, I wonder who that reviewer's review is." Obviously we have a great deal of power over which we research is cited. I think sometimes that's inappropriate. I think sometimes that can cross an ethical line where reviewers try to coerce authors into citing work inappropriately. And equally, authors, there are ethical issues the other way where authors will strategically avoid certain work, which may be inconvenient to the arguments they're trying to make. There's a whole area there. And citations matter, they do matter. But I think, like anything, they have to be seen for what they are and what they are not. They may be a measure of the relatively short term impact of a piece of work, but they should be viewed as one part of a program of science. And they shouldn't be seen as an indication of quality.

SPENCER: Yeah, it's interesting to think about the use of citations. So in math, which is my original field, they're very helpful, because if you don't — let's say you're not familiar with a theorem that they're citing, like, "Oh, okay, I can go look up that theorem and this other paper they're citing." Okay, that's really helpful. It helps me get the information I need quickly. Whereas when citations are used as a way of bolstering your argument, I find that often very iffy, because it's extremely cherry picking oriented. I'm making an argument, I'm just gonna pick a paper that's going to support my argument, like people don't cite papers that contradict their argument.

CHRIS: Well, yeah, indeed. And I think this is where peer review is so important. And perhaps one area that has failed a little bit is in holding authors properly to account in ensuring that citations are properly balanced. And I think there are ways in which the Open Science movement is helping this. One of them is through the publication of preprints, prior to formal journal publication, because when preprints are published, there's an opportunity for the community to actually give feedback before the record is fixed, if you like. So, if somebody has cited work inappropriately, or in a biased manner, there's the opportunity for the community to push back and say, "Actually, you missed this area, or these key papers, or this is biased." And so, there's an opportunity for discussion, and community involvement. I think also, pre-registration can help by encouraging authors to really justify their arguments before they have results. And really think in-depth about the theoretical motivation for a particular argument, or the rationale for a particular hypothesis or method. I think that can also help avoid some of these traps.

SPENCER: In math, I had these funny experiences, where before you go submit the journals, you just put out a preprint on the archive, which is just a public website. I think it's funded by a donor that just funds it, or a series of donors, and it doesn't make a profit. You put it out there, immediately. Mathematicians get in touch with you about it, that are interested and have questions and stuff. And then a year or two later, eventually, your papers will be published. But by that point, it's sort of irrelevant. You've accomplished your goal of getting it out in the world and getting a conversation going about it. And then the publication is just the thing you put on your resume two years later.

CHRIS: Exactly. These are preprints of the future. And this is why I think, as we were discussing earlier, it's so important that peer review, if it occurs, happens at the level of preprints. Because preprints are science. They are the science. If you're a particle physicist, or in many areas of physics or math (like you've said), the preprint is the meat, that's the sandwich, that is everything, that is the product. And what goes into the journal at the end is just the icing on the cake, or the ticking of the box, or whatever you want to call it. I think this is the way science is inevitably going in all fields faster in some than others. I think physics and math are probably about 50 years ahead of the life and social sciences (where I'm from), but we should certainly be thinking about this in that way.

SPENCER: Yeah. And also another thing is the preprints are not paywalls. In my experience, you can just download — anyone can download them for free anywhere in the world — and you can actually update them, and the way they handle that is just a versioning system. You can make changes, and it will just show all the different versions, which is so helpful. If you have some mistake in a formula or something in a regular publication, it's just stuck there forever. I suppose that you can petition to have a change, but it's just a much more difficult process.

CHRIS: You can, of course. The scientific record can be corrected through errata and corrigenda and whatnot. The versioning system of preprints is really much more advanced as science for the future. Because again, the traditional journal based publishing system is stuck in the 18th century, and has a permanent fixed record, which is there. And that it's just not the way science works. Science is always evolving, errors have been corrected, versioning is the way to go. And it's also, I should add, very educational for students to be able to go back and look at how the trajectory of a particular paper changed with each version in synchrony with peer review or whatever other feedback was obtained. This is incredibly valuable for us to look under the bonnet and see how science actually works when you're a junior scientist, rather than just seeing this perfect product at the end, which is it looks good. Yeah, it might look flashy, but it doesn't really teach you very much.

SPENCER: All right, before we finish up, I want to do a quick rapid fire round, where I'm just gonna ask you a bunch of questions and get your quick takes. How does that sound?

CHRIS: Sure. Let's do it, alright.

SPENCER: So, the first question. Suppose you take a paper from one of the top (let's say) 10 journals in social science, and someone were going to attempt to do a really good replication of it. What do you think the percentage chances are that it replicates?

CHRIS: Huh. 50-50?

SPENCER: Okay. It was, I'm assuming, not a registered report.

CHRIS: If it's not a registered report, I think it's probably much lower in general.

SPENCER: Like 30-40% chance of replicating?

CHRIS: Possibly, yeah, possibly lower. I mean, if you look at some of the large scale replication initiatives that have shown successful rates of about one in three, one in four, it really depends how you select but yeah, it's certainly lower than the majority.

SPENCER: Interesting because I would say my best guess is something like 60% replicate but that's, yeah, it's interesting.

CHRIS: It's very much dependent on which field you're in.

SPENCER: Absolutely. Yeah. Social psych has had more replication issues than (like) cognitive science, as I understand it, or cognitive psych, I should say. So yeah, maybe there's some dependency there. Okay, next question for you. What do you think of Sci-hub? And could you also explain what Sci-hub is just for those who don't know?

CHRIS: Sci-hub is a website, which was created to try and essentially, it's like the 'Robin Hood' basically of publishing. So with Sci-hub, all the articles that are behind paywalls are essentially made available freely, and it's technically illegal. But the reality is that a lot of scientists use it. It's very, very popular. And it's the subject of various legal challenges, criminal charges, and whatever; a long list of battles that Sci-hub continually faces. One of the interesting little realities of Sci-hub is a lot of scientists who have legitimate access to journals use it as well, because it's a hell of a lot easier to navigate than the labyrinth and bureaucracy of their own internal library systems, which I think speaks volumes about how unused or friendly the scientific publishing system has become.

SPENCER: Yeah, there's an interesting analogy to music sharing. How we went through this period where everyone was sharing mp3 files. And now, it just seems to happen way less, because it's just so easy to access music legally, and people don't bother anymore. Right?

CHRIS: Right, we haven't got to that level really yet with publishing because, of course, Sci-hub is technically illegal. But I think if I was to make a value judgment, I would say that on balance, it's a positive force, because it is encouraging discussion about the degree of openness in science. And also, it's forcing publishers to update their systems and try and innovate in order to make research more openly available than it is.

SPENCER: Our next question. Do you think we should lower p-values? And if so, could you elaborate a little on why?

CHRIS: Well, I think there's certainly a strong argument for why the 0.05 threshold is too liberal, and can lead to too many false discoveries. Researchers have argued that we should perhaps lower that threshold for making claims down to .005, or even lower in some cases. It's very challenging to do this, because 0.05 has become this edifice of science that if you get a p-value less than 0.05, it's real, and it's publishable. And all of the nonsense and all of the bias that we've talked about really comes into play. In enforcing that threshold, I would prefer to see a system in which p-values were irrelevant to editorial decisions, but forced to evaluate the regular literature. I would say a lower p-value threshold is probably a good thing.

SPENCER: All right, bayesianism vs. frequentism.

CHRIS: [laughs] That's the subject of wars. Wars are fought on social media over this topic. I'm not a statistician. And so I can't comment intelligently on all of the various detailed arguments that people make in favor of one or the other. What I can say is that I don't think it matters very much. I don't think whether we use bayesian or frequentist statistics is diagnostic of good science, or will necessarily lead to better results or more conclusive findings or more reliable results. The main thing is that we use appropriate methods. And you can find appropriate methods within bayesian and frequentist statistics.

SPENCER: It's a very safe answer. [laughs]

CHRIS: I think it's the only answer, really. We can't fall into the trap that the bayesian versus frequentist wars have been going on for a long time. And I don't think anybody cares, except statisticians. I think ultimately, what scientists want is to have tools they understand and which can answer their questions. And if you look at these methods, both sets of philosophies are capable of doing that. It's just a case of maybe which you prefer, which you're better trained in. And there's bigger issues. Eliminating outcome bias in science is so much more important than deciding whether to use a Bayesian hypothesis test or a frequent test t-test.

SPENCER: So why is data so rarely published in science? You would think that because a lot of science is publicly funded, why not just have funders require that all the data be published, obviously, with appropriate anonymization and so on?

CHRIS: Well, in many fields they do if you're in crystallography or genomics, then you deposit your data or you don't get published. So we're going to look at the prevailing norms in a field. In certain areas, I think, where the importance of transparency has been identified as very high, then mandates to share data and to make data publicly available are applied. But in other fields like psychology and the social sciences, we haven't yet evolved to that level where data mandates are in place. So we're seeing a shift in that direction. So the top guidelines, the transparency and openness promotion guidelines, for instance, set standards, which are working to improve data transparency. But we haven't got there yet. Because I think at a fundamental level, many social scientists think that they own their data, that their data somehow belongs to them. And I think this is a very deep misconception that only time and efforts and various changes in policies and practices will be able to reverse and actually create a more healthy norm.

SPENCER: A final question for you. To take a topic like ego depletion, the idea that exerting self-control will make it harder to kind of use self-control effectively later. This is a crazy topic because you have, well, over 100 papers on it. And yet, it seems like there's still a lack of consensus on, is ego depletion real? What's going on with it? Is it all just statistical artifacts, or whatever? So this is kind of an open ended question. But why do you think that debates like this continue? Why can't we come to a conclusion and a consensus about a topic like you had a position?

CHRIS: Well, I think there's two broad issues there. One is that I think, in any area of science, consensus is unusual. Unless you're talking about the most basic theories that have been established over centuries, I think, lack of consensus is normal for a start, and it's healthy. I think it's healthy for scientists to disagree. At a more local level, I think the low quality of so much of that work has shrouded in unnecessary uncertainty. And if we look at the hundreds of studies that suggested ego depletion existed, what we see is rampant bias, publication bias, outcome bias, reporting bias, which has created a massively distorted literature. The minute we eliminate that bias, ego depletion disappears. I personally don't believe it exists. If it does, it's really nothing like it's proposed at the moment in terms of theory. It's a warning, I think, to the psychological field that we need to work much harder to control our own biases and avoid fooling ourselves.

SPENCER: Chris, thanks so much for coming on. It was a great conversation.

CHRIS: My pleasure. It's been fun.

[outro]

JOSH: Some people consider you to be a polymath. Even if you're not an expert in a bunch of areas, you're at least able to carry on thoughtful, detailed conversations with guests who are experts in fields. So how did you get to be so knowledgeable about so many things?

SPENCER: Well, first of all, I just spend a lot of time learning. I'm learning every day. I'm reading scientific papers. I'm reading articles every day. I'm reading many books, my web and so on. So, if you make a habit of taking in information from a wide variety of sources, then you will learn a lot of stuff as you get older and older. So I think that's one thing. Another thing I would say is that I just really, really care about understanding the structure of things, and how to model things. And just this drive to understand the structure of things is part of why I'm so fascinated with things; like mathematics, because you can use mathematics to model things; or machine learning, which you can use to make predictions about things; psychology, which you can use to make predictions about people and to better understand the way people operate; and so on. So I think that is at the core of a lot of my drive to understand things. And that drive leads me to constantly be taking in information to improve my understanding of how the world works. And then when I'm talking to a guest, even if I don't know much about what they're talking about, I'm trying to relate it to the things I already do understand and see how I can use what they're saying to better understand the world.

Staff

Spencer Greenberg — Host / Director
Josh Castle — Producer
Ryan Kessler — Audio Engineer
Uri Bram — Factotum
Janaisa Baril — Transcriptionist

Music

Affiliates

Click here to return to the list of all episodes.

CLEARER THINKING

Episode 116: Are scientific journals just parasites? (with Chris Chambers)

Contact Us