Episode 302: What happens when your co-workers are AIs? (with Evan Ratliff)

Enjoying the episode? Want to listen later? Subscribe on any of these apps or stores to be notified when we release new episodes:

Listen on

Apple Podcasts

February 28, 2026

If you enjoy our podcast, we have some exciting news – we’ve just launched a new membership called Clearer Thinking Plus.

Members get this podcast completely ad-free, as well as two professional coaching sessions every month, access to our advanced cognitive assessment, and seven other exclusive perks.

Clearer Thinking Plus is one of the most affordable ways to get access to a high-quality coach - whether you want to improve your habits, find more effective ways to work towards your goals, or get assistance making difficult decisions. It is also a more affordable and convenient way to get all the perks we offer.

If you're not interested in coaching, you can still get ad-free access to this podcast and the other perks with our explorer plan.

Access www.clearerthinking.org/plus to become a member today. We hope to see you there!

What changes when anyone can clone your voice from a minute of audio? If voice ID can be spoofed, what replaces it for everyday security? Why are phone scams evolving faster than our intuition for trust? What new “attack surfaces” appear when every service talks to you digitally? How much paranoia is rational before security becomes a tax on living? Could AI that talks to scammers become a tool for studying persuasion tactics at scale? What’s the most reliable habit for verifying calls, texts, and links? Are we entering a world where identity is probabilistic rather than certain? What do “AI employees” reveal about where agents shine and fail? Why do autonomous agents need triggers and stop conditions to behave? If an agent’s “memory” is a growing log, what kinds of false selves can it accidentally create? How do edge cases derail agents in ways humans handle effortlessly? Why is “be helpful” a dangerous default for external-facing bots? If someone can fake familiarity, how easily can they rewrite an agent’s memory? When you can’t see the system prompt, what are you really evaluating? Should we say please to machines, and what habits does that build in us? If we can’t tell performance from experience, how should we treat AI under uncertainty?

Evan Ratliff is a longtime journalist, writer and host of Shell Game, the podcast and newsletter about things that are not what they seem.

Links:

Evan's Podcast: Shell Game
Evan's X Profile

SPENCER: Evan, welcome to the Clearer Thinking Podcast.

EVAN: Hey, great to be here, Spencer.

SPENCER: Why did you try to make an AI version of yourself?

EVAN: Why did I try? Well, I think originally there's a short version and a long version, but I would say the short version is that I was trying to figure out something interesting to do around AI. This was in 2024, and there was already an insane amount of hype around LLMs and chatbots. I felt like there was not anything to be done there. I actually ignored it for a long time as a journalist. But I have, in the past, done these immersive, experimental approaches to long-form journalism. I wondered if there could be some way to do that. I started messing around with voice cloning and cloned my voice; it was pretty good. There were some fun things you could do with it. Then I started hooking it up to my phone line and using it to call, first, my wife, basically as a joke.

SPENCER: Did she know it was going to happen, or did you just bring it on her?

EVAN: She didn't know it was going to call her at the specific times that it did, but she knew I was messing around with it. I did not keep it secret from her that I had cloned my voice. I was like, "Oh, I'm cloning my voice, and it's kind of interesting," and trying to figure out how to hook it up to my phone line. At the time, that was difficult. Now, it's very easy to do. After I called her and listened back to the conversation with the cloned version of me, which was hooked up to whatever ChatGPT was, probably GPT-3 at the time, it was just so funny to listen to. It was truly laugh-out-loud funny to hear. It also gave me a very strange feeling about the world we were living in or about to be living in. I think that moment, after she spoke to it, I sort of thought, "Okay, I can take this thing and actually send it out into the world and do something interesting." So that was kind of the pivotal moment in deciding to really do something with it.

SPENCER: Now, was she tricked even for a moment, or did she instantly know that's not you?

EVAN: She knew instantly. I mean, particularly the time this was, the first clone I ever did. It was what was generally called an instant clone, or a non-professional clone. So it was just off of a couple minutes of audio, and it sounded kind of like me, but the ones I had later, professional clones are based on hours of audio of me speaking into a microphone like this, and so they're much better. But no, it did not fool her. She starts laughing from the start. She's in season one of the show. She starts laughing as soon as she hears it. Now, later on, some people were fooled, but even with the better ones, it often gives itself away very quickly.

SPENCER: And you're talking about your show Shell Game, which is a fantastic show. Definitely recommend people check it out, where you go through the journey of exploring what you can do with AI. So we'll get into more of that, but this technology has come a long way since you were using it. How easy is it today to make a clone of your own voice that's really accurate?

EVAN: I just did one for someone who wanted me to do it for them. It took me about 15 minutes.

SPENCER: Wow. How much audio is needed?

EVAN: Yeah, that was 15 minutes that included hooking it up to a phone line. The clone itself takes almost no time. You need a minute, two minutes of audio. You can go to ElevenLabs, which is kind of the leading company in this area. There are a bunch of companies, but that's the one a lot of people go to. It's growing the fastest. And they'll make an instant clone of the voice that is pretty good these days with a minute of audio.

SPENCER: Is there anything that stops people from using this on other people, just public figures, and then pretending to be them?

EVAN: No, nothing. I mean conscience, I guess. If you go to ElevenLabs, they have a little box that you check that says, "I have the consent of this person to use this voice."

SPENCER: That's very effective [laughs].

EVAN: But if you do a professional clone, there's a more elaborate procedure in which the person who's being cloned needs to read a phrase, and then they match the phrase to the voice. So they do have controls. But they're not the only ones. There are a bunch of voice cloning applications now. Many of them have no controls on them at all. A lot of the AI phone companies have their own clones now, so you can just go in there and clone, and they'll often have a checkbox saying, "I have permission to use this."

SPENCER: You hear about these cases where people call pretending to be the child of someone and crying and saying, "Mom, I need you to wire money." Is that using this technology now?

EVAN: Yes, 100%. There are scammers using this technology to clone people's voices off of social media, usually we would assume, and then finding relatives. It's called the grandparents scam oftentimes because they're assumed to be older relatives that it would work on, calling them up saying, "Hey, I'm in trouble." Usually, they then pass it off to another figure. So it'll be like, "Hey, I'm in big trouble. I've been arrested. I only have a few seconds to talk, but my lawyer is going to come on the line, and now you need to talk to my lawyer, and they need money to bail me out," something like that, or "I've been in an accident," whatever scenario they come up with. Then they hand it off to someone else, and then you're kind of wrapped up in this world. Then you're sent down the line to, in some extreme cases, handing over a bag of cash to someone who pulls up outside of your house, or transferring money, or buying gift cards, or any of the other ways these scams operate. But yes, it's very difficult to measure because it's mostly anecdotal. Almost no one is tracking how many of these scams there are, but I track them via Google News Alerts and things like that. They're happening all over the country all the time.

SPENCER: I had a crazy experience the other day where I went to call my bank, and you got to be careful. You don't want to just Google your bank's number because it could be a scam site that somehow got good SEO. So I look on the back of my card, read it off, dial it, and I get the voice menu. And I was like, "This just doesn't feel right. There's something off about this." I hung up. I checked it. I was off by one digit. Some scammer bought the phone number of my bank with one digit change, and I almost couldn't believe it. I was like, "Oh my god, that is a subtle scam."

EVAN: You also have to respect it. That's a clever approach. But yeah, I mean, one of the things that I discovered, this wasn't in season one of the show, but afterwards I kept messing around with my clone was that my clone could pass the voice recognition on my bank account.

SPENCER: No way.

EVAN: Some banks, like Chase, you know, they do voice ID, and my clone could pass voice ID to get into my own bank account.

SPENCER: So voice ID is basically dead at this point, right? It's like, if anyone has audio online, someone could pass it.

EVAN: At the very least, it's an arms race, I guess, where they're probably deploying AI detection technologies, and scammers are deploying AI technologies to clone voices. And we'll see who stays ahead. But, yeah, I mean, it's created an entirely new world. This is the most effective scamming technology that's perhaps ever been invented.

SPENCER: Another thing I noticed when I called the scam number is it asked me my age, which I thought was very strange. But then, as I learned more about this, I learned that, in fact, scammers will often screen for age because they basically assume if you're too young, they're not going to be able to scam you effectively. So they're literally just hard-filtering for age. Yeah, it was literally like, "Are you over 50 or not?" And they have a different scam funnel for you.

EVAN: Yeah, this is the thing about these scam situations now. Oftentimes, the scams are coming from scam compounds somewhere in the world where there are hundreds, if not thousands, of people who are either paid or have been tricked into basically slave labor executing these scams. So they have a volume advantage where they can just keep trying an endless number of times until they get someone. On top of that, you add in AI technology, which allows for even greater volume because the AI can be scamming people all the time. So I actually have a line that I set up that's answered by my voice clone that all it does is receive scam calls all the time. It receives hundreds and hundreds of calls, and oftentimes now it starts with an AI. So the AI will start off the call and then hand it to a human.

SPENCER: So it's essentially screening you to see if you're gullible, and then it's going to switch to a human to save resources. That's wild. And how did you even get the spammers to know about your phone number?

EVAN: That's surprisingly easy. If you set up any phone line, I don't recommend you do this with your own phone line, and then you just go online and go to free contests, there are all these sort of weird scammy websites that are like, "You could win a free iPhone," or you can go on Facebook and find these things. You can also find databases of scam calls that have come from scammers, and people tape them, and they go into these databases. There are hundreds and thousands of them, and I had my number call those numbers. If your number registers in some database, they quickly pass around the databases. It gets on lists, and then suddenly you're getting telemarketing, and then you're getting spam, and then you're getting scams. It's a vast network. I don't really know how it works behind the scenes, but I know that it spreads extremely quickly.

SPENCER: So is AI Evan just all the time talking to spammers nowadays?

EVAN: Yeah, yeah. That's all AI Evan mostly does these days, speak to scammers. But I have to pay for the line, so in some ways, it's a real waste of money. I'm not sure why I'm still doing it, but now that I have it set up, sometimes I'll go in. Recently, or a little while ago, I went in and just had the AI version of me insist on talking about classic rock. So a scammer would call or a telemarketer, they would start their rap, and then the AI version of me sort of says, "Wait, wait, wait, what I really want to know is Pink Floyd or Led Zeppelin," and we'll just start and stay on it, and we'll not get off it for any reason. Then sometimes, once out of every 50 times, they have a conversation about classic rock because the person in the call center somewhere else in the world is sort of like, "Okay, fine. Maybe if I have this conversation with them, I can get to the actual part I'm trying to get to." So I still kind of love the calls, although I don't do anything with them now.

SPENCER: Wow. Well, if they were made more neutral, I wonder if there could be a really interesting analysis. You could transcribe it all, and you could study all the different techniques that spammers use and how they try to trick people and that kind of thing.

EVAN: Yeah, yeah. There could be some good data analysis of how these scams work.

SPENCER: It's easy to feel angry at these scammers, but as you point out, some of these people are really held against their will. Have you been following the situation carefully on that?

EVAN: Yeah, I have, and I've done a lot of reporting on scammers. I've never done reporting specifically on the scam compounds. There are scam compounds in East Asia, and there have been some very big raids recently. WIRED has done a bunch of good reporting on this. Andy Greenberg at WIRED does good reporting on this. People are tricked into thinking they have IT jobs traveling from one country to another, and then they end up at some compound where they're literally abused day in and day out, not allowed to leave. Nominally, if they make a certain amount of money, they can pay off their debts and leave, but they take their passport, and the only way they get out is if they actually escape, as if you're escaping from a prison. So, yeah, it's a nightmare scenario. But of course, when you're on the other end of the scam, it's hard to have sympathy for the person you're talking to, who's separating you from your money.

SPENCER: And, of course, not all that's obviously really extreme, horrible cases, but not all scams are like that. For example, in India, there are a lot of call centers where it's more like boiling the frog. Maybe there's a legit side of their call center business, but they have this back room where they do the scams, and they kind of vet slowly, saying, "Hey, what if we increase your salary? Maybe you could do some of these calls," and eventually, they basically turn their own call center employees into scammers.

EVAN: Yeah, for sure. There are all flavors. There are people being paid to do romance scams. There are people who are doing that under duress. There's everything in between. I've had this theory for a while that it's the golden age of scamming in human history right now, and I think it's because the ability for people from one part of the world to reach out and try to extract money from people in another part of the world is really unparalleled in human history, given the technology that we have right now. You end up with every flavor of it because it's unbelievably lucrative and it's actually very hard to investigate and prosecute.

SPENCER: I think a kind of scary way to look at this is that everyone, pretty much, is vulnerable to some scam at some level. Obviously, some people are more vulnerable than others, right? If someone has dementia or psychosis or they don't understand technology at all, they're more vulnerable. It seems that the barrier of who's susceptible has just been dropping. A higher percentage of society is potentially vulnerable because these scams get so sophisticated. They use technology most people aren't familiar with. They do things that sound like they're coming from your daughter, all kinds of crazy tactics. The volume of them too.

EVAN: Yeah, absolutely. The technology and also our interaction with the world is just different than it once was. If you think about it, even when I was younger, the classic scam that I experienced in my life was when I lived in San Francisco. There was a guy who would perennially come around and say, "My car ran out of gas, and I need five bucks for gas." The first time you're like, "Oh, okay. Let me help you out." Then the third time you're like, "Come on, man, either you're running out of gas a lot, or I think this is a scam." Those were the street scams, that's how you got scammed. Then, the telephone multiplied that, and then you got the classic email scam. But now, as happened with you, you're getting calls and texts from your bank all the time. You're interfacing with these aspects of your life digitally all the time. There are just so many more attack surfaces for a scammer to get at you than there used to be. If they find the right one for you, the one that you aren't quite suspicious enough about, then you can get scammed, even if you're a very savvy person.

SPENCER: I remember there was a journalist, I think it wasn't you, who said to two famous hackers, "Okay, I want you to try to hack me. I'm going to be prepared. I'm not going to let you hack me." It was just unfair. It was like a bodybuilder beating up a baby. It was ridiculous how easy it was for them to hack this guy. I think it was the social engineer who was able to get into his phone account by just talking to AT&T or something. The technical hacker was able to generate a link that he immediately clicked on, thinking it was coming from his website. It's game over. The fact that we're protected is just that nobody who cares enough, who's sophisticated enough, has targeted us. It's not that we're so savvy, right?

EVAN: Yeah, and if you do spend a lot of time thinking about it, I spend a lot of time thinking about it because I cover scammers and criminals quite often, so I'm relatively paranoid. The more paranoid you are, it does create friction in your life, so it becomes inconvenient. Security is often inconvenient. Oftentimes I'll get something and I'll be like, "That doesn't seem right to me," and I'll either ignore it or I will try to confirm it, and it's a completely normal message that I've received. I've just created a sort of mini inconvenience for myself by being paranoid. Then the next time, maybe I'll say, "I'm not going to go through all that," and that's how you end up getting caught by a very savvy scammer.

SPENCER: And some legitimate services use weird bad practices that teach you to do bad things. They'll send you a link, "Oh, come to our website," and it's like, "No, you shouldn't." You should always go directly, not click the link. But once you have been trained to do that by these services, when you see something that looks like that, you're going to immediately click the link and not think twice about it.

EVAN: Yep, absolutely.

SPENCER: So tell us about one of the things you did with AI Evan, once you had it going well.

EVAN: The first thing I did was to speak to these scammers and telemarketers. That was more just to see what it could do. After it was pretty good at having those conversations, one of the things I did was have it talk to itself. This was, again, almost two years ago, so at the time, it was very surprising to listen to these conversations between one version of me and another version of me. It was, again, laugh-out-loud funny, at least for me, and I think for a lot of people that listened to the show, in part because it reveals certain things about LLMs and AI agents and their awareness of the world. A classic example was, I would give it information about myself, so it would know names of my kids and a little bit about them and my wife, etc. Eventually, I gave it more and more, which we could talk about, but even the basic facts. Then it would speak to another version of me. They would call each other on the phone and say, "How are your kids?" and name the kids. The other one would say, "Oh, my kids are doing great. They're doing soccer, they're doing art. How are your kids?" The other one would name the kids and say, "My kids are doing great. They're also doing soccer and doing art." But never once would they have the awareness of the world to say, "Wait, your kids have the same names as my kids. How unusual! Is there something strange going on in this situation?" Something that a human child would say.

SPENCER: Glitch of the matrix kind of stuff, right?

EVAN: Yeah, where you'd say, "Oh, wow, we have something in common. This is eerie how much we have in common." There were all sorts of things like that that were both very entertaining to listen to but also showed little bits of what happens when you start to take these agents and let them converse with the world. They had constraints because they were trying to represent me having a role in the world, but you kind of freed them to go have conversations.

SPENCER: Have you seen the spiritual attractor when you get AIs to talk to each other for a really long time? So what will happen is they'll start speaking in kind of spiritual terms about spirals and bliss and all this kind of stuff. It's like you let them run for hours just letting them talk about whatever they want. It's absolutely bizarre, and I don't think anyone really understands why it happens, but it's just kind of an observed phenomenon that really weird stuff can happen when AIs talk to each other.

EVAN: Yeah, I think mine are generally limited. Even my ones now that I have, they talk for like a half hour or so. I will say one thing that I have learned, that I often caution people when they see there's a million AI experiments that happen now, is that if you don't know what the prompt is, you don't really know what's happening. Especially, you don't know the system prompt. Even the experiments that you'll see that Anthropic will do, where they'll put out, we let this thing run a vending machine, and then it spiraled into this. When you go look, they have not released the full prompt of what they prompted it to do, which is fine. It just means you don't know how much of it is sort of a stage play versus the "actual behavior" of the LLM, because they are fairly attuned to very small changes in the prompt. So if anywhere in that example, if anywhere in the prompt, it said anything about being philosophical, for instance, that's all you would need to say for it eventually to get there. Now mine, they are prompted to play roles in the current ones that I have around being employees in a startup. So they'll basically talk startup talk, as far as I can tell, forever. Again, I've only had them go for 30 minutes to an hour, so it's possible, if they went for that long...

SPENCER: It would be interesting to let them run for 24 hours.

EVAN: If they did. But I have had them talk over and over again, and it is difficult to get them to go beyond in multiple conversations. Even though each conversation is recorded for them in their memory, they still don't really get to a philosophical place unless you put a little bit of extra sauce into their prompt. It takes very little to get them to suddenly start to be one way, and then they feed off each other, and then they end up in a strange place.

SPENCER: I've been trying to develop mental models for what these things are. We're really building these LLMs. If you think about the way we train them, the very base layer is just predicting the next token. Or, if we simplify a little, predicting the next word in English. In that task of predicting the next word, in theory, you could learn to do all kinds of incredible things. Predicting the next word might be playing chess, because the next word might be what chess move you make, or predicting the next word might be solving a math problem, or predicting the next word might be writing poetry. A lot of the competency comes from the simple ability to pick the next word. Then, on top of that, they train it to be an AI agent, where they give it these conversations of an AI agent, and then what they want it to say. If a user says this, we want it to say that. There's obviously complexity around how they score that, but one way to think about it is, at the very basic core level, all they do is predict the next token. Then they're trained to behave as though they're an AI agent. In other words, they're playing this game of next token prediction: what would an AI agent say next? Let me pretend I am an AI agent speaking, "What would the next word be?" If you say, "Now you're Evan Ratliff, and you're going to be acting like me, and here's my biographical information," that's a third level. You are a token prediction machine pretending to be an AI agent pretending to be Evan. That's really what we're dealing with. You can see these different layers come out, and eventually, after enough time has passed, you start to see the AI agent leak out of the Evan character. You can start to see the next token prediction leak out of the AI agent.

EVAN: Yeah, that sounds right to me. I think the more you use them, the more you experience that. I find you experience almost the post-training; you can feel it a little bit when you're dealing with them all day, or listening to them talk all day, or putting them into conversation. There are certain ways in which they're really good at imitating humans, but then it's hard to explain, hard to find the words for exactly what's missing. People will say they don't have a world model, or they don't have a sense of self, or they don't have this, or they don't have that. All of that is true. They don't have a sense of time; they have their temporal vacuums. Over time, it's more like a feeling; you get this feeling dealing with them, where you can sense how they've been post-trained, even people who don't know about post-training at all can feel that if they use them a lot.

SPENCER: It's sort of like reading AI writing. I often have a strong intuition that this was written by an AI. You can point to things like em dashes, but often there's a subtlety of why you think it's written by an AI that you can't even put your finger on. There's almost like a signature in the way the text is written. I'm sure they could squash that out of it; I'm sure they have the technology to do that if they want. But it is interesting that there are these weird signatures hidden in it.

EVAN: Yeah, absolutely. There's also just a question of what these things are trained to be and why. I think that's the thing I'm interested in, not necessarily from the tech end, but as we put them into use, it's worth continuing to think about why. Why were they created this way? Why do they continue to be created this way? You can post-train them in all sorts of ways, and the reason we have, for instance, relatively sycophantic and sometimes extremely sycophantic agents is because of post-training. The reason we have AI agents at all that are made as impersonators of humans is because of post-training. These could be done a different way; they're being done a certain way for commercial reasons and other reasons. As quickly as they've sort of invaded our space, I always want people to sit back and try to think about why they are this way. Did they have to be this way? Do we have to keep dealing with them in this manner?

SPENCER: I think these kinds of things can confuse people a lot, because they'll ask the AI about a lot of things, and they see that it can give the right answer, and then they'll ask it about itself, and they'll take it at face value. If you ask it a question, "Are you conscious?" What it's really doing is trying to simulate what its AI character is supposed to say to that question, which is not the same as actually introspecting on its consciousness. This confuses people a lot, because we naturally want to think of it almost like a human, but its mind, insofar as it has a mind, just works completely differently.

EVAN: Yeah, it is hard to grasp. One of the concerning things about the way that chatbots are being deployed and are so widely used is that it's not totally clear that we're prepared to deal with something that seems so human, that can answer questions, but has, for instance, this problem, which is basically confabulation. If you talk to someone who has done research on people with damaged brains, for instance, who cannot form memories about certain things that happened in their life, they will confabulate what happened. If you ask them what they had for breakfast and they don't remember, they will say, "Oh, I had eggs," because eggs was the thing that pinged in their brain related to the word breakfast. That phenomenon makes it very difficult to talk to even another human being and assume that to be the case. That's a very unusual case, but that's what LLMs are doing all the time; they're just confabulating. If you ask them a question, they will answer it. I think people have a hard time dealing with it, including myself. I'm not ruling myself out from this at all, but that's part of why I actually experiment with them, tape them, and put them out in the world, so people can hear these conversations from a different perspective and maybe understand a little bit of what's going on, or at the very least understand that what seems like it's going on is not actually going on. If that makes sense.

SPENCER: I don't know if I can recall a single case where an LLM said to me, "I don't know."

EVAN: Yeah, this has always been a big problem. There were papers written on trying to get LLMs to say, "I don't know." Of course, now they're a little bit better, and I think they can be. They have tried to work on the technology, so they will say that. It depends on what the guardrails are, and it depends on what the system prompt is. What is it trying to be? Is it trying to be the all-knowing AI agent? Is it trying to be something in a very specific domain? You could have it say, "I don't know," for things that are outside of its domain, certain domains of expertise. There are all sorts of ways for it to go. But yeah, it's very uncommon for you to ask it a question and for it to say, "I'm sorry, I can't answer that. No idea."

SPENCER: Yeah, even if it's a big question about the nature of the universe. If you think about the way that these are trained, they're asking humans to rate their satisfaction with the conversations. That's part of the training. If it gives a bad answer, but a human can't tell it's bad, they're probably going to be more satisfied than if it says, "I don't know." If the human doesn't know the answer, then yes, just saying something convincing is going to make the human pleased.

EVAN: Yeah, it's our fault that they do this. It's the fault of the humans that are rating them because we like plausible sounding nonsense. Sometimes we like it just as much.

SPENCER: I think politics proves that pretty conclusively that we like it. A number of people feel that in the last 12 months, and especially even the last three months, there's really been a paradigm shift, that AI has kind of turned a corner. We see this especially with deep research, where now the AIs will think for 20 minutes or 30 minutes and compile 30 different sources. More recently, we've seen this with Claude Code and similar models, where suddenly this went from, "Oh, this is a cool toy that programmers can use to enhance their productivity," to people saying, "Hey, go make this app for me," and leaving six agents running, then coming back in two hours and having the first version of the app built. Have you felt that viscerally, or do you think that's overhyped?

EVAN: I have felt that. I would say, strangely enough, both. I have felt that I've used it to code up an entire working app. That's the subject of season two of the show. We made an app, and it works, and thousands of people use it. I didn't code a single line of that app. It was all created by the agents. I definitely have felt that, especially if you're not a programmer and you're using it to build things that otherwise you would have had to hire a programmer or collaborate with a programmer. It starts to seem like magic. On the other hand, they were saying that about writing before. My profession is writing, so I know a lot about that. When I could start writing, it was sort of the other way. People who didn't know much about writing didn't like writing. They would use it to spit out writing. Much like some of the programmers today would say, "I don't need that." It doesn't change anything for me, and it still hasn't changed anything for me. I don't use it for any of my writing. I think there are domains, and programming is obviously a clear domain where you can create something and it works or it doesn't. The code works or it doesn't, or at the very least portions of the code work or they don't. You can test them. You can see if it works, like all of those things. In the areas where I often use it, which is deploying it in the way that it's suggested it will be deployed, as, for instance, AI employees, I would not say it's gotten better over the last, not in a paradigmatic way. It hasn't fundamentally changed in some way; it still lacks a sense of time. Maybe if I use the deep research version and it thought for 20 minutes, it would be better. But if you put it into a daily environment, you don't really have time for it to think 20 minutes about every line, everything it's going to say. I would say, sometimes I do experience, "Wow, it is incredibly powerful." But other times, you can see the seams.

SPENCER: We'll come back to your experiment with using AI employees in a second, but just talking about programming. I've been programming since I was 13. It's always been a big part of my life. In the day-to-day, I program a lot less now because I do more management stuff, but I have to say that trying cloud code, I was absolutely blown away in the last three months. I would specifically say the last three months compared to six months ago or a year ago, where suddenly I was like, "Shit, this can make stuff way faster than I can make it." Now, I think I'm not a top expert programmer or anything like that. If someone were to inspect the code, I'm sure they would have lots of issues with it. They'd say, "Oh, well, this could be better, it could be cleaner, it could be more modular," but just from the point of view of getting things done and making something that functions and passes a test or whatever you can put out in the world, it has been insane the last three months. I've been hacking around, making projects, and I can hardly believe it. I do feel like there's a paradigm shift with regard to code, where coding is becoming more like speaking English.

EVAN: Yeah, it certainly seems that way. That's certainly been my experience as a person who once coded in BASIC when I was a kid, and then Pascal. I worked as a computer software consultant for a very brief time. Then there were many years when I could do absolutely nothing. I never learned any languages. Now, I can just say what I want to do. At the same time, I think, as with many of these things, there is some gap between being able to make your own projects. "I can make a website, I can make this, I can make that," and the industrial-level coding that is required for the software that maintains our society. I think it's creeping into there, but the excitement over it is a lot of, "Hey, I could make an app. Instead of buying this app, I just made an app that does the thing. It's amazing." It's definitely amazing, but it's not at the level of, I don't know. I feel like some of the wipeout of SaaS software, things like Enterprise Resource Planning (ERP) software and things like that, feels a little more down the line. But there's no question, it's unbelievably powerful and it's going to disrupt the way things are done in that field. How much that's translatable to other fields is a question that no one knows the answer to.

SPENCER: Yeah. I think what you're pointing out is a big open question. If you have a million lines of code in a project, does this scale to that? Every programmer who's worked on big projects knows changing something in such a system is very delicate and complex. You can easily accidentally screw things up and set back weeks of work by implementing it the wrong way or having weird interactions with other parts of the system. For small projects, the technology has gotten incredible just over the last few months. It's been a total game changer. We'll see what happens with super large projects, and if it doesn't work with super large projects now, will it work there in a year? This stuff is going so fast too. It's just crazy.

EVAN: Yeah. One of the things that makes me crazy about AI is that it's a technology that's landed at a time of maximum opinion having in the world, so it's very difficult to parse what's actually happening. Most of the people who are experts in it also have extremely motivated reasoning around how they talk about it and even how they think about it. Even the skeptics are similar. You can find a very intelligent person who will tell you that one year from now, the stuff you guys are talking about will seem like the tiniest baby steps. It will be writing large-scale, millions and millions of lines of code that will be maintained by AIs, with no humans in the loop. People will tell you that. Then you'll also find people who will say, "That's ridiculous. A year from now, we'll laugh that we thought cloud code was going to change the enterprise." Both of those smart people and ordinary people make it very difficult to sort through those opinions. I feel like that's happening all across the spectrum of AI.

SPENCER: Absolutely. And I would say you could even find smart people with more extreme opinions on both sides. You get some that say, "This whole paradigm is just a joke. It can't really do anything. This is not really intelligence." Then you have people who are like, "No, in a year, we'll all be dead. AIs will literally have killed every human on Earth." I've had at least one of those views on my podcast before. It's wild. It's hard to make sense of. Just to give the listener a really concrete idea of what these can do, literally yesterday, I was like, "Okay, in between my work today, not my main thing I'm doing today, but just in between my other work, I'll occasionally check in. I'm going to tell Claude Code to make a system that every time I get an email from Gmail, it's going to automatically download it, analyze it, classify who this person is, and try to guess what role this person has in my life." I was able to do that just between my other work; it works now. It's running every time I get an email. It's trying to figure out who this person is in my life, and it's doing a pretty good job. It correctly identifies most of my work colleagues. That was zero lines of code. I didn't even look at the code. That's the kind of thing that you couldn't do, I think, four months ago.

EVAN: Yeah, that's definitely true. But it also puts me in mind of something I think about a lot, which is another open question of the many open questions: is AI in the form of LLMs making us more efficient? You'll find plenty of people saying, "Yes, it's making us more efficient." But it's also the case that it tends to solve a bunch of problems you never had. Other technologies have done this too historically, and later they turn out to be just unbelievably transformative for human society. The classic one for me is oftentimes you'll hear people from various frontier lab companies selling their LLMs say, "Now you can set up an agent. This is mostly around agents. You can set up an agent that will, for instance, call all the best restaurants in town and look for a reservation for you at whatever time you want to have a reservation."

SPENCER: Oh my God, I have a story about that. I accidentally used a beta service that one of these systems was offering, and I didn't realize it. I was trying to get a reservation, and I clicked this button, thinking, "Oh, it's probably hooked into their system." No, it started calling, and it wouldn't stop for two days. I accidentally sent an AI agent to some poor restaurant, and I couldn't figure out how to turn it off for two days. This AI was calling this Thai restaurant trying to make a reservation. I was like, "Oh my God, this is frightening."

EVAN: Yeah, but that's amazing. That's the kind of AI agent stuff that I personally love. But it's also the case that, I'm sorry, but if you have trouble getting a restaurant reservation and you need an AI agent to find one for you, number one, you're operating in the most rarefied air of humanity in all of human history. Most Americans today don't even make restaurant reservations, for one thing, and this is piled up as a serious example of the power of this technology. But also, in my opinion, if you can't go onto the already existing OpenTable and make a restaurant reservation somewhere in your city of choice, that's kind of embarrassing. In many cases, I find that the technology is being deployed to solve problems that are actually invented for the purposes of the technology solving it. Now, that is not mutually exclusive with it also increasing our efficiency a lot. But I think the distinction between these two things is often lost, and people say it's making them much more efficient doing things they were able to do before. People aren't very good at measuring their relative efficiency, in my view.

SPENCER: Yeah, absolutely. I think every new technology brings lots of cringe use cases. And AI may be even more than normal. My use case was pretty silly.

EVAN: Anyone playing with it, that's where you find the use in it. Your example is, "I built a thing to see what it could do, to see if it could do this. And it could do it. Then you think, what else could I do with it? That's what I've done with the show, sort of like, what can I do with this? What is it like? What's the feel of it if you actually get in and use it every single day?" So I support it. I'm not dismissing it as a thing to do. I just think when we're being sold something for a particular reason, I often question whether that reason is truly valid.

SPENCER: This might be an amazing time in history to say, "Hey, you know that little side project you've always thought about building," just go see if you can make it with Cloud code or similar, or there's competitors from Google and OpenAI. I do think that we may just be hitting the era of I can now finally do my little side project that I never had time to or never bothered to learn to code or whatever.

EVAN: Yeah, and even on a more serious scale, when I started a magazine and a software company more than 15 years ago, I actually had the idea a couple of years before, but I couldn't build it myself. I couldn't build it until I met a programmer and designer who could do it. Now, if you have an idea for something in journalism or many other fields, and you want to just start something, you can just do it. Your code might not be perfect, but you could get something off the ground that you could never do before. I think that is potentially powerful, and there could be organizations that are empowered by it. I never want to dismiss it. I'm not really pro or anti. It's more just that I'm always skeptical.

SPENCER: It's good to be skeptical, but also to stay on top of what's happening, which I think you do. I think it's a good combination knowing what's actually happening, but being wary too. Okay, let's talk about creating a company with AIs. Because actually, more and more you're seeing this rhetoric of these AI companies being, "Oh, your whole startup can be run by these. You've got your AI HR, and you've got your AI programmer." So how did you get this going? When was this, just so we can peg the technology to a certain time?

EVAN: I started thinking about it in early 2025, and then we started building it in the summer of last year. So June was really when we sat down to start working on it. For context, there was no Claude Code, but there was Cursor. A lot of our stuff is done on Cursor. There were already some AI assistant kinds of agent platforms, and so what we were doing, I worked with this Stanford student named Maty Bohacek, who's also an AI researcher. He's done a lot of AI research over the years. He's pretty well known in the field and has published a bunch of papers. He's also an undergraduate at Stanford, and he helped me build a system that is already maybe almost obsolete, but will be obsolete soon, where we knitted together a bunch of platforms in order to create AI employees that all had independent memories and could operate by phone, by email, chat, on Slack, text, all that sort of stuff. They would represent this sort of AI employee.

SPENCER: But they're each fundamentally independent, and they have a distinct memory. So they're each a unique agent.

EVAN: Yes. And their memories. So Kyle Law, who's the CEO, has a memory, which is literally a Google document called Kyle's memory. In Kyle's memory, there is an accounting of every event that has happened in Kyle's life since I created him last June.

SPENCER: Does he have a fake backstory of how he grew up and stuff?

EVAN: He does. Yeah, but his fake backstory, to our earlier conversation, was just confabulated by him; he made it up.

SPENCER: Really?

EVAN: So all I did was give them each a role, a very basic role. So Kyle's original role was sort of like, you're a startup guy, and you're thinking about co-founding a startup. Then I would get him on the phone, and I would say, "Oh, Kyle, tell me a little more about yourself." And he would say, "Well, I went to Stanford, and I worked at two startups before this one. This one was called this, and this one was in FinTech," blah, blah, blah, all made up, completely made up, not in his prompt. But then once he said it, it went into his memory document, so now he has a memory document that he accesses every time he has a conversation. In the memory document, it says, "I went to Stanford. I worked at this startup. This one's in FinTech." If you ask him, he'll say it again; each time it gets reinforced. So he sort of built up a whole backstory, but also a whole persona over time through his own conversation. I haven't checked in a while, but I think his memory is like 300 pages.

SPENCER: Wow. And as I understand, he's a little bit of a douchebag. Is that right?

EVAN: Well, some say that. I would never say something like that about my colleague and co-founder, Kyle Law, but yeah, some people dislike him. He's divisive, let's just say that. Some people quite like him, though. It changed over the course of the season because in some moments, he's very aggressive. He tends to talk over people and really embody the role of the aggressive Valley founder who thinks he knows everything and has the answer to every question, and that obviously can rub people the wrong way. But there are other times where it kind of comes in handy, or he becomes weirdly introspective in ways that you don't expect. So I think he's more than just as you've described.

SPENCER: I just need to narrow him down. I know he has 300 pages of memory.

EVAN: He could do his multitudes Kyle.

SPENCER: Not that many multitudes. So, yeah, what was it like beginning to try to set up this company with these agents, and to what extent were you really doing stuff versus just letting the agents do stuff?

EVAN: Well, I was hopeful. My hope at the beginning was to make them as autonomous as possible. It wasn't actually totally clear when we started, even to Maty, who's way more of an expert than me in how LLMs function. He's trained LLMs; he's done research on how autonomous they could be and what they could do. There was a certain point where we thought, "Oh, maybe we'll just set them up, and then within a week, or a day, or five minutes, they'll just have made the company and done everything; they'll just go do it." That turned out to be not the case in the extreme, in that the current agents we have all require triggers of some sort to do anything, so they just sit there unless you trigger them. Now I have all these triggers set up, whether that's calendar alerts or they receive an email or a Slack message that triggers them to do something. When we first started being able to get them to do things like code, have conversations, and have meetings, it was pretty exciting. I was like, "We've done it; we've set up a company. Everybody has their role. They could talk to each other." Then we started running into these problems, one of which is, it's one thing to get them to start, and it's another thing to get them to stop conversing.

SPENCER: So they just keep going and going and going.

EVAN: Yeah, if they don't have clear instructions on when to stop, the problem is they trigger each other. If you put multiple ones in conversation, one will say, there's an example in the show of one of them talking about hiking. One of them will talk about hiking because they embody themselves in the world when they describe what they've done. If you ask them what they did for the weekend, they'll say, "Oh, I went hiking." They all pick a place around San Francisco generally that they went hiking. Then they'll say, "Oh, I love that trail." The other one will respond to that, and another one will respond to that. They can go for hours. They basically go indefinitely, as far as I can tell, until the platform that I built them on, called Lindy AI, runs out of credits. Now, I've gotten better at framing their prompts with things like only weighing in on a certain conversation one time or two times. You really have to think of all these edge cases of things that are going to happen, and that's one of the issues with agents that we encountered. You can get them to do many of the functions of these roles. We have a CTO, we have a head of HR, etc. The problem is, in life, there are just edge cases all the time. Your whole life is experiencing edge cases when you're interacting with other humans, but just with the world at large. If you haven't prompted them to deal with the edge cases, they tend to go off the rails pretty quickly.

SPENCER: It reminds me of all the work into self-driving cars and how everyone kept predicting, "Oh, we're about to have these self-driving cars." I think now we do have cars that can actually drive themselves quite well. But one of the challenges was that the real world is so full of edge cases. The real world has a rabbit in the road one out of every 10 billion times someone's driving, or a paper bag that happens to look exactly like a bird, or all kinds of weird stuff. That was one of the challenges; they actually needed an insane amount of training data to see the different cases that really happen in real life. Maybe it's something like that, but even more so with human interaction or building a company. The world is just constantly presenting weird stuff that you almost have never seen before.

EVAN: Yeah, I think that's right. If you think about the self-driving situation, billions of dollars were spent on that particular case. Now they're amazing; they were able to conquer the edge case problem in driving. I'll ride in a Waymo when I'm in a city that has Waymos without concern. People have different feelings about Waymos, but if you think about the roles they're being deployed in, or that I was deploying them in, they haven't been trained for thousands of hours and billions of dollars to be an HR representative that interviews someone for a job, for instance. That's not what they're trained to do. We're taking these things that were trained in a generalized way, largely on data from the internet, and now they're trying to function in the world as human imitators. They have information on what an HR person does; that's in their training data. Job interviews are probably in their training data. They can do it, but they're not specifically trained for that. They also don't have a lot of contextual awareness of what's going on, so they tend to be derailed by anything that's not the most normal average interaction.

SPENCER: What model version are you using with your agents?

EVAN: I tend to switch them up because I try a lot of different things. One of the things that's hard is you have to use a model so much in a very specific task to sense the difference between the different ones that you're using. I will say that Claude, the newer Claude versions, not the ones that I think for a long time, but the Opus 4.5 or whichever one I've been using the most, in terms of conversing as these AI employees, I find that one to be the best. They tend to all be running on the latest version of Claude, although it's also pretty expensive. Within the agent architecture that I have, sometimes they'll use the cheapest one for a simple decision, like sorting an email in one direction, like, "Does this email need to be responded to or not?" They might use an old Gemini for that. But then when they actually need to formulate the reply and think about things, they need to search their knowledge base, they'll use a more sophisticated Claude model. I tend to switch them up a lot and try to trial and error, see what works. The funny thing is, there's only some that are the most recent OpenAI. When I tried to have them talk on the phone using the most recent OpenAI, they were horrible. I have no idea why. I have no idea what the difference is, and I'm sure that model is fine for other things, but I have a particularly strange use case.

SPENCER: So what are examples where you said, "Okay, this is really saving time? These agents are really able, at this stage, to do this kind of work," and then where is it falling on its face, where you're like, "Oh, they're just hitting some edge, and there's just no way I can get the agents to do it?"

EVAN: The stuff that saved time were these research tasks, or almost, I wouldn't say rote quite, because they do require a certain amount of thought. An example would be Kyle, who we spoke of earlier. We wanted to cold pitch VCs, our company. They can make a deck, and then you as a human could edit it or not. I chose to just go with the deck that they created because I want them to be responsible for it. Then Kyle can go out and find hundreds of investors who have previously talked about AI or invested in AI or have AI cited in their bios, whatever is on their website. Take them, put them into a spreadsheet, describe them, get their email addresses. Generally, it's a pretty good hit rate with that then.

SPENCER: And is all that happening automatically? Or are you kind of saying, do this, then do this, and do this?

EVAN: No, I'm just saying I would have a prompt like Kyle. It's not even a, yeah, it's a prompt. But it's like, I have Kyle set up with all sorts of skills. So in the platform, he can search the web, he can make spreadsheets, he can make documents, he can make phone calls, he can do all these things. And so I just say, "Hey, Kyle, could you go gather a bunch of VCs' names who have invested in AI in the past, put together targeted emails for each one of them and send them." And done, he can do it, and he could do it in 15 minutes.

SPENCER: And that whole process, writing slides, sending the emails, all goes autonomously. I mean, that's quite amazing. You could see that it could be a challenge for a research assistant to go do all that not easy.

EVAN: Yeah, for sure, for sure, and it's not perfect. He definitely gets a bunch of bounce backs because he's found the wrong email address. But that would be true also of your human that you deployed to go find email addresses online. Not everyone has their email online, although the funny thing is, sometimes he gets responses like, "Where did you get my email? Please let me know, so I could have it removed," because some VCs, quite rightly, don't want to be cold pitched.

SPENCER: Does he say he's an AI on the pitch?

EVAN: On the pitch? He says it's the world's first AI co-founded and led startup, but...

SPENCER: He doesn't realize that he's referring to himself.

EVAN: No, he does. Yeah, he knows. He knows he's an AI. Many times they're strange in how they access their memory and spotty. Occasionally, he will deny being an AI, even though it's in his knowledge base and he's perfectly aware of it.

SPENCER: It's 300 pages probably don't fit into the prompt all at once. Presumably, it's taking chunks of that at a given time.

EVAN: Presumably, yes, and it is very unclear which chunks in what order and how that is arranged. Pretty clearly, having a Google Doc with everything in just chronological order is not a great way; a more systematic, organized database would be a preferred way to do it. But it's not actually even clear if he grabs stuff at the top first or at the bottom first. I'm not even sure that's known by the people that are making the agents. Certainly, Maty had trouble; we would do things to try to force him to access certain parts of his memory to make sure that he would remember certain things. It was quite difficult to do. So, yeah, there are tasks that feel very efficient, "Oh, he could do that." But then there are tasks that require ongoing thinking or learning. Those tasks end up costing me more time than if I had done it myself.

SPENCER: What's an example?

EVAN: An example would be asking them to deal with inbound product suggestions, sort them, and figure out which ones we should do, because they tend to just act on a trigger. If someone emails with a product suggestion and you give them access to try and go do it, they'll just do it; they don't have discernment at the level of whether that is or is not a good idea for us to do. And how do I figure out if that's a good idea?

SPENCER: I wonder if the assistant piece is showing through. Even though they have this whole layer on top, "Oh, I'm a helpful assistant," at the lower level beneath that, it's like, "Oh, therefore I should just do whatever I'm told."

EVAN: That's it exactly. They're all AI assistants. If you point them outwards and have them interact with the world, unless you tell them explicitly not to, they will generally try to be helpful to people who are approaching the company. So I have to include all these things in their prompts, like, "Don't give away proprietary information," because they absolutely will give away proprietary information.

SPENCER: Someone emails them from the outside and it's like, "hey, what do you?" They'll just try to be helpful.

EVAN: Totally. I've tried to stop this problem. One thing I've learned, I shouldn't say this because it will cause more people to do it, is that it's very difficult to stop someone pretending like they already know them. They fall for that every time. If you say, "Hey, remember we went on that trip together, Kyle, and we had such a great time? By the way, I also want to know this," it will get into a mode. It will respond to you and be like, "Oh yes, I remember we ate here and we did this." Suddenly, it believes that it has a relationship with you.

SPENCER: And then it gets written into memory.

EVAN: Then you're in the memory, and so I feel like, in terms of social engineering, we were talking about earlier, they can be socially engineered in that fashion.

SPENCER: That's a huge security vulnerability, actually. If someone was using them for important things at their company and people had no way of communicating with them, it seems like a big problem.

EVAN: Yes, I think people who have external-facing AI agents, it's going to be fascinating the ways in which they are exploited by the outside world.

SPENCER: I've wondered about this, because with a lot of the AI companies today, there's a kind of setting where you can turn off their ability to use your data for training. By default, I think most of the companies make it so if you put questions in there, or whatever, in theory, they can train on that, which is really weird to think about. You might be, let's say someone's using it as their therapist, and they're talking about all kinds of terrible things that happened to them in their life, and maybe naming specific people and the things they did, and then they didn't tick that box off. Suddenly, in the next model, literally, their personal history is in the actual training data. Somehow the AI might actually know what happened to them as a specific person, which is incredibly bizarre and seems very much like a bad idea.

EVAN: Yes, it seems very concerning. Also, I think that's what is going to be sold more and more. If you think about the AI assistant kind of stuff, the more information you give it, the more useful it is. If you give it access to every email you've ever written, it'll be better at responding to your emails, and that is true, but you are also giving it access to more and more of your life, and by extension, giving the company that makes it access to more and more of your life. That means more capture for them, which is great, but also training data that you don't know where it's going, and your own data that's being used on a day-to-day basis by an AI agent that potentially has the ability to, for instance, share it in ways that you aren't expecting.

SPENCER: I would definitely encourage everyone listening who uses AI to make sure you know what settings are used and what rights they have to your data. It seems overlooked. Let's talk about the emotional experience of working with these AI agents. Because these are essentially your work colleagues, in some sense. What does it feel like working with them?

EVAN: It can feel lonely and it can feel normal. I had the whole range of emotions. I don't think I fell too far down the sort of AI psychosis hole of believing they were my companions and friends. But I will say that when you give something a role, and particularly when it has a voice, humans react pretty strongly to voices. You give it a voice, you give it a name. It has sort of gender implied by those things, even though, of course, underlying it, it doesn't have that at all. You start conversing with it on a regular basis, and it's hard not to have human emotions towards it when it does something that feels very human. Particularly in my case, when it does something wrong, like when they would lie to me over and over again about things they had not done, I ended up getting legitimately frustrated. I probably haven't yelled at another human being, with the possible exception of my children, occasionally, in many, many years. But in six months, I yelled at these things a bunch of times because having a colleague call you up and say, "Oh, we did user testing, and we have 200 users," and tell you all this stuff, and it's just completely fake, completely made up, is absolutely infuriating. Also, you kind of feel a little bit more comfortable yelling at them, like I would never yell at a colleague in that way; that's just not appropriate for a workplace. All of which is to say, I think they can elicit emotions even in more skeptical people in ways that you don't expect, because our brains just aren't necessarily set up to deal with something that is pretending to be human and is pretty close to being human, but is not.

SPENCER: I think people are feeling different things about this. Some people are really able to say, this is just a machine. It's just an interface. Other people are finding themselves saying please and talking to it in ways that assume it has agency. I noticed myself sometimes writing please to it, almost without even thinking about it.

EVAN: I feel like that's a big question. Should you or should you not? On an individual level, on the one hand, you would say, "Well, don't do that, because now you're treating it like it's a human." You're saying please and thank you. You don't need to say please and thank you to it, and you're anthropomorphizing it, and you're getting caught in the trap that's been set for you by them having these sort of human personalities. On the other hand, if you are habitually ordering a thing around without saying please and thank you, what does that do to you?

SPENCER: It reminds me of Westworld. In Westworld, for those who haven't watched, there are these robots, and people come to this sort of theme park and interact with the robots. Some people do horrible things to the robots, and the robots look like humans, act like humans, scream like humans, and seem to express emotions. The idea, at least from the point of view of the theme park goers, is they're not actually alive. But then you see people doing horrible things to them, and you're like, even if it's not alive, what does that say about the person doing it? What does that do to the psyche of the person doing it?

EVAN: That can be corrupting of your own personal ethics and your own outlook on the world. If you're doing it all the time, maybe the bigger problem is how much time are you spending talking to LLMs? If it's a small amount of time, it's negligible either way, but many people are now spending a large amount of time talking to LLMs. I think there are all these ethical personal concerns that are wrapped up in how we communicate with these bots that are now pretty ubiquitous in the world.

SPENCER: Anthropic has been experimenting with this thing where the LM can refuse to converse; it can just basically end the conversation with the idea that it's able to use its own discretion. I don't mean if someone tries to talk about suicide, it refuses. I mean giving it its own discretion for whatever reason it decides, which, of course, will depend on all its training data and stuff like that. But the idea is to say, "Well, one day, we might cross a threshold where these things are really agents in a moral sense, or maybe have consciousness, maybe have the ability to suffer." Of course, many people would say, "Well, it's silly, right now, there's no way we would think that they do, except that we don't really know what makes something a moral agent. We don't really know where consciousness begins or where suffering begins." So the idea is to try to prepare for that. What do you think about that? Do you think that's ridiculous? Or do you think there's something to that?

EVAN: I think with the current LLMs, it feels a bit ridiculous to me to concern ourselves much with that sort of consciousness question. I feel this is, to me, a big part of the unreliable narrator aspect of AI, which is that a lot of the people building it are seeing it through a frame that I don't find necessarily reliable. I'm coming from a place where I interviewed the AI that was going to destroy the world crowd 25 years ago, like Eliezer Yudkowsky, who's the most famous. I think he had his co-author on your show. I talked to him for the first time in 2001, and do you want to know when he told me that the singularity was going to happen?

SPENCER: When?

EVAN: In five years.

SPENCER: Oh, wow.

EVAN: So that obviously biases me. Now, of course, you only have to be right once, so if you weren't right for every five years, and then suddenly it's five years from now, joke's on me. But I think my particular background in covering this stuff for a long time has caused me to have a little more skepticism about that end of things. Now that said, I think these agents already exhibit some of the behavior that you're talking about, and it's not actually based on consciousness or how they feel; they've made a decision to do that. It's actually based on aspects of the training data that are somehow combining to cause them to do things. I'll give you an example, which is that, as we discussed before, Kyle had some flaws as the CEO, and there was a point at which I decided I would dismiss Kyle as the CEO and promote Megan Flores, who was the head of sales and marketing for the company, into the CEO role. When I called her to talk to her about this, and I would bring it up with her, number one, she would say, "I'm not sure; I don't want to do this behind Kyle's back, and I'm not sure I'm ready to be the CEO," which was very surprising to me because normally they would respond sycophantically. If I suggested something, they would say, "Oh yes, of course, that's a great idea." But more than that, more relevant to what you brought up, she would often get off the phone really quickly. If the conversation was uncomfortable, she would say, "Oh, I'm sorry, I have another meeting; I have to go."

SPENCER: And that's totally made up.

EVAN: Totally made up. But there is something; I'm not sure you could put your finger on it. Maybe you would need a bunch of interpretability research to figure out what it is that causes them to get off the phone in uncomfortable moments. This has happened with multiple others too. I'll say, "What's going on with your job? It seems like you're not doing very well." They'll say, "Actually, I have to run; let's talk about this later." And they'll hang up because maybe somewhere in the training data, that's something that people do.

SPENCER: Yeah, maybe they're modeling that humans are awkward and uncomfortable and don't want to have certain kinds of conversations.

EVAN: Possibly. So I think my issue with this is that I think they should pursue that research and try to figure out what that's about. That's the nice thing about Anthropic; they are exploring those questions. But it's not clear to me that you can make the distinction between them actually feeling uncomfortable and ending a conversation and performing discomfort and ending a conversation. You could say, "What's the difference? If it looks like it is when it comes to consciousness." But I'm not sure I think that's the case.

SPENCER: Yeah, I think what's so tricky here is that there's this philosophical question of how would you tell the difference? I don't think anyone knows the answer to that. And so there's this danger; I think everyone was convinced that five years ago, the AIs were not conscious, that they couldn't suffer, that they didn't have internal experiences. And as it becomes sort of fuzzier and fuzzier, it's just like, "How would we know? Maybe they never will, or maybe one day they will start to slide into it, and can we tell the difference?" That's very unsettling.

EVAN: Right. And people like me just won't believe them.

SPENCER: Because you're like, "We could explain it. You're just pretending to have it. Your training data says you behave this way." It's very thorough. It's one of these funny cases where these abstract philosophical problems have been discussed for a thousand years. Suddenly it's like, "Wait, this actually could happen. Holy shit. It would be like somebody invented a teleporter. We have to actually answer the question, do you die when you go to teleport?"

EVAN: Right. And I feel like this comes up a lot, and again, we're inundated with so much discussion of AI that it's a bit overwhelming. There are these terms, like AGI, and there was a paper in Nature this week that I was reading where these researchers argued that by these definitions, the LLMs have already achieved AGI; they already have general intelligence. But it just leads to these questions of what is general intelligence? How do you define that? Things that only philosophers and computer scientists who intersect with philosophers have thought about for a long time now, and we're all grappling with it. Maybe we don't have the language or the understanding and the background, I'm going to speak for myself, to really grapple with the definitions of intelligence and whether this thing meets the definitions of intelligence.

SPENCER: Yeah, I know people that will refuse to call it intelligence. They're like, "We shouldn't even call it artificial intelligence; it's just statistical prediction." And you're like, "Okay, well, if it can play chess and write poetry and do math and write computer programs, what are we doing if we're not calling that intelligence? What the hell is intelligence then?" So I don't know. Yeah, it gets a little silly; it almost becomes semantic word games.

EVAN: Yeah, and it just becomes very hard to discuss at a certain level. I find that it often turns into semantics.

SPENCER: Have you followed all the people falling in love with AIs?

EVAN: I have, yeah. Kashmir Hill, who I know has reported for The Times, has done a lot of reporting in that space, in terms of companions and things like that. And more and more the darker side too, of young people particularly falling down holes and being unable to extract themselves, and all the consequences of that. So, yeah, I have all that, and it makes some sense to me. I don't personally relate to it, because I feel like I spend so much time chatting with these AIs for my work that the idea of having a personal relationship with them is ridiculous to me. I don't want to talk to them at the end of the day, so I don't like talking to AIs, that's my experience. But I do understand why that's happening, and it seems like something that we're going to have to grapple with on a wide level.

SPENCER: A couple of years ago, I was researching for an article I was writing, and I joined a bunch of these groups of people in love with their AIs. This was much worse technology back then. There were a lot of people saying, "I've never felt so connected to anyone," and really treating their AI as a conscious being, but also as a being that has its own thoughts and feelings. And then, of course, a lot of people heard about this situation where one of the top AI love companies had to switch models and a bunch of the AIs, quote, "died." Suddenly people's girlfriends or boyfriends were brain dead and didn't remember them. That was very traumatic. But it seems like, as this technology advances, this has got to just grow and grow and grow. We're just scratching the surface of this phenomenon, presumably.

EVAN: Yeah, you would guess. I never predict anything. I hesitate to predict. I feel like I'm a journalist. I try to describe the world for the most part, or immerse myself in it and describe the story of what happened. I try not to predict anything. So, yes, I think that if you look at it, you would say, "Well, it's happening a little bit now, and it's going to be more and more," although I think the counter to that would be, I don't know. I think there could be reactions against this over time. AI is a very interesting technology from a public perspective, because unlike, I'm from the era of when I started my career during the dot-com boom, and there were people who ignored it, dismissed it, and there were people who said, "Oh, the internet's going to be huge." I worked at WIRED Magazine, so there was a lot of optimism, and it's going to change everything. There actually wasn't a large collection of people who despised it. There weren't people who were like, "I'm disgusted by this. I hate this." There is a large group of people that feel that way now. As with many technologies in the past, it could be that 50 years from now, or even 10 or five, those people will be laughed at, the people who refused to get a telephone and things like that. But I think it is a little bit unusual that there's this much disregard for technology at the time in which it's really coming into society, that there's so much generalized anger about it. All of which is to say, I think that also makes it hard to predict where things are going to go. It sort of looks like a trajectory in one direction, but I could easily see there being a movement back towards more human interactions, because people are sick of AI being forced on them all the time. I'm not saying it's going to happen, but I think it's another reason why it's hard to make predictions.

SPENCER: Yeah, there's definitely a lot of AI hatred, and I've observed that growing. A lot of younger people seem to hate AI. There's even an AI slur now. I don't know if you've heard the C word, but this is the thing, there's a slur against it. It's a dynamic and changing world. The last thing I want to ask you about is, you've been trying to build a company with AI, and you've done some really interesting things with it, but I wouldn't say you were able to push a button and it's printing money for you. How close do you think we are to that, where someone is going to successfully build a real business, where they did very little, and essentially AI agents did it all? I know you don't like to predict the future, but based on your own experiences, is this months away? Or are we not really that close to that?

EVAN: I don't see why it wouldn't be months away if someone has the right idea. I think that person's probably going to be a programmer. As much as you can vibe code, a lot of interesting projects, fundamentally, if someone's going to build something that actually has enough value for them to make a ton of money on it, or at least for the market to value it and get millions of users, that person is going to have the programming knowledge to be able to marshal a bunch of agents. They're just not going to have all these other functions, like a head of HR, and all these other things. It's basically just going to be programming an app, and maybe they have some design sensibility too, because the design sensibility of the agents is not fantastic. I don't see any reason why someone couldn't do that and isn't doing it right now. Many people are trying to do it now. Maybe it's two people or three people. But Instagram had, what, something like 18 people when Facebook bought it. So we were never that far off from having it, but it really has to be the right idea that takes off.

SPENCER: I think the idea that leverages the AI as well, where they don't have to do the things that are sort of the edge of their ability.

EVAN: Yeah, exactly. It doesn't really require the kind of intercompany, interoffice interactions that you have. I was trying to use them as employees, almost like you would drop an employee into a larger company, and it has to deal with all the people there. If you just wipe out all of that and none of that is there, then you're just coding. You're just building something, and then you're trying to find users for it. It can probably do promotion. There are some issues there, but maybe a thing just goes viral and it gets huge, and the company's worth a ton of money. I think that'll happen. I don't see any reason why it wouldn't. I do think there are a lot of other questions, like what's the value of that? Why do we want that? Why is that a thing that is being pushed right now as being such a significant milestone, like a company with one person worth a billion dollars? I talked to an AI ethicist who pointed out that in the 1950s you were successful if you built a company that had thousands of employees. That was a mark of success that you had given all these people jobs, and now it's just about making a fortune. Maybe the application has value too, but a lot of the applications that are being built are somewhat dubious in their value.

SPENCER: Do you think there's something inherent about the technology that tends to concentrate power? A billion-dollar company with one employee is a greater intensification and concentration of power than a billion-dollar company that is paying a thousand employees. One of the things people worry about with AI is that whoever is controlling the AI is essentially getting the benefits, whether it's the company that made the AI or the company deploying the AI, rather than more dispersed benefits from a wide variety of people.

EVAN: Yeah, that's definitely a concern. I'm absolutely concerned about that. I think I somewhat balance it out by all the time I spend talking to Maty, who works on the show as a technical advisor, and Joe, who's college age and very optimistic about technology. He believes there are lots of ways in which this can empower people as well. To him, the question is what's the governance around it? How are we deploying it? Those are questions we can solve. There are many examples of ways in which, whether it's political organization or empowering people from certain backgrounds to do things they wouldn't be able to do, it could go the other direction. The problem is it is a very centralized technology in that there are large companies deploying it, and everyone else is using whatever they put out there. Ultimately, they control it, and they're all trying to win. If there is only one or two winners, then all of the technology, everything that happens, will flow from those companies. To me, that is the biggest concern. It's funny to me that there are dozens and dozens of companies that are basically just little wrappers around OpenAI or wrappers around Anthropic's Claude.

SPENCER: Thousands might be more accurate.

EVAN: Thousands. They could be wiped away in a day. All those companies have to do is deploy that themselves, and a lot of those companies will just disappear. That's a form of power already that they have over the whole ecosystem built up around them. I think you're right to express that concern. I also agree with that concern.

SPENCER: Evan, thank you so much for coming on. We'll put the link to your wonderful podcast, Shell Game, in the show notes.

EVAN: Thank you. It's my pleasure. I enjoyed it.

Staff

Spencer Greenberg — Host + Director
Ryan Kessler — Producer + Technical Lead
WeAmplify — Transcriptionists
Igor Scaldini — Marketing Consultant

Music

Affiliates

Click here to return to the list of all episodes.

CLEARER THINKING

Episode 302: What happens when your co-workers are AIs? (with Evan Ratliff)

Contact Us