with Spencer Greenberg
the podcast about ideas that matter

Episode 128: What, if anything, do AIs understand? (with ChatGPT Co-Creator Ilya Sutskever)

Enjoying the episode? Want to listen later? Subscribe on any of these apps or stores to be notified when we release new episodes:

October 27, 2022

Can machines actually be intelligent? What sorts of tasks are narrower or broader than we usually believe? GPT-3 was trained to do a "single" task: predicting the next word in a body of text; so why does it seem to understand so many things? What's the connection between prediction and comprehension? What breakthroughs happened in the last few years that made GPT-3 possible? Will academia be able to stay on the cutting edge of AI research? And if not, then what will its new role be? How can an AI memorize actual training data but also generalize well? Are there any conceptual reasons why we couldn't make AIs increasingly powerful by just scaling up data and computing power indefinitely? What are the broad categories of dangers posed by AIs?

Ilya Sutskever is Co-founder and Chief Scientist of OpenAI, which aims to build artificial general intelligence that benefits all of humanity. He leads research at OpenAI and is one of the architects behind the GPT models. Prior to OpenAI, Ilya was co-inventor of AlexNet and Sequence to Sequence Learning. He earned his Ph.D. in Computer Science from the University of Toronto. Follow him on Twitter at @ilyasut.

JOSH: Hello, and welcome to Clearer Thinking with Spencer Greenberg, the podcast about ideas that matter. I'm Josh Castle, the producer of the podcast, and I'm so glad you joined us today. In this episode, Spencer speaks with Ilya Sutskever about the nature of neural networks, the psychology and sociology of machine learning, and the increasing power of AI.

SPENCER: Ilya, welcome.

ILYA: Thank you, it's good to be here.

SPENCER: You've really been a pioneer in the advancement of AI, so I'm really excited to talk to you today about where AI has been, what kind of progress we've seen, where AI is going, and your thoughts for the future as well.

ILYA: Yeah, happy to talk about it.

SPENCER: Great. So just to start off on a bit of a philosophical note, people sometimes debate whether machines today, while doing these things that seem intelligent, whether they're really intelligent. So I just wanted to ask to start, how do you think about what intelligence means?

ILYA: Intelligence is a little tricky to define. But I think there are two useful ways to think about it. One useful way of thinking about intelligence is by saying, “Well, we're not sure what intelligence is, but human beings are intelligent. And let's look at the things that human beings can do. And if you have computers which can do some of those same things, it will mean that those computers are intelligent.” You can also try to come up with formal definitions of intelligence, for example, a system is intelligent if it can do really well on a certain broad range of tasks specified in a certain formal way. But I find those definitions to be a little less useful. I find just thinking about things people or animals (people primarily) can do, and if computers can do those things, too, then these computers are intelligent. And the more of those things computers can do, the more intelligent they are.

SPENCER: One thing that's been fascinating to me in the history of AI and machine learning is how so many tasks have existed where people said, “Oh, a computer's never going to be able to do that.” Or they say, “If a computer could do that, it could do anything a human can do.” And yet, again and again and again, we seem to be able to get computers to do the thing, despite many people's expectations that there's no way it could do that, or that by doing that it must be able to do everything. And so it seems like many more tasks are narrower than people realize. And I'm curious if you have a perspective on that.

ILYA: I think what's happening here is that our intuitions about intelligence are not exactly perfect. Intelligence is not something that's very easy to fully observe and to understand empirically, as evidenced by the examples that you brought up. I think people have intuitions where certain tasks feel hard for them. And so they feel that it would be really impossible for a computer to solve those tasks without solving everything else. I think on the flip side, people did say things like, “Well, if you can play chess, then you can do all these other amazing things.” And turns out that those tasks have indeed been quite narrow. But I'd say that with the situation we are today with deep learning, we do have quite general purpose tools. And if you want to get really good results on some task, if you can collect a large amount of data, you will, in fact, get very good results on this task.

SPENCER: One of the things that most impressed me about GPT-3, a system that I know you were really integral in creating, is that while you trained it to do just one thing, it was able to do many things as a consequence. And just for those listeners that don't know too much about GPT-3, maybe you could just start by telling the listener a little bit about what it is. And then we could talk about how it's able to generalize across tasks.

ILYA: So to explain what GPT-3 is, it will be helpful to understand what neural networks are, and how they are the foundation of what GPT-3 came out to be. So the way to think about a neural network is that it's a certain kind of parallel computer which can program itself. That's what a neural network is. It has a lot of parallel computer that has a lot of parallel compute and a limited amount of sequential compute. And the thing that's special about this computer is that it has a learning algorithm. It has a way of automatically programming itself. So the reason why I described neural networks as these restricted computers in this case is because it becomes a lot easier to understand the idea that if a neural network is like this computer, which can program itself, it can actually do a whole lot. And the next question that arises immediately is, so let's suppose it is true that we have these neural networks, and the larger they are, the more powerful the computer they implement. And the question we need to ask then is, what should we ask the neural network to learn, because okay, you have this capability, which way should you apply to? Like you say, the neural network can be trained, i.e. that computer which is implemented by the neural network can be programmed by the procedure. You feed it some inputs, you see how it behaves, and you say, “Uh-uh, this is not the behavior which I want, please do this other thing.” And the neural networks will say, “Okay, I get it, I'm going to modify myself to not make this mistake in the future.” So the real insight of GPT-3 is that there is one task, which, if you get a neural network to be really, really good at, will give you, as a byproduct, all kinds of other capabilities and tasks, which are very interesting and meaningful to us. The task that I'm talking about is doing a really good job at guessing the next word in a corpus of text. So the way it would be set up is that you have a neural network, and you give it some text, and then you ask it to guess the next word. And the way guessing the next word works is that you output probabilities about what the next word should be. So you say, “Well, maybe the next word is this one, or maybe that one.” And you want to place your bets such that, in general, you're correctly confident, and your predictions are generally correct. Sometimes you can narrow down your predictions by a whole lot, sometimes not so much. So let's recap where we are. We have these neural networks, which are these parallel computers, which if you make them large, they can learn all kinds of stuff. We can then point them to the task of predicting the next word, guessing the next word in a large corpus of text really well. Now, the third thing I want to talk about is the implication of this capability. Suppose you have a system which, give it some text and it guesses the next word pretty well, then you just take the guess, and you feed it back to the neural network, and you do it again. And this will generate text. And now you have a system which can respond in meaningful texts to any other meaningful text. And this ability hides within itself lots of abilities that we want. Like you say, “Hey, what do people do in text as expressed on the internet?” Well, sometimes they might summarize text, sometimes they might converse in text, or all the things that GPT-3 does are, in some sense, like a reflection and interpolation of things that people do in text that's expressed online.

SPENCER: Right. So even though you never program the system to write poetry, or translate languages, or do simple math problems, it learns how to do all of those things. Because if you are going to predict what's going to come next, if you start with the beginning of a poem, the most likely thing is going to be the end of a poem, right? If you start with a math problem, the most likely thing that's going to come next is the answer, and so on. So, in order to predict, it essentially has to learn all these different sub tasks.

ILYA: That's exactly right. I want to add another thing, which is, at the level of tasks we are talking about, directly programming your system to do these tasks is, basically, really impossible. The only way to get capabilities like this into a computer is by building a neural network inside the computer, and then training it on a task like this, which is something like guessing the next word.

SPENCER: So here with GPT-3, we're trying to get it to predict the next word. What's the connection between predicting and understanding?

ILYA: There is an intuitive argument that can be made, that if you have some system, which can guess what comes next in text (or maybe in some other modality) really, really well, then in order to do so, it must have a real degree of understanding. Here is an example which I think is convincing here. So let's suppose that you have consumed a mystery novel, and you are at the last page of the novel. And somewhere on the last page, there is a sentence where the detective is about to announce the identity of whoever committed the crime. And then there is this one word, which is the name of whoever did it. At that point, the system will make a guess of the next word. If the system is really, really good, it will have a good guess about that name; it might narrow it down to three choices or two choices. And if the neural network has paid really close attention (well, certainly that's how it works for people), if you pay really close attention in the mystery novel, and you think about it a lot, you can guess who did it at the end. So this suggests that if a neural network could do a really good job of predicting the next word, including this word, then it would suggest that it's understood something very significant about the novel. Like, you cannot guess what the detective will say at the end of the book without really going deep into the meaning of the novel. And this is the link between prediction and understanding, or at least this is an intuitive link between those two.

SPENCER: Right. So the better you understand the whole mystery novel, the more ability you have to predict the next word. So essentially understanding and prediction are sort of two sides of the same coin.

ILYA: That's right, with one important caveat, or rather, there is one note here. Understanding is a bit of a nebulous concept like, what does it mean if the system understands one concept or doesn't? It's a bit hard to answer that question. But it is very easy to measure whether a neural network correctly guesses the next word in some large corpus of text. So while you have this nebulous concept that you care about, you don't have necessarily a direct handle on it; you have a very direct handle on this other concept of how well is your neural network predicting text and you can do things to improve this metric.

SPENCER: So it sort of operationalizes understanding in a way that we can actually optimize for.

ILYA: Precisely.

SPENCER: (Now, as an aside, I'll just tell the listener, if you've never seen GPT-3 in action, I actually recorded a podcast episode where I interviewed GPT-3 and have it play different characters. So I'd suggest you go check that out, and you might want to come back to this conversation after that, just so you can see what it's actually like.) So my next question for you: what was the barrier, let's say five years ago, to building something like GPT-3, and how did you overcome that barrier?

ILYA: I think three things needed to come together. GPT-3 is not a small neural network, in terms of the amount of compute that was used. And one very direct barrier was to simply have the compute in place, that means both having really fast GPUs, having access to a large cluster, having the infrastructure and the techniques for utilizing a large cluster to train a single large neural network. This was the first obstacle. The second obstacle to actually successfully use this computer is you need to have a neural network architecture and the optimization tools to successfully train it. In other words, you need to have a neural network architecture, such that if you were to apply this compute to it, you would actually get good results. That has also not been the case in the past, I guess, five years ago. Things would have been quite a bit worse. We didn't even have the transformer back then. So we would not have been able to make anywhere as efficient use of the compute that we had. And the third obstacle was to realize that this is a good idea. The thing about deep learning is that we have this strange phenomenon, where in theory, if you train larger neural networks and bigger data sets, they should get more amazing results. But it's something that's not easy to believe in. And just believing in that is a major, major part of the advance that led to GPT-3.

SPENCER: So let's go back to those three points. I think each of them has really interesting stuff there. So point one, you mentioned about compute and infrastructure. Could you give people some idea of just how much compute or infrastructure is needed to train a system like GPT-3?

ILYA: On the compute, you need to use thousands of GPUs for quite a while, for at least a number of weeks, in order to get the result that you need. So it's just a very large amount of compute. And it's not easy to get it.

SPENCER: And with Moore's law, I'm guessing that this would have essentially been impossible five years ago to do this much compute, or it would have been just so obscenely expensive that it wouldn't have been reasonable.

ILYA: It would have been more expensive. That would definitely be true. But in addition, I think the transformers came out in late 2017. Just to explain what the transformer is and to talk about what deep neural networks are, and what's the deal there, so a neural network is a circuit. And a circuit is best thought of, in my opinion, as a parallel computer. Because the neural network has a learning algorithm, the parallel computer that's implemented by the neural network can program itself from data. Now, various attributes of the neural network determine the shape of the computation that's implemented by it. So if you have a shallow neural network, it means that you have a parallel computer which can only do one step in parallel. That doesn't seem very powerful. And that's actually totally obvious. If you have a parallel computer, but it only has one step of parallel computing, it won't be able to do much at all. This is basically what shallow models in AI are like. You can prove mathematical theorems about “Hey, it's shallow, so I can say it's easy to handle mathematically. So I can prove that my learning algorithm will find the best function in my shallow function class.” But if you think about it from the perspective of what computer is being implemented by my functions, it's like a parallel computer that does only one or like a very, very small number of steps. Once you are deep networks, the number of steps that's afforded to the parallel computer increases significantly. And because of it, it can do a lot more. The transformer then takes this to the next level. But the main usefulness of the transformer comes from what may appear to be a technicality at a sufficiently high level, like the conversation that we are having. The technicality is this, if you think about neural networks as you describe them, where you have these matrix multiplies followed by element wise nonlinearities, it is very natural to apply them to vectors, but it becomes clunky to apply them to sequences of vectors. And most interesting data, or a great deal of very interesting data, comes in the form of sequences. Language is an example. And so you say, “Okay, well, I want my neural network to process long sequences of vectors. How to do that?” And then you start saying, “Okay, well, maybe we can apply a neural network in one way, or maybe we can apply a neural network in another way.” And the dominant way of applying neural networks to sequences was this way called the recurrent neural network. (I'm not going to explain to you what it is, because it will be too much of a digression.) But it's a very, very beautiful and elegant neural network architecture design, it's very attractive, but it had a problem that was difficult to train well. So it would not be able to process sequences as well as we hoped. Now back then, we didn't know that something better was possible. Then, the main innovation of the transformer is that it lets you process long sequences of vectors in a very compute-efficient way. And most crucially, in a way that's easy to learn for the learning algorithm. And this is a point which I haven't elaborated on, but I think it's important that I do so now. I mentioned that neural networks are like parallel computers, which have a way of programming themselves automatically from data. But their learning algorithm is finicky. It doesn't work under all conditions. And it's not easy to tell, especially a priori, especially it wasn't easy to tell back then when precisely it would work and when it wouldn't, and what would be the way to fix it. And so you can see, a great deal of the breakthroughs that were done in AI was to figure out (to hone in on) the conditions where the learning algorithm, which programs the neural networks — which is called the backpropagation algorithm -– the conditions under which it works reliably and well. And the transformer is an architecture with the following three properties: it can process long sequences of vectors very naturally; it is very comparatively straightforward to learn with the backpropagation algorithm. And in addition, it has one other big advantage–it runs really fast on GPUs. And GPUs is the main way in which we implement neural networks. So the fact that something runs quickly on GPUs is a huge advantage. So those were the three things that made the transformer really, really successful.

SPENCER: So in some sense, you can think of transformers as giving us the power to better leverage the computation that we already have. Is that right?

ILYA: In one sentence, it is right. But in another sentence, it gives us something even more than that, if you care about processing sequences, and for example — we spoke about GPT-3 — GPT-3 processes a sequence of 2000 words to make its prediction about the next word. Back in the day, it was a lot of words. By comparison, if you were to look at recurrent neural networks, the most anyone has ever gone successfully was a hundred words. So it's not only that we leverage computers more efficiently, it's that their optimization problem is much easier than the problem of the previous best approach for processing sequences. So that training a recurrent neural network to process a sequence of 2000 words would just not be a meaningful thing. It would not benefit from these extra words. Do you see what I mean?

SPENCER: Right. And so with GPT-3, you could give it a whole assay that's 2000 words long and say, “Well, what's the next word in the assay?” Whereas these older models, you were limited to such a small input that it limited the capabilities of the system.

ILYA: That's right. Now, in fairness, there have been a lot of advances in optimization in machine learning and our understanding of what makes neural network architectures work is far greater than it used to be. So it is very possible that, with a little bit of work, it will be possible to bring back recurrent neural networks and make them competitive with transformers. But it seems unnecessary because transformers have so much going for them. And these GPUs work so well. So you've got this really good thing going. And I don't think we are hitting the limitations of our approaches just yet.

SPENCER: Now you mentioned that these are really good for processing sequences of vectors. And I just want to explain that briefly for those that might be confused about what that means here. So as I understand it, a word (or really a token, which is generally a piece of a word) is going to be represented as a vector. And so if you're processing 2000 words, one way to think about that is you're really processing a sequence of vectors. So essentially, vector being a list of numbers, where each token in English represented a list of numbers, and then the sequence of them is the set of all the vectors for all the different words. Is that correct?

ILYA: Yes, that's right. Sequences are just very natural. They occur all over. Sequences of words... language is sequential, speech is sequential.

SPENCER: Okay. So this brings us to the third point you made about why it would have been hard to build a thing like GPT-3 five years ago, which is just this belief in large networks. So I'd be really interested to hear your thoughts on that. Why did people not believe that large networks would give them such a big advantage? And why was it that you and your colleagues in OpenAI thought otherwise?

ILYA: So the second question is much harder to answer. But the first question, I think the answer has to do with psychology. I think there are very powerful psychological forces in play, where if you work a lot with a particular system, you can feel its limitations so keenly, and your intuition just screams that there are all those things that this (whatever you're doing) cannot do. And I think that this is why researchers in AI have consistently underestimated neural networks. Especially in hindsight, it is so clear that, of course, neural networks can do a lot of things. But I think even now, there are plenty of people who will tell you that you need to, for example, combine neural networks with symbolic approaches because of inherent limitations to neural networks. So I think this belief prevails in some circles, even today.

SPENCER: I've heard you mention elsewhere that there may also be an issue with the way that machine learning algorithms were measured, essentially that academics would use fixed datasets where they say, “Here's the benchmark, see how well you can perform on that.” Do you want to unpack that a little bit?

ILYA: Yeah, that's right. This is a really correct point. So I mentioned the psychology of machine learning but I forgot to mention the sociology of machine learning. Indeed, the way research has been done, especially 5,10,15 years ago, 20 years ago, is that researchers were primarily interested in developing the great machine learning algorithm, but they were very uninterested in building datasets. And the reason for that is that building datasets is not intellectually stimulating. I think that's all there is to it. Because because building datasets is not intellectually stimulating, researchers would just use the datasets that have existed primarily. They will just say, “Oh, here is a well established data set. Let us now try to get better results on the status.” If you are in this regime, then if you try to increase the size of your neural network, you will not get better results, or much better results. Because for the larger neural network to get better results, it also needs more data. It makes intuitive sense if you think about it, because the larger neural network has more trainable parameters, it has more synapses. And by the way, the way you should think about the parameters of a neural network is as of the strength of the connections between its neurons; the bigger the neural network, the more neurons, the more connections there are between the neurons. And the more data you need to constrain all these connections. Again, these things are obvious in hindsight, but in fairness, pretty much every really significant advance in at least AI has been fairly obvious in hindsight.

SPENCER: This reminds me of “The Bitter Lesson”, which is a short essay written by Rich Sutton. Have you read that essay?

ILYA: Yes, yes.

SPENCER: I'll just read a very brief excerpt from it. Because I think it's super interesting. Sutton says, “The biggest lesson that can be read from 70 years of AI research is that general methods that leverage computation are ultimately the most effective, and by a large margin. Seeking an improvement that makes a difference in the shorter term, researchers seek to leverage their human knowledge of the domain. But the only thing that matters in the long run is the leveraging of computation.” And so I think the point of this essay is that researchers keep trying to build better AIs by using their cleverness, their knowledge of the domain that the AI is trying to solve. And maybe for a year or two that does produce better results. But then someone ends up blowing past them by throwing more computation and more data at the same problem, but with less clever methods.

ILYA: Yeah, that's definitely borne out in neural networks research, in deep learning research. Researchers want to make progress. And researchers want to apply their skills to make progress. And oftentimes, it is easy and satisfying in the short term (exactly, as you said) to say, “Hey, what if we change things up a little bit in a certain way?” And performance improves. And most of those changes lead to improvements on a specific task, but they don't generalize and you can potentially spend a very large amount of human effort baking those improvements into the system. There is a certain flavor, there's a certain taste to improvements in methods that actually stand the test of time and that work better as you have more compute. Those improvements exist and (you mentioned) we discussed some of them, the switch from recurrent neural networks to the transformer was one, because it is still an improvement that researchers have produced, but it leads to better utilization of compute and unlocking new capabilities. Obviously, improvements in optimization, which is something that we discussed as well, is another category of such improvements, and there's a whole bunch of others. So I don't want to overstate the message of “The Bitter Lesson”, that the only thing that matters are simple methods. What I think is true, is that the only thing that matters are really good simple methods that scale.


SPENCER: So now that we're in this era where these massive networks are used in order to get state-of-the art predictions, is academia able to keep up? I mean, it can cost millions of dollars to train one of these networks. And so I wonder if we're gonna get to the point where you have to be part of a group like OpenAI or DeepMind to really be at the cutting edge.

ILYA: Yeah, I think the situation for academia is such that academia will no longer be able to be at the cutting edge of training the absolutely largest models. I think something similar is happening in systems research. And something similar has happened in semiconductor research as well. There has been a time where the most cutting edge research in distributed systems has been taking place in academia. Now it's taking place in companies like Google and maybe other companies. There is still interest in academic research going on that studies questions which are neglected by these big companies. And with respect to AI, I can see things going two ways for academia. I think first is that they could produce a lot of foundational understanding of the methods that we use. And I would also expect there to be a lot of collaboration between academia and some of the companies that you mentioned, like OpenAI, or DeepMind, or Google, because these companies often expose the models that they train through various APIs. And academics and people in academia could study their properties and modify them in various ways and see what happens and discover new and useful things.

SPENCER: Got it. So you think there could still be a helpful role for academics pushing forward the state of the art, even if they can't actually train the state-of-the art models?

ILYA: That's right. It will be a different kind of work. And there will be some people who will find it very interesting and attractive, and they will work on it.

SPENCER: Now, when we talk about these large models. With GPT-3, we're having about 175 billion parameters — at least in the original version of GPT-3 (I don't know whether that's changed). And I think it's such a shockingly large number of things being learned, of essentially numbers and the model being learned. I think a lot of people who, let's say, have some statistics background might think, “Well, isn't this just going to overfit like crazy?” And so, to get an intuition for overfitting, imagine you're just doing a simple linear regression and you have a scatterplot with 10 points, you could fit a line through those points. But you could also just take a pencil and kind of draw some crazy squiggly line that hits every single point perfectly. But there's some intuition people have that, if you draw some arbitrary squiggly line that hits every point perfectly, that's not going to generalize very well to new data. Whereas if you kind of fit a smooth line through the points, that's probably going to generalize better. And you know, when you have 175 billion parameters, I think a lot of people who have some statistics under their belt, their intuition is going to be how can this learn anything, right? Isn't this going immediately overfit?

ILYA: Yeah, that's a very deep question. And this question has indeed perplexed a lot of researchers. And I think right now we have pretty decent answers to it. There are actually two parts to what I'm going to say. The first question that you brought up was GPT-3 specifically. It has so many parameters, 175 billion parameters, which is a lot. However, it was trained on 300 billion tokens. So they're actually trained on more data than parameters. And so you could argue that, even if we were in the regime that the statistics people are used to, for something like GPT-3, the situation wouldn't be too bad, because the amount of data has been quite large compared to the parameters. However, there is another phenomenon that researchers have observed, which I think is what you're truly alluding to, that even when you train a neural network, which has far more parameters than it has training points. And that until recently has been the case for various vision classification tasks, because even though you feed it an entire image, you only learn from a small label on the output. So you have neural networks with far more parameters than labels. How can those generalize? There, it turns out that this happens due to properties of the optimization algorithm of stochastic gradient descent, where various arguments can be made around the stochastic gradient descent finding a solution that has the smallest form naturally and automatically. And there are other arguments that argue, (I would say, with some degree of convincingness) that stochastic gradient descent in particular has an effect where it removes the information from the parameters, it minimizes a certain measure of information of the parameters, which is distinct from their actual numerical quantity. So to sum it up, even if those details may not have been the easiest to grasp immediately, it is the property of the training procedure that makes it so that, even when you have a huge neural network with a huge number of parameters that vastly exceeds the amount of data that you have, you still achieve good generalization.

SPENCER: Right. So maybe an intuition to help people here is that if the model could really take on any value for any of those parameters, then it probably would overfit. It would be predicting the noise, not just the signal. But the training procedure actually puts in a preference for some of those parameter values over others. So even though all those parameters technically exist, it's sort of more constrained than it seems. And one way to think of it is that it essentially has a prior of preferring certain solutions to others. Is that a fair way to put it?

ILYA: That's a fair way to put it, yes. If you had some adversary that would say, “I can absolutely very easily produce a neural network that's going to do perfectly well on the training set, and do horrendously poorly on the test set.” So not only such adversaries exist, but you can even, if you try really hard, it is possible to actually program one such learning algorithm that will have deliberately intentionally poor generalization properties. Of course, it's not something we would ever want to use. But yes, the learning algorithm does a lot of the heavy lifting in the success of these neural networks, not only in training, but also in generalization, their real-life performance.

SPENCER: One thing that surprises me when I'm using GPT-3 sometimes is that it seems to know exact texts in some cases. If you give it the beginning of a really famous book, it might be able to actually write the exact correct sentences that are actually in that book. But a lot of times it will also generalize, it will say things have never been said. You can Google the phrases it comes up with and they've never been said by a human, at least not on a webpage that's available and indexable by Google. So I'm curious, (maybe this is hard to answer) but in some sense, it is memorizing some of its training input, but it's also generalizing. Do you have anything to say about that?

ILYA: Yes, I do. I think it's a very profound and deep question. It touches on the question of where does memorization end and generalization begin? And I think that they go hand in hand to a significant degree. And there is an intuition from Bayesian reasoning. You can construct an intuition about how this memorization is fully compatible with generalization by considering a Bayesian example. So suppose you have your prior distribution over some space of functions. And suppose that I tell you, “Hey, on this input, you have some output. Please give me the posterior distribution on the functions that all of whom are satisfied that, on this input, they have this output.” You could say, “No problem, here's your Bayesian posterior.” You could then use it to make really good predictions. If your prior distribution has been over an interesting set of functions, this posterior will give you great generalization. You can do it for several data points, and you'll get a posterior distribution over all the functions which perfectly hit their desired outputs on the training data, and yet you will generalize well. What I'm trying to say here is that simple formulations of Bayesian inference are compatible with memorization and generalization simultaneously, because this posterior distribution over functions will have the property that every function in it memorizes the training set perfectly. Yet when you average over this posterior, you will get good predictions. Do you see what I mean?

SPENCER: Right. So you can essentially — if we think of the neural net, as learning many, many, many functions internally — you can have some of those functions that are essentially memorizing data. And by memorizing we mean, for a given input, they'll produce the exact output; you give it the first sentence of a book, it will produce the next sentence, right? Whereas, because it's combining all of these functions together, it's still able to generalize to new circumstances, despite having these many pieces that are kind of exact memorizers. Is that right?

ILYA: This is close, but not exactly right. So the example with the idealized Bayesian inference is not something that the neural network literally does. The neural network does something that tries to approximate it but in a way that's a bit different that I don't want to get into. The reason I brought up the Bayesian inference example is because it is relatively widely known that Bayesian inference gives you correct generalization. If you follow the recipe of Bayesian inference, which is down to the dot -– it is computationally intractable precisely — but suppose you did that, you will get very good predictions. Yet if you say, “Okay, I have my class of functions, all my neural networks together,” (rather than sub-neural networks, that every function is a different configuration of weights) all my neural networks, and I say, “Here's my training set. Please give me the posterior distribution over all the neural networks which satisfy this training set.” So I'm going to have a posterior distribution over neural networks, each of which memorizes the data set perfectly. Yet, when you make Bayesian predictions, when you average over these neural networks, they will be really good. So I feel like the details may be a little hard to follow but they're not important. The important thing is this: it appears to be perplexing. How can it be that a neural network memorizes all this data and generalizes at the same time? Yet I claim that idealized Bayesian inference, which is known to make the best predictions in a certain sense -– the best predictions possible in a certain sense -– has the same property. If you say, “Here's my training set,” you will get the Bayesian posterior over functions which memorize perfectly, yet you will get good generalization. So Bayesian inference also exhibits the same...

SPENCER: So maybe we're confused about this idea that memorization is a bad thing just because, in early AI systems, it happened to often show that you had not trained properly, but it's not inherently a problem.

ILYA: Yeah, I think that's right. I wouldn't say it's because of the early AI systems. I would say the reason is different. I think this is one area where humans and the human brain operate differently from our neural networks. I think our subjective experience is that memorization is something we really don't like to do, generally speaking. At least in my personal experience, I know I don't like memorizing things. I find it difficult and uncomfortable. And so perhaps we can do better. Perhaps it is possible to do better than just memorizing everything. But my statement is different. My statement is that the gold standard formalism of generalization -– or at least a gold standard formalism -– Bayesian inference, is very consistent with: you memorize everything, yet you fully generalize.

SPENCER: That's really interesting. So we've talked about how these systems work. I want to now switch and talk about the future of these systems and where you think they're going. One thing that comes up as soon as you start thinking about the future of these systems is: is there a limit to throwing more scale and more computation at these problems? You know we've gone from these early natural language processing systems to GPT-2 to GPT-3, and we're seeing these incredible gains from building bigger networks and using more data and computation. And some people are skeptical that we're going to continue going all the way with this. And other people say, “No, maybe we can just keep going and going indefinitely.” So I'm curious to hear your thoughts on that.

ILYA: Yeah. I think it is undeniable that we will go extremely far. I think that the current wave of progress with this specific paradigm has not ended yet. I think that's definitely true. I do think that there may be some capabilities for which it's not clear how this specific approach will give rise to. An example of that would be: suppose we were to scale the GPTs much, much further, as is without any modification. If we were to allow modification, the argument would break down. Let's suppose no modification at all, just keep scaling computing data. There was a blog post recently that I read. Someone was making the argument that, would the GPT that will be produced by training on even more interim data be able to beat the world champion in chess? Probably not. Because the answer depends on whether someone has accidentally uploaded a lot of super high quality chess data to the internet. If no one has done that, then we shouldn't expect the GPTs to be good at chess, even if they were scaled up a lot. So I think they will be incredible, but there is a good chance that there will still be some gaps we'll need to address.

SPENCER: This is a tricky question but how do you view the difference between what a system like GPT-3 is doing and what the human brain is able to do? Because it seems like GPT-3 can learn to do just about anything that involves — okay, you've got a series of tokens of text, and you want to predict what text comes next. And there are many problems that can fit into that, whether it's writing poetry or essays or answering questions or being a chatbot. But it does still seem like there are some things that humans do that just aren't gonna fit into that paradigm.

ILYA: Yeah, I think there are definite things that you can point to about the human brain as being more efficient in multiple ways than the GPTs and I'll give some examples. Some examples are from the way we can point at how people learn. And a different example would be from the observable behavior of people. I'll start with the second one because I feel like it's very easy to see. If you look at the capability of a GPT, it's very non-human in the following sense: a neural network that's been trained to guess the next word on a very large corpus of texts on the internet will have a lot of familiarity with nearly everything. It will be able to speak pretty well about any topic — essentially, any topic that's been discussed and to talk about imaginary topics as well. It will have an incredible vocabulary; it's hard to imagine there is an English word that a GPT model wouldn't know and not be able to use. And yet, we also know that it makes mistakes that humans would never make. Whereas if you compare it to a human, I think human beings have far smaller vocabularies. And the set of topics that human beings know seems to be far smaller, but human beings seem to know those topics far, far more deeply. I think that's like a real and meaningful distinction.

SPENCER: There's a sense in which the GPT-3 is already super intelligent in certain ways, right? As you point out, its vocabulary is essentially super intelligent. And just this knowledge base, in some sense, it knows about way more topics than you could ever learn in your lifetime.

ILYA: That's exactly right. And yet the depths that at least GPT-3 exhibits is lesser than the depths that the human has. A human that studies one topic can achieve more depth than a GPT-3 of today. Of course, we expect future GPTs to increase their depth as well but that seems almost like a qualitative difference between these neural nets and humans. And relatedly — related to the earlier comments about human beings not liking (really not liking) memorizing information -– human beings are also extremely selective about the information they consume, extremely selective. You know, if you look at how to train the GPT, you just give it random web pages, and it's going to keep getting better. Whereas for human beings, are you kidding? If you give a person lots of random text to read and suppose somehow they've motivated themselves to do so, it's not clear that they'll benefit much at all. So I think that this is another very important way in which there is a difference where people are so choosy about the data they consume. And in fact, they will go to great lengths to find exactly the data that they need in order to consume. I think this points to another difference. Now, this doesn't mean that pushing further in the GPTs will not lead to further progress, that's not the case at all. But I think it does mean that it is still possible to do better.

SPENCER: There's also an efficiency issue here, right? It seems that a human can learn with way less information than GPT-3 can?

ILYA: Well, that one I think is a tricky one. I think there is truth to it but I think it's tricky for the following reason. So the right comparison to make is, how quickly do people learn at the age of 20, let's say? And how quickly does the GPT learn after you finish training it? Because it takes a person to get to the age of 20 -– a person is exposed to a lot of concepts and information, they've learned a whole lot. A GPT model -– because it doesn't have the benefits that biological evolution has given us in terms of various instincts and knowledge of what's the important data and what's important to focus on — we compensate by giving our GPT far more data. And the question then is asked: how quickly can a GPT learn once it's been trained on all this vast amount of data? And we know that, at that point, the neural networks learn far faster than they are at the early stages. So the gap between what people normally think about the speed of learning (in deep learning of artificial neural networks and the kinds of GPT systems that you have today after we finish the training) is significant; we learn much faster than it seems. But I do think that there is truth to the claim that human beings learn even faster still, and this is probably another gap. Though it is interesting to see what will happen to this gap as we continue to make simple and straightforward progress to GPTs through scaling them up and through making efficiency improvements.

SPENCER: Two common critiques you hear about the standard neural net paradigm is, one, that symbolic thinking or reasoning is missing from today's systems. And another is that some kind of embodiment in the world and interaction with the world may be necessary to learn the way humans learn. And so I'm just curious to hear your thoughts on those two critiques.

ILYA: So there on the first critique, I feel like there has been quite interesting recent work. It's been trending on Twitter (I think it's from Google), that shows that if you simply take a GPT-3 and -– rather than ask it to answer some question, you ask it to use reasoning to answer your question or show your step-by-step reasoning when you answer this question — then the GPT-3 will, in fact, generate step-by-step reasoning, perfectly good symbolic reasoning and get much, much better results on the kinds of tasks where symbolic reasoning seems to help, like math. So I think there is some evidence that maybe the current approach gets us at symbolic reasoning, (or at least there is) maybe we should expect to make some progress. More deeply, the human brain is a neural network, and it's perfectly capable of symbolic reasoning so why shouldn't an artificial neural network be fundamentally incapable of symbolic reasoning?

SPENCER: Do you think that there's enough in common between the neural nets we're implementing in computers and neural nets in brains, that we can be confident that there's not some kind of qualitative difference there?

ILYA: I think we can't be sure. I think it's definitely conceivable that the human brain is doing something that's quite a bit better than our artificial neural networks. And I think what it will mean primarily is that the amount of compute that will be required to reach human-level intelligence will be larger than it seems than one might guess today. Because basically, if you think about a neuron, a biological neuron...there was a neuroscience paper from a few years ago, where some people took the most sophisticated model of a biological neuron they could find and they tried to approximate it as an artificial neural network. And they were able to do a really good job with a neural network which had 100,000 connections. So then you could say, “Okay, well, if you replace each neuron with this little gadget made out of artificial neurons with 100,000 connections, now you have a giant artificial neural network, which is actually very similar to the brain.” Now you need even more compute. So I would say that, rather than things being fundamentally different, I think it's more a matter of degree in terms of how much compute would be needed for the artificial neural network to get to the point where it is, in some sense, comparable to the human brain.

SPENCER: I wonder if this is in some way a rephrasing of the idea that neural nets are function approximators and they can kind of approximate any function. And so even if a human neuron does something really complicated, as far as we can tell, there's no particular reason that that couldn't be approximated to an arbitrary degree of accuracy by a neural net.

ILYA: There is proof to what you say on some level. Universal approximation of neural networks basically means that if you have a neural network with a single layer (so it's a shallow network), but the layer is exponentially wide (so that you have a neuron for every possible bit of space in your input space), then this neural network can approximate any function.

SPENCER: And that's just wildly unrealistic, right?

ILYA: That's right. It's not relevant to anything but there is a sense in which there is some kind of universal approximation going on but of a different kind, where you can say, “Hey, you have your neural network.” And suppose you have a neural network, and I'm saying, to build something intelligent, we need to have some kind of computational gadget inside. Well, the neurons inside the neural network can organize themselves to simulate this computational gadget. Do you see what I mean? So then suppose we say, “You need some kind of special operation inside the neural network.” Well, the neurons in the neural network can say, “No problem. Let us organize ourselves using training as to implement that precise operation.” And suppose we say you actually need biological neurons. Well, you can literally say, “Hey, let's imagine a truly gigantic artificial neural network, where groups of neurons with 100,000 connections between them, each corresponds to a biological neuron.” And now this whole system can simulate a large number of biological neurons. Maybe instead of calling it universal approximation, it might be better to call it universal simulation.

SPENCER: Right. It can essentially approximate any algorithm that's needed.

ILYA: Yes.

SPENCER: So the other critique that comes up sometimes is that AIs may need to be embodied in the world and interacting with the world in order to one day learn to do all the things humans can do.

ILYA: Certainly there is an argument to be made since humans interact with the world and they are embodied and they are our only example of intelligence. I think that possibility exists. I think it's also quite likely that it's possible to not be physically embodied, to compensate for the lack of embodiment with the vast amount of data that exists on the internet. So probably, my bet is that physical embodiment is not necessary though there is some chance that it will make things a lot easier.

SPENCER: Some groups like DeepMind, I know, have been experimenting with putting AI in kind of simulated worlds, which is sort of somewhere in between (where it's not putting it in the real world), but it is putting it in a world where it can do things and then see their impacts and learn from that.

ILYA: Yeah, taking action seems important. But doing it in a physical form factor may be less so.


SPENCER: So I'm curious to know, before we wrap up the conversation, how much do you worry about potential dangers from AI?

ILYA: I would say the questions around AI and the challenges and dangers that it poses vary by timescale. As capabilities will continue to increase, the power of AI will become greater. And the greater the power of a system is, the more impact it has. And that impact can have great magnitude in all kinds of directions. And I would say it could have a positive direction, or it could have an epic negative direction. When we started OpenAI, we already sensed that AI, once sufficiently developed, could indeed pose a danger because it is powerful. And although these new technologies -– it is very hard to predict how things will unfold -– it seems desirable to, at minimum, be thinking quite hard about the different ways in which AI — once it becomes very, very capable and powerful -– could be used in truly undesirable ways. Or, as in the case of AI as a special technology where if it is built incorrectly, if it is designed without sufficient care, then it could lead to outcomes nobody wants, without anyone's intention. So yeah, I spent quite a bit of time thinking about it,

SPENCER: I tend to divide the potential dangers from AI up into three categories. One is misapplication or bad application of narrow AIs like we have today. For example, someone using a large language model to generate sort of endless combinations of spam that maybe spam detectors can't detect. Or narrow AIs today maybe having racial bias because they're trained on data that has racial bias or something like that. So that's narrow dangers. The second category I think about are ways that AI could enhance the power of one group over all other groups like an authoritarian regime, using AIs that monitor every person in the population or quadcopters flying around watching everyone all the time and even assassinating people if they misbehave. And then the third category of AI dangers I think about are potential dangers from uncontrolled AI, like potential AI super intelligence. If one day we make AIs that are really smarter than humans, not just narrow ways, but in many ways or even every way, that those systems themselves, if not properly controlled, could cause great danger. So I'm just curious to get your take just on these three categories. Starting with category one on dangers of narrow AI, what are some things that you're concerned about? And what do you think we can do to try to protect against that?

ILYA: All the dangers that you mentioned are extremely valid. And we are already facing the first danger that you're describing today. It is indeed the case that, if you train a neural network on data which exhibits undesirable biases, and then you do nothing to address (somehow inside the neural network), then the neural network will exhibit them. And so this is a problem that the field of AI today is very keenly focused on. And I think that there are multiple things that one could do in order to mitigate this concern of this danger. So the first one would be around something that we do at OpenAI, where we don't just release these models into the wild. We expose them through an API, and we carefully monitor how the API is being used, and we carefully restrict the allowed use cases. So this is kind of a semi-manual solution, but it is effective, where we say, “Okay, these use cases where, if your particular use case is going to expose a lot of surface area for bias, then you should not allow it, or other kinds of negative use cases. And if a different use case is okay, then we should allow it.” And then of course, all this on top of further training of the model to then learn to not exhibit their highly undesirable biases. And there is a lot of work about it in the field and there is quite a bit of awareness of this problem as well. So that's gonna be the thing I will say about the first problem.

SPENCER: Yeah, I really like that approach, because it gives you feedback. Because you have this API, you're able to monitor how people are using it and then you're able to notice patterns in misuse and then update your system to detect them. So that seems really valuable to me. But it does make me concerned about copycats. So even if you have the absolute cutting edge AI and you monitor it really carefully, others copy the work you do and even if they're a year or two behind you in their development -– because you're the best at what you're doing -– it just means that people can use these kind of more dangerous applications on a model that's one to two years out of date. And so I'm just wondering how you think about that.

ILYA: Yeah, I think this one is definitely a trickier one. The issue that you're describing, the dimension that technology is going to be diffused and there'll be lots of different companies gonna implement it. That's just true point.

SPENCER: Yeah, it's tough because it's kind of a collective action problem, right?

ILYA: That's right. But one thing that OpenAI has tried to do is try to implement some self-regulation. We recently — together with Cohere and AI21 which are all three large language model API companies — came together to make a statement about shared principles about how these models should be used and how they should be deployed precisely to address the risk you're describing. And the hope is that other entrants into the space will follow our example. And they will also follow those principles.

SPENCER: That's good to hear that you're trying to solve this collective action problem by getting people together, especially the groups that are sort of cutting edge and trying to all come to agreements on what you all want to enforce together. Of course, you can't bind the most nefarious actors (laughs) who are just trying to make a buck and don't care about principles, but at least you can get the major players on board.

ILYA: That's right. And if you get the major players on board, you also can get the great majority of volume of use on board as well, if you see what I mean. At least that's the aspiration, and there is a story that it will be the case.

SPENCER: One thing I'm worried about is whether people would use these natural language models to generate sort of customized spam, or customized attacks. An example would be -– we know that certain countries will manipulate Twitter and have warehouses of people that will be posting to try to create propaganda. We also know that, around the world, there are people that sort of do custom attacks on individuals that are going to compromise their computers, and phishing attacks, and so on. And you can imagine that the natural language processing systems are getting so good that they might actually be able to automate these words, instead of a thousand people in a warehouse, it's 10 million automated bots, all saying actually different things, but sort of all espousing some propaganda in a particular direction, or doing customized phishing attacks based on what's known about a person. So I'm just curious if you've seen any evidence of those kinds of attacks occurring or what your thoughts are about trying to prevent them.

ILYA: Yeah, so I have two thoughts on this. But the first one is, when we first looked into releasing GPT-3 through an API, this use case in particular was one that we were really worried about. We really worried about people using GPT-3 to create propaganda and to make the (kind of) persuasive arguments in a particular direction in order to manipulate people or to persuade them. We thought this would be a major use case. Empirically, it hasn't been the case. We were looking for it and we weren't able to find it. So that's good. It doesn't mean it won't emerge in the future. It may mean, for example, that countries who engage in these kinds of activities have sufficiently cheap humans. So maybe that scale up...maybe the cost benefit isn't quite there yet for them. Maybe it will be with better GPTs in the future. But I do think that, in addition, there is one possible dynamic — I don't know if it will happen, but I think it could happen and there is a plausible story for it happening and I'd like to mention it. So one thing which I think will hopefully be true is that, let's say, I think the attempt for the AI industry to self-regulate will be successful to a significant degree. And so we should expect that all the companies with the best models should be on board with it. And there should be perhaps systems from a couple years behind who will be used for these nefarious purposes. But I would hope that at that point in time, like for example, Google, will be able to deploy a massive persuasion detection system in their email or something. It seems totally within reach as well, for that particular concern that you're describing. I wouldn't say it would happen right away. But I could imagine that once that starts to be an issue, once there are indeed bot attacks of this kind, there will be a lot of interest in using similar neural networks to detect the application of such personalized manipulation.

SPENCER: An important advantage in having the cutting-edge systems be in the hands of groups that are more responsible is that maybe that means that they can actually detect the less cutting-edge systems that are one or two years behind. Maybe GPT-3 can fool itself but maybe it's able to tell that GPT-3 is generating something, if you see what I'm saying?

ILYA: Yeah, exactly right, especially if the more powerful system is trying to recognize if less powerful systems are trying to do something, that is definitely a story that they should be successful, or at least they could be successful.

SPENCER: The second big category of AI dangers that I think people are concerned about is a concentration of power where, if you have an AI system that's really, really good, it may enable some groups to just gain a huge leverage over other groups, or maybe even over the rest of the world. And some examples of this would be like, if one AI created by far the best financial trading system ever, and was able to make, not just billions of dollars, but trillions of dollars? Another example would be if AIs can get so smart that they can do automated hacking, and you can have essentially millions of AI hackers hacking every system simultaneously around the world. Or even just making predictions, if you have AI systems that can make predictions far better than humans can about many different things, this could lead to one group having a huge power differential. Or I'll just give one final example. If you have AI systems that can replace human labor and one company could have 100 million AI laborers essentially, and replace large swaths of human labor around the world, but it's all owned by one company. So all of these examples are essentially concentrating power in a way that could potentially be scary. So I'm curious to hear your thoughts on this.

ILYA: Yeah, I think there's a lot of truth to the underlying idea that AI is a concentrating technology simply because of the need of a very large cluster. So the biggest AI systems will always be the most capable and most powerful. I would say that there are two questions you can ask. Whether there is an overall increase in the power in AI systems, power that AI systems wield. I think the answer is definitely yes and this power is going to be increasing because the systems will become more capable, will continue to become increasingly more capable over time. You can make a reasonably good (reasonably strong) case that the various capabilities that you mentioned will be increasing continuously. And the reason you see that is because we have some degree of discontinuity today. For example, we discussed GPT-3 and there are all those things that it can do, just not very well. And you wouldn't want to use it. It kind of gives you like a taste. You could say, “Well, one day it could do all those things.” But right now, we wouldn't want to use it. In the same way that the self-driving car, kind of see how great it would be if it worked, but it really doesn't work at all -– I mean, it kind of works -– but also doesn't work right now. So I think we will see this kind of gradual increase where, as the capabilities increase, people will find new ways to use them in order to grow the economy in productive ways or in harmful ways, like you mentioned hacking, or some other ways -– manipulation -– I think as capabilities increase, people will find new ways of using those capabilities to advance their goals. So I think this is an argument for some degree of continuity. But yeah, I think there is definitely a real concern that AI could lead to more concentration of power than is desired. And I think it may be that society will need to discuss (perhaps ahead of time) how this should be dealt with.

SPENCER: My understanding is that OpenAI has kind of a capped return for its investors? I think it's 100x return and then after that, the idea is to kind of give back any further profits to humanity? Is that right?

ILYA: That's right. The underlying idea there is that, if you have really significant AIs creating an incredible amount of wealth (actually incredible amount of wealth) -– if you run into the future a little bit and you allow yourself to imagine the Sci Fi ideas of AI being materialized, then the world will look very different. And it seems desirable (it seems desirable to us) to at least have the option to not be forced to maximize revenue at the absolute largest speed possible. Or at least to have one fewer powerful incentive to do so. Future is hard to predict. I don't know what will happen. Maybe it's all going to be fine. Maybe it's going to be like, we have multiple competing AI companies, the price of the services drops to zero, they're profitable, very small, maybe that's fine. We don't know what's going to happen. So at least let's not be forced to go down this route with absolute force if we can just avoid it. Let's not make it so that we are absolutely forced to maximize revenue at the highest speed possible. Let us have the ability to not do so if it seems like the best course of action.

SPENCER: Yeah, it seems like maximizing profit is (laughs) not the ideal when you're talking about the potential for these technologies. If we're just talking about incremental gains, okay, making profit, fine. But if we're talking about really something transformative, where potentially hundreds of millions of jobs or billions of jobs are getting replaced with AI, the effects on society could be so great that we need to proceed with extreme caution.

ILYA: Yeah, pretty much. We are moving into very unknown territory and it's going to be tricky to navigate. So it seems net positive to have more degrees of freedom in your actions and say, “Hey, if we can slow down our growth strategically because it will lead to some better outcome, we should be able to do so.”

SPENCER: We only have a few minutes left. The last thing I want to ask you about is some of the concerns around this third topic which is the idea day, we may build an AI that is really more intelligent than humans, or at least so intelligent that we can really think of it as a general intelligence and not just a narrow intelligence. My first question around that is your reaction to the Effective Altruism community, because this is something the Effective Altruism community talks a lot about, the potential dangers from such an AI. And I just wonder how you feel about their critiques.

ILYA: I have to say that this is an example where the community has done some very foresightful thinking about the problem far ahead of its time. There is definitely a possibility. So, let's look at the situation. What is the statement? The statement is, as you said, if you have a truly super intelligent AI, what are the dangers that it poses? Not just people misusing it, which is going to be already a tricky, tricky question to navigate, but risks from the technology itself. And the thing which I will say is, I think it is definitely a positive and a productive thing that the community has done to bring awareness of this question into the minds of more mainstream AI researchers. I think those questions have the potential to be very real and very important. There is the possibility that things will work out. The possibility exists because there will be unexpected discoveries in ML. There'll be radical changes in our understanding as they have happened before. But there's also a possibility that, indeed, those AI systems will be very, very tricky to handle when they are super intelligent. Because if you take seriously the idea of super intelligence, it's one hell of a thing. It's a very, very potent object. And this is a question that we think about a lot at OpenAI. How to deal with this and how we should do research about, and strategic implications, still work in progress. But, yeah, it's not possible to think seriously about AI without thinking about AI and where it's going without thinking about these questions as well.

SPENCER: Yeah, and I think a critique that I've heard some people in the EA community make is that, if these systems have such potential for danger, then we can't responsibly try to build them at all, that until we figure out how to build them safely, we should not build them. Or, at the very least, it should be a collaborative effort where lots of groups work together and take extreme caution. And really make sure to avoid any kind of race dynamics where people are racing and rushing to beat each other to build it. So I'm curious to hear your reaction to that.

ILYA: I'll comment on this question rather than answer it by saying two things. The first is that, indeed, it is a very tricky situation with AI. But I can mention to you the evolution that we've undergone in our own thinking inside OpenAI, and I expect that many labs will go through a similar evolution of thinking as well. When we started opening, I always thought that open sourcing would be a very good path forward because you had these big companies and they would be able to concentrate all their progress, and we would open source it. And that would be a way of dealing with this concentration of power that the natural progress in AI would lead to. But then we realized that actually that's not the way to go. Once we truly internalize the power that AI systems will have, it became clear that just open sources of technology is no longer the way to go. Instead, the way to go is something much more careful, which is the approach we've taken with the slow releases of our systems through the an API, which is slow, deliberate, careful. And that will continue to be more the case as we continue to make progress. I expect that, as the different labs will fully internalize just how capable AI will be, then there will be a lot. Then these concerns will propagate into all the labs of the frontier. And so I think there is a reasonable case to be made that care will emerge. But the other thing is that, ultimately, it's not possible to not build AI. And the thing that we have to do is to build AI and do so in the most careful way possible.

SPENCER: I think one of the biggest concerns that I have, personally, is that there will end up being a race as we get closer. And it seems to me that anything that can be done to prevent that is great, because it means that, if you're not racing, then you can take your time, you can really think about risks, you can really be careful. Are you optimistic that, as AI is getting more and more capable, that the top groups really will work together instead of feeling competition with each other?

ILYA: I definitely feel that, for the Western groups, there's a pretty good chance of that.

SPENCER: That's great to hear. Ilya, thank you. This was a fascinating conversation. I really appreciate you taking the time.

ILYA: Yeah, thank you so much. It's a pleasure.


JOSH: A listener asks: You've mentioned in a number of episodes that causation is important. Why do you think causation is important? For example, evidential decision theory or EDT doesn't use explicitly causal information at all. So what do you think is wrong about that?

SPENCER: I think I have a slightly controversial opinion about causation, which is that I don't think it's fundamental. Well, I'm not very confident, but my best guess is that causation is not fundamental. Causation is like an aspect of a model. So causation comes into play when you start asking questions like, what would have happened had things been different? But that's not the way reality works; reality just does its thing. And then models have this property — you can ask a question about them, like what would happen if things had been different? So I think causality is like an element of a model. And the reason I think it's really important is because we want to do things in the world. We want to make the world different than it would have been had we not existed. And so, when you're in that frame of wanting the world to be different than it would have been had you not existed, then you automatically are concerned with causality because causality is the aspect of the model that lets you ask the question: what do I need to do to achieve this different outcome than would have occurred? But I don't put too much stock in evidential decision theory. I feel like it doesn't perform that well in some of the weird philosophical thought experiments. So, I don't abide by it, particularly.




Click here to return to the list of all episodes.


Sign up to receive one helpful idea and one brand-new podcast episode each week!

Contact Us

We'd love to hear from you! To give us your feedback on the podcast, or to tell us about how the ideas from the podcast have impacted you, send us an email at:

Or connect with us on social media: