July 22, 2021
What is a choreographic interface? Is dance for everyone? What is dance for? What can dancers learn from roboticists, and what can roboticists learn from dancers? What are some of the practical difficulties one encounters when programming a robot to dance? Why do robots break or fail so often? What are the pros and cons of using virtual reality to train or test robots? Why don't we see more robots in everyday life? What are some everyday robotics applications coming up on the horizon? Are humans still needed in the robot training process? Can robots create art?
Catie Cuan is a dancer, choreographer, and researcher. Catie is currently a PhD Candidate in the Mechanical Engineering department at Stanford University, where she recently completed a Master's of Science in Mechanical Engineering in Spring, 2020. Her artistic and research work focuses on dance and robotics. She is a 2018 TED Resident, 2018 ThoughtWorks Arts Resident, and the 2017-2018 Artist-in-Residence at the Robotics, Automation, and Dance Lab at the University of Illinois at Urbana-Champaign. You can find Catie on Instagram at @itscatie and on her website, catiecuan.com.
JOSH: Hello, and welcome to Clearer Thinking with Spencer Greenberg, a podcast about ideas that matter. I'm Josh Castle, the producer of the podcast, and I'm so glad you've joined us today. In this episode, Spencer speaks with Catie Cuan about exploring the design space of motion, developing narratives about the robot-human relationship, and programming robots for virtual reality and AI.
SPENCER: Catie, welcome. It's great to have you here.
CATIE: Thanks for having me, Spencer.
SPENCER: So you had one of the more unusual life paths, taking you through dance and the world of robots. Do you want to tell us a little about how you started dancing with robots?
CATIE: Sure. So after I graduated from college, I went to UC Berkeley. I moved to New York City to take a management consulting job, but I grew up dancing my entire life. And the prospect of being a professional dancer seemed impossible. Especially for me, I've come from the first generation. My parents are immigrants, and it didn't seem super viable to consider a full-time dance career. But when I was living in New York, I was dancing on the weekends. I actually wound up pursuing a parallel path in both dance and creative direction, so it’s designing websites and working for a big agency. And when I was doing both of those things at the same time, I thought they had quite a bit in common. I thought I could bring my general technical interest in making websites and working with the internet to my dance practice. So I started using video and different dance technologies, like a dance and technology hybridization and choreographed practice. And then, in 2017, I met a woman named Amy LaViers at the conference for "Research on Choreographic Interfaces," which has very quickly become a marquee gathering place for technologists, designers, academics, and practicing artists who are working in the fields of dance and technology.
SPENCER: What is a Choreographic Interface?
CATIE: Yeah, great question. This conference was founded by a guy named Sidney Skybetter, a longtime New York choreographer who realized the ubiquity of the Apple Watch. Because it had a gyroscope in it, you could track people's motions with a couple of sensors in a nicely packaged wearable watch. And all of a sudden, Apple was doing things like detecting whether or not someone was giving a thumbs up based on where the watch was oriented on someone's wrist, relative to the ground and their own body. And Sydney thought, "Oh, my god, Apple's doing choreography." That's essentially the type of practice they're entrenched in. And the Apple Watch is a choreographic interface because you're inferring things about people's motion from the outputs that you get from the device. So he founded this conference on that premise. I met Amy at this conference, who runs an institution called the Robotics Automation and Dance Lab, and she invited me to be the artist in residence at her lab. So, about four years ago, I got to start dancing with robots, and it was phenomenal for me because it was the full feedback loop of getting to program something and then interact with it physically. And that experience was so enlivening for me. It was challenging from a creative and a research point of view that I decided to apply to grad school. Stanford was the place I endeavored to go because it had this incredible tradition of mechanical engineering, computer science, interdisciplinary work, philosophy, dance, everything. I really see it at the forefront of thinking about some of these topics from a truly interdisciplinary standpoint. I've been at Stanford since the fall of 2018, and I'm getting my Ph.D. in mechanical engineering. It's a robotics Ph.D. mostly, but I studied haptic interactions between humans and robots, teleoperation imitation learning using virtual reality to control robots and all of these different kinds of abstracted forms of dancing with robots. So, I took my artistic practice, and now it's a part of my research practice.
SPENCER: People might be surprised by this because they think of robots as something really practical, right? You make a robot to accomplish some goal, move something from point A to point B, clean something, or whatever. So tell me more about how you think about robots.
CATIE: My definition of a robot is something that has sensing capability. It can sense things about the environment. It has a computer, so it can take that sensory information and analyze it. And it has actuators so it can move through the environment. In some cases, reach out and touch it but act physically on behalf of the information it gets from the computer. So for me, robots have to have sensing actuation and computation. But that definition of robot is — I love Andrea Kay, the founder of Silicon Valley robotics. She likes to tell me, "Catie, that's a very modern definition of robots." Because for a long time, robots were anything you could program automatically. And that can be a puppet, or a toilet, for example, and anything you can program automatically that will act on behalf of that program's information. So, I agree with you; people tend to think of robots as something quite practical. But for me, robot is a much broader category than something that manufactures cars, like these big heavy ABB robots, for example or something that flies through the air and delivers packages. We've seen those mock-ups from Amazon. So for me, the definition of a robot is slightly more abstract, which means that you can kind of encompass a lot of other things that aren't necessarily strictly practical.
SPENCER: Japanese toilets certainly feel like robots. Like what is this thing? Why so complicated? [laughs]
SPENCER: So why dance with robots? Is it a form of artistic expression? I'm curious how you frame it.
CATIE: I should remark that human beings were some of the only species that dance, and every historical human society — Scientific American has published a lot of nice articles about this — but every historical early human society shows evidence of dance. And it's because it codifies this non-verbal, inter-behavioral group dynamic through this rhythmic, repetitive motion. And people dance for all kinds of reasons, right? They used to dance to ask for the rain, celebrate the hunt, and for weddings. And scientists have another word for this intermeshing of motion and music. It is called entrainment. And it's something that we do from the time we're born and that we're able to pair sound and movement. And not all dance needs to have sound. But the reason that I provide that as a little bit of a premise is because dance has this incredibly important foundational part of what it means to be human. And I think everyone can dance, and everyone should dance.
SPENCER: Clearly, you've never seen me on the dance floor.
CATIE: [laughs] Well, I think it's one of the ways that we codify social bonds. And I'm hearing a little bit of self-consciousness in your voice when you talk about being on the dance floor. But I think some of that is a byproduct of the idea that dance belongs in only certain codified, sanctioned-off spaces or that you have to be trained in order to be a dancer or to dance well, and I don't necessarily subscribe to that idea. But maybe I'm hearing a little bit of the self-consciousness in your voice about that because we've sort of evolved to think that dance is not for everyone when I really think that it is. But I provide that as an initial context setting. Because when we see things that move autonomously, when we see shapes moving or random patterns of dots, we are so attuned to paying so much attention to the way they move and the patterns they create because movement is meaningful to us. And I think because robots are embodied, they're three-dimensional, they take up space, they move around, they can reach out and hit their environments, or at least navigate through them, that means that they're engaged in a type of dancing, right? They're engaged in a type of making deliberate motions on behalf of whomever — whether it's a robotics programmer or a choreographer — that's dictated those motions. So, I think dance is like this lens where you can not only plan different motions for robots but also think about what those motions mean in different contexts. I just talked about dancing at weddings, dancing for rain, or dancing as a performance. And those are all completely different contexts where dance means different things. And so, if you apply that same logic to robots, robots moving through a factory are very different from robots moving to a hospital, even though they're moving. In both instances, the context is so different that it means something different to the people there. And dancers spend a lot of time thinking about motion and thinking about context. So for me, it seems like the absolute right lens, and actually, Sydney Skybetter (the person who established this conference that I just told you about is co-teaching) is a professor of choreography at Brown. He's co-teaching a choreo-robotics course at Brown next semester with Stephanie Telex (who's a professor of Computer Science who specializes in robotics). And the course will be attended by dancers, choreographers, and roboticists, and really look to create a sort of shared pedagogy or shared language around some of the thorny topics we just talked about.
SPENCER: So I'm interested to hear what you think roboticists have to learn from dancers, and maybe the reverse. What have the dancers learned from roboticists?
CATIE: When you take Choreo 101, or you take sources of movement or improvisation, you explore this huge design space of how you can create motion and talk about motion. So those are things like repetition, inversion, and transformation. You take a motion you're doing with a hand and do it with your foot, cannon. You learn to talk about space in a particular way, diagonals, upstage, downstage, that sort of transference of moving across in one plane versus other planes. And so, in Choreo 101, or any sort of dance pedagogy, you start to build this incredible toolbox for creating and performing motions. And that toolbox often results in performances. It results in writing. It's really like the study of how to make and do motion. And then how to perform it for people in different contexts. So, I think what roboticists have to learn from dancers is this huge vocabulary: "Oh, I got the robot to do this thing. I did it on the basis that I wanted the gripper to hit some point in Cartesian space. And so, I ran a basic inverse kinematics algorithm. And I don't really know what I just did from naming or a nomenclature point of view." Whereas the dancer might say that they call that a space hold, for instance. There's a parallel term for that in dance. So, I think that's one, which is this huge design space, and a design space that can possibly be more experimental and strange than the ones that roboticists are used to working in (I'll provide you with a concrete example). So, sometimes dancers decide to make an "ugly" dance, right? You're bending your leg, and you're not really on the music. It's a little bit rushy, or all over the place, chaotic, disorienting. And that's a viable dance, just like you have Swan Lake, where it's very measured, beautiful, and extremely linear. And so dancers are also used to not only having this rich vocabulary but this huge design space of being able to make a motion that's so different in all these different ways. And roboticists — because you don't want to break the robot because you don't want to put undue stress on some of the mechanical components — you design robot motion, by and large, to be a certain way. And so, I think roboticists can also learn about these "ugly" or possibly less-explored types of motion when they learn and work with dancers.
SPENCER: Well, one thing I take away from what you're saying is that dancers have come up with basically a language for describing motion through three-dimensional space, that maybe roboticists can learn from that language, that kind of talks about what the motion is, and the different aspects most of roboticists may not be aware of.
CATIE: Yeah, that's exactly right. And I think what I would also just add on to that is that the term choreography, literally, it's a portmanteau. It means dance writing. So it comes from dance notation. So the same way that musicians have this wonderful practice of being able to write different — you can replicate and recreate the music that's being generated simply by looking at a score — that was one of the initial use cases of choreography, which by and large, is the same argument that roboticists make. They use a different vocabulary in order to codify motion. They use a vocabulary that these joint angles or this particular orientation relative to this point, or this distance from the initial starting point until symmetry. So roboticists have that written language, too. It's just very different. And I would argue, in some ways, a little bit smaller. And so, what roboticists and dancers stand to benefit from working together is this kind of mashing, broadening, smashing up of these two languages and seeing what other possibilities fall out of it.
SPENCER: I see. So there are two different ways of describing this motion. And maybe there's some aspect that choreography does better than through box language and vice versa.
CATIE: Yeah, absolutely. And I think that comes from the tradition that they're involved in. As you just said at the beginning of this conversation, there's this expectation that robots come from a place of practicality, whereas dance comes from a place of expression. And when you want to express, you want all of the paints in the art store, right? Because you were, "Well, what if I want to make magenta versus deep saturated purple?" like, "Give me all the colors." Whereas in robotics, you want to do something optimally, which maybe you do want the full landscape of possibilities, but maybe you don't. Maybe you will already know, based on the math that's in front of you, what's going to give you the shortest trajectory or what's going to give you the quickest outcome. So I think there's that push-and-pull between "Do we want this full, rich history of expression?" or "Do we want to make some constrained and limiting assumptions about how much expression we want?" because we know that we're looking for these types of optimal constraints. And I don't know if the tradition of dances is truly from an expressive one or if the tradition of robotics is truly from an optimal one. I think it has been, but I think that's changing.
SPENCER: Well, it seems like when robots interact with people, as we can expect to see more and more over time, then these questions of optimal motion will change, right? Because it's not just about making the most efficient motion possible, it's your motion that will actually affect the person standing next to you. Maybe if you make the most efficient motion, it could look scary, unfriendly, or whatever.
CATIE: Totally 1,000% and especially — and think about where that applies, right? I think about my parents, who are a perfect example. There's been so much exploration about robots being used for eldercare — so that people like my dad, who lives alone in their 70s, can live at home for as long as possible. My dad cares a lot about how that robot moves around him, whether it spooks him, makes him feel comfortable, whether he feels empowered by the robot, or belittled by the robot. And as we talked about at the beginning of this conversation, motion is unconsciously important to human beings. And in a context where the robot has to move around a person, whether it's my dad in his house or a bunch of people in Costco or Walmart restocking shelves, or whether it's in an airport, the stakes of that motion are so heightened. And then you exponentiated that when you have more and more robots, right? Like in San Francisco, the knightscope robots are going around. They look like big cameras on tripods, essentially that kind of capsule-like object. And sometimes, you'll see more than one of them on a single block. And it's like, "Oh, my god, there are two robots now."
SPENCER: What do they do?
CATIE: Well, they surveil. So they mostly check to see if... I suppose there are people doing things they don't want them to be doing in the immediate vicinity. But when there's multiple of them, and they're moving around you, and you're outside, it's like, wow, the interactive capability of those robots tends to boil down to how they move around me. They're not talking to me; they're not blinking lights, there's no screen, and all I have to infer about their actions is how they move. And so I think of motion back to this point about choreographic interface. When something moves, it winds up becoming an interface. The whole robot winds up becoming an interface. And by way of that status, how it moves winds up being a primary component of that interface.
SPENCER: I went to see this amazing show of synchronized quadcopters. I think there were probably well over 100 of them moving in a dance-like fashion to music. And there was just some moment during the show where I just got extremely creeped out when it suddenly felt like this was like a swarm of bees or something and had all these flashes of light. And you started to think, "wow, this could be used in war." And something is terrifying about 100 robots flying to create… but also can be very beautiful in another moment. So that was my personal experience with this.
CATIE: What do you think triggered that for you? There's just you got partway through the performance, and then you started to feel different?
SPENCER: It was to your point, I think. It has a lot to do with emotion, right? There's an emotion that feels like these are friendly creatures, if you will. And then there's an emotion that feels like these could be violent, these could hurt people. And all these psychological factors are in motion, just like you're pointing at.
CATIE: And that can change on a dime, right? And this is what choreographers play with and oscillate all the time. It is that you can make deliberate choices about those things — I'll provide a tangible example. When I did a TED talk a couple of years ago, I also did this TED dance with a little SoftBank NAO robot, which is two-feet high and looks like a humanoid. And in this dance, there was a moment in the dance where we'd been dancing together and then I was going to kick the robot over, literally kick it over as a symbolic move of “I am reclaiming my humanity and this whole interaction.” And when we got to that point in the dance (kid you not), the robot had been kind of buggy. Despite the fact that we'd run this piece 100 times, the robot had been buggy that day (that day of all the days). And then, the exact moment when I was about to kick the robot over, it overheated, and it bent at the waist and fell backward. So it looked as though the robot had died, or the robot had collapsed or given up. And it was so insane because it completely changed the meaning of that piece. And we only filmed it once. So the piece went from being the original intended trajectory of humans and robots collaborating and being together humans deciding that they can move beyond robots and reclaiming their own humanity. And where the piece wound up was that humans and robots are collaborating together. The robot is a victim of the trials and tribulations of the world, decides to fold, and becomes a sympathetic character because the universe is so hard. And then, the human being has to pick up and care for the robot when it breaks down. And so, I think this is to your point about the swarm that sort of freaked you out [laughs]. Those choices can be deliberate, right? The artists can have a very clear intention about how you as the participant, you as the viewer are going to walk away from that experience. And because robots, by and large, have been in fiction and in our imaginations for so long, a lot of that transference of artistic intention to robot impression has happened already. Because you've seen a lot of movies about robots, you've read a lot of books about robots, you've seen a lot of plays about robots. But I just underscore how that type of feeling that you walked away with, if it's someone who's really astute on how to make those choices, they can choose and oscillate those emotions for you by how they design the piece.
SPENCER: So let's go back to the inverse question. What can dance learn from robots?
CATIE: There are so many layers to this question — and I'm not going to rant hopefully too long about this [laughs]. So for a long time, that idea of what it meant to be a dancer (with like a capital D) seemed like you've got to join a dance company, you have to have a contract, you're dancing with this dance company for X amount of years, you get promoted through the ranks, and it's your full-time job. And you are part of this coterie of performers that sit underneath the Pharaoh choreographer, who's just handing dances down, and you're the recipient of those dances. And that's a very Western Valley contemporary, like the modern tradition of what dancer meant with a capital D. But I think that when you spend X number of years and X number of hours, thinking about the minutiae of how your body moves through space, becoming very learned in your ability to take instruction about how to move to interpret motion, to create motion, it is the type of knowledge. And for so long, it seemed like it only belonged in performances. But that expertise can go a lot of different places. You can design personalities for robots based on how they move. You can design interaction modes, or Apple Watches, based on how they're oriented — as I said at the beginning — of the thumbs-up, thumbs-down. But in order to do that, you need to learn about all of those other fields. There needs to be shared knowledge. And so what I think dancers can learn from roboticists — and what roboticists really offer dancers is — a new platform, a new body that can employ their knowledge, take your dance knowledge and put it into some of these other contexts, like the Apple Watch, or the robot. We do need to expand our skill sets, right? You do need to learn how to program. You do need to learn how sensors work, and why robot motion planners are designed a certain way. The idea that everyone needs to learn how to code is a tough one for me to unpack. But I think how I would say it here is that roboticists can force dancers out of this puritanical mindset that dance is a thing that's done by and for people because dance is the thing that happens all the time. There was this beautiful photo essay in the New York Times recently, where they showed people reaching for hotdogs over a hotdog stand, or pulling a newspaper out of the container. And they're engaged and trenched in this totally fully investigatory body moment when you're reaching for the hotdog. And that is the type of dancing. And that's what roboticists are doing when they're making robots do things because they're in a type of choreography. And dance is a field (I would argue) that is going to become so staid. If the only way that we can think about dance is that it belongs in a proscenium on a stage for people, and so, roboticists can implore dancers to think about their expertise and to think about dance as something that goes beyond the human body, beyond the stage, and belongs to many different kinds of interactions in many different kinds of places. And if we don't do that, like dance as a field, I don't know where we go from there. And we're seeing that stratification happen in lots of places. People are dancing on Instagram and on television, and people are uploading videos of them doing dances in their living rooms because that's what we have now. But I think in order for us to progress as a field and to unlock all of that knowledge that I was describing earlier that dancers have, we need to explore the idea that dance only happens in certain places with certain bodies, and I think roboticists can force answers to that.
SPENCER: One day, I'd really like to dig into just the practical issues of programming a robot. Could you tell us a bit about what it actually looks like?
CATIE: It depends on the kind of robot and depends on the tools that you're using. A lot of robotics companies like ABB have a trackpad. It's got a little knob on it, a little nugget that you sort of turn around. And on this trackpad, you can command each of the joints to a certain point. So, that would be making choices about robots and what we call joint space — I should also say right now, when I'm providing a lot of these descriptors, I'm talking about the most popular kinds of robots, which are serial manipulators and serial manipulators just mean that you have a lot of joints in a row, for example, 12345 — and those joints can be lots of different kinds of joints. They can be prismatic, which means they slide back and forth like beads on a string, or they could be revolute, which means they rotate around clock hand on a clock face, or they could be spherical, which is a little more like a ball and socket. And if you have a bunch of these joints in a row, each joint connected to the next, you have what's called a serial manipulator. And one of the ways that you program those is by commanding each of those joints, either simultaneously or in sequence. So for example, if I'm looking at my own arm and I'm wanting to command the joints, I could command the elbow, and then the wrist for the wrist and the elbow, or both at the same time. And you can do that with a trackpad. You could do that with a software interface like Rhino — I think it’s one that I've used before with ABB — or you could do that with any type of robot programming software that's able to translate your high-level Python or C++ commands into low-level joint torques for all of the motors that comprise each of those different joints.
SPENCER: Do you have a series of commands that are “move this joint in this direction,” “this amount,” or that kind of thing?
CATIE: Totally. And so you could input exactly what you just described in a bunch of different ways. You could write it as a line in Python code. You could write it as something on a trackpad that you just literally select the joint and then twist a knob to decide how far you want to go. You can do it with your body — something that we worked on a project that I did with ThoughtWorks was if I'm just moving through free space, can they take my extracted skeleton from a depth camera (like a Microsoft Kinect) and use that to make a robot move?
SPENCER: So each of your joints is mapped onto one of the robot's joints, is that the idea?
CATIE: Yeah, exactly. But fundamentally, all of those different kinds of interfaces, what they're doing is that there are various degrees of abstraction. But at the end of the day, they're sending a series of joint torques which are positions and forces for the motor to a certain location or a certain value. And that's what we call, in robotics, the Low-Level Controllers. And those controllers can also have to do with Cartesian space. If I just want my hand to go to a certain point in space with a certain orientation, and I don't really care about where my elbow and my wrist start, I just want my hand to be somewhere, that's called a Cartesian space command. And then you could let all of the other joints just kind of optimize for a certain equation to make sure that you reach that point in space.
SPENCER: So you'll say “move the hand from this XYZ position to that XYZ position” and then it's able to do some automatic calculations to figure out how all the joints have to move to make that happen efficiently, is that right?
CATIE: Exactly. And you could do that in either a forward or backward way. You could call that forward kinematics or inverse kinematics. If you're picking the location of the hand, it's an inverse kinematics instruction. And all of those are also predicated on a lot of constraints the robot has. Some robots’ joints for their limbs are only a certain distance. By that constraint, you're not going to be able to get to certain points in space. Sometimes the robots have joints that only move 90 degrees instead of 360 degrees. All of those different types of constraints you have to account for as well, which a lot of these nice simulation platforms, or robot control platforms have already done for you. They've done all the hardcore math. I love that you asked about programming a robot because in our mind's eye, I think, before I started actually doing this myself and I really was entrenched in how to do it, it all just struck me as kind of magical. All the robot moves. Whereas when you actually get down to the nitty-gritty, and you're deciding how to tell it what to do, you realize that generating motions for robots is a fairly precise process.
SPENCER: Now it's making me think. Imagine I had to say how to dance by spelling out the movements of different joints. That sounds like a difficult problem.
CATIE: Totally. Whereas with people, you just stand in front of a group of dancers, and you move around. They're so gifted that they can watch you move, and then replicate the motion almost automatically. And so one of the pieces — actually the first piece I ever made with robots that became this 20-headed monster of so many different videos, software, performances, all these different components — is actually called Time-To-Compile because human beings can learn motion almost immediately. And when you have to go through all of the teeny tiny tunings of all of these robot motions, it takes a long time, which is why the piece was called Time-To-Compile because the compile times between humans and robots were so different.
SPENCER: So what was that piece like?
CATIE: So we had a bunch of different components under that umbrella. There was a short film between me and the Baxter that was called Partitions. Baxter is a hybrid research industrial robot from Rethink Robotics. We had that as a short film. We had this big installation which we called The Loop, which took all of these different sensors and all of these different robots and interconnected them into these different rooms. So you think that you're dancing with one robot, but actually, that robot is populating the motion of another robot that another person is dancing with. And so you have this very long game of movement telephone.
SPENCER: Let's unpack The Loop. I think that was really cool. I definitely recommend searching that online. As I understand it, the idea of the Loop was that you would have a human kind of watching your robot trying to imitate it. And then you'd have another robot that's following what the humans are doing, etc. But instead of there being a starting point, it just kind of feeds back on itself. So from the perspective of each human or robot, they're just kind of copying what's happening in front of them. And then sort of the motion arises organically of everyone copying everyone else. Is that right?
CATIE: You summarized it beautifully. Yeah, and I should share the way we came up with this piece. I was working in the Rad Lab from 9 am to 9 pm, or whatever, Monday through Saturday. And I really loved mirroring, especially this Baxter robot. And so we would — we, meaning Shawn and Nov. Nov is an undergrad at the time, and Sean is a grad student — and we would work these crazy long hours in the Rad Lab, and we would program a sequence onto the Baxter. For example, we just had it move its arms around. And as a dancer, I just loved watching the robot and repeating that motion back to it. It was really stimulating for me as a performer and a dancer to do that. And I and the Baxter would have these like one-to-one interactions. And then at the same time, we were also creating these kinds of interactions in VR with a robot Avatar — a moving robot in virtual reality while you're wearing HTC Vive. And I realized that we think we're having these partitioned interactions but if we connect all of these robots together (like this Baxter, the VR Avatar), the video of a moving robot, we can connect them all together. And basically, all the people who are moving in front of them wind up becoming, simultaneously, the interpreter and performer of the motion that they're seeing from the Baxter, and also the generator of motion for another robot in another room. And for me, that felt like a big commentary on society, because we're all sitting behind these screens, sort of mirroring, translating, receiving, sending in our body data like our body information is traveling all over the place. And that piece, The Loop, was totally — you could expand it and contract it, right? You could add two more nodes to the loop, and you would have five segments, and you would have a loop with five nodes, or you could contract it, and you could have a loop with three nodes because in each of these interactions you could expand and contract based on the number of sensors, and actuators, number of sensors, and robots you wanted to use. And that — I should also say, Spencer — that piece broke so many times. And I think that the other thing that people don't realize about dancing with robots is they see these wonderful, fully polished, nicely edited videos online when robots break all the time. And they also break (like that TED example that I provided), they break at the exact moment when you cannot have them break. And that's what's so hard too about performing with robots is it creates this heightened state of awareness for me because I'm so worried that the thing is going to break down. And that piece, The Loop that we did, we did it with a bunch of different audiences and so many different permutations. And I think there was maybe like once or twice when it didn't completely fall apart.
SPENCER: Why did they break so much?
CATIE: I think there's the kind of rational explanation [laughs] about why things break, and then, a slightly irrational one that I tend to subscribe to. The rational reason why I think some of this stuff breaks is not everything runs perfectly every time, right? You export software onto a robot, and sometimes there's a bug that day, or somebody else has uploaded a different program onto the robot.
SPENCER: So, it’s more software and not hardware malfunctions?
CATIE: Yeah, totally. Like someone uploaded a different piece of software, and all of a sudden, the robot doesn't have the right drivers activated when you run your program. And then it doesn't want to work and then you don't know why. Sometimes there are hardware errors, like the one I said with the NAO Robot, it literally just overheated and fell down and I had never had that happen before but it's because we had the robot on for a really long time. And we were waiting, waiting, waiting to start the show and for the audience to come in, and then they came in, and then the robot broke. There can be hardware errors, there can be software errors. Sometimes, the internet connection is really bad too. And you need the robot to be updating at a certain frequency, and it can't update that fast because the internet where you're performing is low or the live feed that you're trying to get the 360 camera to run on YouTube is slow, it's just not connecting. And even in the few years that I've been doing this, all of those things have become more robust. But it just breaks my heart to think about the number of times things have broken, and not worked. It's so painful. And then I think there are kind of irrational reasons why a lot of this stuff happens. I have had so many experiences where everything is perfect, up until the exact moment when people have to see it. And the number of times that has happened is so spooky. It kind of makes me think robots aren't sentient, obviously, they have no consciousness. I'm going to be very, very clear about that. But for whatever reason, there's some atmosphere that happens with my energy when I'm getting ready to perform, where even though the robot has done something a hundred times, I feel like my energy goes on to the machine and it will just break on occasion.
SPENCER: I hope this is bad luck, and not you infusing the robots with mischievous energy.
CATIE: [laughs] Or maybe it will be that point that you just described. And yeah, bad luck sounds a little more palatable.
SPENCER: One thing I want to ask about is programming robots to run a virtual reality so that you can kind of simulate them without having to deploy this big chunky thing. I want to talk about that for a moment.
CATIE: So this has been around for quite some time. Stephanie Tellex, who I mentioned earlier. Some of her students, David Whitney and Eric Rosen. There are so many people who study this. My advisor is also very well known for this, Alison Mora. And she's worked programming robots, and virtual reality for teleoperation, primarily for medical purposes. If you have a heart surgeon who's in Montana, and you have a patient who's in Baltimore, someone might be able to tell you to operate a robot with the Da Vinci. You do it with this very fancy, custom teleoperation interface. You don't do it in virtual reality, but there is definitely a number of viable use cases for why you might want to tell you to operate a robot in virtual reality. And it's sort of exactly what you might think. You take the position of a human being's controller, where their hand is the VR controller that they're holding, and you perform some sort of scaling. In order to move the robot's arm around and the robot's end-effector around. So if I move my hand, let's say six inches in front of me, the robot will move its hand six inches in front of it.
SPENCER: So you're almost like inhabiting the robot. And it sounds like a video game where you become a robot?
CATIE: Totally. And that would be a type of position translation. Sometimes people will hook up what the robot sees to what you see, so then you really do feel like you're embodying the robot. And there's been some cool theory about how people tend to pick up the rest of the cognitive load. So if the robot is not actually doing what you want it to be doing, human beings use our incredible dexterity and ability to adjust to the circumstances, and figure out how to do it. So for example, if you're trying to get the robot to push through a door, but the robot arm is too short, human beings would infer that you should move forward, right? So like, we got to put the burden of cognitive load on human beings in typical VR. And then, you can get haptic feedback in virtual reality, and get cues about the environment in virtual reality. You can simulate different kinds of physics so that you can make things heavier, lighter, etc. All of that allows researchers to learn a lot about what types of interfaces are going to allow people to teleoperate those robots more interestingly. But I agree, it's totally video game inspired. I think the first time I ever did that was actually at Brown in 2018 and I was like, "This is insane!" Getting to be in virtual reality, I feel like I am the robot, moving my arms around, and the robot arms moving around. It was totally mind-boggling the first time I did that.
SPENCER: There seems to be this effect, where when we get very proficient with a tool, we start almost acting like it's part of our body. You imagine someone who is a fencer and fights with the sword or is a violinist, and it sort of becomes second nature to use that tool. And so your brain has kind of adopted as part of itself. And I wonder if you're in a robot for 10 hours in virtual reality, would you momentarily forget that you're not this robot?
CATIE: Yeah, I think you just asked about something so compelling, which is this idea that I can inhabit or be another body that's not my own. The first time I heard about that from an artistic point of view — Espronceda The Institute of Art and Culture, in Espronceda Street in Barcelona. And it's called the Be Another Lab. And they came out with a piece in 2014, where you literally stand in front of this mirror and move around (or this is 2012 actually). It's called an embodiment virtual reality system. And it's another body that is in the mirror, it's not you, right. And it was this idea that you could swap genders, or you could be a different human being by participating in this mirror. And I think the first time I saw that piece was so insane to me because we talk a lot about technology's ability to “connect us” in the sort of like Facebook connection use case. But this piece was doing that from an embodiment point of view that I could literally be another human being and move through their body. And this lab has explored that in a number of different ways. They've had where you can put on a VR headset, and you're a person that maybe has a different race than you are. And it allows you to experience or feel a little bit, obviously, it's not going to be perfect, but feel a little bit what it might be like to move through the world with that kind of body. And so you asked about a sort of human-to-robot empathy. This is a little bit more of a human-to-human empathy. But I thought that piece was so beautiful because when you can actually — we talk a lot about walking a mile in someone else's moccasin is literally doing that.
SPENCER: Many say there's another use of VR with robots, which is the reverse where you're simulating the robot in VR as a way to learn about the right motions or experiment with motions or things like that. Do you want to talk about that briefly?
CATIE: That's a very popular use case. VR, in the way that you're using it, and I'm expanding that a little bit to also say simulation. So VR is a type of simulation, whereas you can also simulate on a computer screen. But people are doing this for all kinds of stuff right now. For example, at UC Berkeley, Peter Beal and Ken Goldberg are creating these robots in sim in their research labs, and they're having them pick up 10,000 different objects and move them around. Because you can render water bottles, boxes of cereal, pieces of beef jerky, etc. You can render them in simulation, and then have the robot practice picking up and moving objects around, and you can perform huge amounts of robot learning simulated data training the robot on how to do things by doing it in simulation. And you could do that multiplied, take that to the exponential. So I could do that with 20 robots in unity (which is a VR development platform) and have each of them practice picking up and moving objects for 10 hours, all in simulation, and then use that data to create a policy for how that robot should pick up objects in the real world. And so, that's become an incredibly popular use case for simulation platforms in virtual reality platforms. And in addition, there are a lot of questions about when we learn a bunch of stuff in simulation, in virtual reality, or in a two-dimensional simulation platform, how well does it translate to the robot being able to actually do this in the real world?
SPENCER: This is simulated physics and everything, right?
CATIE: The physics is super — especially to pick up and move something. I'm just picking up my mouse right now, and I'm applying a normal force and it's sheer, and I am using all of the kinesthetic information from my joints and bones to decide how much my muscles should support this thing, right? Picking something up is not that easy of a task. There's been a lot of modifications to the task. Like, again at Berkeley, they put suction cups and stuff like that on the robot, to try to do it without all of those more subtle touchpoints. It's become a wildly useful tool, in order to be able to train robots and get them to do things. And it's also a way that you can get a lot of data, as you said, without having to learn to program the robot itself.
SPENCER: But it's not just the number of hours you can do simulation. So imagine you could simulate much faster than it can move in real life.
CATIE: Yeah, I think all of that and more. Plus, maybe I don't have 600 different types of beef jerky sticks in all of the convenience stores around me, but I can simulate 600 of them. Or I can just pull 600 mesh files off the internet and all of a sudden I have it. So, there are some practical constraints around that, too.
SPENCER: It's interesting to think that the robot's brain is essentially software, plus maybe in some cases, a machine learning algorithm that's been trained. And you can kind of take their brain, temporarily put it into the simulated world, have it learn a bunch of stuff like updates, machine learning models, then download it back into the physical body, or copy it across 100 physical bodies, or 1000 physical bodies. It's a weird idea. It's so different from the way we as humans, we're kind of stuck in one body, and that's it.
CATIE: It's precisely that. And it's because we have full knowledge of how that robot body is designed and how it operates. And just as I was saying, because robots are designed by people, so you know exactly how long that limb is and exactly how far those joints can move. And you can do all kinds of calibration to make sure that your understanding of the robot is true. And you could make the same argument for people, right? You could say, "I know exactly how long my forearm is" and people are doing all kinds of mocaps on track and field athletes, and stuff like that. There's a lot of human sensing and that kind of capability. But I agree it's not only that you can import and export information across these brains and across a digital body versus a real body, but there are a lot of reasons why that's hard to do. With people, which aren't just the fact that we only have one body, it's also that our bodies are tricky. We think that we have perfect knowledge about how they function, but we don't even know how our brains work. We wanted to employ a similar thing. If I wanted to learn a lot about how to pick up an object, export that information, and import it onto you, or onto my fiance, or onto my parents, or whatever, it's hard to have complete knowledge about what our bodies can and can't do. Whereas because robots have been cadded and manufactured, and built, and all that stuff, we have a better understanding of what they can and can't do.
SPENCER: So why is it that we don't see more robots in everyday life? I think if you ask people like in the 80s, I think they would have said, "Oh, yeah, we're gonna have all kinds of robots all around in a few years, or at least a couple of decades." And it's like where are all the robots?
CATIE: Robots are hard. There's a nice video in the Intro to Robotics class that they play where they show an autonomous vehicle parking itself in the 90s — I think the 80s or the 90s. And it's a solved problem that the autonomous vehicle parallel parks itself. But to do that at 99.9% performance across every single parking lot, in every single weather condition, with every single surface of asphalt, grass, sidewalk, whatever, is really hard and it's hard to do inexpensively — I should say like Mark Yim who is the head of the GRASP Lab at Penn talks about this a lot. Low-cost robots that can do things robustly over a long period of time without breaking at 100%, 99.9% predictable performance is challenging. So for example, there's something like a teleoperation device that's really popular called the Force Dimension Sigma 7. It's essentially a handle with a nice couple of metal beams off it. And as you move this handle around, you can get a robot to move and it gives you some haptic feedback. Those are $50,000 because the motors and the sensors that are in there are so expensive because you need this incredibly fine, precise, high resolution. And you have to have it. It has to be nicely designed and needs to hook up onto a computer and provide all the information at a really high frequency. And those aren't cheap. And so there are only so many use cases where a price threshold like that makes perfect sense. Medicines are really good ones: medical robots, the Da Vinci Surgical Robot, etc., or even robots that are used in hospitals because nurses spend so much time moving prescriptions around and moving literal stuff from one part of the hospital to another. There's a very clear price point and use case for why a robot that's fairly expensive and equipped with a lot of high expense sensors calibration, like perception stuff. There are only so many instances where that high price point provides a very rational and necessary benefit. As sensors and actuators are really expensive, especially really good ones, I think that's maybe one reason why we don't see more robots. Also, I would say a kind of general thesis about what robots are good for, that you couldn't do with other things. So we see all the time that we can have a smartphone that's also a camera. It doesn't seem like there's a huge benefit in my house to having a smartphone. That's also a camera that can also move around, which would kind of make it a little more robotic. Why would I have that in my house? There's sometimes a question of, "Well, what are robots good for?" or "What do we need them for?" If you have a really good high performing dishwasher that's going to last you 20 years, why do you need a dishwashing robot? So I think there are also some questions around what are the use cases? Or what are the instances where we actually need robots where they would be really helpful for us? And by and large, a lot of these use cases right now are still ones like construction. Construction robotics is booming. When you have to put in drywall, for example, that's something that a robot could do really well. When you have a 50,000-square-foot construction site and it's hard for you to install surveillance cameras throughout the entire site, a robot that has a camera on it can navigate through that space. That's what Spot-The-Robot is doing in a lot of construction places right now. If you have 50 miles of solar panels, it’s hard to get those clean perfectly so they can operate at the highest efficiency level. That would be a great thing for robots to do. So maybe those are the two reasons that I'll offer first: robots are still pretty expensive — robots have to perform well in all of these different, changing, and robust environments; and the use cases for robots are so ones that maybe we don't necessarily need to see them in our day-to-day.
SPENCER: So we think about consumer robots. The Roomba is probably the most famous, right? That will clean your house and it seems to work quite well, people seem to like it a lot. What are some other interesting applications you see maybe coming soon? Or maybe they're already here that are not as well known? You mentioned elderly care, I think that's a really interesting one. Do you want to talk about that first?
CATIE: Yeah, in Japan especially. And I can't speak too much to this, but I've seen a few articles about it. In Japan, there's a lot of exploration, like the PARO Robot, a soft, seal-shaped robot that people can cuddle. And that is something which has a use case if you're perhaps living in a nursing home and maybe it's not really safe for you to have a dog or a pet or something like that, but it would be really nice to have something that provides a little bit of that animal-like companionship. There are PARO robots that are designed for that explicit purpose. I think I've seen some examples of robots where they help people stand up and get out of bed, for example, which would allow you a little more independence. If I didn't have to call a nurse every time I need to get out of bed to use the bathroom, but I had a robot in my room that was going to help me get up, that would be a fantastic use case for sort of eldercare and robots. And then some of these household things that I also mentioned. If my dad wanted to stay at his house and he was having trouble with his dexterity, it wasn't so easy for him to cook. Rather than ordering food all the time, he could have a food preparing robot that makes salads or is able to pull things out of the fridge, put them on the skillet, and then clean the dishes when they're done. That is something that a bunch of robotics companies, like Toyota, are exploring. And you've probably seen the hamburger flipping robots. There was some like pizza-making robots for a while. So the kitchen and independent cooking application is getting really big. So I would start with those three in terms of consumer robots for eldercare. This is a little bit less of a consumer application, but one area I think is so cool are robots in space. There's been a lot of robots in space, like the Canada Arm for a long time, that do all kinds of things like move around the exterior of the ISS. For now, Mark Cutkosky Group at Stanford is looking into using this kind of Gecko adhesive to grab and pull space junk out of the atmosphere, which is so cool. And so this is less on a consumer application but more robots in space, and you could imagine, at some point in the future if human beings colonized Mars, or they needed to live elsewhere, and you wanted a sort of hybrid cyborgian spacesuit that you could wear, there might be some components that are a little bit robotic. for example, an exoskeleton for you to be able to walk around an atmosphere that has a different gravity level. And that's not quite a consumer application, but I think robots in space are so cool and also a really challenging problem, because as you said earlier, the physics are very different.
SPENCER: That's really cool. So let's talk a little bit about humans in the loop of robots. Because I think people when they imagine a robot, they're usually thinking about it acting totally autonomously. Do you want to talk a little bit about what it was to mean to have a human in the loop?
CATIE: One good example is these Starship robots, which are also very popular. So these delivery robots are all around the campus on different college campuses. (I think it is Starship, which is UC Berkeley.) And they go to a restaurant, someone at the restaurant hands the QR code, you put the food inside, the top of the lid of the robot closes, seal shut, drive somewhere else. And then a human being scans a QR code, and the person who intended to pick up food opens the lid of the robot and pulls the food out of it. So I think for a long time, people thought like those robots had a map of the world, and they were just moving around. They were going from Chipotle to someone's house. But there are actually a lot of people where if that robot gets stuck if that robot tries to turn a corner and it doesn't understand where it's going, and they ping, or they provide part of their state as to say, "Oh, I'm confused. I'm lost." And a human being who's not located where the robot is, a remote operator can take over the robot. And so then they become a “human loop” because now there are human who, whether it's through a VR interface or simulation platform, or like joystick kind or any other (choose your poison) robot operation software, they can take over the robot temporarily or permanently, and take it where it needs to go. So that would be an example of a human in the loop, what I also just talked about with the surgical robots, those are purely teleoperated. Actually, that might not be accurate; there are some parts of what the robots are doing that is automated. And then a lot of what the robot is doing is also being directed by surgeon. So you have a surgeon in the loop, a human in the loop, who's directing that robot through space. And maybe sometimes the robot touches an organ or gets to a location where it doesn't want to go, and it will provide that feedback to the human being to say, "Oh, based on my camera, I don't think we should move in this direction." And then there's a little bit of negotiation or dance between the human and the robot, which is always won by the human because that's how most of these programming and robot teleoperation software is written. But those are two examples: the delivery robot and the medical robot.
SPENCER: It reminds me of this trend in machine learning more generally, where I think for a long time, people thought, “Oh, we're just going to completely replace humans with the machine learning algorithm.” But then what a lot of people realize is that for more sophisticated tasks, you often need to constantly QA the results, and say, "Well, we're really confident about this result, so we can just let the machine learning algorithm just do its thing." Whereas, "We're not that confident about this result, so we need a human to look at it and decide whether it's right or not." And then that answer from the human then goes back into training the machine learning algorithm to make it better. So an example of this might be if you want a machine-learning algorithm to play the role of an assistant that helps you schedule your meetings. So in some cases, it may understand the email right away about, "Oh, I want a meeting on Tuesday at 3 pm." etc. But other times it might get confused. And then it needs to push it up to a human to decide what was intended when the meeting is supposed to be scheduled, and then that becomes training data for the algorithm.
SPENCER: Absolutely. And people are doing this actually with artwork, too. So AI-generated artwork — Rebecca Sebring, who's a professor at Goldsmiths in the UK, got her Ph.D. in music at Princeton. She created something called the Wekinator, where you can take a machine learning model, you can use it to write music or make dance or whatever. But the model comes back to you after it's trained the first time. It says, "Oh, do you like this Sonata that I just wrote or not?" And then you as the artist can tune and make changes. But again, it's like humans in the loop with an ML model, as you described, in order to tune and create the artistic output that they want. One idea that people think, which is kind of popular, is "Oh, my god, AI is generating artwork." It is a total fallacy for me because it's a human being that's deciding what is the training data that goes into that model. How are we going to train these hyperparameters to make the models sound, or do, or create, or generate whatever it is that I want? And then what am I going to do with that stuff once I have it? Am I going to put it up on my website? Am I going to take it to a gallery? There's clearly a human hand in all of that, but this idea that AI “makes artwork in a vacuum by itself,” for me says that people don't really have a clear understanding of how human beings use these tools in order to make the work that they want. As a choreographer, I use music in my dances, and I make a dance. The music, in no way, is making the dance. Do you know what I mean? I am employing these tools to make the thing that I want. But because a lot of these technologies are super opaque, or for whatever reason, have been so hyped in certain forums, the narrative ones are being like, "Oh, the AI made the artwork" when that's just a completely wrong assessment of what exactly is happening in that relationship.
SPENCER: I largely agree with you right now, in the sense that there's still a huge amount of work that humans are doing too, besides what train data is and the other steps that you mentioned, but it seems like if we keep going in this direction, we're gonna get to a point where it's, "Oh, you want to make beautiful art, you just type in a few words for themes you want. And the AI generates 500 beautiful-looking pieces of art, and then you just pick your five favorites.” And it seems like that is a very different role than someone who imagines the image in their mind or finds it in the physical world, and then goes painstakingly painted over a period of days or weeks, right? If it's something that anyone could do by typing a few words, and then you just pick your five favorites, excuse me, that really does fundamentally change the role of the artist.
CATIE: Yeah, People can do that now, right? I can type a couple of words into the runway, or any other sort of ML generative thing, and I can get a whole poem or a whole play, and it's been pre-trained on GPT3, or it's been pre-trained on whatever image style GAN is useful for. We can do that already. I still don't think that changes my point of view. It just so happens that now I'm collaborating with the person who trains GPT3 or created GPT3. I'm not like the servant or the subject of what GPT3 is going to turn out. That's my take on it. Again, people will disagree. And I think what I should also provide as a counterargument to that is: that we have brand new forms of art that are cropping up all the time. I very much come from the lens that dancing with robots and robot dance has been a genre for a very long time that was kind of funny or seems like a one-off, or seem unusual. But it is a genre. It's been around since forever. Margo Apostolos, who was a Ph.D. student at Stanford in the 80s, was making dances with robots. And it's become popularized lately, for sure. I think she is a big part of that change, and lots of other people who I've worked with and collaborated with are a big part of that change. But 50, 60, or 100 years ago, I don't know if people would have said, "dancing with robots is a genre of art-making". So I think my counterpoint to that a little bit is that new types of art-making, new types of choreography, and artistic building is changing all the time. And AI, and all of these models like GPT3, and any of these style GAN, they're taking information that we already have and creating something new with it, versus creating brand new forms in ways of making artistic products from scratch, which is not something that I think I've necessarily seen in the AI generative art space, yet.
SPENCER: Yeah, it just seems to me that we're a little bit like in an era of trusting machines before testing machines were as good as humans. Sure, people think they're pretty good, but they're not as good as the best people. And I feel like we're that way right now with like, GPT3, generating language, or using style GAN to generate art. It's pretty cool and sometimes on par with humans. Maybe occasionally GPT3 can generate a haiku. It's on par with humans, but it's not something you can say, "Write a play about this." and it writes a play that's as good as a human play, right? And I just wonder what that does if we crossed that threshold where actually GPT5, or whatever the version is, can write a play on par with the best human playwrights, or better than the best human playwrights. Does that kind of fundamentally change the nature of art in a way where if GPT3 is sort of only 80% as good, it doesn't.
CATIE: I think a little bit of what I'm hearing is about being able to compare, and this is what people want to send us about with these Boston Dynamics videos. They were, "Oh, my god, the robots doing the mashed potato, and the robots doing the mashed potato better than I can." And it becomes this human versus robot. Can we compare the goodness of this thing, the goodness of this play, the goodness of this article from ESPN.com, that was generated by GPT3, the goodness of this, whatever. And I am so not interested in that. What I am much more interested in is human beings creating brand new art forms that are something that people do, that's not something that AI's do, right? Human beings created virtual reality. We're now going to employ AI in virtual reality in order to come up with AI-generated games, and AI-generated VR environments, but the nexus of that idea is to create a VR headset experience, blah, blah, blah. That is not an AI-generated thing. And we're coming up with NFTs and all of these other things that are brand new human creations. That's because the current AI models, especially like the supervised learning models that we have, they're all based on historical data. AI can't create VR from scratch if it doesn't know that VR doesn't exist. So it's like a slightly subtle point. It's a little bit different than the one that you're describing, which is more about being able to create different forms, different literal new artistic mediums that didn't exist.
SPENCER: Awesome, Catie. This was a really fun conversation. Thanks so much for coming on.
CATIE: Thank you, Spencer.
Click here to return to the list of all episodes.
Sign up to receive one helpful idea and one brand-new podcast episode each week!
Subscribe via RSS or through one of the major podcast platforms:
Host / Director