I’ve been asked multiple times now how to sync animations and speech on a NAO – or Pepper for that matter; especially from Python.
The answer to that is, there are two options:
The first one is to create the animation in Choreograph and then export it to a python script. You then create your usual handle to the text-to-speech module, but instead of calling the say method directly, e.g., `tts.say(“Hello”)`, you call it through the module’s `post` method, e.g., tts.post.say(“Hello”). This method exists for every function in the API and essentially just makes a non-blocking call. You can then call your animation.
You create a custom animation in Choreograph, upload it to the robot, and call it through AnimatedSay or QiChat. Other than being the, I think, cleaner solution, it allows you more fine grained control over when in the sentence the animation starts and when it should stop. This is what I will describe in more detail below.
Step 1: Create the Animation
Fairly straight forward, and the same for both solutions. You use Choreograph to create a new Timeline box in which you create the animation that you would like. You then connect the timeline box to the input and output of the behavior and make sure it works as you’d expect when you press the green play button.
Step 2: Configure the Project and Upload it to the Robot
In this step, you configure the new animation to be deployed as an app on the robot.
Go to the properties of the project.
Then make sure to select a minimum naoqi version (for NAO 2.1, for Pepper 2.5), the supported models (usually any model of either NAO or Pepper respectively) and set the ID of the Application. We will use this when calling the animations, so choose something snappy, yet memorable. Finally, it is always nice to add a small Description.
Next, we need to reorganize the app a bit. Create a new folder and name it after your animation; again, we will use this name to call our behavior, so make sure it’s descriptive. Then move the behavior that contains your animation – by default called behavior1.xar – into the folder you just created, and rename it to behavior.xar .
Finally, connect to your robot and use the first button in the bottom right corner to upload the app you just created to your robot.
Step 3: Use ALAnimatedSpeech from Python
Note:If you don’t want NAO to use the random gestures it typically uses when speaking in animated speech, consider setting the BodyLanguageMode to disabled. You can still play animations, but it won’t automatically start any.
For existing animations – that come with the robot by default – you call the animation like this
"Hello! ^start(animations/Stand/Gestures/Hey_1) Nice to meet you!"
Now, animations is nothing but an app that is installed on the robot. You can even see listed it in the bottom right corner of Choreograph. Inside the app, there are folders for the different stable poses of NAO like Stand, or Sit, which are again divided into types of animations, e.g., Gestures which you can see above. Inside these folders there is, yet another, folder named after the animation (Hey_1), inside of which is a behavior file called behavior.xar.
We have essentially recreated this structure in our own app and installed it right next to the animations app. So, we can call our own animations using the exact same logic:
"Hello! ^start(pacakge_name/animation_name) Nice to meet you!"
It also works with all the other aspects of the ALAnimatedSpeech module, so ^stop, ^wait, ^run, will work just as fine. You can also assign tags to your animations and then make it choose random animations for that tag group.
Our lab owns robots build by SoftBank that we use for experiments; we have a Pepper and some NAOs. At the moment, I’m working on a NAO.
They are quite pretty robots. I mean, they can barely walk around, let alone navigate the environment, they can’t do proper grasping, the build-in CPU is so slow and hogged by the default modules, and streaming video from the robot for remote processing happens at about 5 FPS. So you can’t really do any of the things you would expect you can do, but hey, they look really cool 😀
Okay, jokes aside, the manufacturing of the robots is actually pretty solid. Being able to get your hands on a biped for about 6000€ is solid, and, despite some stability issues, it can walk – however, nobody really uses that feature in social robotics research. They also come with a huge sensor array, that makes every smartphone jealous. Hardware wise both, NAO and Pepper, are good robots.
The thing that is lacking – by a landslide – is the software. The robots come with an API, but that API is proprietary – in itself, not a problem. The problem starts where the documentation ends. Documentation is shaky, disorganized, not very clear, and – for all the cool parts – nonexistent. In short, you don’t get to read the code and you don’t get good documentation to help you either; hence, if something breaks, you are blind and deaf somewhere in the forest of code and have to find the way out yourself.
Pepper can grasp, it can do navigation, and you can stream video data at a decent FPS – the same is true for NAO; it can do all the things I just complained about. You just have to write the code yourself.
This is what I will talk about in this post. I will not go into grasping or walking, but we will look into navigating the joint space more efficiently. That is, we will have a more in-depth look at ALRobotPosture, some of the hidden / undocumented functions, and how we can use this module for some pretty sick motion planning.
Note: Everything in this post works for both NAO and Pepper. For ease of reading, I will only reference the NAO, because – I think – that is the robot most people reading this will own.
ALRobotPosture, an Overview
If you own either a NAO or a Pepper, you have probably noticed that, when you turn (and autonomous life activates) it on, it moves into a certain pose. For Pepper, it is always the same, for the NAO, it depends if it was turned on sitting or standing. This is RobotPosture in action. The same is true after we play an animation. Once it finishes, NAO moves back into a specific pose, waiting for the next command.
This is the most visible action of the module. When no other movement task is running, it will move the NAO into a stable position. The other thing it does, is it transitions between these stable poses. For example, when you want NAO to either sit down or stand up, then it doesn’t play an animation. It actually uses RobotPosture to navigate the joint space from one stable posture to another until it reaches the Sit or Stand posture respectively.
In essence, RobotPosture is a list of configurations – points in joint space – that serve as stable positions the robot can move into. These points are connected; there is a neighbor relationship between them. They are also attractive; hence, when no other motion is running, NAO will move into the closest posture (closeness being defined as closeness in joint space).
The interesting part is that movement between poses is not done as a direct line in joint space. This could be rather dangerous, since the robot would just fall, if it would move in a straight line from the Stand to Sit. Instead, planning is done in the topological map – a directed graph -, that is defined by the poses and their neighbors. NAO then moves in a (joint space) direct line from the current pose to a neighboring pose and goes through different poses until it reaches the final, desired pose.
I visualized the standard poses in Figure 1. Additionally, there is the USRLookAtTower pose, which is a custom posture I’ve added for a project I’m working on. You can also see it in the picture I chose for the beginning of the post. It looks a lot like normal sitting, but the head is tilted downwards. I will walk you through how I did that in the next section. I also color coded the sitting and standing postures, because they are the most used – but mainly because it looks nice 🙂 .
The graph is laid out using force-directed graph drawing, where the force corresponds to the euclidean distance between nodes in joint space. However, I took a bit of liberty to prevent label overlap. As you can see, there is no direct connection between Sit and Stand; the robot would move through unstable territory. (We could, however, add such trajectories ourselves, creating a fast, dynamic stand up motion – e.g., for robot football.)
Another advantage of this approach is that it is very computationally efficient. Since we have an abstract map of how poses are connected, we can quickly figure out if a pose is reachable, and compute a path to that given pose.
Enough theory, show some code already! Okay … okay. Here is how to use the basics of the module:
The snippet will make the robot run through all the available poses and announce the pose’s name once there. This is about the best you can do with the official part of ALRobotPosture; not that much.
There is a lot more functionality in the module. There just isn’t any documentation of it on the web. We can look at all the methods in a module via:
Alternatively we can use qicli (with the parameter –hidden) to list all the functions in a similar fashion. Qicli is documented here.
Here we can find a few very promising functions:
_isRobotInPosture(string, float, float)
This function is similar to getRobotPosture(). However, instead of giving the current posture, it gives a boolean that is true if the robot is in the given posture. The two floats are threshold values for the joint angles and stiffness. That is, by how much is the current pose allowed to deviate from the defined pose for us to consider them the same.
It returns a triple of (bool, [bool] * 26, [bool] * 2) on a NAO robot. The first boolean tells us if the pose has been reached overall, the second is a breakdown if the pose has been reached for each joint. Finally, the last array is the same for stiffness.
This function is useful if two poses are close together. In this case getRobotPosture() may not show the correct pose; however, we can still differentiate with _isRobotInPosture().
Make your own network of postures, export it, use this to upload it to an army of NAOs, and dominate the world.
Given a serialized graph of poses, it will load it and replace the current posture graph. The string is the (relative) path to the file. It returns a boolean indicating if the loading has succeeded.
Important: The file path is relative to ~/.local/share/naoqi/robot_posture on the robot, so the posture file has to be stored in that directory on the robot.
This is a strange one. While not immediately useful to us, it will re-generate a Cartesian map that the module uses internally to navigate between poses. You have to call this after loading a new posture library or adding individual postures. Otherwise the new postures won’t work!
Pretty self explanatory. Look up the id of the posture using it’s name. Takes the name of the posture and returns an integer that is the id.
Takes a posture id and returns a boolean. True if the posture is reachable from the current robot pose.
The string is the name that we want to save the file as and it will be saved under ~/.local/share/naoqi/robot_posture on the robot.
_addNeighbourToPosture(int, int, float)
Adds a vertex to the graph pointing from the first posture (indexed by the first int) to the second posture. The third value is the cost of traversing along this edge, which can be used for more sophisticated path planning.
Saves the current pose as an edge with ID int and name string.
Putting all these together, we can create custom poses as follows:
Use the Animation Mode (or any other method) to move NAO into the desired pose
_saveCurrentPostureWithName() to add the node to the graph
_addNeighbourToPosture() to connect it to the graph (edges are directed! we have to add both ways)
export the postures via _savePostureLibrary() (this will save the file in the correct place)
In our code: import our custom poses using _loadPostureLibraryFromName()
re-generate the cartesian map _generateCartesianMap()
Here is a code snippet that adds a custom posture called “myPosture”, exports the library, imports it, makes the robot sit down, and then go into “myPosture”.
And just for good measure, a video showing what the robot does when running the snippet:
Naturally, you can be more fancy with this. I am particularly excited about the possibility to do dynamic movements, i.e., one-way trajectories. However, my supervisor will probably kill me if I actually dabble in this area, because the chances of breaking a NAO like this are … elevated.
I hope this article is useful. If you liked it, feel free to leave a like, comment, or follow this blog! I will keep posting tutorials in the area of robotics, AI, and social robotics research.
I’ve added the poster that we published in HAI 2017 as a project to this website. Unfortunately, it doesn’t appear on the front page, since only blog posts are shown. In time I might have to look for a different template.
In the meantime this will serve as an (internal) cross-post so that the project is easier to find.