Franka Emika Panda in Gazebo (without ROS using Docker)

Screenshot of the panda arm with and without gripper attachment.

The background for this post is that I am currently visiting ISIR, and I’ve started a new project working with the panda robot by Franka Emika. Unfortunately, we only have a single physical robot, and we are 2 PhDs using it for different experiments. To minimize downtime, I am, hence, setting up a simulator for it, and, to facilitate reusability, I decided to go with a Docker based setup.

The image is available on GitHub and called panda_sim. You can clone it, build it, and see how it works for you. While I tried to keep ROS out of this, the current model still loads the gazebo ros plugin for control. If I get around to it, I will remove this dependency in a future version, to get it completely ROS independent.

I am currently working on a ROS based image that runs a Gazebo server and a ROS controller for the robot with gripper. This way, you can easily spin up a simulator for your robot behavior. You can even do that in parallel for some epic deep reinforcement learning action.

Preparing the Robot Model

To use Panda in the simulator, we need to convert the existing model into .sdf format. While Gazebo can work with .urdf files, this requires a parallel ROS installation, which we try to avoid. Internally, the .urdf is converted to .sdf anyway, so we might as well supply .sdf and save the dependency.

Screenshot of the location of the .xacro files on GitHub.

First, we need to get the model from franka_ros which is located in the `franka_description` folder. It is a .urdf model, so in order to use it in Gazebo, we need to add some additional information such as joint inertia, or that the arm should be attached rigidly to the world frame. Erdal Pekel also has a tutorial how to bring Panda into Gazebo (using ROS). I used his numbers and suggestions to modify the files.

Next, as we are ripping out the model from an existing ROS package, we will also need to update the paths in the .urdf. In particular, I removed `$(find franka_description)/robots/` in both the `panda_arm.urdf.xacro` and the `panda_arm_hand.urdf.xacro`, and changed the `robot_name` to panda. I also changed the `description_pkg` value to `panda_arm_hand` or `panda_arm` in the `panda_arm.xacro`, depending on the model; this name of the package needs to match the name of the model folder (see below). In `hand.xacro` the value for description_pkg is hardcoded, so I introduced the description_pkg variable, and set it appropriately.

Converting .xacro to .sdf

After doing all the necessary modifications, we need to convert the .xacro to a .urdf. Docker can again be incredibly helpful, as we can spin up a throw away ROS instance, do our conversion, and save the result on the host:

docker run -v <path/to/modified>/franka_description:/xacro osrf/ros:kinetic-desktop-full rosrun xacro xacro --inorder /xacro/robots/panda_arm_hand.urdf.xacro > panda_arm_hand.urdf

Next, we want to convert the model to a .sdf. This is again solved in an easy one liner using docker:

docker run -v <path/to/generated/urdf>:/urdf gazebo:latest gz sdf -p /urdf/panda.urdf > model.sdf

Assembling the Model

Finally, all that is left is to assemble the pieces into a full Gazebo model of Panda. For this we create a new folder called `panda` and copy the meshes folder and the model.sdf in there. We then create a `models.config` to describe the model to Pandas as follows

<?xml version="1.0"?>
<model>
<name>Panda Robot</name>
<version>1.0</version>
<sdf version="1.6">model.sdf</sdf>
<author>
<name>YOUR NAME</name>
<email>YOUR EMAIL</email>
</author>
<description>
A sdf model of the Franka Emika Panda robot adapted from an existing urdf model.
This model is intended to be used in Gazebo.
</description>
</model>
view raw model.config hosted with ❤ by GitHub
The model.config file

Here is the folder structure:

panda_arm_hand
|__meshes
|  |__collision
|  |  |__files from franka_description
|  |__visual
|     |__files from franka_description
|__model.config
|__model.sdf 

Copying this folder into ~/.gazebo/models will make it available to Gazebo. To pack it into a docker image, I wrote a small script in `build.sh` that will construct the above model and then build a docker image with the models already installed. The script will create a tmp folder where it stores the fully constructed models, so you can also run the script and get just the models, if that’s what you need.

Thank you for reading, and happy coding! If you liked this article, and would like to hear more ROS, Gazebo, or Panda related stuff, consider giving this post a like, or leaving me a comment 🙂

Advertisement

How to play custom animations during speech on a NAO (or Pepper)

I’ve been asked multiple times now how to sync animations and speech on a NAO – or Pepper for that matter; especially from Python.

The answer to that is, there are two options:

  1. The first one is to create the animation in Choreograph and then export it to a python script. You then create your usual handle to the text-to-speech module, but instead of calling the say method directly, e.g., `tts.say(“Hello”)`, you call it through the module’s `post` method, e.g., tts.post.say(“Hello”). This method exists for every function in the API and essentially just makes a non-blocking call. You can then call your animation.
  2. You create a custom animation in Choreograph, upload it to the robot, and call it through AnimatedSay or QiChat. Other than being the, I think, cleaner solution, it allows you more fine grained control over when in the sentence the animation starts and when it should stop. This is what I will describe in more detail below.

Step 1: Create the Animation

timeline

Fairly straight forward, and the same for both solutions. You use Choreograph to create a new Timeline box in which you create the animation that you would like. You then connect the timeline box to the input and output of the behavior and make sure it works as you’d expect when you press the green play button.

Step 2: Configure the Project and Upload it to the Robot

In this step, you configure the new animation to be deployed as an app on the robot.

properties-choreography

Go to the properties of the project.

properties-configured

Then make sure to select a minimum naoqi version (for NAO 2.1, for Pepper 2.5), the supported models (usually any model of either NAO or Pepper respectively) and set the ID of the Application. We will use this when calling the animations, so choose something snappy, yet memorable. Finally, it is always nice to add a small Description.

project_structure

Next, we need to reorganize the app a bit. Create a new folder and name it after your animation; again, we will use this name to call our behavior, so make sure it’s descriptive. Then move the behavior that contains your animation – by default called behavior1.xar – into the folder you just created, and rename it to behavior.xar .

buttons_upload

Finally, connect to your robot and use the first button in the bottom right corner to upload the app you just created to your robot.

Step 3: Use ALAnimatedSpeech from Python

Note: If you don’t want NAO to use the random gestures it typically uses when speaking in animated speech, consider setting the BodyLanguageMode to disabled. You can still play animations, but it won’t automatically start any.

For existing animations – that come with the robot by default – you call the animation like this

"Hello! ^start(animations/Stand/Gestures/Hey_1) Nice to meet you!"

Now, animations is nothing but an app that is installed on the robot. You can even see listed it in the bottom right corner of Choreograph. Inside the app, there are folders for the different stable poses of NAO like Stand, or Sit, which are again divided into types of animations, e.g., Gestures which you can see above. Inside these folders there is, yet another, folder named after the animation (Hey_1), inside of which is a behavior file called behavior.xar.

We have essentially recreated this structure in our own app and installed it right next to the animations app. So, we can call our own animations using the exact same logic:

"Hello! ^start(pacakge_name/animation_name) Nice to meet you!"

It also works with all the other aspects of the ALAnimatedSpeech module, so ^stop, ^wait, ^run, will work just as fine. You can also assign tags to your animations and then make it choose random animations for that tag group.

Finally, please be aware that the robot will return to it’s last specified pose after finishing an animation. Hence, if you want the robot to wait in a different position after the animation finished, you will have to do that by creating a custom posture. I have some comments on that here: The hidden potential of NAO and Pepper – Custom Robot Postures in naoqi v2.4

I hope this will be useful to some of you. Please feel free to like, share, or leave a comment below.

Happy coding!

The hidden potential of NAO and Pepper – Custom Robot Postures in naoqi v2.4

Introduction

Our lab owns robots build by SoftBank that we use for experiments; we have a Pepper and some NAOs. At the moment, I’m working on a NAO.

naw_low_quality
NAO looking at a Tower of Hanoi

They are quite pretty robots. I mean, they can barely walk around, let alone navigate the environment, they can’t do proper grasping, the build-in CPU is so slow and hogged by the default modules, and streaming video from the robot for remote processing happens at about 5 FPS. So you can’t really do any of the things you would expect you can do, but hey, they look really cool 😀

Okay, jokes aside, the manufacturing of the robots is actually pretty solid. Being able to get your hands on a biped for about 6000€ is solid, and, despite some stability issues, it can walk – however, nobody really uses that feature in social robotics research. They also come with a huge sensor array, that makes every smartphone jealous. Hardware wise both, NAO and Pepper, are good robots.

The thing that is lacking – by a landslide – is the software. The robots come with an API, but that API is proprietary – in itself, not a problem. The problem starts where the documentation ends. Documentation is shaky, disorganized, not very clear, and – for all the cool parts – nonexistent. In short, you don’t get to read the code and you don’t get good documentation to help you either; hence, if something breaks, you are blind and deaf somewhere in the forest of code and have to find the way out yourself.

Pepper can grasp, it can do navigation, and you can stream video data at a decent FPS – the same is true for NAO; it can do all the things I just complained about. You just have to write the code yourself.

This is what I will talk about in this post. I will not go into grasping or walking, but we will look into navigating the joint space more efficiently. That is, we will have a more in-depth look at ALRobotPosture, some of the hidden / undocumented functions, and how we can use this module for some pretty sick motion planning.

Note: Everything in this post works for both NAO and Pepper. For ease of reading, I will only reference the NAO, because – I think – that is the robot most people reading this will own.

ALRobotPosture, an Overview

If you own either a NAO or a Pepper, you have probably noticed that, when you turn (and autonomous life activates) it on, it moves into a certain pose. For Pepper, it is always the same, for the NAO, it depends if it was turned on sitting or standing. This is RobotPosture in action. The same is true after we play an animation. Once it finishes, NAO moves back into a specific pose, waiting for the next command.

This is the most visible action of the module. When no other movement task is running, it will move the NAO into a stable position. The other thing it does, is it transitions between these stable poses. For example, when you want NAO to either sit down or stand up, then it doesn’t play an animation. It actually uses RobotPosture to navigate the joint space from one stable posture to another until it reaches the Sit or Stand posture respectively.

In essence, RobotPosture is a list of configurations – points in joint space – that serve as stable positions the robot can move into. These points are connected; there is a neighbor relationship between them. They are also attractive; hence, when no other motion is running, NAO will move into the closest posture (closeness being defined as closeness in joint space).

The interesting part is that movement between poses is not done as a direct line in joint space. This could be rather dangerous, since the robot would just fall, if it would move in a straight line from the Stand to Sit. Instead, planning is done in the topological map – a directed graph -, that is defined by the poses and their neighbors. NAO then moves in a (joint space) direct line from the current pose to a neighboring pose and goes through different poses until it reaches the final, desired pose.

I visualized the standard poses in Figure 1. Additionally, there is the USRLookAtTower pose, which is a custom posture I’ve added for a project I’m working on. You can also see it in the picture I chose for the beginning of the post. It looks a lot like normal sitting, but the head is tilted downwards. I will walk you through how I did that in the next section. I also color coded the sitting and standing postures, because they are the most used – but mainly because it looks nice 🙂 .

PostureVisualization.png
Figure 1: Visualization of the available postures on a NAO v4. The distances between nodes roughly correspond to the euclidean distance in joint space. Big nodes are targets for ALRobotPosture.goToPosture(), small nodes are used for the transition. Blue nodes belong to the family Sitting and red nodes belong to the family Standing.

The graph is laid out using force-directed graph drawing, where the force corresponds to the euclidean distance between nodes in joint space. However, I took a bit of liberty to prevent label overlap. As you can see, there is no direct connection between Sit and Stand; the robot would move through unstable territory. (We could, however, add such trajectories ourselves, creating a fast, dynamic stand up motion – e.g., for robot football.)

Another advantage of this approach is that it is very computationally efficient. Since we have an abstract map of how poses are connected, we can quickly figure out if a pose is reachable, and compute a path to that given pose.

Enough theory, show some code already! Okay … okay. Here is how to use the basics of the module:


from naoqi import ALBroker
from naoqi import ALProxy
# start a local broker that connects to the NAO
robot_ip = "your-ip-here"
myBroker = ALBroker("myBroker", "0.0.0.0", 0, robot_ip, 9559)
# get a handle to the module
posture_proxy = ALProxy("ALRobotPosture")
tts_proxy = ALProxy("ALTextToSpeech")
announcement = "I am in posture {posture}. It is part of {posture_family}."
# list current postures
postures = posture_proxy.getPostureList()
print(postures)
# round-trip through all available postures
for posture in postures:
posture_proxy.goToPosture(posture, 1.0)
posture_family = posture_proxy.getPostureFamily()
# not needed, just for demonstration
posture_name = posture_proxy.getPosture()
tts_proxy.say(announcement.format(posture=posture_name,
posture_family=posture_family))

view raw

basics.py

hosted with ❤ by GitHub

The snippet will make the robot run through all the available poses and announce the pose’s name once there. This is about the best you can do with the official part of ALRobotPosture; not that much.

Hidden Features

There is a lot more functionality in the module. There just isn’t any documentation of it on the web. We can look at all the methods in a module via:


from naoqi import ALBroker
from naoqi import ALProxy
from pprint import pprint
# start a local broker that connects to the NAO
robot_ip = "your-ip-here"
myBroker = ALBroker("myBroker", "0.0.0.0", 0, robot_ip, 9559)
# get a handle to the module
posture_proxy = ALProxy("ALRobotPosture")
pprint(posture_proxy.getMethodList())

Alternatively we can use qicli (with the parameter –hidden) to list all the functions in a similar fashion. Qicli is documented here.

Here we can find a few very promising functions:

_isRobotInPosture(string, float, float)

This function is similar to getRobotPosture(). However, instead of giving the current posture, it gives a boolean that is true if the robot is in the given posture. The two floats are threshold values for the joint angles and stiffness. That is, by how much is the current pose allowed to deviate from the defined pose for us to consider them the same.

It returns a triple of (bool, [bool] * 26, [bool] * 2) on a NAO robot. The first boolean tells us if the pose has been reached overall, the second is a breakdown if the pose has been reached for each joint. Finally, the last array is the same for stiffness.

This function is useful if two poses are close together. In this case getRobotPosture() may not show the correct pose; however, we can still differentiate with _isRobotInPosture().

_loadPostureLibraryFromName(string)

Make your own network of postures, export it, use this to upload it to an army of NAOs, and dominate the world.

Given a serialized graph of poses, it will load it and replace the current posture graph. The string is the (relative) path to the file. It returns a boolean indicating if the loading has succeeded.

Important: The file path is relative to ~/.local/share/naoqi/robot_posture on the robot, so the posture file has to be stored in that directory on the robot.

_generateCartesianMap()

This is a strange one. While not immediately useful to us, it will re-generate a Cartesian map that the module uses internally to navigate between poses. You have to call this after loading a new posture library or adding individual postures. Otherwise the new postures won’t work!

_getIdFromName(string)

Pretty self explanatory. Look up the id of the posture using it’s name. Takes the name of the posture and returns an integer that is the id.

_isReachable(int)

Takes a posture id and returns a boolean. True if the posture is reachable from the current robot pose.

__savePostureLibrary(string)

The string is the name that we want to save the file as and it will be saved under ~/.local/share/naoqi/robot_posture on the robot.

_addNeighbourToPosture(int, int, float)

Adds a vertex to the graph pointing from the first posture (indexed by the first int) to the second posture. The third value is the cost of traversing along this edge, which can be used for more sophisticated path planning.

_saveCurrentPostureWithName(int, string)

Saves the current pose as an edge with ID int and name string.

 

Custom Poses

Putting all these together, we can create custom poses as follows:

  1. Use the Animation Mode (or any other method) to move NAO into the desired pose
  2. _saveCurrentPostureWithName() to add the node to the graph
  3. _addNeighbourToPosture() to connect it to the graph (edges are directed! we have to add both ways)
  4. export the postures via _savePostureLibrary() (this will save the file in the correct place)
  5. In our code: import our custom poses using _loadPostureLibraryFromName()
  6. re-generate the cartesian map _generateCartesianMap()
  7. goToPosture()

Here is a code snippet that adds a custom posture called “myPosture”, exports the library, imports it, makes the robot sit down, and then go into “myPosture”.


from naoqi import ALBroker
from naoqi import ALProxy
from pprint import pprint
# start a local broker that connects to the NAO
robot_ip = "love"
myBroker = ALBroker("myBroker", "0.0.0.0", 0, robot_ip, 9559)
# get a handle to the module
posture_proxy = ALProxy("ALRobotPosture")
file_name = "firefoxmetzger-awesome-pose.pose"
posture_proxy._saveCurrentPostureWithName(9942, "myPosture")
custom_posture_id = 9942
stand_posture_id = posture_proxy._getIdFromName("Stand")
posture_proxy._addNeighbourToPosture(stand_posture_id, custom_posture_id, 1)
posture_proxy._addNeighbourToPosture(custom_posture_id, stand_posture_id, 1)
posture_proxy._savePostureLibrary(file_name)
posture_proxy._loadPostureLibraryFromName(file_name)
posture_proxy._generateCartesianMap()
posture_proxy.goToPosture("Sit", 0.5)
posture_proxy.goToPosture("myPosture", 0.5)

And just for good measure, a video showing what the robot does when running the snippet:

Naturally, you can be more fancy with this. I am particularly excited about the possibility to do dynamic movements, i.e., one-way trajectories. However, my supervisor will probably kill me if I actually dabble in this area, because the chances of breaking a NAO like this are … elevated.

I hope this article is useful. If you liked it, feel free to leave a like, comment, or follow this blog! I will keep posting tutorials in the area of robotics, AI, and social robotics research.

Happy coding!

Questback + MTurk — No more survey codes with ExternalQuestions

Introduction

A few weeks ago, I ran my first pilot study on Amazon Mechanical Turk (MTurk).

Essentially, MTurk is a platform that you can use to label data and get participants for studies (using money). There is of course more that can be done here, but from my current understanding, these seem to be the two main uses in academia.

The completion of a survey usually takes the following form:

  1. A worker / participant on MTurk accepts the work and is presented a link to the survey.
  2. The worker copies his unique, anonymized workerID into the survey (so that you can reference the data)
  3. The worker completes the survey and is presented a unique random code at the end
  4. The worker copies the code into a form on MTurk’s website.
  5. You match the IDs of workers that completed the survey with IDs in your database and reward workers for their time

You used MTurk before? This sounds familiar? Yes – however, two parts in this chain are rather weak: (1) the manual copying of the workerID and (2) the generation, matching, and copying of the random code at the end. If either of these fails, you have to work out manually if a worker has participated or not.

Principle

As you have already guessed, there is a better way to do this (why else would I be writing about this? :D). MTurk offers to host a so called ExternalQuestion, which allows you to embed a custom website via an <iframe>. On top, it passes some meta information, such as the workerID, to the website; all you need to do is read that info and use the value as you see fit. Additionally, it allows the website to submit a form back to MTurk as proof of having completed the work, which we will do at the end of the survey.

In short, we integrate the survey directly into MTurk thereby getting rid of above pitfalls. It roughly follows these steps:

  1. Create the Survey on the survey tool (in this case Questback)
  2. Create a small adapter website on GitHub (may not be needed if you have a different survey tool)
    • this will accept and forward requests from Mturk
    • allow you to create an arbitrary preview of the survey
    • pipe the form back to MTurk when the survey is finished
  3. Add URL parameters to the survey
  4. forward people to the adapter website upon completion
  5. Host an external question pointing to the adapter website on GitHub

Small Adapter Website on GitHub

At first, I tried to link Questback and MTurk directly; then I discovered two limitations making this impossible: (1) Questback only accepts URL parameters named “a=..&b=..&c=..” instead of full variable names, and (2) Questback can not post the results of a form; I could only find forwarding via GET.

Hence, I set up a small website hosted on GitHub to do the plumbing between both websites.


<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="utf-8">
<title>Redirecting to the Survey …</title>
<!–[if IE]>
<script src="http://html5shiv.googlecode.com/svn/trunk/html5.js"></script&gt;
<![endif]–>
</head>
<body id="home">
<img src="PreviewQuestionnaire.png" />
<script>
let survey_url = 'https://www.unipark.de/uc/robot_capabilities/&#39;
let plain_url = 'https://firefoxmetzger.github.io/robot_capabilities_glue/&#39;
let mturk_url = 'https://www.mturk.com/mturk/externalSubmit&#39;
// let mturk_url = 'https://workersandbox.mturk.com/mturk/externalSubmit&#39;
if (RegExp("returnToMTurk").test(window.location.href)) {
console.log(window.location.href);
var match = RegExp("assignmentId=(.*?)&").exec(window.location.href);
var attention_response = RegExp("attentionCheckPassed=(.*?)&").exec(window.location.href);
var form = document.createElement('form');
document.body.appendChild(form);
form.method = 'post';
form.action = mturk_url + '?assignmentId=' + match[1] + '&attentionCheckPassed=' + attention_response[1];
console.log(form.action)
form.submit();
} else if (RegExp("ASSIGNMENT_ID_NOT_AVAILABLE").test(window.location.href)) {
// do nothing
} else {
let url = window.location.href;
url = url.replace('assignmentId','a');
url = url.replace('hitId','b');
url = url.replace('turkSubmitTo','c');
url = url.replace('workerId','d');
url = url.replace(plain_url,survey_url);
window.location.href = url;
}
</script>
</body>
</html>

view raw

index.html

hosted with ❤ by GitHub

Note: If you are working with the sandbox, you have to change the mturk_url appropriately.

This does 3 things, depending on how it is called:

  1. if the URL contains a parameter called “returnToMTurk” then we assume that the Questback is forwarding to this website via GET. In this case we take the payload (in this case attentionCheckPassed) and forward it to MTurk as a POST request.
  2. if the URL contains ASSIGNMENT_ID_NOT_AVAILABLE it means that the survey is being previewed. In this case we do nothing and show this website, which will act as the preview of the survey (e.g. display a screenshot of the hit or some other, relevant information to inform people what this task is about). Note that you want to avoid people submitting your survey at this stage, so showing them the raw survey may be counterproductive at this stage.
  3. Otherwise the website is called from MTurk by a worker who wants to complete the survey; in this case we rename the URL parameters from their actual names into a, b, c, d and forward the request to the Questback survey.

Add URL parameters to the survey

In Questback this can be done in the survey properties > User-defined variables.

Steps

As mentioned before, the URL parameters will have names a,b,c, … and can be accessed in the survey tool via #p_0001#, #p_0002#, #p_0003#, … .

Forward People to the GitHub Adapter Upon Completion

This is easily done in the properties section of the final page under Questionnaire editor > Final Page > Properties > Redirect to Survey .

URL forwarding questback

The address should point to the GitHub Adapter and the URL should include three parameters: (1) the assignmentID sent from MTurk, (2)-this is really important– at least one additional parameter to store in MTurk as the result of the task and (3) the “returnToMTurk” parameter used to tell the adapter what to do. An example of a URL could look like this

https://firefoxmetzger.github.io/robot_capabilities_glue/?assignmentId=#p_0001#&attentionCheckPassed=False&returnToMTurk=True

Note that the assignmentID is set to #p_0001# which is the the first parameter (“a“) passed to the survey from MTurk. In above example attentionCheckPassed is a variable from the survey which we used to determine if participants payed attention or just mindlessly filled out the survey. This is an aggregate of multiple questions and computed by the survey tool upon completion of the survey. This can later be used to automatically accept / reject / ban workers that have completed the assignment.

It is also important to note that MTurk expects the assignmentID and at least one additional parameter to be send through the form’s POST request. For some reason, the additional parameter is mentioned nowhere in the documentation, but, instead, tacitly assumed.

Additionally, the checkbox next to the phrase Automatically add ospe.php3 to URL  and Add return ticket have to be disabled. You would use these if you were forwarding / returning to another ESF survey; this isn’t the case here.

Host an External Question Pointing to the GitHub Adapter

All that is left is to actually host a task on MTurk. In this case an External Question.

Unfortunately, this is currently impossible to do through the web UI. Hence, we have to use the API; I decided to do it in Python by adapting a code snippet that I found on the web. It consists of two files: (1) config.py, which stores the credentials, and (2) create_hit.py which creates the actual hit.


AWS_ACCESS_KEY_ID = "YOUR KEY ID HERE"
AWS_SECRET_ACCESS_KEY = "YOUR SECRET KEY HERE"
USE_SANDBOX = False

view raw

config.py

hosted with ❤ by GitHub


import config
from boto.mturk.connection import MTurkConnection
from boto.mturk.question import ExternalQuestion
from boto.mturk.qualification import (Qualifications,
PercentAssignmentsApprovedRequirement,
NumberHitsApprovedRequirement, LocaleRequirement)
from boto.mturk.price import Price
import datetime
# ============================HELPER METHODS=======================
# Quick method to encode url parameters
def encode_get_parameters(baseurl, arg_dict):
queryString = baseurl + "?"
for indx, key in enumerate(arg_dict):
queryString += str(key) + "=" + str(arg_dict[key])
if indx < len(arg_dict)1:
queryString += "&"
return queryString
# ============================VARIABLES============================
# START AWS CONFIGURATION VARS
AWS_ACCESS_KEY_ID = config.AWS_ACCESS_KEY_ID
AWS_SECRET_ACCESS_KEY = config.AWS_SECRET_ACCESS_KEY
# END MAIN CONFIGURATION VARS
# START IMPORTANT HIT VARIABLES
sandbox = config.USE_SANDBOX
base_url = "https://firefoxmetzger.github.io/robot_capabilities_glue/&quot;
params_to_encode = {}
assignments_per_hit = 25
payment_per_assignment = 0.8
# END IMPORTANT HIT VARIABLES
# START QUALIFICATION CONFIGURATION
qualifications = Qualifications()
qual_1 = PercentAssignmentsApprovedRequirement(
comparator="GreaterThan",
integer_value="95")
qualifications.add(qual_1)
qual_2 = NumberHitsApprovedRequirement(
comparator="GreaterThan",
integer_value="100")
qualifications.add(qual_2)
qual_3 = LocaleRequirement(
comparator="In",
locale=["US", "GB", "CA", "AU", "NZ"])
qualifications.add(qual_3)
# END QUALIFICATION CONFIGURATION
# START DECORATIVE HIT VARIABLES
hit_title = "Evaluate Robot Capabilities! (6 minutes / $0.8 USD)"
hit_description = "Rate a robot on a number of capabilities in a short survey (approximately 6 minutes)"
hit_keywords = ["university study", "robot", "survey", "short", "quick"]
duration_in_seconds = 20*60
frame_height = 1200
# END DECORATIVE HIT VARIABLES
# =================================================================
# Initialize boto connection based on sandbox.
if sandbox:
AMAZON_HOST = "mechanicalturk.sandbox.amazonaws.com"
else:
AMAZON_HOST = "mechanicalturk.amazonaws.com"
connection = MTurkConnection(
aws_access_key_id=AWS_ACCESS_KEY_ID,
aws_secret_access_key=AWS_SECRET_ACCESS_KEY,
host=AMAZON_HOST)
# Selecting which endpoint to pass as parameter
if sandbox:
external_submit_endpoint = "https://workersandbox.mturk.com/mturk/externalSubmit&quot;
else:
external_submit_endpoint = "https://www.mturk.com/mturk/externalSubmit&quot;
#params_to_encode['host'] = external_submit_endpoint
encoded_url = encode_get_parameters(base_url, params_to_encode)
create_hit_result = connection.create_hit(
title=hit_title,
description=hit_description,
keywords=hit_keywords,
duration=duration_in_seconds,
lifetime=datetime.timedelta(days=8),
max_assignments=assignments_per_hit,
question=ExternalQuestion(encoded_url, frame_height),
reward=Price(amount=payment_per_assignment),
# Determines information returned by certain API methods.
response_groups=('Minimal', 'HITDetail'),
qualifications=qualifications)

view raw

create_hit.py

hosted with ❤ by GitHub

That’s it. Running the code will create a HIT that will point to the website on GitHub, which itself points to our survey. The survey will point back to the GitHub website, which will point back to MTurk, going full circle. Neat!

As always, I hope this is useful to some of you and feel free to drop a comment or reach out to me if you have questions.

Self-Signed SSL Certificates with IP SAN v2.0 — This time in Docker

For the love of all things good; there is two things I don’t like: (1) unnecessary convoluted setups and (2) redoing work I’ve done earlier. SSL Certificates seem to combine both into one beautifully painful mess.

I once again found myself in the need to generate a throwaway SSL certificate for local development and testing. As I’ve posted earlier there is a way to do that with more or less effort. Since I found myself doing this for the third time now I decided to spice things up a bit, cutting down the time it takes. I’ve created a small docker container that’s sole purpose it is to create a certificate from a given config file. The way it works is:

  1. Create a folder (e.g. ./certs) and place the below config file (named config.cfg) into it.
  2. Modify the config file, adding all the SANs required for this certificate
  3. docker run –rm -it -v./certs:/certs firefoxmetzger/create_ssl

That’s it. It will drop a `private.key` and `certificate.crt` into the previously created folder. It will also print the properties of the certificate into console so you can make sure the SANs are actually added.

Here is the config.cfg template


[ req ]
default_bits = 2048
distinguished_name = req_distinguished_name
x509_extensions = v3_ca # The extentions to add to the self signed cert
string_mask = utf8only
[ req_distinguished_name ]
countryName = Country Name (2 letter code)
countryName_default = AU
countryName_min = 2
countryName_max = 2
stateOrProvinceName = State or Province Name (full name)
stateOrProvinceName_default = Some-State
localityName = Locality Name (eg, city)
0.organizationName = Organization Name (eg, company)
0.organizationName_default = Internet Widgits Pty Ltd
organizationalUnitName = Organizational Unit Name (eg, section)
commonName = Common Name (e.g. server FQDN or YOUR name)
commonName_max = 64
emailAddress = Email Address
emailAddress_max = 64
[ v3_req ]
basicConstraints = CA:FALSE
keyUsage = nonRepudiation, digitalSignature, keyEncipherment
[ v3_ca ]
subjectKeyIdentifier=hash
authorityKeyIdentifier=keyid:always,issuer
basicConstraints = CA:true
subjectAltName=@alternate_names
[ alternate_names ]
IP.1 = 127.0.0.1

view raw

config.cfg

hosted with ❤ by GitHub

And here a link to the GitHub repo with the code of the docker image:

https://github.com/FirefoxMetzger/create_ssl

Google Code Jam 2018: Trouble Sort

Link to the exercise.

The second problem in this years Code Jam was Trouble Sort. This problem was substantially easier then the first one. At least for the small test.

The idea is that one performs bubble sort with three elements at a time and, if the last element in the triplet bigger then the first, one reverses the entire 3-element sub-list. Pseudo-code for this algorithm was given. The goal of the exercise was to, given a list, sort that list using above strategy and then asserting that it was ordered. If it was one should print “OK” otherwise one should print the index of the (0-indexed) first element of non-increasing order (e.g. [3,9,7] should return 1).


import sys
def trouble_sort(L):
# in place sorting
done = False
while not done:
done = True
for idx in range(len(L) 2):
if L[idx] > L[idx + 2]:
done = False
L[idx],L[idx+1], L[idx + 2] = L[idx + 2], L[idx+1], L[idx]
def assert_list(sorted):
# there is a bug, but I don't see it
idx = 0
while idx < len(sorted) 1:
if sorted[idx] <= sorted[idx+1]:
idx += 1
else:
return str(idx)
return "OK"
def fast_sort(unsorted):
list1 = unsorted[0::2]
list2 = unsorted[1::2]
list1.sort()
list2.sort()
# concatenate the list
unsorted[0::2] = list1
unsorted[1::2] = list2
def next_case(file):
test_cases = list()
num_cases = int(file.readline())
for case_idx in range(num_cases):
N = int(file.readline())
read_list = [int(elem) for elem in file.readline().split(" ")]
assert len(read_list) == N
test_cases.append((N,read_list))
return test_cases
if __name__ == "__main__":
test_cases = next_case(sys.stdin)
for idx, case in enumerate(test_cases):
N, test_list = case
fast_sort(test_list)
result = assert_list(test_list)
print("Case %d: %s" % (idx+1,result))

view raw

trouble_sort.py

hosted with ❤ by GitHub

The problem with passing the big test was that lists could have up to `10^8` many elements and above trouble sort implementation has `O(n^2)`. This means using the naive implementation will quickly run into the timeout.

The “trick” is to see that trouble sort is equivalent to splitting the unsorted list into elements with even and odd index, sorting them and then concatenating those sorted lists. Since this sort can use a reasonably fast algorithm (e.g. pythons native one) it can happen in O(n*log(n)) which will be significantly faster and work on the second, big set of tests. (I know this because I read it in the analysis).

The way I got that intuition was thinking about the fact that reversing a three element list is the same as swapping elements i-1 and i+1 . So we swap with a “distance” of two which made me wonder if it would be possible to swap the second and third element in the list in some way. The only swaps that involve the third element are when the “cursor” is at position two and four and that would swap with the first or fifth element respectively which are both uneven numbers. This trend continues for all numbers (even swap with even, odd with odd) and suddenly you begin to see a pattern.

Unfortunately I failed this challenge completely. My logic is correct, however if you look carefully at the last line, you will see that it outputs something like “Case 1:” however it should output “Case #1:”. I spent over 2 hours trying to find the mistake…

Sometimes it’s the small things that catch you. This probably won’t happen to me again … ever.

Google Code Jam 2018: Saving The Universe Again

During last week I’ve learned about a Code Jam hub in our University. The idea is to meet up and participate in the Google Code Jam 2018 . I ended up participating and advancing into the first round, so I thought I’d share my solutions here for anybody interested. (And for me for further reference)

Task 1: Saving The Universe Again (Link)

An alien robot is shooting a beam that will destroy all algorithms knowledge (not sure how an algorithm can have knowledge, but lets not get philosophical).

The beam starts with strength of 1 and follows a given sequence of actions (string with two literals “S” and “C”). Whenever it shoots “S” it deals damage equal to it’s current strength and whenever it charges “C” the strength is doubled. The sequence “SCSS” would thus deal 5 damage.

You have a shield that can absorb D damage (humanity needs the D!) and can swap any two adjacent literals in the string. What is the minimal number of swaps to reduce the damage dealt to a number <= D? (If impossible return “IMPOSSIBLE”)

You can understand this problem as a tree search (again! gives you a competitive advantage to know these things). The state is the current string and it’s children are all the new states that result from swapping in different places. For each such state you can calculate the damage. To find the closest node to the root node that has damage <= D a possibility is to use simple breadth first search.

Breaching from my usual style I will post the code first and then talk about it below. This is the pure, unmodified code I submitted during the hub, so it is intentionally left a bit messy. It’s still fairly readable though … It’s python and it’s short.


from collections import deque
import sys
def get_damage(sequence):
current_strength = 1
damage = 0
for char in sequence:
if char == "C":
current_strength *= 2
if char == "S":
damage += current_strength
return damage
def get_nodes(sequence):
list_sequence = list(sequence)
children = list()
for idx in range(len(list_sequence) 1):
new_node = list_sequence.copy()
new_node[idx], new_node[idx + 1] = new_node[idx + 1], new_node[idx]
children.append("".join(new_node))
return children
def get_moves(initial_sequence, shield):
q = deque()
# tupel in queue:
# (damage, node, num_swaps)
visited = dict()
current_best = (get_damage(initial_sequence),initial_sequence, 0)
q.appendleft(current_best)
while q:
damage, node, num_swaps = q.pop()
if damage <= shield:
return (damage, node, num_swaps)
if damage < current_best[0]:
current_best = (damage, node, num_swaps)
if node in visited:
continue
visited[node] = (damage, num_swaps)
for child in get_nodes(node):
dmg = get_damage(child)
q.appendleft((dmg, child, num_swaps + 1))
return (current_best[0],current_best[1],"IMPOSSIBLE")
# do pruning on minimum damage and max
def read_file(location):
with open(location, "r") as f:
return next_case(f)
def next_case(file):
test_cases = list()
num_cases = file.readline()
for line in file:
line = line[:1]
shield, sequence = line.split(" ")
test_cases.append((int(shield), sequence))
return test_cases
if __name__ == "__main__":
#test_cases = read_file("foo.txt")
test_cases = next_case(sys.stdin)
for idx, case in enumerate(test_cases):
shield, sequence = case
num_moves = get_moves(sequence, shield)
print("Case #%d: %s" % (idx+1, str(num_moves[2])))

view raw

shield.py

hosted with ❤ by GitHub

Note: This will work for “small” sequences and is too inefficient for large ones. The branching factor of the tree is len(sequence)-1 which can be up to 30. I timed out on the second (big) test, because I didn’t optimize this part.

Pure breadth first search on a tree with branching factor >10 is generally not a good idea. As such I’ve implemented the search using a queue rather then choosing a recursive approach. This allows to better scale the code as ideas for run time optimization come in.

By optimization I mean pruning of nodes that are clearly disadvantageous. One example is in line 32 where I keep a dictionary of nodes that have been visited. If I encounter the same state again, I already know a shorter path to it; thus I can safely ignore it. To be honest, I should have done the weeding out directly in get_nodes to reduce constant overhead.

Another optimization, which I didn’t implement, would be to see that this is a convex problem. There is exactly one clear minimum (S[…]SC[…]C) and one clear maximum (C[…]CS[…]S) along the border of the problem and there are no local optima in between. Thus, we can ignore all swaps that increase the damage or keep it constant, i.e. “SS”, “SC”, “CC”, and focus on those that swap “CS” into “SC”. This will severely decrease the branching factor. (Note: We already ignore “SS” and “CC” as part of above optimization)

Another consequence of this is that we can swap the queue type from a FIFO (i.e. breadth first) into a priority queue, sorting by current damage dealt choosing the node with lowest damage as the next node. This is essentially Dijkstra’s algorithm (as we don’t use a heuristic, otherwise it would be A*). It keeps amazing me again and again how easily you can swap between different tree searches by simply changing the queue type.

Applying those three optimizations should cut down search time by enough to cope with the big test. It also gets very close to the other solutions I’ve seen; that is “pass over the string back to front and switch all “CS” into “SC”. Repeat until no change was made for an entire sweep of the string and then output the total number of swaps. My approach is simply more formal and can be applied to other problems more easily.

Layered Layers: Residual Blocks in the Sequential Keras API

I’ve been looking at the AlphaGo:Zero network architecture [1] and was searching for existing implementations. I’ve found quite a few (here , here and here) with varying degrees of completeness. The cleanest is probably this one but it depends on Jupyter.

What surprised me was that I couldn’t find one that used Keras’ sequential API. While residual blocks aren’t exactly sequential, from a high level view the architecture itself is; it simply stacks (a lot of) residual blocks. So it should be possible to create something like this, right?

The answer is, of course: Yes, there isn’t much that you can’t do in Python. We are actually using this strategy already. Sequential itself inherits from Layer and, in fact, Container (a class sitting between Sequential and Layer in the inheritance hierarchy) states so itself: A Container is a directed acyclic graph of layers. It is the topological form of a “model”. A Model is simply a Container with added training routines. (source)

It works by defining the residual block as a new Keras layer. Depending on how tightly integrated you want it this can be quite short:


from keras.engine.topology import Layer
from keras.layers importActivation, Conv2D, Add
class Residual(Layer):
def __init__(self, channels_in,kernel,**kwargs):
super(Residual, self).__init__(**kwargs)
self.channels_in = channels_in
self.kernel = kernel
def call(self, x):
# the residual block using Keras functional API
first_layer = Activation("linear", trainable=False)(x)
x = Conv2D( self.channels_in,
self.kernel,
padding="same")(first_layer)
x = Activation("relu")(x)
x = Conv2D( self.channels_in,
self.kernel,
padding="same")(x)
residual = Add()([x, first_layer])
x = Activation("relu")(residual)
return x
def compute_output_shape(self, input_shape):
return input_shape

Inside the block we fall back to the functional way of stacking layers. If you want better integration, e.g. model.summary() showing the number of trainable weights, there is additional plumbing. Above just shows the gist . . . (gosh! That pun was bad).

Once that is written, we can use model.add( Residual(32, (3,3) )) as we would any other layer. Nice!

To close with an example, I modified the Keras CNN example on CIFAR10 and replaced the hidden convolutional layers with residual ones. I haven’t optimized performance, but you can see how it works. If you are familiar with the example, you might appreciate how similar it looks.


'''Train a simple residual network on the CIFAR10 small images dataset.
It gets to 75% validation accuracy in 25 epochs, and 79% after 50 epochs.
(it's still underfitting at that point, though).
'''
from __future__ import print_function
import keras
from keras.datasets import cifar10
from keras.preprocessing.image import ImageDataGenerator
from keras.models import Sequential
from keras.layers import Dense, Dropout, Activation, Flatten
from keras.layers import Conv2D, MaxPooling2D, Add
import os
from keras.engine.topology import Layer
# Define the residual block as a new layer
class Residual(Layer):
def __init__(self, channels_in,kernel,**kwargs):
super(Residual, self).__init__(**kwargs)
self.channels_in = channels_in
self.kernel = kernel
def call(self, x):
# the residual block using Keras functional API
first_layer = Activation("linear", trainable=False)(x)
x = Conv2D( self.channels_in,
self.kernel,
padding="same")(first_layer)
x = Activation("relu")(x)
x = Conv2D( self.channels_in,
self.kernel,
padding="same")(x)
residual = Add()([x, first_layer])
x = Activation("relu")(residual)
return x
def compute_output_shape(self, input_shape):
return input_shape
batch_size = 32
num_classes = 10
epochs = 100
data_augmentation = True
num_predictions = 20
save_dir = os.path.join(os.getcwd(), 'saved_models')
model_name = 'keras_cifar10_trained_model.h5'
# The data, split between train and test sets:
(x_train, y_train), (x_test, y_test) = cifar10.load_data()
print('x_train shape:', x_train.shape)
print(x_train.shape[0], 'train samples')
print(x_test.shape[0], 'test samples')
# Convert class vectors to binary class matrices.
y_train = keras.utils.to_categorical(y_train, num_classes)
y_test = keras.utils.to_categorical(y_test, num_classes)
model = Sequential()
model.add(Conv2D(32, (3, 3), padding='same',
input_shape=x_train.shape[1:]))
model.add(Activation('relu'))
model.add(Residual(32,(3,3)))
model.add(Residual(32,(3,3)))
model.add(Residual(32,(3,3)))
model.add(Residual(32,(3,3)))
model.add(Residual(32,(3,3)))
model.add(Flatten())
model.add(Dense(512))
model.add(Activation('relu'))
model.add(Dropout(0.5))
model.add(Dense(num_classes))
model.add(Activation('softmax'))
# initiate RMSprop optimizer
opt = keras.optimizers.rmsprop(lr=0.0001, decay=1e-6)
# Let's train the model using RMSprop
model.compile(loss='categorical_crossentropy',
optimizer=opt,
metrics=['accuracy'])
x_train = x_train.astype('float32')
x_test = x_test.astype('float32')
x_train /= 255
x_test /= 255
if not data_augmentation:
print('Not using data augmentation.')
model.fit(x_train, y_train,
batch_size=batch_size,
epochs=epochs,
validation_data=(x_test, y_test),
shuffle=True)
else:
print('Using real-time data augmentation.')
# This will do preprocessing and realtime data augmentation:
datagen = ImageDataGenerator(
featurewise_center=False, # set input mean to 0 over the dataset
samplewise_center=False, # set each sample mean to 0
featurewise_std_normalization=False, # divide inputs by std of the dataset
samplewise_std_normalization=False, # divide each input by its std
zca_whitening=False, # apply ZCA whitening
rotation_range=0, # randomly rotate images in the range (degrees, 0 to 180)
width_shift_range=0.1, # randomly shift images horizontally (fraction of total width)
height_shift_range=0.1, # randomly shift images vertically (fraction of total height)
horizontal_flip=True, # randomly flip images
vertical_flip=False) # randomly flip images
# Compute quantities required for feature-wise normalization
# (std, mean, and principal components if ZCA whitening is applied).
datagen.fit(x_train)
# Fit the model on the batches generated by datagen.flow().
model.fit_generator(datagen.flow(x_train, y_train,
batch_size=batch_size),
epochs=epochs,
validation_data=(x_test, y_test),
workers=4)
# Save model and weights
if not os.path.isdir(save_dir):
os.makedirs(save_dir)
model_path = os.path.join(save_dir, model_name)
model.save(model_path)
print('Saved trained model at %s ' % model_path)
# Score trained model.
scores = model.evaluate(x_test, y_test, verbose=1)
print('Test loss:', scores[0])
print('Test accuracy:', scores[1])

view raw

cifar10_res.py

hosted with ❤ by GitHub

References

[1] Silver, David, et al. “Mastering the game of go without human knowledge.” Nature 550.7676 (2017): 354.