open the door

Calamity and me started a discussion about letting InMoov get a bit of autonomy.

Found the shiffman youtube series of kinect data captering very interesting.

Based on the Markus video where his InMoov tried to hit a light switch I came up with the idea to have a movable InMoov try to open a door (european door handle, not a turning knob)

To make the task a bit easier I thought we could help with one or more optical tags on the handle to enable us to calculate pitch, yaw and roll of the handle.

There was once mentioned in a calibration thread that MRL already has a service detecting optical tags. What I found on the net are ArUco markers which (with the help of a library) should be identifiable by OpenCV.

I have the following setup in mind:

InMoov has a movable/rotateable base (as Markus has built). The program tries to find an optical tag in the Kinect image. If none found it starts to scan the room using base and/or mid stomag rotation.

Once the tag is identified we should be able to get  pitch, yaw, roll from the tag and the distance from the depht data of the kinect at that location (for the Kinect v1 there might be additional calculation needed to find the correct depht elements for the visual location). We could do this iteratively until the positioning is down to small movements.

From these values we might be able to calculate a good 3d position of the right arm which would allow for the further movements of hand and arm to grab the handle, rotate it down and push/pull the door open.

For the moment that looks to me like enough challenges for a start.

1) what kind of tags could be used and what is needed to find them in an image.

2) assuming we can evaluate pitch, yaw, roll and distance of the handle in relation to the kinects position, how do we move the body in front of the handle at maybe a distance of 60 cm?

3) my Marvin is on non-motorised rolls but it could tell me based on the calculations where I have to move and rotate him



Mats's picture

Target finding

Hi Juerg

I find Ardian Rosebrocks series about image processing and now moving on to deep learning very interesting.

Here is a video where he shows how to find targets in an image, and his mobile base is a bit more challenging than ours. And it not just showoff. He shows the code, and explains very well what it is doing. Only downside is that he is using Python and not Jython as MRL is using. So some of the libraries that he is using are not available in MRL. But from a learning point it's very good.


calamity's picture

Hi juerg, I have watch those

Hi juerg, I have watch those video and yes it gives me a lot of idea of things to try. But I have a lot of catch up to do before I can do anything with image processing, but with kinect or openCV before I can work on that part. but I'm sure there is thing to do with that. Mats link look also very interresting


With IK it should not be too hard to enter parameters so it can translate to a point that will fit the an arm position and base movement to reach the target. It`s just some maths to apply

I did some reading about collision detection. So I have some basic idea how to implement this, well at least for the inMoov body part. With the DH parameters and the IK service, We can know the position of each joints a their length. By adding a radius for the size of the body parts, I can easily have 3D shape representing the body part, along with their angle. So I will need to apply good algorithm to find out if those shape will cross while doing movement. Nothing implement yet, just some basic idea how to implement the collision detection.

But before I'm able to work on that, I need to configure each body part with DH parameter. So it probably worth working on a calibration way at the same time. So that`s where I want to work first.

I should be able to set up some basis thing about calibration and review with it with you all. 

I'm seeing the island in the distance, but there is a lot of fog and still need a lot of rowing to reach there



juerg's picture

You can find my DH parameters

You can find my DH parameters I used for "lookAtHand" here. Not including fingers as I was unsure how to pop them onto the hand ;-)


calamity's picture

thanks for sharing, It help

thanks for sharing, It help to have another set to compare.


I'm strike at how our DH parameters (kwatters, yours and mine) look different at first, but along the transformation of the matrix, we all come up to very similar DH parameter. The main difference is where we apply the 'rest position' and the direction of the x axis.


juerg's picture

not a real progress but the

not a real progress but the path I would like to take

Made my kinect work with my windows PC (finally, issues with drivers) and have a running image and depht display in Processing. Looks like the Kinect 1 expects at least 1 m of distance, a bit on the brink I guess.

Next will be to experiment with ARToolKit Library to identify a marker on the door handle and hopefully get its yaw, pitch and roll.

This I will try to pass to MRL through the REST API - but rather unsure yet how to do that.

Question: If I can send you distance, yaw, pitch and roll of the handle (in relation to the Kinect front) through this REST API (I currently have about 3 times per second in mind) will you be able to move the robot into a fitting opening position?

calamity's picture

Hi juerg the roll/pitch/yaw

Hi juerg

the roll/pitch/yaw is not implemented yet in the IK service, Maybe I should work on that next as it will be needed to interract with object in his surrounding.

Last week I have work on some collison detection algorythm. So right now I  can have it to avoid it's own body part and set objects in his environment. Still need testing to ensure it's working like it should, but it look very promising and i'm quite exited about it.

By example I did this test:

move from rest position (fingertip at (280,  180,  -330)) to an high position (200, 500, 1000). I use x as left/right coordinate, y as front/back and z as up/down.

The IK service make it reach the point mostly by moving the shoulder up and ajusting rotate or mtorso to reach the target position. 

Now I add an object in front of it (object from point (-1000, 300, 0) to (1000, 300, 0) that is blocking the shoulder from going up.

The IK service still find a way to reach the point by turning the mtorso more so the shoulder could clear the object and ajusting the other part to reach the target point.

What you want to do with the kinect look very promising too. If you can get a coordinate of the target object from the field of view of the Kinect, that coordinate can converted to the space of the IK service then the robot will definitively be able to reach the object. But it remain to have it reach with the correct angle (roll/pitch/yaw)

That is so much fun, I feel like a kid at christmas eve :D

juerg's picture

eh, nice to hear from your

eh, nice to hear from your progress. Am I correct that you run inverse jacobian to do that and eliminate movements that would interfere with the object?

Thought a bit more about my door handle task and think I will need first a way to calibrate the kinect image. I will now first try to identify and locate a marker that is directly in front of the kinect at the exact same hight as the camera. Having ustom and mstom centred I should be able to get the image position of the marker center and also the depht value. Changing ustom and mstom I should be able to compare degree settings of the servos with the new marker positions in the image. As weather looks now to be changing in Switzerland (the warm and sunny days made me spend my time mostly in my backyard) I should be able to spend more time on this and hopefully also have some success with it.

I am still trying to work on my math skills with slow but steady improvements. I still need better skills to be able to follow the Kathib Oussamas lessons available on youtube and recommended by Kevin.

Addition: just noted that you specified your coordinate usage:  "I use x as left/right coordinate, y as front/back and z as up/down". When you say "left" this is left from the inmoov's point of view? And left is a negative value? And front is a positive or negative value? Maybe we can visualize this in a drawing to make it clear?

calamity's picture

So far I have test using a

So far I have test using a genetic algorythm approach, wich mean generate random solution, find the best solution of the pool, and apply some modification on the best solutions to try to improved the results.

But I want the collision test to apply to inverse jacobian method also so both computing method can be use.

I don`t know if one computing method is better than the other, they both gives pretty good results so far.

I'm glad to see that you have some plan about object identification and positioning. When all will work together we will have a very good AI for the robot interraction with is environment

I'm not good at drawing, but yes I use the coordinate as you deduced

x: negative -> toward the left side

x:positive -> toward the right side

y: positive -> toward the front

y:negative -> toward the back

z:positive-> up


with the origin (0,0,0) been the center point of mtorso.


It's not hard to change that definition if it make it easier for you

juerg's picture

slight change in procedure

After struggling getting raw depht values with the Kinect4WinSDK library I decided to use the size of the marker to estimate the distance. As I have a calibrations step with known size and distances this should be sufficient for my first tests.

I also wasn't successful so far to get the orientation of the tag (a lot of the library nyar4psg is in japanese which leaves me helpless). 

But realized that in a first approach I will not need it as I can simply decide from the location of the tag in the image whether the bot has to move (or has to be moved manually in my case) to the left, right, forward or backwards. Once perpendicular to the tag I should be able to estimate distance from the shape size.

Intend to use Mat's example (thanks btw) of sending commands to accapela speech to command me for the movements (maybe I need a base like Markus has one ... ). 

Once positioned properly I hope I can send something like "open door" to calamity's IK service providing distance and rotation on x-axis of the handle in respect to the kinect camera.

So let me start to work and in case of success create a short video about it.

calamity's picture

Hi juerg I just realize that

Hi juerg

I just realize that I used a different coordinate orientation than what kwatters have use, 

What did you use as base coordinate for your lookAtHand script?

it's just a matter of reference and it just change the angle set in the dh-parameters.

I feel like what i use is different than what use previously and I will probably switch to what kevin used, if you also have use this

kwatters seem to have use this


   * A 6 dimensional vector representing the 6 degrees of freedom in space.
   * @param x
   *          - left / right axis
   * @param y
   *          - up / down axis
   * @param z
   *          - forward / backward axis
   * @param roll
   *          - rotation about the z axis
   * @param pitch
   *          - rotation about the x axis
   * @param yaw
   *          - rotation about the y axis
It's just a reference, but we sould use the same reference


juerg's picture

Hello calamity Agree fully

Hello calamity

Agree fully that we should use an "InMoov" definition of axes and orientation. As you mentioned before it's not a problem to revert from one to another system but it will make things much easier if we talk all in the same language!

As for the values to use I went for up/down as Z for my lookAtHand but Google shows that several combinations are in use. Exception might be that normally plus is used to go up, move right and move foreward.

Maybe as Kevin has been the initiator we just go by his first example. We should however add the +/- definitions in my opinion

x: (-)left, (+) right
y: (-)down, (+) up
z: (-)backward, (+) forward

roll: rotation about z axis, (-) lean left (+) lean right
pitch: rotation about x axis, (-)backward, (+)forward
yaw: rotation about y axis, (-)rotate left, (+) rotate right

and maybe fix the mstom rotation center as the point of origin and assuming for the moment its roll, pitch and yaw as 0?


calamity's picture

Ok, let go to that

Ok, let go to that definition

I think kevin end up with z as backward/forward because he use the omoplate as first DH parameters, 

Yesterday I have work to add the roll/pich/yaw into the IK service, still need to find a way to include those data into the IK computation

progressing :)