Calamity and me started a discussion about letting InMoov get a bit of autonomy.
Found the shiffman youtube series of kinect data captering very interesting.
Based on the Markus video where his InMoov tried to hit a light switch I came up with the idea to have a movable InMoov try to open a door (european door handle, not a turning knob)
To make the task a bit easier I thought we could help with one or more optical tags on the handle to enable us to calculate pitch, yaw and roll of the handle.
There was once mentioned in a calibration thread that MRL already has a service detecting optical tags. What I found on the net are ArUco markers which (with the help of a library) should be identifiable by OpenCV.
I have the following setup in mind:
InMoov has a movable/rotateable base (as Markus has built). The program tries to find an optical tag in the Kinect image. If none found it starts to scan the room using base and/or mid stomag rotation.
Once the tag is identified we should be able to get pitch, yaw, roll from the tag and the distance from the depht data of the kinect at that location (for the Kinect v1 there might be additional calculation needed to find the correct depht elements for the visual location). We could do this iteratively until the positioning is down to small movements.
From these values we might be able to calculate a good 3d position of the right arm which would allow for the further movements of hand and arm to grab the handle, rotate it down and push/pull the door open.
For the moment that looks to me like enough challenges for a start.
1) what kind of tags could be used and what is needed to find them in an image.
2) assuming we can evaluate pitch, yaw, roll and distance of the handle in relation to the kinects position, how do we move the body in front of the handle at maybe a distance of 60 cm?
3) my Marvin is on non-motorised rolls but it could tell me based on the calculations where I have to move and rotate him