Rachel found the beer !

Playing with yolo filter !



Ultimate goal is to use IK for that, for now I scripted 3 possible positions in the space ( left / center / right ) .

YoloClassification publisher give object coordonates in the frame, so we can deduct position from 1 object to an other.

Also I filtered some objects, because I don't want the robot count the table as an object.



Code :


vocal commands :


Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.
Ash's picture

Hi Moz4r, Fun and awesome !!

Hi Moz4r,

Fun and awesome !! ;-)

But, you know, a beer is also an object that makes it possible to communicate... 

See you

hairygael's picture

Great video! I can only guess

Great video!

I can only guess you are using Nixie latest because of the yolo filter, which means you are able to run OpenCV without crash. That's very good news!!

I also realy like how you filtered the objects unnecessary!

It's time to test your scripts modifications on my InMoov.



moz4r's picture

It's not 100% stable yet, for

It's not 100% stable yet, for the demo I had to run the filter permanently, bad for the CPU. We just need 4 or 5 seconds inventory

Hey ash, you right :)

martoys1's picture

thumbs up

great work,,,,impressive

kwatters's picture

you did it!

Wow, great job moz4r!  That is a fantastic video.  It shows off so many different technologies working together in such a seamless way.

We can add a throttle to the yolo filter so it only attempts 1 FPS or something like that.. 

It'd be cool to have the yolo filter classify one frame, then add a tracker object to track the classified object for you.

So, yolo would only be run in the begining once to detect the objects and start tracking them...  hmm..  many possible optimizations...

hairygael's picture

Hello again,So I am testing

Hello again,

So I am testing it right now, with the inmoov-develop + Nixie 958 on a mini pc with  simple webcam.Not using it on the complete InMoov yet. Though using Virtual InMoov.

Seems the code requires the sensor ultrasonic to be connected because the AIML sets to say something related to wrong distances, but I see no implementation of the ultrasonic sensor in the scripts.

Yolo is seing the phone , the cup, the beer but I can't seem to filter other objects because of the ultrasonic barrier. OpenCV window overexpends and freezes after launching "what do you see".

But if I  reset the camera via command "camera on",  Yolo is active and Opencv Window works again.

I guess a bit more study of the script to understand what does what and it should work..



I tried changing the bottle and the cup, but it didn't help.


hairygael's picture

The error I get

Here is the error I get, seems the oob process is not finding it's path.

WHAT DO YOU SEE <THAT> * <TOPIC> * _inmoovGestures.aiml
[INFO] Set predicate startupSentence to This takes a few seconds to process. in WHAT DO YOU SEE <THAT> * <TOPIC> *
[INFO] Set predicate yoloReturn to none

moz4r's picture


HI Gael !

OOB PROCESSING did not returned an error but "none" , yolo didn't detected any object, so programAB said "I saw nothing" I think.

I just moded the yolo.py, just for the demo, so we can execute "what do you see" multiple times now. Because there is still a bug that crash opencv/system because memory overflow, if we close yolo filter after execution.
CPU will be hot :) But test can be executed multiple times.

Also, you are still concerned by other opencv bugs that should be resolved soon, @Grog is working on opencv things.

About ultrasonic, we need to implement it

Updated inmoov to download the last develop branch.

"what do you see" should worky..


hairygael's picture

Hello moz4r! Version 958

Hello moz4r!

Version 958 finally sees the objects without returning the oob processing none.

But I do not understand precisely why, nothing changed in the view frame.

One thing that could be the reason, the head moves while speaking, and it might be taking the single frame at that moment resulting slightly changing the view. Maybe we should pause the head while he is taking the frame.

I noticed, at first it would see the cell phone, but now it just doesn't want to see it, even if I restart mrl, do you think it's because it has been clasified as object not to be seen? Any where I can de-classify some objects?

I have added some extra into the gestures to make them a bit more natural, also making the robot return to "relax" after showing an detected object.

Thanks for this super great feature I love it!

GroG's picture

Ahahahaha :DAwesome video

Ahahahaha :D

Awesome video moz4r !  :D

Not just "beer" .. "Great Beer !" ;)

I refactored a whole lot of OpenCV including its gui (code part - not so much ui part) ... but the adding and removing of filters is much more stable.

I'll include addition and removal of yolo filter (several times) in the new unit test.

I showed a colleque your video, he was very impressed - I can't wait for delivery ! :D

Precious ;)

At the moment I'm still working on a training filter which should allow for semi-supervised continuous training of faces .. (possibly other things too)

hairygael's picture

Testing further,Some

Testing further,

Some unstable results appear, sometimes the robot will keep not seing things.

-Although yolo detects objects on the table with more than 70% result.

-And sometimes yolo doesn't even launch, though nothing is changed in the configuration( table distance head position, object setup)

Seems to me the Yolo filter is less accurate about some objects recognition than when I was using it with version 1.0.2963 when it was not a openCV filter. That's only an impression, based on the recognition of my iPhone.

My phone was always recognised when using the 1.0.2693 version, but now with the openCV filter it gets recognised as a cell phone about 20% of the times. Though I present the phone to the camera the same way. Any idea what could be the reason?

Maybe I need to get myself a new iPhone.


EDIT: just noticed that OpenCV gives me 1000 fps when object are not seen...

moz4r's picture

Wahoo I saw 1000 fps

Wahoo I saw 1000 fps !!!!

Don't change your phone now :) You can retry things after opencv polished.

if you manage to grab the object for a demo it would be great!

We need IK and 3D coordinates of objects for perfect result, maybe with openni cloud points correspondencie, but we can emulate by scripting for now. FUN !