New Tracking Algorithms, TLD, MedianFlow, and More!

So, in OpenCV 3.x there is a new opencv_tracking module that is exposed via JavaCV. 

This module contains 7 different implementation of object trackers.  There is a new "Tracker" filter that allows you to switch between any of these 7 tracking algorithms.  It should work pretty similar to how the lk tracking works in that it should publish the point that's being tracked.  Right now the filter only supports tracking a single point.  

I found this page that has a good writeup of the pros and cons of each of the tracking algorithms.


One other difference is that this tracking algortihm tracks bounding boxes, not single points, so I think it will be very useful for looking at objects for generating training datasets for deep learning.


Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.
wvantoorn's picture


Hi Kevin,

These are great additions to our toolbox, and I see many useful attributes for tracking. We can add other filters like we used to? And last question, how do these new filters/algorithms affect frame rate?

And is there a possibility to maybe add an option to easily change the resolution of the feed? For tracking we don't need much resolution of the video if I am told correctly. Also would it be possible that after tracking is started we can 'turn off the feed for us to see? I know it is a bit hard to exlllain, but if the program controls the tracking, and it works, shutting of the feed to our screen could improve frame rate, or at least I think. I could be completely wrong but in my thinking process, the less video needs to be encoded, the less cpu it uses and the more cpu is left over to be used for tracking.

Maybe I am totally wrong, but just read a piece about a face recognition sensor commonly used in advertising to collect data about gender, age and so on. There they said that when tracking, if no videotaped is output to a screen, the more fps can be achieved . 


The sensor I am referring to is the Omron electronics B5T HTC face detection sensor module

GroG's picture

Oh My, This Looks So Fun And

Oh My, This Looks So Fun And Useful .. :)

Very Squirrel Worthy 


kwatters's picture

Training data generation

Yeah, it was a little side track based on XRobots questions on tracking and inspired by a link that you shared.  This is probably the first reference implementation of the Tracker api in JavaCV, I'm pretty happy at how simple and clean the interface is between all the different implementations.

I think that some of the algorithms throw an exception due to not finding some resources, but I figure that's more of an issue with hope JavaCV is packaged anyway, so I didn't bother digging further. 

Having different trackers that operate on bounding boxes is nice,  I'm hoping to move the training object around infront of the robot and capture good bounding box definitions and the associated label that is being trained. 

It would be fantastic if the robot automatically chose the proper bounding box size, but for now, I'll probably just say something like "look at this small object"  "look at this large object"  ... to denote a default box size to choose for this training object... 

We could target the final image resolution to be 224x224 so it can be input directly to VGG16 without having to resize...   ah.. so muich to do..



miketalmage's picture


This is what Iam currently working on for my inmoovbut i have a lot of stuff to do still it uses histgrams from different detections and then choose object blob or movement, but with every workaround I run into 6 more problems mainly trying to convert c++ and c sharp to java


kwatters's picture

Tracking is hard, don't re-invent the wheel.

I applaud your efforts to implement your own optical tracking.  Reality is, there are many researchers that have worked up these algorithms and contributed them to OpenCV.  That's all written in C/C++.  We use JavaCV which under the covers is actually calling the OpenCV C libraries via JNI. 

There are a bunch of tracking algorithms and options...  Some of these algorithms give you a point to track, others provide a bounding box.  As the object moves the bounding box size scales appropriately so you can detect if the object is closer or farther away based on box size.

I guess I'm curious, what are you porting over to java ?  We might be able to provide examples?  Are you currently working with OpenCV or are you rolling your own stuff from the ground up?