AdaBoost Runtime Reference Manual
AdaBoost Runtime is C/C++ library that is intended to perform realtime 2D object detection using WaldBoost trained image classifiers with Haar features. The library is distributed with source code and gcc and MS Visual Studio project to simplify its usage. This manual explains what functions and structures library provides and how to use the library in your code.
Structures TRect
to represent position and size of
particular image subwindow and TSize
to represent
general size of some object (image size, window size,...). TRect
structure is used to represent detections in image.
struct TRect
{
int
x, y, w, h;
};
struct TSize
{
int w,
h;
};
struct TPoint
{
int x, y;
};
Since this runtime library should be independent of another
libraries as much as possible (espetially image processing part)
there are several classes providing image representation and pixel
access. Most general class is TImage
which is in fact
template with one parameter specifying data type of pixels
(TImage<float>
for float images for instance).
There are predefined basic image types – TImageUChar
(8 bit unsigned char), TImageUInt
(32 bit unsigned
integer) and TImageFloat
(float pixels).
Constructors and methods of TImage
are:
TImage::TImage();
Uninitialized
image.
TImage::TImage(TSize size);
New
image with specified size.
// 100x100 8bit image
TImageUChar
image(TSize(100, 100));
TImage::TImage(const TImage &
other);
Copy constructor – deep copy of other
image. Even if other image is reference image allocates new image and
copy.
TImage::TImage(_T * data, TSize size, int
yoffset);
Reference image with preallocated data. This
could be useful for accessing images created by another image library
(like OpenCV, DigILib,...). yoffset
is addres offset
between two image rows (in bytes).
//
Static float array
float data[100*100];
// New
TImage instance with static data
TImageFloat image(data,
TSize(100*100), 100*sizeof(float));
TImage reference(TImage &
other);
Returns reference image of other
image.
TImage subimage(TRect &
area);
Returns reference image of particular subwindow
defined by area
.
During classification you need to access integral image data and
calculate standard deviation of classified subwindow to normalize
feature response. SampleImage
structure holds references
to intensity and integral images and automatically calculates
standard deviation of the sample (using integral images). SampleImage
could be passed to classifier to calculate response.
Methods:
TSampleImage::TSampleImage(TImageUChar &
image, TRect & area, TSize sampleSize);
Constructor –
Initializes instance of specified size (sampleSize
).
Image defined by area
is rescaled to fit the sample
size, integral representations and standard deviation of the sample
are calculated. This constructor actually allocates new images and
performs resampling.
TSampleImage::TSampleImage(TFrame &
frame, TRect & area);
Constructor – sample size
is defined by size of area
. No images are allocated –
this constructor creates only reference images to given frame
and calculates standard deviation.
Strucure TFrame
is similar to TSampleImage
but instead of holding only one sample it holds whole image and its
integral representations. Samples can be then directly extracted from
TFrame
by TSampleImage
without any memory
allocation nor integral image calculation.
Methods:
TFrame::TFrame(TSize sz);
constructor
– allocates images (intensity and 2 integrals) of given size.
void insert(TImageUChar &
image, int * rescale);
Inserts new image that is
rescaled using precalculated rescale table. Rescale should be
calculated using calcScaleTable()
function. Integral
representations are automatically calculated.
To provide multiresolution image scan library supports image
pyramids – class TImagePyramid
.
Methods:
TImagePyramid::TImagePyramid(TSize sz,
float f, int l);
Initialize pyramid instance
with sz
size of input image (top pyramid level), f
scale factor and l
number of levels.
void insert(TImageUChar &
image);
Insert image
to pyramid –
automatically rescale to all levels (and also calculate integral
representations). Size of image
should be as specified
in constructor.
TFrame & frame(int
level);
Returns TFrame
structure of specified
pyramid level
;
All classifiers are represented by TClassifier
class.
This one can load classifier from xml structure. Important method is
scanImage
to detect objects in video frame. Another
method is evalSample
to evaluate classifier response to
single sample image. Here is function that can load classifier from
xml file and return pointer to initialized instance of TClasifier
.
TClassifier * loadStrongClassifier(string
& filename);
Most important function in the library is scanImagePyramid()
which performs actual object detection – multiresolution image
scan and classifier evaluation.
int scanImagePyramid(TImagePyramid
* p, TClassifier * c, std::vector<TRect> & r);
Parameters of functions are pyramid (p
), classifier
(c
) which will perform object detection, and reference
to vector (r
) where results will be placed. Function
performs scan of all pyramid levels with given classifier and returns
number of detections.
AdaBoost Runtime supports XML classifiers with following structure.
<ROOT>
<TWaldBoostLearner>
<WaldBoostClassifier
sizeX=”24” sizeY=”24”>
<stage negT=”-4.5” posT=”1E+10”>
<DecisionStumpWeakHypothesis threshold=”-0.5”
alpha=”0.9” parity=”-1”>
<HaarHorizontalDoubleFeature positionX=”6”
positionY=”4” blockWidth=”4”
blockHeight=”5” />
</DecisionStumpWeakHypothesis>
</stage>
<stage>
... another weak hypothesis
</stage>
...
another stages
</WaldBoostClassifier>
</TWaldBoostLearner>
</ROOT>
Classifier is placed in node <WaldBoostClassifier>
and holds “stages” with waldboost thresholds. Every stage
holds one weak hypothesis (DecisionStumpWeakHypothesis
is supported) and weak hypothesis holds one image feature. Tho only
supported frature types are Haar heatures with names –
HaarHorizontalDoubleFeature
(a),
HaarVerticalDoubleFeature
(b),
HaarHorizontalTernalFeature
(c) and
HaarVerticalTernalFeature
(d), which corresponds to
wavelets on figure 2. Haar features are based on intensity difference
between adjacent rectangular regions. Meaning of wavelet parameters
in xml is explained on figure 3.
Figure
2: Haar wavelets
Figure
3: Feature parameters
Here is an example of using AdaBoost Runtime to detect objects in image.
#include
”abr.h”
#include <vector>
using
namespace std;
...
TClassifier * classifier =
loadStrongClassifier(”test.xml”);
TImagePyramid
pyramid = new TimagePyramid(size, 1.33, 8);
TImageUChar
image(size);
vector<TRect> detections(max_detections);
if
(classifier && pyramid && classifier->ok())
{
// Acquire image
from camera or video file
pyramid->insert(image);
int count = scanImagePyramid(pyramid, classifier,
detections);
// vector detections
now contains count
number of detected areas
}
The AdaBoost Runtime was tested with MS Visual Studio 2005 and gcc 3.4 on Windows. To install it on your system follow this steps:
Build libabr.lib (libabr.a) – use Visual Studio solution or makefile
Place static library to your compiler lib path
Copy header files (except common.h) from src directory to include directory of your compiler
To successfully build static library you need libxml2 installed. The libxml is used to provide processing of XML classifiers. To use functions and structures the library provide you need to include abr.h to your code and link the code with static library (-labr in gcc, additional library libabr.lib in VS).