AdaBoost Runtime Reference Manual

Introduction

AdaBoost Runtime is C/C++ library that is intended to perform realtime 2D object detection using WaldBoost trained image classifiers with Haar features. The library is distributed with source code and gcc and MS Visual Studio project to simplify its usage. This manual explains what functions and structures library provides and how to use the library in your code.

Classes, Structures and Functions

Support classes

Structures TRect to represent position and size of particular image subwindow and TSize to represent general size of some object (image size, window size,...). TRect structure is used to represent detections in image.

struct TRect
{
int x, y, w, h;
};

struct TSize
{
int w, h;
};

struct TPoint
{
int x, y;
};

Image Representation

Since this runtime library should be independent of another libraries as much as possible (espetially image processing part) there are several classes providing image representation and pixel access. Most general class is TImage which is in fact template with one parameter specifying data type of pixels (TImage<float> for float images for instance). There are predefined basic image types – TImageUChar (8 bit unsigned char), TImageUInt (32 bit unsigned integer) and TImageFloat (float pixels).

Constructors and methods of TImage are:

TImage::TImage();
Uninitialized image.

TImage::TImage(TSize size);
New image with specified size.

// 100x100 8bit image
TImageUChar image(TSize(100, 100));

TImage::TImage(const TImage & other);
Copy constructor – deep copy of other image. Even if other image is reference image allocates new image and copy.

TImage::TImage(_T * data, TSize size, int yoffset);
Reference image with preallocated data. This could be useful for accessing images created by another image library (like OpenCV, DigILib,...). yoffset is addres offset between two image rows (in bytes).

// Static float array
float
data[100*100];
// New TImage instance with static data
TImageFloat image(data, TSize(100*100), 100*sizeof(float));

TImage reference(TImage & other);
Returns reference image of other image.

TImage subimage(TRect & area);
Returns reference image of particular subwindow defined by area.

Sample Images and Frames

During classification you need to access integral image data and calculate standard deviation of classified subwindow to normalize feature response. SampleImage structure holds references to intensity and integral images and automatically calculates standard deviation of the sample (using integral images). SampleImage could be passed to classifier to calculate response.

Methods:

TSampleImage::TSampleImage(TImageUChar & image, TRect & area, TSize sampleSize);
Constructor – Initializes instance of specified size (sampleSize). Image defined by area is rescaled to fit the sample size, integral representations and standard deviation of the sample are calculated. This constructor actually allocates new images and performs resampling.

TSampleImage::TSampleImage(TFrame & frame, TRect & area);
Constructor – sample size is defined by size of area. No images are allocated – this constructor creates only reference images to given frame and calculates standard deviation.

Strucure TFrame is similar to TSampleImage but instead of holding only one sample it holds whole image and its integral representations. Samples can be then directly extracted from TFrame by TSampleImage without any memory allocation nor integral image calculation.

Methods:

TFrame::TFrame(TSize sz);
constructor – allocates images (intensity and 2 integrals) of given size.

void insert(TImageUChar & image, int * rescale);
Inserts new image that is rescaled using precalculated rescale table. Rescale should be calculated using calcScaleTable() function. Integral representations are automatically calculated.

Image Pyramids

To provide multiresolution image scan library supports image pyramids – class TImagePyramid.

Methods:

TImagePyramid::TImagePyramid(TSize sz, float f, int l);
Initialize pyramid instance with sz size of input image (top pyramid level), f scale factor and l number of levels.

void insert(TImageUChar & image);
Insert image to pyramid – automatically rescale to all levels (and also calculate integral representations). Size of image should be as specified in constructor.

TFrame & frame(int level);
Returns TFrame structure of specified pyramid level;

Classifier structure

All classifiers are represented by TClassifier class. This one can load classifier from xml structure. Important method is scanImage to detect objects in video frame. Another method is evalSample to evaluate classifier response to single sample image. Here is function that can load classifier from xml file and return pointer to initialized instance of TClasifier.

TClassifier * loadStrongClassifier(string & filename);

Functions

Most important function in the library is scanImagePyramid() which performs actual object detection – multiresolution image scan and classifier evaluation.

int scanImagePyramid(TImagePyramid * p, TClassifier * c, std::vector<TRect> & r);

Parameters of functions are pyramid (p), classifier (c) which will perform object detection, and reference to vector (r) where results will be placed. Function performs scan of all pyramid levels with given classifier and returns number of detections.

Classifier format

AdaBoost Runtime supports XML classifiers with following structure.

<ROOT>
<TWaldBoostLearner>
<WaldBoostClassifier sizeX=”24” sizeY=”24”>
<stage negT=”-4.5” posT=”1E+10”>
<DecisionStumpWeakHypothesis threshold=”-0.5” alpha=”0.9” parity=”-1”>
<HaarHorizontalDoubleFeature positionX=”6” positionY=”4” blockWidth=”4” blockHeight=”5” />
</DecisionStumpWeakHypothesis>
</stage>
<stage>
... another weak hypothesis
</stage>
... another stages
</WaldBoostClassifier>
</TWaldBoostLearner>
</ROOT>

Classifier is placed in node <WaldBoostClassifier> and holds “stages” with waldboost thresholds. Every stage holds one weak hypothesis (DecisionStumpWeakHypothesis is supported) and weak hypothesis holds one image feature. Tho only supported frature types are Haar heatures with names – HaarHorizontalDoubleFeature (a), HaarVerticalDoubleFeature (b), HaarHorizontalTernalFeature (c) and HaarVerticalTernalFeature (d), which corresponds to wavelets on figure 2. Haar features are based on intensity difference between adjacent rectangular regions. Meaning of wavelet parameters in xml is explained on figure 3.


Figure 2: Haar wavelets


Figure 3: Feature parameters

Example

Here is an example of using AdaBoost Runtime to detect objects in image.

#include ”abr.h”
#include <vector>
using namespace std;
...
TClassifier * classifier = loadStrongClassifier(”test.xml”);
TImagePyramid pyramid = new TimagePyramid(size, 1.33, 8);
TImageUChar image(size);
vector<TRect> detections(max_detections);
if (classifier && pyramid && classifier->ok())
{
// Acquire image from camera or video file
pyramid->insert(image);
int count = scanImagePyramid(pyramid, classifier, detections);
// vector detections now contains count number of detected areas
}

Installation

The AdaBoost Runtime was tested with MS Visual Studio 2005 and gcc 3.4 on Windows. To install it on your system follow this steps:

  1. Build libabr.lib (libabr.a) – use Visual Studio solution or makefile

  2. Place static library to your compiler lib path

  3. Copy header files (except common.h) from src directory to include directory of your compiler

To successfully build static library you need libxml2 installed. The libxml is used to provide processing of XML classifiers. To use functions and structures the library provide you need to include abr.h to your code and link the code with static library (-labr in gcc, additional library libabr.lib in VS).