Investigating Hidden Markov Models' capabilities in 2D shape classification

Manuele Bicego

Object recognition, shape analysis and classification constitute important research areas in computer vision. Three-dimensional (3D) object recognition has been faced by a large number of approaches, many of which are based on the analysis of two-dimensional (2D) aspects of objects.
In this context, a basic issue is surely the kind of object description to be used. In fact, shape analysis methods can be classified considering this issue in terms of boundary (or external) or global (or internal) algorithms. Typical examples of the former class are constituted by methods coding the object boundary like, for example, Fourier descriptors and chain code, whereas, examples of the latter class are algorithms based on the medial axis extraction, or moment-based approaches.

In particular, object contours proved to be very effective in many applications, and different types of  approaches have been proposed in the past years, each with different characteristics, like robustness to noise and occlusions, invariance to translation, rotation and scale, computational requirements, and accuracy.

In this context, this work aims at investigating the capabilities of Hidden Markov Models (HMMs) for 2D shape classification. Shapes are represented by contours and described by their curvature coefficients along the boundary. HMM performances are assessed in the cases of translation, rotation, noise, occlusions, shearing transformations, and combinations of the above perturbations.

Special attention is devoted to the training of the HMMs, in particular, to the initialization of the learning session and to the model selection issue. The initialization issue is crucial to the learning because of the local behavior of the standard procedure used to estimate the HMM parameters: in this approach it is addressed by using a Gaussian Mixture Model clustering approach. The model selection is another fundamental issue, regarding the choice of the topology and the number of states: this typically prevents overtraining situations. In our approach the model selection issue is faced using a fast and reasonable approach linked to the initialization, using a Bayesian Inference Criterion (BIC).

Results are promising: with 50% of occlusion, 95% of correct classification is achieved.

Object databases utilized