Part-based object retrieval with binary partition trees

Author

Giró i Nieto, Xavier

Director

Marqués Acosta, Fernando

Codirector

Chang, Shih-Fu

Date of defense

2012-05-31

Legal Deposit

B. 10510-2013

Pages

211 p.



Department/Institute

Universitat Politècnica de Catalunya. Departament de Teoria del Senyal i Comunicacions

Abstract

This thesis addresses the problem of visual object retrieval, where a user formulates a query to an image database by providing one or multiple examples of an object of interest. The presented techniques aim both at finding those images in the database that contain the object as well as locating the object in the image and segmenting it from the background. Every considered image, both the ones used as queries and the ones contained in the target database, is represented as a Binary Partition Tree (BPT), the hierarchy of regions previously proposed by Salembier and Garrido (2000). This data structure offers multiple opportunities and challenges when applied to the object retrieval problem. A first application of BPTs appears during the formulation of the query, when the user must interactively segment the query object from the background. Firstly, the BPT can assist in adjusting an initial marker, such as a scribble or bounding box, to the object contours. Secondly, BPT can also define a navigation path for the user to adjust an initial selection to the appropriate spatial scale. The hierarchical structure of the BPT is also exploited to extract a new type of visual words named Hierarchical Bag of Regions (HBoR). Each region defined in the BPT is described with a feature vector that combines a soft quantization on a visual codebook with an efficient bottom-up computation through the BPT. These descriptors allow the definition of a novel feature space, the Parts Space, where each object is located according to the parts that compose it. HBoR descriptors have been applied to two scenarios for object retrieval, both of them solved by considering the decomposition of the objects in parts. In the first scenario, the query is formulated with a single object exemplar which is to be matched with each BPT in the target database. The matching problem is solved in two stages: an initial top-down one that assumes that the hierarchy from the query is respected in the target BPT, and a second bottom-up one that relaxes this condition and considers region merges which are not in the target BPT. The second scenario where HBoR descriptors are applied considers a query composed of several visual objects. In this case, the provided exemplars are considered as a training set to build a model of the query concept. This model is composed of two levels, a first one where each part is modelled and detected separately, and a second one that characterises the combinations of parts that describe the complete object. The analysis process exploits the hierarchical nature of the BPT by using a novel classifier that drives an efficient top-down analysis of the target BPTs.

Subjects

004 - Computer science; 621.3 Electrical engineering

Documents

TXGN1de1.pdf

16.34Mb

 

Rights

L'accés als continguts d'aquesta tesi queda condicionat a l'acceptació de les condicions d'ús establertes per la següent llicència Creative Commons: http://creativecommons.org/licenses/by-nc-sa/3.0/es/
L'accés als continguts d'aquesta tesi queda condicionat a l'acceptació de les condicions d'ús establertes per la següent llicència Creative Commons: http://creativecommons.org/licenses/by-nc-sa/3.0/es/

This item appears in the following Collection(s)