An unbalanced approach to metric space searching
Proximity queries (the searching problem generalized beyond exact match) is mostly modeled as metric space. A metric space consists of a collection of objects and a distance function defined among them. The goal is to preprocess the data set (a slow procedure) to quickly answer proximity queries. Th...
Autores principales:  , , 

Formato:  Objeto de conferencia 
Lenguaje:  Inglés 
Publicado: 
2005

Materias:  
Acceso en línea:  http://sedici.unlp.edu.ar/handle/10915/21087 
Aporte de:  Aportado por :
SEDICI (UNLP) de
Universidad Nacional de La Plata .

Sumario:  Proximity queries (the searching problem generalized beyond exact match) is mostly modeled as metric space. A metric space consists of a collection of objects and a distance function defined among them. The goal is to preprocess the data set (a slow procedure) to quickly answer proximity queries. This problem have received a lot of attention recently, specially in the pattern recognition community.
The Excluded Middle Vantage Point Forest (VP–forest) is a data structure designed to search in high dimensional vector spaces. A VP–forest is built as a collection of balanced Vantage Point Trees (VP–trees).
In this work we propose a novel twofold approach for searching. Firstly we extend the VP– forest to search in metric spaces, and more importantly we test a counterintuitive modification to the VP–tree, namely to unbalance it. In exact searching an unbalanced data structure perform poorly, and most of the algorithmic effort is directed to obtain a balanced data structure. The unbalancing approach is motivated by a recent data structure (the List of Clusters ) specialized in high dimensional metric space searches, which is an extremely unbalanced data structure (a linked list) outperforming other approaches. 
