%0 Journal Article
%T Fast Exact Nearest Neighbour Matching in High Dimensions Using -D Sort
%A Ruan Lakemond
%A Clinton Fookes
%A Sridha Sridharan
%J ISRN Machine Vision
%D 2013
%R 10.1155/2013/405680
%X Data structures such as -D trees and hierarchical -means trees perform very well in approximate nearest neighbour matching, but are only marginally more effective than linear search when performing exact matching in high-dimensional image descriptor data. This paper presents several improvements to linear search that allows it to outperform existing methods and recommends two approaches to exact matching. The first method reduces the number of operations by evaluating the distance measure in order of significance of the query dimensions and terminating when the partial distance exceeds the search threshold. This method does not require preprocessing and significantly outperforms existing methods. The second method improves query speed further by presorting the data using a data structure called -D sort. The order information is used as a priority queue to reduce the time taken to find the exact match and to restrict the range of data searched. Construction of the -D sort structure is very simple to implement, does not require any parameter tuning, and requires significantly less time than the best-performing tree structure, and data can be added to the structure relatively efficiently. 1. Introduction The nearest neighbour matching ( NN) problem is encountered in many applications of computer science. It is the problem of finding the points in a database nearest to a given query point. The complexity of a simple linear search is proportional to , where is the number of database entries and is the number of data dimensions. Many attempts have been made to reduce the search time by implementing data storage and indexing structures so that the minimum number of data points has to be compared to the query point. Unfortunately, these methods are only effective in low dimensions or when using approximate nearest neighbour matching [1, 2]. Where an exact solution is required in dimensions greater than 20, linear search is only a fraction slower than the best existing search structure. This paper proposes methods for improving the performance of linear search for the purpose of exact nearest neighbour matching using typical visual descriptors, such as the scale invariant feature transform (SIFT) [3, 4]. Many modern visual descriptors, such as gradient location and orientation histogram (GLOH) [5], are based on SIFT and have very similar characteristics from a search perspective. First, it is shown that simple modifications to the linear algorithm can allow it to outperform all existing search structures, without the need for any data preprocessing. Secondly, by
%U http://www.hindawi.com/journals/isrn.machine.vision/2013/405680/