Design, Analysis and Implementation of New Variants of Kd-trees
Tutor / director / evaluatorRoura Ferret, Salvador,
Document typeMaster thesis
Rights accessOpen Access
The representation of multidimensional data is a central issue in database design, as well as in many other elds, including computer graphics, com- putational geometry, pattern recognition, geographic information systems and others. Indeed, multidimensional points can represent locations, as well as more general records that arise in database management systems. For instance, consider an employee record that has attributes corresponding to the employee's name, address, sex, age, height and weight. Although the di erent dimensions have di erent data types (name and address are strings of characters; sex is a binary eld; and age, height and weight are numbers), these records can be treated as points in a six-dimensional space. We may see a database as a collection of records. Each record has several attributes, some of which are keys. The associative retrieval problem consists of answering queries with respect to a le of multidimensional records. Such an associative query requires the retrieval of those records in the le whose key attributes satisfy a certain condition. Examples of associative queries are intersection queries and nearest neighbor queries. In order to facilitate the retrieval of records based on some conditions on its key attributes, it is usually helpful to assumed the existence of an ordering for its values. In the case of numeric keys, such an ordering is quite obvious. In the case of alphanumeric keys, the ordering is usually based on the alphabetic sequence of the characters making up the attribute value. Furthermore, certain queries, like nearest neighbor searches, require the existence of a distance function.