Rank selection in multidimensional data
Document typeExternal research report
Rights accessOpen Access
Suppose we have a set of K-dimensional records stored in a general purpose spatial index like a K-d tree. The index efficiently supports insertions, ordinary exact searches, orthogonal range searches, nearest neighbor searches, etc. Here we consider whether we can also efficiently support search by rank, that is, to locate the i-th smallest element along the j-th coordinate. We answer this question in the affirmative by developing a simple algorithm with expected cost O(na(1/K) log n), where n is the size of the K-d tree and a(1/K) < 1 for any K ¿ 2. The only requirement to support the search by rank is that each node in the K-d tree stores the size of the subtree rooted at that node (or some equivalent information). This is not too space demanding. Furthermore, it can be used to randomize the update algorithms to provide guarantees on the expected performance of the various operations on K-d trees. Although selection in multidimensional data can be solved more efficiently than with our algorithm, those solutions will rely on ad-hoc data structures or superlinear space. Our solution adds to an existing data structure (K-d trees) the capability of search by rank with very little overhead. The simplicity of the algorithm makes it easy to implement, practical and very flexible; however, its correctness and efficiency are far from self-evident. Furthermore, it can be easily adapted to other spatial indexes as well.
CitationDuch, A., Jiménez, R., Martínez, C. "Rank selection in multidimensional data". 2009.
Is part ofLSI-09-32-R