Show simple item record

dc.contributor.authorRajani, Nazneen
dc.contributor.authorMcArdle, Kate
dc.contributor.authorDhillon, Inderjit S.
dc.date.accessioned2015-07-29T11:32:22Z
dc.date.available2015-07-29T11:32:22Z
dc.date.issued2015
dc.identifier.citationRajani, Nazneen; McArdle, Kate; Dhillon, Inderjit S. Parallel k nearest neighbor graph construction using tree-based data structures. A: HPGM: High Performance Graph Mining. "1st High Performance Graph Mining workshop, Sydney, 10 August 2015". 2015.
dc.identifier.urihttp://hdl.handle.net/2117/76382
dc.description.abstractConstruction of a nearest neighbor graph is often a neces- sary step in many machine learning applications. However, constructing such a graph is computationally expensive, es- pecially when the data is high dimensional. Python's open source machine learning library Scikit-learn uses k-d trees and ball trees to implement nearest neighbor graph construc- tion. However, this implementation is ine cient for large datasets. In this work, we focus on exploiting these under- lying tree-based data structures to optimize parallel execu- tion of the nearest neighbor algorithm. We present parallel implementations of nearest neighbor graph construction us- ing such tree structures, with parallelism provided by the OpenMP and the Galois framework. We empirically show that our parallel and exact approach is e cient as well as scalable, compared to the Scikit-learn implementation. We present the rst implementation of k-d trees and ball trees using Galois. Our results show that k-d trees are faster when the number of dimensions is small (2d N); ball trees on the other hand scale well with the number of dimensions. Our implementation of ball trees in Galois has almost linear speedup on a number of datasets irrespective of the size and dimensionality of the data.
dc.format.extent8 p.
dc.language.isoeng
dc.relation.ispartofHigh Performance Graph Mining workshop (1st: 2015: Sydney)
dc.rightsAttribution-ShareAlike 3.0 Spain
dc.rights.urihttp://creativecommons.org/licenses/by-sa/3.0/es/
dc.subjectÀrees temàtiques de la UPC::Matemàtiques i estadística
dc.subject.lcshAlgorithm
dc.titleParallel k nearest neighbor graph construction using tree-based data structures
dc.typeConference report
dc.subject.lemacAlgorismes
dc.identifier.doi10.5821/hpgm15.1
dc.rights.accessOpen Access
upcommons.citation.contributorHPGM: High Performance Graph Mining
upcommons.citation.publishedtrue
upcommons.citation.publicationName1st High Performance Graph Mining workshop, Sydney, 10 August 2015


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record

Except where otherwise noted, content on this work is licensed under a Creative Commons license: Attribution-ShareAlike 3.0 Spain