Robust clustering of data collected via crowdsourcing
Document typeConference report
Rights accessOpen Access
Crowdsourcing approaches rely on the collection of multiple individuals to solve problems that require analysis of large data sets in a timely accurate manner. The inexperience of participants or annotators motivates well robust techniques. Focusing on clustering setups, the data provided by all an- notators is suitably modeled here as a mixture of Gaussian components plus a uniformly distributed random variable to capture outliers. The proposed algorithm is based on the expectation-maximization algorithm and allows for soft as- signments of data to clusters, to rate annotators according to their performance, and to estimate the number of Gaussian components in the non-Gaussian/Gaussian mixture model, in a jointly manner.
© 2017 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes,creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works
CitationPages, A., Giannakis, G.B., López, R., Gimenez, P. Robust clustering of data collected via crowdsourcing. A: IEEE International Conference on Acoustics, Speech, and Signal Processing. "ICASSP 2017 - 42nd IEEE International Conference on Acoustics, Speech and Signal Processing: March 5-9, 2017:New Orleans, USA: Proceedings book". New Orleans: 2017, p. 4014-4018.