Clustering media items stemming from multiple social networks
Rights accessOpen Access
We have created and evaluated an algorithm capable of deduplicating and clustering exact- and near-duplicate media items of type photo and video that get shared on multiple social networks in the context of events. This algorithm works in an entirely ad hoc manner without requiring any pre-calculation. When people attend events, they more and more share event-related media items publicly on social networks to let their social network contacts relive and witness the attended events. In the past, we have worked on methods to accumulate such public user-generated multimedia content in order to summarize events visually, for example, in the form of media galleries or slideshows. In this paper, first, we introduce social-network-specific reasons and challenges that cause near-duplicate media items. Second, we detail an algorithm for the task of deduplicating and clustering exact- and near-duplicate media items stemming from multiple social networks. Finally, we evaluate the algorithm's strengths and weaknesses and thoroughly compare its performance with the state-of-the-art feature detection algorithms SIFT, ASIFT and SURF and show that for the given use case it performs almost equally well accuracy-wise, but strongly outperforms speed-wise.
CitationSteiner, T., Verborgh, R., Gabarro, J., Mannens, E., Van de Walle, R. Clustering media items stemming from multiple social networks. "The Computer journal (paper)", 27 Setembre 2015, vol. 58, núm. 9, p. 1861-1875.