Speech/music audio classification for publicity insertion and DRM
Tutor / directorTarrés Ruiz, Francisco
Document typeMaster thesis
Rights accessOpen Access
The goal of this project is to develop, implement and optimize an existing method called Continuous Frequency Activation (CFA). The aim is to try to solve the problem that exists when advertising in randomly introduced in TV programmes/films/audio podcasts/etc. that can generate discomfort in the viewer. The basic idea is to avoid introducing adverts in the middle of a conversation. The final criteria will be selected taking into account metadata of video (change of plane, scene, fade-out, etc.) and audio (voice, music). To do that, we have developed and algorithm capable of discriminate between music and voice. This algorithm has been developed exclusively for this purpose and does not require base or date training to be trained. Previously to the creation of the algorithm, different existent methods of discrimination between music and voice have been studied and their pros and cons have been analysed. After performing the study, the method that has been selected is The Continuous Frequency Activation (CFA). CFA is one of the methods with better statistic results and it is not necessary to obtain large data bases for its training. The implementation of this algorithm has been performed using MATLAB®. Data base have been used in the realization of the trials, using five different musical style: classic music, Blues, electronical music, Jazz and Speech. The audio files from each different music style have been edited using the software called Audacity®. After performing all the tests, it can be said that the developed algorithm works correctly and it is able to discern music from voice in a very high percentage of cases (97.55%). With the results obtained after the trials, it can be said that this method could be used by companies that are involved in the fields of media, television (Antena 3, Telecinco, etc.) and/or audio podcast. The goal is to automatically introduce publicity in audio podcast format at the most appropriate moment.