Doctoral thesis (Dissertations and theses)
A Combining Approach to Cover Song Identification
Osmalsky, Julien
2017
 

Files


Full Text
phd-thesis-osmalskyj.pdf
Author postprint (19.45 MB)
Download
Annexes
phd-presentation-osmalskyj.zip
Publisher postprint (25.59 MB)
Download

All documents in ORBi are protected by a user license.

Send to



Details



Keywords :
Cover Song Identification; Combining; Music Information Retrieval; MIR; Audio Features; Rank Aggregation; Million Song Dataset; Second Hand Song Dataset; QMax
Abstract :
[en] This thesis is concerned with the problem of determining whether two songs are different versions of each other. This problem is known as the problem of cover song identification, which is a challenging task, as different versions of the same song can differ in terms of pitch, tempo, voicing, instrumentation, structure, etc. Our approach differs from existing methods, by considering as much information as possible to identify cover songs. More precisely, we consider audio features spanning multiple musical facets, such as the tempo, the duration, the harmonic progression, the musical structure, the relative evolution of timbre, among others. In order to do that, we evaluate several state-of-the-art systems on a common database, containing 12,856 songs, that is a subset of the Second Hand Song dataset. In addition to evaluating existing systems, we introduce our own methods, based on global features, and making use of supervised machine learning algorithms to build a similarity model. For evaluating and comparing the performance of 10 cover song identification systems, we propose a new intuitive evaluation space, based on the notions of pruning and loss. Our evaluation space allows to represent the performance of the selected systems in a two dimensional space. We further demonstrate that it is compatible with standard metrics, such as the mean rank, the mean reciprocal rank and the mean average precision. Using our evaluation space, we present a comparative analysis of 10 systems. The results show that few systems are usable in a commercial system, as the most efficient is able to identify a match at the first position for 39% of the analyzed queries, which corresponds to 4,965 songs. In addition, we evaluate the systems when they are pushed to their limits, by analyzing how they perform when the audio signal is strongly degraded. To improve the identification rate, we investigate ways of combining 10 systems. We evaluate rank aggregation methods, that aim at aggregating ordered lists of similarity results, to produce a new, better ordering of the database. We demonstrate that such methods produce improved results, especially for early pruning applications. In addition to evaluating rank aggregation techniques, we propose to study combination through probabilistic rules. As the 10 selected systems do not all produce probabilities of similarity, we investigate calibration techniques to map scores to relevant posterior probability estimates. After the calibration process, we evaluate several probabilistic rules, such as the product, the sum, and the median rule. We further demonstrate that a subset of the 10 initial systems produces better performance than the full set, thus showing that some systems are not relevant to the final combination. Applying a probabilistic product rule to a subset of systems significantly outperforms any individual systems, on the considered database. In terms of direct identification (top-1), we achieve an improvement of 10% (5,460 tracks identified), and in terms of mean rank, mean reciprocal rank and mean average precision, we respectively improve the performance by 40%, 9.5%, and 12.5%, with respect to the previous state-of-the-art performance. We further implement our final combination in a practical application, named DISCover, giving the possibility for a user to select a query and listen to the produced list of results. While a cover is not systematically identified, the produced list of songs is often musically similar to the query.
Disciplines :
Computer science
Author, co-author :
Osmalsky, Julien ;  Université de Liège - ULiège > Dép. d'électric., électron. et informat. (Inst.Montefiore) > Dép. d'électric., électron. et informat. (Inst.Montefiore)
Language :
English
Title :
A Combining Approach to Cover Song Identification
Defense date :
05 October 2017
Number of pages :
187
Institution :
ULiège - Université de Liège
Degree :
Doctorat en Sciences Informatiques
Promotor :
Embrechts, Jean-Jacques ;  Université de Liège - ULiège > Département d'électricité, électronique et informatique (Institut Montefiore)
Van Droogenbroeck, Marc  ;  Université de Liège - ULiège > Montefiore Institute of Electrical Engineering and Computer Science
Secretary :
Geurts, Pierre ;  Université de Liège - ULiège > Montefiore Institute of Electrical Engineering and Computer Science
Jury member :
Dixon, Simon
Peeters, Geoffroy
Dupont, Stéphane
Available on ORBi :
since 10 October 2017

Statistics


Number of views
348 (22 by ULiège)
Number of downloads
326 (12 by ULiège)

Bibliography


Similar publications



Contact ORBi