We report on experiments done in an attempt to improve the performance of a music similarity measure which we introduced in Aucouturier and Pachet 2002. The technique aims at comparing music titles on the basis of their global “timbre”, which has many applications in the field of Music Information Retrieval. Such measures of timbre similarity have seen a growing interest lately, and every contribution (including ours) is yet another instantiation of the same basic pattern recognition architecture, only with different algorithm variants and parameters. Most give encouraging results with a little effort, and imply that near-perfect results would just extrapolate by fine-tuning the algorithms' parameters. However, such systematic testing over large, inter-dependent parameter spaces is both difficult and costly, as it requires to work on a whole general meta-database architecture. This paper contributes in two ways to the current state of the art. We report on extensive tests over very many parameters and algorithmic variants, either already envisioned in the literature or not. This leads to an improvement over existing algorithms of about 15% R-precision. But most importantly, we describe many variants that surprisingly do not lead to any substantial improvement. Moreover, our simulations suggest the existence of a “glass ceiling” at R-precision about 65% which cannot probably be overcome by pursuing such variations on the same theme.

