Top 10!
- ricky martin - 37.5232
- smash mouth - 43.5782
- steve winwood - 47.0957
- third eye blind - 47.3322
- korn - 48.7898
- fleetwood mac - 49.2755
- jennifer paige - 49.5098
- sugar ray - 49.724
- lionel richie - 51.0326
- mya - 51.204
- prince -708.6668
- jamiroquai - 673.2426
- natalie imbruglia - 610.4856
- oasis - 493.2065
- bloodhound gang - 342.4461
- daft punk - 231.6445
- radiohead - 229.4721
- tool - 198.714
- miles davis - 197.7051
- frank sinatra - 184.6662
So, I dig deeper. The r-precision ranks were based on the top recommendations for each song, just what hubs (and anti-hubs) are best at mucking up. The list above are based on means of distances, which are particularly sensitive to outliers (which we have seen are usually badly modeled anti-hubs).
Let's take Jamiroquai. Below is a visualization his inter-song log-distances (red = distant).
Looks like we have an outlier, and it's name is "Picture Of My Life" from the epic album "Funk Odyssey". This track's hubness (using 100-occurances) is 2, so it's easily considered an anti-hub. Taking a look at the "activation-gram" we see a weird section about 3.5 minutes in.
Clearly at least 7 of the 32 components were trained to solely model this part of the track. What is this strange musical section, you may ask? It turns out to be the silence between the end of the song and the beginning of the "hidden track", "So Good To Feel Real". You can even see that the second song is also not as well modeled as the first. Oh, the joys of content-based recommendation!
After more looking around, it looks like most artists at the bottom of the list above have just one song in their set that, for some weird but reasonable reason (e.g. there's a Michael Jackson "song" which seems to just be a bonus voice-over included on the remastered edition of "Off The Wall"), doesn't fit with the others.
There's a statistic that's particularly good at weeding out these outlier songs: the median. It turns out, and makes sense, that the median intra-artist distances are more (de)correlated with the average r-precision: corr. coef. = -0.4195, p-value = ~0. And we indeed replace the suspect bottom of the above list with our familiar, typically inconsistent artists. So, the median is good and I am a fan (although without first using the mean I would have never listened to those hot Jamiroquai tracks).
I'd also like to counter what you may be asking: why not use better data? I could and have often thought about it, but the uspop collection is something of a standard and will be easier for anyone to cross-check my work against his. Besides, input problems like the ones shown here are realistic problems any good recommendation engine should be able to handle.
2 comments:
Mark:
I'm enjoying your blog. Keep it up!
Rebecca Fiebrink and I (well mostly Rebecca) worked on developing a metric for objectively evaluating music similarity systems using a combination of the artist distance, genre distance, and their ratios (giving a number that represents how well artists cluster within genres and how well genres cluster within the entire collection). We wrote it up, but never published it. You may find it interesting. I'd be happy to send you a copy if you think it might be useful.
Paul,
Thanks for reading and the encouragement!
I would really like to see that paper. I was thinking of incorporating some genre information at some point or some other "ground truth" metrics (like musicseer data) and would love to some additional ideas.
Post a Comment