Inspired by Spotify’s year in music feature (I wrote a post on it as well), I decided to analyze music related data that I had at my disposal. The data that I chose was the list of all the artists that I’ve seen live (78 at the time of doing this analysis).
There were two things that I wanted to surface from this data:
- Which genres of music have I seen the most live?
- Which artists should I see next, based on the artists I’ve already seen?
To answer both these questions I decided to use the Echo Nest API. And Python. All the code I wrote to analyze the data can be found here. I wrote this code when I should have been sleeping so the quality is not the best. Oh well.
About halfway through writing the code I decided that generating a word cloud for #1 would be cooler than simply listing the top genres. After failing miserably to get word_cloud working on my machine I decided to use an online word cloud generator instead. Here’s the resulting word cloud:
The technique I used to answer #2 was to get the list of similar artists for each artist I’ve seen live, remove artists that I’ve already seen, and keep track of how many times each unseen artist is listed as a similar artist. Here are the top recommendations generated by my algorithm (format: <artist, number of times listed as similar artist>):
- Swedish House Mafia, 5
- The Raconteurs, 4
- Cut Copy, 3
- Beach Fossils, 3
- Kaiser Chiefs, 3
- Iron Maiden, 3
- Dio, 3
- Ellie Goulding, 2
- Black Sabbath, 2 (seeing them in September)
- Animals as Leaders, 2
My recommendation algorithm is extremely simple but produced surprisingly good results.
The Echo Nest API is incredible.
P.S. I tried using pyechonest but there didn’t seem to be a way to retrieve artist genre information which is why I decided to use their API directly.