How Do Programs like iTunes Genius and Pandora Work?
My music tastes aren’t really main stream and usually I find the talk and commercials on the radio annoying. So the question is - how to find new music? Do I spend hours listening to short pieces of songs on iTunes? Do I surf the web and check out the top 100 hits? Do I try out Pandora? Where they don’t even have a lot of my favorite artists? How does iTunes Genius and Pandora work? Can I figure out a way to make them work better for me?
So say you use iTunes. It looks at what songs you have and how often you play them, and it compares them to the collections of other iTunes users. Statistics are generated for each song. Apple engineer Erik Goldman says "These statistics are computed globally at regular intervals and stored in a cache." And then they apply fancy algorithms.
But it is how they define the factor that gives more weight to the things that really matters. The usual way to do this is with tf-idf or term frequency–inverse document frequency.
Wikipedia says –
“the tf–idf weight (term frequency–inverse document frequency) is a weight often used in information retrieval and text mining. This weight is a statistical measure used to evaluate how important a word is to a document in a collection or corpus. The importance increases proportionally to the number of times a word appears in the document but is offset by the frequency of the word in the corpus. Variations of the tf–idf weighting scheme are often used by search engines as a central tool in scoring and ranking a document's relevance given a user query. Tf-idf can be successfully used for stop-words filtering in various subject fields including text summarization and classification.”
Ok, tf-idf is easy for documents that are full of words, but music is different right? You don’t like a song because it uses the word ‘cloud’ or something do you? No it’s the sound of the music, the rhythm, the intruments and other factors. Apple, being Apple isn’t really forthcoming on how they determine these factors. Pandora is more open about how they pick songs.
Pandora has the Music Genome Project which is an effort to "capture the essence of music at the fundamental level". Pandora uses almost 400 attributes to describe songs and a complex mathematical algorithm to organize them. Songs listed on Pandora are represented by a vector (a list of attributes) containing approximately 400 "genes".
The genes correspond to musical characteristics. The songs are analyzed by musicians who follow in-house standards and assign the musical characteristics. It takes 20-30 minutes to anyalyze each song.
According to a quick spreadsheet that I made using some information from Pandora the songs that I like are likely to have: vocal-centric aesthetic, mixed acoustic and electric instrumentation, major key tonality, mild rhythmic syncopation, and vocal harmony. Of course they don’t have Turkish music like Tarkan or Gripin. It seems that Tarkan’s Come Closer album features modern r & b styling, melodic songwriting, call and answer vocal harmony (Antiphony), and electronica influences. Unfortunately I could not find a list of attributes for my favorite song on the album – If You Only Knew. What can I say, it’s a waltz; I am hopelessly romantic.
The problem that I have with iTunes Genius is if I use is to make a playlist say based on Gripin’s song Cok Kisa (Acoustic Version) I get a playlist of 100 songs, and the only Turkish music is by Gripin. This is actually great, I now know that they are similar to some of the older U2 music, Sting, Snow Patrol and others that I have. I can use this information to create a station on Pandora which will then play songs that I haven’t heard before. Of course this doesn’t help me find new Turkish music, which would be really nice, but maybe sometime in the future I will be able to.
When I base a Genius playlist on any song by Tarkan the ONLY music I get is by Turkish artists, or I am told that there are not enough related songs to create a playlist. Urgh. I even get only Turkish artists when I am able to create a playlist using one of his English songs! (BTW - Start The Fire is the only song on the English album that Genius can create a playlist for.)
I really wanted to find more artists like Tarkan. (Yes, I know there can be only one. . . ) But I can’t create a station on Pandora to look for music similar to Tarkan’s music because I don’t know what artists are similar. I will have to do more looking to see what else I can try. I wonder if you can create a station on Pandora based on attributes . . . ?