Latest from MIT Tech Review – I Was There When: AI became the DJ

I Was There When is an oral history project that’s part of the In Machines We Trust podcast. It features stories of how breakthroughs and watershed moments in artificial intelligence and computing happened, as told by the people who witnessed them. In this episode we meet Gustav Söderström, who helped create algorithms aiming to understand our taste in music. 

Credits:

This episode was produced by Jennifer Strong, Anthony Green and Emma Cillekens. It’s edited by Michael Reilly and Mat Honan. It’s mixed by Garret Lang, with original music by Jacob Gorski. 

Full transcript:

[PREROLL]

[TR ID]

[MUSIC IN] 

Jennifer: People started using machines to help create music playlists long before apps and smartphones…  The top-40 format that radio stations use? probably began in the 1950’s, in Omaha, Nebraska—where a station manager noticed certain songs would be played over and over again on a local bar’s jukebox… (and if you’ve never seen one of those – it’s a machine that plays music…and lets people choose the songs they want to hear.)

The radio station used the list of songs compiled from this jukebox as the basis for their music playlist.

Then in the early 2000’s, anyone with an iPod and an iTunes account could become their own personal DJ… rearranging downloaded songs to match whatever theme they liked. 

These days, half a billion people subscribe to music streaming services… that recommend music you might like and curate personalized playlists based on your listening habits. 

Spotify is by far the biggest player in this space, with a third of the market—thanks in no small part—to the sophistication of its music recommendation algorithms. 

I’m Jennifer Strong, and this is I Was There When—an oral history project featuring the stories of breakthroughs and watershed moments in AI and computing, as told by those who witnessed them.

This episode, we meet someone behind Spotify’s recommendation engine… who helps algorithms understand our taste in music and podcasts. 

Gustav Söderström: I’m Gustav Söderström. I’m the chief R and D officer at Spotify. And I was there when AI changed the way we consume music. It didn’t happen in a single moment, but rather it took years of work and iterative development.

Even if you grew up in the pre-internet days of music discovery, like I did, it’s hard to remember exactly how difficult it used to be, to find new music. Now, all you have to do is press a play button and you’ll get a stream of songs tailored to your exact taste. But in the past, you had to put in some work. Back when I was an engineering student at Stockholm’s Royal Institute of Technology in the early 2000’s, I remember my hardcore music nerd friends coming through record stores, looking for that new Depeche Mode album. Personally, I was into slightly heavier synths, like Front 242. And they loved it. Which is why at Spotify, for the first few years, we didn’t see just how pivotal machine learning powered music recommendations would come to be. Around 2010, I distinctly remember sitting in the office with Daniel Ek, Spotify CEO and founder, and Oskar Stål, another early Spotifyer, talking about what was next for Spotify.

Related work from others:  UC Berkeley - RECON: Learning to Explore the Real World with a Ground Robot

And at the time, we actually all agreed that recommendations weren’t really core to our product. We cared about music discovery, but as far as we were concerned, Spotify was already great at it. When Spotify first launched in 2008, it was a game changer for music aficionados. We had created the perfect tool for someone with an encyclopedic knowledge of artists and genres who already keep up with the latest releases and enjoys spending hours at a time looking through back catalogs and putting together carefully crafted mixes.

So we thought that all you really needed was a good search bar and an advanced playlisting feature. From there, you could soundtrack your own life perfectly. Right? Well, it turns out that even though everyone loves the rush of discovering a new favorite song, not everyone loves working that hard or has the time to work that hard to find it.

We also noticed the industry changing. While the first wave of the internet was about cataloging things that were offline and bringing them online with a search bar to let people find what they wanted. The next wave was about recommending things that they didn’t even know they wanted yet. Like Pinterest, for example. Luckily for us, by 2011, we had something that barely anyone had at the time—hundreds of millions of user-created playlists.

Spotify was already arguably the largest music creation database in history. And that database has only gone larger since then. It now contains over 4 billion playlists. We knew that we could harness all that data to bring listeners music they never would’ve found on their own, but we didn’t quite know how yet. 

So we hired two machine learning engineers. For the better part of the year, they worked on a technology called collaborative filtering. Collaborative filtering is a type of recommendation algorithm that sweeps a data set and looks for patterns based on how often two items appear in the same set. In our case, that meant how often tracks appeared on the same user generated playlist. If two songs showed up together on playlists over and over again, a collaborative filtering algorithm would deduce that A, the songs are somehow similar to one another. And B, if someone likes one of those songs, then they would probably like the other song too. 

But collaborative filtering on its own, wasn’t enough. The recommendation algorithm would often come up with absolutely amazing and unintuitive suggestions that few humans would’ve ever found. But it also made simple mistakes that no human would’ve ever made. The problem with collaborative filtering is that it doesn’t pay any attention to the content or style of the tracks themselves. Only if they appear together. You could say that it’s blind in that way. In other words, collaborative filtering is only as good as the data it’s working off of, and our data didn’t always make sense. 

Related work from others:  Latest from MIT : Perfecting pitch perception

For example, around Christmas time, people often made playlists with Christmas carols and pop songs in the same set. So when you later put on Justin Bieber over spring break, the algorithm might deduce that you’re also in the mood for some Jingle Bell Rock, because they used to appear together back in December. This actually happened by the way. You can see how the system was kind of working in that the code and the models were functioning correctly, but not really working in that the system didn’t really understand the musical style or genre of what it was recommending because that factual human-defined type of information just wasn’t in the data. It only looked at what tracks appeared together. 

But through the acquisition of a company called The Echo Nest, which started as a spinoff from the MIT Media Lab, we managed to find that type of factual, human-understandable genre data, and combine it with our collaborative playlist-based algorithm. With this new combination of data sets, the irregularity started fading out and the recommendations improved drastically. I remember trying the radio feature where you could input a track as the seed, and then get hours of surprising, but still sonically and genre-consistent tracks served up. 

The machine finally worked, but the obvious next question was how to present this new tool to millions of Spotify users. Because Spotify was all about playlisting, not listening to radio channels. We played around with a number of design solutions like highlighting related artists and searches, building a Discover page. But none of it seemed to really drive significant usage. This was because listeners didn’t think of Spotify as a place to go for music recommendations yet. They had access to millions of songs, which they searched for and created personal playlists with. At no point did they suppose Spotify would help them discover new content. So in a sense, building out machine learning was the easy part. Making the recommendations actually reach users in a design that made sense to them turned out to be an unexpectedly difficult problem to solve. We didn’t make much progress until one of our annual hack weeks—when one group of employees came up with a rather simple and beautiful solution: a personalized playlist updated weekly, where all you had to do was press play. Just to reiterate how much of a face Palm moment this was… by this point, Spotify was made of billions of playlists, including playlists by some of the world’s best, most knowledgeable curators. Think music critics and DJs. It was one of the key features our users loved the most. One that they were intrinsically familiar with. But nobody had come up with the idea to create an algorithm that could construct playlists all by itself. Instead of asking users to try a new feature. We just asked them to keep doing what they were doing: listening to playlists. 

Related work from others:  Latest from MIT Tech Review - OpenAI is selling DALL-E to its first million customers

So in the summer of 2015, we launched our first fully algorithmically-generated playlist individualized for every single user: Discover Weekly. It was a critical juncture for our use of machine learning. And it was an overnight success. In just 10 weeks, Discover Weekly, reached a billion streams. Discover Weekly transformed how we thought about music discovery. And over time, our recommendations continued to improve. We learned how to break down songs acoustically: applying, even more elaborate learning techniques like machine listening and classifiers that could hear a song’s tempo and beat as well as the less describable elements, which were contained in the pattern of the music like energy. 


We had crossed a new threshold where the whole business model and product needed to be rethought based on machine learning. In just a few years, we expanded our discovery offering to include algorithmically generated playlists for nearly all different artists and genres, moods, use cases, habits, and interests. We also introduced playlists that leveraged the taste and expertise of our curators. In order to add a human dimension to machine learning, a process we named “Algotorial.” Curators can label tracks as coffeehouse vibes, or songs to sing in the car. The kinds of traits that machines have a really hard time understanding.Then an algorithm picks individual songs from that giant pool of tracks and arranges them into a playlist just for you. So my coffeehouse vibes are going to be slightly different from your coffeehouse vibes. 

And what was the key lesson for us? While machine learning has provided the opportunity for fantastic recommendations, to really bring it to market, designing the end product and understanding the user experience mattered just as much as the technology itself. Integrating machine learning completely upended the way music discovery worked on Spotify and represented a shift in the way music was offered to listeners around the world.

More importantly, however, It brought a whole new realm of possibilities for fans to connect with new artists they had never even heard of. Today, listeners discover new artists on Spotify close to 16 billion times every month. Now we’re bringing everything we’ve learned from developing machine learning powered music discovery, to the discovery of podcasts and other types of audio.

And there’s no doubt, If it wasn’t for machine learning, my list of favorite artists and songs would look very different today. 

Jennfier: I Was There When… is an oral history project featuring the stories of people who witnessed or created breakthroughs in artificial intelligence and computing. 

Do you have a story to tell? Know someone who does? Drop us an email at podcasts at technology review dot com.

[MIDROLL]

[CREDITS]

Jennifer: This project was produced by me with Anthony Green and Emma Cillekens. We’re edited by Michael Reilly and our mix engineer is Garret Lang. 

Thanks for listening, I’m Jennifer Strong. 

[TR ID]

Similar Posts