Introduction: It’s the remix!

At the beginning of the Computational Musicology course, we were asked to choose a corpus for the portfolio. A key aspect of the corpus selection was to come up with a selection that allowed for meaningful comparisons and contrasts, to answer a specific research question. Since the ideal corpus size was around 50-100 tracks, I immediately thought of my playlist with around 80 remixes, that I started maintaining during the first COVID-19 lockdown in The Netherlands.

I started maintaining this playlist for myself because there are generally a large number of remixes for artists that I follow, but I was unfamiliar with most. Remixes are often an underappreciated part of an artist’s discography. Most often, the reason a remix is made is to adapt or revise a song for radio or nightclub play, as is stated beautifully, among other reasons, on Wikipedia. The remixes on my playlist are either extended remixes by the original artists/producers or a remix that has a significant amount of production on the song altered from the original. Remixes, where the only difference is a new guest artist are common nowadays but not included on the playlist. Listening to a lot of remixes and collecting those that I like was an interesting musical journey. Before diving deep down the “Singles and EPs” section of a lot of artists’ Spotify profiles, I allowed myself to be influenced by various threads in the popheads subreddit.

I will try to answer one main research question in this portfolio: What makes the remixes in my playlist different from their original recordings?

First, I take a look at the track-level Spotify features for both groups, then I dive a bit deeper with a few lower-level track analyses. For example, an expected change is the tempo of a remix, which is probably higher than the original tracks. Finally, I try to see if a remix can easily be detected by a classifier, based on this corpus.

Typical tracks:

  1. Katy Perry - Chained To The Rhythm (Oliver Heldens Remix)
    • This is quite a typical dance remix of a pop spng, as there are lots on Spotify. Producers like Oliver Heldens and R3HAB just love to put these out. This is the type of remix that does no harm but also doesn’t bring a lot (different) to the table.
  2. Lady Gaga & Ariana Grande - Rain On Me (Purple Disco Machine Remix)
    • This is an extended version of the original, but the beat is a bit more house. There are more extended versions/remixes of tracks on this playlist, so it’s quite typical.

Atypical tracks:

  1. NAO - In The Morning - Mura Masa Edit
    • Instead of making this track more energetic, this remix swaps the beat and keeps it low-key, with a very present filter applied to NAO’s vocals.
  2. Kesha - Praying (Frank Walker Remix)
    • This turns an acclaimed ballad into a dancefloor banger, with a good drop and build-ups. It may sound typical, but it feels so different and flips across genres.

A remix can change a lot of aspects of a song, an expected change is the tempo


Since remixes exist for a large number of reasons, and since one of the main reasons is making a song more suitable for the dancefloor, there is probably some change in tempo. One of the many features the Spotify API provides is the average tempo across a track.

In the violin plot, which shows the full tempo distribution for both categories, it’s apparent that for original recordings, the tempo for most tracks lies between 100-140 bpm, with a peak around 120 bpm. The tempo distribution for the remixes is a lot denser, with a peak at just slightly above 120 bpm. Most house music, and thus, remixes that fall into that genre, is around 128 bpm nowadays, which can be a significant factor in the overall tempo change.

Tempo alone doesn’t define how club-ready a track is, other important factors are energy, danceability, and loudness, all features in the Spotify API.

Other important factors regarding remixes: danceability, energy, and loudness


When thinking of dance tracks, one thinks of high-energy, danceable tracks, and the music being played loudly. Since Spotify has all of these features computed, we can look at the distribution.

According to the Spotify API documentation, energy is measure from 0.0 to 1.0 and it “represents a perceptual measure of intensity and activity. Typically, energetic tracks feel fast, loud, and noisy”, danceability is also from 0.0 to 1.0 and describes “how suitable a track is for dancing based on a combination of musical elements including tempo, rhythm stability, beat strength, and overall regularity”. Loudness is the overall loudness of a track in decibels and is averaged across the entire track.

Means for several features, per category
Category Danceability Energy Loudness
Original recordings 0.665 0.718 -5.609
Remixes 0.678 0.740 -5.903

As visible in the boxplots and the table with means, remixes have just slightly higher energy and danceability, as one would expect. Overall, they are just a bit less loud than the original recording.

Big changes in energy point towards a remix being really different from the original


As we’ve seen, energy for remixes is just slightly higher on average, and the energy values are not distributed in an entirely different way. However, that doesn’t mean that there can’t be huge differences for an individual remix.

After listening to some of the tracks with a big difference in Spotify-measured energy, I agreed that those were the remix where a lot of elements were changed from the original.

Interesting enough: Flume and Disclosure both have a track of theirs remixed and a remix that they did in this top 10.

And what about the smallest differences in energy?


After looking at the remixes with the biggest change in energy, it’s interesting to look at the remixes with the smallest change as well. The top 10 “least different” really don’t feel that different from the original.

Half of the songs in the top 10 are mostly remixes that are just extended versions of the original song, namely: Circus, Flames, Better When You’re Gone, Don’t Start Now, and Toxic. It makes sense that an extended version of a song is not drastically different from the original. Also of note: Mura Masa has a track in both the top 10 biggest and top 10 smallest changes in energy.

Emotional valance and arousal; do remixes shift the mood of a song?


This graphic shows the Spotify API-determined energy and valence for all tracks in the corpus. Most original recordings in the corpus were already quite high in both energy, with almost no tracks in the lower half of the plot.

Looking at Russell’s circumplex model of affect, almost all tracks, whether remix or original, are in the upper half, between Sad, Nervous, Active, Enthusiastic, and Happy.

Looking at the remixes, the valence seems to be going more towards the extreme values on both ends. Overall, there seems to be a slight shift to the upper-left, indicating a portion of remixes should be more nervous-feeling.

When hovering over a point in the graphic, the exact values and song title are shown. The track’s counterpart in the other category gets highlighted as well, highlights can be made undone by double-clicking on any blank space in the graphic. Every point’s size stands for the tempo, and their color for their mode (minor or major).

The need for a tempo, even when it’s not always accurate


In 2014, British pop artist Charli XCX released her sophomore album, SUCKER. The song “Need Ur Luv” features on the record, and has a similar vibe to BØRNS’ song “Electric Love”, which came out just a month before. Australian indie-pop artist Japanese Wallpaper made a remix of “Need Ur Luv”, which got an official release in 2018.

A snippet of Charli XCX - Need Ur Luv (Japanese Wallpaper Remix)

According to Spotify, the tempo of the remix is 216 beats per minute, which should make it quite a fast piece. As seen in the first tempogram, the different sections of the track lead to different strengths of tempo octaves, multiples of the tempo. The parts with stronger percussion are stronger at 108 beats per minute, but it seems like the chimes threw off Spotify’s algorithm to pick to tempo for the whole piece. In the second tempogram, all tempi are cycled to be between the most common range of 80-160 beats per minute, resulting in a very clear line at around 108 beats per minute. It’s not unfair to assume that the decision for the overall tempo of the piece also influences other metrics, like energy and danceability, which might be “inaccurate” for some pieces, even though only Spotify really knows how those are computed exactly.

Gonna be you, and me / It’s gonna be everything, restructured by Flume


Flume’s remix of Disclosure’s “You & Me” (featuring Eliza Doolittle) had the biggest change in energy in the corpus compared to its original recording. It’s also way more popular than the original. As of writing, the remix has 398 million Spotify streams, while the original has 19 million. The remix also has the highest difference in energy from its original in this corpus.

In the original recording, both self-similarity matrices show horizontal and vertical lines for novelty just before the chorus, where the song strips some elements. The timbre-based matrix shows a lot of blocks, which is a coherent part of the song. The chroma (pitch classes)-based matrix most clearly shows the different parts of the song. Besides the novelty lines, the part just before the line and after the line is parallel to its repetition later in the song. The difference between the first chorus and the second verse is visible as well because the chorus stands out as a homogenous block.

The Flume remix is not only different in energy, the structure is completely different as well. The verses are completely gone, and only a part of the chorus (and the part just before) is used, a newcomer is the two drops the song now has. The chroma-based matrix shows a lot of parallel lines, which means there’s a big similarity in a later part of the song. But mostly, this matrix shows there’s a high similarity between the two drop parts (the large dark blue squares) and similarity between all parts that are not the drop. The timbre-based matrix more clearly shows the (mostly just musical) parts after the drops, and the verses before the drop are somewhat visible, due to the actual drop parts being darker.

Dance, dance, dance: do remixes have a different timbre?


The Spotify low-level track analysis for each track holds information for each segment (a very small part, mostly less than 1 second, depending on the track), including chroma and timbre. We can use the timbre information to see if remix could be using different types of sounds.

Every segment has timbre information, which holds the value for all the Spotify timbre coefficients. From all segments, a mean per coefficient for each song can be calculated. As seen here, when those means are visualized with a boxplot, the distribution of timbre coefficient values overall isn’t that different, especially for the later timbre coefficients. However, for coefficients 3, 4, and 6 we see a difference.

Coefficient 3 roughly explains the flatness of a sound, which is a lot more spread out. Coefficient explains 4 how strong the “attack” of a sound is, which overall is higher. Coefficient 6 is more spread out, and lower for the remixes.

Can a song easily be classified as a remix based on Spotify’s features?


In the previous pages, some small, overall differences between the remixes in my playlist and the original recordings were already identified. Namely in some timbre coefficients and in the tempo of the tracks. A human, of course, knows when something is a remix because most of them have the word “Remix” in the title. But can we find out just from the Spotify features?

To answer this question a random forest was used to find the most important features. It’s visible that the most important features are:

  1. Instrumentalness
  2. Timbre coefficient 6
  3. Duration
  4. Timbre coefficient 4

We can use these features for a reduced knn classifier. The duration probably makes it easy to find the extended versions of songs in the playlist. Some remixes also have a (way) higher instrumentalness score, due to them stripping some vocals of the original.

On the left, the performance is shown. Out of 160 songs, 50 original recordings were classified correctly, along with 38 remixes. This leaves us with an accuracy of 55%, just slightly better than random chance. In conclusion, this probably isn’t a useful way to identify a remix. Performance is also influenced by the types of songs that are remixed, which in this case, sometimes, are already dance tracks.

Conclusion

Even with a limited musicological background, looking at all the information provided by Spotify API helped in gaining insights into the remixes that I like, and what makes them different from their original counterpart.

Taking a look at lower-level features and the visualisations of those, like the self-similarity matrix and tempogram, was interesting as well. I also feel like I now have a slight grasp of how Spotify determines its features.

Beforehand, I already knew why I added remixes to my remix playlist. I wanted longer versions of songs I knew or nice dance versions of songs. The data confirmed that my remixes on average are longer (which the playlist length also already shows beautifully) and that they’re a bit more energetic and danceable. But it also showed things that I didn’t know, but make sense. For example, how the remixes are slightly more “nervous”-feeling. Now it makes sense why I listen to it mostly while working on assignments.

This playlist started as a passion project during the first lockdown, and it’s funny to see how it became my university project where I got to use techniques I already knew in (web) programming and even learned an entirely new language during another lockdown. I hope to enjoy some of the remixes in a club when that’s possible again. MEROL beautifully expressed this sentiment in “knaldrang”, so I’ll leave you with that.