How to pick the wrong music for soundtracks: choose by the lyrics.

A very common trend in many recent soundtracks: to choose its music tracks according to song lyrics – which, by axiom, requires the use of songs. That, unbeknownst to many, are not the only musical form (interesting, isn’t it? Because “song” is commonly used, since more or less a century ago, as standard definition for music piece. A topic probably worth another post).

One example, amongst the many (that I’ll keep anonymous, out of bon ton – albeit series aficionados might be able to spot it): I’ve recently watched a TV series with an episode containing a very important scene with the death of a woman, whose soundtrack was the well known song “Ain’t no sunshine when she’s gone” – even though (and this is the main problem), emotionally speaking, it didn’t connect with the scene. At all. Because, here’s the problem: lyrics are not music. Lyrics are words – another, completely autonomous, reality. So autonomous that you can make art just out of lyrics: it’s called poetry.

Music has an emotional content on its own, indipendent from lyrics.

Lyrics might have words matching the context shown. But music, the actual sounds that compose it, are not granted to do the same.

This is the very same reason because you don’t have to learn to listen to music: you just do it. You can listen to music never heard before, with instruments you’ve never seen, and done by people you’ve never met, and like it – something that, for those who want to, happens on a daily basis. Same cannot be said of lyrics: you have to know the language and sociological context. Share grammar, syntax, glossary, culture… There has to be lot of shared common grounds, so to make verbal communication possible. Because, again: language is abstract symbols, that receive meaning only when you share enough cultural context to understand what they refer to. Music is not: music is an experience – like watching a sunset or the smell of freshly baked bread. You don’t need somebody to explain what it is: you just feel it.

The problem is rooted in how those who choose the music in these occasions don’t have music knowledge – and by “music knowledge” I don’t mean being updated about who’s at the top of the latest Spotify chart or trending on Google searches: I mean, given paper, pen and piano, being able to write a symphony. Or a Hard Rock piece, or a Reggae one – able to make music.

Google at the resque.

So, in the moment they were given the task to choose which music to have, they were disarmed: where to look for? Because, to pick something from a colossal topic such as music, to make an informed choice, you have to understand how it works. A bit like when your car suddenly stops, you open the hood, and all you see is a grovel of mechanical things: chaos. You have no idea what’s what, and how to figure out a solution – whereas, when your car repairer does that, he knows what he is looking at. And how to make sense of it, and fix it – he knows how to make order out of chaos (a treat for all of you Jungians out there).

So, back again to music: should I use violins? Maybe better plucked strings – a guitar? Maybe an electric one? How many strings…6? 7? 8? What tuning? Or maybe better synthesizers – analog? Analog modular? Digital? Digitally stabilized analog? Maybe virtual? …virtual analog, or software ROMplers (can we truly call them synthesizers, by the way?)? And so forth – and I could go on for a very long time, with all the infinite possibilities. A very simple task for someone who understands orchestration, composition, instrumental practice and all that makes music – but for someone who doesn’t? Hieroglyphics. Uncharted territory.

So, how to hack it? Simple: lyrics.

This project is an epic movie with medieval-like imagery mixed with futuristic science fiction, with strong references to Norse mythology

Ah, simple: let’s Google what songs have lyrics about Norse mythology – there it is! Led Zeppelin!

Even though: does Led Zeppelin’s music style matches well with epicness, futuristic science fiction or medieval imagery…? Not at all. (This is another example, by the way)

This also opens to another very interesting topic: why are not musicians chiming in to help? It’s not that the world doesn’t like music anymore: everyone likes music – even those who don’t know yet they do. They just need to show up and say: “This is what I do for a living: let me help you!”.

In conclusion

Whenever you have to choose music for a soundtrack, choose, first of all, music. Lyrics are a nice optional.

If you don’t do so, there’s no amount of apologies or rationalizations that are gonna fix what you’ve broken: your brain is the one who likes music – whether you’d want it, or not. Reason because I’ve made an example about sunsets and fresh bread smell: those are direct neural inputs, that give you back feelings – they bypass entirely your cognitive side. Just like music does. Reason because brains are surprisingly good at picking good music – until we decide to mess with them, of course (which is, more or less, same principle I’ve spoken about in this other article of mine: https://www.linkedin.com/pulse/difference-between-rational-social-purchase-lorenzo-lmk-magario/ ).

The use of artificial intelligence in music (written by a musician)

I’m seeing more and more news about “use of AI in music”: let’s have a closer look at this topic – this time, with a musician around (me).

First things first: disambiguation about “What is an AI”

There’s 2 kinds of “artificial intelligence”: one true, one fake.

The “true” intelligence, the one with which you can have a nice chat and ask opinions about a movie, is called “strong AI” (https://www.investopedia.com/terms/s/strong-ai.asp). Think of this as the one from Star Trek’s Data and Star Wars’ C3PO.

The “fake” intelligence, the one we actually have, that you can see at work at its best in video games and parking gates, is called “weak AI” (https://www.investopedia.com/terms/w/weak-ai.asp). “Fake” because it actually isn’t an “intelligence” – as in: the capability for cognition and self awareness. It’s just a long set of predetermined “if X then Y”. It cannot create anything original. It cannot feel – nor is it supposed to: it’s called “intelligence” as a handy term for definition. Not because “actually intelligent”. Not a shock for anyone who ever played video games (who surely didn’t think of having a conversation with a Quake 2 end level boss, just because they had “artificial intelligence”).

With this out of the way, let’s go further.

How is music born?

Music is born in the depths of our brain – the infamous limbic system. The one who creates emotions, and some of you might have heard called “reptilian brain”. After these emotions are born, we can then transform them into practical phenomena by interpolating them with our “rational side” – the grey matter. (And, by the way: this is the process we call “creativity”. Using our rational side to shape emotions into tangible creations)

As quick reference:

Grey matter = information processing (reason).

Limbic system = emotions (primal instincts).

In practical terms, for music: the limbic system “generates emotions”, the grey matter “rationalizes them into music”. These 2 systems are physically interconnected by the white matter. See picture for visual reference:

Music is, more or less, a physical sublimation of our emotions – which is, by the way, the definition of art.

No emotions = no music.

I guess you found your answer already

Until machines will be able to feel, they won’t be able to make music.

As of right now, our machines are able to pretend they feel. Just as much as you can make your car pretend to be happy by drawing a smile on its hood – but they actually cannot. Which is fine: people do just fine at being people

…or are they?

Why would you want to use AI to make music?

Now: this is an interesting question.

If people are perfectly able to do music, why would you want to use machines for it?

Very simple answer: not everyone trained enough to acquire the necessary talent to make music. So, they’re looking for ways to “cheat”: they hope to find tools to do the job they didn’t do – as untalented people usually do (even though for the “right” natural impulse: humans, deep inside their nature, are built for survival – no matter what. And if cheating is what they need to make it out alive, that’s what they’re gonna do. But, for civil living, it is necessary not to put ourselves into situations who will instigate us to harm others – like a pathological craving for a status symbol of “music maker”).

What weak AIs can actually be helpful for in music

If you’re a music software company, this section is for you:

there’s a lot of fields in which weak AIs can be used for, in music production.

For instance (but not limited to):

  • MIDI humanization (a software to humanize MIDI patterns – seriously: why do we have to do this damn thing all by hand?? It’s 2020!)
  • MIDI scrubbing (a software to recognize potential MIDI recording errors – again: it is pretty easy to see blatant errors, like a 1/128 note, and delete them automatically. Rather than having someone go through the track and do that by hand)
  • MIDI drum arranger (a software you can use to build drum patterns according to your tastes – many times, drum patterns are repetitive. Yet, they require every single beat to be written by hand – because this software I’ve just mentioned doesn’t exist! Why can’t we have a software to which we can tell “8 crescendo hits on the snare drum, and then 2 hard hits, clockwise, starting from the 1st tom, on all toms”?)
  • And so on…

I might write a separate post about this, because it’s a topic that really interests me

What is HRTF? (Brief explanation)

Let’s get to know psychoacoustics.

HRTF stands for Head Related Transfer Function.

In other words: the phase and frequency response of our head. In fact, a transfer function is a particular mathematical formula that groups both data sets.

Basically, it is the way in which our head changes the sounds that reach our eardrums.

These changes are dictated by the structure of our head: nose, forehead, mouth, hair, bone density, auricles… every feature of us that the sound hits before reaching our eardrums. And, in the event that the sound is coming from below, our shoulders too.

Every struck “obstacle” is going to faintly change the sound, altering frequencies and phases.

Depending on where the sound comes from (in front, behind, above, below), is going to encounter different obstacles. And different acoustic alterations.

Our brain has finely memorized these peculiarities, and it takes advantage of them to understand which direction the sound is coming from.

That is the reason for which, even with our eyes closed, we can still understand the position of a sound source.

The organ that influences these alterations the most is auricles: all their twists are needed to extensively characterize the auditory changes, by having the latter “clash” onto them.

(You’ve finally understood why ears have such a “weird” shape, instead of simply being flat.)

Here’s an example of front (continuous line) and back (dotted line) HRTF.

HRTF frontale e posteriore

If we apply the continuous line frequency response to a signal, our brain will understand that the source of the sound is in front of us.

If we apply the dotted line frequency response to a signal, our brain will understand that the source of the sound is behind us.

Needless to say that everyone of us has it’s own physical structure, and for this reason the HRTFs are never going to be identical. However, there’s a slight resemblance between all the HRTFs that allows our brain to effectively interpret signals that have been elaborated with other people’s HRTFs.

Moreover, are you aware that there are systems to record in HRTF, so that our recording isn’t just going to identify left and right (monodimensional), but also above, below and to the sides (tridimensional or binaural)?

Here’s one of the most common ones: a dummy head.

 

Dummy head (Neumann KU100)

Dummy head (Neumann KU100)

 

That is, a tool that simulates the shape of a human head, with microphones instead of eardrums. The attempt is to effectively record the HRTF information.

And it works pretty well: listening to a recording made with this tool feels like being on stage.

Software-wise, there are HRTF decoders that allow HRFT data to be linked to a signal, therefore giving it 3D spatiality.

Another interesting implementation of HRTF is to be found in almost every modern headphones: to try and avoid the “sound inside the head” effect, a frontal HRFT impression is imprinted into the headphones.

Which is also the reason that headphones frequency responses can’t be interpreted “with the naked eye”.

By the way, here’s another example of a professional pair of headphones.

http://www.headphone.com/

http://www.headphone.com/

Even if the curve is quite irregular, the auditory result is going to be reliable anyway, given that the anomalies are due to a particular frontal HRTF impression.

Actually. not exactly frontal, since the typical position of two speakers is a the vertexes of an equilateral triangle, with our head corresponding to the third vertex.

If you want to know why this tutorial was made, you’ll find out more in this post:

Our first post.

And you got our entire website to hear if we’re talking about something that we can do.

We want to hear about you!

If you found this post useful, please: share your experience with us on our social pages!
Maybe together with a link to what you’ve created, and using our official hashtag #lmkmprod to let us find you all.

We’re looking forward to hear about you!