AirPods Max
Jun 27, 2021 at 5:58 PM Post #4,471 of 5,629
Don't quite take machine learning to master tracks out of the equation. It would be very similar to applying a "style" to an image (look up deep style transfer); granted we're talking about a 1D signal vs a 2D image. You can have ML learn the "style" in which music is mastered. The input to such a network would be the raw input tracks and the output would be a single mixed track. Training data would essentially encompass the raw tracks that were used to generate the final master with the output being the final master itself. Granted we don't have access to this data, but I'm sure the record companies do (assuming they keep all the data). If the record companies kept this data, they'd have a wealth of data to train these models with.

In terms of "put the cells over there" and "pianos over there," the ML could learn that certain "types" of sounds would be placed and mixed a certain way. Given enough examples, it would definitely learn this. You have to remember that neural nets start out as a blank slate (unless you're doing transfer learning, then it doesn't), and it needs to learn what different features exist and patterns within the features and how they connect. We can train an auto encoder to generate realistic looking faces. There isn't a clearly defined "put the eyes next to each other" and "a nose below and between the eyes" inherently in the network, but it learns it as it sees more examples of this. These networks learn a general pattern and apply said pattern. The auto encoder will learn the that eyes have eyeballs and eye balls have dark circles and irises surrounding them.

One could try to, for example, make a model that's trained on recordings from just one genre or recordings from one specific producer; I feel finding patterns for a smaller set of related things would be easier than trying to make one model to rule them all. But that would be the end goal. If you can get one model to work, then you could apply transfer learning to get other genre's/producers to be replicated by model.

The problem? Well there is inherent risk involved. It's likely there is a very complex pattern that exists (how humans do things do tend to have certain patterns to them, even if there is a little variation from one work to the next). The problem is what's the cost of learning this model? Time? These models can take years to train assuming there is hardware out there that is capable of doing it. Not only the training time, but the time it takes to tune the model; which of itself is kind of an art still rather than a science (though there are approaches to doing this line a genetic algorithm or some other form of optimization like Bayesian Optimization). Hardware costs are another huge thing. You can rent, but this will cost you very much in the long term. You can buy, but that's not cheap either. Manpower and expertise... Who's going to do the work, they need to get paid too... It likely takes a team.

I feel like the two biggest factors right now is both on the hardware and time fronts. Getting a machine that can train a NN with this sort of data is going to be very costly if not impossible. Without good hardware your training times per model will take forever, and when optimizing and tuning a model, you train many models with that number growing combinatorially larger. I'll also admit there is a lot of luck involved. Note as of time of this writing there isn't much hardware out there that can train a NN on raw genomic DNA... Even a bacterial genome is too large to do (either space is too big or time is too big). Basically, I don't think the hardware has caught up to be able to do music yet.



To the first one, Warner actually has a playlist of this (so does Apple). The Atmos recordings don't sound too different to the originals; they are the least altered ones IMO. Artists wouldn't be doing the custom mixes, the songs are owned by the record companies. They definitely have the budget to do this. Especially if there is backing from Amazon, Apple, Spotify, etc.

Edit: I honestly could care less for 3D effects and stuff. They're cool and all... But I'd rather the record labels start going back to mastering things quiet again instead of loud. The Dolby Atmos tracks for the most part accomplish this particular goal IMO. It's really the only reason why I like them. I hear more dynamics with the Atmos tracks vs the original masters (on AAC). So yeah, forget the Atmos stuff... Just do the mastering in a way that doesn't completely destroy the dynamics of the music. Though I guess some people prefer their music loud.
I was wondering about that too when encountering the volume change in stereo vs Atmos comparisons. It did seem like stereo had everything crammed together in a small range of loudness (less dynamic range), whereas Atmos gave more room between instruments and vocals. That is, Atmos felt like I could hear a greater range, even if only a trick of staging separation, that allowed me to appreciate the qualities of instruments better at quiet vs loud volumes.

The loudness alterations in particular got me thinking, as you seem to be saying, that the “loudness war” we’ve had in music for many years may be softening up. Maybe Atmos will help lead artists to focus more on impressing listeners with staging and space, rather than just blasting the dB higher on each side of the ears.

Or maybe we will just have a new wave of loudness wars, but within Atmos. Crap
 
Jun 27, 2021 at 6:06 PM Post #4,472 of 5,629
I was wondering about that too when encountering the volume change in stereo vs Atmos comparisons. It did seem like stereo had everything crammed together in a small range of loudness (less dynamic range), whereas Atmos gave more room between instruments and vocals. That is, Atmos felt like I could hear a greater range, even if only a trick of staging separation, that allowed me to appreciate the qualities of instruments better at quiet vs loud volumes.

The loudness alterations in particular got me thinking, as you seem to be saying, that the “loudness war” we’ve had in music for many years may be softening up. Maybe Atmos will help lead artists to focus more on impressing listeners with staging and space, rather than just blasting the dB higher on each side of the ears.

Or maybe we will just have a new wave of loudness wars, but within Atmos. Crap
Well a lot of people like their music loud. Especially when it means that the "good" music is very low volume... If they already maxed out the volume slider, then they definitely will hate soft recordings. I do hope that more record labels start doing stuff softer, but I'm not holding my breath.
 
Jun 27, 2021 at 6:34 PM Post #4,473 of 5,629
I know it’s been like this for a long time — recording everything so loud. But really it’s perplexing. I mean, why and how did sound engineers decide that listeners turning up the volume a bit was suddenly too difficult a task to do?
 
Jun 27, 2021 at 7:19 PM Post #4,474 of 5,629
I know it’s been like this for a long time — recording everything so loud. But really it’s perplexing. I mean, why and how did sound engineers decide that listeners turning up the volume a bit was suddenly too difficult a task to do?
If I'm not mistaken, it was a result of radio and FCC regulations. So FCC set maximum volume levels for radio and louder stuff "pops" more since it simply has more energy. So radio stations started asking for louder recordings and each song trying to be louder than the last. Unfortunately it wasn't the engineer's decision.

Note that FCC regulation applies to TV too if I'm not mistaken. This is the reason why the volume level doesn't alter drastically when going channel to channel or even TV show to commercial.
 
Last edited:
Jun 27, 2021 at 7:42 PM Post #4,475 of 5,629
If I'm not mistaken, it was a result of radio and FCC regulations. So FCC set maximum volume levels for radio and louder stuff "pops" more since it simply has more energy. So radio stations started asking for louder recordings and each song trying to be louder than the last. Unfortunately it wasn't the engineer's decision.

Note that FCC regulation applies to TV too if I'm not mistaken. This is the reason why the volume level doesn't alter drastically when going channel to channel or even TV show to commercial.
all that is great except that TV Commercials DO GET LOUD, so much so that certain television manufacturers have a volume leveling feature to prevent the Pharmaceutical ads from blowing your ear drums.
 
Jun 27, 2021 at 8:22 PM Post #4,476 of 5,629
all that is great except that TV Commercials DO GET LOUD, so much so that certain television manufacturers have a volume leveling feature to prevent the Pharmaceutical ads from blowing your ear drums.
This is anecdotal, but in my experience it used to be far worse, before a US law about this issue was passed and eventually implemented. That said — I agree it still happens — the volume spikes are shorter and less frequent (to me), and it seems less obnoxious than it used to. Less obnoxious but no less annoying.
 
Jun 27, 2021 at 9:07 PM Post #4,477 of 5,629
all that is great except that TV Commercials DO GET LOUD, so much so that certain television manufacturers have a volume leveling feature to prevent the Pharmaceutical ads from blowing your ear drums.
It's been a while since I watched TV to begin with, so I don't remember how bad the matching actually is. The other thing to note is that FCC can only deal with public transmissions. I don't remember if cable or satellite falls under that. I know my parents have a satellite at their house that gets them some Vietnamese channels; the volume matching from show to show and commercials to show and even channel to channel is horrendous.
 
Jun 28, 2021 at 2:46 AM Post #4,479 of 5,629
Good video. It describes well that there are way too many variables involved. IMHO.

Artist's original vision vs audio mangled with the new gimmick - pick one.

I would trust Apple more if their hardware could reproduce HiRes Lossless. But it doesn't. So them saying this is the next big thing, (leapfrogging over lossless), seems totally disengenuous. How about getting lossless right first? No?

Oh well.

As for Spatial or Atmos somehow being a "cure" for the loudness wars, I am sorry, a definate no. The audio is mangled using out of phase spatial simulation. to simulate multi speaker reproduction - this does not automatically equate to "more dynamic range". (Especially if the starting point is an already dynamically lacking master)
 
Last edited:
Jun 28, 2021 at 3:01 AM Post #4,480 of 5,629
If I'm not mistaken, it was a result of radio and FCC regulations. So FCC set maximum volume levels for radio and louder stuff "pops" more since it simply has more energy. So radio stations started asking for louder recordings and each song trying to be louder than the last. Unfortunately it wasn't the engineer's decision.

Note that FCC regulation applies to TV too if I'm not mistaken. This is the reason why the volume level doesn't alter drastically when going channel to channel or even TV show to commercial.
Let's not forget 1950s-1970 45 RPM "singles" (vynal record) playing Juke Boxes.

There was fierce competition to sound "the best" (loudest) when played on a Juke Box.
 
Last edited:
Jun 28, 2021 at 3:53 AM Post #4,481 of 5,629
Question, is Atmos already in an X.Y surround sound format (IE higher than stereo input)? If so, there aren't any cars (that I know of) that support higher than stereo, so you'd end up with the stereo output from the source anyways. I can see sound stage sounding like it's increased if the brain interprets the HRTF through the speaker system as being "bigger." To be honest, I've never taken any sort of audio that was done in a 3D-esque way through a speaker system. The closest I've done is a binaural track which never really sounds any bigger.

I was unsure if the source did the HRTF function or if the master had that already. I don't think there are multiple renders... That would make it very odd for downloading music (unless the render was done on the fly by the source) since when you download music you only get one version of the song (standard iTunes Plus, Lossless, Dolby Atmos) and if you want to play other versions you'd have to delete and redownload the song. Which render would you get when downloading Atmos? Spatial Audio version or standard headphone (I personally don't feel like there are two renders though).

The APP and APM will apply head tracking once that feature is available (iOS 15) which is really the only thing special to the APM.
I've been thinking quite a lot how to describe Dolby Atmos in simple terms and I think the best way to think about it is not as a codec but rather extra positional information that the renderer can use to position the sound object in the sound field.

The reason I say this is because Dolby Atmos can be carried in various Dolby and non-Dolby transport/codecs in a backwards compatible way. For example you can have Dolby Atmos carried in Dolby TrueHD or Digital Plus. If you have a standard decoder with the standard 5.1 then you'd get TrueHD or Digital Plus in the same way as before but a Dolby Atmos renderer would use those 6 speakers to try to position the sound objects so simulating, for example, sounds moving overhead that a standard 5.1 setup cannot do.

What is stored when you download Dolby Atmos in Apple Music is something that really only Apple knows, in the same way when you download an AAC song.
 
Jun 28, 2021 at 6:19 AM Post #4,482 of 5,629
Does anyone know how the ‘always on’ Atmos option works? Does it send stereo music in an atmos package so to speak when playing stereo or is it the ‘bit perfect’ stereo version (same as turning atmos off) ? The reason I ask is, when I’m listening via my IE900s to a stereo track, then switch atmos from ‘always on’ to ‘off’ there is a moment of silence and the track continues playing, it seems to sound slightly different between ‘always on’ and ‘off’ but it’s a long enough gap that I might be imagining it.

Edit1: I’m trying the same test with an atmos album and the toggle isn’t doing anything to change the sound??
Edit2: I think the toggle only works with atmos when streaming, once downloaded it is locked into the format that was downloaded.
 
Last edited:
Jun 28, 2021 at 8:34 AM Post #4,483 of 5,629
UPDATE (playing Atmos)

On a road trip, I’m unable to turn Atmos on and off with any files that have been downloaded (using settings to switch between “Always On” and “Off” doesn’t make a difference in playback; it ALWAYS plays Atmos if the file downloaded is Atmos, even with settings switched off for playback).

Removing the downloaded file, I am then able to switch between Atmos on and off. This makes sense since the file is downloaded as Atmos, but I would’ve hoped a downloaded file in Atmos would somehow allow for both modes.

Looks like the only way I’ve been able to compare Atmos on and off is to refrain from downloading any music. Streaming only for comparisons.

EDIT: I want to clarify that this was playing off car speakers. But, if playing off APP/APM, maybe playing off downloaded Atmos still allows for switching it off and on. I think? I have to test later.
 
Last edited:
Jun 28, 2021 at 10:56 AM Post #4,484 of 5,629
Good video. It describes well that there are way too many variables involved. IMHO.

Artist's original vision vs audio mangled with the new gimmick - pick one.

I would trust Apple more if their hardware could reproduce HiRes Lossless. But it doesn't. So them saying this is the next big thing, (leapfrogging over lossless), seems totally disengenuous. How about getting lossless right first? No?

Oh well.

As for Spatial or Atmos somehow being a "cure" for the loudness wars, I am sorry, a definate no. The audio is mangled using out of phase spatial simulation. to simulate multi speaker reproduction - this does not automatically equate to "more dynamic range". (Especially if the starting point is an already dynamically lacking master)
Atmos isn't a solution for loudness since the format itself isn't responsible for being softer. However the majority of the Atmos masters on Apple Music are mixed a lot softer than modern recordings. This is very audible and something that is complained about by many (even moreso when it first came out), I can't count the number of "it's not loud enough" comments about it I've read. That doesn't mean it'll remain this way though.
I've been thinking quite a lot how to describe Dolby Atmos in simple terms and I think the best way to think about it is not as a codec but rather extra positional information that the renderer can use to position the sound object in the sound field.

The reason I say this is because Dolby Atmos can be carried in various Dolby and non-Dolby transport/codecs in a backwards compatible way. For example you can have Dolby Atmos carried in Dolby TrueHD or Digital Plus. If you have a standard decoder with the standard 5.1 then you'd get TrueHD or Digital Plus in the same way as before but a Dolby Atmos renderer would use those 6 speakers to try to position the sound objects so simulating, for example, sounds moving overhead that a standard 5.1 setup cannot do.

What is stored when you download Dolby Atmos in Apple Music is something that really only Apple knows, in the same way when you download an AAC song.
I'm still curious how you could get the proper Dolby sound out of a car that is running in stereo. Sure you have more speakers, but it's still playing in stereo. Unless that particular person had a properly setup surround sound DAC/amp to power his speakers (it didn't seem like he did with the way he described his system). Most cars don't come with that set up because of costs.
UPDATE (playing Atmos)

On a road trip, I’m unable to turn Atmos on and off with any files that have been downloaded (using settings to switch between “Always On” and “Off” doesn’t make a difference in playback; it ALWAYS plays Atmos if the file downloaded is Atmos, even with settings switched off for playback).

Removing the downloaded file, I am then able to switch between Atmos on and off. This makes sense since the file is downloaded as Atmos, but I would’ve hoped a downloaded file in Atmos would somehow allow for both modes.

Looks like the only way I’ve been able to compare Atmos on and off is to refrain from downloading any music. Streaming only for comparisons.

EDIT: I want to clarify that this was playing off car speakers. But, if playing off APP/APM, maybe playing off downloaded Atmos still allows for switching it off and on. I think? I have to test later.
Yes, you can only download one version of the song, so you won't get multiple. Realistically the only effect that can be adjusted with the APP and APM once you downloaded the song is the 3D head tracking which supports any song (Atmos or not).
 
Jun 28, 2021 at 11:57 AM Post #4,485 of 5,629
Yes, you can only download one version of the song, so you won't get multiple. Realistically the only effect that can be adjusted with the APP and APM once you downloaded the song is the 3D head tracking which supports any song (Atmos or not).
In that case, I’ll only download files normally (not Atmos downloads). At least that way I can toggle Atmos on and off when playing music.

EDIT: Unless downloading non-Atmos prohibits flipping on Atmos! In THAT case, I wouldn't download any of my music, opting to stream at all times so I can flip between Atmos on an off (to better explore my library's sound on both formats).
 
Last edited:

Users who are viewing this thread

Back
Top