Thank you
@Mr.Jacob for the interesting research and your effort to bring that experience also to our headphones community, and to
@jude for bringing this to our attention.

After lots of bla bla in my previous posts, at last I found the time to watch at least your linked presentations. I have several questions.
Appreciate it.

And credit goes to Magnus and the rest of the team working under Dr. Gierlich for the excellent research spanning many years.
I am happy to answer thoughtful and genuine questions like yours as best as I can.
1. I maybe can guess the reason behind smaller distortion coefficient in the simplified model
here. My questions is, how can inexperienced listeners that don't fully understand a subjectively vague concept of distortion and make a rating about it, especially when the volume is low or the distortion can manifest as an oddness in FR (assuming it is audible)? Does it even make sense for headphones?
Your link doesn't pin down a timestamp, but I'm assuming you are referring to the equation shown at ~9m40s? That goes back to early on in our research that showed the initial trends and relationships between the 3 dimensions and overall quality. The low distortion coefficient goes back to people generally being accepting of some distortion without grossly affecting their overall perception of the audio quality. The linear regression works to an extent, however when you think of evaluating audio quality, you can also see why it doesn't work.
As an example, take the perfect headphone in every way, except it has terrible distortion. In that case, no one would rate the overall quality as high. And the linear regression model fails.
And to be clear, we DON'T use that equation in the final version of MDAQS.
Either way, the participants in the auditory tests were primed about what each of the categories represented, so they could comfortably make their judgements (to the best of their ability). And as I recall, all audio samples were scaled to the same loudness, so the effect of level is taken out of their judgement and the audio systems we're playing back at "normal" operating conditions.
2. Have you tried it also with experienced critical listeners and did you have similar subjective results, especially those that can relate the subjective experience to objective facts? Or did they just become the part of the same data to train your DL network? Was there a study to evaluate them separately? I personally, am not the same person I was from only a few years ago and what I would rate positively back than might be a hard fail today.
Yes! In fact, we did have a panel of critical listeners in our study at one point.
And I'm told that they were extremely hesitant to use the 3 dimensions. They wanted many more dimensions to properly evaluate the audio quality.
However, once we strong-armed them into doing the listening test, their scores actually came out very similar to our panel of untrained listeners. They generally had the same ranking and preferences.
We made the decision to continue our auditory tests using untrained listeners, because one of our stated goals was to create an algorithm that would match a general consumer, as opposed to someone trained or specialized in specific fields of audio.
Lastly, your comment about how you have changed over the last few years is something to be aware of with any instrumental assessment models (whether for speech, echo or audio quality): they are a snapshot in time of the auditory test participants' preferences and, when done well, capture the zeitgeist well. However, years/decades from now, it's likely to be less accurate.
3. Is the intention to make an industry standardization? Can we, in the future, expect different manufacturers making "from the factory" evaluation and rating of their, for example, headphones? For the automotive industry this is of course a different story, as standardization in that industry is a must.
Good question!
Short answer: yes.
Slightly longer answer:
I've often used the automotive industry as an example because they are so standardized. It would be nice to shop for a vehicle that states top speed, (e)mpg, weight, 0-60, etc. AND audio quality, so you have some knowledge going into the test drive or shopping experience. Right now, car stereo evaluations are hard judge.
In terms of headphones, much like the frequency response provides good insight and value into the audio playback performance, it would be wonderful to see audio quality scores associated with those.
However, all standardization is a democratic (and frankly can be very "political") process that can take years! So we decided to release MDAQS as is, and see how the market responds.
We obviously think there is value to the scores (3 dimensions + Overall Quality), and can help clarify certain industry claims.
4. There is a lot of resistance against quantified data (even if it has a foot in the subjective feedback) in this community. You can see traces of it also in this thread, full of hope that your work will prove that frequency response is meaningless. Unfortunately, many vendors, actively or passively, go along with those claims not to spread negative or unwanted (I use the term snake oil for it) information on their products. Do you think the community can be convinced without the honest efforts from the vendors? My guess is HeadFi is a small community in a much much larger causal HP listeners around the world, so we don't really matter much.
Ha! The feedback from this community is appreciated! Even if it is an order of magnitude (or three orders) smaller than the general HP world..
And we're not shy about mentioning the merits of a frequency response. It's immensely useful to the Audio Product designers, and as consumers, we're getting better at interpreting them. So we're not here to discredit a frequency response measurement, but we are here to say that there is more to Audio Quality than what can be deduced from a frequency response plot. And more data, especially if presented clearly, can be helpful, and perhaps lead to more nuance and an informed perspective on the product audio performance.
Lastly my opinion, I value the effort to bring this quantitative approach to our headphone community and hope that it will also honestly be supported by the vendors. I personally am not too convinced about a single number evaluation rating as I value, for example, timbre over everything else, and would prefer to look at individual ratings, maybe in addition to the frequency response, but still a valuable contribution that supports us on the way to become more informed consumers.
Thanks!
Thanks
@DarginMahkum.
I think we all wish we could hear things with our own ears before we buy, but since we can't, we have to rely on the reviews and measurements available to us. And MDAQS is another (cool) tool in our toolbelt. I see MDAQS as a great complement - or addition to - frequency response measurements, and hope the various industries/communities will be open to it.