Toyo:Yes, on top of that, the "WH-1000XM4" is equipped with "DSEE Extreme" as a new sound quality enhancement function. Yamamoto explains this.
Yamamoto:The "DSEE Extreme" is a further evolution of Sony's original "DSEE HX," a high-quality sound function that has been installed sequentially from Walkmans, wireless speakers, AV amplifiers, etc., released since 2013. "DSEE HX" was a technology to upscale a CD sound source or a compressed sound source to a high-resolution sound source equivalent to high resolution, but "DSEE Extreme" further enhances the high-frequency complement performance to provide high-quality sound closer to high resolution. It will be realized.
What kind of processing did you do to achieve sound quality closer to high resolution?
Yamamoto:With "DSEE HX" and "DSEE Extreme", we are trying to predict and complement the high-frequency signal component that was lost when the music data was compressed from the low-frequency signal component, but it is important here. However, to accurately analyze the low-frequency signal of interest. This is because there are various sounds in the world, such as vocals, percussion instruments, guitars, and pianos, and the characteristics of the sounds are different. For example, vocals do not contain high-frequency signals very much and the rise of the sound is not so fast, whereas percussion instruments have a lot of high-range signals and the rise of the sound is fast. Therefore, in the conventional "DSEE HX", adjusting the sound quality closer to the vocals weakens the percussion instrument, and adjusting the sound quality closer to the percussion instrument has the disadvantage that the vocal becomes unnatural.
So, in the new "DSEE Extreme", we analyze the sound being played back by making full use of the deep neural network (DNN) technology, which is the most advanced AI technology, and for vocals, upscaling for vocals is performed. For percussion instruments, upscaling for percussion instruments can be applied while switching in real time. This has made it possible to enhance the powerful rise of percussion instruments while at the same time making the voice of the vocal beautiful.
Could you also tell us about the background behind this further improvement in sound quality? What motivation led to the birth of "DSEE Extreme"?
Yamamoto:Customers who want high-grade headphones such as the "WH-1000XM4" are naturally obsessed with the sound quality, so the biggest reason is that we wanted to make an evolution that would sound to those people. Personally, I think that music satisfaction is determined by the multiplication of "the number of contents" and "sound quality", but in recent years, with the advent of streaming music distribution service, the former has been considerably satisfied. right. Under such circumstances, we thought that it would be a great added value to enjoy those streaming music distribution services with even higher sound quality.
Also, at the same time, AI technology has a global breakthrough, and I have been able to overcome the problems that have been difficult to solve until now, which is also helping me.
What was the hardest part of developing DSEE Extreme?
Yamamoto:I mentioned earlier that you are making full use of DNN in "DSEE Extreme", but because it says D=deep, it requires extremely large processing power and memory area to operate. In recent years, many services using DNN have appeared, but most of them send data to the cloud instead of the device in front of you and produce results with ample machine power. On the other hand, the "WH-1000XM4" is a compact mobile product, so it needs to be processed in real time, and it is necessary to reduce the processing amount to some extent from the viewpoint of battery life. It was hard to put a high-performance DNN there.
In addition, when building a DNN, we will improve the performance by repeating the process of “learning” using a large amount of data and appropriately “evaluating” the results, but on both sides, we have a music label The power of the Sony Group has been very helpful. Many high-resolution sound sources that Sony Music Entertainment has are "learned" and the optimal algorithm for "DSEE Extreme" is built. In addition to numerical evaluation, subjective evaluation is also important for "evaluation", but we also receive appropriate advice from mastering engineers who are actually involved in sound source production.
How did the mastering engineers at the forefront of music production feel when they heard the sound of "DSEE Extreme"?
Yamamoto:Actually, they also felt the technical problem of "DSEE HX" that I talked about earlier, and it seems that they thought that high-frequency supplementation would not compete with high-resolution sound sources, but "DSEE Extreme" is quite original sound quality. You said that you are approaching. I was very happy with this personally. Of course, you said that it was best to listen to high-resolution sound sources (laughs).
The "WH-1000XM4" can be connected to a music player via the attached cable in a wired manner, and also supports the Bluetooth codec "LDAC" that you can enjoy wirelessly with high resolution sound quality, so if you have a high resolution sound source, you can enjoy the highest sound quality as it is. You can also
Yamamoto:Yes. I hope you can use it properly depending on your own music environment.