Thanks for that - and interesting that you mentioned headphones because I still feel that's what makes the biggest difference to the sound signature.
That said, I was actually wondering how much, on balance, the N3 affects the signature compared to my headphones; for example, I absolutely love the sound of my P7 wired cans, but there's no doubt they sound different when I'm on the N3 compared to a dac/amp like the E5 or direct to my phone.
On the N3 the soundstage seems more intimate, even though clarity and separation are still there, but on the Soundblaster E5 (via Audirvana+) the soundstage is instantly wider and the separation and vocal clarity is more apparent (even though the N3 uses a more advanced DAC chip, so I assume it's the E5's more powerful amp making the difference here).
It would be good to know which parts of the chain have more effect on specific qualities of sound, otherwise it's a blind mix-and-match guessing game...
In theory, the amp *should* be the most consistent and transparent as a good amp should do nothing but increase the power of the signal equally. However, a lot of non-professional amps tint the sound to make it more appealing to the user. The next should be the source, but that varies based on the designer's interpretation of how the digital signal is converted to an analog waveform.
The least transparent would be the headphones/speakers. In a pro studio, the speakers are placed in a acoustically treated room specifically designed to remove echo, reverb, and other environmental effects that would tint the sound. But, even then, the listener's ear "design" changes the tone so that every eardrum doesn't get the perfectly same wave. Headphones are less effected, as they are next to the head, but still are affected by the acoustic chamber that is created and other variables such as amount of earwax, alignment of the driver, tightness of the pad seal, etc. compared against the tuning of the headphone at the factory. Custom IEMs would be the least affected, as a properly designed pair should be designed to be almost the same position every time compared to the tuning at the manufacturer.
If a pure sine wave was sent from a electronically created source (tone generator, not a playback device), it should come out of a good amp perfectly identical; it should be in sync with phase and frequency. The only thing that should change is the amplitude ("volume") of the wave. With speakers, that wave should come out of the drivers and across the room without any distortion (reverb against the room walls/ceiling), phase distortion (drivers not translating the signal to a physical wave correctly), or harmonics (reflection from the back wall or objects around the listener). Headphone should respond the same, but in a smaller environment - the sealing against the head can create it's own "room" that affects the sound and reflections can come off the head itself echoing back to the headphone driver. Custom IEMs will do the same, albeit in the mm range since the output is close to the eardrum and can only echo off the ear canal walls.
The N3's translation of the digital file is naturally tinted or biased by the designer - both of the amplifier circuit design by Cayin and the AKM DAC chip designer at Asashi Kasei. The filters are their way of tinting the analog wave creation from the digital signal, kinda like the way a violinist or horn player would play a score differently to add their touch to a composition.
Done properly, an EQ should never be used to "improve" the sound of playback. It should be used to neutralize the listening environment. For example, in a room where bass is reverbing ("blooming"), the EQ could be used to lower the harmonic frequency of the room. If a room was overly padded or filled with soft furniture, the EQ could increase high frequencies to prevent the highs from being absorbed. Then, the source would be played without EQ, transparently, to the amp, which shouldn't change any frequency balance.
Most listeners (and reviewers) can only guess on what the real audio track *should* sound like. Unless they were there when the piece was recorded, they have no idea what a piece should really sound like. Even then, you are actually listening to the engineer's (recording and mastering) interpretation of what THEY hear. When somebody says that "the soundstage is wide with xxx", I always take it with a grain of salt since the piece may have been recorded and mixed narrower (or maybe even wider) than they are hearing. Some bands record in massive rooms for the natural reverb of the room. Others like tiny dampened rooms to get a more personal feel. The only way to really know how the playback chain affects a given signal is to have stood in the room with the musicians, been the engineer tracking the mix, mastering it to be what one thinks is as accurate a repro as possible to produce a known "standard" file. Only then can the file be played back on a DAP to determine exactly how it changes when different pieces of the chain are altered.
Most engineers will then take that "perfect" mix and play it on the crappiest audio sources possible - car stereos, phones, and bluetooth speakers - to see how their mix is affected. A lot of times, what sounds perfect in the studio has ended up as a massive tinny mix on these devices because of the lack of bass (not the devices fault, but rather the speakers size or displacement). It's then time to make a "radio mix" which modifies the balance to sound better at the expense of accuracy or fidelity. (Think theatrical releases vs director's cuts in the motion picture industry). I had to a lot of cuts like that because the original recording was waay to detailed and needed toning down for normal play.
So yes, an end user can play all day with their settings to make what sounds most "exciting" to them, but to compare apples-to-apples, one would need to truly create their own track from scratch and use it as the source for true comparison to a non-moving target.