I'd say detail is how closely the audio reproduction system produces perfect sinusoidal waves, which are directly correlated with audio tones. Infinitesimally accurate sine waves are synonymous with more details, where as jagged/rough sine waves are not as detailed.
All sound waves are made of pure/perfect sine waves, regardless of how "jagged/rough" (including square waves and sawtooth, etc.) and we've known this beyond any doubt for about 200 years. So I don't really understand what you're trying to say because "
jagged/rough sine waves" are effectively "
infinitesimally accurate sine waves" and therefore the level of detail must be the same.
Essentially, [1] the sine wave looks more like a staircase, and from my understanding, [2] the goal is the get those stairs infinitesimally small (e.g. 4 stairs per unit distance vs. 10,000 stairs for the same distance) to simulate the perfect sine wave, which takes more expensive equipment that can handle these calculations
There's two different (but related) errors with your understanding:
1. The sine waves "
looks more like a staircase" simply because that is the convention of how audio editing/analysis software graphically represents the digital audio data. In other words, the "staircase" you see when you zoom in (in audio software) is due purely to the limitations of the graphical displays/interfaces but in reality there is no "staircase"!
2. Yes, more "
stairs per unit distance" will make the graphical display more perfectly "simulate" the visual appearance of sine waves on your computer screen when you zoom in. However, that pertains
ONLY to simulating the visual appearance of sine/sound waves on your computer screen, it does NOT pertain to the digital audio data (or resultant audio) itself! The digital audio data itself is not trying to "
simulate the perfect sine wave", that's what analogue audio tries to do but digital audio is completely different. Think of it like the old telegraph system: We have a message, we convert that message into a binary code (Morse code, a series of dots and dashes) and then convert that Morse code back into the message. The Morse code itself doesn't look anything like the original message and it's not supposed to, it's not trying to "simulate" the message and only works because it isn't! This analogy might not appear pertinent to digital audio but in fact few analogies would be more pertinent because the fundamental theory of digital audio was born out of the telegraph system, developed by an engineer (Harry Nyquist) working on the telegraph system.
The only thing I know for a fact is that the sine waves that are output from the audio chain are not ideal (since no machine can be made to perfect mathematical precision) ...
While your statement is true, the whole point of digital audio (and digital information theory in general) is that it neatly bypasses this fact and thereby makes it irrelevant. Going back to the telegraph system: It wasn't possible to transmit dots and dashes perfectly but the reason the system worked is because it didn't make any difference, the telegraph system neatly bypassed that issue. It didn't matter how much noise/interference existed in the system or how badly it distorted the dots and dashes, provided the dots and dashes weren't distorted so badly that they couldn't be recognised/differentiated, then the message could be recovered absolutely perfectly! This is the identical principle upon which all digital systems work and if you think about it, your smartphone, laptop, computer, etc., is moving around and operating on many billions of bits of data every second, if only one of those billions of bit of data was not recovered and processed perfectly then your smartphone/computer/etc., would crash once every second!
Of course, once we're out of the digital domain and into the analogue and then acoustic domains, we ARE trying to "simulate the perfect sine waves" and then the imperfections inherent in all machines becomes relevant again and has an effect on our sound wave "simulations".
G