An amplifier basically takes an input signal (usually low power), and adjusts the power of the output to properly match it. Overall current is fairly small in headphones, and the power requirements aren't much, either. The ideal output of the amplifier is a high-power version of the input, the "straight wire with gain" ideal. Typically this is done by transistor-based ICs, where the input and output sides can pretty much be isolated from one another.
A capacitor in the way is mostly to handle bursts of pull, and to keep a stable voltage. After a certain amount of charge, anything bigger is pointless, unless there are large losses in the system (many separate components vs. few integrated components, FI), or some other design consideration requiring more.
Typically, you won't surpass 15mW by the time you're up to ear-splitting volumes. Being extremely insensitive to the load (which varies by frequency, and quite a bit) is much more important than actual power for headphones. As such, the trouble with fitting them into tiny parts is really cramming the necessary circuitry in there, and keeping noise down, and all the while, work on it by hand (small market = not enough $$$ for much automation), and still fit a battery that is easy to replace.
None of what it does has much to do with its size. The size of them, which is actually exceptionally large, is necessitated by having to use many discrete parts, rather than being able to stick it all inside of a small chip. If the demand were high enough, that could and would be done, and would add only a very small bit of size to current portable players.
Maybe growing up with an electrician and HAM for father has colored my view on such things, but I very much see it as, "finally, it's getting down to size," rather than, "how is something so small sounding so good?" Antennae take up space. Capacitors take up space. The rest really could be microscopic, or very near it, inside of a typical IC package, except that there is not a large enough demand for to warrant it. If the demand were high enough, a very good amp could just be a single IC with a capcitor behind it.