One of the issues at hand is that X11, the predecessor of Wayland, does not have a standardized way to tell applications what scale they should use. Applications on X11 get the scale from environment variables (completely bypassing X11), or from Xft.dpi, or by providing in-application settings, or they guess it using some unorthodox means, or simply don’t scale at all. It’s a huge mess overall.
It is one of the more-or-less fundamentally unfixable parts of the protocol, since it wants everything to be on the same coordinate space (i.e. 1 pixel is 1 pixel everywhere, which is… quite unsuitable for modern systems.)
Wayland does operate like how you say it and applications supporting Wayland will work properly in HiDPI environments.
However a lot of people and applications are still on X11 due to various reasons.
Last time I asked around about this question, the answer was surprisingly “probably not much”! When a low-power x86 chip (like those mobile chips) is idling (which is pretty much all the time if all you are doing is hosting a server on it) it consumes very little power, about the same level as an idling Pi. It is when the frequency ramps up that performance-per-watt gets noticeably worse on x86.
Edit: My personal test showed that my x86 laptop fared slightly worse than my Pi 3 in idling power (~2 watts higher it seems), but that laptop is oooooooold.