Quote from: george on March 13, 2021, 11:21:10
can someone explain in layman terms how is this possible?
I don't know if this is layman enough, but one thing you need to understand is that frequency impacts efficiency. It stems from the relationship between voltage and frequency. The higher the operating frequency, the higher the voltage required to ensure stable operation. Power rises with the square of voltage. That's why this relationship is a very important characteristic of a processor from efficiency standpoint. Firestorm cores operate at a much lower frequency than their x86 counterparts under heavy load. They operate much closer to optimum efficiency point. As a rule of thumb, 3 GHz is roughly considered a knee point for efficiency in modern x86 processors. Meaning that efficiency quickly deteriorates beyond this point. That's why base frequencies of mobile processors are where they are. Running around 5 GHz is definitely not good for efficiency.
There are two basic approaches to increasing performance of a processor. Working faster (frequency) and doing more simultaneously (so called width). Historically, these two approaches were considered mutually exclusive. Modern processors combine them. But they still pull in different directions. M1 is a wider design. Widest on the market. Actually, modern x86 designs are internally not that dissimilar. The challenge is that you've got to feed the core with instructions. In the case of an x86 processor, it means decoding x86 instructions and turning them into internal micro instructions. But the instruction set is very complex. And there is a big complication in the form of variable length of instructions. You don't know where the next instruction starts without looking at the previous instruction. Which complicates the design of decoders. And it makes going as wide more challenging. This is where SMT comes in. By processing two (or more) threads simultaneously, you provide more instructions for the core to chew on, working its magic (like out-of-order execution, optimizing utilization of resources).
Also, the relationship between frequency and performance is not exactly straightforward. A processor is faster than memory. A big factor in performance is how much time you spend waiting for data. The higher the frequency, the more cycles get wasted by waiting. And again, SMT can come to the rescue, masking latency.
Apple's designs are extreme in more ways than one. It's hard to say what's going on if you don't have access to internal information. But those are the two main factors. They are so efficient mainly because they run at such a low frequency compared to x86 processors. And they are competitive because they're very wide yet their frequency isn't too low. There are many little things going on. Consider how Apple focuses on low latency (many benchmarks are latency sensitive). And then there is the fact that Apple uses the most advanced manufacturing process on the market, better than what AMD and Intel are using. Which means higher efficiency and being able to cram more transistors into the same space.
Desktop processors consume 100+ W primarily because they can. Efficiency is more of a consequence of increasing performance rather than a target. Their power budget for a high performance personal computer can be over 500 W. Mobile x86 processors are derivatives of desktop designs (and power saving technologies do trickle back). Here we see the opposite. Designs from highly power-constrained world of mobile phones upscaled into the world of personal computers. A single core in my desktop computer has a higher power budget than an entire iPad. And Apple succeeding in this effort that was considered very difficult if not borderline impossible. And it doesn't look like the design is running out of breath.