Crank It to the fMax
For those who value single-core performance over running many cores at once, there’s a trick to “turning it up to 11” and getting a lot more out of your hardware.
Getting the best CPU performance isn’t only about choosing a processor with the most cores and highest clock speed. It’s more nuanced than that. One major factor is how a CPU can be optimized for a specific application to get the best performance or to minimize its power use. This practice of hardware performance tuning is notoriously complex, as there are hundreds of “knobs” on any given CPU that can be adjusted for thousands, or millions, of unique combinations. It’s a mix of art and science.
While some companies are using AI to dynamically tune CPUs and other chips, in this post we’ll explain how a modern server CPU can be tuned for one specific thing: to run single-threaded applications much faster than the processor’s advertised frequency.
In the last decade, chip makers have been cramming more processor cores onto a single die. This is largely fueled by an explosion in cloud-first workloads that can run in parallel—in which case, the more cores the better! But what if a particular application needs single-threaded performance? What if the speed of a single core is more important to your workload than being able to run many cores in parallel?
What’s TDP Got to Do With Performance Tuning?
If single-threaded performance is your jam, we say crank that clock speed up to 11! You can do this type of performance tuning with most modern x86 processors without physically changing them—within a limit. That limit is called TDP, or Thermal Design Power, and it’s determined by the silicon manufacturer. TDP is an expression of the maximum amount of power a CPU can draw, which is directly proportional to the amount of heat it can dissipate.
This performance tuning technique, often referred to as “all-core boost,” relies on disabling some of the CPU cores while increasing the frequency of the remaining ones. The more cores you disable, the more you can crank the active ones before you hit that overall TDP ceiling, enabling maximum possible clock speed for a reduced number of cores.
Servers ship with all cores enabled by default. Most applications naturally take advantage of all the available cores while keeping their servers below the TDP limit, although this is being challenged by the latest super-high-core-count processors.
How Much Performance Can Be Unlocked?
Each core on a modern processor has the potential to run much faster than what the default setting allows. Hence the two speeds you normally see when reading processor specifications: base and maximum.
For example, the 32-core AMD EPYC 7502P “Rome” processor found in our m3.large bare metal server operates at a 2.5 GHz base speed and base TDP of 180 watts, but each of its cores is capable of reaching a maximum speed of 3.35 GHz. Meanwhile, an Intel Xeon Gold 5120 processor, with 14 cores total, runs at a 2.2 GHz base clock speed and a 105-watt TDP, but the maximum speed for each core on this SKU is 3.2 GHz.
That’s a 32% difference that performance tuning can make in the AMD example and a 45% upside for the Intel!
When Clock Speed Means Cash
One area where maximum single-threaded performance is important is FinTech. All-core boost can make a big difference for applications like equities trading risk management or blockchain validation—something we learned first hand scaling alongside the Solana ecosystem. Hardware and system optimizations provide a strong ROI for these applications.
Users of traditional cloud services don’t have access to the raw infrastructure underneath the virtualization layer to do this kind of hardware performance tuning. They only see virtual machines or containers, which are products managed and delivered with defined specs. The only way to tune single-threaded performance is to have full control of your hardware.
At Equinix Metal, we've helped customers operating at scale with specific high-clock-speed needs to perform thermal testing and then adjust the BIOS to maximum clock speed on a reduced number of cores in their machines. A great example is the Solana Server Program, which leverages all-core-boosted versions of our standard m3.large and c3.medium platforms.
The Liquid Cooling Opportunity
So why not enable this performance tuning option for all servers and customers at Equinix? The answer lies in evolving our relationship with TDP, of course! In short, pushing the all-core performance envelope means we’re playing at the upper threshold of the thermal limits of each system. Go too far, and you’ll b0rk the system, which is a pretty undesirable user experience.
The engineering team at Equinix Metal is excited about the opportunity to unlock higher levels of performance through investments in liquid cooling, which can provide more consistent performance when all-core boost is enabled and the TDP envelope is pushed to the max.
Cooling hardware with air introduces the potential for fluctuations at the top end of the CPU frequency. That issue doesn’t exist with direct-to-chip liquid cooling, which ensures a more consistent temperature and (as a result) a more consistent CPU frequency. Translation? Blazing-fast application performance that’s stable, predictable, and more environmentally responsible.
Equinix is investing in energy efficiency and sustainability throughout the data center “stack,” including the deployment of liquid-cooled systems. At our Co-Innovation Facility near Washington, D.C., we are piloting direct-to-chip liquid cooling technology in Open19 servers. Open19 is a Linux Foundation hardware project, fostering a community around open hardware.
Using liquid instead of air to dissipate heat, we’ll be able to not only crank it to the fMax, but keep it at the fMax!