Editor's Note:
Matt Bach is the head of Puget Labs and has been part of Puget Systems, a boutique builder of gaming and workstation PCs, since the early days. This article was originally published on the Puget blog.

While not yet a common term, over the past year or so we have started to see a rise in the usage of the term "Aggregate CPU Frequency" as a way to estimate the performance between different CPU models. This term appears to be used most often when people are discussing high core count Xeon or dual Xeon CPU configurations, but lately we have seen it used when looking at CPUs with as few as just four cores.

At first glance, this term seems reasonable enough: it simply takes the frequency of the CPU (how fast it can complete calculations) and multiplies it by the number of cores (the number of simultaneous calculations it can perform) to arrive at a total or "aggregate" frequency for the processor. For example, below is the number of cores and base frequency for four different CPU models along with their calculated "aggregate frequency":

No. of cores Base frequency Aggregate frequency
Intel Core i7 6700K 4 cores 4.0 GHz 4*4 = 16 GHz
Intel Xeon E5-1650 V3 6 cores 3.5 GHz 6*3.5 = 21 GHz
Intel Core i7 6850K 6 cores 3.6 GHz 6*3.6 = 21.6 GHz
2x Intel Xeon E5-2690 V4 28 cores 2.6 GHz 28*2.6 = 72.8 GHz

Unfortunately, in the majority of cases trying to estimate the relative performance of a CPU in this manner is simply going to give you inaccurate and wrong results. So before this term begins to be used more commonly, we wanted to to explain why "aggregate frequency" should not be used and give some examples showing how (in)accurate it really is.

There are quite a few reasons why "aggregate frequency" is an inaccurate representation of CPU performance, but the largest primary reasons are the following:

It is typically calculated using the advertised base frequency

Most modern Intel CPUs have a wide range of frequencies they run at including the base frequency (the frequency that is advertised in the model name) and the various Turbo frequencies. Turbo Boost allows the CPU to run at higher frequencies depending on three main factors: the number of cores being used, the temperature of the cores, and the amount of power available to the CPU. On modern desktop systems with quality components, however, the cooling and power considerations are pretty much non-factors which means that the frequency of an Intel CPU should only be limited by the number of cores that are being used. In fact, Turbo Boost is so reliable on modern CPUs that, except in a few edge cases, every system that ships out our door is checked to ensure that it is able to maintain the all-core Turbo Frequency even when the system is put under an extremely heavy load.

How big of a difference would it make to use the all-core Turbo Boost frequency instead of the base frequency? If you were to calculate the "aggregate frequency" for an Intel Xeon E5-2690 V4 CPU, you would get a result of 36.4 GHz since that CPU has 14 cores and a base frequency of 2.6 GHz. However, if you instead use the all-core Turbo frequency of 3.2 GHz (which any well-designed and adequately cooled workstation should be able to achieve indefinitely), the aggregate frequency changes to 44.8 GHz which is a difference of 30%.

It does not take the rest of the CPU and system specs into account including the amount of cache, architecture, and chipset

Processors are extremely complex, and just looking at the number of cores and frequency ignores everything else that can make one CPU faster or slower than another. This can include the amount of cache (whether it is L1, L2, L3, or Smart Cache), the bus type and speed, and the type and speed of memory it can use. However, more than almost anything else it ignores the architecture and manufacturing process that was used to produce the CPU.

While the amount of difference all of these other specs can make varies from application to application, as an example we saw up to a 35% difference in SOLIDWORKS between a Skylake CPU and a Haswell-E CPU when both were operating with 4 cores at 4.0 GHz.

It assumes that programs can make perfect use of all the CPU cores

More than anything else, this is the main problem with aggregate frequency. Using the base frequency can throw things off, but in most cases probably only by a maximum of about 10-30%. Likewise, as long as you only compare CPUs from the same product family, the architecture of the CPUs likely won't come into play. But working under the assumption that a program is going to be able to make perfect use of all of the CPUs cores is just so wrong that that it makes using an "aggregate frequency" less accurate in most cases than simply choosing a CPU at random.

It appears that most people who use this term understand that there are some programs that are single threaded (parametric CAD programs are a prime example), but many of our articles have shown over and over that even if a program tries to use all the available cores, how effectively it can do varies wildly. The reason depends on a number of factors including how the program is coded, how well the task lends itself to multi-threading, and how much the other components in the system (including the hard drive, GPU, and RAM) affect performance. There are some programs that are very effective at utilizing multiple CPU cores in parallel, but even the best of them (such as offline rendering) are at best only ~99.5% efficient, and often as low as 90% efficient. This is extremely good, but still low enough that it will throw off any attempt to use an "aggregate frequency" to estimate performance.

Unfortunately, the only way to know how well a program can use multiple cores is to do comprehensive testing on that specific application. We have tested a number of programs including Premiere Pro, After Effects, Photoshop, Lightroom, SOLIDWORKS, Keyshot, Iray, Mental Ray, and Photoscan, but this is only a tiny drop in a giant bucket compared to the number of programs that exist today.

Examples

We can talk about all the reasons why we believe "aggregate frequency" is wildly inaccurate, but there is no substitution for specific examples using actual benchmark data. To help prove our point, we are going to look at a number of different applications and compare the performance between a variety of CPUs using the "aggregate frequency" and the actual performance in reality.

We like to be fair, so to give this term the best chance possible we are only going to use CPUs with same architecture (Broadwell-E/EP). If you were to mix older and newer architectures (such as a Core i7 6700K versus a Core i7 6850K or a Xeon V3 versus a Xeon V4), expect the "aggregate frequency" to become even more inaccurate.

To make it easier to see how close or far from reality the "aggregate frequency" is, whenever the expected performance using the "aggregate frequency" is within 10% of the actual performance, we will color the results in green. Anything that is 10-50% off will be in orange, and anything more than 50% off will be in red.

Example 1: Cinema4D CPU Rendering

Offline rendering of 3D images and animations is among the most efficient tasks you can run on a CPU which makes rendering engines like those found in Cinema4D exceptional at using high numbers of CPU cores. This also makes it a best-case scenario for the term "aggregate frequency":

CineBench R15 Multi CPU Specs Aggregate Frequency Expected Performance Compared to Core i7 6850K Actual Performance Compared to Core i7 6850K
Intel Core i7 6850K 6 Cores, 3.6 GHz
(3.7-4.0 GHz Turbo)
21.6 GHz 100% 100%
Intel Core i7 6950X 10 Cores, 3.0 GHz
(3.4-4.0 GHz Turbo)
30 GHz 139% 156%
(off by 17%)
2x Intel Xeon E5-2630 V4 20 Cores, 2.2 GHz
(2.4-3.1 GHz Turbo)
44 GHz 204% 207%
(off by 3%)
2x Intel Xeon E5-2690 V4 28 Cores, 2.6 GHz
(3.2-3.5 GHz Turbo)
72.8 GHz 337% 358%
(off by 21%)

We run CineBench R15 on nearly every system that goes out our door, and the results above are taken directly from our benchmark logs. Comparing the expected performance to the actual performance between the different CPUs, in every case the "aggregate frequency" ended up expecting lower performance than what each CPU was able to achieve in reality. The most accurate result was the dual Xeon E5-2630 V4 in which the expected performance difference compared to the i7 6850K was only off by about 3% which is actually extremely accurate. However, the other two CPU results were off by about 20% which means that although "aggregate frequency" has a chance of being fairly accurate, it also has a good chance of being off by a moderate amount.

Example 2: Premiere Pro

We have found in previous testing that Premiere Pro is decently effective at using a moderate amount of CPU cores, but with modern hardware there is little need for something like a dual CPU workstation. Still, there are many recommendations on the web to use a dual Xeon workstation for Premiere Pro, so lets take a look at how the actual performance you would see in Premiere Pro compares to what you would expect from the "aggregate frequency":

CineBench R15 Multi CPU Specs Aggregate Frequency Expected Performance Compared to Core i7 6850K Actual Performance Compared to Core i7 6850K
Intel Core i7 6850K 6 Cores, 3.6 GHz
(3.7-4.0 GHz Turbo)
21.6 GHz 100% 100%
Intel Core i7 6950X 10 Cores, 3.0 GHz
(3.4-4.0 GHz Turbo)
30 GHz 139% 123%
(off by 16%)
2x Intel Xeon E5-2630 V4 20 Cores, 2.2 GHz
(2.4-3.1 GHz Turbo)
44 GHz 189% 117%
(off by 77%)
2x Intel Xeon E5-2690 V4 28 Cores, 2.6 GHz
(3.2-3.5 GHz Turbo)
72.8 GHz 337% 111%
(off by 226%)

The results in the chart above are taken from this Adobe Premiere Pro CC 2015.3 CPU Comparison article where we looked at exporting and generating previews in Premiere Pro with a variety of codecs and resolutions. While the results aren't too far off with somewhat similar CPUs, the "aggregate frequency" expected the i7 6950X to be about 16% faster compared to the i7 6850K than it is in reality. This isn't completely out in left field, but the difference between a 39% improvement in performance and a 23% improvement from a CPU that is more than twice as expensive is likely to make quite a big difference if you are trying to decide on which CPU to purchase.

For the dual CPU options, the "aggregate frequency" was much further off from reality being about 77% off on the dual Xeon E5-2643 V4 and a huge 226% off on the dual Xeon E5-2690 V4. In fact, where the "aggregate frequency" predicted the dual E5-2690 V4 CPUs to be the fastest option, they were in fact slower than the dual E5-2643 V4 CPUs (or even the Core i7 6950X) while costing significantly more.

Example 3: 3ds Max

3ds Max is a 3d modeling and animation program that is primarily single threaded, so you would expect it to be a worst case scenario for "aggregate frequency". You may argue that no one should use this term for these types of lightly threaded tasks, but we have started to see this term pop up even when talking about single or lightly threaded tasks so we wanted to show just how inaccurate it may be when someone uses "aggregate frequency" as a catch-all term for CPU performance:

CineBench R15 Multi CPU Specs Aggregate Frequency Expected Performance Compared to Core i7 6850K Actual Performance Compared to Core i7 6850K
Intel Core i7 6850K 6 Cores, 3.6 GHz
(3.7-4.0 GHz Turbo)
21.6 GHz 100% 100%
Intel Core i7 6950X 10 Cores, 3.0 GHz
(3.4-4.0 GHz Turbo)
30 GHz 139% 102%
(off by 37%)
2x Intel Xeon E5-2690 V4 28 Cores, 2.6 GHz
(3.2-3.5 GHz Turbo)
72.8 GHz 337% 89%
(off by 248%)

The results in the chart above are taken from this AutoDesk 3ds Max 2017 CPU Performance article where we looked at animations, viewport FPS, and scanline rendering with a variety of projects. As expected from a mostly single-threaded application, the "aggregate frequency" was very optimistic in each case. Depending on which CPU you look at, the expected performance if you used the "Aggregate frequency" compared to the actual performance ranged from being 37% off to being 248% off! On the extreme end - with a pair of high core count Xeons - this means that instead of more than a 3x increase in performance compared to a i7 6850K, in reality you would actually see a 10% decrease in performance.

Example 4: After Effects

After Effects is an interesting application because it used to be very well threaded and benefited greatly from high core count workstations. However, in the 2015 version Adobe changed it's focus from multi-threading to GPU acceleration. In the long-term this should greatly improve performance for AE users, but the result is that with modern hardware there is little need for a CPU with more than 6-8 CPU cores and often actually a decrease in performance with higher core count and dual CPU setups. So while you may think a single-threaded application like 3ds Max would be the worst case for the term "aggregate frequency", After Effects should be even worse:

CineBench R15 Multi CPU Specs Aggregate Frequency Expected Performance Compared to Core i7 6850K Actual Performance Compared to Core i7 6850K
Intel Core i7 6850K 6 Cores, 3.6 GHz
(3.7-4.0 GHz Turbo)
21.6 GHz 100% 100%
Intel Core i7 6950X 10 Cores, 3.0 GHz
(3.4-4.0 GHz Turbo)
30 GHz 139% 96%
(off by 43%)
2x Intel Xeon E5-2630 V4 20 Cores, 2.2 GHz
(2.4-3.1 GHz Turbo)
44 GHz 189% 90%
(off by 99%)
2x Intel Xeon E5-2690 V4 28 Cores, 2.6 GHz
(3.2-3.5 GHz Turbo)
72.8 GHz 337% 86%
(off by 251%)

The results in the chart above are taken from the 2D Animation portion of our Adobe After Effects CC 2015.3 CPU Comparison article which tested rendering and timeline scrubbing across six different projects. As you can see, trying to use an "aggregate frequency" to estimate the difference between different CPU models is going to be wildly inaccurate. Compared to the i7 6850K, the other CPU choices - which should be anywhere from 39% faster to over 3 times faster - are instead all slower than the Core i7 6850K. In fact, the faster the "aggregate frequency" predicted a CPU configuration to be, the slower it ended up being in reality!

The allure of an all-pervasive specification like "aggregate frequency" is something we completely understand. It would be great if there was an easy way to know which CPU will be faster than another and by roughly how much, but unfortunately there is no magic bullet. To be completely fair, for highly threaded tasks like rendering, the "aggregate frequency" should be close enough that you at least wouldn't end up spending more money for lower performance, but it still isn't going to be great at estimating precisely how much of a performance increase you would see with one CPU over another.

Outside of rendering and a few other highly parallel applications, however, there is no way to know whether the "aggregate frequency" is going to be accurate or not without detailed benchmarking. For example, simulations are often touted as being highly parallel (which means it should be perfect for this term), but we have found that performing simulations in SOLIDWORKS is only moderately efficient - worse in many cases than Premiere Pro! Other simulations packages like ANSYS or COMSOL should be more efficient, but without specific testing there is no way to know for sure.

So if "aggregate frequency" is not accurate, what should people use to decide which CPU to purchase? Like we said earlier, there is no magic bullet for this. If your application is CPU-bound (the GPU, HD, and RAM don't impact performance significantly), you could use Amdahl's Law which taken into account the parallel efficiency of the program to calculate the theoretical performance difference between two CPUs. If you are interested in this, we recommend reading our guide on how to use Amdahl's Law. You are still limited to CPUs of the same architecture, it doesn't take into account things like CPU cache, and you have to do a lot of testing up front to determine the parallel efficiency of the program - but this method should be much more accurate than simply multiplying together a CPU's cores and frequency.

If your application does utilize the GPU to improve performance or if you want to compare CPUs with different architectures, however, there is really no easy way to estimate which CPU will be faster than another and by how much. In these situations, the only reliable method is good old-fashioned benchmarking. Again, we wish there was a better method that was still accurate - it would save us so much time! - but this is simply the fact of reality. This is why we at Puget Systems have started to benchmark different CPUs on as many professional applications as we have the time and expertise to handle to ensure that we are recommending exactly the right CPU to our customers. We unfortunately can't test every program we wish we could (or really even a majority), but keep on an eye on our article list as we expand our testing across more and more applications.