When the news broke about these issues, there was understandably a lot of anger in the community. Now that various teams (as well as many of you!) have had time to do more in-depth benchmarks across multiple components, I have been able to take a look at the fallout, after the patches came through and the dust has settled.
So what went wrong?
A lot of this problem is down to fundamental CPU design, so I’ll do a little bit to explain the bigger problem before explaining the way the fix works.
The reason why this was such a big deal is that meltdown came about from how the CPU handles data—specifically around when it processes information. So, normally a CPU can start to become almost predictive in your day-to-day actions on your computer. Commonly used data is often stored in different levels of cache in the CPU so that when you inevitably need to use it again, it can be accessed quickly.
It often does this for data which is traditionally slow to access, which is also often the more secure information stored on your PC. The problem arose where the exploit was allowing access to what is normally a very secure part of this active memory (meaning whatever the CPU happened to be storing there at the time could be accessed).
As this was an issue with the CPU, which is the heart of any PC, this could have theoretically been any piece of data on your PC. So, as you might imagine, users didn’t want their normally secure passwords, emails, bank details, and so on being at risk.
Now, normally, big security issues like this are not such a big deal on a practical level. Usually they are patched (like the major problems have been now, for the most part) and then we are simply informed after the fact.
Why this blew up was because people jumped the gun announcing these issues on social media, meaning knowledge of the exploit was out in the wild before the patches were ready for release, suddenly making millions of PCs vulnerable.
Intel was badly impacted by this, as their CPU architecture is very heavy in its usage of this method of accessing data. AMD’s CPUs weren’t as affected because they have an additional layer within their architecture, specifically around this level of access. ARM had a few processors which were at risk too.
Finally, a note on Spectre. This is the one which did impact all CPUs and, again, it is similar to Meltdown in that it gives access to secure memory. Unlike with Meltdown, however, you generally need to be sitting at the unit to take advantage of it. But once again, this is being patched as it is a security risk.
But Muh Performance?
Overly simplistic explanations aside, make sure you install the updates! One of the things I’m breaking down below are what things have been impacted by this, and to what level. So let’s get to it!
CPU Core Performance (synthetics vs. real world)
So one of the things that doesn’t show up too well in synthetic benchmarks is the drop-off in performance for real-time access of memory. When looking at CPU performance, most benchmarks put anything newer than a 4th-generation Intel at around a 1-3% performance drop. Some older CPUs, like for those of you still rocking the i5-2500K (god, I miss that build…) have seen very different results here; yet, a lot of this is down to the next (bigger) results from the update: storage speeds.
Overall though, the majority of consumer Intel CPUs have seen a minimal drop, with most being able to be dismissed as within a margin of error. The one thing the synthetics do show us after the patch is the impact to the CPU floating point. As this is a core piece of CPU design and it’s something it usually needs to do all the time, this is something that did often take advantage of the cache for fast utilisation by the CPU for the best compute speeds.
Now what all this means in real terms depends on your use case:
On a single PC, a raw 1-3% drop in performance is not something you’re going to be able to see. This is more of an enterprise-level issue where you have multiple CPUs in a server, made even worse if they’re running multiple virtual environments. Even just a 1% drop in performance across a whole farm can be huge. If you take an average large server farm that has something like 25,000 CPU cores, that 1% drop with the update would be the equivalent of suddenly having 250 cores fail. So it’s bad times for the likes of Google and other companies with massive setups.
Finally, anything around a 1% difference in results when benchmarking is something you could reasonably argue is within the margin of error. Where it becomes important is if multiple tests of large systems had been done before the update and they were within that margin of error, versus after the patch where it is starting to slide over that margin.
A lot of knowing if you really have been impacted by this comes down to experience with the system you are testing and monitoring (if you’re a big-time server admin).
All you really need to know is that—for people like you and me sitting on a home PC—you’re not going to notice any change to the core operating performance of your CPU whatsoever.
This is where things get interesting (and in some rare cases, unfortunate).
Traditional SATA-based drives have been largely unaffected. This is mainly down to the fact that their limitation in speed has always been the bandwidth available through a SATA port, not so much how quickly the CPU can access it (although you do get the difference in speed from spinning up an HDD vs. instant access from an SSD).
Where things vary is how you have your system set up if you are using M.2 drives:
If you’re using it just as a storage drive (like me, as a high-speed drive for my video editing), then you won’t see a difference. However, if the drive is being used as a main OS or program drive, you might be seeing a nasty drop-off.
This is where the big scary 30% drop-off that all the media reported can occur, as it’s down to how the M.2 drives use PCIe lanes with the CPU for ultra-fast access of data. When benchmarking, you’ll see this most obviously in the sequential read results. Again though, not all benchmarks will even show this as it depends exactly what data is stored on the drive and what the benchmark is asking the CPU to access on the drive.
Don’t get me wrong, there are big nasty looking results out there:
Like I was saying, though, this is so varied on the use case that you might never, ever see results like this. For comparison, even when I have projects open and I have a million things open on two screens (like while I’m writing this), my own 500GB 960 EVO doesn’t show anything remotely out-of-the-ordinary when it’s under load and I run a benchmark on it:
Moral of the story: don’t be too worried by cherry-picked footage and benchmarks. Like I said, even with me actively using the drive, mine isn’t as bad as the results I posted above.
So yes, this can potentially be having a hugely negative impact on the performance of your system, given that you are (A) using an extremely high-speed M.2 SSD, and (B) making particularly heavy use of it in certain operations, including as an OS-and-core-programs drive. If you fall into that category, your frustration would be understandable: the whole point of M.2 drives is that they’re super fast, and—while they’re still faster than drives connected via SATA ports—seeing a 30% drop in performance is not what anyone wants to see.
Gaming and GPU PCIe Utilisation
So this is the one you’re really interested in, let’s be honest! With how M.2 drives act with varying levels of performance after the patch, the other big component which used PCIe lanes is the GPU. So naturally this had a lot of gamers worried!
Fore1gn already gave some good early results in his summary article, yet as the week has gone on more and more tests have been completed.
Thankfully, gaming performance is essentially unaffected. If anything, most games are showing around 1% difference, which we can all dismiss as being within the margin of error. A lot of this is down to modern game design not being programmed in a way that would have been impacted by the patch. So how your CPU and GPU behave remains the same. So, as Susano would say, “REJOICE!” (shout-out to my FFXIV buddies!)
The thing is here, even if you had the game files installed onto a M.2 drive which had speed that was impacted by how you have things setup, a game doesn’t need the insane transfer rates that a M.2 drive can provide. Likewise, the CPU is only really telling the GPU to do its thing when playing games, so their communication isn’t something that would have been impacted by this at all.
For most consumers, this is nothing more than a big pain that we shouldn’t have had to deal with. When patched, most people are seeing little or no change on modern consumer PCs. There are exceptions to this, like with CPUs older than 4th-gen Intels, as well as a more noticeable slowdown for users running Windows 7 and 8.1 versus 10.
Not only that, I haven’t even touched on the variants of each of these, which is still being worked on across all devices. Those of you using mobile platforms have probably seen notifications to update your iOS or Android software (if you’re running with an official Android partner who still provides updates, that is). Then you have to consider how these have an impact on Apple desktops too.
All told, the “leak” of this couldn’t have been worse. The manufacturers had all originally agreed to reveal this information at CES—only as it’s a convenient time to have all the press and manufacturers in one place, and it is something that could be controlled. What has happened is a lot of misinformation, false positives, and a whole lot of unknowns.
The places where it is starting to be noticed is at the server level, just based purely on the scale of the setups. Like I mentioned above, a 1% difference to a consumer might mean a 1-3fps difference in a game, if that. And certainly, 1-3 FPS is something that could just as easily be due to variants in your system as it’s performing tasks. But if it’s at least that much of a decrease, if not a little more, the impact to large server farms could be devastating. Right now, the lawyers at Intel are already working hard on a number of lawsuits against the company which have already been presented in courts.
Although this is something that has been patched at the software level, to fundamentally fix this going forward will need a change in CPU architecture design. The companies are not out of the woods yet, and this has all the makings of something that is going to have echoes and effects for a long time to come.