I'm now wondering if I have enough material to do an interesting writeup for my time as a CPU bug-hunter in verification.
The client (a now vanished startup) had a small 8-bit CPU design which they wanted validation for, using the technique of executing random sequences of instructions and comparing the result against an emulator. We wrote the emulator independently from their architecture description. Given that most instructions were a single byte plus arguments and most of those were valid, the test coverage was pretty thorough. All looked fine until I added support for interrupts, at which point we discovered that an interrupt during a branch would not return to the correct point in execution.
Verifying security properties of processors is really hard; you can go looking for specific types of failure, but I'm not aware of a general way of proving no information leakage.
Article Excerpt: "So a speculatively-executed xdcbt was identical to a real xdcbt!"
I've never thought about it until I saw the above line in the article, and that thought went something like this:
"Assembler instructions which might never have the conditions met for their execution during a program's runtime might be speculatively executed nonetheless, and this, depending on the nature of the instruction executed, might have huge ramifications up to and including program incorrectness and even program failure."
In other words, your absolutely 100% deterministic Turing machine (or programs that you write on it that you deem to be 100% deterministic) -- may not be quite so deterministic when viewed against these understandings...
It adds a whole new dimension to what must be thought about when writing assembler code...
Anyway, it's a really great article!
Awesome story! I'm curious though why the branch predictor was running the xdcbt instruction if "The symptoms were identical. Except that the game was no longer using the xdcbt instruction".
Was the game no longer "using the xdcbt instruction", but the branch predictor caused issues, because they put a jmp instruction in front of it instead of removing the instruction entirely?
This issue that xdcbt is supposed to solve happens at a completely different level. Performing a full disk to tape backup would slow systems down because the entire disk would be copied through the OS buffer cache, evicting data used by other processes.
The UNIX fix for this was to use a raw device or O_DIRECT to bypass the buffer cache.
Maybe Intel's new cache partitioning feature offers a similar fix, see:
Actually in the comments someone mentions using cache partitioning for security. Maybe the threads used by jit code could be placed in their cache partition to avoid some of Spectre.
Would it be fair to say that having any sequence of bytes in memory which looks like the xdcbt instruction (even if those bytes are just data) is unsafe? Given that a stale entry in the branch prediction table might end up pointing at those bytes.
Would gcc's "__builtin_expect" construct help in these cases?
Correct me if I'm wrong, but won't Meltdown/Spectre bug allow people to jailbreak pretty much any os/device that has CPU that "supports" these bugs? This could potentially open a lot of currently closed devices to people.
Where are the L1 caches?
I have seen so many bugs and crashes created exactly because of this kind of thinking:
> [insert X here] was no longer guaranteed, but hey, we’re video game programmers, we know what we’re doing, it will be fine.
Very well written. Thanks for sharing!
Scratches PowerPC off the list of trustworthy CPUs
Sooo, out you go, G5. Any suggestions what to use for online banking? A 486 can't handle all the jQuery and tracking scripts.