Cranelift, Part 3: Correctness in Register Allocation
In this post, I will cover how we worked to ensure correctness in our register allocator, regalloc.rs, by developing a symbolic checker that uses abstract interpretation to prove correctness for a specific register allocation result. By using this checker as a fuzzing oracle, and driving just the register allocator with a focused fuzzing target, we have been able to uncover some very interesting and subtle bugs, and achieve a fairly high confidence in the allocator’s robustness.
Push some big numbers through your system and look for bugs
Why does this matter? Okay, let’s say you have a JSON message where you pass around the unique ID of some object in your system. Let’s further say that your system “mints” IDs out of a 64 bit number space, and it spreads them around, so large numbers can turn up every now and then. What happens when you finally get an object ID with a value of 1152921504606846976 and put it into a message?
is-promise post mortem
I had been intending to set up more of my projects to be automatically published via CI, instead of manually published from my local machine, but because is-promise is such a tiny library, I figured it probably wasn’t worth the effort. This was definitely a mistake. However, even if I had setup publishing via CI is-promise may not have had sufficiently thorough tests.
Big Tech Is Testing You
Large-scale social experiments are now ubiquitous, and conducted without public scrutiny. Has this new era of experimentation remembered the lessons of the old?
Physics, chemistry, and medicine have had their revolution. But now, driven by experimentation, a further transformation is in the air. That’s the argument of “The Power of Experiments” (M.I.T.), by Michael Luca and Max H. Bazerman, both professors at the Harvard Business School. When it comes to driving our decisions in a world of data, they say, “the age of experiments is only beginning.”
Despite being highly privileged and processing untrusted input by design, it is unsandboxed and has poor mitigation coverage. Any vulnerabilities in this process are critical, and easily accessible to remote attackers.
SafeSide is a project to understand and mitigate software-observable side-channels: information leaks between software domains caused by implementation details outside the software abstraction.
Dynamically scoped variables in Go
What we want is to be able to access a variable whose declaration is neither global, or local to the function, but somewhere higher in the call stack. This is called dynamic scoping. Go doesn’t support dynamic scoping, but it turns out, for restricted cases, we can fake it.
The 3 A.M. Phone Call
It went to a national security adviser, Zbigniew Brzezinski, who was awakened on 9 November 1979, to be told that the North American Aerospace Defense Command (NORAD), the combined U.S.–Canada military command–was reporting a Soviet missile attack. Just before Brzezinski was about to call President Carter, the NORAD warning turned out to be a false alarm. It was one of those moments in Cold War history when top officials believed they were facing the ultimate threat. The apparent cause? The routine testing of an overworked computer system.
Helping Generative Fuzzers Avoid Looking Only Where the Light is Good
Using a generative fuzzer — which creates test cases from scratch, rather than mutating a collection of seed inputs — feels to me a lot like being the drunk guy in the joke: we’re looking for bugs that can be triggered by inputs that the generator is likely to generate, because we don’t have an obviously better option, besides doing some hard work in improving the generator. This problem has bothered me for a long time.
Binary symbolic execution with KLEE-Native
KLEE is a symbolic execution tool that intelligently produces high-coverage test cases by emulating LLVM bitcode in a custom runtime environment. Yet, unlike simpler fuzzers, it’s not a go-to tool for automated bug discovery. Despite constant improvements by the academic community, KLEE remains difficult for bug hunters to adopt. We’re working to bridge this gap!
My internship produced KLEE-Native; a version of KLEE that can concretely and symbolically execute binaries, model heap memory, reproduce CVEs, and accurately classify different heap bugs. The project is now positioned to explore applications made possible by KLEE-Native’s unique approaches to symbolic execution. We will also be looking into potential execution time speed-ups from different lifting strategies. As with all articles on symbolic execution, KLEE is both the problem and the solution.
Write Fuzzable Code
Fuzzing is sort of a superpower for locating vulnerabilities and other software defects, but it is often used to find problems baked deeply into already-deployed code. Fuzzing should be done earlier, and moreover developers should spend some effort making their code more amenable to being fuzzed.
This post is a non-comprehensive, non-orthogonal list of ways that you can write code that fuzzes better. Throughout, I’ll use “fuzzer” to refer to basically any kind of randomized test-case generator, whether mutation-based (afl, libFuzzer, etc.) or generative (jsfunfuzz, Csmith, etc.). Not all advice will apply to every situation, but a lot of it is sound software engineering advice in general. I’ve bold-faced a few points that I think are particularly important.
Design and Evolution of C-Reduce
Since 2008, my colleagues and I have developed and maintained C-Reduce, a tool for programmatically reducing the size of C and C++ files that trigger compiler bugs. C-Reduce also usually does a credible job reducing test cases in languages other than C and C++; we’ll return to that later.
Part 2: https://blog.regehr.org/archives/1679
Vintage TV Test Patterns
As you might expect, the BBC test card with the girl and clown has both a backstory and a cult following.
hey - HTTP load generator
hey is a tiny program that sends some load to a web application.
Increasing coverage of signal semantics in regression tests
Kernel signal code is a complex maze, it’s very difficult to introduce non-trivial changes without regressions. Over the past month I worked on covering missing elementary scenarios involving the ptrace(2) API. Part of the new tests were marked as expected to success, however a number of them are expected to fail.
I ran Cypress (the JS testing tool) exactly one time ever.
Today I noticed that it put 42,471 files in ~/Library/Caches. 41% of all cache files on my machine are from that one launch. The resource consumption of modern programming tools is just reckless.
Time to the first reply literally beginning with the words “who cares“: about one hour.
Some people claim that unit tests make type systems unnecessary: “types are just simple unit tests written for you, and simple unit tests aren’t the important ones”. Other people claim that type systems make unit tests unnecessary: “dynamic languages only need unit tests because they don’t have type systems.” What’s going on here? These can’t both be right. We’ll use this example and a couple others to explore the unknown beliefs that structure our understanding of the world.
Really about our hidden assumptions.
“Before I was alive I was wrong about this.”
My favorite papers of 2017
The (machine) learning was strong this year.
With the Router, In the Conference Room
The killer was Cathy, in the issue tracking system, with the snarky bug report.
DeepXplore: automated whitebox testing of deep learning systems
The state space of deep learning systems is vast. As we’ve seen with adversarial examples, that creates opportunity to deliberately craft inputs that fool a trained network. Forget adversarial examples for a moment though, what about the opportunity for good old-fashioned bugs to hide within that space? Experience with distributed systems tells us that there are likely to be plenty! And that raises an interesting question: how do you test a DNN?
At first glance this seems like more of the same adversarial stuff, fun as that may be, but they seem to do a better job finding real world scenarios that are misclassified. Nothing malicious, per se, just bad luck.