Pointers Are Complicated II, or: We need better language specs
Below, I will show a series of three compiler transformations that each seem “intuitively justified”, but when taken together they lead to a clearly incorrect result. I will use LLVM for these examples, but the goal is not to pick on LLVM—other compilers suffer from similar issues. The goal is to convince you that to build a correct compiler for languages permitting unsafe pointer manipulation such as C, C++, or Rust, we need to take IR semantics (and specifically provenance) more seriously. I use LLVM for the examples because it is particularly easy to study with its single, extensively-documented IR that a lot of infrastructure evolved around. Let’s get started!
Exploring mullender.c - A deep dive into the first IOCCC winner
I will discuss the code, how I got such old and obscure code to run, as well as include snippets from my conversations with one of the original authors (who was very helpful in figuring some of this out). If all that wasn’t enough I managed to obtain the original PDP and VAX source code, it will be hosted here with permission. I want to give a huge thank you to Sjoerd Mullender and Don Libes for their assistance and permission in reproducing some of this material.
A 35-year-old bug in patch found in efforts to restore 29 year old 2.11BSD
Larry Wall posted patch 1.3 to mod.sources on May 8, 1985. A number of versions followed over the years. It’s been a faithful alley for a long, long time. I’ve never had a problem with patch until I embarked on the 2.11BSD restoration project. In going over the logs very carefully, I’ve discovered a bug that bites this effort twice. It’s quite interesting to use 27 year old patches to find this bug while restoring a 29 year old OS...
How can CharUpper and CharLower guarantee that the uppercase version of a string is the same length as the lowercase version?
Ray Tracing In Notepad.exe At 30 FPS
A few months back, there was a post on Reddit (link), which described a game that used an open source clone of Notepad to handle all its input and rendering. While reading about it, I had the thought that it would be really cool to see something similar that worked with stock Windows Notepad. Then I spent way too much of my free time doing exactly that.
I ended up making a Snake game and a small ray tracer that use stock Notepad for all input and rendering tasks, and got to learn about DLL Injection, API Hooking and Memory Scanning along the way. It seemed like writing up the stuff I learned might make for an interesting read, and give me a chance to show off the dumb stuff I built at the same time, so that’s what these next couple blog posts will be about.
15 years later: Remote Code Execution in qmail (CVE-2005-1513)
In 2005, three vulnerabilities were discovered in qmail but were never fixed because they were believed to be unexploitable in a default installation. We recently re-discovered these vulnerabilities and were able to exploit one of them remotely in a default installation.
Tale of two hypervisor bugs - Escaping from FreeBSD bhyve
VM escape has become a popular topic of discussion over the last few years. A good amount of research on this topic has been published for various hypervisors like VMware, QEMU, VirtualBox, Xen and Hyper-V. Bhyve is a hypervisor for FreeBSD supporting hardware-assisted virtualization. This paper details the exploitation of two bugs in bhyve - FreeBSD-SA-16:32.bhyve  (VGA emulation heap overflow) and CVE-2018-17160  (Firmware Configuration device bss buffer overflow) and some generic techniques which could be used for exploiting other bhyve bugs. Further, the paper also discusses sandbox escapes using PCI device passthrough, and Control-Flow Integrity bypasses in HardenedBSD 12-CURRENT
How are Unix pipes implemented?
This article is about how pipes are implemented the Unix kernel. I was a little disappointed that a recent article titled “How do Unix pipes work?” was not about the internals, and curious enough to go digging in some old sources to try to answer the question.
Patching nVidia GPU driver for hot-unplug on Linux
Anyway, I was kind of annoyed of rebooting every time it happens, so I decided to reboot a few more dozen times instead while patching the driver. This has indeed worked, and left me with something similar to a functional hot-unplug, mildly crippled by the fact that nvidia-modeset is a completely opaque blob that keeps some internal state and tries to act on it, getting stuck when it tries to do something to the now-missing eGPU.
Speculative Load Hardening
While several approaches are being actively pursued to mitigate specific branches and/or loads inside especially risky software (most notably various OS kernels), these approaches require manual and/or static analysis aided auditing of code and explicit source changes to apply the mitigation. They are unlikely to scale well to large applications. We are proposing a comprehensive mitigation approach that would apply automatically across an entire program rather than through manual changes to the code. While this is likely to have a high performance cost, some applications may be in a good position to take this performance / security tradeoff.
Translating Quake 3 into Rust
The Rust-loving team at Immunant has been hard at work on C2Rust, a migration framework that takes the drudgery out of migrating to Rust. Our goal is to make safety improvements to the translated Rust automatically where we can, and help the programmer do the same where we cannot. First, however, we have to build a rock-solid translator that gets people up and running in Rust. Testing on small CLI programs gets old eventually, so we decided to try translating Quake 3 into Rust. After a couple of days, we were likely the first people to ever play Quake3 in Rust!
A new cycle-stepped 6502 CPU emulator
I wrote a new version of my 6502/6510 emulator in the last weeks which can be stepped forward in clock cycles instead of full instructions.
Pointer authentication is a technology which offers strong probabilistic protection against exploiting a broad class of memory bugs to take control of program execution. When adopted consistently in a language ABI, it provides a form of relatively fine-grained control flow integrity (CFI) check that resists both return-oriented programming (ROP) and jump-oriented programming (JOP) attacks.
While pointer authentication can be implemented purely in software, direct hardware support (e.g. as provided by ARMv8.3) can dramatically lower the execution speed and code size costs. Similarly, while pointer authentication can be implemented on any architecture, taking advantage of the (typically) excess addressing range of a target with 64-bit pointers minimizes the impact on memory performance and can allow interoperation with existing code (by disabling pointer authentication dynamically). This document will generally attempt to present the pointer authentication feature independent of any hardware implementation or ABI. Considerations that are implementation-specific are clearly identified throughout.
Chunking Optimizations: Let the Knife Do the Work
“Letting the knife do the work” means writing a correct program and lifting unnecessary constraints so that the compiler can use whatever chunk size is appropriate for the target.
Addressing of AF_INET, AF_INET6 and AF_UNIX sockets
A freshly created socket isn’t very useful. We have to tell it to either listen for incoming data, or connect to a remote peer. To achieve anything useful we need to perform a syscall dance, which involves either bind() or connect() or both.
And some notes about the DNS resolver rabbit hole.
Clang format tanks performance
Let’s benchmark toupper implementations.
Actually, I don’t really care about toupper much at all, but I was writing a different post and needed a peg to hang my narrative hat on, and hey toupper seems like a nice harmless benchmark. Despite my effort to choose something which should be totally straightforward and not sidetrack me, this weird thing popped out.
Tearing apart printf()
If ‘Hello World’ is the first program for C students, then printf() is probably the first function. I’ve had to answer questions about printf() many times over the years, so I’ve finally set aside time for an informal writeup.
This wild goose chase is not only a great learning experience, but also an interesting test for the dedicated beginner. Will they come back with an answer? If so, how detailed is it? What IS a good answer?
children_tcache writeup and tcache overview
This article is intended for the people who already have some knowledge about heap exploitation. If you already know some heap attacks on glibc<2.26 it’ll be fully understandable to you. But if you don’t, don’t worry - I’ve tried to make this post approachable for everyone with just basic knowledge. If you really know nothing about the topic, I recommend heap-exploitation.
Tcache is an internal mechanism responsible for heap management. It was introduced in glibc 2.26 in the year 2017. It’s objective is to speed up the heap management. Older algorithms are not removed, but they are still used sometimes - for example for bigger chunks, or when an appropriate tcache bin is full. But heap exploitation with this mechanism is a lot easier due to a lack of heap integrity checks.
Analyzing Android's CVE-2019-2215 (/dev/binder UAF)
Over the past few weeks, those of you who frequent the DAY streams over on our Twitch may have seen me working on trying to understand the recent Android Binder Use-After-Free (UAF) published by Google’s Project Zero (p0). This bug is actually not new, the issue was discovered and fixed in the mainline kernel in February 2018, however, p0 discovered many popular devices did not receive the patch downstream. Some of these devices include the Pixel 2, the Huawei P20, and Samsung Galaxy S7, S8, and S9 phones. I believe many of these devices received security patches within the last couple weeks that finally killed the bug.
After a few streams of poking around with a kernel debugger on a virtual machine (running Android-x86), and testing with a vulnerable Pixel 2, I’ve came to understand the exploit written by Jann Horn and Maddie Stone pretty well. Without an understanding of Binder (the binder_thread object specifically), as well as how Vectored I/O works, the exploit can be pretty confusing. It’s also quite clever how they exploited this issue, so I thought it would be cool to write up how the exploit works.
Trap representations and padding bits
The C programming language does not hide from you how the values you manipulate are represented. One consequence is that when padding happens, its presence may have observable effects in carelessly crafted programs. Padding is well-known to appear between members of a struct, and also possibly after the last member of a struct. The remaining space in a union when the active member is not the widest one is also considered padding. A C programmer that only cares for usual x86 platforms might be excused for thinking that, for them, this is it. As for trap representations, these may be believed to be reserved for weird hardware that use one’s complement or explicit parity bits.
The example at the end regarding _Bool is nice.