Running the “Reflections on Trusting Trust” Compiler
In October 1983, 40 years ago this week, Ken Thompson chose supply chain security as the topic for his Turing award lecture, although the specific term wasn’t used back then. (The field of computer science was still young and small enough that the ACM conference where Ken spoke was the “Annual Conference on Computers.”) Ken’s lecture was later published in Communications of the ACM under the title “Reflections on Trusting Trust.” It is a classic paper, and a short one (3 pages); if you haven’t read it yet, you should. This post will still be here when you get back.
In the lecture, Ken explains in three steps how to modify a C compiler binary to insert a backdoor when compiling the “login” program, leaving no trace in the source code. In this post, we will run the backdoored compiler using Ken’s actual code. But first, a brief summary of the important parts of the lecture.
A tale of /dev/fd
Many versions of Unix provide a /dev/fd directory to work with open file handles as if they were regular files. As usual, the devil is in the details.
"[31m"?! ANSI Terminal security in 2023 and finding 10 CVEs
This paper reflects work done in late 2022 and 2023 to audit for vulnerabilities in terminal emulators, with a focus on open source software. The results of this work were 10 CVEs against terminal emulators that could result in Remote Code Execution (RCE), in addition various other bugs and hardening opportunities were found. The exact context and severity of these vulnerabilities varied, but some form of code execution was found to be possible on several common terminal emulators across the main client platforms of today.
The Internet Worm Program: An Analysis
This report gives a detailed description of the components of the worm program—data and functions. It is based on study of two completely independent reverse-compilations of the worm and a version disassembled to VAX assembly language. Almost no source code is given in the paper because of current concerns about the state of the ‘‘immune system’’ of Internet hosts, but the description should be detailed enough to allow the reader to understand the behavior of the program.
And some modern commentary: https://infosec.exchange/@hovav/110950949212380779
Shamir Secret Sharing
It’s 3am. Paul, the head of PayPal database administration carefully enters his elaborate passphrase at a keyboard in a darkened cubicle of 1840 Embarcadero Road in East Palo Alto, for the fifth time. He hits Return. The green-on-black console window instantly displays one line of text: “Sorry, one or more wrong passphrases. Can’t reconstruct the key. Goodbye.”
This is the story of a catastrophic software bug I briefly introduced into the PayPal codebase that almost cost us the company (or so it seemed, in the moment.)
Today, should you try to read up the programmer’s manual (AKA the man page) on getpass, you will find it has been long declared obsolete and replaced with a more intelligent alternative in nearly all flavors of modern Unix.
A fork() in the road
The received wisdom suggests that Unix’s unusual combination of fork() and exec() for process creation was an inspired design. In this paper, we argue that fork was a clever hack for machines and programs of the 1970s that has long outlived its usefulness and is now a liability. We catalog the ways in which fork is a terrible abstraction for the modern programmer to use, describe how it compromises OS implementations, and propose alternatives.
Lotus 1-2-3 For Linux
I’ll cut to the chase; through a combination of unlikely discoveries, crazy hacks and the 90s BBS warez scene I’ve been able to port Lotus 1-2-3 natively to Linux – an operating system that literally didn’t exist when 1-2-3 was released!
That simple script is still someone's bad day
What’s this? The second part of the pipeline still ran? Of course it did. It’s *already running* at the point that the reader fails. Its stdin is hooked to the stdout of the other thing, Unix-centipede style.
The Dirty Pipe Vulnerability
This is the story of CVE-2022-0847, a vulnerability in the Linux kernel since 5.8 which allows overwriting data in arbitrary read-only files. This leads to privilege escalation because unprivileged processes can inject code into root processes.
It all started a year ago with a support ticket about corrupt files. A customer complained that the access logs they downloaded could not be decompressed. And indeed, there was a corrupt log file on one of the log servers; it could be decompressed, but gzip reported a CRC error. I could not explain why it was corrupt, but I assumed the nightly split process had crashed and left a corrupt file behind. I fixed the file’s CRC manually, closed the ticket, and soon forgot about the problem.
Months later, this happened again and yet again. Every time, the file’s contents looked correct, only the CRC at the end of the file was wrong. Now, with several corrupt files, I was able to dig deeper and found a surprising kind of corruption. A pattern emerged.
Sandboxing and Workload Isolation
Workload isolation makes it harder for a vulnerability in one service to compromise every other part of the platform. It has a long history going back to 1990s qmail, and we generally agree that it’s a good, useful thing.
From chroot to privsep to docker to firecracker.
A handy diff argument handling feature that's actually very old
If only one of file1 and file2 is a directory, diff shall be applied to the non-directory file and the file contained in the directory file with a filename that is the same as the last component of the non-directory file.
Hacking With Environment Variables
Interesting environment variables to supply to scripting language interpreters
Make system(3) and popen(3) use posix_spawn(3) internally
After 1 week of reading POSIX and writing code, 2 weeks of coding and another 1.5 weeks of bugfixes I have successfully implemented posix_spawn in usage in system(3) and popen(3) internally.
Unix's design issue of device numbers being in stat() results for files
Sometimes, you will hear the view that Unix’s design is without significant issues, especially the ‘pure’ design of Research Unix (before people who didn’t really understand Unix like Berkeley and corporate AT&T got their hands on it). Unfortunately that is not the case, and there are some areas where Research Unix made decisions that still haunt us to this day. For reasons beyond the scope of this entry, today’s example is that part of the file attributes that you get from stat() system call and its friends is the ‘device number’ of the filesystem the file is on.
I think it’s a bit exaggerated to say this is an issue that haunts us. More like a historical note.
Why is there a "V" in SIGSEGV Segmentation Fault?
My program received a SIGSEGV signal and crashed with “Segmentation Fault” message. Where does the “V” come from?
The Deprecated *nix API
But for “*nix”, without any clarifying context, I for one think in terms of shell scripts and their utilities. And the problem is that my own naïve scripts, despite being written on a legit *nix variant, simply will not run on a vanilla Linux, macOS, or *BSD installation. They certainly can—I can install fish, and sd, and ripgrep, and whatever else I’m using, very easily—but those tools aren’t available out-of-the-box, any more than, I dunno, the PowerShell 6 for Linux is.
Exploring munmap() on page zero and on unmapped address space
The difference between Linux and FreeBSD is in what they consider to be ‘outside the valid range for the address space of a process’. FreeBSD evidently considers page zero (and probably low memory in general) to always be outside this range, and thus munmap() fails. Linux does not; while it doesn’t normally let you mmap() memory in that area, for good reasons, it is not intrinsically outside the address space. If I’m reading the Linux kernel code correctly, no low address range is ever considered invalid, only address ranges that cross above the top of user space.
The Early History of Usenet
>November 2019 is, as best I can recall, the 40th anniversary of the conception of Usenet. (What’s Usenet? The Wikipedia article is ok but not perfect.) I should have written a proper paper; instead, there will (probably) be an irregular series of blog posts.
I didn’t notice the series concluded a while back, so if you were waiting to read the whole thing, it’s done.
How are Unix pipes implemented?
This article is about how pipes are implemented the Unix kernel. I was a little disappointed that a recent article titled “How do Unix pipes work?” was not about the internals, and curious enough to go digging in some old sources to try to answer the question.
Curiosity around 'exec_id' and some problems associated with it
The logic responsible for handling ->exit_signal has been changed a few times and the current logic is locked down since Linux kernel 3.3.5. However, it is not fully robust and it’s still possible for the malicious user to bypass it. Basically, it’s possible to send arbitrary signals to a privileged (suidroot) parent process (Problem I.). Nevertheless, it’s not trivial and more limited comparing to the CVE-2009-1337.