How are Unix pipes implemented?
> This article is about how pipes are implemented the Unix kernel. I was a little disappointed that a recent article titled “How do Unix pipes work?” was not about the internals, and curious enough to go digging in some old sources to try to answer the question.
A "living" Linux process with no memory
> This code gets a list of all memory maps from /proc/self/maps, then creates a new executable map where it jits some code that calls munmap() on each of the maps it just got, and finally on the map it’s on. This is just a quick example with no portability in mind, so the source code contains the actual bytes that would be emitted by a x64 compiler. After unmapping the final map, where the jit code lies, there’s no new instruction to execute and a segfault is raised.
Hypervisor Necromancy; Reanimating Kernel Protectors
> In this (rather long) article we will be investigating methods to emulate proprietary hypervisors under QEMU, which will allow researchers to interact with them in a controlled manner and debug them. Specifically, we will be presenting a minimal framework developed to bootstrap Samsung S8+ proprietary hypervisor as a demonstration, providing details and insights on key concepts on ARM low level development and virtualization extensions for interested readers to create their own frameworks and Actually Compile And Boot them ;). Finally, we will be investigating fuzzing implementations under this setup.
TRRespass: Exploiting the Many Sides of Target Row Refresh
> Well, after two years of rigorous research, looking inside what is implemented inside CPUs and DDR4 chips using novel reverse engineering techniques, we can tell you that we do not live in a Rowhammer-free world. And we will not for the better part of this decade. Turns out while the old hammering techniques no longer work, once we understand the exact nature of these mitigations inside modern DDR4 chips, using new hammering patterns it is trivial to again trigger plenty of new bit flips. Yet again, these results show the perils of lack of transparency and security-by-obscurity. This is especially problematic since unlike software vulnerabilities, we cannot fix these hardware bit flips post-production.
Escaping the Chrome Sandbox with RIDL
> Vulnerabilities that leak cross process memory can be exploited to escape the Chrome sandbox. An attacker is still required to compromise the renderer prior to mounting this attack. To protect against attacks on affected CPUs make sure your microcode is up to date and disable hyper-threading (HT).
This is a pretty clear write-up and comes with a nice footnote:
> When I started working on this I was surprised that it’s still exploitable even though the vulnerabilities have been public for a while. If you read guidance on the topic, they will usually talk about how these vulnerabilities have been mitigated if your OS is up to date with a note that you should disable hyper threading to protect yourself fully. The focus on mitigations certainly gave me a false sense that the vulnerabilities have been addressed and I think these articles could be more clear on the impact of leaving hyper threading enabled.
Avoiding gaps in IOMMU protection at boot
> But setting things up in the OS isn’t sufficient. If an attacker is able to trigger arbitrary DMA before the OS has started then they can tamper with the system firmware or your bootloader and modify the kernel before it even starts running. So ideally you want your firmware to set up the IOMMU before it even enables any external devices, and newer firmware should actually do this automatically. It sounds like the problem is solved.
> This post explains how to implement heap allocators from scratch. It presents and discusses different allocator designs, including bump allocation, linked list allocation, and fixed-size block allocation. For each of the three designs, we will create a basic implementation that can be used for our kernel.
A Deep Dive Into Samsung's TrustZone
> After a general introduction on the ARM TrustZone and a focus on Qualcomm’s implementation, this new series of articles will discuss and detail the implementation developed by Samsung and Trustonic.
> These blog posts are a follow up to the conference Breaking Samsung’s ARM TrustZone that was given at BlackHat USA this summer. While an event such as this one is a great opportunity to present a subject we have been working on, many details have to be overlooked to fit the 50-minute format. This blog post, and the following ones, will explain all the details that were missing from the presentation as well as release the different tools mentioned in the talk and developed along the way.
The Year Ahead
> There are a few conferences from 2019 that I didn’t manage to get to last year (notably CCS, SOCC, and NeurIPS) which are still on my plate. And then I’ve pulled together this initial ‘watch list’ for the coming year.
Purgeable Memory Allocations for Linux
> It allocates anonymous pages using mmap(2). When the allocation is “unlocked” — i.e. the process isn’t actively using it — its pages are marked with MADV_FREE so that the kernel can reclaim them at any time. To lock the allocation so that the process can safely make use of them, the MADV_FREE is canceled. This is all a little trickier than it sounds, and that’s the subject of this article.
CPU Introspection: Intel Load Port Snooping
> We’re going to go into a unique technique for observing and sequencing all load port traffic on Intel processors. By using a CPU vulnerability from the MDS set of vulnerabilities, specifically multi-architectural load port data sampling (MLPDS, CVE-2018-12127), we are able to observe values which fly by on the load ports. Since (to my knowledge) all loads must end up going through load ports, regardless of requestor, origin, or caching, this means in theory, all contents of loads ever performed can be observed. By using a creative scanning technique we’re able to not only view “random” loads as they go by, but sequence loads to determine the ordering and timing of them.
> We’ll go through some examples demonstrating that this technique can be used to view all loads as they are performed on a cycle-by-cycle basis. We’ll look into an interesting case of the micro-architecture updating accessed and dirty bits using a microcode assist. These are invisible loads dispatched on the CPU on behalf of the user when a page is accessed for the first time.
On the Metal: Ron Minnich
> On this episode of On the Metal, we interview Ron Minnich. Ron has had a fascinating career working on the interface between software and hardware. Join us as ~we install Gentoo and compile GCC~ to hear a mesmerizing conversation about Unix, Plan9, LinuxBIOS, Chromebooks, RISC-V, of course some Gentoo jokes, flip flop programming toys, and more!
Didn’t actually listen, but there’s a pile of links here anyway.
Sculpt OS on HP EliteBook 840 G5
> Unfortunately, the first boot of a recent Sculpt OS USB flash drive just hanged after GRUB showing the GENODE boot logo. So, it was time to get my hands dirty and debug the boot process. From a debuggers point of view, the used i5-8350U CPU luckily comes with Intel vPRO support, which means enabling AMT Serial-Over-LAN is just a matter of some configuration tweaks. Additionally, I adapted the Sculpt configuration to use the core LOG service, which reflects all messages on the first UART - in our case (and thanks to bender) AMT SOL.
> Meet the ZedRipper – a 16-core, 83 MHz Z80 powerhouse as portable as it is impractical. The ZedRipper is my latest attempt to build a fun ‘project’ machine, with a couple of goals in mind:
Real hardware breakthroughs, and focusing on rustc
> After the addition of the NVMe driver a couple months ago, I have been running Redox OS permanently (from an install to disk) on a System76 Galago Pro (galp3-c), with System76 Open Firmware as well as the un-announced, in-development, GPLv3 System76 EC firmware . This particular hardware has full support for the keyboard, touchpad, storage, and ethernet, making it easy to use with Redox.
> Building Redox OS on Redox OS has always been one of the highest priorities of the project. Rustc seems to be only a few months of work away, after which I can begin to improve the system while running on it permanently, at least on one machine. With Redox OS being a microkernel, it is possible that even the driver level could be recompiled and respawned without downtime, making it incredibly fast to develop for. With this in place, I would work more efficiently on porting more software and tackling more hardware support issues, such as filling in the USB stack and adding graphics drivers.
The Bytecode Alliance: Building a secure, composable future for WebAssembly
> We have a vision of a WebAssembly ecosystem that is secure by default, fixing cracks in today’s software foundations. And based on advances rapidly emerging in the WebAssembly community, we believe we can make this vision real.
> WebAssembly can provide the kind of isolation that makes it safe to run untrusted code. We can have an architecture that’s like Unix’s many small processes, or like containers and microservices. But this isolation is much lighter weight, and the communication between them isn’t much slower than a regular function call. This means you can use them to wrap a single WebAssembly module instance, or a small collection of module instances that want to share things like memory among themselves.
Snap: a microkernel approach to host networking
> This paper describes the networking stack, Snap, that has been running in production at Google for the last three years+. It’s been clear for a while that software designed explicitly for the data center environment will increasingly want/need to make different design trade-offs to e.g. general-purpose systems software that you might install on your own machines. But wow, I didn’t think we’d be at the point yet where we’d be abandoning TCP/IP! You need a lot of software engineers and the willingness to rewrite a lot of software to entertain that idea.
The July Galileo Outage: What happened and why
> This post is an excerpt of a far longer post on Galileo, its structures and the cause of the outage. Here we’ll only focus on the outage - the potential underlying reasons behind it are described in the full article.
> Since the week-long outage in July I’ve been fascinated by Galileo and, together with a wonderful crew of developers, experts and receiver operators, have learned so much about what I now know are called ‘Global Navigation Satellite Systems’ or GNSS. This has lead to the galmon.eu project, which monitors the health and vital statistics of GPS, Galileo, BeiDou and GLONASS. More about the project can be read in the full article.
I totally missed the fact that there was an outage, but some interesting commentary.
An analysis of performance evolution of Linux’s core operations
> When you get into the details I found it hard to come away with any strongly actionable takeaways though. Perhaps the most interesting lesson/reminder is this: it takes a lot of effort to tune a Linux kernel. For example:
> “Red Hat and Suse normally required 6-18 months to optimise the performance an an upstream Linux kernel before it can be released as an enterprise distribution”, and
> “Google’s data center kernel is carefully performance tuned for their workloads. This task is carried out by a team of over 100 engineers, and for each new kernel, the effort can also take 6-18 months.”
Making the Tokio scheduler 10x faster
> We’ve been hard at work on the next major revision of Tokio, Rust’s asynchronous runtime. Today, a complete rewrite of the scheduler has been submitted as a pull request. The result is huge performance and latency improvements. Some benchmarks saw a 10x speed up! It is always unclear how much these kinds of improvements impact “full stack” use cases, so we’ve also tested how these scheduler improvements impacted use cases like Hyper and Tonic (spoiler: it’s really good).
> In preparation for working on the new scheduler, I spent time searching for resources on scheduler implementations. Besides existing implementations, I did not find much. I also found the source of existing implementations difficult to navigate. To remedy this, I tried to keep Tokio’s new scheduler implementation as clean as possible. I also am writing this detailed article on implementing the scheduler in hope that others in similar positions find it useful.
> The article starts with a high level overview of scheduler design, including work-stealing schedulers. It then gets into the details of specific optimizations made in the new Tokio scheduler.