The multi-generational LRU
One of the key tasks assigned to the memory-management subsystem is to optimize the system’s use of the available memory; that means pushing out pages containing unused data so that they can be put to better use elsewhere. Predicting which pages will be accessed in the near future is a tricky task, and the kernel has evolved a number of mechanisms designed to improve its chances of guessing right. But the kernel not only often gets it wrong, it also can expend a lot of CPU time to make the incorrect choice. The multi-generational LRU patch set posted by Yu Zhao is an attempt to improve that situation.
Some near-term arm64 hardening patches
The arm64 architecture is found at the core of many, if not most, mobile devices; that means that arm64 devices are destined to be the target of attackers worldwide. That has led to a high level of interest in technologies that can harden these systems. There are currently several such technologies, based in both hardware and software, that are being readied for the arm64 kernel; read on for a survey on what is coming.
vDSO, 32-bit time, and seccomp
The seccomp() mechanism is notoriously difficult to use. It also turns out to be easy to break unintentionally, as the development community discovered when a timekeeping change meant to address the year-2038 problem created a regression for seccomp() users in the 5.3 kernel. Work is underway to mitigate the problem for now, but seccomp() users on 32-bit systems are likely to have to change their configurations at some point.
The problems inherent in exposing very low level interfaces in one place (seccomp) and high level interfaces in another (libc).
A proposed API for full-memory encryption
Hardware memory encryption is, or will soon be, available on multiple generic CPUs. In its absence, data is stored — and passes between the memory chips and the processor — in the clear. Attackers may be able to access it by using hardware probes or by directly accessing the chips, which is especially problematic with persistent memory. One new memory-encryption offering is Intel’s Multi-Key Total Memory Encryption (MKTME) [PDF]; AMD’s equivalent is called Secure Encrypted Virtualization (SEV). The implementation of support for this feature is in progress for the Linux kernel. Recently, Alison Schofield proposed a user-space API for MKTME, provoking a long discussion on how memory encryption should be exposed to the user, if at all.
The Firecracker virtual machine monitor
Cloud computing services that run customer code in short-lived processes are often called “serverless”. But under the hood, virtual machines (VMs) are usually launched to run that isolated code on demand. The boot times for these VMs can be slow. This is the cause of noticeable start-up latency in a serverless platform like Amazon Web Services (AWS) Lambda. To address the start-up latency, AWS developed Firecracker, a lightweight virtual machine monitor (VMM), which it recently released as open-source software. Firecracker emulates a minimal device model to launch Linux guest VMs more quickly. It’s an interesting exploration of improving security and hardware utilization by using a minimal VMM built with almost no legacy emulation.
What's a CPU to do when it has nothing to do?
Idle states are not free to enter or exit. Entry and exit both require some time, and moreover power consumption briefly rises slightly above normal for the current state on entry to idle and above normal for the destination state on exit from idle. Although increasingly deep idle states consume decreasing amounts of power, they have increasingly large costs to enter and exit. This implies that for short idle periods, a fairly shallow idle state is the best use of system resources; for longer idle periods, the costs of a deeper idle state will be justified by the increased power savings while idle. It is therefore in the kernel’s best interests to predict how long a CPU will be idle before deciding how deeply to idle it. This is the job of the idle loop.
Making C Less Dangerous
Kees Cook gave a presentation on some of the dangers that come with programs written in C. In particular, of course, the Linux kernel is mostly written in C, which means that the security of our systems rests on a somewhat dangerous foundation. But there are things that can be done to help firm things up by “Making C Less Dangerous” as the title of his talk suggested.
The NOVA filesystem
NOVA is intended to be such a filesystem. It is not just unsuited for regular block devices, it cannot use them at all, since it does not use the kernel’s block layer. Instead, it works directly with storage mapped into the kernel’s address space.
Memory use in CPython and MicroPython
Two ways to do the same thing, the big way and the little way.
2038: only 21 years away
Bigger time_t. Either in the kernel or in userspace or maybe somewhere in between.
I’m pretty sure this is the biggest release we’ve ever had, at least
in number of commits.
Function multi-versioning in GCC 6
Toolchain magic for specialized functions.
Linux on the Mac — state of the union
It’s complicated. Lots of custom drivers for custom hardware.
The Emacs dumper dispute
It’s hard work making an editor resume from hibernation in a fast and portable manner.
The status of linux kernel hardening
KASLR, link randomization, memory management, ref counting, etc.