Purgeable Memory Allocations for Linux
> It allocates anonymous pages using mmap(2). When the allocation is “unlocked” — i.e. the process isn’t actively using it — its pages are marked with MADV_FREE so that the kernel can reclaim them at any time. To lock the allocation so that the process can safely make use of them, the MADV_FREE is canceled. This is all a little trickier than it sounds, and that’s the subject of this article.
On Linux's Random Number Generation
> I have been asked about the usefulness of security monitoring of entropy levels in the Linux kernel. This calls for some explanation of how random generation works in Linux systems.
> So, randomness and the Linux kernel. This is an area where there is longstanding confusion, notably among some Linux kernel developers, including Linus Torvalds himself.
Sculpt OS on HP EliteBook 840 G5
> Unfortunately, the first boot of a recent Sculpt OS USB flash drive just hanged after GRUB showing the GENODE boot logo. So, it was time to get my hands dirty and debug the boot process. From a debuggers point of view, the used i5-8350U CPU luckily comes with Intel vPRO support, which means enabling AMT Serial-Over-LAN is just a matter of some configuration tweaks. Additionally, I adapted the Sculpt configuration to use the core LOG service, which reflects all messages on the first UART - in our case (and thanks to bender) AMT SOL.
Boosting the Real Time Performance of Gnome Shell 3.34 in Ubuntu 19.10
> As you may have read many times, Gnome 3.34 brings much improved desktop performance. In this article we will describe some of the improvements contributed by Canonical, how the problems were surprising, how they were approached and what other performance work is coming in future.
> The thing is in the case of Gnome Shell its biggest performance problems of late were not hot spots at all. They were better characterised as cold spots where it was idle instead of updating the screen smoothly. Such cold spots are only apparent when you look at the real time usage of a program, and in not the CPU or GPU time consumed.
Nice write-up on addressing stuttering and lag.
Some near-term arm64 hardening patches
> The arm64 architecture is found at the core of many, if not most, mobile devices; that means that arm64 devices are destined to be the target of attackers worldwide. That has led to a high level of interest in technologies that can harden these systems. There are currently several such technologies, based in both hardware and software, that are being readied for the arm64 kernel; read on for a survey on what is coming.
drgn - Scriptable debugger library
> drgn (pronounced “dragon“) is a debugger-as-a-library. In contrast to existing debuggers like GDB which focus on breakpoint-based debugging, drgn excels in live introspection. drgn exposes the types and variables in a program for easy, expressive scripting in Python.
children_tcache writeup and tcache overview
> This article is intended for the people who already have some knowledge about heap exploitation. If you already know some heap attacks on glibc<2.26 it’ll be fully understandable to you. But if you don’t, don’t worry - I’ve tried to make this post approachable for everyone with just basic knowledge. If you really know nothing about the topic, I recommend heap-exploitation.
> Tcache is an internal mechanism responsible for heap management. It was introduced in glibc 2.26 in the year 2017. It’s objective is to speed up the heap management. Older algorithms are not removed, but they are still used sometimes - for example for bigger chunks, or when an appropriate tcache bin is full. But heap exploitation with this mechanism is a lot easier due to a lack of heap integrity checks.
Analyzing Android's CVE-2019-2215 (/dev/binder UAF)
> Over the past few weeks, those of you who frequent the DAY streams over on our Twitch may have seen me working on trying to understand the recent Android Binder Use-After-Free (UAF) published by Google’s Project Zero (p0). This bug is actually not new, the issue was discovered and fixed in the mainline kernel in February 2018, however, p0 discovered many popular devices did not receive the patch downstream. Some of these devices include the Pixel 2, the Huawei P20, and Samsung Galaxy S7, S8, and S9 phones. I believe many of these devices received security patches within the last couple weeks that finally killed the bug.
> After a few streams of poking around with a kernel debugger on a virtual machine (running Android-x86), and testing with a vulnerable Pixel 2, I’ve came to understand the exploit written by Jann Horn and Maddie Stone pretty well. Without an understanding of Binder (the binder_thread object specifically), as well as how Vectored I/O works, the exploit can be pretty confusing. It’s also quite clever how they exploited this issue, so I thought it would be cool to write up how the exploit works.
An analysis of performance evolution of Linux’s core operations
> When you get into the details I found it hard to come away with any strongly actionable takeaways though. Perhaps the most interesting lesson/reminder is this: it takes a lot of effort to tune a Linux kernel. For example:
> “Red Hat and Suse normally required 6-18 months to optimise the performance an an upstream Linux kernel before it can be released as an enterprise distribution”, and
> “Google’s data center kernel is carefully performance tuned for their workloads. This task is carried out by a team of over 100 engineers, and for each new kernel, the effort can also take 6-18 months.”
How a months-old AMD microcode bug destroyed my weekend
> Unfortunately, unpatched Ryzen 3000 says “yes” to the CPUID 01H call, sets the carry bit indicating it has successfully created the most artisanal, organic high-quality random number possible... and gives you a 0xFFFFFFFF for the “random” number, every single time.
> Unfortunately, after successfully applying the update and rebooting again, I realized my error—yes, Asus showed a later date for the BIOS, but the actual version was the same as the one I already had—3.2.0. My CPU still thought 0xFFFFFFFF was the randomest number ever, always, no matter what.
> At this point, I began to get paranoid—systemd had already quietly worked around the bug. But with most applications just quietly ignoring the problem, how would I know if it ever had been patched? What if two years later, I was still vulnerable to stack-smashing that I shouldn’t have been, due to ASLR that wasn’t actually randomizing?
Another entry for the bad workarounds file.
> unfork(2) is the inverse of fork(2). sort of.
> By combining userfaultfd with process_vm_readv, any userspace application can obtain a copy-on-write mapping (with some limitations) of memory it never owned. All it needs is ptrace privileges, which is to say, having the same uid usually works.
Binary symbolic execution with KLEE-Native
> KLEE is a symbolic execution tool that intelligently produces high-coverage test cases by emulating LLVM bitcode in a custom runtime environment. Yet, unlike simpler fuzzers, it’s not a go-to tool for automated bug discovery. Despite constant improvements by the academic community, KLEE remains difficult for bug hunters to adopt. We’re working to bridge this gap!
> My internship produced KLEE-Native; a version of KLEE that can concretely and symbolically execute binaries, model heap memory, reproduce CVEs, and accurately classify different heap bugs. The project is now positioned to explore applications made possible by KLEE-Native’s unique approaches to symbolic execution. We will also be looking into potential execution time speed-ups from different lifting strategies. As with all articles on symbolic execution, KLEE is both the problem and the solution.
vDSO, 32-bit time, and seccomp
> The seccomp() mechanism is notoriously difficult to use. It also turns out to be easy to break unintentionally, as the development community discovered when a timekeeping change meant to address the year-2038 problem created a regression for seccomp() users in the 5.3 kernel. Work is underway to mitigate the problem for now, but seccomp() users on 32-bit systems are likely to have to change their configurations at some point.
The problems inherent in exposing very low level interfaces in one place (seccomp) and high level interfaces in another (libc).
Killing a process and all of its descendants
> Unix-like operating systems have sophisticated process relationships. Parent-child, process groups, sessions, and session leaders. However, the details are not uniform across operating systems like Linux and macOS. POSIX compliant operating systems support sending signals to process groups with a negative PID number.
I think some of this is not entirely correct, but as noted, it’s a complicated subject.
shebangs and busybox
> neat, right? this lets us write shell scripts that are portable across all sorts of different setups. except there’s a problem.
Extending the Kernel with Built-in Kernel Headers
> Kernel headers are usually unavailable on the target where these BPF tracing programs need to be dynamically compiled and run. That is certainly the case with Android, which runs on billions of devices. It is not practical to ship custom kernel headers for every device. My solution to the problem is to embed the kernel headers within the kernel image itself and make it available through the sysfs virtual filesystem (usually mounted at /sys) as a compressed archive file (/sys/kernel/kheaders.tar.xz). This archive can be uncompressed as needed to a temporary directory. This simple change guarantees that the headers are always shipped with the running kernel.
I feel this is the wrong solution, but interesting nevertheless.
security things in Linux v5.2
> page allocator freelist randomization
And some other things as well.
OpenSSH Taking Minutes To Become Available, Booting Takes Half An Hour ... Because Your Server Waits For A Few Bytes Of Randomness
> Basically as of now the entropy file saved as /var/lib/systemd/random-seed will not - drumroll - add entropy to the random pool when played back during boot. Actually it will. It will just not be accounted for. So Linux doesn’t know. And continues blocking getrandom(). This is obviously different from SysVinit times2 when /var/lib/urandom/random-seed (that you still have lying around on updated systems) made sure the system carried enough entropy over reboot to continue working right after enough of the system was booted.
And then... it just kinda keeps getting worse. The problem is understandable, the inability to resolve it less so.
> Let’s talk about files! Most developers seem to think that files are easy.
> In this talk, we’re going to look at how file systems differ from each other and other issues we might encounter when writing to files. We’re going to look at the file “stack”, starting at the top with the file API, moving down to the filesystem, and then moving down to disk.
Linux and FreeBSD Kernel: Multiple TCP-based remote denial of service issues
> Netflix has identified several TCP networking vulnerabilities in FreeBSD and Linux kernels. The vulnerabilities specifically relate to the minimum segment size (MSS) and TCP Selective Acknowledgement (SACK) capabilities. The most serious, dubbed “SACK Panic,” allows a remotely-triggered kernel panic on recent Linux kernels.