Curiosity around 'exec_id' and some problems associated with it
> The logic responsible for handling ->exit_signal has been changed a few times and the current logic is locked down since Linux kernel 3.3.5. However, it is not fully robust and it’s still possible for the malicious user to bypass it. Basically, it’s possible to send arbitrary signals to a privileged (suidroot) parent process (Problem I.). Nevertheless, it’s not trivial and more limited comparing to the CVE-2009-1337.
A "living" Linux process with no memory
> This code gets a list of all memory maps from /proc/self/maps, then creates a new executable map where it jits some code that calls munmap() on each of the maps it just got, and finally on the map it’s on. This is just a quick example with no portability in mind, so the source code contains the actual bytes that would be emitted by a x64 compiler. After unmapping the final map, where the jit code lies, there’s no new instruction to execute and a segfault is raised.
Speeding up Linux disk encryption
> At one point we noticed that our disks were not as fast as we would like them to be. Some profiling as well as a quick A/B test pointed to Linux disk encryption. Because not encrypting the data (even if it is supposed-to-be a public Internet cache) is not a sustainable option, we decided to take a closer look into Linux disk encryption performance.
> To be fair the request does not always traverse all these queues, but the important part here is that write requests may be queued up to 4 times in dm-crypt and read requests up to 3 times. At this point we were wondering if all this extra queueing can cause any performance issues. For example, there is a nice presentation from Google about the relationship between queueing and tail latency. One key takeaway from the presentation is: A significant amount of tail latency is due to queueing effects
Another look at two Linux KASLR patches
> In the end, this random number generator was quickly removed, and that was that. But one can still wonder—is this generator secure but unanalyzed, or would it have been broken just to prove a point?
A Compendium of Container Escapes
> The goal of this talk is to broaden the awareness of the how and why container escapes work, starting from a brief intro to what makes a process a container, and then spanning the gamut of escape techniques, covering exposed orchestrators, access to the Docker socket, exposed mount points, /proc, all the way down to overwriting/exploiting the kernel structures to leave the confines of the container.
The FreeBSD-linuxulator explained (for users)
> First, the linuxulator is not an emulation. It is “just” a binary interface which is a little bit different from the FreeBSD-“native”-one. This means that the binary files in FreeBSD and Linux are both files which comply to the ELF specification.
Patching nVidia GPU driver for hot-unplug on Linux
> Anyway, I was kind of annoyed of rebooting every time it happens, so I decided to reboot a few more dozen times instead while patching the driver. This has indeed worked, and left me with something similar to a functional hot-unplug, mildly crippled by the fact that nvidia-modeset is a completely opaque blob that keeps some internal state and tries to act on it, getting stuck when it tries to do something to the now-missing eGPU.
Avoiding gaps in IOMMU protection at boot
> But setting things up in the OS isn’t sufficient. If an attacker is able to trigger arbitrary DMA before the OS has started then they can tamper with the system firmware or your bootloader and modify the kernel before it even starts running. So ideally you want your firmware to set up the IOMMU before it even enables any external devices, and newer firmware should actually do this automatically. It sounds like the problem is solved.
The Linux CSPRNG Is Now Good!
> Oceans of ink and hours on stage have been spent to convince the world that the best random number generator is /dev/urandom, the kernel one. And it is, and it’s always been. However, an uncomfortable truth was that the Linux CSPRNG really could have been better than it was. Userspace CSPRNGs couldn’t be better than the kernel one, so our advice was still valid, but that space for improvement always frustrated me.
> Good news everyone! In recent years, the Linux CSPRNG got a number of great incremental improvements, and I can now say in good conscience that it’s not only the best, it’s also good.
Purgeable Memory Allocations for Linux
> It allocates anonymous pages using mmap(2). When the allocation is “unlocked” — i.e. the process isn’t actively using it — its pages are marked with MADV_FREE so that the kernel can reclaim them at any time. To lock the allocation so that the process can safely make use of them, the MADV_FREE is canceled. This is all a little trickier than it sounds, and that’s the subject of this article.
On Linux's Random Number Generation
> I have been asked about the usefulness of security monitoring of entropy levels in the Linux kernel. This calls for some explanation of how random generation works in Linux systems.
> So, randomness and the Linux kernel. This is an area where there is longstanding confusion, notably among some Linux kernel developers, including Linus Torvalds himself.
Sculpt OS on HP EliteBook 840 G5
> Unfortunately, the first boot of a recent Sculpt OS USB flash drive just hanged after GRUB showing the GENODE boot logo. So, it was time to get my hands dirty and debug the boot process. From a debuggers point of view, the used i5-8350U CPU luckily comes with Intel vPRO support, which means enabling AMT Serial-Over-LAN is just a matter of some configuration tweaks. Additionally, I adapted the Sculpt configuration to use the core LOG service, which reflects all messages on the first UART - in our case (and thanks to bender) AMT SOL.
Boosting the Real Time Performance of Gnome Shell 3.34 in Ubuntu 19.10
> As you may have read many times, Gnome 3.34 brings much improved desktop performance. In this article we will describe some of the improvements contributed by Canonical, how the problems were surprising, how they were approached and what other performance work is coming in future.
> The thing is in the case of Gnome Shell its biggest performance problems of late were not hot spots at all. They were better characterised as cold spots where it was idle instead of updating the screen smoothly. Such cold spots are only apparent when you look at the real time usage of a program, and in not the CPU or GPU time consumed.
Nice write-up on addressing stuttering and lag.
Some near-term arm64 hardening patches
> The arm64 architecture is found at the core of many, if not most, mobile devices; that means that arm64 devices are destined to be the target of attackers worldwide. That has led to a high level of interest in technologies that can harden these systems. There are currently several such technologies, based in both hardware and software, that are being readied for the arm64 kernel; read on for a survey on what is coming.
drgn - Scriptable debugger library
> drgn (pronounced “dragon“) is a debugger-as-a-library. In contrast to existing debuggers like GDB which focus on breakpoint-based debugging, drgn excels in live introspection. drgn exposes the types and variables in a program for easy, expressive scripting in Python.
children_tcache writeup and tcache overview
> This article is intended for the people who already have some knowledge about heap exploitation. If you already know some heap attacks on glibc<2.26 it’ll be fully understandable to you. But if you don’t, don’t worry - I’ve tried to make this post approachable for everyone with just basic knowledge. If you really know nothing about the topic, I recommend heap-exploitation.
> Tcache is an internal mechanism responsible for heap management. It was introduced in glibc 2.26 in the year 2017. It’s objective is to speed up the heap management. Older algorithms are not removed, but they are still used sometimes - for example for bigger chunks, or when an appropriate tcache bin is full. But heap exploitation with this mechanism is a lot easier due to a lack of heap integrity checks.
Analyzing Android's CVE-2019-2215 (/dev/binder UAF)
> Over the past few weeks, those of you who frequent the DAY streams over on our Twitch may have seen me working on trying to understand the recent Android Binder Use-After-Free (UAF) published by Google’s Project Zero (p0). This bug is actually not new, the issue was discovered and fixed in the mainline kernel in February 2018, however, p0 discovered many popular devices did not receive the patch downstream. Some of these devices include the Pixel 2, the Huawei P20, and Samsung Galaxy S7, S8, and S9 phones. I believe many of these devices received security patches within the last couple weeks that finally killed the bug.
> After a few streams of poking around with a kernel debugger on a virtual machine (running Android-x86), and testing with a vulnerable Pixel 2, I’ve came to understand the exploit written by Jann Horn and Maddie Stone pretty well. Without an understanding of Binder (the binder_thread object specifically), as well as how Vectored I/O works, the exploit can be pretty confusing. It’s also quite clever how they exploited this issue, so I thought it would be cool to write up how the exploit works.
An analysis of performance evolution of Linux’s core operations
> When you get into the details I found it hard to come away with any strongly actionable takeaways though. Perhaps the most interesting lesson/reminder is this: it takes a lot of effort to tune a Linux kernel. For example:
> “Red Hat and Suse normally required 6-18 months to optimise the performance an an upstream Linux kernel before it can be released as an enterprise distribution”, and
> “Google’s data center kernel is carefully performance tuned for their workloads. This task is carried out by a team of over 100 engineers, and for each new kernel, the effort can also take 6-18 months.”
How a months-old AMD microcode bug destroyed my weekend
> Unfortunately, unpatched Ryzen 3000 says “yes” to the CPUID 01H call, sets the carry bit indicating it has successfully created the most artisanal, organic high-quality random number possible... and gives you a 0xFFFFFFFF for the “random” number, every single time.
> Unfortunately, after successfully applying the update and rebooting again, I realized my error—yes, Asus showed a later date for the BIOS, but the actual version was the same as the one I already had—3.2.0. My CPU still thought 0xFFFFFFFF was the randomest number ever, always, no matter what.
> At this point, I began to get paranoid—systemd had already quietly worked around the bug. But with most applications just quietly ignoring the problem, how would I know if it ever had been patched? What if two years later, I was still vulnerable to stack-smashing that I shouldn’t have been, due to ASLR that wasn’t actually randomizing?
Another entry for the bad workarounds file.
> unfork(2) is the inverse of fork(2). sort of.
> By combining userfaultfd with process_vm_readv, any userspace application can obtain a copy-on-write mapping (with some limitations) of memory it never owned. All it needs is ptrace privileges, which is to say, having the same uid usually works.