How are Unix pipes implemented?
> This article is about how pipes are implemented the Unix kernel. I was a little disappointed that a recent article titled “How do Unix pipes work?” was not about the internals, and curious enough to go digging in some old sources to try to answer the question.
A "living" Linux process with no memory
> This code gets a list of all memory maps from /proc/self/maps, then creates a new executable map where it jits some code that calls munmap() on each of the maps it just got, and finally on the map it’s on. This is just a quick example with no portability in mind, so the source code contains the actual bytes that would be emitted by a x64 compiler. After unmapping the final map, where the jit code lies, there’s no new instruction to execute and a segfault is raised.
firefox's low-latency webassembly compiler
> The goals of high throughput and low latency conflict with each other. To get best throughput, a compiler needs to spend time on code motion, register allocation, and instruction selection; to get low latency, that’s exactly what a compiler should not do. Web browsers therefore take a two-pronged approach: they have a compiler optimized for throughput, and a compiler optimized for latency. As a WebAssembly file is being downloaded, it is first compiled by the quick-and-dirty low-latency compiler, with the goal of producing machine code as soon as possible. After that “baseline” compiler has run, the “optimizing” compiler works in the background to produce high-throughput code. The optimizing compiler can take more time because it runs on a separate thread. When the optimizing compiler is done, it replaces the baseline code. (The actual heuristics about whether to do baseline + optimizing (“tiering“) or just to go straight to the optimizing compiler are a bit hairy, but this is a summary.)
> This article is about the WebAssembly baseline compiler in Firefox. It’s a surprising bit of code and I learned a few things from it.
Speeding up Linux disk encryption
> At one point we noticed that our disks were not as fast as we would like them to be. Some profiling as well as a quick A/B test pointed to Linux disk encryption. Because not encrypting the data (even if it is supposed-to-be a public Internet cache) is not a sustainable option, we decided to take a closer look into Linux disk encryption performance.
> To be fair the request does not always traverse all these queues, but the important part here is that write requests may be queued up to 4 times in dm-crypt and read requests up to 3 times. At this point we were wondering if all this extra queueing can cause any performance issues. For example, there is a nice presentation from Google about the relationship between queueing and tail latency. One key takeaway from the presentation is: A significant amount of tail latency is due to queueing effects
Your Circuit Breaker is Misconfigured
> Circuit breakers are an incredibly powerful tool for making your application resilient to service failure. But they aren’t enough. Most people don’t know that a slightly misconfigured circuit is as bad as no circuit at all! Did you know that a change in 1 or 2 parameters can take your system from running smoothly to completely failing?
LVI - Hijacking Transient Execution with Load Value Injection
> LVI is a new class of transient-execution attacks exploiting microarchitectural flaws in modern processors to inject attacker data into a victim program and steal sensitive data and keys from Intel SGX, a secure vault in Intel processors for your personal data.
> LVI turns previous data extraction attacks around, like Meltdown, Foreshadow, ZombieLoad, RIDL and Fallout, and defeats all existing mitigations. Instead of directly leaking data from the victim to the attacker, we proceed in the opposite direction: we smuggle — “inject” — the attacker’s data through hidden processor buffers into a victim program and hijack transient execution to acquire sensitive information, such as the victim’s fingerprints or passwords.
The unexpected Google wide domain check bypass
> Let me tell you this “funny” story of me trying to bypass a domain check in a little webapp, and acidentally bypassing a URL parser that is used in (almost) every Google product.
Spoiler: it’s a regex bug.
Cryptographic Signatures, Surprising Pitfalls, and LetsEncrypt
> In the above attack Eve managed to create a valid public key that validates a given signature and message. This is because, as Andrew Ayer wrote:
> A digital signature does not uniquely identify a key or a message
Clear Your Terminal in Style
> If you’re someone like me who habitually clears their terminal, sometimes you want a little excitement in your life. Here is a way to do just that.
How Explaining Copyright Broke the YouTube Copyright System
> This is a story about how the most sophisticated copyright filter in the world prevented us from explaining copyright law. It doesn’t involve TikTok dance moves or nuanced 90s remixes featuring AOC. No, it involves a debate at a law school conference over how and when one song can infringe the copyright of another and how exactly one proves in a courtroom if the accused song is “substantially similar” enough to be deemed illegal. In the end, because it was blocked by one of the music companies who owns the song, it also became a textbook study in how fair use still suffers online and what it takes to pushback when a video is flagged. A copyright riddle wrapped up in an algorithmic enigma, symbolic of the many current content moderation dilemmas faced by online platforms today.
Donald Knuth Was Framed
Knuth writes 8 pages and McIlroy writes six lines.
> A damning counter. But neither of us had ever read the paper. And as you know, I’m all about primary sources. We pulled up the paper here and read through it together. And it left us with a very different understanding of literate programming, and the challenge, than the famous story gave.
> C++ “move” semantics are simple, but they are still widely misunderstood. This post is an attempt to shed light on that situation.
I like that the appendix is 3 times the article’s length.
Don't touch my clipboard
> You can (but shouldn’t) change how people copy text from your website.
I Add 3-25 Seconds of Latency to Every Page I Visit
> So if you can inject latency into sites artificially, you can reduce the actual impact of the addiction in a controllable way while not denying the enjoyment of the Internet to yourself.
> Hacker News with 100ms latency feels like liquor: Hacker News with 9000ms latency feels like small beer.
> In this blog post I’d like to look at these simple machines up close. I’ll explain how gears affect the properties of rotational motion and how the shape of their teeth is way more sophisticated than it may initially seem.
> Movement is important in this article so most of the visualizations are animated – you can play and pause them by tapping on the button in their bottom left corner. By default the animations are enabled, but if you find them distracting, or you want to save power, you can globally pause all animations, just make sure to unpause them as needed.
This is very neat.
How the CIA used Crypto AG encryption devices to spy on countries for decades
My FOSS Story
> I’d like to break from my normal tradition of focusing almost strictly on technical content and share a bit of my own personal relationship with Free and Open Source Software (FOSS). While everyone is different, my hope is that sharing my perspective will help build understanding, empathy and trust.
Gathering Intel on Intel AVX-512 Transitions
> This is a post about AVX and AVX-512 related frequency scaling. Now, something more than nothing has been written about this already, including cautionary tales of performance loss and some broad guidelines, so do we really need to add to the pile?
> Perhaps not, but I’m doing it anyway. My angle is a lower level look, almost microscopic really, at the specific transition behaviors. One would hope that this will lead to specific, quantitative advice about exactly when various instruction types are likely to pay off, but (spoiler) I didn’t make it there in this post.
> murex is a shell, like bash / zsh / fish / etc. It follows a similar syntax to POSIX shells like Bash however supports more advanced features than you’d typically expect from a $SHELL.
> It aims to be similar enough to traditional shells that you can retain most of your muscle memory, while not being afraid to make breaking changes where “bash-isms” lead to unreadable, hard to maintain, or unsafe code.
Real-Time Ray-Tracing in WebGPU
> Note that RTX is not available officially for WebGPU (yet?) and is only available for the Node bindings for WebGPU. Recently I began adapting an unofficial Ray-Tracing extension for Dawn, which is the WebGPU implementation for Chromium. The Ray-Tracing extension is only implemented into the Vulkan backend so far, but a D3D12 implementation is on the Roadmap. You can find my Dawn Fork with Ray-Tracing capabilities here.
> Now let me introduce you to the ideas and concepts of the Ray-Tracing extension.