An improved chkstk function on Windows
If you’ve spent much time developing with Mingw-w64 you’ve likely seen the symbol ___chkstk_ms, perhaps in an error message. It’s a little piece of runtime provided by GCC via libgcc which ensures enough of the stack is committed for the caller’s stack frame. The “function” uses a custom ABI and is implemented in assembly. So is the subject of this article, a slightly improved implementation soon to be included in w64devkit as libchkstk (-lchkstk).
Identifying Rust's collect::<Vec<_>>() memory leak footgun
This is the story of how I identified the bug. (TLDR: collect::<Vec<_>>() will sometimes reuse allocations, resulting in Vecs with large excess capacity, even when the length is exactly known in advance, so you need to call shrink_to_fit if you want to free the extra memory.)
Ordinarily, that wouldn’t have been a problem, since the into_iter().map().collect() line used to pack them into (u32, u32)s would allocate a new vector with only the exact amount of space required. However, thanks to the allocation reuse optimization added in Rust 1.76, the new vec shared the backing store of the input vec, and hence had a capacity of 16560, meaning it was using 132480 bytes of memory to store only 16 bytes of data.
When Random Isn't
So there were two environments: an insecure one where you can get all information but can’t act on it, and a secure one where you can act but can’t get the information needed for automation.
An evil idea came in my head: random number generators (RNGs) used in computers are almost always pseudorandom number generators with (hidden) internal state. If I can manipulate this state, perhaps I can use that to pass information into the secure environment.
Annoying details of a Z-buffer rasterizer
I wrote a software rasterizer for occlusion culling and hit many small speed bumps along the way. Here I reveal what I’ve learned in the hope of you writing your own with more ease than I did.
Low-level thinking in high-level shading languages 2023
This, and the followup, is a presentation that I recommend as required reading to people wanting to get deeper into shader programming, not just for the knowledge but also the attitude towards shader programming (check compiler output, never assume, always profile). It has been 10 years since it was released though; in those 10 years a lot of things have changed on the GPU/shader model/shader compiler front and not all the suggestions in those presentations are still valid. So I decided to do a refresh with a modern compiler and shader model to see what still holds true and what doesn’t. I will target the RDNA 2 GPU architecture on PC using HLSL, the 6.7 shader model and the DXC compiler (using https://godbolt.org/) in this blog post.
How to (and how not to) fix color banding
I love to use soft gradients as backdrops when doing graphics programming, a love started by a Corona Renderer product shot sample scene shared by user romullus and its use of radial gradients to highlight the product. But they are quite horrible from a design standpoint, since they produce awful color banding, also referred to as posterization. Depending on things like screen type, gradient colors, viewing environment, etc., the effect can be sometimes not present at all, yet sometimes painfully obvious.
I Ran a Chess Programming Tournament, Here's How it Went!
There are no strings on me
There is a kind of magic to those systems that is worth experiencing. But it’s also worth examining why we prefer to build puppets.
Because I’ve had days where I’ve had to debug my surly emacs boy, and I’ve quickly discovered that his behaviour has very little to do with the code that I’m reading. Methods overridden at runtime, traces that end with a call to a closure that no longer exists, event handlers whose execution order depends on side-effects during module loading, stack-traces which contain multiple different versions of the same function. On the worst days I find myself debugging code that doesn’t even exist on disk but was evaluated in the repl weeks before.
Real-time dreamy Cloudscapes with Volumetric Raymarching
I spent the past few months diving into the realm of Raymarching and studying some of its applications that may come in handy for future 3D projects, and while I managed to build a pretty diverse set of scenes, all of them consisted of rendering surfaces or solid objects. My blog post on Raymarching covered some of the many impressive capabilities of this rendering technique, and as I mentioned at the end of that post, that was only the tip of the iceberg; there is a lot more we can do with it.
One fascinating aspect of Raymarching I quickly encountered in my study was its capacity to be tweaked to render volumes. Instead of stopping the raymarched loop once the ray hits a surface, we push through and continue the process to sample the inside of an object. That is where my obsession with volumetric clouds started, and I think the countless hours I spent exploring the many Sky Islands in Zelda Tears of the Kingdom contributed a lot to my curiosity to learn more about how they work. I thus studied a lot of Shadertoy scenes leveraging many Volumetric Raymarching techniques to render smoke, clouds, and cloudscapes, which I obviously couldn’t resist giving a try rebuilding myself:
Running the “Reflections on Trusting Trust” Compiler
In October 1983, 40 years ago this week, Ken Thompson chose supply chain security as the topic for his Turing award lecture, although the specific term wasn’t used back then. (The field of computer science was still young and small enough that the ACM conference where Ken spoke was the “Annual Conference on Computers.”) Ken’s lecture was later published in Communications of the ACM under the title “Reflections on Trusting Trust.” It is a classic paper, and a short one (3 pages); if you haven’t read it yet, you should. This post will still be here when you get back.
In the lecture, Ken explains in three steps how to modify a C compiler binary to insert a backdoor when compiling the “login” program, leaving no trace in the source code. In this post, we will run the backdoored compiler using Ken’s actual code. But first, a brief summary of the important parts of the lecture.
Polonius refers to a few things. It is a new formulation of the borrow checker. It is also a specific project that implemented that analysis, based on datalog. Our current plan does not make use of that datalog-based implementation, but uses what we learned implementing it to focus on reimplementing Polonius within rustc.
Arena allocator tips and tricks
Over the past year I’ve refined my approach to arena allocation. With practice, it’s effective, simple, and fast; typically as easy to use as garbage collection but without the costs. Depending on need, an allocator can weigh just 7–25 lines of code — perfect when lacking a runtime. With the core details of my own technique settled, now is a good time to document and share lessons learned. This is certainly not the only way to approach arena allocation, but these are practices I’ve worked out to simplify programs and reduce mistakes.
See also: https://nullprogram.com/blog/2023/09/30/
An easy-to-implement, arena-friendly hash map
Champagne for my real friends
Real pain for my sham friends, real tricks for my meh friends, and finding more like this with NLP
Getting RCE in Chrome with incorrect side effect in the JIT compiler
In this post, I’ll exploit CVE-2023-3420, a type confusion in Chrome that allows remote code execution (RCE) in the renderer sandbox of Chrome by a single visit to a malicious site.
The WebP 0day
This means that someone, somewhere, had been caught using an exploit for this vulnerability. But who discovered the vulnerability and how was it being used? How does the vulnerability work? Why wasn’t it discovered earlier? And what sort of impact does an exploit like this have?
There are still a lot of details that are missing, but this post attempts to explain what we know about the unusual circumstances of this bug, and provides a new technical analysis and proof-of-concept trigger for CVE-2023-4863 (“the WebP 0day“).
How I implemented MegaTextures on real Nintendo 64 hardware
This showcases a demo of megatextures running on n64 hardware. A “megatexture” for the n64 is really just a normal sized textured by modern standards but with that you can do some prebaked scenes that look like they don’t belong on the n64.
The Internet Worm Program: An Analysis
This report gives a detailed description of the components of the worm program—data and functions. It is based on study of two completely independent reverse-compilations of the worm and a version disassembled to VAX assembly language. Almost no source code is given in the paper because of current concerns about the state of the ‘‘immune system’’ of Internet hosts, but the description should be detailed enough to allow the reader to understand the behavior of the program.
And some modern commentary: https://infosec.exchange/@hovav/110950949212380779
FreeBSD on Firecracker
Experiences porting FreeBSD 14 to run on the Firecracker VMM
Shamir Secret Sharing
It’s 3am. Paul, the head of PayPal database administration carefully enters his elaborate passphrase at a keyboard in a darkened cubicle of 1840 Embarcadero Road in East Palo Alto, for the fifth time. He hits Return. The green-on-black console window instantly displays one line of text: “Sorry, one or more wrong passphrases. Can’t reconstruct the key. Goodbye.”
This is the story of a catastrophic software bug I briefly introduced into the PayPal codebase that almost cost us the company (or so it seemed, in the moment.)
Today, should you try to read up the programmer’s manual (AKA the man page) on getpass, you will find it has been long declared obsolete and replaced with a more intelligent alternative in nearly all flavors of modern Unix.
Raytraced Order Independent Transparency
About a year ago I reviewed a number of Order Independent Transparency (OIT) techniques (part 1, part 2, part 3), each achieving a difference combination of performance, quality and memory requirements. None of them fully solved OIT though and I ended the series wondering what raytraced transparency would look like. Recently I added (some) DXR support to the toy engine and I was curious to see how it would work, so I did a quick implementation.
The implementation was really simple. Since there is no mechanism to sort the nodes of a BLAS/TLAS based on distance from the camera, the ray generation shader keeps tracing rays using the result of the closest hit shader as the origin for the next ray until there is nothing else to hit.