Polonius refers to a few things. It is a new formulation of the borrow checker. It is also a specific project that implemented that analysis, based on datalog. Our current plan does not make use of that datalog-based implementation, but uses what we learned implementing it to focus on reimplementing Polonius within rustc.
Understanding DeepMind's Sorting Algorithm
A few days ago, DeepMind published a blog post talking about a paper they wrote, where they discovered tinier kernels for sorting algorithms. They did this by taking their deep learning wisdom, which they gained by building AlphaGo, and applying it to the discipline of of superoptimization. That piqued my interest, since as a C library author, I’m always looking for opportunities to curate the best stuff. In some ways that’s really the whole purpose of the C library. There are so many functions that we as programmers take for granted, which are the finished product of decades of research, distilled into plain and portable code.
DeepMind earned a fair amount of well-deserved attention for this discovery, but unfortunately they could have done a much better job explaining it.
Beautiful Branchless Binary Search
I read a blog post by Alex Muscar, “Beautiful Binary Search in D“. It describes a binary search called “Shar’s algorithm”. I’d never heard of it and it’s impossible to google, but looking at the algorithm I couldn’t help but think “this is branchless.” And who knew that there could be a branchless binary search? So I did the work to translate it into a algorithm for C++ iterators, no longer requiring one-based indexing or fixed-size arrays.
The Unreasonable Effectiveness of JPEG: A Signal Processing Approach
The JPEG algorithm is rather complex and in this video, we break down the core parts of the algorithm, specifically color spaces, YCbCr, chroma subsampling, the discrete cosine transform, quantization, and lossless encoding. The majority of the focus is on the mathematical and signal processing insights that lead to advancements in image compression and the big themes in compression as a whole that we can take away from it.
NaN Gates and Flip FLOPS
A new kind of computer architecture that’s more elegant than 1s and 0s, being based directly on Mathematics.
The impossible chessboard puzzle
Bit strings, error correcting, and coloring the corners of higher dimensional cubes.
Cranelift, Part 3: Correctness in Register Allocation
In this post, I will cover how we worked to ensure correctness in our register allocator, regalloc.rs, by developing a symbolic checker that uses abstract interpretation to prove correctness for a specific register allocation result. By using this checker as a fuzzing oracle, and driving just the register allocator with a focused fuzzing target, we have been able to uncover some very interesting and subtle bugs, and achieve a fairly high confidence in the allocator’s robustness.
Multimodal Neurons in Artificial Neural Networks
We’ve discovered neurons in CLIP that respond to the same concept whether presented literally, symbolically, or conceptually. This may explain CLIP’s accuracy in classifying surprising visual renditions of concepts, and is also an important step toward understanding the associations and biases that CLIP and similar models learn.
The good, and the bad...
By exploiting the model’s ability to read text robustly, we find that even photographs of hand-written text can often fool the model.
Improving texture atlas allocation in WebRender
This is a longer version of the piece I published in the mozilla gfx team blog where I focus on the atlas allocation algorithms. In this one I’ll go into more details about the process and methodology behind these improvements. The first part is about the making of guillotiere, a crate that I first released in March 2019. In the second part we’ll have a look at more recent work building upon what I did with guillotiere, to improve texture memory usage in WebRender/Firefox.
On Modern Hardware the Min-Max Heap beats a Binary Heap
The heap is a data structure that I use all the time and that others somehow use rarely. (I once had a coworker tell me that he knew some code was mine because it used a heap) Recently I was writing code that could really benefit from using a heap (as most code can) but I needed to be able to pop items from both ends. So I read up on double-ended priority queues and how to implement them. These are rare, but the most common implementation is the “Interval Heap” that can be explained quickly, has clean code and is only slightly slower than a binary heap. But there is an alternative called the “Min-Max Heap” that doesn’t have pretty code, but it has shorter dependency chains, which is important on modern hardware. As a result it often ends up faster than a binary heap, even though it allows you to pop from both ends. Which means there might be no reason to ever use a binary heap again.
SAT solver on top of regex matcher
A SAT problem is an NP-problem, while regex matching is not. However, a quite popular regex ‘backreferences’ extension extends regex matching to a (hard) NP-problem.
I still believe it would be possible to build a high quality editor based on the original design. But I also believe that this would be quite a complex system, and require significantly more work than necessary.
A few good ideas and observations could be mined out of this post.
Discovering Dennis Ritchie’s Lost Dissertation
It may come as some surprise to learn that until just this moment, despite Ritchie’s much-deserved computing fame, his dissertation—the intellectual and biographical fork-in-the-road separating an academic career in computer science from the one at Bell Labs leading to C and Unix—was lost. Lost? Yes, very much so in being both unpublished and absent from any public collection; not even an entry for it can be found in Harvard’s library catalog nor in dissertation databases.
How Much of a Genius-Level Move Was Using Binary Space Partitioning in Doom?
A decade after Doom’s release, in 2003, journalist David Kushner published a book about id Software called Masters of Doom, which has since become the canonical account of Doom’s creation. I read Masters of Doom a few years ago and don’t remember much of it now, but there was one story in the book about lead programmer John Carmack that has stuck with me. This is a loose gloss of the story (see below for the full details), but essentially, early in the development of Doom, Carmack realized that the 3D renderer he had written for the game slowed to a crawl when trying to render certain levels. This was unacceptable, because Doom was supposed to be action-packed and frenetic. So Carmack, realizing the problem with his renderer was fundamental enough that he would need to find a better rendering algorithm, started reading research papers. He eventually implemented a technique called “binary space partitioning,” never before used in a video game, that dramatically sped up the Doom engine.
History of research into BSP.
Landmark Computer Science Proof Cascades Through Physics and Math
Computer scientists established a new boundary on computationally verifiable knowledge. In doing so, they solved major open problems in quantum mechanics and pure mathematics.
What is the random oracle model and why should you care?
About eight years ago I set out to write a very informal piece on a specific cryptographic modeling technique called the “random oracle model”. This was way back in the good old days of 2011, which was a more innocent and gentle era of cryptography. Back then nobody foresaw that all of our standard cryptography would turn out to be riddled with bugs; you didn’t have to be reminded that “crypto means cryptography“. People even used Bitcoin to actually buy things.
That first random oracle post somehow sprouted three sequels, each more ridiculous than the last. I guess at some point I got embarrassed about the whole thing — it’s pretty cheesy, to be honest — so I kind of abandoned it unfinished. And that’s been a major source of regret for me, since I had always planned a fifth, and final post, to cap the whole messy thing off. This was going to be the best of the bunch: the one I wanted to write all along.
A brief history of liquid computers
A substrate does not have to be solid to compute. It is possible to make a computer purely from a liquid. I demonstrate this using a variety of experimental prototypes where a liquid carries signals, actuates mechanical computing devices and hosts chemical reactions. We show hydraulic mathematical machines that compute functions based on mass transfer analogies. I discuss several prototypes of computing devices that employ fluid flows and jets. They are fluid mappers, where the fluid flow explores a geometrically constrained space to find an optimal way around, e.g. the shortest path in a maze, and fluid logic devices where fluid jet streams interact at the junctions of inlets and results of the computation are represented by fluid jets at selected outlets. Fluid mappers and fluidic logic devices compute continuously valued functions albeit discretized. There is also an opportunity to do discrete operation directly by representing information by droplets and liquid marbles (droplets coated by hydrophobic powder). There, computation is implemented at the sites, in time and space, where droplets collide one with another. The liquid computers mentioned above use liquid as signal carrier or actuator: the exact nature of the liquid is not that important. What is inside the liquid becomes crucial when reaction–diffusion liquid-phase computing devices come into play: there, the liquid hosts families of chemical species that interact with each other in a massive-parallel fashion. I shall illustrate a range of computational tasks, including computational geometry, implementable by excitation wave fronts in nonlinear active chemical medium. The overview will enable scientists and engineers to understand how vast is the variety of liquid computers and will inspire them to design their own experimental laboratory prototypes.
Xor Filters: Faster and Smaller Than Bloom Filters
Among other alternatives, Fan et al. introduced Cuckoo filters which use less space and are faster than Bloom filters. While implementing a Bloom filter is a relatively simple exercise, Cuckoo filters require a bit more engineering.
Could we do even better while limiting the code to something you can hold in your head?
It turns out that you can with xor filters. We just published a paper called Xor Filters: Faster and Smaller Than Bloom and Cuckoo Filters that will appear in the Journal of Experimental Algorithmics.
Introducing Glush: a robust, human readable, top-down parser compiler
It’s been 45 years since Stephen Johnson wrote Yacc (Yet another compiler-compiler), a parser generator that made it possible for anyone to write fast, efficient parsers. Yacc, and its many derivatives, quickly became popular and were included in many Unix distributions. You would imagine that in 45 years we would have further perfected the art of creating parsers and would have standardized on a single tool. A lot of progress has been made, but there are still annoyances and problems affecting every tool out there.
This is great, even just for the overview of parsing.
The CYK algorithm (named after Cocke–Younger–Kasami) is in my opinion of great theoretical importance when it comes to parsing context-free grammars. CYK will parse all context-free parsers in O(n3), including the “simple” grammars that LL/LR can parse in linear time. It accomplishes this by converting parsing into a different problem: CYK shows that parsing context-free languages is equivalent to doing a boolean matrix multiplication. Matrix multiplication can be done naively in cubic time, and as such parsing context-free languages can be done in cubic time. It’s a very satisfying theoretical result, and the actual algorithm is small and easy to understand.