B-Trees: More Than I Thought I'd Want to Know
https://benjamincongdon.me/blog/2021/08/17/B-Trees-More-Than-I-Thought-Id-Want-to-Know/ [benjamincongdon.me]
2025-01-04 11:26
tags:
compsci
database
programming
storage
systems
In my college Data Structures and Algorithms course, we covered B-Trees, but I didn’t grok why I’d choose to use one. As presented, B-Trees were essentially “better” Binary Search Trees, with some hand-waving done that they had improved performance when used in database applications. I remember needing to memorize a bunch of equations to determine the carrying capacity of a M-degree B-Tree, and a vague understanding of B-Tree lookup/insertion/deletion, but not much else. Which is a shame! They’re interesting structures.
source: HN
Static search trees: 40x faster than binary search
https://curiouscoding.nl/posts/static-search-tree/ [curiouscoding.nl]
2025-01-02 01:18
tags:
compsci
perf
programming
rust
In this post, we will implement a static search tree (S+ tree) for high-throughput searching of sorted data, as introduced on Algorithmica. We’ll mostly take the code presented there as a starting point, and optimize it to its limits. For a large part, I’m simply taking the ‘future work’ ideas of that post and implementing them. And then there will be a bunch of looking at assembly code to shave off all the instructions we can. Lastly, there will be one big addition to optimize throughput: batching.
https://en.algorithmica.org/hpc/data-structures/s-tree/
source: HN
With Fifth Busy Beaver, Researchers Approach Computation’s Limits
https://www.quantamagazine.org/amateur-mathematicians-find-fifth-busy-beaver-turing-machine-20240702/ [www.quantamagazine.org]
2024-07-02 17:46
tags:
article
compsci
math
Today, the team declared victory. They’ve finally verified the true value of a number called BB(5), which quantifies just how busy that fifth beaver is. They obtained the result — 47,176,870 — using a piece of software called the Coq proof assistant, which certifies that mathematical proofs are free of errors.
source: HN
Reverse engineering standard cell logic in the Intel 386 processor
http://www.righto.com/2024/01/intel-386-standard-cells.html [www.righto.com]
2024-03-13 07:33
tags:
article
compsci
cpu
hardware
photos
tech
The 386 processor (1985) was Intel’s most complex processor at the time, with 285,000 transistors. Intel had scheduled 50 person-years to design the processor, but it was falling behind schedule. The design team decided to automate chunks of the layout, developing “automatic place and route” software. This was a risky decision since if the software couldn’t create a dense enough layout, the chip couldn’t be manufactured. But in the end, the 386 finished ahead of schedule, an almost unheard-of accomplishment.
In this article, I take a close look at the “standard cells” used in the 386, the logic blocks that were arranged and wired by software. Reverse-engineering these circuits shows how standard cells implement logic gates, latches, and other components with CMOS transistors. Modern integrated circuits still use standard cells, much smaller now, of course, but built from the same principles.
I Ran a Chess Programming Tournament, Here's How it Went!
https://www.youtube.com/watch?v=Ne40a5LkK6A [www.youtube.com]
2023-12-21 02:11
tags:
compsci
csharp
gaming
programming
video
Polonius update
https://blog.rust-lang.org/inside-rust/2023/10/06/polonius-update.html [blog.rust-lang.org]
2023-10-08 19:10
tags:
compiler
compsci
programming
rust
update
Polonius refers to a few things. It is a new formulation of the borrow checker. It is also a specific project that implemented that analysis, based on datalog. Our current plan does not make use of that datalog-based implementation, but uses what we learned implementing it to focus on reimplementing Polonius within rustc.
source: L
Understanding DeepMind's Sorting Algorithm
https://justine.lol/sorting/ [justine.lol]
2023-06-12 21:55
tags:
compsci
cpu
performance
sorting
A few days ago, DeepMind published a blog post talking about a paper they wrote, where they discovered tinier kernels for sorting algorithms. They did this by taking their deep learning wisdom, which they gained by building AlphaGo, and applying it to the discipline of of superoptimization. That piqued my interest, since as a C library author, I’m always looking for opportunities to curate the best stuff. In some ways that’s really the whole purpose of the C library. There are so many functions that we as programmers take for granted, which are the finished product of decades of research, distilled into plain and portable code.
DeepMind earned a fair amount of well-deserved attention for this discovery, but unfortunately they could have done a much better job explaining it.
https://www.deepmind.com/blog/alphadev-discovers-faster-sorting-algorithms
source: HN
Beautiful Branchless Binary Search
https://probablydance.com/2023/04/27/beautiful-branchless-binary-search/ [probablydance.com]
2023-04-28 23:45
tags:
compsci
cxx
programming
I read a blog post by Alex Muscar, “Beautiful Binary Search in D“. It describes a binary search called “Shar’s algorithm”. I’d never heard of it and it’s impossible to google, but looking at the algorithm I couldn’t help but think “this is branchless.” And who knew that there could be a branchless binary search? So I did the work to translate it into a algorithm for C++ iterators, no longer requiring one-based indexing or fixed-size arrays.
https://muscar.eu/shar-binary-search-meta.html
source: L
The Unreasonable Effectiveness of JPEG: A Signal Processing Approach
https://www.youtube.com/watch?v=0me3guauqOU [www.youtube.com]
2022-05-08 03:26
tags:
compression
compsci
graphics
math
video
The JPEG algorithm is rather complex and in this video, we break down the core parts of the algorithm, specifically color spaces, YCbCr, chroma subsampling, the discrete cosine transform, quantization, and lossless encoding. The majority of the focus is on the mathematical and signal processing insights that lead to advancements in image compression and the big themes in compression as a whole that we can take away from it.
NaN Gates and Flip FLOPS
https://www.youtube.com/watch?v=5TFDG-y-EHs [www.youtube.com]
2022-05-01 22:19
tags:
compsci
hardware
math
programming
solder
video
A new kind of computer architecture that’s more elegant than 1s and 0s, being based directly on Mathematics.
The impossible chessboard puzzle
https://www.youtube.com/watch?v=wTJI_WuZSwE [www.youtube.com]
2021-03-25 22:55
tags:
compsci
math
video
Bit strings, error correcting, and coloring the corners of higher dimensional cubes.
Cranelift, Part 3: Correctness in Register Allocation
https://cfallin.org/blog/2021/03/15/cranelift-isel-3/ [cfallin.org]
2021-03-19 23:12
tags:
compiler
compsci
development
fuzzing
jit
programming
rust
testing
In this post, I will cover how we worked to ensure correctness in our register allocator, regalloc.rs, by developing a symbolic checker that uses abstract interpretation to prove correctness for a specific register allocation result. By using this checker as a fuzzing oracle, and driving just the register allocator with a focused fuzzing target, we have been able to uncover some very interesting and subtle bugs, and achieve a fairly high confidence in the allocator’s robustness.
source: HN
Multimodal Neurons in Artificial Neural Networks
https://openai.com/blog/multimodal-neurons/ [openai.com]
2021-03-10 03:07
tags:
ai
compsci
graphics
paper
We’ve discovered neurons in CLIP that respond to the same concept whether presented literally, symbolically, or conceptually. This may explain CLIP’s accuracy in classifying surprising visual renditions of concepts, and is also an important step toward understanding the associations and biases that CLIP and similar models learn.
The good, and the bad...
By exploiting the model’s ability to read text robustly, we find that even photographs of hand-written text can often fool the model.
https://distill.pub/2021/multimodal-neurons/
Improving texture atlas allocation in WebRender
https://nical.github.io/posts/etagere.html [nical.github.io]
2021-02-05 02:11
tags:
compsci
graphics
malloc
programming
This is a longer version of the piece I published in the mozilla gfx team blog where I focus on the atlas allocation algorithms. In this one I’ll go into more details about the process and methodology behind these improvements. The first part is about the making of guillotiere, a crate that I first released in March 2019. In the second part we’ll have a look at more recent work building upon what I did with guillotiere, to improve texture memory usage in WebRender/Firefox.
https://mozillagfx.wordpress.com/2021/02/04/improving-texture-atlas-allocation-in-webrender/
source: HN
Floating-Point Formats
http://www.quadibloc.com/comp/cp0201.htm [www.quadibloc.com]
2020-12-13 07:05
tags:
compsci
format
math
reference
retro
systems
On Modern Hardware the Min-Max Heap beats a Binary Heap
https://probablydance.com/2020/08/31/on-modern-hardware-the-min-max-heap-beats-a-binary-heap/ [probablydance.com]
2020-09-02 23:22
tags:
compsci
perf
programming
The heap is a data structure that I use all the time and that others somehow use rarely. (I once had a coworker tell me that he knew some code was mine because it used a heap) Recently I was writing code that could really benefit from using a heap (as most code can) but I needed to be able to pop items from both ends. So I read up on double-ended priority queues and how to implement them. These are rare, but the most common implementation is the “Interval Heap” that can be explained quickly, has clean code and is only slightly slower than a binary heap. But there is an alternative called the “Min-Max Heap” that doesn’t have pretty code, but it has shorter dependency chains, which is important on modern hardware. As a result it often ends up faster than a binary heap, even though it allows you to pop from both ends. Which means there might be no reason to ever use a binary heap again.
Code: https://github.com/skarupke/heap/blob/master/minmax_and_dary_heap.hpp
source: HN
SAT solver on top of regex matcher
https://yurichev.com/news/20200621_regex_SAT/ [yurichev.com]
2020-07-08 00:05
tags:
compsci
programming
text
A SAT problem is an NP-problem, while regex matching is not. However, a quite popular regex ‘backreferences’ extension extends regex matching to a (hard) NP-problem.
source: trivium
xi-editor retrospective
https://raphlinus.github.io/xi/2020/06/27/xi-retrospective.html [raphlinus.github.io]
2020-07-01 00:55
tags:
compsci
concurrency
development
programming
rust
swtools
text
I still believe it would be possible to build a high quality editor based on the original design. But I also believe that this would be quite a complex system, and require significantly more work than necessary.
A few good ideas and observations could be mined out of this post.
source: L
Discovering Dennis Ritchie’s Lost Dissertation
https://computerhistory.org/blog/discovering-dennis-ritchies-lost-dissertation/ [computerhistory.org]
2020-06-20 23:21
tags:
academia
compsci
retro
It may come as some surprise to learn that until just this moment, despite Ritchie’s much-deserved computing fame, his dissertation—the intellectual and biographical fork-in-the-road separating an academic career in computer science from the one at Bell Labs leading to C and Unix—was lost. Lost? Yes, very much so in being both unpublished and absent from any public collection; not even an entry for it can be found in Harvard’s library catalog nor in dissertation databases.
How Much of a Genius-Level Move Was Using Binary Space Partitioning in Doom?
https://twobithistory.org/2019/11/06/doom-bsp.html [twobithistory.org]
2020-03-21 20:52
tags:
article
compsci
graphics
programming
A decade after Doom’s release, in 2003, journalist David Kushner published a book about id Software called Masters of Doom, which has since become the canonical account of Doom’s creation. I read Masters of Doom a few years ago and don’t remember much of it now, but there was one story in the book about lead programmer John Carmack that has stuck with me. This is a loose gloss of the story (see below for the full details), but essentially, early in the development of Doom, Carmack realized that the 3D renderer he had written for the game slowed to a crawl when trying to render certain levels. This was unacceptable, because Doom was supposed to be action-packed and frenetic. So Carmack, realizing the problem with his renderer was fundamental enough that he would need to find a better rendering algorithm, started reading research papers. He eventually implemented a technique called “binary space partitioning,” never before used in a video game, that dramatically sped up the Doom engine.
History of research into BSP.