inks

tag: programming

How fast are Linux pipes anyway?

https://mazzo.li/posts/fast-pipes.html [mazzo.li]

2025-06-22 18:06

tags: dupe linux perf programming systems

In this post, we will explore how Unix pipes are implemented in Linux by iteratively optimizing a test program that writes and reads data through a pipe.

source: HN

phkmalloc

https://phk.freebsd.dk/sagas/phkmalloc/ [phk.freebsd.dk]

2025-06-17 21:08

tags: c development malloc programming systems

Jason Evans laid jemalloc to rest yesterday, and gave a kind shoutout to my malloc, aka. “phkmalloc”, and it occured to me, that I should write that story down.

source: L

jemalloc Postmortem

https://jasone.github.io/2025/06/12/jemalloc-postmortem/ [jasone.github.io]

2025-06-17 21:07

tags: c development malloc programming systems

The jemalloc memory allocator was first conceived in early 2004, and has been in public use for about 20 years now. Thanks to the nature of open source software licensing, jemalloc will remain publicly available indefinitely. But active upstream development has come to an end. This post briefly describes jemalloc’s development phases, each with some success/failure highlights, followed by some retrospective commentary.

source: HN

Pure vs. impure iterators in Go

https://jub0bs.com/posts/2025-05-29-pure-vs-impure-iterators-in-go/ [jub0bs.com]

2025-06-01 01:34

tags: go programming

Because iterators are so powerful, they’re likely to mushroom in libraries even beyond Go’s standard library. Therefore, to forestall any confusion in the discourse about iterators, the terminology surrounding them should be as precise as possible.

This passage of the documentation seemingly divides iterators into two categories. I’ll attempt to elucidate them through a couple of examples.

source: HN

parking_lot: ffffffffffffffff...

https://fly.io/blog/parking-lot-ffffffffffffffff/ [fly.io]

2025-05-31 02:27

tags: bugfix concurrency programming rust

You’re reading a 3,000 word blog post about a single concurrency bug, so my guess is you’re the kind of person who compulsively wants to understand how everything works. That’s fine, but a word of advice: there are things where, if you find yourself learning about them in detail, something has gone wrong.

source: L

The radix 2^51 trick

https://www.chosenplaintext.ca/articles/radix-2-51-trick.html [www.chosenplaintext.ca]

2025-05-31 00:51

tags: cpu math perf programming

The obvious solution would be to break up each 256-bit number into four 64-bit pieces (commonly referred to as “limbs”).

The first reason is that adc is just slower to execute than a normal add on most popular x86 CPUs. Since adc has a third input (the carry flag), it’s a more complex instruction than add. It’s also used less often than add, so there is less incentive for CPU designers to spend chip area on optimizing adc performance.

The key insight here is that we can use this technique to delay carry propagation until the end. We can’t avoid carry propagation altogether, but we can avoid it temporarily. If we save up the carries that occur during the intermediate additions, we can propagate them all in one go at the end.

source: L

UCSD Pascal In Depth

https://markbessey.blog/2025/04/29/ucsd-pascal-in-depth/ [markbessey.blog]

2025-05-28 05:09

tags: pascal programming retro series systems text

The p-System comes with an editor. It’s a full-screen editor, with some fairly advanced features for the time, like auto-indent, bookmarks, and cut and paste. It’s modal, which is hardly surprising, considering that modal editors were the latest usability improvement of the age, compared to the line-oriented editors of the previous decade.

Also: https://markbessey.blog/2025/04/30/ucsd-pascal-in-depth-2/

Some features of the p-System were really ahead of their time. And then, there is the filesystem. Whenever you set out to create any software, but especially an operating system, which you intend to be aggressively cross-platform, you inevitably run into conflicts between being sophisticated, and hitting the lowest common denominator.

Also: https://markbessey.blog/2025/05/08/ucsd-pascal-in-depth-3-n/

But the 1970s were a very different time. So let’s talk about the text file format for the USCD p-System. This is not just something that applies to the text editor, incidentally. If you declare a file as “text” type in Pascal, it gets the same formatting applied. The formatting is transparently stripped from the file if you send it to the PRINTER: or CONSOLE: device.

Overview: https://markbessey.blog/ucsd-p-system-info/

Also: https://github.com/mbessey/p-system-tools

source: trivium

Making the rav1d Video Decoder 1% Faster

https://ohadravid.github.io/posts/2025-05-rav1d-faster/ [ohadravid.github.io]

2025-05-25 00:24

tags: c compiler perf programming rust

rav1d is a port of dav1d, created by (1) running c2rust on dav1d, (2) incorporating dav1d’s asm-optimized functions, and (3) changing the code to be more Rust-y and safer.

Video decoders are notoriously complex pieces of software, but because we are comparing the performance of two similar deterministic binaries we might be able to avoid a lot of that complexity - with the right tooling.

source: HN

Go Scheduler

https://nghiant3223.github.io/2025/04/15/go-scheduler.html [nghiant3223.github.io]

2025-05-21 22:40

tags: article concurrency go programming systems

Understanding the Go scheduler is crucial for Go programmer to write efficient concurrent programs. It also helps us become better at troubleshooting performance issues or tuning the performance of our Go programs. In this post, we will explore how Go scheduler evolved over time, and how the Go code we write happens under the hood.

source: HN

Build your own ResponseWriter: safer HTTP in Go

https://anto.pt/articles/go-http-responsewriter [anto.pt]

2025-05-09 19:14

tags: go programming web

Go’s http.ResponseWriter writes directly to the socket, which can lead to subtle bugs like forgetting to set a status code or accidentally modifying headers too late.

source: L

Beating the Fastest Lexer Generator in Rust

https://alic.dev/blog/fast-lexing [alic.dev]

2025-05-09 19:07

tags: compiler perf programming rust text

I was aware of the efficiency of state machine driven lexers, but most generators have one problem: they can’t be arbitrarily generic and consistently optimal at the same time. There will always be some assumptions about your data that are either impossible to express, or outside the scope of the generator’s optimizations. Either way, I was curious to find out how my hand-rolled implementation would fare.

source: L

Write the most clever code you possibly can

https://buttondown.com/hillelwayne/archive/write-the-most-clever-code-you-possibly-can/ [buttondown.com]

2025-05-09 18:55

tags: development essay ideas programming

How do we make something utterly mundane? By using it and working at the boundaries of our skills. Almost everything I’m “good at” comes from banging my head against it more than is healthy. That suggests a really good reason to write clever code: it’s an excellent form of purposeful practice. Writing clever code forces us to code outside of our comfort zone, developing our skills as software engineers.

source: L

Cheating the Reaper in Go

https://mcyoung.xyz/2025/04/21/go-arenas/ [mcyoung.xyz]

2025-04-21 23:49

tags: garbage-collection go malloc programming

These things mean that despite Go having a GC, it’s possible to do manual memory management in pure Go and in cooperation with the GC (although without any help from the runtime package). To demonstrate this, we will be building an untyped, garbage-collected arena abstraction in Go which relies on several GC implementation details.

source: HN

Marching Events: What does iCalendar have to do with ray marching?

https://pwy.io/posts/marching-events/ [pwy.io]

2025-04-18 05:31

tags: format programming rust

I’ve found a way of describing occurrences through distance functions. This means that instead of implementing logic for all combinations of frequencies and parameters - as that spooky table from before suggests one might do - we can simply compose a couple of distance functions together.

source: HN

I want a good parallel computer

https://raphlinus.github.io/gpu/2025/03/21/good-parallel-computer.html [raphlinus.github.io]

2025-03-22 17:56

tags: concurrency cpu graphics hardware programming

I believe a simpler, more powerful parallel computer is possible, and that there are signs in the historical record. In a slightly alternate universe, we would have those computers now, and be doing the work of designing algorithms and writing programs to run well on them, for a very broad range of tasks.

source: L

The Defer Technical Specification: It Is Time

https://thephd.dev/c2y-the-defer-technical-specification-its-time-go-go-go [thephd.dev]

2025-03-19 22:48

tags: c compiler programming standard

Time for me to write this blog post and prepare everyone for the implementation blitz that needs to happen to make defer a success for the C programming language.

source: HN

Robust Wavefront OBJ model parsing in C

https://nullprogram.com/blog/2025/03/02/ [nullprogram.com]

2025-03-15 19:25

tags: c graphics programming

Wavefront OBJ is a line-oriented, text format for 3D geometry. It’s widely supported by modeling software, easy to parse, and trivial to emit, much like Netpbm for 2D image data. Poke around hobby 3D graphics projects and you’re likely to find a bespoke OBJ parser. While typically only loading their own model data, so robustness doesn’t much matter, they usually have hard limitations and don’t stand up to fuzz testing. This article presents a robust, partial OBJ parser in C with no hard-coded limitations, written from scratch. Like similar articles, it’s not really about OBJ but demonstrating some techniques you’ve probably never seen before.

Quicksort with Jenkins for Fun and No Profit

https://susam.net/jenkins-quicksort.html [susam.net]

2025-03-14 22:48

tags: programming sorting swtools turtles

Jenkins supports pipeline scripts written in Groovy as a first-class entity. A pipeline script effectively defines the build job. It can define build properties, build stages, build steps, etc. It can even invoke other build jobs, including itself.

Wait a minute! If a pipeline can invoke itself, can we, perhaps, solve a recursive problem with it? Absolutely! This is precisely what we are going to do in this post. We are going to implement quicksort as a Jenkins pipeline for fun and not a whit of profit!

source: trivium

Constant-Time Code: The Pessimist Case

https://eprint.iacr.org/2025/435 [eprint.iacr.org]

2025-03-08 06:09

tags: compiler cpu crypto paper pdf perf programming turtles

This note discusses the problem of writing cryptographic implementations in software, free of timing-based side-channels, and many ways in which that endeavour can fail in practice. It is a pessimist view: it highlights why such failures are expected to become more common, and how constant-time coding is, or will soon become, infeasible in all generality.

From compiler optimizations to CPU pipelines and register renaming.

Zen and the Art of Microcode Hacking

https://bughunters.google.com/blog/5424842357473280/zen-and-the-art-of-microcode-hacking [bughunters.google.com]

2025-03-08 06:03

tags: bios cpu exploit hash programming security systems

In this post, we first discuss the background of what microcode is, why microcode patches exist, why the integrity of microcode is important for security, and how AMD attempts to prevent tampering with microcode. Next, we focus on the microcode patch signature validation process and explain in detail the vulnerability present (using CMAC as a hash function). Finally, we discuss how to use some of the tools we’ve released today which can help researchers reproduce and expand on our work (skip to the Zentool section of this blogpost for a “how to” on writing your own microcode).

source: HN