I want a good parallel computer
https://raphlinus.github.io/gpu/2025/03/21/good-parallel-computer.html [raphlinus.github.io]
2025-03-22 17:56
tags:
concurrency
cpu
graphics
hardware
programming
I believe a simpler, more powerful parallel computer is possible, and that there are signs in the historical record. In a slightly alternate universe, we would have those computers now, and be doing the work of designing algorithms and writing programs to run well on them, for a very broad range of tasks.
source: L
The Defer Technical Specification: It Is Time
https://thephd.dev/c2y-the-defer-technical-specification-its-time-go-go-go [thephd.dev]
2025-03-19 22:48
tags:
c
compiler
programming
standard
Time for me to write this blog post and prepare everyone for the implementation blitz that needs to happen to make defer a success for the C programming language.
source: HN
Robust Wavefront OBJ model parsing in C
https://nullprogram.com/blog/2025/03/02/ [nullprogram.com]
2025-03-15 19:25
tags:
c
graphics
programming
Wavefront OBJ is a line-oriented, text format for 3D geometry. It’s widely supported by modeling software, easy to parse, and trivial to emit, much like Netpbm for 2D image data. Poke around hobby 3D graphics projects and you’re likely to find a bespoke OBJ parser. While typically only loading their own model data, so robustness doesn’t much matter, they usually have hard limitations and don’t stand up to fuzz testing. This article presents a robust, partial OBJ parser in C with no hard-coded limitations, written from scratch. Like similar articles, it’s not really about OBJ but demonstrating some techniques you’ve probably never seen before.
Quicksort with Jenkins for Fun and No Profit
https://susam.net/jenkins-quicksort.html [susam.net]
2025-03-14 22:48
tags:
programming
sorting
swtools
turtles
Jenkins supports pipeline scripts written in Groovy as a first-class entity. A pipeline script effectively defines the build job. It can define build properties, build stages, build steps, etc. It can even invoke other build jobs, including itself.
Wait a minute! If a pipeline can invoke itself, can we, perhaps, solve a recursive problem with it? Absolutely! This is precisely what we are going to do in this post. We are going to implement quicksort as a Jenkins pipeline for fun and not a whit of profit!
source: trivium
Constant-Time Code: The Pessimist Case
https://eprint.iacr.org/2025/435 [eprint.iacr.org]
2025-03-08 06:09
tags:
compiler
cpu
crypto
paper
pdf
perf
programming
turtles
This note discusses the problem of writing cryptographic implementations in software, free of timing-based side-channels, and many ways in which that endeavour can fail in practice. It is a pessimist view: it highlights why such failures are expected to become more common, and how constant-time coding is, or will soon become, infeasible in all generality.
From compiler optimizations to CPU pipelines and register renaming.
Zen and the Art of Microcode Hacking
https://bughunters.google.com/blog/5424842357473280/zen-and-the-art-of-microcode-hacking [bughunters.google.com]
2025-03-08 06:03
tags:
bios
cpu
exploit
hash
programming
security
systems
In this post, we first discuss the background of what microcode is, why microcode patches exist, why the integrity of microcode is important for security, and how AMD attempts to prevent tampering with microcode. Next, we focus on the microcode patch signature validation process and explain in detail the vulnerability present (using CMAC as a hash function). Finally, we discuss how to use some of the tools we’ve released today which can help researchers reproduce and expand on our work (skip to the Zentool section of this blogpost for a “how to” on writing your own microcode).
source: HN
0+0 > 0: C++ thread-local storage performance
https://yosefk.com/blog/cxx-thread-local-storage-performance.html [yosefk.com]
2025-02-17 21:29
tags:
compiler
concurrency
cxx
library
perf
programming
We’ll discuss how to make sure that your access to TLS (thread-local storage) is fast. If you’re interested strictly in TLS performance guidelines and don’t care about the details, skip right to the end — but be aware that you’ll be missing out on assembly listings of profound emotional depth, which can shake even a cynical, battle-hardened programmer. If you don’t want to miss out on that — and who would?! — read on, and you shall learn the computer-scientific insight behind the intriguing inequality 0+0 > 0.
source: HN
Can atproto scale down?
https://bsky.bad-example.com/can-atproto-scale-down/ [bsky.bad-example.com]
2025-02-17 21:10
tags:
networking
perf
programming
social
storage
And skipping right to the end, my answer to “can it scale down” is just: “yes!”. Here’s my Raspberry Pi 4b, at home, consuming a few watts and pulling around 20GB of simplified firehose events per day. It’s an AppView indexing all cross-repo references (backlinks) in the AT-mosphere, often up to 1,500 created per second. It’s closing in on one billion backlinks, eating up an old SATA SSD connected over a salvaged USB adapter.
source: L
"A calculator app? Anyone could make that."
https://chadnauseam.com/coding/random/calculator-app [chadnauseam.com]
2025-02-17 21:02
tags:
android
compsci
math
programming
ux
A calculator should show you the result of the mathematical expression you entered. That’s much, much harder than it sounds.
source: HN
How do modern compilers choose which variables to put in registers?
https://langdev.stackexchange.com/questions/4325/how-do-modern-compilers-choose-which-variables-to-put-in-registers [langdev.stackexchange.com]
2025-02-17 20:59
tags:
compiler
cpu
programming
This is a very broad subject. The problem of deciding how to map a program with arbitrarily many variables onto a fixed set of registers is known as register allocation, and it has been the subject of much research, study, and engineering effort since the very earliest compilers. One of the canonical approaches, graph coloring, was first proposed in 1981. Countless other approaches and variants have been explored since then, and I cannot hope to cover the full breadth of the topic in a single answer.
source: HN
Get in loser. We're rewinding the stack.
https://andrews.substack.com/p/get-in-loser-were-rewinding-the-stack [andrews.substack.com]
2025-02-17 20:57
tags:
perl
programming
series
wasm
In my last post, I expressed frustration at how the lack of exnref support in most WebAssembly runtimes made zeroperl effectively unusable. However, complaining alone doesn’t solve problems—if something is broken, fix it. Don’t accept the status quo or let it derail your goals.
Using libsetjmp from the WASI SDK for setjmp/longjmp breaks compatibility across WebAssembly runtimes, so I decided to implement it myself. Binaryen has an Asyncify feature, which provides more than enough functionality to build a setjmp implementation from scratch.
Part of a series on getting perl running in wasm.
https://andrews.substack.com/p/zeroperl-sandboxed-perl-with-webassembly
source: HN
The Art of Dithering and Retro Shading for the Web
https://blog.maximeheckel.com/posts/the-art-of-dithering-and-retro-shading-web/ [blog.maximeheckel.com]
2025-02-03 19:47
tags:
gl
graphics
interactive
programming
visualization
web
I spent the past few months building my personal website from the ground up, finally taking the time to incorporate some 3D work to showcase my shader and WebGL skills. Throughout this work, I got to truly understand the crucial role that post-processing plays in making a scene actually look good, which brought some resolutions to long-term frustrations I had with my past React Three Fiber and shader projects where my vision wouldn’t materialize regardless of the amount of work and care I was putting into them.
Taking the time to build, combine, and experiment with custom post-processing effects gave me an additional creative outlet, and among the many types I got to iterate on, I always had a particular affection for the several “retro” effects I came up with. With subtle details such as dithering, color quantization, or pixelization/CRT RGB cells, they bring a pleasant contrast between the modern web landscape and a long-gone era of technology we 90s/early 2000s kids are sometime longing for.
source: HN
JavaScript Temporal is coming
https://developer.mozilla.org/en-US/blog/javascript-temporal-is-coming/ [developer.mozilla.org]
2025-01-30 20:14
tags:
browser
javascript
library
programming
update
web
Implementations of the new JavaScript Temporal object are starting to be shipped in experimental releases of browsers. This is big news for web developers because working with dates and times in JavaScript will be hugely simplified and modernized.
source: HN
Bilinear down/upsampling, aligning pixel grids, and that infamous GPU half pixel offset
https://bartwronski.com/2021/02/15/bilinear-down-upsampling-pixel-grids-and-that-half-pixel-offset/ [bartwronski.com]
2025-01-27 23:28
tags:
graphics
programming
So I figured it’s an opportunity for another short blog post – on bilinear filtering, but in context of down/upsampling. We will touch here on GPU half pixel offsets, aligning pixel grids, a bug / confusion in Tensorflow, deeper signal processing analysis of what’s going on during bilinear operations, and analysis of the magic of the famous “magic kernel”.
source: HN
Go 1.24 interactive tour
https://antonz.org/go-1-24/ [antonz.org]
2025-01-15 21:07
tags:
garbage-collection
go
programming
update
Go 1.24 is scheduled for release in February, so it’s a good time to explore what’s new. The official release notes are pretty dry, so I prepared an interactive version with lots of examples showing what has changed and what the new behavior is.
source: L
Why is my CPU usage always 100%?
https://www.downtowndougbrown.com/2024/04/why-is-my-cpu-usage-always-100-upgrading-my-chumby-8-kernel-part-9/ [www.downtowndougbrown.com]
2025-01-13 22:14
tags:
bugfix
c
investigation
linux
programming
systems
That’s really weird! Why would top be using all of my CPU? It says 100% usr in the second line. Sometimes the usage showed up as 50% usr and 50% sys. Other times it would show up as 100% sys. And very rarely, it would show 100% idle. In that rare case, top would actually show up with 0% usage as I would expect. The 2.6.28 kernel did not have this problem, so it was something different about my newer kernel.
source: HN
WorstFit: Unveiling Hidden Transformers in Windows ANSI!
https://blog.orange.tw/posts/2025-01-worstfit-unveiling-hidden-transformers-in-windows-ansi/ [blog.orange.tw]
2025-01-10 14:54
tags:
exploit
programming
security
text
turtles
windows
The research unveils a new attack surface in Windows by exploiting Best-Fit, an internal charset conversion feature. Through our work, we successfully transformed this feature into several practical attacks, including Path Traversal, Argument Injection, and even RCE, affecting numerous well-known applications!
source: HN
How to triangulate a polyline with thickness
https://jvernay.fr/en/blog/polyline-triangulation/ [jvernay.fr]
2025-01-05 22:33
tags:
c
gl
graphics
interactive
programming
visualization
To render any geometric figure to a GPU (with OpenGL / Direct3D / Vulkan / ...), they must first be triangulated, i.e. decomposed as a series of triangles. Some figures are trivial to transform into triangles: for instance, a segment with thickness is represented by a rectangle, which can be rendered with two triangles. But a segment strip with thickness (aka. polyline) is not trivial.
Ultimately, this exploration has been a rabbit hole, also partly due to some digressions along the path — let’s prototype with a bare implementation of GeoGebra in vanilla JavaScript — let’s do a WebGL + WASM demo to verify the algorithm works correctly ... 😅 At least, it gives some fancy interactive visuals for this blog post. 😁
source: HN
Don't clobber the frame pointer
https://nsrip.com/posts/clobberfp.html [nsrip.com]
2025-01-05 09:34
tags:
bugfix
compiler
cpu
go
programming
Recently I diagnosed and fixed two frame pointer unwinding crashes in Go. The root causes were two flavors of the same problem: buggy assembly code clobbered a frame pointer. By “clobbered” I mean wrote over the value without saving & restoring it. One bug clobbered the frame pointer register. The other bug clobbered a frame pointer saved on the stack. This post explains the bugs, talks a bit about ABIs and calling conventions, and makes some recommendations for how to avoid the bugs.
source: L
Way too many ways to wait on a child process with a timeout
https://gaultier.github.io/blog/way_too_many_ways_to_wait_for_a_child_process_with_a_timeout.html [gaultier.github.io]
2025-01-04 18:00
tags:
best
c
concurrency
programming
systems
unix
So let’s implement our own that does both! As we’ll see, it’s much less straightforward, and thus more interesting, than I thought. It’s a whirlwind tour through Unix deeps. If you’re interested in systems programming, Operating Systems, multiplexed I/O, data races, weird historical APIs, and all the ways you can shoot yourself in the foot with just a few system calls, you’re in the right place!
Very good.
source: trivium