inks

tag: concurrency

parking_lot: ffffffffffffffff...

https://fly.io/blog/parking-lot-ffffffffffffffff/ [fly.io]

2025-05-31 02:27

tags: bugfix concurrency programming rust

You’re reading a 3,000 word blog post about a single concurrency bug, so my guess is you’re the kind of person who compulsively wants to understand how everything works. That’s fine, but a word of advice: there are things where, if you find yourself learning about them in detail, something has gone wrong.

source: L

Go Scheduler

https://nghiant3223.github.io/2025/04/15/go-scheduler.html [nghiant3223.github.io]

2025-05-21 22:40

tags: article concurrency go programming systems

Understanding the Go scheduler is crucial for Go programmer to write efficient concurrent programs. It also helps us become better at troubleshooting performance issues or tuning the performance of our Go programs. In this post, we will explore how Go scheduler evolved over time, and how the Go code we write happens under the hood.

source: HN

I want a good parallel computer

https://raphlinus.github.io/gpu/2025/03/21/good-parallel-computer.html [raphlinus.github.io]

2025-03-22 17:56

tags: concurrency cpu graphics hardware programming

I believe a simpler, more powerful parallel computer is possible, and that there are signs in the historical record. In a slightly alternate universe, we would have those computers now, and be doing the work of designing algorithms and writing programs to run well on them, for a very broad range of tasks.

source: L

Travertine (CVE-2025-24118) - An absolutely wild race condition in the macOS kernel

https://jprx.io/cve-2025-24118/ [jprx.io]

2025-03-14 23:14

tags: auth c concurrency exploit macos security systems

It involves a combination of several cutting-edge features in the macOS kernel (XNU)- Safe Memory Reclamation (SMR), read-only page mappings, per-thread credentials, memcpy implementation details, and of course, a race condition tying everything all together. This bug allows for corruption of thread’s kauth_cred_t credential pointer. Specifically, the SMR-protected p_ucred field of a process’s read-only struct can be corrupted to point to invalid memory, or potentially to a different (maybe even more privileged) credential.

https://github.com/jprx/CVE-2025-24118

source: trivium

0+0 > 0: C++ thread-local storage performance

https://yosefk.com/blog/cxx-thread-local-storage-performance.html [yosefk.com]

2025-02-17 21:29

tags: compiler concurrency cxx library perf programming

We’ll discuss how to make sure that your access to TLS (thread-local storage) is fast. If you’re interested strictly in TLS performance guidelines and don’t care about the details, skip right to the end — but be aware that you’ll be missing out on assembly listings of profound emotional depth, which can shake even a cynical, battle-hardened programmer. If you don’t want to miss out on that — and who would?! — read on, and you shall learn the computer-scientific insight behind the intriguing inequality 0+0 > 0.

source: HN

Way too many ways to wait on a child process with a timeout

https://gaultier.github.io/blog/way_too_many_ways_to_wait_for_a_child_process_with_a_timeout.html [gaultier.github.io]

2025-01-04 18:00

tags: best c concurrency programming systems unix

So let’s implement our own that does both! As we’ll see, it’s much less straightforward, and thus more interesting, than I thought. It’s a whirlwind tour through Unix deeps. If you’re interested in systems programming, Operating Systems, multiplexed I/O, data races, weird historical APIs, and all the ways you can shoot yourself in the foot with just a few system calls, you’re in the right place!

Very good.

source: trivium

The case of the application that used thread local storage it never allocated

https://devblogs.microsoft.com/oldnewthing/20221128-00/?p=107456 [devblogs.microsoft.com]

2024-03-15 22:42

tags: bugfix concurrency development malloc programming windows

Upon closer inspection, the real problem was not that the application’s TLS was being corrupted. The problem was that the application was using TLS slots it never allocated, so it was inadvertently using somebody else’s TLS slots as its own. And of course, when the true owner updated the TLS value, the application interpreted that as corruption.

Smashing the state machine: the true potential of web race conditions

https://portswigger.net/research/smashing-the-state-machine [portswigger.net]

2023-08-10 16:24

tags: concurrency exploit networking security web

HTTP request processing isn’t atomic - any endpoint might be sending an application through invisible sub-states. This means that with race conditions, everything is multi-step. The single-packet attack solves network jitter, making it as though every attack is on a local system. This exposes vulnerabilities that were previously near-impossible to detect or exploit.

source: L

Breaking java.lang.String

https://wouter.coekaerts.be/2023/breaking-string [wouter.coekaerts.be]

2023-07-11 23:58

tags: concurrency java programming

Let’s abuse a bug in java.lang.String to make some weird Strings. We’ll make “hello world” not start with “hello”, and show that not all empty Strings are equal to each other.

source: HN

Paving the Road to Vulkan on Asahi Linux

https://asahilinux.org/2023/03/road-to-vulkan/ [asahilinux.org]

2023-03-20 18:25

tags: concurrency gl graphics linux programming systems

In every modern OS, GPU drivers are split into two parts: a userspace part, and a kernel part. The kernel part is in charge of managing GPU resources and how they are shared between apps, and the userspace part is in charge of converting commands from a graphics API (such as OpenGL or Vulkan) into the hardware commands that the GPU needs to execute.

Between those two parts, there is something called the Userspace API or “UAPI”. This is the interface that they use to communicate between them, and it is specific to each class of GPUs! Since the exact split between userspace and the kernel can vary depending on how each GPU is designed, and since different GPU designs require different bits of data and parameters to be passed between userspace and the kernel, each new GPU driver requires its own UAPI to go along with it.

source: HN

The futex_waitv() syscall and gaming on Linux

https://www.collabora.com/news-and-blog/blog/2023/02/17/the-futex-waitv-syscall-gaming-on-linux/ [www.collabora.com]

2023-02-17 23:48

tags: concurrency gaming linux perf programming systems

The futex_waitv syscall is a new syscall through which the process can wait for multiple futexes. The task wakes up when any futex in the list is awakened. This can be used to implement wait on multiple locks and wait lists, etc, without the limitations imposed by using eventfd.

source: L

How fast are Linux pipes anyway?

https://mazzo.li/posts/fast-pipes.html [mazzo.li]

2022-06-02 22:56

tags: concurrency linux malloc perf programming systems

In this post, we will explore how Unix pipes are implemented in Linux by iteratively optimizing a test program that writes and reads data through a pipe.

We will proceed as follows:
A first slow version of our pipe test bench;
How pipes are implemented internally, and why writing and reading from them is slow;
How the vmsplice and splice syscalls let us get around some (but not all!) of the slowness;
A description of Linux paging, leading up to a faster version using huge pages;
The final optimization, replacing polling with busy looping;
Some closing thoughts.

source: L

What's new in CPUs since the 80s?

https://danluu.com/new-cpu-features/ [danluu.com]

2022-04-19 17:10

tags: article concurrency cpu perf programming systems

Everything below refers to x86 and linux, unless otherwise indicated. History has a tendency to repeat itself, and a lot of things that were new to x86 were old hat to supercomputing, mainframe, and workstation folks.

x86 chips have picked up a lot of new features and whiz-bang gadgets.

Overall, a pretty good introduction to modern CPUs, performance, and concurrency.

Why Rust mutexes look like they do

http://cliffle.com/blog/rust-mutexes/ [cliffle.com]

2022-04-02 05:25

tags: concurrency programming rust

In the rest of this post I’ll walk through a typical C mutex API, compare with a typical Rust mutex API, and look at what happens if we change the Rust API to resemble C in various ways.

source: HN

Curious lack of sprintf scaling

https://aras-p.info/blog/2022/02/25/Curious-lack-of-sprintf-scaling/ [aras-p.info]

2022-02-25 22:08

tags: c concurrency investigation mac perf programming

Some days ago I noticed that on a Mac, doing snprintf calls from multiple threads shows curious lack of scaling (see tweet). Replacing snprintf with {fmt} library can speed up the OBJ exporter in Blender 3.2 by 3-4 times. This could have been the end of the story, filed under a “eh, sprintf is bad!” drawer, but I started to wonder why it shows this lack of scaling.

source: HN

Beyond the Remake of 'Shadow of the Colossus': A Technical Perspective

https://www.youtube.com/watch?v=fcBZEZWGYek [www.youtube.com]

2022-02-23 06:20

tags: concurrency development gaming malloc programming video

Intro to porting games between platforms, then also a deep walkthrough of a custom allocator libary.

Eliminating Data Races in Firefox – A Technical Report

https://hacks.mozilla.org/2021/04/eliminating-data-races-in-firefox-a-technical-report/ [hacks.mozilla.org]

2021-04-07 00:02

tags: compiler concurrency cxx development programming update

We successfully deployed ThreadSanitizer in the Firefox project to eliminate data races in our remaining C/C++ components. In the process, we found several impactful bugs and can safely say that data races are often underestimated in terms of their impact on program correctness. We recommend that all multithreaded C/C++ projects adopt the ThreadSanitizer tool to enhance code quality.

source: HN

ARM and Lock-Free Programming

https://randomascii.wordpress.com/2020/11/29/arm-and-lock-free-programming/ [randomascii.wordpress.com]

2020-12-11 04:33

tags: concurrency cxx programming systems

This is intended to be a casual introduction to the perils of lock-free programming (which I last wrote about some fifteen years ago), but also some explanation of why ARM’s weak memory model breaks some code, and why that code was probably broken already. I also want to explain why C++11 made the lock-free situation strictly better (objections to the contrary notwithstanding).

What went wrong with the libdispatch. A tale of caution for the future of concurrency.

https://tclementdev.com/posts/what_went_wrong_with_the_libdispatch.html [tclementdev.com]

2020-11-25 01:48

tags: concurrency development library mac programming

The future was multithreading and we had to use the libdispatch to get there. So we did.

As we went down that rabbit hole, things got progressively worse.

source: L

Windows Timer Resolution: The Great Rule Change

https://randomascii.wordpress.com/2020/10/04/windows-timer-resolution-the-great-rule-change/ [randomascii.wordpress.com]

2020-10-11 22:01

tags: concurrency systems update windows

The behavior of the Windows scheduler changed significantly in Windows 10 2004, in a way that will break a few applications, and there appears to have been no announcement, and the documentation has not been updated. This isn’t the first time this has happened, but this change seems bigger than last time.

The short version is that calls to timeBeginPeriod from one process now affect other processes less than they used to, but there is still an effect.