How fast are Linux pipes anyway?
https://mazzo.li/posts/fast-pipes.html [mazzo.li]
2022-06-02 22:56
tags:
concurrency
linux
malloc
perf
programming
systems
In this post, we will explore how Unix pipes are implemented in Linux by iteratively optimizing a test program that writes and reads data through a pipe.
We will proceed as follows:
A first slow version of our pipe test bench;
How pipes are implemented internally, and why writing and reading from them is slow;
How the vmsplice and splice syscalls let us get around some (but not all!) of the slowness;
A description of Linux paging, leading up to a faster version using huge pages;
The final optimization, replacing polling with busy looping;
Some closing thoughts.
source: L
What's new in CPUs since the 80s?
https://danluu.com/new-cpu-features/ [danluu.com]
2022-04-19 17:10
tags:
article
concurrency
cpu
perf
programming
systems
Everything below refers to x86 and linux, unless otherwise indicated. History has a tendency to repeat itself, and a lot of things that were new to x86 were old hat to supercomputing, mainframe, and workstation folks.
x86 chips have picked up a lot of new features and whiz-bang gadgets.
Overall, a pretty good introduction to modern CPUs, performance, and concurrency.
Why Rust mutexes look like they do
http://cliffle.com/blog/rust-mutexes/ [cliffle.com]
2022-04-02 05:25
tags:
concurrency
programming
rust
In the rest of this post I’ll walk through a typical C mutex API, compare with a typical Rust mutex API, and look at what happens if we change the Rust API to resemble C in various ways.
source: HN
Curious lack of sprintf scaling
https://aras-p.info/blog/2022/02/25/Curious-lack-of-sprintf-scaling/ [aras-p.info]
2022-02-25 22:08
tags:
c
concurrency
investigation
mac
perf
programming
Some days ago I noticed that on a Mac, doing snprintf calls from multiple threads shows curious lack of scaling (see tweet). Replacing snprintf with {fmt} library can speed up the OBJ exporter in Blender 3.2 by 3-4 times. This could have been the end of the story, filed under a “eh, sprintf is bad!” drawer, but I started to wonder why it shows this lack of scaling.
source: HN
Beyond the Remake of 'Shadow of the Colossus': A Technical Perspective
https://www.youtube.com/watch?v=fcBZEZWGYek [www.youtube.com]
2022-02-23 06:20
tags:
concurrency
development
gaming
malloc
programming
video
Intro to porting games between platforms, then also a deep walkthrough of a custom allocator libary.
Eliminating Data Races in Firefox – A Technical Report
https://hacks.mozilla.org/2021/04/eliminating-data-races-in-firefox-a-technical-report/ [hacks.mozilla.org]
2021-04-07 00:02
tags:
compiler
concurrency
cxx
development
programming
update
We successfully deployed ThreadSanitizer in the Firefox project to eliminate data races in our remaining C/C++ components. In the process, we found several impactful bugs and can safely say that data races are often underestimated in terms of their impact on program correctness. We recommend that all multithreaded C/C++ projects adopt the ThreadSanitizer tool to enhance code quality.
source: HN
ARM and Lock-Free Programming
https://randomascii.wordpress.com/2020/11/29/arm-and-lock-free-programming/ [randomascii.wordpress.com]
2020-12-11 04:33
tags:
concurrency
cxx
programming
systems
This is intended to be a casual introduction to the perils of lock-free programming (which I last wrote about some fifteen years ago), but also some explanation of why ARM’s weak memory model breaks some code, and why that code was probably broken already. I also want to explain why C++11 made the lock-free situation strictly better (objections to the contrary notwithstanding).
What went wrong with the libdispatch. A tale of caution for the future of concurrency.
https://tclementdev.com/posts/what_went_wrong_with_the_libdispatch.html [tclementdev.com]
2020-11-25 01:48
tags:
concurrency
development
library
mac
programming
The future was multithreading and we had to use the libdispatch to get there. So we did.
As we went down that rabbit hole, things got progressively worse.
source: L
Windows Timer Resolution: The Great Rule Change
https://randomascii.wordpress.com/2020/10/04/windows-timer-resolution-the-great-rule-change/ [randomascii.wordpress.com]
2020-10-11 22:01
tags:
concurrency
systems
update
windows
The behavior of the Windows scheduler changed significantly in Windows 10 2004, in a way that will break a few applications, and there appears to have been no announcement, and the documentation has not been updated. This isn’t the first time this has happened, but this change seems bigger than last time.
The short version is that calls to timeBeginPeriod from one process now affect other processes less than they used to, but there is still an effect.
The Watchdog Hydra
https://thedailywtf.com/articles/the-watchdog-hydra [thedailywtf.com]
2020-09-29 01:23
tags:
auth
bugfix
concurrency
Ammar checked, and sure enough, his code was sending hundreds of thousands of requests per second. It didn’t take him long to figure out why: requests from the watchdog were failing with a 500 error, so it called the login method. The login method had been succeeding, so another watchdog got scheduled. Thirty seconds later, that failed, as did all the previously scheduled watchdogs, which all called login again. Which, on success, scheduled a fresh round of watchdogs. Every thirty seconds, the number of scheduled calls doubled. Before long, Ammar’s code was DoSing the API.
Fun times.
xi-editor retrospective
https://raphlinus.github.io/xi/2020/06/27/xi-retrospective.html [raphlinus.github.io]
2020-07-01 00:55
tags:
compsci
concurrency
development
programming
rust
swtools
text
I still believe it would be possible to build a high quality editor based on the original design. But I also believe that this would be quite a complex system, and require significantly more work than necessary.
A few good ideas and observations could be mined out of this post.
source: L
Memory Ordering in Modern Microprocessors
https://www.linuxjournal.com/article/8211 [www.linuxjournal.com]
2020-06-18 00:17
tags:
concurrency
cpu
programming
systems
Latency in Asynchronous Python
https://nullprogram.com/blog/2020/05/24/ [nullprogram.com]
2020-05-26 22:00
tags:
concurrency
programming
python
This week I was debugging a misbehaving Python program that makes significant use of Python’s asyncio. The program would eventually take very long periods of time to respond to network requests. My first suspicion was a CPU-heavy coroutine hogging the thread, preventing the socket coroutines from running, but an inspection with pdb showed this wasn’t the case. Instead, the program’s author had made a couple of fundamental mistakes using asyncio. Let’s discuss them using small examples.
When Parallel: Pull, Don't Push
https://nullprogram.com/blog/2020/04/30/ [nullprogram.com]
2020-05-01 04:54
tags:
concurrency
programming
I’ve noticed a small pattern across a few of my projects where I had vectorized and parallelized some code. The original algorithm had a “push” approach, the optimized version instead took a “pull” approach. In this article I’ll describe what I mean, though it’s mostly just so I can show off some pretty videos, pictures, and demos.
What Outranks Thread Priority?
https://randomascii.wordpress.com/2020/04/14/what-outranks-thread-priority/ [randomascii.wordpress.com]
2020-04-15 11:45
tags:
concurrency
investigation
perf
systems
turtles
ux
windows
This investigation started, as so many of mine do, with me minding my own business, not looking for trouble. In this case all I was doing was opening my laptop lid and trying to log on. The first few times that this resulted in a twenty-second delay I ignored the problem, hoping that it would go away. The next few times I thought about investigating, but performance problems that occur before you have even logged on are trickier to solve, and I was feeling lazy. When I noticed that I was avoiding closing my laptop because I dreaded the all-too-frequent delays when opening it I realized it was time to get serious.
A lot of effort for a rather unsatisfactory conclusion, but I won’t spoil the surprise.
Exploiting Race Conditions Using the Scheduler
https://www.youtube.com/watch?v=MIJL5wLUtKE [www.youtube.com]
2020-04-10 01:04
tags:
concurrency
exploit
linux
security
video
This talk shows how two bugs involving somewhat narrow-looking race windows (https://crbug.com/project-zero/1695 in the Linux kernel, https://crbug.com/project-zero/1741 in Android userspace code) can be stretched wide enough to win the race conditions on a Google Pixel 2 phone, running a Linux 4.4 kernel, by making use of the unprivileged sched_*() syscalls.
source: grugq
How Go's net.DialContext() stops things when the context is cancelled
https://utcc.utoronto.ca/~cks/space/blog/programming/GoDialCancellationHow [utcc.utoronto.ca]
2020-01-17 02:04
tags:
concurrency
go
programming
When I started looking into the relevant standard library code I expected to find that things like net.Dialer.DialContext() had special hooks into the runtime’s network poller (netpoller) to do this. This turns out to not be the case; instead dialing uses an interesting and elegant approach that’s open to everyone doing network IO.
In order to abort an outstanding dial operation if the context is cancelled, the net package simply sets an expired (write) deadline.
Stop worrying about blocking: the new async-std runtime, inspired by Go
https://async.rs/blog/stop-worrying-about-blocking-the-new-async-std-runtime/ [async.rs]
2019-12-17 00:45
tags:
concurrency
library
programming
release
rust
async-std is a mature and stable port of the Rust standard library to its new async/await world, designed to make async programming easy, efficient, worry- and error-free.
Today, we’re introducing the new async-std runtime. It features a lot of improvements, but the main news is that it eliminates a major source of bugs and performance issues in concurrent programs: accidental blocking.
source: L
The Go runtime scheduler's clever way of dealing with system calls
https://utcc.utoronto.ca/~cks/space/blog/programming/GoSchedulerAndSyscalls [utcc.utoronto.ca]
2019-12-08 18:34
tags:
concurrency
go
programming
One of Go’s signature features is goroutines, which are lightweight threads that are managed by the Go runtime. The Go runtime implements goroutines using a M:N work stealing scheduler to multiplex goroutines on to operating system threads. The scheduler has special terminology for three important entities; a G is a goroutine, an M is an OS thread (a ‘machine’), and a P is a ‘processor’, which at its core is a limited resource that must be claimed by an M in order to run Go code. Having a limited supply of Ps is how Go limits how many things it will do at once, so as to not overload the overall system; generally there is one P per actual CPU that the OS reports (the number of Ps is GOMAXPROCS).
source: HN
Thread-safety, torn reads, and the like
http://joeduffyblog.com/2006/02/07/threadsafety-torn-reads-and-the-like/ [joeduffyblog.com]
2019-11-10 01:30
tags:
concurrency
programming
I was on a mail thread today, the topic for which was the meaning—and perhaps lack of comprehensiveness—of the statement: “This type is thread safe.” Similar statements are scattered throughout our product documentation, without any good central explanation of its meaning and any caveats.