inks

tag: text

Slightly better named character reference tokenization than Chrome, Safari, and Firefox

https://www.ryanliptak.com/blog/better-named-character-reference-tokenization/ [www.ryanliptak.com]

2025-06-27 23:28

tags: browser compsci html programmer text

So, I thought I’d take you through what I came up with and how it compares to the implementations in the major browser engines. Mostly, though, I just think the data structure I used is neat and want to tell you about it (fair warning: it’s not novel).

This article continually and relentlessly grew in scope, and has ended up quite a bit more in-depth than I originally imagined.

source: L

The Missing 11th of the Month

https://drhagen.com/blog/the-missing-11th-of-the-month/ [drhagen.com]

2025-06-19 01:07

tags: history investigation language text

There are not many days that seem to be smaller than the typical size. February 29th is a tiny speck, for instance. But if you stare at the comic long enough, you may get the impression that the 11th of most months is unusually small. The title text of the comic concurs, reading “In months other than September, the 11th is mentioned substantially less often than any other date. It’s been that way since long before 9/11 and I have no idea why.” After digging into the raw data, I believe I have figured out why.

source: HN

UCSD Pascal In Depth

https://markbessey.blog/2025/04/29/ucsd-pascal-in-depth/ [markbessey.blog]

2025-05-28 05:09

tags: pascal programming retro series systems text

The p-System comes with an editor. It’s a full-screen editor, with some fairly advanced features for the time, like auto-indent, bookmarks, and cut and paste. It’s modal, which is hardly surprising, considering that modal editors were the latest usability improvement of the age, compared to the line-oriented editors of the previous decade.

Also: https://markbessey.blog/2025/04/30/ucsd-pascal-in-depth-2/

Some features of the p-System were really ahead of their time. And then, there is the filesystem. Whenever you set out to create any software, but especially an operating system, which you intend to be aggressively cross-platform, you inevitably run into conflicts between being sophisticated, and hitting the lowest common denominator.

Also: https://markbessey.blog/2025/05/08/ucsd-pascal-in-depth-3-n/

But the 1970s were a very different time. So let’s talk about the text file format for the USCD p-System. This is not just something that applies to the text editor, incidentally. If you declare a file as “text” type in Pascal, it gets the same formatting applied. The formatting is transparently stripped from the file if you send it to the PRINTER: or CONSOLE: device.

Overview: https://markbessey.blog/ucsd-p-system-info/

Also: https://github.com/mbessey/p-system-tools

source: trivium

Beating the Fastest Lexer Generator in Rust

https://alic.dev/blog/fast-lexing [alic.dev]

2025-05-09 19:07

tags: compiler perf programming rust text

I was aware of the efficiency of state machine driven lexers, but most generators have one problem: they can’t be arbitrarily generic and consistently optimal at the same time. There will always be some assumptions about your data that are either impossible to express, or outside the scope of the generator’s optimizations. Either way, I was curious to find out how my hand-rolled implementation would fare.

source: L

Memory safety for web fonts

https://developer.chrome.com/blog/memory-safety-fonts [developer.chrome.com]

2025-03-19 22:52

tags: browser graphics library text

The FreeType library is used by Chrome to compute metrics and load hinted outlines from fonts. Overall, use of FreeType has been a huge win for Google. It does a complex job, and does it well, we rely on it extensively and contribute back to it. However, it is written in unsafe code and has its origins in a time when malicious inputs were less likely. Merely keeping up with the stream of issues found by fuzzing costs Google at least 0.25 full time software engineers. Worse, we observably don’t find everything or find things only after the code has shipped to users.

source: HN

Shift Happens - A book about keyboards

https://shifthappens.site/ [shifthappens.site]

2025-03-14 23:25

tags: book history interactive text

The book is sold out, but there are some fun widgets to play with as well.

Kerning, the Hard Way

https://home.octetfont.com/blog/kerning-hard.html [home.octetfont.com]

2025-03-14 20:29

tags: design graphics text

It looks a bit like L and T have been clipped, but in fact they’ve been drawn over. Black parts of L overlap the T, and vice versa: black parts of the T overlap L. The effect is what you can see, where L and T share a space, the black bars overlap and are solid, obliterating the reversed out letterforms. So how do i kern this font, if not with GSPOS lookups?

source: L

The hardest working font in Manhattan

https://aresluna.org/the-hardest-working-font-in-manhattan/ [aresluna.org]

2025-02-17 21:05

tags: article design history photos text urban

In 2007, on my first trip to New York City, I grabbed a brand-new DSLR camera and photographed all the fonts I was supposed to love. I admired American Typewriter in all of the I <3 NYC logos, watched Akzidenz Grotesk and Helvetica fighting over the subway signs, and even caught an occasional appearance of the flawlessly-named Gotham, still a year before it skyrocketed in popularity via Barack Obama’s first campaign.

But there was one font I didn’t even notice, even though it was everywhere around me. Last year in New York, I walked over 100 miles and took thousands of photos of one and one font only. The font’s name is Gorton.

source: L

The history and use of /etc/glob in early Unixes

https://utcc.utoronto.ca/~cks/space/blog/unix/EtcGlobHistory [utcc.utoronto.ca]

2025-01-13 18:57

tags: sh text unix

One of the innovations that the V7 Bourne shell introduced was built in shell wildcard globbing, which is to say expanding things like *, ?, and so on. Of course Unix had shell wildcards well before V7, but in V6 and earlier, the shell didn’t implement globbing itself; instead this was delegated to an external program, /etc/glob (this affects things like looking into the history of Unix shell wildcards, because you have to know to look at the glob source, not the shell).

source: HN

WorstFit: Unveiling Hidden Transformers in Windows ANSI!

https://blog.orange.tw/posts/2025-01-worstfit-unveiling-hidden-transformers-in-windows-ansi/ [blog.orange.tw]

2025-01-10 14:54

tags: exploit programming security text turtles windows

The research unveils a new attack surface in Windows by exploiting Best-Fit, an internal charset conversion feature. Through our work, we successfully transformed this feature into several practical attacks, including Path Traversal, Argument Injection, and even RCE, affecting numerous well-known applications!

source: HN

Cutting edge calligraphy

https://languagelog.ldc.upenn.edu/nll/?p=67761 [languagelog.ldc.upenn.edu]

2025-01-04 07:01

tags: art text video

Green sand and razor blade. Nice video. From https://www.tiktok.com/@qiaobiangugu

The history of Alt+number sequences, and why Alt+9731 sometimes gives you a heart and sometimes a snowman

https://devblogs.microsoft.com/oldnewthing/20240702-00/?p=109951 [devblogs.microsoft.com]

2024-07-02 16:56

tags: text ux windows

A customer reported that a recent Windows update broke their ability to type a snowman by using Alt+9731. We explained that the update was not at fault; rather, Alt+9731 was never supposed to produce a snowman at all! But the customer insisted that it used to work.

source: HN

State of the Terminal

https://gpanders.com/blog/state-of-the-terminal/ [gpanders.com]

2024-04-30 04:31

tags: development systems text tty unix

It’s only been in the last couple of years that I’ve begun to dig deep into the inner workings of how terminal emulators, and the applications that run inside of them, really work. I’ve learned that there is a lot of innovation and creative problem solving happening in this space, even though the underlying technology is over half a century old1.

I’ve also found that many people who use terminal based tools (including shells like Bash and editors like Vim) know very little about terminals themselves, or some of the modern features and capabilities they can support.

In this article, we’ll discuss some of the problems that terminal based applications have historically had to deal with (and what the modern solutions are) as well as some features that modern terminal emulators support that you may not be aware of.

source: Dfly

How Not To Release Historic Source Code

https://www.os2museum.com/wp/how-not-to-release-historic-source-code/ [www.os2museum.com]

2024-04-28 02:30

tags: development format retro text windows

For practical purposes, old source files are not text files. They are binary files, and must be preserved without modification. It is not OK to take an old source file and convert it to UTF-8. For one thing, UTF-8 didn’t even exist in the times of MASM 5.10 and Microsoft C 5.1, of course old tools can’t deal with it!

source: L

2023 Emoji Law Year-in-Review

https://blog.ericgoldman.org/archives/2024/01/2023-emoji-law-year-in-review.htm [blog.ericgoldman.org]

2024-03-14 23:39

tags: links policy text

I continue to maintain my census of U.S. cases referencing emojis or emoticons. In 2023, I logged 225 such cases (this number will grow a bit due to lags with the electronic databases). The case count continues to grow exponentially. The 2023 count represented a 17% increase over the 2022 count.

a history of the tty

https://computer.rip/2024-02-25-a-history-of-the-tty.html [computer.rip]

2024-03-11 07:44

tags: article hardware retro text tty

It’s one of those anachronisms that is deeply embedded in modern technology. From cloud operator servers to embedded controllers in appliances, there must be uncountable devices that think they are connected to a TTY.

source: Dfly

Fonts are still a Helvetica of a Problem

https://www.canva.dev/blog/engineering/fonts-are-still-a-helvetica-of-a-problem/ [www.canva.dev]

2024-03-06 19:45

tags: security text turtles

CVEs in three strange places and the unique problem of safely processing and handling fonts.

Although the previous research focused primarily on memory corruption bugs in font processing, we wondered what other kinds of security issues might occur when handling fonts.

source: HN

npm search RCE? - Escape Sequence Injection

https://blog.solidsnail.com/posts/npm-esc-seq [blog.solidsnail.com]

2023-12-16 00:59

tags: exploit security text tty turtles

In a previous post I went over a vulnerability I discovered in iTerm2 that allowed code execution in the shell by leveraging the output of a command. Today, We’ll focus on the other side of that interaction, the application running underneath the terminal.

"[31m"?! ANSI Terminal security in 2023 and finding 10 CVEs

https://dgl.cx/2023/09/ansi-terminal-security [dgl.cx]

2023-10-20 19:20

tags: exploit security text tty turtles unix

This paper reflects work done in late 2022 and 2023 to audit for vulnerabilities in terminal emulators, with a focus on open source software. The results of this work were 10 CVEs against terminal emulators that could result in Remote Code Execution (RCE), in addition various other bugs and hardening opportunities were found. The exact context and severity of these vulnerabilities varied, but some form of code execution was found to be possible on several common terminal emulators across the main client platforms of today.

source: HN

A Blog Post With Every HTML Element

https://www.patrickweaver.net/blog/a-blog-post-with-every-html-element/ [www.patrickweaver.net]

2023-08-04 00:16

tags: docs essay html standard text ux web

I could, element by element, continue to add support (mostly by making CSS updates for each element to fit in with the rest of my style choices) as I came across specific needs for them, but not one to shy away from an exhaustive exploration, I decided to write this post and attempt to use every element.

A goal of the post, was to avoid delaying other future posts with CSS updates on a previously unused element, but in reality it took a year and a half to make all the updates for just this post! I am using the MDN Web Docs list of HTML elements as a reference which has more than 100 tags divided into a few categories, which I will also use in this post.

source: L