present - A terminal-based presentation tool with colors and effects.
https://github.com/vinayak-mehta/present [github.com]
2020-08-30 21:36
tag: text
present - A terminal-based presentation tool with colors and effects.
https://github.com/vinayak-mehta/present [github.com]
2020-08-30 21:36
A 35-year-old bug in patch found in efforts to restore 29 year old 2.11BSD
http://bsdimp.blogspot.com/2020/08/a-35-year-old-bug-in-patch-found-in.html [bsdimp.blogspot.com]
2020-08-17 17:46
Larry Wall posted patch 1.3 to mod.sources on May 8, 1985. A number of versions followed over the years. It’s been a faithful alley for a long, long time. I’ve never had a problem with patch until I embarked on the 2.11BSD restoration project. In going over the logs very carefully, I’ve discovered a bug that bites this effort twice. It’s quite interesting to use 27 year old patches to find this bug while restoring a 29 year old OS...
source: HN
Implementing a Type-safe printf in Rust
https://willcrichton.net/notes/type-safe-printf/ [willcrichton.net]
2020-08-17 04:35
I show how to use heterogeneous lists and traits to implement a type-safe printf in Rust. These mechanisms can ensure that two variadic argument lists share important properties, like the number of format string holes matches the number of printf arguments.
source: HN
How can CharUpper and CharLower guarantee that the uppercase version of a string is the same length as the lowercase version?
https://devblogs.microsoft.com/oldnewthing/20200804-00/?p=104040 [devblogs.microsoft.com]
2020-08-05 00:49
The CharUpper function tries to convert the string in place, but if the uppercase and lowercase versions of a character are not the same length, then it panics and does something strange.
Also: https://devblogs.microsoft.com/oldnewthing/20200803-00/?p=104038
Let's build a Full-Text Search engine
https://artem.krylysov.com/blog/2020/07/28/lets-build-a-full-text-search-engine/ [artem.krylysov.com]
2020-07-30 16:48
Today we are going to build our own FTS engine. By the end of this post, we’ll be able to search across millions of documents in less than a millisecond. We’ll start with simple search queries like “give me all documents that contain the word cat” and we’ll extend the engine to support more sophisticated boolean queries.
source: L
SAT solver on top of regex matcher
https://yurichev.com/news/20200621_regex_SAT/ [yurichev.com]
2020-07-08 00:05
A SAT problem is an NP-problem, while regex matching is not. However, a quite popular regex ‘backreferences’ extension extends regex matching to a (hard) NP-problem.
source: trivium
xi-editor retrospective
https://raphlinus.github.io/xi/2020/06/27/xi-retrospective.html [raphlinus.github.io]
2020-07-01 00:55
I still believe it would be possible to build a high quality editor based on the original design. But I also believe that this would be quite a complex system, and require significantly more work than necessary.
A few good ideas and observations could be mined out of this post.
source: L
Unicode Security Considerations
https://unicode.org/reports/tr36/ [unicode.org]
2020-06-11 17:41
Because Unicode contains such a large number of characters and incorporates the varied writing systems of the world, incorrect usage can expose programs or systems to possible security attacks. This is especially important as more and more products are internationalized. This document describes some of the security considerations that programmers, system analysts, standards developers, and users should take into account, and provides specific recommendations to reduce the risk of problems.
A large number of problems as well.
source: solar
Psychic Paper
https://siguza.github.io/psychicpaper/ [siguza.github.io]
2020-05-02 00:39
Yesterday Apple released iOS 13.5 beta 3 (seemingly renaming iOS 13.4.5 to 13.5 there), and that killed one of my bugs. It wasn’t just any bug though, it was the first 0day I had ever found. And it was probably also the best one. Not necessarily for how much it gives you, but certainly for how much I’ve used it for, and also for how ridiculously simple it is. So simple, in fact, that the PoC I tweeted out looks like an absolute joke. But it’s 100% real.
I dubbed it “psychic paper” because, just like the item by that name that Doctor Who likes to carry, it allows you get past security checks and make others believe you have a wide range of credentials that you shouldn’t have.
source: grugq
Notes on Parsing in Rust
https://blog.wesleyac.com/posts/rust-parsing [blog.wesleyac.com]
2020-04-30 22:37
I’ve recently been writing a bit of parsing code in Rust, and I’ve been jumping back and forth between a few different parsing libraries - they all have different advantages and disadvantages, so I wanted to write up some notes here to help folks who are undecided choose what libraries and techniques to consider, and also to offer some suggestions for the future of the Rust parsing ecosystem.
source: L
Hashtag of note
https://languagelog.ldc.upenn.edu/nll/?p=46455&utm_source=rss&utm_medium=rss&utm_campaign=hashtag-of-note [languagelog.ldc.upenn.edu]
2020-03-18 17:18
You will probably notice immediately that it contains a full-width dash, in other words a Unicode (probably Chinese-origin?) character. For some reason, this is all over Twitter in posts from Anglophone people I am almost completely sure have no input method installed that can actually produce it.
It’s not a real dash at all but a “Katakana-Hiragana prolonged sound mark“:
The unexpected Google wide domain check bypass
https://bugs.xdavidhu.me/google/2020/03/08/the-unexpected-google-wide-domain-check-bypass/ [bugs.xdavidhu.me]
2020-03-09 21:01
Let me tell you this “funny” story of me trying to bypass a domain check in a little webapp, and acidentally bypassing a URL parser that is used in (almost) every Google product.
Spoiler: it’s a regex bug.
source: HN
JetBrains Mono
https://www.jetbrains.com/lp/mono/ [www.jetbrains.com]
2020-01-24 05:31
Another developer font. With a fancy web site to explain the design.
source: DF
Introducing Glush: a robust, human readable, top-down parser compiler
https://www.sanity.io/blog/why-we-wrote-yet-another-parser-compiler [www.sanity.io]
2019-12-18 17:54
It’s been 45 years since Stephen Johnson wrote Yacc (Yet another compiler-compiler), a parser generator that made it possible for anyone to write fast, efficient parsers. Yacc, and its many derivatives, quickly became popular and were included in many Unix distributions. You would imagine that in 45 years we would have further perfected the art of creating parsers and would have standardized on a single tool. A lot of progress has been made, but there are still annoyances and problems affecting every tool out there.
This is great, even just for the overview of parsing.
The CYK algorithm (named after Cocke–Younger–Kasami) is in my opinion of great theoretical importance when it comes to parsing context-free grammars. CYK will parse all context-free parsers in O(n3), including the “simple” grammars that LL/LR can parse in linear time. It accomplishes this by converting parsing into a different problem: CYK shows that parsing context-free languages is equivalent to doing a boolean matrix multiplication. Matrix multiplication can be done naively in cubic time, and as such parsing context-free languages can be done in cubic time. It’s a very satisfying theoretical result, and the actual algorithm is small and easy to understand.
source: trivium
Hacking GitHub with Unicode's dotless 'i'.
https://eng.getwisdom.io/hacking-github-with-unicode-dotless-i/ [eng.getwisdom.io]
2019-12-17 02:51
GitHub’s forgot password feature could be compromised because the system lowercased the provided email address and compared it to the email address stored in the user database. If there was a match, GitHub would send the reset password link to the email address provided by the attacker- which was technically speaking, not the same email address.
This is beautiful.
source: HN
Teletext’s creative legacy lives on
https://wepresent.wetransfer.com/story/teletext-creative-legacy/ [wepresent.wetransfer.com]
2019-12-09 06:06
Like Walkmans and VHS recorders, teletext now seems impossibly quaint. But designer and writer Craig Oldham explains that not only was Teletext a revolutionary technology in its prime, its creative legacy lives on with a new generation of artists who love its creative limits.
source: Dfly
Announcing the Allsorts Font Shaping Engine
https://yeslogic.com/blog/allsorts-rust-font-shaping-engine.html [yeslogic.com]
2019-11-21 03:24
Today YesLogic is open-sourcing the Allsorts font parser, shaping engine, and subsetter for OpenType, WOFF, and WOFF2 under the Apache 2.0 license. Allsorts was extracted from the Prince HTML to PDF typesetting and layout tool and is implemented in Rust.
Font shaping is the process of laying out the glyphs of a font in order to represent some input text. Rasterisation of the glyphs is a separate process. Font shaping for Latin text is quite simple. For some scripts, like those used by Indic languages, it is quite complex and requires reordering and substituting the glyphs in each syllable to produce the final output. There are only three main font shaping engines in use today: DirectWrite on Windows, CoreText on macOS and iOS, and HarfBuzz on open-source operating systems and some web-browsers. Of these, only HarfBuzz is open source.
source: L
Text Editing Hates You Too
https://lord.io/blog/2019/text-editing-hates-you-too/ [lord.io]
2019-10-29 01:00
Alexis Beingessner’s Text Rendering Hates You, published exactly a month ago today, hits very close to my heart.
Back in 2017, I was building a rich text editor in the browser. Unsatisfied with existing libraries that used ContentEditable, I thought to myself “hey, I’ll just reimplement text selection myself! How difficult could it possibly be?” I was young. Naive. I estimated it would take two weeks. In reality, attempting to solve this problem would consume several years of my life, and even landed me a full time job for a year implementing text editing for a new operating system.
source: L
An unexpected character replacement
https://www.datafix.com.au/BASHing/2019-10-18.html [www.datafix.com.au]
2019-10-18 06:04
A few weeks ago I found a replacement in GBIF that I’d never seen before: M<fc>ller. It was a hexadecimal value for the character “ü” enclosed in angle brackets. That particular hex value for “ü” appears in Windows-1252 and other encodings, but what program did this replacement? And why?
source: HN
Text Rendering Hates You
https://gankra.github.io/blah/text-hates-you/ [gankra.github.io]
2019-09-29 17:48
Rendering text, how hard could it be? As it turns out, incredibly hard! To my knowledge, literally no system renders text “perfectly”. It’s all best-effort, although some efforts are more important than others.
I lost it at multicolored ligatures.
source: L