danger + opportunity ≠ crisis
> There is a widespread public misperception, particularly among the New Age sector, that the Chinese word for “crisis” is composed of elements that signify “danger” and “opportunity.” I first encountered this curious specimen of alleged oriental wisdom about ten years ago at an altitude of 35,000 feet sitting next to an American executive. He was intently studying a bound volume that had adopted this notorious formulation as the basic premise of its method for making increased profits even when the market is falling. At that moment, I didn’t have the heart to disappoint my gullible neighbor who was blissfully imbibing what he assumed were the gems of Far Eastern sagacity enshrined within the pages of his workbook. Now, however, the damage from this kind of pseudo-profundity has reached such gross proportions that I feel obliged, as a responsible Sinologist, to take counteraction.
What is a 'Weenus' ('Wenis,' 'Weenis')?
> The loose skin at the joint of one’s elbow
Errant v. Arrant
> But curiously, arrant and errant are the historically the same word, with an interesting and tangled history.
History of Information
Lots of little facts organized in various ways.
How to Explain What Words Mean
> It pains me to admit it, but this one even confuses me.
Text Rendering Hates You
> Rendering text, how hard could it be? As it turns out, incredibly hard! To my knowledge, literally no system renders text “perfectly”. It’s all best-effort, although some efforts are more important than others.
I lost it at multicolored ligatures.
The secret-sharer: evaluating and testing unintended memorization in neural networks
> This is a really important paper for anyone working with language or generative models, and just in general for anyone interested in understanding some of the broader implications and possible unintended consequences of deep learning. There’s also a lovely sense of the human drama accompanying the discoveries that just creeps through around the edges.
> Disclosure of secrets is of particular concern in neural network models that classify or predict sequences of natural language text… even if sensitive or private training data text is very rare, one should assume that well-trained models have paid attention to its precise details…. The users of such models may discover— either by accident or on purpose— that entering certain text prefixes causes the models to output surprisingly revealing text completions.
> I would like to apologize.
> It’s Not Wrong that “🤦🏼♂️”.length == 7 But It’s Better that “🤦🏼♂️”.len() == 17 and Rather Useless that len(“🤦🏼♂️“) == 5
> The string that contains one graphical unit consists of 5 Unicode scalar values. First, there’s a base character that means a person face palming. By default, the person would have a cartoonish yellow color. The next character is an emoji skintone modifier the changes the color of the person’s skin (and, in practice, also the color of the person’s hair). By default, the gender of the person is undefined, and e.g. Apple defaults to what they consider a male appearance and e.g. Google defaults to what they consider a female appearance. The next two scalar values pick a male-typical appearance specifically regardless of font and vendor. Instead of being an emoji-specific modifier like the skin tone, the gender specification uses an emoji-predating gender symbol (MALE SIGN) explicitly ligated using the ZERO WIDTH JOINER with the (skin-toned) face-palming person. (Whether it is a good or a bad idea that the skin tone and gender specifications use different mechanisms is out of the scope of this post.) Finally, VARIATION SELECTOR-16 makes it explicit that we want a multicolor emoji rendering instead of a monochrome dingbat rendering.
And then we move on from there, in quite some depth.
Women's Romanization for Hong Kong
> This is not to say that this type of ad hoc, spontaneous Romanization of Cantonese has not already existed for some time. Indeed, young people have been using it extensively for texting, on social media, etc. for years. What’s new is that it is now consciously being employed to out fake protesters who do not know Hong Kong Cantonese and its informal writing system.
> Probably because of something my ancestors did.
FUCT in the brain
> Scientists have found that swearing most likely originates in the right hemisphere of the brain, and within that half, in the “primitive” part of the brain, the limbic system. The right half of the brain [which] is responsible for nonpropositional or automatic speech, which includes greetings, conventional expressions such as ‘not at all,’ counting, song lyrics, and swearwords. Propositional speech—words strung together in syntactically correct forms to create an original meaning—occurs in the left hemisphere.
> But the evidence for this conclusion is weak, in my opinion.
This map shows the most commonly spoken language in every US state, excluding English and Spanish
> English is, unsurprisingly, the most commonly spoken language across the US, and Spanish is second most common in 46 states and the District of Columbia. So we excluded those two languages in the above map.
Alphabetical order in Korean
> Alphabetical order in Korean has an interesting twist I haven’t seen in any other language.
> In Korean, alphabetization is also done at the syllable level.
> So “-bachi” is now an English suffix for any food prepared live by Asians on a metal plate.
Emily Wilson on Translations and Language
> In a recent Twitter thread, Emily Wilson listed some of the difficulties of translating Homer into English. Among them: “There aren’t enough onomatopoeic words for very loud chaotic noises” (#2 on the list), “It’s very hard to come up with enough ways to describe intense desire to act that don’t connote modern psychology” (#5), and “There is no common English word of four syllables or fewer connoting ‘person particularly favored by Zeus due to high social status, and by the way this is a very normal ordinary word which is not drawing any special attention to itself whatsoever, beyond generic heroizing.’” (#7).
> Using Twitter this way is part of her effort to explain literary translation. What do translators do all day? Why can the same sentence turn out so differently depending on the translator? Why did she get stuck translating the Iliad immediately after producing a beloved translation of the Odyssey?
> She and Tyler discuss these questions and more, including why Silicon Valley loves Stoicism, whether Plato made Socrates sound smarter than he was, the future of classics education, the effect of AI on translation, how to make academia more friendly to women, whether she’d choose to ‘overlive’, and the importance of having a big Ikea desk and a huge orange cat.
> “Whaumau” is a well-formed but non-existent Māori word, which would be pronounced /faʉmaʉ/ — that is, basically the same as the English pronunciation of the internet acronym FOMO, Fear Of Missing Out. And that’s what it means.
Size Venn Diagram
The large dipper and great potatoes.
German for Programmers
> After 2 years of learning German I’ve noticed that, for the most part, you can go a long way by mapping foreign concepts to ones that you already know. In particular, I’ve had success mapping aspects of German grammar to programming concepts I use every day. After all, programmers deal with weird grammars all the time, why not take advantage of that skill?
Emoji Law 2018 Year-in-Review
> As I’ve mentioned before, I track every U.S. court opinion in Westlaw and Lexis that references “emoji” or “emoticon.” This is not a comprehensive census for several reasons, including my inability to set up alerts when a court displays the symbol without calling it an emoji or emoticon (which, in many emoji cases, aren’t even displayed in Westlaw or Lexis) and the other known skews and limits of Westlaw’s and Lexis’ case collections. Still, FWIW, I’ve posted the updated roster of cases.