Over 200 offensive slurs could soon be banned from competitive Scrabble
> The North American Scrabble Players Association (NASPA) seems poised to remove hundreds of offensive slurs from tournament-level Scrabble play.
> Words that are “used to cause offense on scatological, prurient, profane or other grounds” are not under discussion this time around. NASPA publishes an obfuscated, anagrammed list of which offensive words fall into each category.
The Art of the Bad Faith Argument
> The person who types “lol” is never actually laughing; the person who types I’M SCREAMING is silently dabbing at a screen. In the same way, the person who is perpetually shocked and outraged and brimming with righteous fury is almost always lying to themselves. They’re as affectless as the rest of us: play-acting, downloading synthetic emotions, and then passing them on.
Unicode Security Considerations
> Because Unicode contains such a large number of characters and incorporates the varied writing systems of the world, incorrect usage can expose programs or systems to possible security attacks. This is especially important as more and more products are internationalized. This document describes some of the security considerations that programmers, system analysts, standards developers, and users should take into account, and provides specific recommendations to reduce the risk of problems.
A large number of problems as well.
Welp, sup, yep, yup, nope
> Though we have presented quite a bit of informal and recent use, our earliest written use of welp goes back over 70 years. It shows up in a scholarly article on two of welp’s linguistic cousins: yep and nope. Well gained that final -p as part of a normal process of articular: the lips come together to stop the sound of well and prepare for the next sound, and some hear that stoppage as a -p. This means it is very common in speech. One linguist went so far as to say that anyone who didn’t know what welp meant was probably an alien.
> A San Diego federal judge Friday dismissed a $10 million defamation lawsuit filed by the owners and operators of San Diego-based One America News Network against MSNBC and political commentator Rachel Maddow. Last summer, the liberal host told her viewers that the Trump-friendly conservative network “really literally is paid Russian propaganda.”
How to decode a data breach notice
> But data breach notifications have become an all-too-regular exercise in crisis communications. These notices increasingly try to deflect blame, obfuscate important details and omit important facts. After all, it’s in a company’s best interest to keep the stock markets happy, investors satisfied and regulators off their backs. Why would it want to say anything to the contrary?
Metaphors in man pages
> I went through some of the examples of metaphors in Metaphors To Live By and grepped all the man pages on my computer for them.
danger + opportunity ≠ crisis
> There is a widespread public misperception, particularly among the New Age sector, that the Chinese word for “crisis” is composed of elements that signify “danger” and “opportunity.” I first encountered this curious specimen of alleged oriental wisdom about ten years ago at an altitude of 35,000 feet sitting next to an American executive. He was intently studying a bound volume that had adopted this notorious formulation as the basic premise of its method for making increased profits even when the market is falling. At that moment, I didn’t have the heart to disappoint my gullible neighbor who was blissfully imbibing what he assumed were the gems of Far Eastern sagacity enshrined within the pages of his workbook. Now, however, the damage from this kind of pseudo-profundity has reached such gross proportions that I feel obliged, as a responsible Sinologist, to take counteraction.
What is a 'Weenus' ('Wenis,' 'Weenis')?
> The loose skin at the joint of one’s elbow
Errant v. Arrant
> But curiously, arrant and errant are the historically the same word, with an interesting and tangled history.
History of Information
Lots of little facts organized in various ways.
How to Explain What Words Mean
> It pains me to admit it, but this one even confuses me.
Text Rendering Hates You
> Rendering text, how hard could it be? As it turns out, incredibly hard! To my knowledge, literally no system renders text “perfectly”. It’s all best-effort, although some efforts are more important than others.
I lost it at multicolored ligatures.
The secret-sharer: evaluating and testing unintended memorization in neural networks
> This is a really important paper for anyone working with language or generative models, and just in general for anyone interested in understanding some of the broader implications and possible unintended consequences of deep learning. There’s also a lovely sense of the human drama accompanying the discoveries that just creeps through around the edges.
> Disclosure of secrets is of particular concern in neural network models that classify or predict sequences of natural language text… even if sensitive or private training data text is very rare, one should assume that well-trained models have paid attention to its precise details…. The users of such models may discover— either by accident or on purpose— that entering certain text prefixes causes the models to output surprisingly revealing text completions.
> I would like to apologize.
> It’s Not Wrong that “🤦🏼♂️”.length == 7 But It’s Better that “🤦🏼♂️”.len() == 17 and Rather Useless that len(“🤦🏼♂️“) == 5
> The string that contains one graphical unit consists of 5 Unicode scalar values. First, there’s a base character that means a person face palming. By default, the person would have a cartoonish yellow color. The next character is an emoji skintone modifier the changes the color of the person’s skin (and, in practice, also the color of the person’s hair). By default, the gender of the person is undefined, and e.g. Apple defaults to what they consider a male appearance and e.g. Google defaults to what they consider a female appearance. The next two scalar values pick a male-typical appearance specifically regardless of font and vendor. Instead of being an emoji-specific modifier like the skin tone, the gender specification uses an emoji-predating gender symbol (MALE SIGN) explicitly ligated using the ZERO WIDTH JOINER with the (skin-toned) face-palming person. (Whether it is a good or a bad idea that the skin tone and gender specifications use different mechanisms is out of the scope of this post.) Finally, VARIATION SELECTOR-16 makes it explicit that we want a multicolor emoji rendering instead of a monochrome dingbat rendering.
And then we move on from there, in quite some depth.
Women's Romanization for Hong Kong
> This is not to say that this type of ad hoc, spontaneous Romanization of Cantonese has not already existed for some time. Indeed, young people have been using it extensively for texting, on social media, etc. for years. What’s new is that it is now consciously being employed to out fake protesters who do not know Hong Kong Cantonese and its informal writing system.
> Probably because of something my ancestors did.
FUCT in the brain
> Scientists have found that swearing most likely originates in the right hemisphere of the brain, and within that half, in the “primitive” part of the brain, the limbic system. The right half of the brain [which] is responsible for nonpropositional or automatic speech, which includes greetings, conventional expressions such as ‘not at all,’ counting, song lyrics, and swearwords. Propositional speech—words strung together in syntactically correct forms to create an original meaning—occurs in the left hemisphere.
> But the evidence for this conclusion is weak, in my opinion.
This map shows the most commonly spoken language in every US state, excluding English and Spanish
> English is, unsurprisingly, the most commonly spoken language across the US, and Spanish is second most common in 46 states and the District of Columbia. So we excluded those two languages in the above map.