Marty Weitzman’s Noah’s Ark Problem
> Marty Weitzman passed away suddenly yesterday. He was on many people’s shortlist for the Nobel. His work is marked by high-theory applied to practical problems. The theory is always worked out in great generality and is difficult even for most economists. Weitzman wanted to be understood by more than a handful of theorists, however, and so he also went to great lengths to look for special cases or revealing metaphors. Thus, the typical Weitzman paper has a dense middle section of math but an introduction and conclusion of sparkling prose that can be understood and appreciated by anyone for its insights.
> The Noah’s Ark Problem illustrates the model and is my favorite Weitzman paper.
On studying mathematics
> I understand that most of us do not have a good experience when studying mathematics in high school or college. I assure you that this is a common problem, not only in Vietnam, but all over the world. Mathematics is taught and learned mechanically, without joy. But don’t worry, and please be optimistic, school is not the only place we can learn. I have never had a college degree, and if I can study math, anyone can learn it. The only thing we need is an open mind, daring to try new things, the rest can be left to math!
The Only Way to Win Is Not to Play the Game
> When I became a math and science writer, I had no idea that one of the most common requests I would get would be to weigh in on order of operations problems that somehow go viral in some segment of the internet.
> The real answer, the one I believe any mathematician, physicist, engineer, other number-cruncher would tell you is to make sure your expressions aren’t ambiguous.
Another take: https://danso.ca/blog/order-of-operations/
Decades-Old Computer Science Conjecture Solved in Two Pages
> “This conjecture has stood as one of the most frustrating and embarrassing open problems in all of combinatorics and theoretical computer science,” wrote Scott Aaronson of the University of Texas, Austin, in a blog post. “The list of people who tried to solve it and failed is like a who’s who of discrete math and theoretical computer science,” he added in an email.
> Over the years, computer scientists have developed many ways to measure the complexity of a given Boolean function. Each measure captures a different aspect of how the information in the input string determines the output bit. For instance, the “sensitivity” of a Boolean function tracks, roughly speaking, the likelihood that flipping a single input bit will alter the output bit. And “query complexity” calculates how many input bits you have to ask about before you can be sure of the output.
> Induced subgraphs of hypercubes and a proof of the Sensitivity Conjecture
One law to rule them all?
> Power-law distributions seem to be everywhere, and not just in word-counts and whale whistles. Most people know that Vilfredo Pareto found them in the distribution of wealth, two or three decades before Udny Yule showed that stochastic processes like those in evolution lead to such distributions, and George Kingsley Zipf found his eponymous law in word frequencies. Since then, power-law distributions have been found all over the place
> Or maybe not? Many of the alleged “power-law” examples are actually log-normal, or some other heavy-tailed distribution, according to a paper by Aaron Clauset, Cosma Rohilla Shalizi, and M. E. J. Newman, “Power-law distributions in empirical data” (SIAM Review 2009). As an alternative to the paper, you can read Cosma’s blog post “So You Think You Have a Power Law — Well Isn’t That Special?“, 6/15/2007; or this summary of the results in “Cozy Catastrophes“, 2/15/2012:
I found two identical packs of Skittles, among 468 packs with a total of 27,740 Skittles
> This is a follow-up to a post from earlier this year discussing the likelihood of encountering two identical packs of Skittles, that is, two packs having exactly the same number of candies of each flavor. Under some reasonable assumptions, it was estimated that we should expect to have to inspect “only about 400-500 packs” on average until encountering a first duplicate.
> So, on 12 January of this year, I started buying boxes of packs of Skittles. This past week, “only” 82 days, 13 boxes, 468 packs, and 27,740 individual Skittles later, I found the following identical 2.17-ounce packs:
A Go implementation of Poly1305 that makes sense
> Although it’s really a fraction of the complexity of e.g. elliptic curves, most of the implementations I’ve read look decidedly like magic, mysteriously multiplying values by enchanted constants, and shuffling bits like The Sorcerer’s Apprentice in Fantasia. Even the paper does not explain why and how its design decisions lead to faster code!
> Still, after reverse-engineering what the implementations were doing, I grew convinced that cryptography code could be perfectly understandable if only we commented it.
How to generate uniformly random points on n-spheres and n-balls
> For many Monte Carlo methods, such as those in graphical computing, it is critical to uniformly sample from $d$-dimensional spheres and balls. This post describes over twenty different methods to uniformly random sample from the (surface of) a $d$-dimensional sphere or the (interior of) a $d$-dimensional ball.
Just going to link to the whole blog.
A Deep Dive on RSA Accumulators
> Accumulators are a topic of interest in academia since 1994. Similarly to a Merkle Tree, they are used to cryptographically commit to the knowledge of a set of data. At a later point in time, proving membership of a subset of the dataset in the dataset can be proven by publishing a proof. In Merkle Trees the proof is called a Merkle Branch (or Merkle Proof), and grows logarithmically to the size of the committed data (commit 16 elements, prove inclusion by revealing log_2(16)=4).
> Accumulators on the other hand, allow proving membership in constant size, as well as batching of proofs for multiple elements (which is not a feature of Merkle trees).
> The focus of this post will be on describing the building blocks of RSA Accumulators, how we can construct proofs of (non-)membership as well as batch them across multiple blocks. This particular technique also has applications in UTXO-Based Plasma, and has given birth to the Plasma Prime variant. A lot of effort is being put into designing an accumulator that allows compaction of the UTXO-set in Plasma.
Probability and the Real World
> What aspects of the real world involve chance? What does mathematical probability tell about about those aspects? What concepts from mathematical probability can be illustrated by interesting contemporary real data? This web site records my efforts to articulate some answers to such questions. It is aimed at readers who have either read some “popular science” style account of probability, or taken a college course involving probability. So I won’t explain very basic stuff, and I mostly avoid discussing material easily found elsewhere.
The beginning of Public-key cryptography
> “Can the reader say what two numbers multiplied together will produce the number 8616460799? I think it unlikely that anyone but myself will ever know”
> -William S Jevons, The Principles of Science, 1874
My Favorite Paradox
> Here is Simpson’s Paradox once again.
> Trends which appear in groups of data may disappear or reverse when the groups are combined.
> I love this because every layer you peel off can invert your conclusion. At first glance B is better. Look deeper and A is better. Look deeper still and B may be better again.
A Fairly Fast Fibonacci Function
> Fair warning: this is a bit of rabbit hole, with no other purpose than to optimize the hell out something for which there is frankly no practical use. But we get to do a bit of linear algebra and try out some pretty interesting optimization techniques; that’s what I call a good time!
Faster remainders when the divisor is a constant: beating compilers and libdivide
> Not all instructions on modern processors cost the same. Additions and subtractions are cheaper than multiplications which are themselves cheaper than divisions. For this reason, compilers frequently replace division instructions by multiplications. Roughly speaking, it works in this manner. Suppose that you want to divide a variable n by a constant d. You have that n/d = n * (2N/d) / (2N). The division by a power of two (/ (2N)) can be implemented as a right shift if we are working with unsigned integers, which compiles to single instruction: that is possible because the underlying hardware uses a base 2. Thus if 2N/d has been precomputed, you can compute the division n/d as a multiplication and a shift. Of course, if d is not a power of two, 2N/d cannot be represented as an integer. Yet for N large enough, we can approximate 2N/d by an integer and have the exact computation of the remainder for all possible n within a range. I believe that all optimizing C/C++ compilers know how to pull this trick and it is generally beneficial irrespective of the processor’s architecture.
> Can we do better? It turns out that in some instance, we can beat both the compilers and a library like libdivide.
The Curious Case of Convexity Confusion
> There are several things worth highlighting about this bug. Firstly, computational geometry is hard. Seriously. I have some experience with it and, while I can’t say I’m an expert I know that much at least. Handling all the special cases correctly is a pain, even without considering security issues. And doing it using floating point arithmetic might as well be impossible. If I was writing a graphics library, I would convert floats to fixed-point precision as soon as possible and wouldn’t trust anything computed based on floating-point arithmetic at all.
> Secondly, the issue highlights the importance of doing variant analysis - I discovered it based on a public bug report and other people could have done the same.
> Thirdly, it highlights the importance of defense-in-depth. The latest patch makes sure that drawing a concave path with convex path algorithms won’t result in memory corruption, which also addresses unknown variants of convexity issues. If this was implemented immediately after the initial report, Project Zero would now have one blog post less :-)
Big Integer Design
> This page explains the design and implementation of operations on big (modular) integers, used for RSA and generic elliptic curve computations.
> This section describes the mathematical foundations of the algorithms implemented in BearSSL. This is in no way an exhaustive treaty of computations on big integers; there are many representations and algorithms that are not shown here. These algorithms are the ones that are used in BearSSL (except Karatsuba multiplication, which is briefly exposed below, but not used in BearSSL right now).
> Imagine you compressed a file using your favorite compression algorithm, but you realize that there was a typo in the file. You then correct it, adding a single bit to the original file.
> Compress it again and you get a much larger compressed file… for just a one-bit difference! Much compression algorithms do not have this strange behavior, but for LZ’78, one of the most famous of them, this surprising scenario might well happen. This is what we will overview in this post.
> I’ve recently been looking into a fascinating corner of mathematics that at first glance appears a little bit silly, but actually has far-reaching applications, from physics to numerical methods to machine learning. I thought I’d share what I’ve learned over the next few episodes.
> I assume you recall what a complex number is, but perhaps not all of the details. A complex number is usually introduced as a pair of real numbers (a, b), where a is called the “real part” and b is called the “imaginary part”.
> A brief aside: it has always bugged me that these labels are unnecessarily value-laden. There is no particular “reality” that is associated with the real part; it is every bit as “imaginary” as the imaginary part. They might as well be called the “rezrov part” and the “gnusto part”, but we’re stuck with “real” and “imaginary”. Moving on.