Pointers Are Complicated II, or: We need better language specs
Below, I will show a series of three compiler transformations that each seem “intuitively justified”, but when taken together they lead to a clearly incorrect result. I will use LLVM for these examples, but the goal is not to pick on LLVM—other compilers suffer from similar issues. The goal is to convince you that to build a correct compiler for languages permitting unsafe pointer manipulation such as C, C++, or Rust, we need to take IR semantics (and specifically provenance) more seriously. I use LLVM for the examples because it is particularly easy to study with its single, extensively-documented IR that a lot of infrastructure evolved around. Let’s get started!