PyPy's new JSON parser
> In the last year or two I have worked on and off on making PyPy’s JSON faster, particularly when parsing large JSON files. In this post I am going to document those techniques and measure their performance impact.
"Stubs" in the .NET Runtime
> ‘Stubs’, as they’re known in the runtime (sometimes ‘Thunks’), provide a level of indirection throughout the source code, there’s almost 500 mentions of them!
> This post will explore what they are, how they work and why they’re needed.
The Baseline Interpreter: a faster JS interpreter in Firefox 70
> The Baseline Interpreter sits between the C++ interpreter and the Baseline JIT and has elements from both. It executes all bytecode instructions with a fixed interpreter loop (like the C++ interpreter). In addition, it uses Inline Caches to improve performance and collect type information (like the Baseline JIT).
The story of a V8 performance cliff in React
Getting Into Browser Exploitation
Last post in series, toc at the top.
> 0x00: New Series: Getting Into Browser Exploitation
> 0x02: The Butterfly of JSObject
> 0x04: WebKit RegExp Exploit addrof() walk-through
> 0x05: The fakeobj() Primitive: Turning an Address Leak into a Memory Corruption
> 0x07: Preparing for Stage 2 of a WebKit exploit
> 0x08: Arbitrary Read and Write in WebKit Exploit
It’s Time for a Modern Synthesis Kernel
> The promise of kernel-mode runtime code generation is that we can have very fast, feature-rich operating systems by, for example, not including code implementing generic read() and write() system calls, but rather synthesizing code for these operations each time a file is opened. The idea is that at file open time, the OS has a lot of information that it can use to generate highly specialized code, eliding code paths that are provably not going to execute.
Trashing the Flow of Data
> As has been shown many times before, often bugs that don’t seem exploitable at first can be turned into an arbitrary read/write. In this case in particular, triggering the garbage collector while our fake pointer was on the stack gave us a very strong exploitation primitive. The V8 team fixed the bug very quickly as usual. But more importantly, they’re planning to refactor the InferReceiverMaps function to prevent similar bugs in the future. When I noticed this function in the past, I was convinced that one of the callers will get it wrong and audited all the call sites. Back then I couldn’t find a vulnerability but a few months later I stumbled over this newly introduced code that didn’t add the right runtime checks. In hindsight, it would have been worthwhile to point out this dodgy API to the team even without vulnerabilities to show for.
A year with Spectre: a V8 perspective
> In theory, it would be sufficient to defeat either of the two components of an attack. Since we do not know of any way to defeat any of the parts perfectly, we designed and deployed mitigations that greatly reduce the amount of information that is leaked into CPU caches and mitigations that make it hard to recover the hidden state.
> Fortunately or unfortunately, our offensive research advanced much faster than our defensive research, and we quickly discovered that software mitigation of all possible leaks due to Spectre was infeasible. This was due to a variety of reasons. First, the engineering effort diverted to combating Spectre was disproportionate to its threat level. In V8 we face many other security threats that are much worse, from direct out-of-bound reads due to regular bugs (faster and more direct than Spectre), out-of-bound writes (impossible with Spectre, and worse) and potential remote code execution (impossible with Spectre and much, much worse). Second, the increasingly complicated mitigations that we designed and implemented carried significant complexity, which is technical debt and might actually increase the attack surface, and performance overheads. Third, testing and maintaining mitigations for microarchitectural leaks is even trickier than designing gadgets themselves, since it’s hard to be sure the mitigations continue working as designed. At least once, important mitigations were effectively undone by later compiler optimizations. Fourth, we found that effective mitigation of some variants of Spectre, particularly variant 4, to be simply infeasible in software, even after a heroic effort by our partners at Apple to combat the problem in their JIT compiler.
Attacking Clientside JIT Compilers
> Our research focused on 3 front end compilers and back end JIT engines for which little, or no public security research exists. We explore the potential security impacts of using JIT engines in applications such as web browsers and language runtimes and describe the tools we developed for security researchers to build on our JIT research. We also discuss a case study of a security vulnerability we found in the Firefox SpiderMonkey front end and discuss ways the back end JaegerMonkey JIT can be used to exploit the vulnerability. Finally, we will conclude with discussion on possible techniques for hardening JIT implementations that apply to both browser and language runtime JIT engines.
Per the author, “Despite being written by a much younger, and dumber, me, this paper on JIT engines has aged well.”
Standardizing WASI: A system interface to run WebAssembly outside the web
> WebAssembly is an assembly language for a conceptual machine, not a physical one. This is why it can be run across a variety of different machine architectures.
> Just as WebAssembly is an assembly language for a conceptual machine, WebAssembly needs a system interface for a conceptual operating system, not any single operating system. This way, it can be run across all different OSs.
> This is what WASI is — a system interface for the WebAssembly platform.
> How does it work? Essentially, V8 switches into an interpreter-only mode based on our existing technology: all JS user code runs through the Ignition interpreter, and regular expression pattern matching is likewise interpreted. WebAssembly is currently unsupported, but interpretation is also in the realm of possibility. V8’s builtins are still compiled to native code, but are no longer part of the managed JS heap, thanks to our recent efforts to embed them into the V8 binary.
> Ultimately, these changes allowed us to create V8’s heap without requiring executable permissions for any of its memory regions.
Is C# a low-level language?
Specifically, what happens when translating a C++ raytracer and trying to make it fast.
> I started by simply porting the un-obfuscated C++ code line-by-line to C#. Turns out that this was pretty straight forward, I guess the story about C# being C++++ is true after all!!
Spectre is here to stay: An analysis of side-channels and speculative execution
WebAssembly Is Not a Stack Machine
> This poses a problem for optimisation.
Maybe you don't need Rust and WASM to speed up your JS
Exploiting the Math.expm1 typing bug in V8
Ruby 2.6.0 Released
> It introduces a number of new features and performance improvements, most notably:
> A new JIT compiler.
> The RubyVM::AbstractSyntaxTree module.
> The JIT compiler aims to improve the performance of Ruby programs. Unlike traditional JIT compilers which operate in-process, Ruby’s JIT compiler writes out C code to disk and spawns a common C compiler to generate native code. For more details about it, see the MJIT organization by Vladimir Makarov.
> Ruby 2.6 introduces the RubyVM::AbstractSyntaxTree module. Future compatibility of this module is not guaranteed. This module has a parse method, which parses the given string as Ruby code and returns the AST (Abstract Syntax Tree) nodes of the code. The parse_file method opens and parses the given file as Ruby code and returns AST nodes.
More consistent LuaJIT performance
> So, did we achieve everything we wanted to in 12 months? Inevitably the answer is yes and no. We did a lot more benchmarking than we expected; we’ve been able to make a lot of programs (particularly large programs) have more consistent performance; and we’ve got a fair way down the road of implementing a new GC. To whoever takes on further LuaJIT work – best of luck, and I look forward to seeing your results!
Nginx on Wasmjit
> As of a few days ago, Wasmjit is able to run Nginx 1.15.3 in user space. This means you can use Wasmjit to run the same nginx.wasm on all POSIX systems. So far it’s been tested on Linux, OpenBSD, and macOS. It’s not a stripped down version of Nginx: it’s the whole thing, compiled straight from source without modification and with multi-process capability. All the complex bits of the POSIX API required for Nginx have been implemented, including signal handling and forking.