AWS re:Invent 2019: Speculation & leakage: Timing side channels & multi-tenant computing
> In January 2018, the world learned about Spectre and Meltdown, a new class of issues that affects virtually all modern CPUs via nearly imperceptible changes to their micro-architectural states and can result in full access to physical RAM or leaking of state between threads, processes, or guests. In this session, we examine one of these side-channel attacks in detail and explore the implications for multi-tenant computing. We discuss AWS design decisions and what AWS does to protect your instances, containers, and function invocations. Finally, we discuss what the future looks like in the presence of this new class of issue.
This is a good recap. Specific defenses starts around 42:00.
A Compendium of Container Escapes
> The goal of this talk is to broaden the awareness of the how and why container escapes work, starting from a brief intro to what makes a process a container, and then spanning the gamut of escape techniques, covering exposed orchestrators, access to the Docker socket, exposed mount points, /proc, all the way down to overwriting/exploiting the kernel structures to leave the confines of the container.
My infrastructure as of 2019
> The goal for my infrastructure is to run the services I need. While a lot of people in the homelab community experiment and play with software for its own sake, I actively use the stuff I host. When I stop, I kill the service (though I’m not as proficient at this as Google). These are my production systems, and when one of them is down, I do miss it.
How Tailscale works
> There is one last question that comes up a lot: given that Tailscale creates a mesh “overlay” network (a VPN that parallels a company’s internal physical network), does a company have to switch to it all at once? Many BeyondCorp and zero-trust style products work that way. Or can it be deployed incrementally, starting with a small proof of concept?
> Tailscale is uniquely suited to incremental deployments. Since you don’t need to install any hardware or any servers at all, you can get started in two minutes: just install the Tailscale node software onto two devices (Linux, Windows, macOS, iOS), login to both devices with the same user account or auth domain, and that’s it! They’re securely connected, no matter how the devices move around. Tailscale runs on top of your existing network, so you can safely deploy it without disrupting your existing infrastructure and security settings.
OpenBSD on DigitalOcean
> They are both sort of old at this point and with OpenBSD 6.6 out I ran into a bit of a snag. The default these days is to use a GPT partition table to enable EFI booting. This is generally pretty sane but it looks to me like the FreeBSD droplet doesn’t support this. After the installer rebooted the VM failed to boot, being unable to find the bootloader.
> Thankfully DigitalOcean has a recovery ISO that you can boot by simply switching to it and powering off and then on your Droplet.
dd miniroot over FreeBSD, reboot, lemonade!
Three ways to reduce the costs of your HTTP(S) API on AWS
> Since we would send this five billion times per day, every byte we could shave off would save five gigabytes of outgoing data, for a saving of 25 cents per day per byte removed.
It all adds up.
Defense in depth against SSRF vulnerabilities with the EC2 Instance Metadata Service
> Today, AWS is making v2 of the EC2 Instance Metadata Service (IMDSv2) available. The existing instance metadata service (IMDSv1) is fully secure, and AWS will continue to support it. But IMDSv2 adds new “belt and suspenders” protections for four types of vulnerabilities that could be used to try to access the IMDS. These new protections go well beyond other types of mitigations, while working seamlessly with existing mitigations such as restricting IAM roles and using local firewall rules to restrict access to the IMDS. AWS is also making new versions of the AWS SDKs and CLIs available that support IMDSv2.
Eh, seems this could have been better from the start, but oh well.
How I accidentally took down GitHub Actions
> Commit shorthashes have a major problem: As a repository accumulates a large number of commits, eventually it will contain two commit hashes that start with the same seven characters (and have the same shorthash). After this happens, tools that use shorthashes will start to break because the commit shorthash is ambiguous (it’s no longer a pointer to a single commit). Due to the birthday problem, any repository that has at least 19291 commits is likely to have a pair of ambiguous commits somewhere. So if we waited for the actions/docker repo to have tens of thousands of commits, one of the shorthashes would eventually become ambiguous and break someone’s build.
Snap: a microkernel approach to host networking
> This paper describes the networking stack, Snap, that has been running in production at Google for the last three years+. It’s been clear for a while that software designed explicitly for the data center environment will increasingly want/need to make different design trade-offs to e.g. general-purpose systems software that you might install on your own machines. But wow, I didn’t think we’d be at the point yet where we’d be abandoning TCP/IP! You need a lot of software engineers and the willingness to rewrite a lot of software to entertain that idea.
Kubernetes made my latency 10x higher
> Problems often appear just because we put some pieces together in the first place.
OpenTitan - open sourcing transparent, trustworthy, and secure silicon
> Today, along with our partners, we are excited to announce OpenTitan - the first open source silicon root of trust (RoT) project. OpenTitan will deliver a high-quality RoT design and integration guidelines for use in data center servers, storage, peripherals, and more. Open sourcing the silicon design makes it more transparent, trustworthy, and ultimately, secure.
OpenBSD on Google Compute Engine
> This tutorial outlines a simple way to get OpenBSD working on GCE, utilizing only OpenBSD to create the image and send up into gcloud.
Defense at Scale
> Last year, my colleague Chris Rohlf gave a keynote at BSidesNOLA entitled “Offense at Scale”. Offense sounds fun. Pwn all the things. And you’re always going to win! And normally I’m a big fan of being massively offensive. Unfortunately, I find myself on the defense when it comes to information security.
> Here’s how you defend at scale. Can’t be done. The end. Everything’s fucked. You’re pwned.
Plenty of good points here. Also a fun read.
Migrating From Cloudflare
> Okay so here’s the thing: Cloudflare isn’t just the CDN provider for the instance, it is also the domain’s nameserver. That means that it holds all the DNS records that point mastodon.technology to the various IP addresses used for HTTP requests, email, and even public DKIM keys for mail server verification. These DNS settings are really, really important. If they get messed up, everything about the instance can break.
> So I split up the migration from Cloudflare to BunnyCDN into two phases: first migrate the CDN provider, and then migrate the DNS provider. Getting this right is really important, and I mostly did okay, but hopefully you can learn from my experiences.
Public Suffix List Problems
> This is a collection of thoughts from a maintainer of the Public Suffix List (PSL) about the importance of avoiding new Web Platform features, security, or privacy boundaries assuming the PSL is a good starting point.
> Equally terrifying, however, is how many providers only discovered the existence of the PSL once LE was using it to rate limit - meaning that their users were able to influence cookies and other storage without restriction, until an incidental change (wanting to get more certs) caused the server operator to realize.
Preventing The Capital One Breach
> Every indication is that the attacker exploited a type of vulnerability known as Server Side Request Forgery (SSRF) in order to perform the attack. SSRF has become the most serious vulnerability facing organizations that use public clouds. SSRF is not an unknown vulnerability, but it doesn’t receive enough attention and was absent from the OWASP Top 10.
> SSRF is a bug hunters dream because it is an easy to perform attack and regularly yields critical findings, like this bug bounty report to Shopify. The problem is common and well-known, but hard to prevent and does not have any mitigations built in to the AWS platform.
Google Groups entirely ignores SMTP time rejections
> Google Groups ignored this rejection and began sending email messages from the group/mailing list to my spamtrap address. Each of these messages was rejected at SMTP time, and each of them contained a unique MAIL FROM address (a VERP), which good mailing list software uses to notice delivery failures and unsubscribe addresses. Google Groups is, of course, not good mailing list software, since it entirely ignored the rejections. I expect that this increases the metrics of things like ‘subscribers to Google Groups’ and ‘number of active Google Groups’ and others that the department responsible for Google Groups is rewarded for. Such is the toxic nature of rewarding and requiring ‘engagement’, especially without any care for the details.
Building Facebook’s service encryption infrastructure
> In this post, we’ll talk about how we migrated our encryption infrastructure in data centers from the Kerberos authentication protocol to TLS. Optimizing for operability and performance, while still satisfying the right security model for each service, required navigating difficult trade-offs. By sharing our experiences, we hope to show how we think about our encryption infrastructure and help others as they think through their own implementation.
The Unusual Case of Status code- 301 Redirection to AWS Security Credentials Compromise
> The redirection that I got in the first step was now becoming a Server Side Redirection, not just a client-side redirection. Now if its a server side redirection then there would definitely be a big chance of SSRF (Server Side Request Forgery) attack.
Stealing Downloads from Slack Users
> The vulnerability could have allowed a remote attacker to submit a masqueraded link in a slack channel, that “if clicked” by a victim, would silently change the download location setting of the slack client to an attacker owned SMB share. This could have allowed all future downloaded documents by the victim to end up being uploaded to an attacker owned file server until the setting is manually changed back by the victim.