Confirmed Sessions

We're just now beginning to announce the sessions. 70+ more to go. We'll be adding new sessions every few days.

Who guards the guardians? Designing for resilience in cluster orchestrators

Preetha Appan - HashiCorp

Cluster orchestrators enable reliable and repeatable application deploys and provide fault tolerance without operator intervention. These orchestrators are themselves complex distributed systems like the applications they manage. The blast radius when a cluster orchestrator fails is huge; it could take down all your applications. Designing resilience into the orchestrator is a unique challenge given its critical operational nature.
Preetha Appan outlines various failure modes ranging from network failures to entire server failures in Nomad, an open source scheduler that supports heterogeneous workloads. You’ll discover how building graceful degradation and resilience to address these failures involves looking at the problem as a trade-off between three system features: correctness, performance, and availability. Along the way, Preetha shares examples of design decisions that impact the availability of applications managed by the scheduler and lessons learned that apply to building any complex distributed system.

Isolate Computing

Zack Bloom - Cloudflare

For forty years computation has been built around the idea of a process as the fundamental abstraction of a piece of code to be executed. In that time, how we write code has changed dramatically, culminating with serverless, but the nature of a process has not.
Processes unfortunately incur a context-switching overhead as the operating system moves the Processor from executing one serverless container to another, wasting CPU cycles. Processes also can only do IO and other critical tasks by firing interrupts into the kernel which waste as much as 33% of the execution time of an IO bound function. Processes also incur startup time as heavyweight virtual machines like NodeJS are initialized, which we experience in the serverless world as a cold start. The fear of cold starts require us to do complex work to warm serverless functions, and require even infrequently used functions to consume precious memory to avoid them.
There may be an alternative. Web browsers have solved the same problem, the need to run many instances of untrusted code with minimal overhead and start new code execution lightning-fast, in an entirely different way. They run a single virtual machine, and encapsulate each piece of code not in a process, but in an ‘isolate’. These isolates can be started in 5 milliseconds, 100 times faster than a Node Lambda serverless function. They also consume 1/10 the memory.
Beyond serverless, being able to initiate execution of server-side code in less time than it takes for a web request to connect opens dramatic possibilities. Services can be scaled to millions of requests per second instantaneously. They can be deployed to hundreds of locations around the world with the same economics as deploying to just one. Even better, by eliminating process-related overhead, it brings us close to the economics of running on bare metal, but with the ergonomics of serverless programming.
Leaving this presentation attendees will have an understanding of where Isolate-based serverless might be more appropriate than other forms of compute. In those situations, they will be able to deploy code which can be affordably ran close to every Internet visitor, which can autoscale instantaneously, and which can be as much as three times less expensive than container based serverless systems per CPU-cycle.

Enterprise transformation (and you can too)

Donovan Brown - Microsoft

“That would never work here.” You’ve likely heard this sentiment echo from your company’s conference rooms or board rooms (or maybe you’ve said it yourself). There are always reasons: established processes (with vested interests supporting them), legacy codebases and data centers (both with large install footprints), and scale (for some values of scale), to name just a few.
Good news: change is possible. Donovan Brown walks you through a case study from Microsoft’s Visual Studio Team Services (VSTS). VSTS went from a three-year waterfall delivery cycle to three-week iterations and open sourced the VSTS task library and the Git Virtual File System (GVFS). To make these changes, the team had to question its tool choices, change its processes, and empower its people. You’ll learn why integration of cross-functional teams is key to the continuous delivery of value to end users.

Data Modeling in the 24th and ½ Century with Apache Cassandra

Amanda Moran - DataStax

Why do I want a cloud-native database? Why all this migration headache? Can’t I just keep my relation database? This talk will focus on Apache Cassandra data modeling, how to do it right, and how to be successful with cloud-native distributed databases by avoiding common mistakes. Some of the topics covered in this session are:
What needs to be considered when moving from a relational database to Apache Cassandra?
What needs to be considered when moving from another NoSQL database to Apache Cassandra?
What is the difference between SQL and CQL?
How to do data modeling in Apache Cassandra? Steps on how to get your data model correct
Common Mistakes and how to fix them to be successful.

How to Scale your Customer Experience

Chris McCraw - Netlify

Do you wish your company's Support team was constantly bringing you User Stories and filing better bugs? This talk will instruct and demonstrate how to create a better environment for collaborative work across teams particularly as they grow in size and products grow in complexity. We'll cover topics including:
- helping your support team think like engineers. This leads to better escalations and feedback.
- developing an engineering relationship directly with your customers.
- working to develop a model actionable feedback (instead of "this is broken", "this could meet more use cases we've heard about if behavior changed ")
- developing a better escalation path for customer-facing issues
You'll end up with some thoughts and practices that will help your customer experience and Support team interactions scale as fast as your business.
Intended audience:Engineers/Engineering Managers or Product Managers who want to get "closer to the meat" (receive more direct and actionable customer feedback)

Schema Evolution Patterns

Alex Rasmussen - Bits on Disk

Everybody’s talking about microservices, but nobody seems to agree on how to make them talk to each other. How should you version your APIs, and how does API version deprecation actually work in practice? Do you use plain old JSON, Thrift, protocol buffers, GraphQL? How do teams communicate changes in their services’ interfaces, and how do consumer services respond?
Separately, nobody seems to agree on how to handle migrating a service’s structured data without downtime. Do you write to shadow tables? Chain new tables off the old ones? Just run the migration live and hope nothing bad happens? Switch everything over to NoSQL?
Both these problems are instances of issues with schema evolution: what happens when the structure of your structured data changes. In this talk, rather than taking a prescriptive approach, I’ll try to distill a lot of institutional knowledge and computer science history into a set of patterns and examine the tradeoffs between them.

Serverless Security: Attackers & Defenders

Ory Segal - PureSec

In cloud-native environments in general, and serverless in particular, the cloud provider is responsible for securing the underlying infrastructure, from the data centers all the way up to the container and runtime environment. This relieves much of the security burden from the application owner, however it also poses many unique challenges when it comes to securing the application layer. In this presentation, we will discuss the most critical challenges related to securing serverless applications – from development to deployment. We will also walk through a live demo of a realistic serverless application that contains several common vulnerabilities, and see how they can be exploited by attackers, and how to secure them.
Key takeaways include:
1) Understand application security challenges for serverless architectures
2) Learn about the key risks and developer mistakes for serverless applications
3) See how an attacker approaches serverless apps, and exploits weaknesses
4) Learn how to protect and defend your serverless code
5) Learn about open source tools that can help

Base64 is not encryption - a better story for Kubernetes Secrets

Seth Vargo - Google

Secrets are a key pillar of Kubernetes’ security model, used internally (e.g. service accounts) and by users (e.g. API keys), but did you know they are stored in plaintext? That’s right, by default all Kubernetes secrets are base64 encoded and stored as plaintext in etcd Anyone with access to the etcd cluster has access to all your Kubernetes secrets.
Thankfully there are better ways. This lecture provides an overview of different techniques for more securely managing secrets in Kubernetes including secrets encryption, KMS plugins, and tools like HashiCorp Vault. Attendees will learn the tradeoffs of each approach to make better decisions on how to secure their Kubernetes clusters.

Everything is a Little Bit Broken ~or~ The Illusion of Control

Heidi Waterhouse - LaunchDarkly

We never change the amount of work or technical debt, we just shift it, and with it, we change how it emerges and appears.
Our systems don’t have to be perfect to be operational – planes, networks, and elite athletes all function at extremely high levels even though they are not operating at 100%.
As an industry, we have moved the locus of control from hardware to operating system to virtual machine, to container, to orchestration, and now we are approaching serverless. None of that has reduced the amount of work that must happen, it just makes it possible to re-use and conceptually compress the work of others. Since we are making the work in our tools less visible, we also have less control over how they work. We end up assuming that the promises that have been true will continue to be true, but that is not in our control.
How do we handle this level of uncertainty? By adding in error budgets, layered access, and other accommodations for failure and for designing our systems for function over form or purity.
The audience will leave with some concrete ideas about how to add resiliency to their system by learning to trust but mitigate their reliance on perfect performance of their underlying tools.