Accepting that imperfect things still work is fundamental to preventing failures from becoming catastrophes.
Seeing a year’s worth of capacity growth in a matter of weeks, the CDN services provider hustled to build and reinforce the infrastructure it needed to serve its users (and European soccer fans).
To build a high-performing software delivery system, your stack’s capabilities are just one part of the picture.
As software systems become ever more complex, chaos engineering provides a (not-actually-so-chaotic) tool kit for building more reliable and resilient systems.
The way we fight fires affects how quickly we can resolve outages. Appointing an incident commander can help—and you (yes, you) can be one.
Pseudo-tested methods can be a reliability risk. Here, the authors explain how they developed a methodology and tool to uncover them in Java applications.
A case for designing consumer software with safety-critical principles and formal methods in mind.
Leaders at Deliveroo, DigitalOcean, Fastly, and Headspace share how their organizations think about reliability and resiliency and their advice to engineering orgs embarking on reliability journeys.
Occasionally I see a comic that has a grocery store checkout that says “fewer.” Here’s another one. The digression is below the comic.
Okay, digression. This is about the likelihood of picking the fastest line and why we always seem to not pick it. The reason is that line speeds are actually random, and when we pick a line, we compare where we are to the line on each side. That makes the odds of picking the fastest line three to one. So sometimes we do pick the fastest line, but only one out of three times feels worse than it is. I read this in an article in Scientific American, so it must be true.
Best practices for crafting your resume, doing your research, and—hopefully—landing the job.