Building Secure and Reliable Systems
Best Practices for Designing, Implementing, and Maintaining Systems
Google's site-reliability and security teams jointly write down what it actually takes to build systems that are both safe and dependable, from threat models and design reviews to rollback culture and crisis response.
As an Amazon Associate we earn from qualifying purchases. The link above is sponsored.
- Published
- 2020
- Publisher
- O'Reilly Media
- Pages
- 558
- Language
- English
Read this if
Staff-and-up engineers, SREs, and security leads designing or operating systems where reliability and security must be argued for in the same room. The book treats safety and security as the same engineering discipline, which is the right model and almost nobody else publishes it.
Skip this if
Readers who want a tooling tutorial or vendor-neutral checklists. The case studies are Google-shaped, and the patterns assume you have the discipline (postmortems, code review, paved roads) to execute them. If your org cannot stop a deploy, half the book will read as aspirational.
Key takeaways
- Reliability and security share a common substrate: both are about designing for failure modes you cannot fully predict, and both decay if not exercised.
- Recovery, not prevention, is the core skill of mature security organizations; the rollback, response, and recovery chapters are the heart of the book.
- Most security wins come from boring infrastructure (paved roads, default-secure libraries, code review, sandboxing) rather than detection magic.
Notes
Pair with the original SRE book and Site Reliability Workbook for the reliability half of the argument, and with Threat Modeling: Designing for Security (Shostack) for the design-review motion. The book is freely readable on Google's site (sre.google/books), but the print edition is worth it for the cross-team study sessions it tends to spawn. Read alongside Designing Secure Software (Kohnfelder) to fill in the application-layer gap.
What to read before
What to read before Building Secure and Reliable Systems →Advanced · 2023
Security Chaos Engineering
Kelly Shortridge and Aaron Rinehart on treating security as a property of complex adaptive systems: instead of preventing failure, you continuously simulate it, and design the organization to learn from each result.
Advanced · 2020
Security Engineering
Ross Anderson's comprehensive textbook on the design of secure systems, covering protocols, access control, side channels, economics of security, and policy.
Intermediate · 2021
Designing Secure Software
Loren Kohnfelder, the original PKI author, on how to weave security thinking through requirements, design, implementation and operations rather than bolt it on at the end.
What to read next
What to read after Building Secure and Reliable Systems →Advanced · 2023
Security Chaos Engineering
Kelly Shortridge and Aaron Rinehart on treating security as a property of complex adaptive systems: instead of preventing failure, you continuously simulate it, and design the organization to learn from each result.
Advanced · 2020
Security Engineering
Ross Anderson's comprehensive textbook on the design of secure systems, covering protocols, access control, side channels, economics of security, and policy.
Advanced · 2024
Evasive Malware
Kyle Cucci on the anti-analysis arms race: sandbox detection, anti-debug, anti-VM, packing, and the analyst-side tooling and tradecraft that get past those layers.
Explore similar books
Alternatives to Building Secure and Reliable Systems →Advanced · 2023
Security Chaos Engineering
Kelly Shortridge and Aaron Rinehart on treating security as a property of complex adaptive systems: instead of preventing failure, you continuously simulate it, and design the organization to learn from each result.
Advanced · 2020
Security Engineering
Ross Anderson's comprehensive textbook on the design of secure systems, covering protocols, access control, side channels, economics of security, and policy.
Advanced · 2024
Evasive Malware
Kyle Cucci on the anti-analysis arms race: sandbox detection, anti-debug, anti-VM, packing, and the analyst-side tooling and tradecraft that get past those layers.