
Cost Effective Data Pipelines
The definitive guide on reducing data pipeline costs, by Data Scout Founder Sev Leonard. Based on over a decade of industry experience, including best practices from companies like Uber and Doordash. Trusted by engineers and managers running small to large data deployments, from startups to Fortune 1000 companies
The cloud data revolution of the mid-2010s gave data engineers easy access to compute and storage at extraordinary scale, but this sea change also made engineers responsible for the daily dollars and cents of their workloads. This is the book we’ve been waiting for to provide clear, opinionated guidance on monitoring, controlling and optimizing the costs of high performance cloud data systems.
Sev’s best practices and strategies could have saved my employer millions of dollars. That’s a pretty good return on investment for the price of a book and the time to read it.
Real world data pipelines are notoriously fickle. Things change, and things break. This book is a great resource for getting ahead of costly data pipeline problems before they get ahead of you.
Managing data at scale has always been challenging. Most organizations struggle with over-provisioning resources and inflated project costs. This book provides crystal clear insight on overcoming these challenges and keeping your costs as low as possible.
Practical guide with good software engineering fundamentals ⭐⭐⭐⭐⭐
I’ve read this book and recommended it several times to other engineers in my company. The book really focuses on using good software engineering principles such as building lean prototypes, building extensible scalable tooling, and having good testing and observability from the start.
The book has a lot of real word examples for how to work with spot instances, laying data out with an eye toward cloud storage costs and manageability, and building intelligent tests with real or mocked data.
This is the manual I wish I had when I was just getting started with data; it would have saved me a lot of suffering! But whether you’re just getting started or have decades of experience, the accessible strategies Sev has developed will not only help you build more reliable, cost-effective pipelines; they will also help you communicate about them to a variety of stakeholders. A must-read for anyone working with data!
This is the most readable guide I’ve seen in decades for designing and building robust real-world data pipelines. With plenty of context and detailed, non-trivial examples using real-world code, this book will be your 24/7 expert when working through messy problems that have no easy solutions. You’ll learn to balance complex trade-offs among cost, performance, implementation time, long-term support, future growth, and myriad other elements that make up today’s complex data pipeline landscape.
Excellent book on data pipelines ⭐⭐⭐⭐⭐
Well written with engaging prose and well thought out diagrams that illustrate complex topics. Very up-to-date examples and realistic scenarios that organizations large and small are likely to encounter. Throughout, recommended readings are generously provided for delving deeper into a specific topic. Highly recommended!
Well written with engaging prose and well thought out diagrams that illustrate complex topics. Very up-to-date examples and realistic scenarios that organizations large and small are likely to encounter. Throughout, recommended readings are generously provided for delving deeper into a specific topic. Highly recommended!
