Featured image

Part 4: Measure, Improve, Repeat – Using Golden Signals and DORA Metrics to Drive Growth Link to heading

From Gut Feel to Data-Driven Improvement Link to heading

In the whirlwind of startup life, it’s easy to judge your engineering performance by gut feel: “We’re deploying pretty fast, right? Our quality seems okay, I think.” But as we grew from 5 to 50 engineers, we realized that what gets measured gets improved. We wanted to turn our hard-earned lessons into sustainable practices. Enter DORA metrics – a set of four key metrics identified by the DevOps Research and Assessment group (Google’s research arm) that objectively gauge software delivery performance. These are Deployment Frequency, Lead Time for Changes, Change Failure Rate, and Time to Restore Service. In plain terms: how often you deploy, how quickly you go from code to production, how often those changes fail, and how fast you recover when they do. Speed and stability captured in four numbers.

We decided to treat these metrics as our scorecard. Not in a gamification way (“yay, 10 deploys a day!”), but as a mirror to see where our feedback loops shined and where they had cracks. For instance, if Deployment Frequency is low, why? Is testing taking too long? If Change Failure Rate is high, what are we missing in quality checks? DORA gave us a common language to discuss improvement.

The Dashboard of Truth Link to heading

We built a simple internal “Engineering Health” dashboard. It pulled data from our CI/CD pipeline and incident management system to automatically plot our DORA metrics each week. The first time we displayed it in an all-hands, there were some surprises. Our Deployment Frequency was actually pretty high – multiple times a day on average (thanks to our confidence with previews and good tests). Lead Time for Changes (commit to deploy) was a bit higher than we’d like; on analysis, we found code review delays were the culprit. The Change Failure Rate (how many deploys caused an incident) was around 10%. Not catastrophic, but we set a goal to halve it. And Time to Restore (if something goes wrong, how quickly can we fix) averaged a few hours – those 2 AM issues, mostly. We wanted that down to under an hour.

By seeing these metrics, we could pinpoint which feedback loop to tighten. Long lead times? Let’s improve CI speed or streamline code reviews. High failure rate? Maybe add more integration tests, or use canary releases. DORA metrics basically pointed to the weakest link in our fast & safe equation.

One quarter, the dashboard showed our deployment frequency dipped. We investigated and discovered a bottleneck: our manual QA on staging was slowing releases. That insight pushed us to expand our automated test coverage and trust our preview environments more. The next month, deployment frequency rebounded because we weren’t waiting on a long staging sign-off for every little change.

Another insight: we noticed our Time to Restore was largely dominated by one or two hairy incidents (like a complex data migration gone wrong that took a day to fix). In retrospectives, this led us to implement blameless postmortems and make specific improvements (in that case, improving our backup and rollback procedures). The result? The next incident of that type was resolved in under an hour. Data proved our improvement – incredibly satisfying to see on the graph.

Celebrating Progress and Learning from Pitfalls Link to heading

DORA’s research famously shows that elite performers achieve multiple deploys per day, <1 hour lead times, <15% change failure, and <1 hour restore times. These aren’t just vanity numbers – they correlate with high-performing teams that can innovate quickly without burning out or blowing things up. We aspired to that level. And we got close! At one point, we were deploying on average 5x a day (thanks to feature flags and automated everything), lead time ~1 day (some changes still required a bunch of tests), failure rate ~5%, restore time ~30 minutes. Those numbers represented countless optimizations and cultural shifts – not overnight, but steady progress. And DORA metrics gave us feedback on our improvement journey.

For example, when we introduced preview environments (from Part 2), we hypothesized it would catch issues earlier and thus reduce change failure rate. Over the next few deployment cycles, we saw that metric indeed trend downward – a validation that the feedback loop was working. When we rolled out structured logging and better monitoring (from Part 3), our Time to Restore began dropping – we could verify that our faster debugging translated to less downtime. It’s one thing to feel like “we’re getting better”; it’s another to see it in black and white on a chart. Those wins energized the team and justified further investment in tooling and process.

DORA metrics also sparked great retrospective conversations. If a metric stagnated or worsened, it forced us to ask “Why?” in a constructive way. One quarter, our lead time plateaued and even regressed for certain teams. Discussion revealed that our frontend build times had ballooned (a side effect of adding lots of features). Armed with that knowledge, we allocated time to optimize webpack and CI caching. The following month, lead time was back on track. Without measuring, we might have missed that trend until it got much worse. DORA metrics took ambiguity out of the equation – we couldn’t cherry-pick feelings or anecdotes; the data kept us honest.

Golden Signals: What to Watch in Real Time Link to heading

While DORA tells us how we’re performing over weeks and months, Google’s Golden Signals tell us what to watch minute‑to‑minute in production:

  • Latency: How long requests take. Track p50/p95/p99 and watch for regressions during releases.
  • Traffic: How much demand the system sees (RPS, concurrent users, queue depth). Sudden spikes often precede issues.
  • Errors: The error rate seen by users (HTTP 5xx/4xx where relevant, app‑level failures). This is your immediate quality bar.
  • Saturation: How “full” your system is (CPU, memory, I/O, connection pools). Running hot reduces headroom and resilience.

We wired these into dashboards and release checklists. Before and after a deploy, we glance at Golden Signals to catch any instant regressions; over time, DORA trends tell us whether our overall loop is getting faster and safer. Together, they form a powerful pairing: Golden Signals for fast detection, DORA for sustained improvement.

Culture: Focus on Improvement, Not Blame Link to heading

A key aspect in using these metrics was avoiding the trap of Goodhart’s Law – “when a measure becomes a target, it ceases to be a good measure.” We reminded everyone: the goal isn’t to game the metrics (like deploying junk just to raise frequency) – the goal is to improve what the metrics represent: our ability to ship quickly and reliably. By discussing metrics openly and blamelessly, we built a culture where feedback (even about our own process) was welcome. For instance, if deployment frequency dropped because folks were batching more changes together (maybe due to an overly cumbersome release process), we talked about how to make releasing easier, not “yell” at someone for not deploying often. Metrics were our shared responsibility.

We also balanced metrics with qualitative feedback. Every retro, alongside DORA numbers, we’d ask “How do people feel about our velocity and quality?” Sometimes an engineer would say, “I know the metrics look good, but I feel super stressed with how we’re rushing.” That’s important! We’d then dig into root causes – maybe we needed a better on-call rotation or more slack time between sprints. The metrics won’t tell you that directly, but they start the conversation. We learned to use them as a compass, not a hammer.

The Payoff: Better Software, Happier Team Link to heading

As our startup matured, these feedback loops and metrics became second nature. The payoff was huge. We achieved a reputation (internally and with users) for being ultra-responsive. If a user reported a bug in the morning, often a fix was deployed by afternoon. That’s because our whole system – from dev environment to preview to logging to metrics – was optimized to shorten the path from problem discovery to solution. We heard comments like, “Wow, your team fixes things faster than any product we use.” Music to a startup’s ears!

On the flip side, we could ship new features confidently, knowing our safety nets were in place. This confidence meant we took more swings at bold changes than competitors who might be paralyzed by fear of breaking things. It’s no coincidence that high DORA performers also tend to win in the market. We felt that effect. By striving to be “elite” in engineering operations, we were able to innovate quickly for customers – a true competitive advantage.

Perhaps most importantly, our engineers were happier. There’s nothing worse than slogging through weeks of debugging or being stuck in deployment hell. By removing much of that pain via fast feedback loops, engineers spent more time doing what they love – building cool stuff – and less time firefighting. And when firefighting was needed, it was a quick blaze, not a days-long forest fire.

The Never-Ending Loop Link to heading

Continuous improvement is, fittingly, continuous. We haven’t “solved” everything – no one ever does. But we’ve ingrained a mindset of measuring and learning. Whenever something feels off, we ask, “Is there a feedback signal or metric that can guide us?” And we invest in that. The four posts you’ve read aren’t isolated tips; together they form a holistic approach:

  • Dev inner loop: Make it insanely fast and consistent (Dev Containers, Docker) so devs can iterate without friction.
  • Team feedback loop: Get changes in front of eyes and users early (Preview Environments) to catch issues and refine in context.
  • Production loop: Instrument your runtime (Structured Logging, Monitoring) so you can rapidly detect and fix problems in the wild.
  • Process improvement loop: Continuously measure delivery performance (DORA Metrics) to identify and address bottlenecks in how you build and release.

Each loop feeds the next. Improvements in dev environment speed help you deploy more often. Better preview testing reduces failures in prod. Better prod monitoring reduces restore times, which feeds back into confidence to deploy frequently. It’s all connected in a virtuous cycle. Our journey is testament to this: every loop we tightened made the others stronger too.

In the end, fast and safe software execution isn’t a pipe dream; it’s a result of intentional feedback loops and a culture that embraces them. By storytelling our way through these four parts, we hope you saw not just the how, but the why – and felt a bit of the excitement and relief that comes when these practices click. Whether you’re a scrappy startup or a growing tech team, these principles can scale with you. Stay agile, keep the feedback flowing, and never stop improving. Your users will thank you, and your future self will too.


Series navigation Link to heading