Skip to content
Podcast

You have developer productivity metrics. Now what?

Many teams struggle to use developer productivity data effectively because they don’t know how to use it to decide what to do next. We know that data is here to help us improve, but how do you know where to look? And even then, what do you actually do to put the wheels of change in motion? Listen to this conversation with Abi Noda and Laura Tacho (CEO and CTO at DX) about data-driven management and how to take a structured, analytical approach to using data for improvement.

Show Notes

  • Common mistakes organizations make with developer productivity metrics:

    • Reverting to old habits: Simply adding the metrics to leadership reports without driving real change.
    • Overwhelming teams with data: Expecting teams to derive meaning from hundreds of measurements without providing adequate support or clear expectations.
    • Failing to connect metrics with decision-making: Collecting data that sits unused in dashboards rather than influencing team behavior and strategy.
  • Questions high-performing engineering teams will ask about productivity metrics:

    • Who is this data for? Are we diagnosing or improving? How will this data be used in decision-making?
  • Two primary use cases for metrics:

    • Engineering organizations use metrics to assess efficiency and drive improvement. Leadership uses this data to guide transformation efforts, ensure alignment with business goals, increase quality, and improve engineering velocity. Teams use metrics to make daily and weekly decisions to improve their performance and velocity.
    • Systems teams (Platform Eng, DevEx, DevProd): These teams use metrics to understand how engineering teams interact with internal systems and to assess the ROI of DevEx investments. These metrics are existential for these teams, who need to show the impact of their work in order to measure the success of their investment. These measurements are also crucial for setting future priorities.
  • Categories of metrics:

    • Diagnostic metrics: These are high-level, summary metrics that provide insights into trends over time.
      • Collected with lower frequency
      • Benefit from industry benchmarks to contextualize performance
      • Best used for directional or strategic decision-making
      • Examples: DX Core 4 primary metrics, DORA metrics
    • Improvement metrics: These metrics drive behavior change.
      • Collected with higher frequency
      • Focused on smaller variables
      • Are often in the teams’ locus of control
  • Other best practices for ensuring metrics lead to action

    • Tell a story with data: Rather than presenting raw numbers, frame metrics in the context of progress toward key business goals.
    • Use industry benchmarks for context: Comparing your organization’s metrics to industry benchmarks can help make data actionable. You can download the full set of DX Core 4 benchmarks here.
    • Mix qualitative and quantitative data: Looking at quantitative data from systems can tell you what is happening, but only self-reported data from developers can tell you why. For improvement, the “why” is critical.

Timestamps

  • (0:00) Intro
  • (2:07) The challenge we’re seeing
  • (6:53) Overview on using data
  • (8:58) Use cases for data-engineering organizations
  • (15:57) Use cases for data - engineering systems teams
  • (21:38) Two types of metrics - Diagnostics and Improvement
  • (38:09) Summary

Listen to this episode on:

Transcript

Abi Noda: So, hey everyone, thanks so much for joining. We’re really excited for this conversation today. I’m Abi, CEO and co-founder of DX, and Laura, of course, CTO of DX. And today we’re talking about a subject that Laura and I have been talking about with a lot of our customers within our company, with our researchers, and that is how to actually use metrics in our businesses. So really excited to dive in to this topic. But Laura, first I’ll turn it over to you for a little bit of housekeeping.

Laura Tacho: Yeah, so welcome everyone. It’s so great to see such a big group here. This is such an existential problem that a lot of us are now having in our organizations. Abi and I are here in service of you. We want this to be really useful for you, so if you have questions, you can use either the Q&A feature in Zoom or feel free to drop it in the chat. If we miss it, I apologize in advance. It’s hard to sometimes keep up. We encourage everyone to chat with each other and make friends and carry the conversation out.

One question, just to take 30 seconds to gather your own thoughts before Abi and I launch into what we have prepared for you is, so we’re all here because we have metrics, now what do we do? I’m wondering what you all would love to learn from this 45 minutes, 30 minutes. What’s the question that you hope Abi and I answer? Just take 30 seconds and write it in the chat so that Abi and I can make sure that we’re taking this next time that you’ve so generously chosen to spend with us and make it as useful as possible for you.

Abi Noda: Well, Laura, thanks for doing the introductions. So let’s jump in. As you set the stage, today, the topic, the title is, You Have Developer Productivity Metrics, Now What? And we were talking earlier today how this is just a problem a lot of us seem to be facing right now. Us at DX and our research team, this is a topic we’re talking about, our customers are asking us about it. It’s something we’re hearing about. What are you seeing and hearing?

 

Laura Tacho: Yeah. I think what’s interesting now is this problem is becoming extremely visceral. Obviously, for the two of us who spend all of our time thinking about developer productivity metrics, but you can look at the chat and look at all these really great questions. You all are feeling it too. And I don’t want to suggest that the problem of what to measure is in any way solved, but I think that we are crossing … we are going over the hump here so to speak, and we’ve maybe solved that problem sufficiently enough. People have figured out what to measure, they’re using a framework, maybe it’s DORA, maybe it’s Core 4.

And now the next problem on the problem ladder is really presenting itself and showing up in a big way. We have people with access to metrics but without really knowing what to do with them. One thing I hear all the time is like, “Okay, we just spent three months setting up our quantitative metrics or we just ran our first developer satisfaction survey. What do I do next?” It’s just a simple question. “What do I do next? I don’t even know what to do next.” Is that a similar question that you’re seeing Abi?

Abi Noda: Yeah, I see it in many forms. It might be like you said, I’ve talked to organizations that come to us and say, “Hey, we spent a year setting up DORA metrics and now we’re starting to figure out what to do with them.” And there’s always a tinge of regret there from folks bringing this up because it’s, “Hey, we’ve invested a year into this thing and now we’re trying to figure out how to get value out of it.”

And I think, like you said, Laura, what to measure that question is not solved. But at least here at DX, and the work we’re doing with customers, we feel like we’ve made progress on that with frameworks like the DX Core 4. And so the biggest question we’ve been getting around the Core 4 is how do we actually use this in our organization? Which again brings us back to this problem.

Laura Tacho: Yeah. And maybe Abi, we can get a little more even concrete with some of what we’re actually seeing. I’ve gotten emails from people, I’m seeing in the chat here, “I don’t know what to do next.” I don’t know how to present this in an Excel file. I don’t know who should see it. There’s some just very fundamental questions of fundamentally what do we do with the data? We also see reverting to old habits as a way I could describe it, of trying to do the wrong things with the data, borrowing patterns from things that we’ve done before.

Like, okay, we have a monthly leadership review, let’s just add these to the leadership review. I’ve seen people struggle to figure out how to sell the metrics up or how to report up to their executives or to their board and they end up creating these documents that have 40 metrics in them and it’s just a readout of the data that they have access to. It’s transparent, yes, actionable, no. But there’s just a lack, there’s a gap of understanding of how do I get this data out of the dashboard and into the teams.

Abi Noda: Two recent examples I would share. One, talked with several organizations who’ve gotten a bunch of data and metrics stood up, and now they’re thinking about, okay, how do we go to all of our frontline managers and teams and say, “Hey, here’s this data, here’s how you should use it.” I think that’s a really common challenge for leaders and folks in enablement roles.

And earlier this week I was talking to a leader who’s leading a DevEx function at Adobe and they have to do a monthly review with the GM of their business unit and they’re trying to get ready for how do we tell our story, how do we show we’re making progress? And again, that gap from having the data to telling a good story with that data and presenting and packaging that data, there’s a huge chasm to cross there, and that’s something I think we all sometimes take for granted. Yeah, lots of challenges around this problem.

Laura, we’ve been spending a lot of time thinking about this. How can we provide guidance to folks? We have definitely not figured it all out yet, but I do feel like we’ve made a little bit of progress. Share with everyone where our head is at currently around this problem.

Laura Tacho: Yeah. Patricia, you had such a thoughtful question in the chat, I’m just going to read it for everyone. “How can we change the mindset of engineers that metrics are for improvements and not just as a report?” And I think this sums up exactly the problem here so well, that we don’t want these metrics just sitting and rotting on a dashboard somewhere collecting dust. We want them to be a living, breathing part of the fabric of how your organization operates. It’s like updating your organizational operating system in a way to include these metrics for decision-making.

Abi and I have made a lot of progress in how we formulate our mental model in thinking about it. So we’re excited to share with you just a small piece of our guidance on how to think about this. I think in short, we’ve got two axes that we think about this problem. And I’ll give an overview and I’ll let Abi then go into a little bit more detail here. But we want to think about the use case. So you might be a engineering leader or a frontline team thinking about using metrics to evaluate performance and improve performance. This is Patricia, your use case that you’re talking about in your question here.

I think the other use case here is systems teams. Like platform teams, developer experience, developer productivity, infra teams, who have this existential problem of needing to know what is happening out there in the organization in order to make decisions about what to do next. So we’ve got two use cases. That’s one axis.

And then metrics can either be like a diagnostic metric where we have an overview of what’s going on and we’re getting direction, we’re getting trend, we’re looking at benchmarks. Or they can be used for improvement, which is today, what do I need to do different? Or this week, what I need to do different to drive improvement?

Abi Noda: Great overview, Laura. I feel the first axes that you described of that bifurcation of using data as engineering leadership to gauge the organization, improve the velocity of the organization versus a centralized platform or enablement team that’s using this to shape roadmap and show their success, I think that’s one we’ve had a pretty good finger on the pulse of for a while.

I really think it’s the second one, the second axis you mentioned around this distinction between diagnostic and improvement, that for us is I think unlocking a pretty powerful new way of thinking about this problem. And we’ll go into more detail. But first, let’s talk about this idea of the two distinct use cases and give some examples to folks, talk about the differences between these two use cases as it pertains to the types of data we need to feed into this.

Laura Tacho: Yeah. I think this first use case of the engineering team, the engineering leadership, the engineering organization trying to comprehensively improve efficiency and performance is what gets talked about the most. I don’t want to say it’s the most common or most popular use case. It definitely gets talked about the most. And I think that’s what people have stronger emotional reactions to. And when we talk about the traps of metrics like overemphasizing one metric and people gaming the system, it’s usually being talked about in context of this particular use case.

This is taking data about efficiency and performance and then making a plan. Whether that plan is a midterm plan or a longterm plan or we can look at more granular data and make plans for today, this week, this sprint, in order to drive improvement. But this is more about the organizational health and organizational efficiency of those product teams, all the frontline engineering teams.

Abi, do you have a couple examples to share of companies who stick out to you as doing this really well or maybe doing this in public?

Abi Noda: Yeah, a lot of the customers we work with are really focused on this, right? There’s a top-down mandate in the business to really focus on efficiency, accelerate delivery, and there’s a platform or enablement organization that is part of trying to drive that, but really the onus of driving that transformation sits on the shoulders of the VPs and directors and that trickles us down to all the teams.

A good public example I think is actually Microsoft. Microsoft has a big initiative around developer productivity called Edge Thrive. And one thing I think they’re doing really well is figuring out how to make that something that all the teams and all the leadership around the company are focused around. And one piece of advice they’ve shared is this idea of pressurizing the system. The idea that to really drive this type of improvement using data, you have to get senior leadership to care about this data and to be told, “Hey, this is important. This is going to be part of determining your success.”

And then naturally that prioritization is going to trickle down all the way to the frontline teams and lead to action being taken there. Yeah, I think Microsoft is actually a great example of where they’re really thinking about this problem not just through the lens of centralized investments, but how do we get the whole organization to think about change and improvement?

Laura Tacho: Yeah, this is such a hairy problem of getting a whole organization invested and capable of using data and just building the muscle up. I think we talked a lot about what the problem looks like, analysis paralysis, sitting in front of dashboards of all this data, but the cost of that I think is really, to me, it’s very palpable in this particular use case. Where one of the failure modes that I see a lot is an engineering organization collects data, makes data available to their teams, but without the squeezing from the top or pressurizing the system. So what happens is the data gets just pushed out to the leaders and the expectation is like, “Okay, go do something with this.” Without really clear direction, pace setting from executives to show that this is not just an activity that I want you to try to squeeze in while you’re having lunch at your desk in the margins of all the other work that you have, all the delivery pressure that you have, but instead, we’re fundamentally changing the way our organization operates and we want to embrace continuous improvement.

That’s a fundamental part of making this use case successful is having buy-in that’s clearly communicated, clear expectations from executives, because we can’t just spray and pray the data, just give it to people and expect them to immediately know what to do with it. There needs to be a lot more support organizationally than that.

Abi Noda: I think you hit on a really important point there. And we see this all the time. You can aggregate all this data, create really awesome dashboards and reports, and you can put it in front of all your teams. And some of those teams are probably going to be really excited and latch on and start using these in their retrospectives and planning sessions. But a lot of teams are probably going to say, “This is great, but I have deadlines to hit. We’re already slammed and this dashboard looks pretty, but this isn’t our number one priority.”

And I think that’s something we see happen. And again, this goes back to I think that great advice from Microsoft, that lesson from Microsoft that, to really drive full organizational change, culture change, and adoption around something like this, you do need to really start at the top if you want to get to 100%. If you’re okay with 20%, 30%, 40%, even 50%, then sure just give this data to your teams and there’s going to be a lot of value created there. But if you want to drive a transformation and aligned effort around this, you need to start by pressurizing the system from the top.

Laura Tacho: Yeah, absolutely. So this use case of engineering teams, engineering organizations using data, incorporating data into their decision making in order to drive continuous improvement is probably the most talked about use case. And if you’re looking at content on LinkedIn, my content on LinkedIn, obviously, a lot of the times we’re talking about this particular use case.

There’s another use case though that is equally as important and probably one that speaks very personally to a lot of you in the audience here, which is those of you who work on systems teams like platform, DevEx, developer productivity teams, infrastructure teams. You need this data in order to know the impact, the adoption of the systems that you’re building, and to get direction in order to know what problems are important to be solved next.

Abi Noda: And I was talking to an executive a couple of weeks ago who said, “Look, in the back of the mind of every senior executive is this question. Are the investments we’re making in DevEx platform infra, are we getting ROI out of those investments or is it a black hole?” And as you say, this points at the existential challenge for infra platform DevEx leaders. You are often viewed as a cost center. That’s the cold hard reality. And you need data to justify your existence to shine a light on the opportunities and problems that you are focused on solving.

And you also need data to be able to show that the investment the business is making is actually moving the needle. And so for me, I think a lot about this audience. Obviously, the podcast we do is very focused on this audience, but this is an existential challenge, as you said for leaders. If you work in product, you ship a new feature, customers are happy, you sign that deal. The impact is really clear. With enablement and internal productivity, as obvious as it is to those of us who can see the amount of inefficiency that exists in the organization, showing that to the business, showing that to executives, showing that to the CFO is a big challenge and that’s a really distinct use case for metrics and data that we need to remember and focus on.

Laura Tacho: Yeah. The truth is there is some overlap between the two of these use cases or both of these use cases. I think Abi and I don’t necessarily have the silver bullet all the answers of exactly the steps that you need to do in order to improve productivity, but quite a few of them that we would call out like driving urgency, increasing pace, that can be done on the frontline team level. Managing and up-skilling talent, making sure people are supported, those are all things that can be done on the frontline level. Managing quality. There’s some things though like simplifying complex workflows that really need to be done at an organizational level where even if frontline teams are really equipped to use this data for continuous improvement, they will need support from their organization in the form of a services team, an infra team, a dev prod team, a platform team in order to simplify that complexity and really see that advantage accrue over time when it comes to improving productivity and improving efficiency.

I see some joking, like half joking, half crying, half laughing about data to justify our existence, but man, if it isn’t the truth, right? That platform teams have, I would say, a higher standard or a higher bar set when it comes to showing impact, showing the ROI of projects, and getting data from your customers, from your users even though they’re internal users is so existential for defending your position from getting funding, creating whatever strategy, making sure that you’re retaining the funding and resources that you have in order to continue to be doing the great and impactful work that you have. We just need the numbers and be able to communicate that and tell a better story, and this data really does allow you to do that.

Abi Noda: Let me add a couple things then I’d love to move on, Laura, to the next section, which we feel at least for ourselves is a breakthrough. I think these use cases intersect in many ways. I’ve seen in the comments and the questions folks commenting on this. But one way they intersect is that oftentimes, typically I would say, the folks actually leading, spearheading and championing, the flagbearer of these organizational transformations are the enablement teams and the enablement organization. And so these two things aren’t happening in isolation from one another. It’s typically the enablement org that is able to bring the data to the table, bring data to the leadership team that then sparks the organizational transformation and tops down motion around really driving change across the organization.

And another way these two use cases intersect is that it’s always best if both of these use cases are aligned around the same set of core metrics. And this was one of the main goals of the Core 4 was to be able to bring these two lenses together. And the reason why that’s so important is that if different people in the organization, if the CFO and CTO, are thinking about engineering productivity through one set of metrics and through one lens, and enablement and infra is looking at the problem through a different lens, there’s a disconnect. And one or the other is going to win out and the one that has the wrong definition is going to lose out on funding and buying. So really important to get aligned around the same definition set of metrics. And again, that’s one of the things we’ve aimed to do with the Core 4.

With that said, Laura, let’s move on to activities. And again, I’ve been really excited about this part of our conversation because I think this is potentially a breakthrough in terms of how we think about this. Again, this problem, how do we really use metrics in the organization? So give folks a little dive into what we mean by diagnostics versus improvement.

Laura Tacho: Yeah. So we have the use case access with whether you are an engineering org or platform team, trying to figure out what’s driving impact using data for continuous improvement. But within those use cases, there’s also different kinds of metrics that you’re going to encounter. One metric is a diagnostic metric. These are metrics that are generally like summary metrics. They’re comprehensive. They’re showing trends over time or giving you some directionality. So is this thing moving up or down? Is it trending this way or that way? This is where benchmarks can be really, really important in helping contextualize your performance.

These metrics are the bigger picture metrics that can feel hard to find action. To find the action, we need a different metric, and that’s what Abi and I are calling an improvement metric. These are metrics that are a bit more focused in their purpose. They’re usually collected more frequently and have a higher granularity, a higher resolution. They can help you guide decisions today or decisions this week. Abi, maybe you want to share your great example of the blood panel versus continuous glucose monitor to help crystallize this for folks who might be hearing about this framing for the first time.

Abi Noda: Yeah, I think you introduced it well. There’s different types of data, different types of metrics and measurements that are well suited for one, sometimes both of these activities. But I think it’s really important to point out these are distinct activities. These are different modes, different things that leaders and teams are focused on and different cadences. So the analogy I love to use is, you go to the doctor once per year and you get a blood panel and that’s going to give you this diagnostic of how are you doing overall? How do your numbers stack up against other people? Against reference ranges? And then your physician is going to help you interpret those numbers and then based on that, you’re going to maybe make some decisions around things in your life you want to change or areas you want to improve. So you might say, “Okay, I want to lose weight.” Or, “I want to improve my metabolic health.”

Then to accomplish those goals, you’re going to also probably use data, but you’re going to use data differently and probably different types of data. And I think the best concrete example of this is, as you mentioned, the continuous glucose monitor. At least here in the United States, I don’t know if everyone has the same standard, we have a metric called IAC, which is actually a baseline of your 90-day trailing blood glucose level. And that’s something that’s standard when you go to the doctor. There’s also something called a CGM or continuous glucose monitor, which is something you actually wear on your body, but it gives you a real-time view into your blood glucose.

And to me, this is a great example because it’s actually the same thing that you’re measuring, blood glucose, in both cases. But the way in which you’re measuring and how you’re using this data is completely different. The IAC blood panel is this periodic benchmark to tell you whether this is an area you should be focused on or make sure you’re still doing well. CGM is something that you’re using hourly, daily, weekly to get really fast feedback to make adjustments and ultimately make improvements so that the next time you go back to the doctor, that IAC score is hopefully going to have risen as well. Yeah, that’s one of my favorite analogies right now.

Laura Tacho: Yeah. And I think what we see a lot with leaders or folks who have access to this data and they’re finding it just really difficult to figure out what to do next, they’re looking at the diagnostic data, expecting it to point them to something to do today or something to do this week. And so just the acknowledgement that, “Oh, I’m looking at a diagnostic data. This is great for telling me a trend. This is great for contextualizing my performance. This is great for setting maybe more midterm to long-term goals about overall performance.” That is just very freeing to know that you’re not doing anything wrong, it’s just that the data that you’re looking at isn’t designed to be the thing that’s driving decisions on a daily or weekly basis. You need more granular data that’s collected in a different way, maybe calculated in a different way, in order to do those actual improvement activities within the org.

If there’s one thing that I would want you to take away is that if you feel stuck, I want you to ask yourself, “Is this metric that I’m looking at suitable for what the decision I’m trying to make?” And I’ll put this up here on the screen just as this is the checklist that I go through when I am either looking at a metric myself or coaching someone on how to think about metrics. If we’re thinking in the diagnostic area or improvement on a daily or weekly basis data or the data for improvement on a daily or weekly basis, I think about things like frequency. Is this collected infrequently or is it collected very frequently? Low or high? Specificity? So how isolated is the variable? Am I taking a measurement of this very complex system and just taking one aggregate measurement or am I getting really specific about parts of the system?

The resolution. Can the system of measurement detect smaller changes or can it only detect shifts of certain orders of magnitude? The more that the metric you’re looking at tends toward the high section on all of these, the more actionable it’s going to be for you and your teams on a daily, weekly, monthly basis versus if they’re in the low category. These are more of a diagnostic metric. If you open up DX and you look at your Core 4 scorecard, know that that’s not supposed to be the thing that you take to your team to make a decision about what to change for this sprint or for this week necessarily. Some teams can be successful that way, no doubt, but we want to look at improvement metrics, not necessarily the diagnostic metrics for that. It’s like a report card versus a pop quiz or again, that blood panel versus looking at your glucose to decide if you should take a nap or go for a run.

Abi Noda: Yeah, there’s questions about giving more concrete examples, so let’s use the Core 4. I think both actually, the DORA key metrics and the DX Core 4 are great examples of diagnostic measures. The DORA metrics, originally DORA was actually called the DORA scorecard or the DevOps scorecard, similar to how we typically present the Core 4 metrics, which is a Core 4 scorecard. Your Core 4 metrics are going to be this diagnostic. It’s your blood panel. And to tie it back to the health example we were using, when we actually do the diagnostic for Core 4, let’s take a metric like just PR throughput. We actually, a question that comes up is, “Well, if you’re doing a quarterly checkup on PR throughput, how do you actually measure that? Are you taking the last 30-day sample of data or the last 90 day?” If we use A1C, that blood glucose example, they use trailing-90-day.

Actually with Core 4, we typically standardize around let’s get a trailing-90-day diagnostic of these different metrics. Then you use that data to understand, it’s your checkup of how are we doing in these different areas, and then suppose that throughput velocity is an area that your organization wants to focus on. Then you would step away from the diagnostic that is the Core 4 and really drill into different types of data and tools to help you specifically with velocity. And so you might look at a real-time continuous monitor of PR throughput. You might look at real-time data around cycle time, lead time, code review time. And these real-time metrics are going to be the tool that you’re using on a day-to-day weekly basis to ultimately make improvements to velocity. And then when you go back to the doctor, when you go get that next baseline in diagnostic of your Core 4, you would want to see improvement. That’s one concrete example of taking engineering metrics specifically and making that distinction between how to approach a diagnostic versus what types of reporting and dashboards might you want to do the improvement part.

Laura Tacho: Yeah. And I think one important thing to emphasize for all of you here, you should be able, and this is a skill that you’ll need to work on in practice, but any diagnostic metric that you’re looking at, it’s a skill to be able to look at that diagnostic metric and then identify the improvement metrics to correlate with it, just as Abi laid out in the example. So if you want to improve cycle time, you might be looking at the average cycle time from the last quarter, and that’s a great diagnostic metric, but an improvement metric might be looking at how many PRs have been waiting for longer than X-amount of hours for a first review. That is going to give your team something to do today, to do this week, and to focus on a much shorter timeframe. But something really, really specific. Cycle time is a big metric.

If we think back to those scales, we’re looking at something that’s collected quarterly, it doesn’t have a great resolution, it’s also really big. It’s taking into account a lot of different moving parts. We can isolate though time to first review or time to first approval. That’s a really specific part and high resolution that’s taken at a frequent cadence, that is going to be a metric that’s really useful to drive improvement. And so for every diagnostic metric, you can do the work to figure out what are the improvement metrics in order to influence that diagnostic. Do the work then in your org to improve them. And then the next time you take that diagnostic, you’ll be able to see … because I think as is pointed out in the chat, these diagnostics are often lagging indicators.

They’re looking backward into the last quarter sometimes into the last half year or year, and so they take a while to respond to the current conditions, whereas improvement metrics are much more point in time or current measurements for what’s actually going on in your org.

Abi Noda: And another analogy would be sleep, right? You can go get a sleep study. It’s going to be a comprehensive diagnostic. It’s going to benchmark you. Again, say, here’s what’s good, here’s what’s bad. Of course, you need interpretation, because based on your age, your gender, your DNA, your actual personal sleep needs are going to be different … and tendencies. But then to actually improve sleep, you’re going to maybe implement a checklist on good sleep hygiene or you’re going to get an Apple Watch and track your nightly sleep habits and look to optimize that, or even an oximeter.

Again, really thinking about, am I in the mode of diagnosing and understanding where to focus or am I in the mode of, I’m trying to focus on a thing and improve it? What are the tools I need? I think that distinction is really important. And really important, I think, when we’re giving guidance to teams, managers, folks who aren’t maybe thinking about the ins and outs of metrics as much as enablement and dev prep folks who tend to have more time to think about these problems.

And of course, Laura, one of our big goals at DX this year is to bake in this type of clarity into our platform and product as well. So for those of you who are listening and thinking, “Wow, this makes sense, but how do we operationalize this?” Laura and I are working on more guidance resources and baking this into our platform as well.

Laura Tacho: Yeah. I would also be remiss to say that we’ve given a lot of quantitative examples in this discussion, but absolutely the same methodology of thinking applies to qualitative metrics as well. One example I can think of, someone mentioned quality. We might have a lagging indicator of change failure rate as an indicator of quality, and that’s going to be in the diagnostic as well for the Core 4, that’s our quality dimension there. But you need to have some intermediary metrics in order to keep track of whether quality is trending up or down.

And interestingly enough, a qualitative measure of satisfaction with quality practices has been shown in the Developer Productivity for Human series as being a fairly good indicator of quality over time and being sort of a leading indicator to see whether that change failure rate is going to move up or down, because incidents should happen very infrequently, and absence of an incident doesn’t mean that quality is actually increasing. So we need to bring developer voices into this methodology as well. Everything that Abi and I said, even though we’re using quantitative examples, definitely also applies to the qualitative data. And both Abi and I firmly believe, and we’ve talked about this very often before, that the mix of qualitative and quantitative data when trying to improve developer team performance and productivity, it really is the only way.

We need to get the whole picture and that can only happen when we bring developers into the conversation, ask them about their experiences, ask for self-reported data. They’re the ones who are using these tools every day. They’re the ones who are very aware of what’s busted and what needs to be fixed, and they’re the ones that can help you fix it as well. So I want to call that out just so that we don’t end this conversation talking a lot about quantitative data without the very important aspect of qualitative data as well.

Abi Noda: And that reminds me of one last analogy and then I’ll let you wrap up, Laura. Which is, aerobic fitness. Again, let’s talk about diagnostic versus improvement. Diagnostic of aerobic fitness is a VO2 Max test or an aerobic threshold test. I’ve gotten both. And when you get that diagnostic, you get an idea of how is your aerobic fitness against your age, your gender, your goals, and then if you want to actually improve your aerobic fitness, you are given a set of metrics like heart rate training zones that you want to monitor and target.

And in aerobic fitness, there’s actually a qualitative metric that’s commonly used, which is perceived rate of exertion. And in fact, if you have a Fitbit or I think in Strava, there’s actually after you do an exercise that pops up a questionnaire where you input that qualitative perceptual measure of effort and exertion, side by side with heart rate. So I always love that example of that from fitness and health where you use that mix of qualitative and quantitative to get the full picture of how your training is going.

Laura Tacho: Yeah, absolutely. The myth that only quantitative metrics is objective, it’s just not accurate. It’s a myth. And that’s I think one really example that resonates, because we often think about health data as either your heart rate is up or it’s down, but perceived rate of exertion is a really important qualitative metric, just like developer experience is really important to bring into this conversation about developer productivity.

We’re going to wrap up a bit as we’re nearing the end of our time together. We went over quite a bit today. We talked about the challenge that we’re seeing in organizations, which is when you present a lot of data to engineering leaders to platform teams, and it can feel very overwhelming and we’re seeing a lot of teams just not know what to do next. And the reason for this is there’s plenty of reasons for it. We want to give you some guidance on how to make progress toward taking that data off of the dashboards and out of the tooling and into your teams.

And I think the first thing to do is to start thinking about the mental model that you have when approaching the data. Is this for continuous improvement organization-wide on those frontline teams or as an engineering leadership group? Or are you a platform team using this data to answer that existential question, are we having impact or not? Once you know your use case, you can figure out what activity you need to do. Are you trying to do diagnostics? Are you trying to look at trends? Are you looking at benchmarks? Are you thinking big picture comprehensive? Or are you in search of a metric that’s going to help you make a decision today or tomorrow or this week?

And so diagnostics and improvement metrics is also another mental model that can be really powerful in making sure you’re looking at the right data in the right context to make the right decision. In a perfect world, Abi and I would tell you, it’s great to know those questions and find the answers before you set up your instrumentation to collect the data, before you author your survey or send it out for the first time. We know that that’s usually not the case. For many of you, you’re getting access to data that’s already there and you’re trying to make sense of it, which is a really difficult job. And so if it feels hard, it’s because it is hard.

But if it feels like you’re looking at a metric and just scratching your head, what do I do with this? Use that mental model. Figure out what’s your use case and then what’s your activity? Is this a diagnostic piece of data that you’re trying to use for improvement? Is this an improvement piece of data that you’re trying to do big picture strategy? There’s a mismatch there. And so just being able to identify that can help you make that next step toward using data really effectively in your organizations.

Abi Noda: Just want to say thanks to everyone and there’ve been a lot of great comments and questions, so lots of inspiration for Laura and I on more topics to cover in the future. With that said, we’re out of time for today, so we’ll wrap up. Laura, I’ll turn it over to you for a couple quick announcements and then we’ll wrap.

Laura Tacho: Yeah, Abi and I did mention the Core 4, which is a new framework that we’ve published a few months ago. So you can head to DX’s website and look in our research tab and find more information about this new developer productivity framework that we worked on with collaboration with the authors of DORA, Space, DevEx. It’s a unified framework. It also does have industry benchmarks broken down by company size and by sector. So if you’re doing diagnostics and you want some industry benchmarks to look at, DORA benchmarks are of course great. Those are pretty popular. You can also look at these Core 4 benchmarks to get a sense of, for example, PR throughput … what’s a median or what’s a top quartile performance? So we’ll include these in the email blast that goes out to everyone who’s attending here live or has registered. I’m also teaching my course on developer productivity metrics where I give you a deep dive into many popular frameworks including DORA, Space, DevEx, the Core 4.

We talk about how to actually use the metrics, how to author developer satisfaction surveys, what to do with the data. That’s running on February 13th, and you can go to bit.ly/lauratacho if that’s interesting for you. It’s live coaching plus some self-paced material. We’ll send a link out as well in the blast email that comes after this. Thanks all of you for spending some time with Abi and I today, but mostly thanks to yourself because you’re taking time to invest in your own knowledge and your teams are better off for it. So a pat on the back to all of you for investing your time wisely today. Thanks so much and we’ll see you next time.

Abi Noda:Thank you everyone.