How to cut through the hype and measure AI’s real impact (Live from LeadDev London)

View the slides

The AI hype cycle vs. ground truth

The “disappointment gap” refers to the widening space between sensational AI headlines and the lived reality of teams on the ground. Organizations are being pushed to move faster with AI, yet few have defined what success even looks like.
Headlines touting “90% of code written by AI” inflate expectations and erode trust. Developers feel let down when tools don’t deliver on the hype. Executives, in turn, expect exponential productivity gains without understanding what’s realistically achievable.
The best way to close this gap is with data. Leaders need to ground their AI strategies in facts, not forecasts.

AI’s current role in high-performing engineering orgs

In the top quartile of organizations, around 60% of developers are now using AI tools daily or weekly. However, this usage does not translate directly into AI generating most of the code.
These organizations are seeing the best results because they invest in enablement, support, and identifying practical use cases that actually work.
Across nearly 39,000 developers at 184 companies, the average reported time savings from AI use is 3 hours and 45 minutes per week. It’s a meaningful uplift, but not a silver bullet.

Engineering leaders must shape the narrative

Engineering leaders are also business leaders—and they need to take on the responsibility of educating peers and execs on what AI adoption actually looks like.
Effective leaders can clearly answer:
1. How is our organization performing today?
2. How is AI helping—or not helping?
3. What are we doing next to improve?

Back to basics: what defines engineering excellence?

A shared definition of engineering performance is essential before measuring the effects of AI. The DX Core 4 framework offers this foundation.
Core 4 combines elements of DORA, SPACE, and DevEx into a single, balanced model with four key dimensions: speed, effectiveness, quality, and impact.
These metrics must be evaluated together. Optimizing one at the expense of another (e.g., speed at the cost of quality) risks destabilizing the system.

Developer experience drives performance outcomes

Developer experience is the strongest performance lever available to engineering organizations. The DXI (Developer Experience Index) measures 14 evidence-based drivers of experience and correlates directly with time savings.
For each DXI point gained, developers save 13 minutes per week. While that may seem small, the impact scales dramatically across teams.
Block used DXI to identify 500,000 hours lost annually due to friction—data that directly shaped their investment decisions and enabled faster delivery without compromising quality.

A complementary framework for measuring AI

The AI Measurement Framework adds clarity by tracking the effect of AI across three pillars: utilization, impact, and cost.
Utilization captures how broadly and consistently AI tools are being used. The biggest gains typically come when teams move from occasional to consistent usage.
Time savings per week is the most aligned metric across the industry for measuring impact.
Cost includes not just licenses but also investment in training and enablement—areas that are often overlooked but essential for success.

Using both frameworks together creates clarity and confidence

Core 4 answers: “What does high performance look like?”
The AI Measurement Framework answers: “How is AI affecting that performance?”
Together, these frameworks enable leaders to move beyond guesswork and act with clarity, especially during times of rapid change.

AI is a multiplier—but only with the right foundations

Accelerating software delivery with AI is possible, but it requires strong fundamentals in place. Cutting corners on quality, stability, or developer experience for short-term gains can create long-term damage.
When grounded in solid frameworks and real data, AI can improve velocity, collaboration, and developer satisfaction without compromising core engineering values.
Better software faster is possible—not by chasing hype, but by aligning teams on what matters and measuring what works.

Hi, friends. So if you caught Gergely’s talk this morning, you heard about this really big gap between how organizations are actually using AI and what we read in the headlines and on social media. And this gap is called the disappointment gap, and a lot of us are stuck in it right now. So I want to talk to you today a bit about building better software faster, about being a leader in a time of constant change in the age of AI.

My name is Laura Tacho. I’m the CTO at DX. DX is a developer intelligence platform. We take data from all of your system tooling and pair it with data from your developers about their developer experience so you can get deep insight into not only the performance of your organization, but also what you can do to remove friction and improve efficiency. Every day I help software engineering leaders, just like yourselves, get really deep insight into how their organization is doing and what they could do better, and I’m hoping to give you some of that wisdom and lessons that I have learned today as well.

So let’s get back to this question. How much faster can we go with AI? This question is dominating every conversation that I’m having with VPs and CTOs and other engineering leaders, staff engineers, you name it. Your board wants the numbers, your CEO is asking you for timelines, your engineers just want tools that work and don’t feel gimmicky. But here’s the problem with this question, is that we’re really focused on the faster part and a lot of us haven’t stopped to think where we’re even going.

There’s so much space for innovation with AI. We can use it in new and novel ways, but at the end of the day, AI is a tool that needs to help us improve the system. We need to think about using AI on an organizational level if we really want to see the organizational benefits that are being promised.

We agreed a long, long time ago that lines of code are not equal to developer productivity, right? We’ve established this already, but when we look at what’s in the headlines about AI, there’s a huge focus on percentage of code written or lines of code written really focusing on the output. And AI can have tremendous impacts on output, for better or for worse, but really AI is a tool that improves developer experience. It’s not about more lines of code. We need good developer experience, we need reliability, we need long-term stability and maintainability if we want this tool to actually help our organizations win.

But let me tell you the headlines these days aren’t really helping us as leaders stuck in this disappointment gap. If you just open LinkedIn right now, don’t do it on conference Wi-Fi, but if you were to open LinkedIn right now, it doesn’t take you very long to come across some sensational headline about 30% of code being written by AI or 60% of code or in three to six months is going to be 90% of code. And we’re hearing that AI will simultaneously replace all junior engineers, but also all senior engineers and eventually all engineers.

And this is the reality of the situation. These headlines are headlines for a reason, right? It’s like looking at a glossy fashion magazine that are filled with Photoshopped models or looking at advertisement for food that’s all been fluffed and zhuzhed and the milk is actually shaving cream, right? It just doesn’t hold up to reality. And I’m not saying this is not, that there aren’t great things happening with AI. It’s just that we have to have realistic expectations, otherwise we’re going to stay stuck in that disappointment gap.

The antidote to all of this hype is data. Data beats hype every single time, and you to be a leader during this time of change need to have the right data to explain what the capabilities, but also the limitations are to explain what a realistic expectation looks like to yourself, most importantly, to your leaders and your organization and also to your development teams.

And again, AI can be really transformative. And I’ve been working with organizations that are very far along their AI journey that have seen some tremendous results, but that’s not what we’re seeing in the average organization right now. Among industry peers, people like me who have access to large amounts of data across the industry, it’s well understood that those headlines don’t reflect the reality on the ground. And if you caught Gergely’s talk this morning where he went through the particular personas and what they might have at stake with claiming things about AI, you might understand that a little bit better. We’re seeing lots of claims. Some people make those claims because they have economic benefit. Some people make those claims because they want to help others, and there’s just a lot of different media coverage all over the place.

This hype around AI is in many ways the biggest barrier to its adoption because what happens is that developers, especially, see these headlines and they think it’s promising them the world. They go to use the tool and no tool can live up to the expectations in those headlines because they’re sensational headlines, and so perhaps then developers abandon the tool or they perceive it just as a gimmick. On the other hand, we have leaders who maybe don’t have a background in software engineering seeing those headlines and having inflated expectations of what can actually happen in reality. This quote is from Brian Hauck, who is a co-author of the Space Framework of Developer Productivity. He’s researching AI in its impact at Microsoft right now. He’s incredibly smart. We have a podcast episode with an interview, actually two, and I would highly recommend checking those out for your plane ride or train ride home.

So here are some statistics from real organizations. Right now, the organizations that are doing the best, so in the top quartile, I’m talking about top 25% of organizations have about 60% of developers using AI tools on a daily or weekly basis. This is not necessarily in line with the 30% of code or 100% of code being written by AI right now. So there’s quite a gap. Don’t get stuck in it.

But these organizations and every organization that is putting muscle and money into training and support and enablement to help engineers adopt AI tools and identify the use cases that are working, are seeing this number rise every day. We’re getting the tools in the hands of the developers. They’re liking using them, and they’re seeing good organizational results and good individual results.

On average, developers who use AI tools are saving about three hours and 45 minutes per week by using AI to assist or augment their development workflows. This is data across 38,880 developers at 184 companies, data that we gathered in Q1 and Q2 of this year, 2025. So this is very, very fresh data. This is what we’re seeing in reality on the ground right now at organizations just like the ones that you work at. And in fact, some of you here, your data is reflected here in these numbers. So these are very promising results, but not necessarily near what we see in the sensational headlines.

You have a really hard job to do right now as an organizational leader, and if you don’t have the frame of mind of yourself as a business leader, I want you to put that hat on, just for a second, because the reality is that you have subject matter expertise in engineering, and that’s what makes you great at your job. But if you are in a leadership position, you are an organizational leader. You need to be leading on the business side. And whether you like to hear this or not, it is your responsibility to educate around the hype. It’s your responsibility to educate others in your company about what to expect from AI, how it’s actually impacting your organizational performance.

To do that, you need two things. First, we have to go back to the fundamentals of what great engineering performance actually looks like. It’s really easy to get distracted by a shiny new tool when we don’t have a really strong foundation of what it is we’re chasing after.

So the first thing that you need to do is have a very good definition of engineering performance and excellence that’s a common language across your business so that you’re all speaking the same language and you have a really solid foundation of principles and a definition of performance to go back to.

Then you can enhance that definition, build on top of it, and look at some very specific AI measurements that are going to help you measure the impact of AI in your organization. You need to know what’s working with AI, what’s not working with AI, and how it’s improving or not improving some of those foundational measurements of performance in order to make the right choice of what to do next.

I talk with a lot of engineering leaders every day, and the truth is a lot of us still feel like we’re showing up and guessing, especially when it comes to the AI part, because it’s all so new and it’s all changing so fast. If that’s you, that’s not because you’re a bad leader or you’re bad at your job, it’s just because this is really hard. It’s changing really, really fast, and it can be really hard to keep up with. This is all brand new stuff. I don’t want you to feel like you’re on your own, and we’ve been doing research on this for a year plus, so let me help you get some definitions and help you show up to those meetings with confidence.

I want you to show up to every single meeting with your exec team, your leadership team when they’re asking, “Hey, what are we doing with AI,” I want you to be able to tell them three things. First is how you’re performing as an organization in general. The second is how AI is helping you or not helping you. And the third thing is, what are you going to do next?

And again, it is your job to educate others around you, especially your business counterparts, cross-functional stakeholders, cross-functional counterparts. I know this can feel unfair that we have to bear the burden of educating around AI, and I’d rather have you hear it from me now and say, oh, I really don’t like that, than realize it three months from now when it’s a little too late and too much time has passed and you have to go undo some things.

So let’s get back to the question in the room. What does it actually take to build better software faster? Can AI do this for us? Well, AI is accelerating the good, solid fundamentals of software delivery and software performance. It’s changing the way we work, absolutely, but it’s not changing those fundamentals. We need quality, reliability. We need a great developer experience. We need teams who can collaborate with each other. We want a fast time to market so we can get feedback from our customers and adjust our course accordingly.

You also need to protect your org from long-term damage. And what I mean by that is not sacrificing long-term velocity and stability, maintainability for short-term gains when it comes to AI, because that is the reality for a lot of companies. We’re producing more code because AI makes it easier to produce more code. And if we’re not taking care to make sure that code is high quality, that our pipelines are ready for that code, that our SRE operations can support that amount of code going into production, we can do some lasting damage, which we don’t want, so we need to have a complete picture.

The first thing I’m going to share with you is the way to align your organization on a definition of excellence. The DX Core 4 is a framework that I co-authored with Abi Noda, who is the co-author of the DevEx Framework. This framework is a collaboration between the two of us, and also Nicole Forsgren, who is the lead of, she founded DORA, Dr. Margaret-Anne Storey, who’s the co-author of the SPACE Framework, and many other researchers.

The idea here was to bring together DORA, SPACE, DevEx and put it together in one framework that is easy to deploy and easy for you to use. This framework measures across four different categories. We have speed, effectiveness, quality, and impact. These are all holding each other in tension, so the idea here is you have to look at all of them together. Just like a spider web is holding each other up, and if one of the dimensions goes down, the whole structural integrity is compromised. And the same thing goes for the Core 4. We can’t optimize for speed at the expense of developer experience. We can’t invest so much in quality and pay so much attention to it that our innovation and business impact grinds to a halt.

Dropbox, Etsy, Pfizer, hundreds of other companies are already using Core 4. Dropbox was a very early adopter. And Drew Houston, who is an engineer but also co-founder and CEO of Dropbox, says that, “Core 4 gives you a much more cohesive picture of what’s happening in your organization. It answers the question, what does performance look like?”

And Drew, being an engineer, knows that developer experience is at the heart of a great engineering performance. Developer experience is the biggest lever that you can pull as an engineering leader to improve your organizational performance, to improve your team performance and your individual performance. And this is not just my opinion. This is what the data says.

Over, excuse me, 20% of time is lost due to friction across the whole developer population that we’ve been studying as we’ve developed the Core 4, 20% of time. I know some of you may look at that number and think that seems a little low. 20% of time is lost due to friction due to inefficiencies and poor tooling.

When we study developer experience and its correlation to time loss, there is such a strong correlation that we were able to make a model out of it. So the DXI is the way that we measure developer experience. It’s a research-backed, evidence-based 14 different drivers of developer experience. The DXI is a composite metric that reflects those drivers.

For every increase in DXI, developers save 13 minutes a week or about 10 hours annually. That might not sound like a lot, but if you think about your developer population, 50, 100, 200 developers and how much opportunity you have to increase developer experience, that time adds up so quickly. It is so valuable to your organization. Time is a very scarce commodity, and the fact that 20% of it is being due to poor developer experience is a critical business problem. So if you are in a meeting trying to explain the value of developer experience, DXI and mapping it back to time and back to perhaps recovered salary is a really effective way to show up to those meetings prepared with a really rock-solid business case.

Block, which is the company behind Cash App, Square, and TIDAL, has been using metrics in the Core 4. They used DXI to identify 500,000 hours to save annually that they were losing due to inefficiencies and frictions. What they did then is they took the data around that developer experience and decided what to invest in next. And Azra, who leads the developer experience team over at Block, said, “Those improvements then helped us move faster,” great, “without sacrificing quality or focus.” And it’s so important to have that without clause in there for you as well in your own organization.

You don’t need to be a big, huge company like Block and have a dedicated DevEx team specifically to be able to use Core 4 to make better decisions about where to invest and to assess the performance of your engineering organization. We designed it to be easy to deploy even on your own. And so here’s how you can do it.

Here’s a template that you can use with Google Forms, Typeform, pick your form, whatever you want. Make sure that when you’re choosing a tool to run this survey, that you’re doing it in a way that you can get the data exported into Google Sheets or Excel because you’re going to have to do a little bit of computation. This way you can get all of the core metrics from Core 4, all the primary metrics from self-reported data and have better decisions in days or weeks, not months.

The other thing that we designed the Core 4 with is industry benchmarks. So I know firsthand going into some of those big, scary meetings, sometimes you need to have a little bit more ammunition to make a business case for something, and industry benchmarks have been highly valuable for that. It’s one of the reasons that DORA became so popular because they have industry benchmarks, and so we wanted Core 4 to be benchmarkable.

These are open for you all. You can go check out the benchmarks. You can use the survey template, then compare your results to the benchmarks here. You’re going to get 75th percentile median values as well, so you can see where you’re doing really well, where you might need a little bit more investment.

The Core 4 is designed to be a comprehensive, robust, future-proof way to measure engineering performance. This is the whole point of software. The whole point of developing software is to do it in a way that’s sustainable, high quality, high business impact, and that’s what the Core 4 is designed to measure. When you introduce any new tool, whether it’s AI or a new framework or a new process, you’re going to see the impact in the Core 4 metrics.

So with AI, we do want to accompany the Core 4 with some specific measurements that measure the effect that AI is having on our organization. And so I’m actually very proud to share this more widely. In fact, for the first time here at LeadDev, we published this a few days ago.

The AI Measurement Framework is the result of a year’s worth of research at top companies out in the field. We also collaborated with Cursor, with DORA, with Sourcegraph, and plenty other individuals at AI companies and large corporations as well. The AI Measurement Framework gives you hard metrics, very clear guidance on what to capture across three different categories, utilization, impact, and cost.

Utilization is the first part of the story here. Tracking AI tool usage is important because according to our data, we see the biggest uplift from AI from people who go from being non-users to people who become persistent, consistent, daily or weekly users. So even occasional usage weekly or monthly can bring quite a lot of uplift going from non-usage to consistent usage. So we want to get the tools into more hands of developers because that’s where we’re seeing the biggest gains.

On impact it’s so important to keep in mind AI is a tool propelling you to some outcome. It’s not just there for the sake of it being there. We need to look at its impact on Core 4 metrics, on developer satisfaction with the tool. Our top metric that we recommend to track in terms of impact is time savings per week. That’s been where the industry has aligned. Google published an article outlining how they do their measurements, and AI-driven time savings is also at the top of their list, so that’s a good place to standardize your measurement.

And then finally, cost. Cost is an important part of this conversation because we are in the business of business. And so if you want to have better conversations about the ROI of these tools, understand where to invest, where not to invest, that’s an important thing. It’s not just about the cost of licenses though. It’s also about the cost of training, enablement, and support. You want to make sure you’re not underspending in those categories.

Again, like the Core 4, we want to look at all of these dimensions together and not overemphasize just one or the other because overemphasizing, for example, on utilization gives us that tunnel vision where we might see some short-term gains, but we’re forgetting about all of the other fundamentals in our foundation of what software engineering excellence looks like.

Booking.com has used this measurement framework to inform their own strategy of rolling out AI to their developer population of 3,500. Zane, who’s on the DevEx team at Booking, said, “It showed us where to focus and help us get way more impact out of the tools, both in how deeply and how widely they’re being used.” They have 3,500 engineers. They saw 65% higher adoption and saved an additional 150K hours by looking at metrics, like the ones in the AI Measurement Framework, and then figuring out what to do next.

When you combine these frameworks together, you have Core 4 answering the question, what does great organizational performance look like, and then you have the AI Measurement Framework telling you what is the impact of AI in my organization, you can have confident decisions and make sure that you’re operating with the right data to know what to do next.

Leaders are going to win who have the right data. We don’t want to walk into a meeting and make the wrong decision based on the wrong information, and so you need to have the right data in order to adapt quickly. And using the Core 4 together with the AI Measurement Framework is going to help you lead during this time of rapid change.

AI can help us build better software faster, but it is up to you to figure out how to leverage it best for your particular organization. Looking at industry trends like knowing, okay, average developers saving 3.75 hours is very important for contextualization, but you must know what it’s actually doing on the ground in your own organization in order to make better decisions. I don’t want you to stay stuck in that disappointment gap between inflated expectations and reality. And data beats hype every time, so it’s on you to get the right data.

So if you are an engineering leader who needs to communicate engineering performance out to the rest of your leadership team, maybe to your board, if you are worried about AI adoption and don’t know how to measure it, you don’t know what to do next, you have some work to do when you get back to your desk on Wednesday morning. So you can start using the Core 4 to align your business on a solid definition of what engineering performance looks like. Use the template, use the benchmarks. Those are all free and available for you to use. You can identify points of friction, go fix them, improve your developer experience.

Then you can use the AI Measurement Framework to start tracking how is AI actually having an impact on the development population at my company? Think about it as a research project. What’s working, what’s not working, how can we replicate some patterns and where might we need to try something different?

Again, I want you to go into every single meeting with confidence and the data and frameworks to back it up. I want you to be able to answer three questions. What is organizational performance, what’s the definition, and how are we doing? Second is what is AI actually doing in our organization, what’s the impact? And the third thing is what are we going to do next?

Better software faster is possible with AI, but AI is not a magic button that’s going to do it for us. I don’t want you to be stuck in that disappointment gap. Data beats hype every time, and it’s up you to get the good data.

Hope to see you around the event, and otherwise, I will see you around the internet. Take care.