Laura: Bruno, it’s so nice to have you here. Thanks for joining us today.
Bruno: Thanks for having me.
Laura: Can you give us a bit of idea of the scale that Booking.com operates on and also a little bit about your role there?
Bruno: Yeah, sure. Yeah, so I work on the product side of developer experience and over the past year and a half-ish, I’ve been overseeing how we roll out GenAI or how we discover and enable engineers to use GenAI for engineering at Booking.com. Yeah, for those who don’t know, Booking.com is one of the largest OTA, so online travel agencies in the planet. We serve about 1.5 million room nights. And from a technical perspective, our developer team is over 3000 people with about 250,000 merge requests a year so that should give an idea of the scale. We run around 2.8 million CI pipelines and we are very experiment driven. It’s part of our culture from the beginning to AB test pretty much everything that we put forward and we try to bring that to the engineering side as well. So hence, the quest to discover the value and where GenAI fits into our engineering community.
Laura: When you introduced AI, obviously, you have a really strong experimentation mindset so this wasn’t just bringing in AI because it was the cool thing to do or that it was new and novel. Of course, there’s a lot of promise, a lot of excitement, but you started out from the very beginning with an experimental mindset trying to figure out what the capabilities were, where it could be best applied. What were some of the specific business goals that you hoped bringing AI tooling to your 3000 plus developers would bring?
Bruno: Yeah, really good question. So our main goal was can we accelerate the speed in which developers do what they do, but also can we remove the toil of their day-to-day work? So anything that is boring KTLO and not adding any value to the business, could we potentially remove that from the day-to-day of the developers so that they focus 80, 90% of their time into innovating in the problem we want to solve, which is travel. So that was the main goal for us with the whole GenAI loudness that was in the industry.
Laura: And so that business impact metric, like the innovation ratio, just for some sense of when you say 80 to 90% innovation, looking at even the P90, so 90th percentile, top 10% of all companies, tech of all sizes, tech and non-tech and companies of all sizes. So that business impact metric innovation ratio, that’s the one that you’re paying close attention to. And just to give the listeners out there some kind of comparative value, we have that innovation ratio as part of our Core 4 framework and we have global benchmarks for it. So we have benchmarks from tech and non-tech across all different sizes. You’re looking to get to 80 to 90%, which is outstanding, right?
I think that’s a really admirable goal, and I think that’s something that’s no more businesses are looking at that metric and thinking if we can spend more time on innovation, we are more defensible. It’s all about shortening time to market and making sure that we’re spending the most time on delighting our customers so that’s a really good place to start from when you’re bringing in AI to solve a specific problem, we want to increase the innovation rate. So the top 10% of companies I think are still only spending… Have to look at the specific number. I think it’s 65% of their time on innovation. So trying to get up to 80% is a very aggressive goal, but AI can bring sort of unlimited ceiling of benefits.
Bruno: We think so. As we get deeper into exploring it, we see that some areas of the company would be easier to get to closer to 80%, others not. And then we are trying to apply GenAI in order to be able to fast track some of those areas where our developer experience is a little bit slower. Time to market is a bit slower. We are trying to see what we can leverage in AI in order to be able to make it faster. For instance, re-platforming our code base, moving from a more legacy code base to a newer code base. And we see some signs of it being useful there.
Everything up until now is signs, right? We are evaluating and seeing that it could work in some areas, but it’s very much, for instance, the 80 to 90% is super ambitious. We think we can be and we should be ambitious, but it’s not yet guaranteed would get there, right?
Laura: Yeah, it’s a good horizon to chase. How old is Booking actually as a company?
Bruno: Yeah, I believe that we are on our 30th birthday this year. I might be wrong. I’d have to look at the numbers, but certainly over 25 years.
Laura: Yeah.
Bruno: And another important factor to mention here is that because of the culture of experimentation, we got to where we are today in success, but also in some more messy parts of our code base, right?
Laura: Mm-hmm.
Bruno: The experiments and the site become more and more optimized. It also becomes very hard to be able to get an experiment right. And so what that means is that we try to go super fast in throwing experiments out there to see if we can get to a positive experiment and cleaning up those experiments becomes harder and harder so we try to move on to the next business problem. And what we see today is that we have quite a large amount of debt when it comes to cleaning up those experiments, which inherently makes the developer experience slower, cycle time is slower and more prone to errors.
Laura: I think when a lot of folks hear about such an ambitious goal, like 80 to 90% spent on innovation, what comes to mind is a brand new company that has a code base that’s less than five years old, and that is not the environment that you’re operating in. You’ve got a lot of… I mean there’s Greenfield stuff, but there’s a lot of Brownfield stuff as well. You’re dealing with legacy code, you’re dealing with migrations. There’s really complex gnarly problems to solve. And maybe despite all of it, or maybe because of it, that’s why you’re able to… You’re experimenting really quickly in applying AI to very specific business problems, which is a great way to start. I want to walk back to the beginning of your journey with AI. And so how far in the past was this? Two years ago?
Bruno: Yeah, about one and a half years ago. We started very end of December 2023 then… But let’s say we put all our focus into it in 2024. And yeah, we started with the hype, right? At the time we worked with Sourcegraph search, which is an amazing product by the way. If you don’t know about it, I suggest you look into it. But they helped us navigate our more legacy codebase via their search product. And so they had the context of our codebase already and they launched a product called Kodi, and that was a no-brainer with the context that they had about the codebase, adding Kodi to it with an LLM in front of it made it really useful for us to be able to experiment a ton. And so in the beginning it was about the hype. It was this bogus metric of our set and everybody was talking about hundreds of-
Laura: The promise, right?
Bruno: Yeah. That was a huge promise and so we started that and a year and a half later, I think the world is changing dramatically. I remember when we first started, we only had the choice of one LLM for the entire company. The developers couldn’t choose an LLM to play with. Sourcegraph kind of paired with us and partnered with us to be able to make sure that we changed it so we could have the best shot at trying this new paradigm and technology.
Laura: Yeah. Things have really become commoditized quickly in AI. I mean, thinking a year and a half ago is like… Is it dog years, AI years? It’s a long time ago. Things have changed really rapidly and I think when you started or Booking started, generally, you journey with an AI tool the same way a lot of companies are starting their journey, which is that you had an existing vendor that offered an AI tool and the on-ramp was very easy because it was just there. And that’s the experience of… I know a lot of Kodi users, a lot of Copilot users as well is like you have the vendor already, and so people just start using it. What was the initial adoption like when Kodi was first available?
Bruno: It was really, really low. We suspected at the time that developers didn’t know how that would play out in terms of sharing a job with an AI.
Laura: With a robot.
Bruno: So I think there was a bit of fear on will I lose my job because of this? And so for us it was very important to restate that goal of we would like to make you as productive as possible so you can focus on the things that are actually nice and sexy to focus with from an engineering perspective. And so with time, we started massaging that muscle. And so in the beginning we had very little adoption. The second point that made that a thing was the lack of knowledge. We realized that no one knew, one, how to interact with an LLM. What was an LLM? Was the whole prompt thing and engineering behind it? How do you give context to an LLM, but also, what can they use an LLM for legally, right? We are in a company that is quite big. We’re a public company, and so it becomes a bit of a fear to understand where is the line.
And so for us it was very important to show them where that line was so they can really give their best to do that and so we saw adoption kind of rising as we started working directly with the developers.
Laura: Yeah. What’s funny is we talked about being able to choose the LLM as something that’s become commoditized. In that way, the world has really changed a lot since 2023, 2024. But as you said, acceptable use policies, not understanding how to use it, needing enablement. These things are still as true today as they were back when you were introducing Kodi and I think a lot of folks are still struggling with those particular barriers to adoption. You, of course, have a really experimental mindset here and so can you talk me through how you approached measurement and evaluation and treating this initial pilot of Kodi as an experiment? What was that like?
Bruno: Yeah, it’s a really good question. So the first metric that we got presented to by Sourcegraph was the concept of hours saved. And that was the metric used across the industry. It wasn’t just doing the Sourcegraph thing. But the problem with that metric was it was very qualitative only and with a very small sample size of users feeding back how much time they would save over a survey. And so it was the best we had, but for us wasn’t enough. We couldn’t make business decisions with hours saved. And so it was important for us to strategize how we were going to try to measure and it wasn’t possible and in some places it’s still not possible to measure everything, but we wanted to know, one, speed, efficiency, quality, but ultimately can we use GenAI in order to re-platform and rewrite some of our legacy code base?
So that was ultimately the goal because we knew that if we did that we would go back to that beginning goal, which was remove the toil from the day-to-day of the developer and reduce that cycle time and the time that it takes for features to get to the end user. And so that’s sort of where we went towards and we paired really nicely with Sourcegraph to be able to try to measure some of those metrics. Then you folks came into the picture as well, which DX was incredibly important for us and still is for us to be able to, one, get a posture of what we are trying to measure, but also how we going to measure that. And it’s a journey. It’s been a one and a half year journey and we are getting somewhere.
Laura: Yeah, definitely getting somewhere. I really like you called out something a little bit subtly that I just want to highlight because I think it maybe wasn’t obvious. Because adoption was so low you didn’t have a sufficient sample size to actually make an accurate business decision. And I think that’s such an important point because when we talk about adoption, it’s not just adoption for adoption sake. When you have an experimental mindset and you’re trying to do testing, we have to orchestrate this experiment properly, support it properly, staff it properly, do proper enablement, and having, as you said, only 10% of your users, especially across all the business units that you have, it just wasn’t sufficient, right?
Bruno: That’s right. Yeah. The main metric, it’s important that you call that. The main metric in the beginning was let everybody use it as much as possible. Gt everyone to use it and so that became sort of the north star. As you mentioned, we are split into different business units and those business units have different flavors of development. And so for us, it was very important to be able to capture a healthy amount of users from each of those, so we can really measure how GenAI is having an impact on those things. And so the main thing behind adoption was education and so we just doubled down on that. Sourcegraph provided a bunch of material. We started collecting sort of learning paths, so developers knew how to start, but also the bare minimum that we needed them to know in order for them to 10X their skills on GenAI.
And so once we started teaching and showing developers the basics, we then started pairing with those business units on learning journeys where we spend one whole day just walking through prompt engineering, how to write better prompts, how to give the right context or token size and limits. Just shoving everything to the LLM wouldn’t do as good as of a job as giving it in bite sizes. And then the second day, we pair with them on a business problem that we would attempt to solve using GenAI. And so that was really cool because it made them sort of ground themselves into a particular problem that they were trying to solve for their business units. And we saw that once that happened, we started getting advocates and that helped us roll out more initiatives and also POC other providers as.
Laura: Yeah. You mentioned in one of our previous conversations that when you think about bringing AI to booking, it wasn’t just about giving everyone access to licenses, it was really about the cultural change needed to embrace AI as an accelerator for progress. And I think what you spoke about just now is that with very illustrated example, it wasn’t just about granting licenses to your developers, but you were doing very targeted enablement training support and changing their attitude to maybe hesitant or just early user don’t quite know what I’m doing to someone who has sufficient mastery and then becomes an advocate internally, right?
Bruno: A hundred percent. And you’re right, we tried in the beginning. We were like, “Okay, well everybody can use this,” and almost no one did. And so for us it was important to be able to change that culture and bring AI into the culture of development. And one important point here is are not the only target leaders of the different business units need to understand what is it that we are bringing into the development today in order to help us facilitate that. And so we also realized that at some point we were taking their developers out of their business problem for hours in order to be able to teach them and to show them, but they weren’t aware of what was happening. And at the same time, they were going outside of the business and educating themselves on what that was, the metrics that other companies were measuring and so it was important for us to bring everybody together into the same journey so that education with leadership was also super important. Still is.
Laura: Still is. Yeah, definitely still is. I’d like to get a little bit… I’d like to get a little bit of more detail into this measurement and data visibility challenge because coming with such an experimental mindset, those things were essential not just for your evaluation, but also this is 3000 plus developers. Or sorry, was it 2,500? Was that the number that we want to use?
Bruno: You’re right.
Laura: So you’re supporting 3000 developers here and these are big decisions just because of the scale that you’re operating on. And so I want to get into a little bit more detail of how you tackled that data visibility and measurement challenge just a little bit more because I think everyone is struggling with that right now. You talked about hours saved being one measure, but not everything. There was obviously some limited capability in terms of what the tool could get you in terms of telemetry metrics and then just the big question as an industry, we didn’t quite know. What were some of the things that you looked at or some of the places that you looked for data in order to put together a full picture of what the impact actually was?
Bruno: It’s still a big journey that we are going through. For us, going into the hours saved piece and then coming out of it trying to look at other metrics, we wanted to know what is it that we were doing with GenAI. And so the idea behind the percentage of the code that is being written by GenAI was a piece that we started looking into it. I remember having a meeting with the folks from Sourcegraph and Grayson from DX was in a meeting, and Grayson, as he does, stays quiet, listens more than he talks and then comes back the next day and go like, “Hey, I have a report here that I think could be really useful in order to be able to build the value of GenAI so far.”
And I think that shows some of the gold behind the eggs and why we are also pairing with your folks so much to be able to understand the impact of what’s going on with GenAI. But then we looked into, yes, what is the percentage of the code that’s being written by five GenAI and then start questioning, is this code eventually going to production or not? Which is still a big piece of the puzzle here. We assume that because an MR was created, that code is going to production. And so-
Laura: It’s not always the case.
Bruno: Not always the case at all. And so I think once we know, and I think that’s the journey that we are all into now, once we know what is it that is going to production be easier for us to be able to look back and say is the code of better quality? Is it less vulnerable, right? Does it introduce less bugs? Is it readable? Is it easy for a human to then interact with that code afterwards in light of outages and failures with the code? And so I think that’s where we are going towards, but yes, the time that it takes in order, for example, to debug the code or the time that it takes to merge the code that has GenAI footprints in it are some of the metrics that we are looking into right now.
But also one metric that goes into the main goal, which was a tech modernization, is how much can we automate the whole cleanup of our code bases? We know that there’s a ton of experiments and feature flags that are lingering around without being used, and so how much can we start automating the deletion and removal of that code, and so we started paying attention into that as well.
Laura: What I’m hearing from you is an approach that I’ve seen pretty common across the industry, which is that we have to look at some kind of activity metrics around the AI tool utilization. Percentage of code being written, percentage of MRs being written using augmented AI assisted engineering. Just to understand, are developers actually using this tool because even though that’s not the only thing to pay attention to, that is still part of the story. Right? But what I appreciated about your outlook and what you just shared is that you’re also looking at the same solid performance metrics that you have been looking at this whole time, which is maintainability, quality, cycle time.
It’s not just about generating more code for the sake of generating more code that has to be tied back to that business impact. And that’s a question I get a lot and I’m sure you get a lot as well, which is what are the longer term impacts of AI? And I wonder, you already shared a little bit about it, but are there other things that you’re looking at to measure the longer term, mid to long-term impact of AI?
Bruno: I think the more lagging metrics like change failure rate, I think the cycle time in general, those metrics, those core four metrics need to stay sort of like-
Laura: Front and center still.
Bruno: I think because that’s how we do DevEx and how we measure DevEx in general. I think GenAI needs to bind to that. There’s more of the soft side of things like, for example, are developers enjoying what they’re doing? So we measure CSAT of developers. Paired tooling that we try as well, so we can see what is it that is coming into their lives and entering with a less barrier of learning or they’re enjoying the most. So that’s also important for us to look at it as well.
Laura: Okay, great. Can we talk a little bit about the… Can we talk a little bit about the specific interventions that you did organization wide in order to increase adoption. Thinking back a year and a half ago, but also I know some of these things are ongoing. You mentioned a couple really interesting things like putting people in a room and giving them a business problem to solve. Not doing generic training about here’s how you write a specific kind of prompt, but really putting them in an immersive learning experience with a real business problem. I know you’ve done hackathons, you’ve done office hours. What are some of the other education and enablement things that you’ve done that have actually shown pretty good results?
Bruno: To emphasize, developers coming to the table with their business problems in hand is super useful, right? And just to correct, we weren’t giving them a business problem, we were asking them to bring the business problems that they have-
Laura: Oh, even better.
Bruno: From their BUs so we can attempt to solve. And it was very clear in the beginning we might not be able to solve, but we think we can. And with what you know so far, what is it that you think GenAI could be helping solving? And so they would bring those into the equation. Then would turn into a mini hackathon where we would be in a big room into five or six tables and people would then go into trying to solve most of those problems and then showcase afterwards what they created.
But one interesting thing that we saw recently is we started adopting what we call EBAs is an Amazon concept where you do experience-based accelerators. So you bring everyone into a room to solve a particular problem and in this case using GenAI, and we spent about three to five days focusing and obsessing about that particular problem. And we saw a lot of benefits both in the learning of GenAI because we bring providers into that journey as well, but also in removing the barriers of the day-to-day meetings, distractions, and getting folks to focus on a particular problem. And we are seeing a ton of value in doing that and GenAI is playing a big role here.
We estimate about 70% of those of the code written and the problem solved in those EBAs is done with GenAI at the moment. Those are estimated numbers from the sample size that we have and what we asked developers via surveys after those EBAs. They’re estimating that which shows from the beginning the folks not wanting to use it or being scared of it to now even helping us estimate and being excited about what we do.
Laura: Mm-hmm.
Bruno: And so we use those moments to both fix a business problem but also try different tools. And so EBAs, we bring EBA, we bring a different tool and we separate the cohorts. We make some of them stick to the tool that we’re using at present, others to use something new, and then we compare afterwards the benefit that each tool brought to the equation.
Laura: I think it’s such a unique way to do enablement and training and actually when anything is rooted in a real-life business problem, it’s like you’re kind of double-dipping because not only do you get real practice, which helps the people better understand and become more fluent in a tool, but you also get an artifact or something that you produce during that training time that’s actually useful to the business so it’s kind of a win-win on both sides, which is very cool.
Bruno: Another useful point that GenAI brought to the equation is we were able to come out of out silos of development of our status quos that we were leading by and really traverse the organization with that GenAI hat in mind. But from an experimentation place, to me that’s been really useful. We are talking a lot more across business units. We are interacting with each other so much more, and to me, that’s a massive achievement as well.
Laura: Yeah, like you said, it’s not just a tool, it’s a cultural shift. It’s changing the way that work gets done. And so enablement… Sorry. So time to experiment is something that we also see confirmation of sort of across the industry as something really essential for enablement and adoption, not having time to experiment I would call a barrier. And that’s what you’re solving here with the EBA’s and the sort of mini hackathons or having the developers bring business problems. I want to talk just a little bit about the other barriers to adoption that were not education or training more unlike the procurement, security, other kinds of things. As you said, Booking is a public company, you’re operating at a very large scale. This is not developer with a credit card territory, right? So can you talk a little bit about some of those other bottlenecks or barriers to friction that existed and what you did to get around them?
Bruno: Sure. It’s a really good point. In the beginning we saw that folks were reluctant to try GenAI, and when we started educating them, the wish to bring more GenAI just exploded, right?
Laura: Yeah.
Bruno: So we went from having one particular tool, one LLM, to now folks just wanting to try everything that they saw out there. And so developers started suggesting, “Let’s try X, let’s try Y.” And so we became the bottleneck for developers to actually start to experiment with GenAI. So it was really important for us to be able to centralize a little bit more. As I mentioned, we split into different business units and we didn’t want for each of those business units to have to go through legal procurement risks, security, individually, because I think we think we would waste a lot more time doing that. And so what we did is we created, for the lack of better word, a committee that… I hate the word committee by the way, that could centralize that effort you know.
Laura: Yeah.
Bruno: So when developers wanted to try something new, we would put them through a process where we’d meet every single week, that committee, and we’d look at the number of things that developers and the business wanted to try. And so we were able to really fast track starting POCing some of those tools, but also where the developers didn’t know what we already had, we could bridge that gap. And so it was useful for us to be able to say to developers, “Hey, cool that you want to try this, but did you know we have this one thing here that we know solves that particular problem you’re trying to solve. Is it cool if you start experimenting with this so you can go faster?”
We put the new tool through our procurement process and it still we’re getting much, much better. And it took also educating from those support roles from a legal procurement perspective for them to understand where we wanted to get to and come in and now we have a ton of super excited folks on the topic and really working hard to be able to let developers and the business, this is not just about developers in this case the business experiment and be excited about it.
Laura: What a testament to your success though, because as you set out to turn people into advocates and you did so well that you moved the bottleneck down the line to where your team was the bottleneck or procurement was a bottleneck, but what a nice signal that what you did worked because now you had all these people trying to get their hands on latest tools. I think what’s funny is its kind of the typical platform problem, right? You had folks who had a need, didn’t know that there were services or tools already available within the company that they could use. You’re trying to centralize the pain a little bit in kind of a platform team way. It’s a lot of the same patterns, and I think it’s a nice reminder that GenAI is just another tool. It has a lot of promise, there’s a lot of optimism. There’s definitely a high ceiling on what can be accomplished, but at the end of the day, we have to follow a lot of the same patterns that work for other tools, other developer tools as well.
Bruno: A hundred percent. And I think GenAI brought a really nice… Because of the hype and because of the speed in which everyone wanted to see the value of it, it really gave us tools in order to be able to interact with our audience, with our developers, and with our stakeholders in a different way. Like I mentioned, we could traverse the organization, we could get together, we could really experiment together and we brought some of those learnings into other DevEx tools that we have in the company and how we do, for instance, roll out of new features, how we interact with our community, how we get feedback from our community. And we started gaining also trust, more trust from the community because we’re all closer together. And so I also really appreciate that from the newness of GenAI and the hype behind it, right?
Laura: Yeah. What a great catalyst for you know… Just what a great catalyst in general. I think that’s really nice.
Bruno: And I think that’s a really important thing to do in order to have great DevEx organizations.
Laura: Yeah. Is there anything you want to talk about barriers, training enablement before we get into… I want to talk now about evaluating so we’ll fast-forward a year now you’ve got all of these tools, how do you know which ones should stay and which ones should go? And then we’ll talk about results and then where you want to get to. Is there anything else about the things.
Laura: Okay.
Bruno: I think we go into the place.
Laura: Okay. So lets kind of think about where you are here in this journey. A year and a half ago you started using Kodi because it came in through Sourcegraph, which was an existing vendor. You had low adoption at first. You did some very specific things in order to increase adoption, like training enablement, trying to fast track the procurement, go over the legal battles, also building community, having time for experimentation.
Those things might seem unimportant, but they are a force multiplier. And so you were doing all these things and now you’re at the point where you have tons of advocates across the company. People are very interested in getting their hands on all different kinds of categories of tools as well. And so you have all these tools flooding in, you obviously need to have a little bit of method to the madness. How did you think about this with your experimentation mindset? How were you kind of systematically evaluating all these tools so that you could figure out which ones were worth investing in and which ones you would kind of pass over?
Bruno: Yeah, good question. So one of the things that was really important for us, one of the values that we lived by and we still do up until now, is to be open and honest with our providers. We weren’t sticking to one. We were going to try multiple and I think that was important to be clear from the get go. And so as we started trying new providers, it was important for us also for us to be able to evaluate them consistently between, right? And so we built a very, very simple framework where we started listing what’s important to us. So is the provider compliance for instance, could we just integrate our systems with their systems and be [inaudible 00:36:57].
Would we have access to multiple LLMs in the beginning? Or how often do they bring new LLMs into their product roadmap? Do they have good documentation? Do they have good integration? Would they help us hands-on? Which we didn’t mention in the beginning, but that was a really, really important part of the whole process, was bring the providers into the business and then help them evaluate the problems. We assumed that because they were providing the tools, they were more experts in the topic, and so bringing them in to listen to where we were struggling or what is it that we were trying to solve was really, really important.
And so we talked about if we find a bug for instance, how quick do they solve it? And so we started listing some of what was important to us and giving it a weight of importance, but also a rate for each of those providers. Now that was well intended in the beginning, but everything is moving so so fast that we started going, “Okay, well let’s try first. What do the community want to try was important to be able to bring that excitement, to maintain the excitement, but also what is out there that is being launched that is new.” We were welcoming all of that to be able to try them in parallel.
One thing that we are lucky for is that our community is quite large, and so we could segregate them into different cohorts, try different tools and compare them simultaneously at the same time. That has been super useful for us. So we started going and trying the staples, comparing with folks like Amazon to be able to see what is it that they were doing and how would they do differently. Folks like Google for instance, and bringing their workforce into our business to be able to try things from their angle and see what we could get to. And so that was also being really important for us to be able to evaluate different providers.
Laura: So where do you stand right now with adoption? Because you’ve had such a purposeful approach to increasing adoption and we know where you started, which was hesitant, perhaps lower adoption than you would’ve hoped you did a lot of interventions. And so tell me about the state of the world right now.
Bruno: So we have most of our developers adopted GenAI, but one of the interesting signs that we found is that the people that were using on a day-to-day, and we call it day-to-day sort of like three times a week because we know developers don’t just write code. And so we established that a daily user would be the users that use three times a week so 12 times a month, roughly speaking. We found the correlation between those users shipping or creating more merge requests or more [inaudible 00:41:06]. And so we started looking at adoption from a different lens. We were like, “Okay, well just adopting is not good enough. What is it that we have to do in order to make folks use it a lot, be part of their day to day?”
And so we have segregated our community and started to go, "Okay, well the folks that are not using on a day-to-day, what is it that we need to make them… What is it that we teach them in order for them to see the value? Because we know that the ones that are using more often are more… The speed is the wrong word, but more efficient.
Laura: Yeah, more efficient.They’re shipping stuff to production more often, which is a signal. It’s a signal of how healthy the system is. And also it’s a signal of friction, I guess is really what it comes down to if you can ship more frequently. And I want to ask, are these pull requests some of the appropriate questioning or criticism about that PRs or MRs per developer metric, which is in the core four as well, and you’re using core four to kind of underlie all of these performance conversations, but one of the things that comes up is like, well, is AI just shipping more junk code? Is it just shipping more small insignificant changes? And so how are you answering that question with data or what do you see?
Bruno: We see that the folks that are using GenAI on a daily basis, they create around 30% more merge requests. That’s one sign that we need to dig deeper. And we also saw that those MRs were around 70% lighter, and we don’t know how to make sense of that yet. That’s something that we’ve seen, but we don’t know how to go deeper into evaluating those. One of the things that I am leaving as far as possible in automating is the whole merge request process because that is, in my opinion, the only bit that we have now to be able to really validate from a human perspective what is it that’s being shipped to production and so our codes still get reviewed.
Laura: Yeah.
Bruno: BMR gets reviewed by at least two humans to be able to be shipped to production. And so there is no report from those reviews that the code is not usable, right?
Laura: Okay.
Bruno: But we still need to dig deeper. We need to understand why are those MRs lighter? Is the code readable, is the code efficient? Is the code less vulnerable? And so what is the quality behind what’s being shipped to production?
Laura: Yeah.
Bruno: From the population that we have today, which we have around 60% of our developers adopted GenAI, of that about 84 user on a weekly basis, and about 70% of those developers use more from a daily perspective.
Laura: And also for some context here, that’s top 25% of companies according to the data that we have, which is data from, of course, DX customers, but also other companies that are not affiliated with DX coming in from different research studies. So that’s a great result already. You’re staying ahead of the curve and you have even more ambitious goals. What are you going to do about those 40% that are not using it?
Bruno: We are trying to track more with those users, right?
Laura: Yeah.
Bruno: We want to understand why are they not using. We want to understand what is it that they want to try that perhaps is not there and how they want to do that. One of the things that I mentioned in the beginning is that bringing leadership was important for the adoption to happen. And we’ve seen that as leaders become more and more versed towards GenAI and use it and love it, the more they can empower their communities to do that, right, and to use it.
And so we started creating hackathons with business unit leaders as well and making them code, vibe code. And that became really fun sessions to be had, but they come out of that room energized and pushing communities in order to be able to use. And so we already seen an uptake from those 40% being folks starting to.
Laura: Yeah.
Bruno: It’s being in touch with our developers, that’s essentially the tool that I’m using in order to help them on board into GenAI.
Laura: Have you been able to see, is it like they haven’t tried AI at all or they have tried and then stopped using? Are you able to see any of those trends?
Bruno: We are. We are able to look at the churn of starting to use and then not using. One of the things that we started changing as well is how easy it’s for them to onboard into the tool. So for example, when we adopted Kodi, folks could only use Kodi if they were on GitLab, for instance, because the login to Kodi was via GitLab. And so we started to change that to an [inaudible 00:46:27] perspective. So anybody, not just developers, can use and can adopt and we assume that that’s already removing the barrier to start using the tool. But yeah, we use a lot of the… And I think it’s important to show that we avoid go to the developer level. And I think from the beginning, one of the values that we established was we think it could be really dangerous to look at developer, their developer who was using and not using.
So because folks could potentially use GenAI in the wrong way and so there’s ethics behind the whole thing. And so we started looking at business unit level and then going all the way to a director level and a director here, which was for a big org, and we stopped that. And so we are able to connect with those leaders and say, “Hey, your business unit could have a higher uptake. What is it that is missing for that to happen? Perhaps could we train your community a little bit better?”
And so we don’t have to the developer level visibility and we want to stay that way.
Laura: And it’s important because as you said, there’s ethics, there’s gamification, there’s a lot of bad things that can happen and it’s been my position and we are very aligned on this, that these metrics are for improvement. They’re not for performance evaluation. And keeping it at the director level or even team of teams level is a really good way to reinforce that. I think the downside is that it can be difficult to see if there’s an individual who is needing support, they have to ask for it. And so that’s sort of an extra lift from your team to provide the mechanism for them to get the support they need when you can’t see directly what’s happening, they need to come to you.
Bruno: We started using communities in Slack, so we have an engineering, a GenAI engineering channel where we see folks entering daily quite a bit.
Laura: Yeah.
Bruno: And so we also look at that from an adoption perspective and the questions that come up there, some of them even go into our training paths because we see that developers there are more relaxed to be able to ask their questions and instead of ask going to the developer level, is the developer coming to us? And so that’s been really useful for us to understand where the bottlenecks are. Some of the questions that they ask, it’s clear that is missing a little bit more education onto what [inaudible 00:48:57] do actually providing the right context and doing and using the right prompts. And so we are able to then go and hold their hands and help them do that. So yeah, it’s essentially a way that we found for the developers to find us instead of us finding the [inaudible 00:49:13].
Laura: Well, Bruno, thanks so much for sharing about your journey and about and where you started, the results that you’re having. What I’m taking away from this is that bringing AI into Booking was not just a technical thing, it was also a cultural thing. So you wanted to get the licenses in the hands of the developers, but you’re also trying to create communities, advocates, trying to get people enthusiastic about it. It was also a catalyst for cross-functional collaboration across the business units, which is really great.
Training and enablement time to experiment are really, really important. Fast tracking some of those procurement things and you have the results to show for it. I mean, you have achieved really outstanding results and you have already… Or you have even still some very ambitious goals. One thing I appreciate about Booking’s approach is that you approached it with a hypothesis or with a business problem to solve, not just, “Hey, let’s see what sticks and what doesn’t,” and I think there’s appropriate times for both. To kind of wrap up our conversation. I want to ask you where you plan to take AI adoption at Booking over the next year, two years? What’s in store for you?
Bruno: Our goal is that everyone uses Gen.AI every day. To me or to us is not anymore is GEN.AI impactful. We know it is. We now need to get everyone using it so we go to the micro level of that impact. The question on is it still… What is it that GenAI outputs is still out there and we need to understand where is it good at and where it’s not because we want those developers to be able to focus 90% of their time into the problem that they’re trying to solve less on [inaudible 00:50:59]. So ultimate goal is to improve DevEx, right? Make sure that we ship faster. So yeah, everyone uses it every day. That’s the goal.
Laura: And then all coming back to basics, AI is a tool to improve developer experience and when you have a great developer experience, your time to market is reduced. You can ship faster and delight your customers more often.
Bruno: A hundred percent.
Laura: I love it. Thanks Bruno so much. It’s been a pleasure.
Bruno: Thank you for having me.