Jason Valentino:
Hello, hello. Well, hey all, thank you for attending my breakout session. The other one’s probably better. So you made a mistake, but I appreciate that because it builds my self-confidence, and my therapist think that’s important. I am Jason Valentino. I work at the Bank of New York. I’m in charge of our software engineering strategy. Essentially, I own Software Strap, all the developer tools, our DevEx teams, and, unfortunately, compliance for 8,000 developers and their day-to-day. Today, we’re going to talk about a little journey that my team went through last year, which was as we were getting ready to celebrate the victory of, wow, everybody is using AI, coding assistants are prevalent, they’re everywhere, people love them. And what you heard from the other speakers, are we making any more money? Not exactly. And so I want to talk about some exercises that we did to figure out exactly where we need to embed AI, where we need to actually focus a little bit more just on old-fashioned DevEx and automation, how we prioritized it throughout our day-to-day.
And then talk about some of the results and just share that with you guys, see if it resonates. It’s incredibly intimidating talking about AI. I’m a pretty good public speaker, but we are all running at such a high clip right now. I think if you are in a job similar to mine, this has probably been the busiest year of your life. And so, as I wrote this two months ago, it seems outdated to me, and you probably have all figured this out, which is great. Tell me how to fix it better after this talk. So there’s some good news here though. The coding assistants themselves are pretty dope.
I’m back in the code again. That’s been years in the making. DX’s Q1 benchmark came out, I think they’re saying about 27% now of all code is authored by AI in some meaningful form. At BNY, our numbers is right around just sub 50%, like 47. And again, similar to what you heard in some of the keynotes earlier, that isn’t exactly relating to 40% more releases, 40% more MRs, but it is having really unique and awesome effects throughout the organization, some of which were ones that we didn’t necessarily expect. I hate to admit this on stage. I had a pretty bad code coverage problem at the beginning of 2025. We made that completely go away. We made it completely go away with use of AI tools in a way where even the product executives in our company still think they have additional engineering capacity. In an old world, things like that would’ve happened and engineering would’ve been yelled at for slowing down the pipes for not delivering enough product for focusing on tech debt when we need them putting new products out to our customers.
We’re starting to see this innovation happening in these small pockets where people are finally finding the time and [inaudible 00:07:27] to fix the little things that have been bothering them over and over and over again.
And I think Nancy called this the pipe earlier, writing code is just one of many, many, many steps in what it takes to build and ship product. And it’s kind of the fun part of it. So we’ve effectively made the fun part of the job less of the job so that we can focus on the rest of it, which is having the world’s greatest coding assistant doesn’t necessarily help you do product research, doesn’t help you write stories, doesn’t help you get requirements [inaudible 00:08:05]. And then at the end, after the engineers do get those requirements and actually get to work, then we have the next task, peer review becomes the next holdup, right? And I know there’s lots of AI peer review tools. Let’s assume you don’t have them yet. And then also everything else that you have in your release governance, your testing harnesses, and whatnot, all of this needs to be changed along the way for you to be successful, if you’re really going to see the promises of AI.
And so I have a quick little mental model I like, we shared last year. And it’s three simple steps, which the first was, just ask your teams, every team that’s responsible from research all the way to production releases, what happens to your systems if we start moving at a three times clip? If we were really to get to a point where we were seeing three times the amount of PRs going through, what falls over? Do my build systems fall over? Does my release struts fall over? Do my infrastructure fall over? The answer’s probably yeah, in a couple different places, but those become places that you now can put on a roadmap and say, “Hey, if we really believe this thing, we want to de-risk the organization from an AI surge, where are we going to invest next?” The second thing we like to do is figure out a way to measure the impact of where we want to attack with AI.
And that’s really just decomposing in… I’m looking at my product manager’s probably giggling because it didn’t look sexy like this when we did it. It was an Excel spreadsheet with every task in the SDLC in it. We take every task in the SDLC, we do a little bit of our own research. We’re customers of DX, so we can pull a lot of the sentiment in from what our developers are telling us, where the pain points are, what’s taking longer than industry average. We went through and just listed every single step and said, "Is it manual today? Is it automated today?
Jason Valentino:
… up and said, “Is it manual today? Is it automated today? Is it partially automated today? Would it pass or fail that three times stress test?” And then at the same time, what is the sentiment around each of these tasks? What’s the sentiment about peer review? Got to be honest, no one was complaining about peer reviews a year ago. They’re complaining about peer reviews now because the sheer amount of stuff that’s coming through. And so this helps you just understand, and where I’m going with this is I don’t think the answer is some magic vendor gets up on stage next and says, “I solve all these problems. I will complete… Give your SDLC to me.” I think what you’re probably dealing with is that each of your businesses has pretty unique challenges to how you build software. I’m this thing called a GSIPI. I don’t even know what that means, but apparently Fed regulators have my phone number and it’s not great.
So there’s very, very specific things that needs to happen when we build software to secure the supply chain to make sure certain checks and balances happen. And so this is how we’ve kind of figured out our own prioritization methods towards what problems to attack next, knowing that some of these will be fixed with vendor problems. Some of them will be fixed with good old-fashioned, we’re going to build it ourselves, and some’s going to be a combination of the two.
The third technique that we were thinking of also is unlike the AI coding assistant, which is a fairly straightforward, I buy a thing, I give a thing to engineers, the thing goes. Each of these problems has probably one of three different deployment patterns. There’s the one that we’re all used to, which is like, let’s buy a thing and install a thing, the IDE, the CLI, great examples. You put the thing in and the user itself is allowed to take their creativity and just supercharge it. In these examples, you’re not too worried about the confines of what that user does. That’s probably a developer that still knows how to use the rails of your organization. That was probably a smart person before they got AI. Now they’re a smart person with AI. Of course, we’re finding ways to train those tools to be a little bit smarter about our organizations.
We have a marketplace for skills and for workflows, yada, yada, yada. We have a library of MCP servers that are approved for internal use. All of that is being built behind the scenes. But at the end of the day, this is kind of the category for creativity, the fun part of the job. And then there’s the second category, which is truly autonomous AI running on its own. So beyond the agent you might spin up on your computer, we have a digital worker in our company.
That sounds so much cooler than it really is. It’s like a bratty intern that can do a few things and it’s a little bit psychopathic, so you kind of watch out what things it does. But this is like the category of, “Hey, every time X happens, I want an agent to launch to do Y.” Our digital engineers, again, they sound cool. They do very simple work, very simple security review stuff, very awful monotonous work that our developers don’t like doing, like getting simple access requests nailed down and turned on, helping groom your Jira backlogs, but just autonomous. I do see more and more going this way, but occasionally there’s something in the SDLC. An example for us is like when a build breaks that we love when an automated agent just spins up and goes and solves the problem. It’s a great example. And then there’s the third, and this for anybody that has a little bit more of this regulatory splash to your job becomes really exciting.
It’s the when AI absolutely needs to be built into the workflow, the workflow of SDLC. What is happening when an MR gets submitted? What’s happening when test evidence is uploaded? And this is a pretty exciting spot. It’s kind of boring, but it’s exciting because this is where I think a lot of the benefits really lie. If you get to a point where we’ll talk about peer review for a second. Right now, peer review is something that like, “Hey, you got Windsurf or you have ClaudeCode installed, you used the peer review agent, it makes life better.” It’s still human driven. And I don’t think my answer to the regulators is going to be like, “Yeah, we kind of use AI. It’s optional.” My answer’s going to be, "When an MR hits, a deterministic evaluation criteria goes through and says, “This is a healthy looking chunk of incremental code. I will trust the AI to work with the engineer to get this merged.” And then there’s the, “Hey, someone is trying to put that famous 75,000 line PR through. That’s not the AI’s job to solve that problem. That is quickly going to get escalated out to a human path.”
And so if we think about how your SDLC works, how you evidence everything, how you ensure things happen every time, you start building solutions that are actually embedded into like you’re codifying your SDLC through. It’s something you probably did before AI came through. I’m a GitLab customer. I watch their data stream. There’s certain things I do when I see certain events. Now it’s just doing that and then throwing an AI action after it. And these are the examples that we’ve seen inside the walls of BNY. The IDE, CLIs, we have our Windsurfs, our Copilots, our ClaudeCodes, playing with Codex as well.
We’ve gone ahead and built our shared resources for all of those to consume. We have our bratty intern in the middle. Don’t tell Brian I call it that. That’s going ahead and running just like a junior partner inside your teams, talking to you on soccer teams, executing tasks. And then there’s the more exciting stuff, which is like, what can we actually embed into the workflows themselves? A lot of this is starting off homegrown. I do see more vendors playing in this space. I do think it’s something that I might be able to throw more money at and less engineering time, but we’ll see.
So we’ll open this up for Q&A in a little bit. Just in closing, AI absolutely just multiplies everything. And if I could leave you with something, it’s kind of like start with that 3X exercise, which is the, how big does the pipe need to be for you to be successful? How much do you have to grease? How much extra old DevX work do you want to do inside your organization to make sure it’s able to withstand the world where pretty soon MRs just start coming crashing through? And then the other thing I would like to leave people with, raise your hand if you have unlimited budget for all the stuff that you think you need to do walking out of this.
This is the reality of AI. If there was ever a time to really open up to internal contribution models, to just allow innovation in, it’s now. None of us are able to keep up with the clip of this thing. And we especially can’t keep up on the clip of this thing if folks in positions like mine, yours, similar, sit down and say, “Whoa, this is my problem to solve. We’re going to put this on the backlog for next PI.” That’s not how this works. Right now, innovation’s happening in corners of your organization that you probably can’t even see. Do your best to make sure this stuff is getting cataloged, is showing up in showcases, really get to a point where you develop this show and tell culture around AI because otherwise the best ideas get snuffed and then be willing to say yes. What I mean by say yes is meeting hits my calendar, someone shows up, I talked about the marketplace for where our skills and workflows and MCP registry sits.
That came from a customer team. They showed up with something that was completely just mocked up, probably Windsurfed up and showed up to my desk. I can take the answer, "Oh, why didn’t you build this like our existing backstage-ish frameworks marketplace? This should be over here. We got to keep the … " Or I could just say what to do what we did, which is like, “Oh yeah, we’ll put this live.” It isn’t perfect, nothing’s perfect. But if you just take more of a ship it culture again, kind of like almost 10 years ago, you’re going to find the whole organization really will bolster around these products and help build together. Obviously have the roadmap, obviously know where you want the North Star to be with this, but just start… I don’t know. My advice, and this is advice coming from someone who’s highly regulated, and if you’re a regulator, please don’t be in this conference, just start saying yes.
I might get in trouble for that one. Just start saying yes. I think at the end of the day, DevX is truly the intersection between product and engineering. It’s learning what our engineers need, solving problems for them, breaking blocks. This is just another example of it, and you can go a long way from just letting people in. So with that, I’ll end my structured talk and we’ll open it up, I believe, for questions. Ah, I was supposed to introduce you again, but I didn’t.
Brittni Allen:
Perfect. Thanks, Jason. That was amazing. So many good takeaways from that already. Questions are starting to flow through from the audience. So one of the questions is, you spoke a lot about some of the different ways that we can apply AI, but how do you see the role of platform and productivity teams changing to support this demand of AI tooling beyond just the engineering portion?
Jason Valentino:
It’s back to the, you’re totally going to get so much more budget and everyone’s going to realize the value you bring to the organization.
Jason Valentino:
… budget and everyone’s going to realize the value you bring to the organization. No, but this is, to me, this is a platform capability. Every org is different. Some of you may see this being led from some kind of AI think tank or center of excellence. In our world, it is within platform engineering. So I report to a gentleman who runs effectively frameworks platform engineering for the organization. We very much now, a lot of our Dev X spend is going towards AI tooling, AI enablement. The CIO has been very kind to move a little bit more knowing that we need to invest here. She’s not going to be happy until I think we hit 100% maybe of AI code written. So we’ve got to find a way to get there. And then we need to be able to endure the amount of just throughput that that might cause the organization.
Brittni Allen:
Totally. I think so many of the people in the audience here are facing those same questions. So yeah, definitely agree. Another question coming through, where have you found the greatest challenge to be in integrating AI tooling into existing processes or modifying those processes? And how did you overcome those?
Jason Valentino:
I’d be lying if I said I overcame them all, but we have actually… One of the weird things about my role is I both own productivity, developer experience. I also own compliance for the SDLC, which means the voices in my head argue, but at least I don’t have to reach over the table to someone else to have these conversations. So we are rewriting a lot of our policies as we go. The wording that we probably codified two, three years ago no longer applies. Little things like when you talk about review, it’s like you have to scratch that word human in front of it off the policy really quick and then redefine what are we going to define as these and get them republished. If anyone else lives in a controlled banking sector, we kind of have a policy, we have a procedure, we have handwritten controls which map everything written in that doc to technical things we do.
All of those are being scrambled right now. But you have to almost want to start with a clean slate of what I’d want this to be and then go work with your risk partners, your audit partners and whatever and be like, “Would you be okay if I made this go away and answered it a different way?”
Brittni Allen:
Totally. Okay. So similar to the conversation of risk, let’s talk a little bit about quality. We’re seeing dramatic changes in throughput. Are you also seeing any changes in rework or revert rates and how are you treating that as a sign of quality?
Jason Valentino:
I haven’t yet. I haven’t yet. And I think it’s happening, but I haven’t been able to sniff it out. Quality wise, we have seen a new… We’re also in the middle of a transformation when it comes to what’s the role of a quality engineer. And so we’re finally starting to see the point where engineers that have been resistant to owning their outcomes and tests now in order to get the velocity from this, are really starting to rethink their this is someone else’s job pose. And they’re really starting to interloop test a lot more, which is a pleasant thing to see throughout the organization. We’re also seeing that our test maturity throughout the organization is naturally climbing without any stick techniques right now. And I think it’s a lot. People just finally have the time to do the work they always wanted to do, but never did.
Brittni Allen:
Totally. Yeah. I think we’re seeing a lot of that as well. I mean, a lot of your session today spoke about some of the conceptual concepts of what we’re seeing changed across the SDLC. What are some of the things you’re most proud of how you’ve applied that specifically within BNY? What are some of those tactical examples?
Jason Valentino:
We hacked Jira. We used to use the data center version of it. And we built all these AI plugins into Jira just to take the monotonous stuff, make it a little bit more fun, like writing a story, writing an epic, pulling an epic from our planning software. And it’s like these little… As you guys map your SDLC, don’t just list it as like there’s planning and there’s coding. Actually get down into what does a human do here and like to the point where you’re watching them over their shoulders, when you see an action that is highly manual, just like writing a well-written story, that’s an attack point. And then when you master that attack point, you can then kind of expand on it. So now we have AI writing the Jira story, and then we have AI linting the Jira story, then we have AI assigning a confidence score to the Jira story.
All of a sudden that story is something I can say, okay, AI, now let’s generate test cases that are going to be used later in the software delivery process. And it just cascades and cascades and cascades to the point where you’re like, even if the developers are very much in control of what we build and how our software works, the outer loop of it, we’re trusting AI a lot more and more.
Brittni Allen:
Yeah. I know a little bit earlier today, a lot of the other speakers spoke a little bit about how one of the things that they want to make sure to keep an eye on is with all of these new changes, with all of these new possibilities, how do we make sure duplicated efforts don’t happen? Or even more so, how do we make sure chaos doesn’t completely unfold? How are you treating that at BNY? What are you doing to prevent that?
Jason Valentino:
Next question. I’m not. I’m going to have a meeting on Monday about four different one click UI testing solutions that were built in the last week. The good news is that I found four of them, so one’s bound to be good, but that goes back to the comment around show and tell concept culture. You’re never, ever, ever… Especially we have 8,000 engineers and then probably like 4,000 people that want access to Winsurf and Claude because they’re going to be engineers, right? You can’t convince those folks to go to a central registry of all the efforts that are going on and be like, “I want to solve this… Oh, that someone’s working on that. I’m going to let them do it.” It’s not going to happen. They’re going to run. But the good news is there’s not but so much waste if people are running super fast and they’re showing up to show and tell the thing they built.
And at that point, in my example where I have to go figure out four different AI solutions to basic playwright UI testing, it’s going to be a fun meeting. It’s going to be a fun meeting for me because I’m going to probably just say like, “Can I get these features from all four together? Would you guys be willing to work together? Is anyone okay working together?” I have to host it anyway, so yes. And so long story short, I can’t stop duplication, but I can at least get to a point where as quick as it happens, we turn around, we address it, we bring it in, and then it turns into something that could be beautiful for the organization.
Brittni Allen:
Totally. I think we’re going to see a whole new era of cross-team collaboration and what that could look like in the future. Yeah, absolutely. Okay. Along these same lines, how has your adoption of these AI tools assisted in any regulatory audits or any of the compliance checks that you all have to work through?
Jason Valentino:
Has it made regulatory better?
Brittni Allen:
Yeah. Has it made it better? Has it made it harder? What have you seen?
Jason Valentino:
I’m trying to think of specific examples that could be fun here. Yes, but it’s a very small example and it’s probably boring to whomever asked that question. We do get a lot of fourth line external audit requests that just tend to fly in. And if you have external examiners, they don’t really care about your PI schedule. They just show up and it’s like, I want to know everyone in GitLab with M starting as their name. And those, we blast through them now, right? So it’s no longer one analyst has to run with it. There’s a broken sprint, there’s pain. Simple requests like that now are just simple things you can Claude up and then let launch.
Brittni Allen:
Amazing. So two of the highest friction points on some of the graphs that you showed were code reviews and change tickets.
Jason Valentino:
I also had making coffee and no one caught it.
Brittni Allen:
Hey.
Jason Valentino:
The front row’s cool here. Red. Sorry, back to seriousness.
Brittni Allen:
No, totally fair point. How have you been able to break down those gates though?
Jason Valentino:
Say them again because I made a joke and I thought it was funny.
Brittni Allen:
Code reviews and change ticket.
Jason Valentino:
Yes. So code reviews, we’re still working on it. Right now it’s happening local. So people are using the AI coding assistance to help assist their code reviews. We’ve built a skill that’s in our marketplace that people can use. My goal on that one is that it’s 100% codified, that there’s something very deterministic that sees the inbound request. If it’s fun and it’s little and it’s incremental, AI, I want to just take the wheel with a rule set that we agree on at the enterprise level and with a rule set that the local teams can also set so that we’re getting best of both worlds. And then when something gross comes through, I want that thing taken out and put on the human track. And that’s not a bad thing, right? There’s a reason for a giant 10,000 line MR. It could just be someone who was working on new test automation for an app.
It was totally done under the guise of their manager. Great. Manager can attest that and approve it. And then I have evidence of that happening. But when it’s just a bunch of AI slop, it gives us a chance to actually just stop the production line, take a look at it, have a heart-to-heart, maybe a learning moment, and then you go back to business.
Brittni Allen:
It’s almost like you need metrics to make sure that all these things are happening and amazing.
Jason Valentino:
The second part of that I think was change tickets.
Brittni Allen:
Yes. Change tickets.
Jason Valentino:
We are trying to get to a point where there’s no more human review if you’re building with the right golden path and if you are within your health metrics are fine. And we had that before we got to this AI world. We just don’t have it vastly rolled out where your code composition analysis came out clean, your security scan came out clean-
Jason Valentino:
Your code composition analysis came out clean, your security scan came out clean. This is within your release window. This is the same stuff a lot of you guys probably do now. You haven’t caused a major incident in a while, so you’re a more trustable team. Voila, instantaneous approval of change control ticket. That’s very not AI right now. It’s very just deterministic. And we’re looking at ways where AI could maybe play a little bit of a part there, but not too much right off the bat.
Brittni Allen:
Fair. I think that dovetails nicely into the second question, which is when you’re seeing twice the amount of changes, is that changing your on-call model or any other safety nets that you have in place to make sure that, as those changes get released, you have the safety nets in place to be able to handle anything that might go wrong?
Jason Valentino:
A little bit. We haven’t seen twice as many changes yet. That would be a pretty cool victory, and come next year at DX Annual, and I’ll be telling you how we accomplished that. Right now, we are actually seeing probably twice the amount of commit jitter that we saw when we started this journey in 2024, but that doesn’t mean twice as many MRs and it doesn’t mean twice as many releases. But the good news is our folks that architect our production support models, they have these tools too. And so what we’re seeing is just as I have my bratty intern for writing software, my partner in crime on the SRE organization has their bratty intern for watching level one requests, for checking monitors. Our observability team is all over AI as well. We’ve seen, as a regulated industry, I have to always say a reduction in incidents. But we have seen… While it has been one of the busiest years of my career, it has been one of the most stable years for the bank.
Brittni Allen:
A huge win. Something that’s not typically seen. So amazing. So there must be a lot of others in the audience who are also in regulated industries because a lot of regulatory questions are coming through. So let’s ask-
Jason Valentino:
I see the faces, the help face. I get it.
Brittni Allen:
We’re all friends here. Amazing. So tell us a little bit about how you’ve been able to manage some of those regulatory constraints to be able to not slow you down.
Jason Valentino:
The one thing we tend to forget in the regulated industry is that we are the ones that author the policies and governance that our companies abide by. If anybody has lived through a Fed exam or whatnot, there is no… You can show up and say, “This is how I do this thing based off NIST or based off this policy or based off this.” They care more about the spirit of whether you are protecting your institution, whether you’re de risking the institution. And so you always just, as you build and as you build these policies… I have a really good partner in our second line risk group. Gentleman’s name is Tim, but Tim is very quick to want to design these things with me rather than wait for me to build something and then show up at his desk and say, “Hey, Tim, I’m turning this control that you have off.” That doesn’t work. You really have to bring them into the dev process.
And for a Claude license, you can get anyone to… If you happen to be the person who owns Claude and Windsurf licenses in the company, you can get a lot of bribes from people who are non- [inaudible 00:33:12] that don’t have access to it. So just use that to your advantage.
Brittni Allen:
I see so many head nods, I know from talking with so many of you that we’re all running the same playbook. So it’s so fun to hear. We’ve talked a little bit about the approaches that you’ve taken to rolling out these tools, what you’ve seen, things like this. Tell us a little bit more about how you’ve been able to help teams go from what they’re individually doing with these tools to scaling what works best across the entire organization or across an entire department.
Jason Valentino:
Show and tell us. So every Friday from, I think, 8:00 to 9:00, we have an AI stakeholder meeting that has 600 some odd attendance. I don’t even know how you get on the list, but I know I haven’t been able to escape it where we expect just an hour of innovation show and tell. And that’s not just from tech too. That could be someone in operations who figured out a way to destroy a swivel chair process. It could be someone in sales that has figured out a new way of arming their client dossier before they go on a trip. It’s innovation across the bank, but it is a place for people to start talking, to start sharing ideas. We try to keep it half engineering focused, half the rest of the bank focused. We also do, when we roll out a new tool, we try to manage them like products and we try to make sure…
It’s 2024, we’re rolling out Copilot. As the first couple use cases show up where someone’s like, “Wow, this is really good at this.” Then come back to the community of practice, get those on there. The last slide I showed in there, which was really make sure that your people feel like they have a show and tell culture and make sure that you’re accepting all ideas as possible past production goes a long way towards your entire organization leveling up.
Brittni Allen:
Totally. I think that comes back to Jen’s talk earlier as well. So it sounds like that will be a consistent theme throughout the day. I think we have time for maybe one more question. Tell us a little bit about anything else you’re really proud of what’s happened so far this year at BNY. You said it’s the busiest year you’ve had.
Jason Valentino:
It’s the busiest year I’ve ever had. And AI is supposed to be making it easier and it doesn’t seem to make sense, does it? I thought it was supposed to be coming for our jobs and now I’m like, “Oh, that promise was such a lie.” No, I think the team is just… This isn’t just a Jason experience. By the way, I didn’t have gray hair in 2025. The team has just been burning it, absolutely burning it nonstop. But the fun part about it is, it’s not like you’re under pressure for a project deadline to get something to a customer. We’ve all had those 70 hours, 70 hours, 70 hour lurches in our career and hopefully we don’t have but so many of them. This one’s one where everybody’s just enjoying themselves as they go.
The Teams chat at night is just still lit up. Folks in the U.S. bragging about stuff we’re working on by the time my India team comes online like, “What are you guys doing awake?” If I was to say I was proud of anything, I think I’m proud of just the energy that folks have around this and I don’t know how… I don’t have the playbook for how we became happy enough to do this work or we’re given this trust by the organization to build this stuff, but it feels amazing. And so I’ll close with that.
Brittni Allen:
Engineering became fun again. I love that.
Jason Valentino:
Yeah. Yeah.
Brittni Allen:
So many things to think about, Jason. Thanks so much for being with us here today. That was incredible. So many takeaways that I know all of us have gained, so that was great.
Jason Valentino:
Thank you for having me, Brittany. Thank you all. Appreciate it.