Podcast

Leading infrastructure change at scale at DAT

Ian White, Director of Platform Engineering at DAT, joined the company to scale their Kubernetes-based cloud infrastructure, which has come under stress as their business has grown over the past couple years. Here he shares how he partnered with developers to learn about their challenges, how we conveyed a vision for how the company needed to evolve, and how he’s been working with development teams and business stakeholders to successfully drive change.

Timestamps

  • (1:00) The challenges DAT was facing as Ian joined
  • (5:13) How Ian used customer interviews to understand problems
  • (10:48) The typical journey companies take as they scale their infrastructure as they grow
  • (16:20) How early changes were positioned and received
  • (20:00) The four personas Ian identified
  • (25:14) How Ian evangelized the vision
  • (28:48) Areas of pushback Ian foresees as they introduce new changes
  • (33:00) Handling teams that want to stay on self-managed infrastructure instead of moving to a managed infrastructure
  • (41:55) Managing business stakeholders
  • (45:00) Partnering with finance

Listen to this episode on Spotify, Apple Podcasts, Pocket Casts, Overcast, or wherever you listen to podcasts.

Transcript

Abi: Ian, thanks for coming on the show today. Really excited to chat with you.

Ian: Likewise. Thanks for having me.

Abi: So I understand you started at DAT around four months ago with this problem statement of, “How do we scale infrastructure as demand scales?” Can you share a little bit more about the context and the challenges the company has been facing, the context for bringing you in, and the problem you’re trying to solve?

So DAT connects truckers, carriers, and brokers. We’re the largest load board in America, and we’ve been in business for over 44 years. In the technology space, I think a DAT started off as a fax and phone based kind of company and transitioned into a platform based company a number of years ago. And through that transition, especially on the cloud side, we’ve seen that our platforms have really been focused on building out self-service platform modules that enabled that consumption of compute and storage and raw infrastructure. And that’s really good. It actually was a key component to maturing our products, our platforms, our services, both internally and externally, which is great. But the challenge that we ran into and that DAT ran into is that the demand in those areas outpaced all of the available resources, team structure, and the capabilities. So really it’s just a challenge of scale.

That’s sort of where I came into the picture is that many of these self-service platforms have been built, but they were challenged with scale. And so where I’ve kind of come into play is trying to help them evolve and accelerate our delivery processes so that we can reduce the cognitive load that we have on developers today and remove some of those bottlenecks, frankly, so that developers can really deliver reliable, high quality products faster.

Well, you mentioned the self-service platforms, teams were struggling with them and organizational structures maybe were also needing to be rethought. When you came in or maybe your predecessors, how were people trying to solve this problem before you got started?

So our platform is kind of GitOps space, Kubernetes infrastructure, running on EKS. And that was good in the sense that it really did allow a lot of rugged individualism. You could rapidly create infrastructure, get your environments online, which was good. I think the challenges that my predecessors ran into is that it’s awesome to have as a kind of early startup. As you start to mature, you start to see a lot of patterns that everyone’s sort of solving the same problems differently. I call it non-differential problems. And we saw that play out many times. “Why was this product down for X hours?” “Oh, we ran into the same thing that another team hit downstream a while ago, but we didn’t have the mechanism to do it or we weren’t aware of it.” Or, “Hey, how come it’s taken so long to get a new environment, a new feature online?”

“Well, we’ve got to rebuild this from scratch. We’re building our own DevOps pipelines and CIC pipelines.” Well, there’s 20 other teams that have that. So although there is a good community around knowledge sharing and enablement across the company, it was just one of the challenges I think of a fully decentralized platform model that’s fully kind of only self-service, right? So I think that was one of them. I think the other challenge is that, I mean if you look at the pandemic, the pandemic had a massive impact on our industry. We saw 40, 50, 60% increase in our usage and volume, and that’s across transactions, that’s across the amount of loads happening through the pandemic. And so a lot of our systems had never been pushed to that degree before.

And so it did unearth some fragility in some of our architectures and some of our frameworks. And so that’s also I think a response to, “Okay, cool. We learned a lot right through that, how do we really make sure that we’re ready for the next kind of challenge like that?”

So a lot of growth, a lot of increased demand. So you come in and you mentioned to me you spent the first couple months just doing interviews, workshops, base-lining measurements. So I’d love to dive into all of this further. For starters, what do you mean by customer interviews? What did you do when you joined the company?

Yeah, I mean it was a literal customer interview in that we sat across the table with, I think, it was probably 10 different application teams, maybe a little more than that. We’ve got about 30 all in. So it was a little more than half. And I would ask them questions about their experience with their infrastructure, where they’ve seen fragility in their environments, what’s their wish list of things that they would love to have? What are the things that they hate about the platform today? And then what are the things that they’re not ready for yet but they’d like to see in the future? And that was a super candid conversation. A lot of things came out of that. I think the three themes that came out of it was, “Hey, we want really good quality testing across our environment, especially around the platform and the app loads.” Two, they really wanted to make sure that they were getting a single pane of glass observability around the platform, which historically we do have modern observability, but it’s not a single paned glass, I would say.

And then three, I think they really wanted to make sure that there is proactive alerting around incidents as we compile a lot of data for logs and things like that today, but how are we using logs metrics traces to get more proactive and automate the responses to incidents and have more self-healing infrastructure. But it was a literal conversation. I think before we got to the conversation, one of the things that I did was I wrote a North Star vision statement. It was probably a 24 pages deck or so, but the meat and potatoes was in the first seven. And I shared that out to the organization. I said, “Hey, this is what I’ve observed in the last 30 days, what I’ve seen already. I want to get two clicks down, so I’m going to be meeting with you to interview you and understand if these assumptions are valid.”

And I think that was a really good alignment around, I wasn’t coming into the conversation, just kind of asking questions in the dark. We were generally aligned on the things that I’ve seen, the pain points I was observing. And some of that had already been compiled through one on one conversations but it was good to now get into a team context and hear from the whole dynamic of the DevOps teams talking through the concerns. So I think that was the meat and potatoes of the actual interviews, which was fun.

That makes sense. Well, you mentioned some of the needs expressed from the teams, or at least some that arose from the interviews. I’m curious to dive a little bit deeper into the pain points or the things you were observing, right, happening with teams. Were there problems? Clearly, it sounds like there were some resiliency problems. You’ve mentioned security, release management. Can you describe what the status quo looked like just from your eyes as you looked across how teams were self-managing?

Yep, absolutely. So the benefit of the platform was that you’ve got self-service, right? So it’s very flexible, it’s very extensible, it’s pretty reliable if you set it up correctly. The challenge or weaknesses in that environment that I was seeing with dev teams was inconsistency, inconsistency of deployment processes, test processes, release processes, a lot of inconsistency in terms of observability, both from what we should measure as well as how we should measure it, and then what we should do with those measurements. There was inconsistency in terms of practices and processes, especially as it comes to incident management and reliability and scalability. Each team was approaching these problems with their own domain knowledge and with their own contextual background history with it. So I would say your experience would be highly dependent on your more senior developer and your least senior developer on the team. We also saw that just the operating model that we had for cloud was not really scalable to the needs.

There were some restrictions in terms of workflows and a lot of hoops to jump through, too many workflows to jump through to ship software. I think at one point I counted 15 or so different steps. There’s just too many different steps in the process. I think this has led to a lot of silos. You’ve got both silos of excellence, teams that are just crushing it and absolutely killing it, but also then you have silos of deficit and pain, right? Teams that are struggling sort of visibly and sometimes invisibly, right? Because they are learning and doing these things for the first time and sometimes not having a whole lot of help on how to do it since it’s kind of a self-managed environment. I think the other thing is that we weren’t really measuring the hidden costs, call it developer toil, if you’d like, or even just developer productivity.

We weren’t really measuring that. And so as a result, some of these teams didn’t really kind of have the ability to zoom out and go, “Why is this so painful and what are the root causes of addressing that?” Because, “Hey, I got to get the next feature out the door.” And so I think that ability to have that 30,000 foot context around toil, productivity, availability, release frequency, the number of incidents you’re having, all of those burdens hadn’t really been pulled into a single view to say, “Oh, that’s why it hurts. I get it now. All of those things I have, those are all issues for me. And what can you as a platform team do to unblock me on those barriers?”

I’m curious for those of us, myself included, who aren’t as entrenched in the world of Kubernetes, is this a typical journey? Is this a common pattern where organizations kind of start out in a self-service model with Kubernetes? And I’ve briefly worked with Kubernetes and have also seen other people comment on its complexities. So is this kind of the natural result of teams trying to self-manage Kubernetes in your experience?

I think so. Yeah, I think so. I’ve run into this now, I would say, at least three times in my career, pretty hard where we get to a certain level of maturity with it that the anchor weight of all of the complexities starts to weigh down on the teams and you have to approach in different ways. Other teams and other worlds have lived in, sometimes you do that with tooling, sometimes that you do that with process. I think it probably takes all three, taking a different cut of technology.

There’s obviously a lot of iterations, improvements in this place but yeah, I do think it’s sort of a very common thing. I think also one of the things, it’s not just Kubernetes. I think in general all of the hyperscalers out there are providing a lot of Legos for the Lego box, but it’s sort of like we’re running into the same six challenges over and over around security and scalability and stability. It’s just a couple of concerns that I think it would be great at some point if we get more of that functionality out of the box from the hyperscalers. But for now, you do have to build that resiliency and that pattern of, say, governed and scalable cloud yourself.

Thanks for sharing your perspectives on the Kubernetes journey, but also just the more universal journey of all organizations, figuring out how to scale their infrastructure as they grow. One thing you had mentioned to me before was that you noticed that some teams weren’t having problems self-managing. So I’m curious in your view, why were some teams struggling and others not?

That’s a really good question. Yeah, I think two things come to mind. The teams that weren’t struggling were larger. They tended to be larger teams is what I’ve seen. Some of our biggest products, as you would almost expect and infer, had the most amount of maturity. They had really senior resources, they had resources that had been in this environment for… At some point, some of them had over a decade of experience working with these products at scale. That means they’ve had quality assurance capabilities and resources embedded into their environments. At DAT, we don’t have those resources everywhere in terms of full coverage. I think the other observation is that they tended to also have an engineer or two that were really specialized on cloud infrastructure. That also tended to be the thing. Actually now that I think about it, that was a pretty big insight.

The bigger teams had bigger, larger resources and many of those teams also had an embedded resource that had experience, like deep experience, with cloud infrastructure. I think the smaller teams and some of the up and comers as well are smaller, they’re trying to figure it out. They’re like, “Wait,” and they run into a lot of the same, the big boy challenges, but they hadn’t had those experiences yet. And sometimes the tools were just so new. GitOps is great once you understand it, but it can be really frustrating trying to understand how it works the first time. So those are the things I’ve observed.

The other thing I would say, kind of loosely coupled, as we’ve started to mature our observability, both in terms of instrumentation as well as in terms of process review of incidents and what we had captured, what we didn’t have captured, what alarms should have been configured that weren’t, et cetera, the bigger teams tended to have a little bit more maturity around observability and they had already readied dashboards while teams were saying, “Oh, I need a dashboard? Okay, I’ll build one today but I didn’t know I needed it.” So I think that was probably the two things that stood out to me. And I think, just real quick before we go on, what I sort of inferred from that is that I believe that we need to really democratize that experience. Everyone should have access to best in class tools across DAT.

And it shouldn’t just be relegated to how big you are or how large your budget is, right? Or your future roadmap. Building performant, reliable systems at scale should be democratized. And that was basically that the big aha, it was like, “Oh, we’ve got this in certain places, but we don’t have it everywhere. How do we take some of the goodness that some of these larger teams are doing and make sure that they have…? And also, how do we enable, frankly, some of those larger teams that have learned some of these things through pain, they’ve got the T-shirt and they’ve got the bruise, how do we make sure that those things are not repeated as they continue to scale or they go and try and do bigger, larger things?”

Yeah, that makes sense. You don’t want only the teams with the super senior people to be able to manage their infrastructure successfully. Before going into where this journey took you next, I’d love to ask you about that 24 page document, the sort of memo to the company that you put out. What was the mood on the ground as you got started? I mean, I imagine some teams thought that change was coming, could have been feeling a little worried or even threatened by that. So what did you get a sense of on the ground and what was maybe in your memo? How did you try to manage that early on?

Yeah, I kind of led with, “I want to make sure we have the right resources for the right teams at the right time.” And so I think early on, that messaging was well received because there was a resource scarcity challenge. I don’t think I mentioned it, but when I got here, there was one cloud engineer for the whole company of 500 something developers. So there’s an obvious resource concern. But one of the early pieces of resistance, and I shouldn’t say resistance, feedback that I received and I thought it was a good one, was like, “Hey, are we going to go do multiple clouds? 'Cause that is terrifying to me.” And I remember saying early and often, “No, we’re going to get great at one. AWS is our preferred player today. Let’s go do that and let’s get awesome at that. And then after that, if we have use cases that support those additional clouds, we don’t want to restrict ourselves from being able to get there, but we can do that as a next step.”

And I think that was one of the larger fears I heard and I made sure that that was addressed. I think the other thing was just being… In my communication, I believe in just radical candor, right? Radical transparency. So I was also very candid about how I thought our platform was working today. We graded ourselves and shared those grades internally like, “Hey, we think that this capability is kind of okay, this one’s not great at all. Here’s the thing that we believe we are absolutely missing today. The platform doesn’t support it, can’t support, it’s inflexible.” And I think that was key to talking about some of the rigidity that existed in the current platform. As a good example, a lot of our company is really baked around data. We really are a large, big data consumer and we use the analytics across that data to inform our customers.

And so our data science teams are at the tip of the spear of driving much of that growth, understanding, and sharing of knowledge and information not only internally, but also to our partner ecosystem. And they have so many evolving patterns around ML and algorithm processing that, I think as a platform team, we’re struggling to keep up with our current implementation. And so we use that also as a real example of, “Here’s how we believe taking a different cut of this and providing more flavors and more options will satisfy concerns like these as well as others.”

So how are we going to capture the 90% and how are we also going to capture the edge? And I appreciate that in that communication we had all of those concerns thought through, not just, again, you got to pull block forms for the majority, not the minority, but you cannot forget about the minority either. And I think that was really key to have that. If you’re a developer at DAT, you could see yourself in each of those personas. And then it became, “What does this persona really need?” Question versus “I don’t see myself in any of these personas.”

Well, I love those tips. I think that’s really helpful for listeners who are either currently or in the future going to embark on a similar journey of change as you led. I love to fast forward a little bit. So you’d done these interviews, put out a lot of communication, and you developed a vision for where the company needed to go. And you mentioned that it included these four buckets. Can you share more about what this kind of vision was and how you conveyed it to the company?

The four personas that I saw from a platform offering and capability perspective that we have at DAT were one what we already have today, which is a self-managed platform. It’s got integrated guardrails, some best practices for infrastructures, security networking. It’s got some rational defaults, it has some blueprints and modules off the shelf that teams can use to spin up new environments. The challenge is that we were using that persona for a hundred percent of our environment for our application teams, whether they really fit into that persona or they didn’t. And where I see us continuing to leverage that persona in the future is using it specifically for things like sandbox environments or we do have acquisitions and mergers that are happening all the time. Those often kind of run a little differently than everything else and I think it’s appropriate for a self-managed experience, a self-managed platform.

And then, like I mentioned earlier, we also have teams that are just really autonomous. They have unique use cases. Data science is an interesting one of them, I would agree. They need to use the full gamut of the catalog with zero restrictions and I think they would take advantage of a self-managed platform. As I mentioned earlier, I think the question is that today that’s a hundred percent, I’d articulate that in our new cohort plan that should be probably less than 10% of our environment that’s running through that. The second persona that we really need to build is a managed platform, and that’s where we’re spinning the next six to eight months kind of building this up, which is a platform that’s constantly being enhanced with best in class tools, with best in class product services and processes. It’s designed and architected for resilience.

It’s got security defaults, SEPs, all of those best practices, well-architected architecture built into the framework so that you’ve got security, scalability, recoverability, all done as infrastructure as code across our cloud environments. Ideally this persona could extend to our private cloud environments, our hybrid cloud environments eventually could extend to our multi-cloud environments, but we use that as a framework to say that this is going to be our golden path and our gold standard for platforms at DAT. This would also mean that this is something that has SLAs, it’s got support systems, it’s got evolving nascent cloud capability, and it’s got the ability to plug and play the right modular infrastructure for the right act team. So that was a big piece of that.

The third persona that we saw is that community contribution model, was what I like to call it, and that’s that we actually have a really great foundation of developers and engineers today that are always sharing information, but I really want to make that more of a golden path and curate content that is built sometimes for specific needs for specific a AI team that could be injected with best practices, security defaults, all that good stuff, and then given back to the broader DAT community. I think enabling a good developer experience through great documentation, great guidelines, great standards, reusable, modular blueprints, infrastructure as code, all those things are critical. And I think that those things can be delivered across a self-managed platform as well as a managed platform.So I really see it as an incubator for awesome things and it removes our platform team and our DevOps teams from being the bottlenecks or the gatekeepers for building innovative solutions. 

The fourth persona is broker platforms. It’s something that we do today, maybe by happenstance. I think of these as external ran platforms and environments that should be using our guidelines or best practices or guardrails. And even sometimes our engineering specs, right, coming from a platform team but not being so prescriptive that those teams can’t support themselves. I’ll give you a good example. We’ve got emerging needs all the time whether that’s marketing sites or perhaps we’re thinking about a secondary or a tertiary cloud that we want to experiment with for a specific data science need.

We want to be able to support those brokered platforms and those broker experience even if the platform team is not running those platforms. So how do we make sure that we’re enabling that persona and that experience to exist while also thinking, “At some point, we may need to either incubate those learnings back into our managed platform and involve the platform that way,” or at some point, they may be transitioned over to the platform team to run itself? So as we think about our WordPress sites, our Drupal sites, our marketing sites, our sales sites, all those things, we want to continue to do those things and really allow the organization to move fast but not moving away. That’s not reasonable.

Well, thanks for sharing all that. I love the four buckets or personas and it sounds like a really clear way to convey the future state to the rest of the company. This was a couple weeks ago, but you mentioned to me there’s a lot of clamor right now for this vision. So I’d love to ask, how have you evangelized this vision to teams to build up all the hype around it?

Well, I think it started with the interviews, frankly. You start to get people excited about the art if possible. What we did with the interviews, by the way, is we turned those interviews into user stories. So there was a set of user stories that then we got sized and then became part of a roadmap. And every month I put out a list of priorities for the team and the group and the organization and what we plan to achieve that month and what we achieved from our previous month’s commitments. And then in November I did that and I also added an update around our cloud strategy. So company wide, shared out exactly what we’re talking through today, “Here’s what we learned, here’s where we came from, here are the metrics that we believe are important to us and to our growth. Here’s how we’re going to make those metrics go green or get better or improve by building out these capabilities around managed experiences.”

And it broke down. We provided, I want to say, over 30 architecture diagrams, detailed specs in terms of how we’re going to achieve the items on our roadmap. Our user stories are available publicly. We’ve got a feedback module that we have implemented into our roadmap so that users can give us real time feedback on things that they want to see added, moved, changed, up in priority, et cetera. And that was something that, again, went to everyone. I think the other piece of this is that we’ve continued to try to inject things that make sense into our current processes even before the platform is online. So we have a new capability coming along around SRE and we’ve been able to take some of the best practices in that space and inject them into our operational excellence weekly reviews that we’re doing with all of that teams.

I think the last piece is we’re doing a lot of, you can call them roadshows or you can call them kind of education/feedback sessions with the application teams. We’re sitting in some of their sprints, we’re sitting in some of their planning sessions. As part of our annual operating process, we’ve injected our frameworks, our roadmap into what we’re doing. And as a company we are rolling out OKRs, which is awesome to have that being rolled out company wide. And so our OKRs are directly tied to the top level OKRs that then sort of cascade across the company. So we’re injected into how we’re just going to do business in 2023, which is really exciting. I think that’s been the biggest piece of it. And then the more visual piece of it is that they can obviously see the team size growing. Again, I started with a single cloud engineer and as of next week, we’ll be up to about 26.

That’s awesome. Well, sounds like you’re doing a lot. I mean sounds like you’re doing a great job at communicating and being transparent in a really deliberate way, the constant interactions and embedding and road shows as you called them, and then support from leadership, the OKRs and everything. So it sounds like you’re doing a lot. As you look ahead, once said managed platform comes into existence, what are the ways in which you think change…? Because I mean earlier you mentioned the goal is like a hundred percent self-managed to 10% self-managed. That’s a pretty big change for the company. So what are the ways in which you think this change is going to be hard?

Yeah. You kind of got me at the end. I was going in one direction, I got the other. In terms of hardship, I would say it’s a couple of things, at least two that come to mind. One is we’ve started with things that I think much of the change that we’re going to introduce early will be things that people are clamoring for and they want, right? They’re excited about it, they’ve asked for it for years. They’re passionate promoters. It’s the NPS, right? They are promoters of some of the function that’s coming forward and that’s going to be great. We’re starting with our kind of most mature, earliest adopters. We’ve got a pilot team of about three application teams that we’re working exclusively with that have been part of the interviews, part of the consultation, frankly, helping us to educate us on what is wanted beyond what we see.

And they’ll be part of the first kind of tip of the spear going into it. And that’ll be great, I think in terms of starting to get real users kind of proliferating this information. After that I think is where it gets hard, not just in terms of onboarding the rest of the DAT ecosystem, but also in terms of, look, if you have a really reliable scalable platform, it’s going to reveal fragility within your current application architecture. And so at some point, my customers are going to have to change, not just me. So I think the two junction points I expect to be complicated is anything that involves customer-oriented change. 

The first big one’s going to be onboarding to the platform. That’s always a somewhat disruptive affair, going from one to another and decommissioning the old and running two in parallel at one time. There’s always a migration process and that’s a bit clunky. It’s always a learning experience and it’s a learning experience for every team. That’s why we’re going to try and pilot it with some of our bigger, baddest, I say baddest in terms of good, ambassadors of what we’re doing so we can take these lessons forward. But migrations, there’s always nuance to them. 

The second piece is going to be, “Hey, your application is not architected in a way that can take advantage of the platform capabilities.” One of the things that we want to have, it’s an aggressive goal, is multi-region DR across the platform. But there are application realities that have to be able to take advantage of that. And we don’t have that in all places. Some of our… Not some but a good set of our backend infrastructure is on-prem. And so there’s some systemic bottlenecks that happen organically as a result of that relationship.

So breaking more of our applications into microservices, moving away from monoliths, lifting capability up and into shared models that are living in cloud and shared across multiple applications, applications treating themselves as a service to each other, building contracts between applications as a service, I think all of that’s going to get exposed and it’s going to stir up the pot internally around how that works. And I also think this is a good thing, but we will have a single measurement for what we think good looks like for the company. Every app will be held essentially to that yard stack and every app team will have to make changes in order to achieve those targets. And that has not existed before. I think every app team’s kind of made their own yardstick or had their own truth around what best in class looked like. And instead we’re saying, “No, this is what best in class will be across the company.”

And so I think those are the things we’re going to run into probably about mid-cycle. I don’t think it’s the first three months. I think it’s probably the next six to nine months that we’ll run into those. And I think that’s healthy conflict. I really believe in healthy conflict and healthy debate. We should all be aligned and hyperfocused on our customers’ experience. And if we’re doing that, we’re going to debate each other, we’re going to come to throes on certain things and approaches on how you should solve that problem. And ultimately, if we’re oriented around the customer and DAT is a very customer-centric company, then we’ll solve it in the right way. But yeah, we’re going to run head first into that.

Yeah, this last part is really interesting to me because again, thinking back to that goal of a hundred percent, the 10%, I imagine there’s going to be some teams that just want to stay on their self-managed infrastructure versus move on to the managed infrastructure. Whether that’s valid or invalid reasoning behind that, how do you anticipate navigating that piece specifically?

Yeah, the way to get to the root of that because that will happen, and I do anticipate that as well, the way I approach it is leading with data. And one of the things I don’t think I talked a whole lot about yet was we did establish KPIs for our teams and for the company that didn’t previously exist around toil and productivity and availability and things like that. And what I am building as part of this platform is a single paned glass at the executive view so that executives can look and see the release velocity, the security, vulnerability of each environment, of each application team, the performance of each application team, and how it runs across the platform. You have a single paned glass around all of those things, the release velocity. All of those things will be in a single paned glass. And what we are building is a governance council that includes executive sponsorship that will review the performance across those metrics.

So application teams don’t need to explain it to me as to why they’re not hitting one of those metrics or another, they’ll have to explain it to the committee. And I think more than that, the interesting piece of that is I do believe that each app team is going to have to make a series of trade-offs. All things equal, you don’t always have unlimited resources, budget, time. So you do have to make some trade-offs. And in some cases what I expect to see and what I’ve seen in the past is that some teams will say, “Hey, yeah, we have more vulnerabilities or a higher list of vulnerabilities in our environment today because we’re making this trade-off in terms of speed. We are in hyper growth mode, we are very young and so we are not focusing on that concern.” Probably security’s not the right thing by the way, I’d probably get in trouble for saying that.

“But we’re going to deemphasize that one rule book, that one metric in pursuit of aggressive speed.” Or we may say, “Hey, yeah, we are really expensive. We’ve got a little bit more infrastructure than we need and we know it. You can see it in terms of other applications of similar size in the company where we’ve got a little more spend than we should, but we’re doing that because we expect peak season or we expect this kind of growth or we’re launching this campaign and so we’re going to run like this for the next 45 days. And then it’s all going to disappear into the ether.” But you’re making now a very intentional, a very knowledgeable trade-off that’s informed by business value and you’re having that conversation directly with the business and with the executive sponsorship across DAT as well as with the enablement teams, cloud platform teams, DevOps teams, SREs, around how to make those things real.

And there might be opportunities for us to say, “Cool, that makes sense, that trade-off makes sense. However, I can also help you remove some of those concerns. Is there a way for me to automate that for you? Is there a way for me to inject some capability of the platform so you don’t have to think about that? Is there something we can learn here?” And I think again, that level of candor, transparency, visibility, and awareness of the trade-offs and concerns will ultimately drive us to better customer outcomes but we don’t have that today. So I think that’s going to be one of the first things we’re doing. And I’m actually trying to rope… One of my other teams is the enterprise data team. So I’m trying to rope some of my reporters into building that dashboard out so that we can get that as soon as possible because having that baseline will show us, one, how we’re improving, but, two, allow us to have those conversations.

This really ties into the next question I was going to ask, which is around rollout and adoption. And you already mentioned right now you’re starting with these pilot teams and you’re heavily focused on the success of those teams. From what you were describing previously, it sounds like one of the ways you’re hoping to drive adoption is by establishing those KPIs and saying, “Teams, you have to measure up. You have to hit these success criteria.” And are you envisioning then that that will naturally lead a lot of teams to the managed platform because by switching from self-managed to managed, they’ll essentially be able to hit most of those success criteria out of the box? Is that one of the key approaches to the adoption?

It is. Yeah, it is. And it is because I believe a couple things are true. One, I love the change process, the chasm of change that you’re kind of like, “Oh, am I going to jump over this thing?” You will if there’s something for you on the other side of it. And so I appreciate the ability for some of the folks, the people that go first to champion to their peers. I think that speaks louder than I ever could, champion to their peers, the success they’re seeing. Two, I think the visibility at the enterprise level across our application teams and seeing some of the things, the concerns that the platform will do even in the early days, even around MVP and the capability that will give and the amount of time and productivity that teams will get back, I think there’s going to be folks that organically just want to jump into that as well.

And I think since there will be visibility around those things, I also think there’ll be some healthy pressure. I think it’s healthy. “Well, you’ve had X amount of incidents in the existing space and this other thing is available, so why not consider it?” And there’s probably going to be like, “Hey, when is the right…?” And that’s the right conversation. It’s not if, it’s when do we onboard and that’s where I’d like to see some of those conversations happen. But yeah, I think it’s mostly organic. And then I think the last piece, there always are the delayed adopters. There always are, and we talked about it a little earlier in our conversation. I expect that, you always have that. And that is where I think we’ll use the two vehicles around. If it is meeting your needs, well, maybe then we don’t need to move you.

And you know what? The 90% is a target, but I think we’re going to be constantly learning a lot as we go through this process. And it may be 80%, it may be 70%. If it is less than 90%, I think the question that I reflect in on is, “Cool, what do I need to change in the platform? What do we need to change in the platform to get that last mile?” And that’s totally fair. So I think the great thing about this change adoption process is that at every juncture, our application teams will have choice. And that was really important to me. I think that’s really important to DAT. I’ve heard that it’s really important from our application teams is that they have choice right around the level of adoption. Again, it’s modular. So you can pick the things you want to use. The when of adoption as well as the if.

And again, if all of those personas are true, then cool, you have an if choice, “What am I and what do I aspire to be, right? And where am I today?” I think it’s about meeting people where they’re at. “Where am I today?” And then meeting them where they’re at and then helping them with that maturity journey. And it is a journey. So it’d be great if we get a hundred percent on December 2nd, 2023, but frankly, if we get over 50, I call it a win because we will be learning and iterating on what we learn. And that alone will drive the right kind of behavior change.

Well, I love the thoughtfulness around giving teams the choice, the agency to decide, and also love that you mentioned that by giving teams this choice, it will potentially force your team to reflect on, “How do we improve the platform to win over more teams?” I had a similar conversation recently with a leader at Wayfair who said the same thing about how he thought about adoption of developer platforms. So thanks for those thoughts. We’ve talked a lot about how you’ve managed stakeholders in terms of developers and how you’ve interfaced with those teams. I’d love to ask you about who the other stakeholders are in this and then ask you about how you’re managing those relationships. I imagine InfoSec, finance, compliance, legal maybe. Who’s involved in this outside of the application engineering teams?

All of those act are definitely involved. Legal, compliance, finance. We actually, as part of the design sprint process, when we were designing all these things, we brought a number of those folks in. We had folks from our security teams, we are driving towards the DevSecOps model and that was key to us to make sure they’re not only included, but at the tip of the spear of that conversation. Our finance teams have been with us the whole way. One of the things they’ve thought a lot about is one, “How do we really use this platform capability to drive growth in the business? And what do you really need to make that true? And when it materialize, what does governance really start to look like?”

I think the other teams that have been in our periphery a little bit have been the sales and marketing teams because they have a really deep connection and relationship with our customers and a lot of the things that end up on our roadmap kind of come in through a lot of those lines. And so understanding where capability needed to exist and what our top three priorities were going to be for 2023 as a company fed into our roadmap in terms of where we wanted to get that done. Actually material change was originally when we mapped this out. We’re like, “Oh, it’s June. July is when we’re going to get this.” And we’re like, “Based on what we’re trying to deliver and what the company’s trying to achieve, it’s too late.” So we went, took another cut at it, and really fine-tuned what MVP was and landed on March-April. So yeah, that was another big interaction.

I think the last big one is just ELT, our executive leadership team. They’ve been our biggest supporters, our biggest promoters. Our CEO, Claude is just an incredible guy and he literally reveals our plans and talks through what we’re trying to learn and he stepped in some of the sessions and design sessions. So it’s been great to have that level of collaboration with the C-suite. Our CTO has been at the forefront of this as well as he’s thinking about a lot of the evolution of the SDLC process. How does that vary with what we’re doing on the platform side?

So those are, I think, the various personas that we’ve interacted with building this. It’s just been really fun to capture that. I’m trying to think if there’s another one. Oh yeah, I was thinking on the finance team specifically, I remember one of the accountants was like, “So are we going to have financial governance?” We’re like, “Yes, it’s awesome.” And then we layered out into, “What does that look like to you?” And then we talked about what’s capable, how fast we can get those things. And I love how those conversations have influenced our timelines and getting visibility into that at the right levels.

Well, I love all the sort of anecdotes you’ve shared and the stakeholders you’ve listed. I’d love to double-click a little more on the finance piece. For people listening who are going through a similar journey, I’d love to really pick apart a few maybe concrete headwinds they may face or strategies that you’ve been able to use to kind of really partner closely with finance. So you mentioned that governance piece. Sounds like that was a big win for finance. What have been some other concerns, issues, even just concerns around overall costs that have come up that you’ve been able to navigate through?

Finance sits at a very incredible intersection of every business. And I always kind of consider them as the canary in the coal mine, right? Because they understand the business at a level that is just incredible. I kind of feel like we all need to be closer to finance and go take those courses in college, it’s worth it, because they just have an incredible viewpoint on material impact and business value. And so a lot of the conversations early on, actually, as we were forming up the strategy, I was sharing literally on a day-to-day basis updates around the strategy with my finance partner. And I was having conversations with our CFO around his observations in terms of things that we’ve tried in the past and where it caught us, the investments we’ve made in the past and where it didn’t really materialize or where there was moon shots that we could learn from that he appreciated.

Our CFO is an interesting guy, Tony, because he also comes from a technology background and so he knows this stuff and we can talk shop at a different level that I really appreciate. And I think really my biggest advice would be they’re your friends, right? They are ultimately focused on business growth and they’re going to ask hard questions around that. They’re great questions, but they’re hard questions around, “What is the ROI of this? What is the value of this?” And really helping you to crystallize and strip away any of the noise, any of the distraction to just get down to brass tacks, “What is this going to enable change or deliver to our customers? And if it’s not going to do that, do we really need it?” And I appreciate that lens and we’ve done that, again, like I said, it was a daily conversation as we’ve moved to another rev of a deck or if we got another level of understanding or we saw a different how that might impact our hiring forecasts or our needs going into the year, and we did try to look at this in terms of three things.

What is the current spend associated with 2022? What is the predicted spend associated with 2023 and what does that get us from an ROI perspective? And what does this team and what does this capability and spend look like in 2024? 

We actually went three years out to really think through that. We started that conversation in August. And so what’s great is we all understand the milestones, we all understand the drivers, we all understand the strategy, and as we’ve encountered challenges or as we’ve learned things, we’ve had to pivot on certain approaches, finance has been there along the way, like, “Oh, but are we still going to get that?” Yes. Cool. Okay. “Well, how does that change when we’re gonna get it?” “Well, actually I think it makes it faster.” Okay, cool. Awesome. And we’re having that as a live dialogue so that there aren’t any surprises. I think of them as the best accountability partner you could ever have.

And I’m big on accountability. I want to hold myself accountable to deliver for my customers. It’s the only way to be empathetic is to put something on the line. So I think they’ve been a fantastic partner in making sure that it’s thoughtful, it’s valuable, there’s accountability baked into it.

The last piece on this that I think finance is incredibly helpful with is in thinking through actually how to financially achieve this stuff. We had levered the pull that I would not have been aware of if they were not part of this conversation from the beginning. And that really has been awesome because when you go into a conversation with your CEO or with the C-Suite and finance is saying it for you and with you, it’s so much of an easier conversation because building these things are not inexpensive. There is material cost to it and there’s long-term expense that’s associated with it as you staff up a team, right. And go from one person to 26. But because we’ve been able to do it together and we’ve been able to stay focused on the metrics that matter to the business, it’s been a really enjoyable process. And I think, frankly I’m sort of astounded that we’ve been able to do all of this in four months, and I think it’s only because we work with partners like finance to make this real.

That makes a lot of sense. I think that’s great advice for people to think about and really take seriously partnering with finance. I think a lot of people often see them as a gatekeeper rather than a potential strategic partner in, like you said, even winning over the executive leadership team. So love that advice and, Ian, this conversation’s been full of insights. Really excited to, to follow the rest of your journey as you go on. But, um, lots of great tips and insights shared here today. Thanks so much for coming on the show. Really enjoyed speaking with you. 

Thank you. It’s been a pleasure. I really appreciate it. It’s fun to talk about this stuff.