Transcript
Abi: I just listened to your talk from PlatformCon today. I have to say I really love the way you think about platform work and developer experience, because you’re clearly someone who’s recognized the similarities between the challenges of internal facing product work and external facing product work. It seems like you’ve really gone all in on achieving mastery in this domain. I understand your story. You spent a number of years working on platform at Netflix, where you had a bunch of big learnings.
Now, you’ve recently started a new role at Doma, where you’ve applied a lot of these learnings. Today, I’d love to first start with your experience at Netflix, so then tie it back to what you’re working on today.
Michael: Great. Platform engineering, just as a high level, it’s a super interesting space. Anybody that looks at the hype cycle will see that it’s definitely on the early stages of rising, and there’s a whole lot of interest in this space, and I’m excited by that. My journey in platform engineering, I had bits and pieces throughout my career, but it really took off at Netflix when I had the opportunity to move into delivery engineering there. Delivery engineering is part of the platform engineering at Netflix platform. Netflix roughly has two major organizations. It’s productivity, and it’s the infrastructure organization.
There are some changes that have happened since then, of course, where data also has a footprint as well. But for the most part, it was that way when I was there. I was part of the productivity organization focused on delivery engineering, which meant, largely, the product that most folks associate, Spinnaker, so managing teams that worked on the Spinnaker product, and really bringing about the next generation of solutions when it comes to delivery. One of the bigger projects that we were leading near the end of my time there was a project called Managed Delivery, which I believe the domain still exists, managed.delivery.
In any event, during that time that I was in productivity there, we did explore a lot of concepts, and I’m excited that we’ll be able to talk about today in relation to how you understand what your customers need. This was a big problem for Netflix and, I think, with most platform organizations, where you end up really focusing on the things that you know are problems or believe are horizontal problems, but how you see those problems may not be how your customers see those problems, and certainly not necessarily they will see the priority of those problems in the same way. It’s an interesting challenge.
Most of what I know about platform engineering really grew from that experience there. So when I brought it over to Doma to really help lead the shape of the platform engineering organization at Doma, I was able to leverage a lot of what we learned there at Doma.
You mentioned Spinnaker. I’m familiar with it, but for those who are listening, can you just quickly describe what Spinnaker is?
Spinnaker started off as really an open source project between Google and Netflix. I want to say as far back, I want to say it was around 2014, maybe 2015. I’d have to check the exact times. But really, it was a way to present continuous delivery more accessibly for engineering. By that, it meant it really introduced an interface that focused on providing pipelines as a first class experience, and so you could assemble your workflow just through the use of a very visual UI experience that was, like I said, pipeline based.
That made it much easier for people to visualize, conceive, and implement delivery workflows. Because it made it easy to do that, it allowed us to do things that were historically more complicated, like introducing canaries into your delivery workflows, or introducing more complex designs into your workflows because maybe your application needed it. So, it made being able to achieve a fairly complicated blue green, or at Netflix, because we were Netflix, called it red black deployment strategies accessible to every engineering team without needing to be an expert in delivery.
Got it. Your team was primarily focused on this tool. You’ve mentioned to me that there was some point during your time in Netflix where you realize maybe things aren’t working so well, right? I think you mentioned the org, sort of what is it that you do? Can you dive into that eureka moment where you realized something was out of alignment in terms of your words?
It’s a really interesting problem. So when I was at Netflix, it was the first time that I had ever heard of the concept of the, "You build it, and they will come” as it applies to platform organization. I’d heard this in the past in relation to products, but it really actually is a big pitfall that can easily happen, or a trap that can easily happen.
So it’s a common pitfall that can happen where teams get so focused on a set of problems that they start going heads down for months or even quarters, and nobody outside of the platform organization, sometimes even other teams within platform aren’t really clear why they’re working on this, or what they’re trying to achieve or what the problem is. This became an issue at Netflix primarily, because teams had asks that we were saying no to because we didn’t have bandwidth for it. It wasn’t specific to delivery engineering or anything. This was a platform engineering challenge.
Different teams did balance this in different ways, but nevertheless, it got to points where our other stakeholders in the product engineering organizations didn’t understand why it’s taking us so long to implement x. A good example of this was a thing called feature branches that took us a long time to come out with a strategy for, which has to do with just as a technique to enable engineering teams to do fast feedback on their development workflows. These were asks that went unanswered for a long period of time.
So once that started to build up, you started to get a lot of pressure from other parts of the organization saying, “What are you guys doing then? What are you here for, and what value do you provide?” That hit a really eye-opening moment within the platform organization, where we realized that there is this big disconnect between these sets of asks and what we’re actually delivering and how we’re communicating value. So when we first tried to tackle it, we thought, “Well, maybe we just need to communicate our value better.”
Then we realized over time that, “Nope, it wasn’t about our communication. We can communicate to the days end, but if we’re talking about things that don’t matter to them, they’re not going to listen to it. They’re not going to hear it. It’s going to be hard to connect those dots.” What ended up happening was we basically started to follow an approach that really was defined in this book called Customer-Driven Playbook. What that essentially is is it’s a framework for gathering insights and input, basically data. What do your customers care about? What is their world like?
Gathering all of this information, identifying problems, proposing hypotheses to solve those problems, and then proposing solutions to address those problems. Throughout that life cycle, you are constantly engaging with your customers to validate that yes, this is a meaningful problem. Yes, this is a solution that resonates with them. So throughout the whole process, the value you are providing is obvious and is connected to what they need from you.
That’s so interesting. I’d love to zoom in to this journey from point A to point B, the moment where the existence of your team was challenged or at least questioned to this adoption of a different approach. First, I’m curious to ask, when you mentioned that there were questions around what is this team even doing? Was that a result of just complete… Was there downward pressure on the business? It cut costs. I’m just curious where that pressure came from.
I mean, part of it is when you have organizational changes that occur in different parts of the business, lack of progress in certain areas, or work that’s done on the product engineering sides that people say, “Why are we the ones doing this? This feels like this should be a platform-owned project.” All of those start to build up the frequency at which those questions get asked. When you have leadership changes, we had some leadership changes that forced us also to reevaluate what our purpose here is and what we’re trying to accomplish.
That’s when I think we started to really realize, “Hey, we actually don’t have a lot of great answers.” When we go into the room and we start talking to leaders of other organizations, it doesn’t feel like the things that we’re talking about are connecting with them. You can sense a frustration. There’s X number of tickets that haven’t been addressed yet, or there’s X number of questions or open asks. I think it wasn’t like there was one cataclysmic event, but I would say that it was a sequence of leadership changes as well as organizational adjustments that really brought this to a head.
The catalyst, I think, for us to approach it from the customer-driven perspective though was a decision that was made to bring in product management into the platform organization. So once product management became a concept in platform, that’s when all of a sudden this idea of, “Hey, we need to actually treat platform like a product,” became more obvious and became more apparent. Then we learned how to do it effectively. I would say that we didn’t feel like we had necessarily something we could borrow from somebody else, so we ended up just trying to pick up and figure this out, and evolve our approach.
That’s really interesting. To be blunt, it sounds like your customers were frustrated, and perhaps your roadmap felt in misalignment or disconnected from their needs. You brought in product management, and you mentioned you had to learn as a platform group how to be customer driven. Tell us more. I mean, what did that look like? This book, the Customer-Driven Playbook, I’ve heard about it. Was this just a book that was on every product manager’s bookshelf at Netflix, and they just said, “Hey, here, the rest of your engineers, read this?”
Where did that book come from? Did your whole team read it?
Well, yeah. Actually, we were all given copies of it. We all were essentially tasked. We assembled a working group of individuals across productivity, and took various roles in terms of driving the process of getting definition around the problems and the solution definitions. I would say one other bit that maybe is a useful context as well is even before we really dove into the customer-driven aspects of it, we had some individuals that took on product management roles that had a good understanding of gathering context and customer pain from their previous experiences.
Part of this had to do with the fact that they came from the product engineering organization, so they brought some of those grievances into our organization. I think that that also helped provide light on, “Hey, this is an issue that we should be taking a look at.” When that happened, some of those individuals actually ended up driving, I think, what was the most impactful piece to this entire journey, which was going and starting to do deep dive customer interviews, literally recording an-hour-long conversation with an engineer, and diving into questions like, “Tell us about your working environment. Tell us about the tools that you like to use. Tell us… What are the steps you go through when you want to create a new application.”
Really just getting an understanding of how they operate today. What are they doing? What are they working on? What is their feeling of pain? What is their way of work… What is their environment work style like to get context? These were recorded. These were transcribed, and then we leveraged… I mean, I think there was maybe 100 of them that were done. A lot of us deep dove into those, and we pulled out phrases and data and information and quotes from these to start collecting themes of areas that needed investment. When we started to get that sense of, “Okay, here are some of the big items that are coming up. Discoverability is a problem.”
Clarity in terms of which tool I should use for what problem is an issue. The sprawl of services that we have is an issue. So, some of these things started to pop up as themes. I think, once we had that information, then it started to become a matter of identifying frameworks that would be maybe helpful from this. I was not directly involved in the decision around the customer-driven playbook approach. That was brought in from the product management team, but it absolutely helped us put shape to, “Now that we have all these signals, what do we do to move these into areas of investment?”
That’s really interesting to hear. I’m, first of all, curious to ask. I don’t know how involved you were in those interviews, but you mentioned they were recorded. You mentioned you did about 100. So, for others looking to employ the same practice, did the book have a pretty fixed interview structure or template, or how did you design these interviews?
It’s a great question. I will say I ended up repeating the same process when I came to Doma. I found it so immensely valuable to gather this context. So when we did, we actually drafted an interview guide based off of what we had had done with customer-driven approach, which the Customer-Driven Playbook does provide you some guidance and structure around this. We put together our own version of that at Doma, which is very pretty much very, very similar, I would say, to the way we did it at Netflix. Maybe a little bit of adjustment.
I do have a guide and a set of questions that I’d be more than happy to share, and that you can include if you would like. But the general takeaway here is these interviews are done in a common way that you would normally see a customer research interview style. It’s very much one interviewer with one interviewee. Everybody else is essentially… If anybody else sits in, their cameras are off. They’re muted. They’re not allowed to directly engage. They’re just in the background just to listen for context. If they have any questions, they can send it through Slack to the interviewer who may or may not decide to ask that question.
I did actually sit in on a number of the interviews. I did a few of them myself at Doma. This format was extremely effective in terms of establishing a conversation, which is really what you want. You want that conversation to happen with the engineer, and you want to make sure that your questions are all entirely focused around curiosity. You’re not judging. It’s an interesting approach there, because you want to make sure that you’re pulling out as much information as you can, and so you avoid things like, “Why did you do X? Why do you do Y?”
As opposed to, “How do you solve this? What are your biggest pain points, or if you had a magic wand, what might you eliminate out of your development experience that you wish could go away today or something like that?” You’re really just trying to solicit a lot of this insight, because you’re going to analyze it later. What you’re looking for is signals. You’re not trying to change what somebody’s doing today, or try to question their approach. They may not understand how to use your product. That’s not their fault. That’s yours, and so you want to encourage that insight.
Right. You mentioned 100 interviews-ish, right, and probably more, I believe, after you left Netflix, but I mean who were you having? Was it just product managers doing these interviews? I mean, how many weeks did it take to get through 100? How distributed was the work across the team, I suppose I’m asking?
That’s a great question. We did this at Doma. You want individuals within the platform team involved in different ways. There’s lots of different ways. One way is to be the actual interviewer. Another way is to be a note taker that’s silently sitting there, and you can schedule. There’s lots of different roles that you can play in this process. At Netflix, it took I would say months, I think. It took several months. It’s very hard to schedule these things, but all of these conversations are immensely valuable.
The big challenge we had, when you’re looking at 100 interviews that are an-hour long each, is it becomes virtually impossible to have one person who can listen to all of them, and gather that context. So even at Doma, I think we did about 20 of them. We ended up then dividing up the responsibility of listening to the interview after the interview, and condensing that down into summary takeaways, and converting it essentially from conversation into some structured metadata that you could use to start finding signals or structured data, rather, structured information.
This was a big problem at Netflix. We actually talked about tools or techniques that we could potentially use to do this more efficiently. We never really arrived at an answer for that, so we ended up as human effort having to go through and really categorize the information, identify individual quotes that were relevant to different pain points and so on.
Well, first of all, I love that practice you shared of involving team members in different ways. They might not be the interviewer, but you had them maybe be the note taker for example. I love that suggestion. I was also chuckling to myself, because I’ve done qualitative research, and as you mentioned, the process of coding interviews into structured data is intensive. I was going to ask you if you had used some tool for that, but it sounds like it was just human effort.
I mean, as Netflix, sometimes I think there’s the mystique about Netflix as this company that just can throw a powerhouse of machine learning at whatever. In truth, I suspect that we probably could have encouraged some team to jump on it, and spend a hack day trying to figure something out. But the reality is I think there’s no substitution for actually having people listen through it, and identify this information. The other reason besides, it’s just being really hard to do that programmatically, is the person listening to the interview, and categorizing this information often will have come out of that experience with a completely different view of the world.
Because all of a sudden, they’re hearing things that they say, “Oh my God, I had no idea that this was difficult for them.” They might find themselves screaming at the screen, “Why didn’t you just push the button? It’s up at the top right.” But the thing is that they walk out of this realizing, “I thought that it was so easy to use our experience. It’s not. I thought that their problem was A, and it’s actually B.” So, really, what this does is it causes folks to come through with a much better understanding of what our customers care about.
They become more passionate customer advocates, and they’re much more empathetic when customers come back into channels later, and express a frustration because they’ve actually now really heard how what they’ve been working on interact with the lives of our engineers and developers.
I love that observation about the value of that just face-to-face experience sharing, if you want to call it. I’m curious to know. You went through this process at Netflix, and clearly sounds like it had a big impact on the way you viewed how platform work should be approached. You then started this new role at Doma. Tell me how have you applied these principles at Doma, and also just share a little bit about the company and the definition of your charter with your new org there.
Well, Doma is it’s in the home closing… It’s a proptech company in the home closing space. It really is focused on anybody that’s bought a house, or has gone through the home ownership process. I’m sure you can relate to the absolutely atrocious and complex process that a home closing process looks like. Emailing off highly sensitive documents, and hoping everything lines up, and then rushing to get it all signed. It’s a very stressful time in everybody’s life, and so Dom is focused on making that a much more modern and automated approach, really driven by lots of modern technology, which is applied to a 100-year-old industry.
It’s an interesting space. I’m a big fan of disrupting areas that really need disruption. This one is definitely one of those areas. So coming over to the platform engineering organization, when I joined Doma, there was a set of teams that were focused on platforming kinds of concepts. So developer experience, there was a team focused on that. There was a team focused on cloud infrastructure and some of the early designs of our Kubernetes layer in Azure. We had an early product security team and a central test automation or quality automation team that provided frameworks for teams. It was an interesting combination.
The thing was is that when I arrived, even though these teams all were adjacent to each other, there wasn’t a clear understanding of what platform is here for. So, one of the things I talk about in my talk is specifically this challenge, that when platform lacks a clearly-articulated purpose and a clear focus, as an organization, not from the individual teams, but as an organization, what ends up happening is you end up having either missing expectations from customers, because they expect something, because they think it sounds like it should be a platform thing.
It shouldn’t be theirs, or you effectively get into the more nefarious situation, which is when you’re in front of a senior leadership or a CTO, and they’re not really clear why you have the number of engineers you have in your organization. What value are you providing? To them, you’re not delivering on a feature or a product necessarily. What are you actually doing? Frankly, it was a small example of the same thing that in some ways was happening at Netflix. So when I saw it, it smelled very, very similar to that.
We took the same approach that we did there, which was our start of our journey of really clarifying who we are and the value we provide.
I’m curious, I mean, you described the process at Netflix of doing those customer interviews. It sounds like that’s one of the first things you did at Doma. Was it as intensive? I mean, was it, again, a multi-month effort, or what flavor of it did you execute on at Doma?
I mean, this first step of gathering this information, these customer interviews, is something that you don’t want to rush. You need… First of all, it’s new for lots of folks, especially in the platform space. They don’t necessarily typically do this activity. I think when you’re looking at rolling something like this out, you should have an expectation that this will take at least a month, maybe two, depending on how many folks you decide to interview and how you establish this. There’s playbooks now. I mean, like I mentioned, the docs I have can help.
That book is useful as well. But at the end of the day, you still need to get the folks that are going to run this together. You need to connect them with why you’re doing this, and practice doing it a little bit so that you get quality interviews. It took us, I think, I want to say about two months it did for us from the, “We need to do this,” to, “We’ve completed all of our interviews.” It’s just because these things take a little time for folks to be effective at.
Well, I’d love to know someone who’s never spent this much time interviewing internal customers. I’d love to hear just some of the raw learnings. What did you hear when you went into this organization, started asking developers about their experience?
One thing that was a very interesting insight, so at Netflix, the spin up of a new application or a new service, that was a common activity that people did. So if you had an idea, you use the internal CLI tool called Newt. You newt in it, and you would… Some application type, right? I’m doing a JavaScript backend service or something like that. Then it would scaffold everything up. It would set up a rudimentary pipeline in Spinnaker. It would put the footprint into GitHub, and set up all the basic configuration you needed.
Once you did that, you essentially had a skeleton application that you were a hop skip are going to jump away from just being able to push, and have it get deployed to some staging environment essentially. More or less, that’s how it usually work. It literally meant you went from idea to some running code within minutes, maybe 15 minutes all depending. One of the questions we asked at Doma in our questionnaire was, “How long does it take you to create a new application?” The vast majority of teams that we talked to, individuals had said, “Well, I haven’t had to create a new application in months. I think the last time I did it, it took two weeks, and I had to talk to X, Y, Z individuals.”
It was fascinating to hear this, because that was just not a problem that people had to solve here at Doma. It took a lot for me to chew on that, because my fundamental understanding as to how do you evolve something was based on the idea that you could start a new application, and roll with it. This was a very interesting insight, and once I found that out, it led me to more conversations after the interviews. When you have new functionality that you need to implement, where does it go? How do you make that decision? Many teams, at that time, were following a pattern of just adding that functionality to existing services, which can be a problem depending on how teams want to evolve.
The bigger issue is if they wanted to spin up a new application, in their minds, it already took two weeks. So gosh, even if it was a good idea, even if I should put this into a separate microservice, I don’t really want to, because that’s a big lift, two weeks. That, all of a sudden, revealed an interesting insight that if we had just asked all of our teams, “Hey, what do you want us to work on?” That might not have been something that they would’ve brought up. But if we had asked, “Hey, when you want to roll out a new idea, when you have something new that you want to put out there, how would you make that happen?”
Even if they knew it should go into a new service, that would be a big lift and a big barrier if they decided to do it. So, it became clear that doing that kind of work, simplifying that workflow had the potential to add a lot of value for the business, because it was a path that they were unable to take, and many of them actually, it could make sense to them to do it, and they couldn’t do it easily. So, it was very helpful to get that insight.
Well, thanks so much for sharing all that insight. That sounds like a enlightening experience for you as you joined the company. You just touched on something that I found really interesting, which was you mentioned that had you just asked developers across the company, what’s the number one thing you want us to work on or fix? It might have been different than what your interviews led you to discover as probably being a more important strategic problem. So, how do you think about that today?
It’s a great question. I think, a mistake… I’m glad you brought this up, because a mistake on thinking of what these interviews are for is to take these immediately, translate them into a backlog of asks. I would actually generally say you already have that you don’t need to do these interviews. You have the support channels. Your customers are already making requests. You can go after all of those items if you want, but that’s an anti pattern from a platform organization standpoint. If you just run after a list of asks that come in from your support channels, you have no strategic benefit.
You are just chipping away at immediate pain, and the only way that you can help that organization scale is by purely adding more people to your platform engineering organization to handle more of the backlog. But what you actually want is a situation where your platform is looking for opportunities to eliminate whole classes of work. That’s how your platform organization can scale sublinearly to the rest of your engineering organization, which is exactly what your target should be. So, when you are listening to these interviews, you’re not necessarily looking for the surface signal. I don’t like X. Y is painful.
What you’re looking at more is about how they are solving their problems. What are they doing? What is their workflow like? Most of the time, engineers, especially after they’ve been at a place long enough, have developed workarounds and scripts and solutions to solve the most immediate pains, right? Then the rest of them, the dent in their foot from the pebble in their shoe has just become a known experience, and then they just move on. So what you actually need to find is, “Oh my god, that pebble in their shoe is actually keeping them from running. That’s why they’re not running is because they have that pebble.”
The answer is we need to remove the pebble, and then they have the impediments removed from them to be able to run. It’s hard to do that without understanding the activities and workflows that engineers actually need to do in order to innovate and get work done. That’s where you really do need engineers to be involved in these interviews, because they’re the ones that can see that and say, “Well, wait, if they’re having to do these 15 steps in order to get context for a production issue, that’s going to keep them from checking those issues on a regular basis, or from publishing as frequently because it’s hard to debug in production.”
So, you start to stitch together some theories around cause and effect for where the business needs to go based off of the insights that you learned from those interviews.
Well, I love that analogy, and I love everything you just shared on how you approach this. You mentioned to me that at Netflix, you use the term leverage as a way of thinking about, “Where are those pebbles in the shoes, if you will?” Can you share your thinking? You mentioned you have a rough formula around that. How do you think about prioritizing or ranking these different potential investments?
Right. This is a classic problem. I remember when I first joined platform engineering at Netflix, we had this all hands. We discussed this concept of leverage, right? Leverage, that’s how we justify the things we work on. This was before we had product managers involved and so on. At a very high level, I think, everybody understands this idea that if I spend six hours of time or six months of effort, it should impact some percent of the organization. I should be a force multiplier for the rest of the organization. Somehow, that work should enable more out of the organization than I am putting into it, right?
The challenge is that thinking about how to measure that, how to understand that is constantly up for debate, because it’s not an easily measurable thing. What I have generally found in the experience that I’ve had at Netflix as well as at Doma has to do with this idea that really leverage breaks down to two different bits. Who is impacted? Basically, how many people are impacted, or how many teams are impacted, and how often are they impacted by this work that you’re about to do? That’s a bit of a leverage, but you multiply that leverage by the impact. Impact is actually… It comes from a variety of different things.
How much cognitive load are you alleviating? How much manual toil are you eliminating? Really, the way that you can understand that depends on a few different factors like the engineering maturity that you have as an organization, the risk profile that you’re willing to accept as an organization, and I would say the complexity that your organization exists in. Some of those are elements that as you start to put them together, and you say, “Okay, if we work on this problem, and this problem will be utilized by 50% of the organization, and it alleviates the need to understand…”
I’ll give you a good example of this, alleviate the understanding of some security concept within the business. That’s a pretty impactful… That’s a pretty meaningful thing to work on. It may not be something that any of those individual teams surfaces as the most important thing. But as a horizontal, it’s very clear that there is value to invest there. We actually had exactly this problem at Netflix. There was a concept called security groups in AWS, which is a very nebulous concept for engineering. It’s not something that just pops up top of mind when they are setting up a new application. What should the security groups be for my application?
We’ve abstracted so much of the underlying infrastructure and concepts from people that asking that question used to force them to actually deep dive down. I remember actually when I first joined, you actually had to provide IP address ranges instead of even just talking about it at a security group level. “I want my application A to be able to communicate with other applications within the security group.” Now, we abstracted a lot of those concepts so that you could get closer to saying, “My application should be able to talk to this other application,” which is really all that you think of when you’re actually designing these things.
You’re not thinking about the underlying policies. Every team, every service had to define these. If you didn’t define them correctly, it either wouldn’t communicate effectively, or it would be too open, which was a security problem, but it’s a one-time pain. You did it, and you moved on. The problem is that every team, every person did that, and had to deal with that every single time. It’s an example of a pebble. It’s not keeping you from doing your work, but it’s something that knocks on you and knocks on you and knocks on you. That’s one way to factor.
Think about prioritization is really looking at that. The other piece, though, that I think a lot of platform organizations don’t… Sometimes they really get it, and sometimes I think they set it aside, because it doesn’t fit into the leverage equation. You need this other bit that I call strategic value. This one’s a bit trickier, because this one… The way I like to see it is leverage times impact as one part of the equation, and then parens around that, and plus strategic value.
The reason why strategic value is outside of that, and the reason why is because you need it as a variable that if it’s large enough, it essentially equates to, “We should still do this activity,” because there is certain things that are not high leverage that you absolutely want to do. So you could have a high impact, but if it only impacts one guy, and it’s not strategic value, it probably is not worth it for a platform organization. That makes a lot of sense. Similarly, if you have high leverage and virtually no impact, very small impact, it may not be necessarily the right thing to do. You want really the high leverage, medium to high impact for it to make sense.
The strategic value stuff though is where it’s an opportunity for you to solve a meaningful problem maybe for a small part of the business, but that that small part of the business is a strategic differentiator for the business. We did this at Netflix too. This was an example. A lot of folks associate Netflix with using canaries everywhere. Canaries are a mechanism during delivery where you slowly introduce the new version of your software into production traffic, and then can detect whether or not after a period of time that change was a good change to make or not.
If it’s not a good change, you could roll it back. If it is, you can continue to progress it and move it out there. It’s an experimentation platform ultimately is what it is. I’m already hearing all my resilience friends there probably screaming at the screen. There’s a lot more to it than what I just described, but canaries are very useful for that from a delivery standpoint. That usage was not across the entire company, because you don’t get enough traffic across all aspects of the company, but it was very specific to a part of the business, and it was very valuable to a part of the business.
Should platform engineering focus on that? The answer is yes, because alleviating pain from that is a strategic differentiator, and has a whole lot of value, even if the size of the audience that you would be impacting is maybe relatively small. It’s a factor that you have to consider in your overall investment equation.
I love that. What’s clear is that this is a tricky nuanced problem to decide your strategy. You mentioned… You keep using this term strategic value. I want to ask you more about that, because, I mean, today… Well, this week, there was that announcement about Amazon building their new, I don’t know if you’ve heard about it, but a builder experience team to really focus on clearing away the muck, if you will, of things that are slowing down their developers.
Most product organizations have a set of north stars, whether it’s a MPS score or engagement or ARR revenue. So for a platform team or a developer experience group, when you talk about strategic value, what is that? What is the strategic value?
Platform isn’t in the island unto itself, right? When you think about strategic value, it’s not… I think of it less as strategic value from a platform standpoint. It’s more about what the business sees as the most important outcomes that it needs to achieve. So, platform may need to focus on a part of the business, and alleviate that significant pain from a part of the business in order for that part of the business to be unlocked and be able to execute and run faster, which may mean that you don’t strictly adhere to a pure horizontal vision or view of the world, but that you might have a bit more of a mountainous landscape if you will.
Something where for certain parts of the organization, you might have a higher layer of abstraction, and you maintain that for that part of the organization, because that part of the organization is so… It has a specific need that unlocks business value. So when I think about strategic value, I really more think about it as, “What are the VPs? What are the CTOs? What are directors? What are they talking about that needs to actually happen from the business, and what are the tentpole problems that the organization deals with, or the challenges that they deal with?”
So understanding that within the space allows you to find ways to provide maybe a unique platform level solution to those. There’s two different ways that can happen, actually. At Netflix, we had this concept called local central teams, where you actually had a platform-ish team embedded within those organizations that might do some of that closer to the domain platform work. In theory, that work could potentially graduate to the more horizontal platform layer. Depending on your organization, that may be a viable strategy.
But I think at the end of the day, the real point is having a platform focus on verticals that need higher levels of abstraction, it shouldn’t preclude your platform organization from thinking about how to engage in that. Even if it may mean that you do graduate, say, something from that local central team, that may still not quite spread horizontally. But by them graduating it to your platform organization, you unlock them to even raise that layer higher, or provide more services or value. There’s opportunities there, I think, that you can’t ignore.
If you’re too, I would say… Gosh, I want to say too… I’m trying to use a term that would be kind. If you’re too strict, let’s just say it that way. If you’re too strict on what you will and won’t do, and you are ignoring what the business needs, you end up falling into a trap as an organization, unless you are delivering overwhelming value.
Well, one of the things I’ve taken away from this conversation is when you talk about strategic value, the importance of it really starts with the business. Like you mentioned, it’s not just what is the strategic value of the platform team? It’s What does the business care about? Which areas of the business or aspects of the business do the leaders of the business want to amplify and want to accelerate? That is where the value of platform is, is to actually help drive acceleration, or force multiplication as you called it, or leverage to improve those areas. I really love that advice.
100%. Just to bookend it, I think if you think about it from the customer interview conversations, even at the beginning when you asked where did this come from at Netflix, I think one of the key points is senior leadership in other parts of the business also were not clear what value platform was necessarily providing. You can’t forget the business, and it’s easy in platform to fall into that trap unintentionally. You really are trying to do the right thing, but you’re disconnected from the business needs, so yep, 100%.
Well, Michael, thanks so much for coming on the show today. I really enjoyed this conversation.
Thanks for having me. This was fun.