Read our whitepaper on Developer Experience
right arrow
Read our whitepaper on Developer Experience
right arrow

How Shopify's Infrastructure Team Looks for Impact

November 13, 2022
|
Episode
22
Featuring Mark Côté, Shopify

In this interview, Mark Côté, the Director of Engineering of Developer Infrastructure at Shopify, explains an exercise the Infrastructure group went through to define their boundaries of work. He shares their areas of focus, the team’s guiding principles, how they use their developer happiness survey to decide what to prioritize, and more. 

If you enjoy this discussion, check out more episodes of the podcast. You can follow on iTunes, Spotify, or anywhere where you listen to podcasts. 

Full Transcript

Abi: Mark, thanks so much for coming on the show today. Really excited to chat.

Thank you for having me. So am I.

Well, would love to just dive in with your background, your current role. You've been at Shopify four years. What were you doing before Shopify and what's your role currently at Shopify?

Yeah, sure. So I've been in management for around oh 10, 12 years, somewhere around that. I was at Mozilla before Shopify where I was in a similar role. I joined there as a senior engineer in their automation and tools team, which evolved in various ways into what eventually became an engineering workflow team.

And I was managing a team of about 13 people there before I left. I joined Shopify as a manager of the mobile tooling team, so a specific automation and productivity team around the mobile investment there. And then I moved up to a senior manager and then most recently as a director of developer infrastructure. So we were about 60 people devoted towards supporting the development platform at Shopify for all of engineering.

And I know infra orgs can consist of many different intricate pieces and parts, but just at a high level, how's the infra org currently broken down?

Yeah, good question. So we're part of developer acceleration, so that's around a hundred people. And our focus is on accelerating Shopify development, so making Shopify developers more productive. And we're split into two wings. So one side of it is the Ruby on Rails infrastructure team, which focus, as the name implies, on Ruby on Rails, which are bread and butter at Shopify.

And then my org, which is developer infrastructure. And we're split into what I would say different phases of the development workflow largely. So if you think about when you're coding, you're in a development workflow and then you move into a validating on CI or Canarys and then you're deploying and then you're managing your services afterwards.

We have teams that are representing each one of those phases so that we can concentrate on each part of the development workflow separately. And then we have a few cross-cutting teams as well, a mobile tooling team I mentioned before, that are focused specifically on mobile enablement and mobile technologies, and a internal support team called Developer Success that supports our internal engineering tooling.

And then another one called Code Scale, which is a relatively new team to just try to handle the organizational challenges of the amount of code that we have and the amount of different repositories and just sheer volume of changes that go into our code bases every day.

Well, it sounds like an inspiring effort, and you actually had me chuckling a little bit because I always go and talk to people about how there's all these different teams with different names, dev prod, dev enablement, and today's actually the first time I've ever heard the name Developer Acceleration.

And I know Shopify went through, especially in the last few years, a lot of growth in head count. So I'm curious, is this org, it sounds like a pretty sizable org. Has it grown a lot just in the last couple years or has Shopify always had a pretty large investment in developer acceleration?

I guess I would say both. The roots of this team go back far before I started to at least 2015, which I think is when developer acceleration, the name was chosen. It emerged out of a different team and we were part of the production engineering team at that point, which also handle all of Shopify's production servers and services. And so there's been an early investment into that.

I think one of the things that I really like working about Shopify is this really deliberate investment. Because of our growth and our scale, I think leadership knew early on that we had to invest in this area and this would just be an area that would continue to grow. And just it's we're building for the long term in this case and so we always need this investment. And we scaled up over the last few years along with engineering as a whole. So the investment continues and again, that's one of the reasons I love working here.

Yeah, sounds like an awesome environment for leaders who care about developer experience and developers. I know you recently had a sort of summit with your leaders of your organization to realign around what your charter is and where you're headed from here. So first of all, I want to ask you fresh off the summit, what is your charter?

Yeah. So our charter exists to try to align everybody on what do we do and make sure that we are feeling, everybody feels connected and engaged to a larger mission, and then we understand what we are doing as a developer acceleration group.

We split it mainly into two areas, our opportunities for impacts, sorry, and our guiding principles. And so these two things together help us identify what problems we want to work on, where the impact is, and then how we make decisions around that, how do we approach the problems, how do we gauge the impact, that kind of thing.

Sounds like you recently got together with your team to really realign everyone around how your group can have an impact and how you're going to measure that impact. So curious to know what is the way your group thinks about having an impact on developer acceleration?

Yeah. So we've defined that part of the charter as called the opportunities for impact. And we split that into three things, tightening feedback loops, reducing cognitive overhead and scaling up engineering. So to talk about each three of those, the tightening the feedback loops is to provide developers with the information that they need at the right time to understand their code and to make sense of their applications.

We get that kind of information from a bunch of different places, from local development, from CI, from our deploy process and from our production environment. And it's all fed back in so the developers can course correct as quickly as they can. So understanding the problems they're working on and then getting that information back so that they can make sure that they're continuing to solve that same problem and that they're not waiting overly long.

So the longer you wait before you get your feedback, the more you might be distanced from your problem and the more you have to fix later. So the example, I think that the simplest example of a feedback loop that every coder would know is the development cycle around coding and building and testing and then going back to that. So the tighter, if you look at a really productive programmer, they're running these build test cycles really quickly.

And they're getting that information instead of writing a whole program, compiling it at the end and then debugging it all, which is a much longer way of getting to the same end result. And so we try to provide a bunch of the so many different feedback loops when you're developing that comes from all these different areas. And so we try to get that feedback to you in the right place at the right time so that you can make these decisions.

I love that. And I'm curious, well, first of all, you gave these three buckets, tightening feedback loops, cognitive overhead, scaling up engineering. Having talked to a lot of teams that try to come up with charters and having worked on sort of conceptual frameworks around developer experience myself, I know how hard it is to boil things down in this way. So I'm curious, was there a fourth or fifth candidate that didn't make the cut in this top three? I'm curious, was there any debate or challenge in distilling it down to these three?

Yeah, for sure. We started with a lot of just rough notes to think what are these projects we're working on and where are the similarities? I'm a really big fan of mental models and I like to take just as much kind of raw information as possible and then just see where are the patterns and the commonalities so that we can then kind of invert that and then use those categories to find new problems inside of them.

We're thinking of breaking down this charter even further to the specific teams because there are areas where we don't necessarily have commonality across the entire department. So our Ruby on Rails infrastructure team almost surprisingly, they do a lot of open source work, they contribute to the Ruby language, they contribute to the Rails core.

Developer infrastructure, we don't do that as much. We're building a development platform. And so we thought about does that fit in? Is there something in there around our open source initiatives that is at the highest level in dev accel? And we thought, yeah, probably not. I think that's more of a specific thing. Anything that we do that's open source is a little more auxiliary to our main purpose, to our main mission.

That makes sense. And I love how it sounds like you're a mental model nerd. I definitely am as well. So another question around the three buckets. I'm going to get into each of the buckets in more detail here shortly, but just topically speaking, do a lot of your projects often fall in more than one bucket at once? Is this more of a Venn diagram rather than three separate lanes?

Yeah. That's an excellent question. That's exactly what we were thinking. I think a lot of them would fall under possibly even all three of them and definitely two. If we talk a little bit about the reducing cognitive overhead, those two kind of aspects of reducing cognitive overhead.

So there's so much context at Shopify. At any large engineering organization, you're just kind of hit with this fire hose of information. And trying to figure out what is the information you should be paying attention to right now, very, very difficult.

So our mission is then to try to give you the right information again at the right time so that we can eliminate the number of reducer, the number of decisions that you have to make. So there's constantly decisions that any coder has to make, any manager has to make and so the fewer decisions that you have to make, then the more you can put effort into those specific decisions.

So some of the overlap between feedback loops and cognitive overhead is around this kind of information. So the feedback loop provides you information so that you can course correct right now, but then reducing cognitive overhead provides you the right information you need to make the right decision right now and not pay attention to other things you don't need to think about at this moment. Just help you provide you the information that you should be thinking about right now.

Yeah. That makes a lot of sense and can definitely see the connection there. Before I ask you a ton of questions about what you're doing in these different areas and where you hope to go in these different areas, I want to ask you how is this charter? Sounds like it was a intense, not intense, but a fun exercise to distill this down to this model. How is this the same or different to what your organization's charter already was before you realigned on this?

Yeah. Well, we didn't really have one. So that would be the major difference. Up until this point, we, I guess, we had a combination of very short form and very long form things. So we had a mission. Our mission for developer acceleration is to make Shopify developers highly productive. That is nice and concise and a very good mission statement, but is extremely open to interpretation.

And so we want to make sure that all of our projects are contributing towards that, but it's still left it wide open as to how do we think about this, where do we look for impact, where are the problems that we need to solve. And so I had written with the principal engineer sort of our guidelines for how we have impact in our group and how we think about this, because impact is a necessarily nebulous word when you talk about engineering and then you have to focus it in.And so that was a multi page, 10 page document or something, is how your team should think about impact, how you should think about impact as specific, as an individual developer or as part of your team. And so we really needed something in the middle, which is a one pager that is, here are some places to think about impact, here's some principles that you should use to think about it and here's areas of opportunity where you should be looking to find those problems.

Well, I really love that and I think you hit on a really important point. I just recently spoke with a leader who was previously at Netflix and he told this story about how their platform team at one point was almost dissolved because the organization came and said, "What is this team actually doing? What are they working on?"

And so sounds like maybe you kind of realigned the organization more preemptively before you hit a point where people are like, "What is this team been doing?" But so that's kind of my question is what prompted you to do this now, other than you just being a good leader, were there questions coming from the outside or even on the inside of, was there a lack of focus? What really drove you to do this?

Yeah. I guess, it's many things. It's funny you say that because I think developer acceleration groups or productivity groups, we're pretty autonomous a lot of the times. We have to set our own goals. I mean, we might get some top down direction that's fairly broad, but a lot of the time if things are working, if CI is running, if deploys are flowing out, there's not a lot of attention that's paid to us.

I don't want to say that leadership doesn't value us, because I think the investment in developer acceleration at Shopify shows that we're highly valued, but it's easy to forget about while there are big business problems to solve with regards to our end users of Shopify. Yeah.

We wanted to get a little ahead of that so that we could communicate better to leadership like, "This is why we exist, this is how we think about our problems." And that allows them to get a broader overview of what we're doing and to focus on the right levels of abstraction, I guess you would say.

So if they're looking at an individual project and kind of wondering, "How does that fit in? I don't really understand that," they can go to our charter and understand, "Okay, this ladders up into a particular area of opportunity and this is how we're thinking about it." And we can talk about it in those terms and kind of stay out of the real nitty gritty that you would need a lot of context to understand why one of our systems needs a particular new feature in it.

We can just go back to that charter and explain this all. But it's also because as you mentioned, we grew a lot and our groups grew a lot and we've all been remote for years. Developer acceleration has actually always been distributed. It's one of the groups pre pandemic that was spread across multiple countries, multiple time zones. But it's even more so because at least we used to get together before and now that's slowed down a lot.

And so it was important for us to try to get everybody on the same page and not feel siloed into individual teams, but understand how these teams connect and fit together in some whole. Because that was a signal that we were getting that people, they understood their exact domains, they understood their team and they had good team relationships, but they didn't really understand what is developer infrastructure exactly, what is developer acceleration exactly, how do we fit in to a bigger whole and how does developer acceleration fit in.

So it was really an attempt to make sure that both our individual teams, the individual contributors and the individual of the team managers as well as leadership all understood what we were doing and that we could all have a form of input into making sure that we're aligned on all this.

I love all those reasons and I hope that a lot of listeners out there who are at similar stages to where you're at today or even just getting off the ground can kind of take in that lesson. Because I think really being able to communicate to the business and align internally on what the impact is of your group is so important for platform and in for teams.

I want to ask one more question about the charter and that was what outside inspiration, if any, did your group draw upon to come up with those buckets? I mean, were you looking at something like the space framework or Tim Cochran's paper, anything out there, or was this really homegrown?

I guess, that's one of those things where it's probably a bit of, again, a bit of both. There was nothing very specific that we looked at when we looked at this. We didn't even know what the major sections of this charter were going to be. We just had raw notes and then we thought, "Wait, there's emerging problem areas and the two ways of what are the problems and how do we think about them and make decisions about them?"

And then that kind of emerged into, "Ah, we have opportunities for impact and we have guiding principles." But a lot of us read different blogs, listen to podcasts around developer productivity. I've had multiple chats with different leaders in this space. It's funny you mentioned developer acceleration as one of the names you've never heard of.

I talked to somebody at Wayfair who leads the developer acceleration team and he didn't realize, he thought they were the only developer acceleration team in there. And he was very amused that we both came to this name independently. So having these conversations with other leaders and keeping up with this space, I think has just set a lot of ideas kind of in the back of my head that then slowly percolate to the top and then emerge in here.

So to answer your question more directly, there was no specific framework, but there's so many inputs and so many background processes that are always running that whenever I sit down to write a model, it just kind of, it's things that just emerge out of all this disparate ideas that have been bouncing around in the back of my head for weeks if not months.

To add on to this conversation about these three core pillars, earlier you also mentioned you had guiding principles. So I'm curious what are those or what are they for and what are they?

So we came up with four. I mean, these are still, I should also preface that these are still in flux. After our leadership burst last week, we had all kinds of notes and about a thousand stickies up on boards that we now need to take back and see did this really reflect everything or do we need to adjust these things.

So these are kind of draft things. But one of them I think actually was related to what we were saying earlier around why the charter exists as a whole, this is a little self-referential, but one of our principles is that we have to hold ourselves accountable for measuring success. We don't have product managers. I think those are still relatively rare inside of infra and productivity groups.

And so it's up to us to not prove our worth, but to show what's the impact that we're having here. And so we need to make sure that we're thinking about success and what that looks like and making sure that we're concentrating on a problem and not a solution in order to just be able to explain to other people, "Yeah. This is what we're doing and this is why we're doing it."

And we've shown we've got measurable amounts of our measurable metrics around the success. I think another very important one is that we maximize our impact with effective two-way communication. So we have to be, again, it kind of ties into not having product managers, we have to be talking to our users all the time and communicating both what we're doing and then listening to what their needs are.

And this is its own sort of feedback loop where we have to show that we're listening to what their needs are by communicating back to them, "Well, here's the plans that we have, here's what we've delivered." And so that they can, we're making sure that the engineers themselves are feeling heard. And so it's this constant back and forth around understanding the problems and then communicating our solutions back to them.

What is your process for determining where to actually focus? Within these pillars, how do you identify the areas of largest impact and opportunity?

Yeah, another great question. There's many sources that we use. I mean, there's no substitute for just talking with our developers. We'll have just random conversations. Sometimes we'll have office hours or something, people can come in and just ask us questions. There are some people who are just, who are in the product groups who find our group just really interesting and they will want to sit down and just talk about our tooling, give us direct feedback, which we love and incorporate back into our plans.

But we have a couple of other more deliberate sources. One is I mentioned we have a group in my, a team in my group called Developer Success and they staff a channel called Help Eng Infrastructure on our Slack installation. And this is where people go when they have questions about how our infrastructure works.

And so this team handles sort of first line support and answers broader questions around how to use our tools and problems that they run into. And then they escalate up to our individual development teams as necessary. But they actually have a system for categorizing all of these support interactions. And then they feed this feedback back to our development teams.

And so just listening to these support questions gives us quite a lot of insight into where are users confused, where are they having problems, where do they seem like they're just going off the green path. Because for either our solutions aren't meeting their needs or they don't understand the parameters of our systems as we set out, and they're using them in unexpected ways. And another primary way in which we get a lot of information is our developer happiness survey.

So this is something we run twice a year. We partner with a people analytics team and send this out to half the developers at a time. So every developer should get this survey once a year. We try to keep it not too long. Something that takes 15, 20 minutes. And then that provides us with a wealth of both quantitative and qualitative information to help us determine where we're going to have impact.

Well, I've from the outside heard of the Shopify Developer Happiness survey and I want to devote the latter half of our conversation to learning more about that. But as you kind of describe the work you do to really understand the needs of users across the organization made me think of product managers. And I know you've mentioned you don't have PMs that are part of your group. Is that a deliberate decision and what are the pros and cons? Why is it that way?

I don't know if it's exactly a deliberate decision, it's just the sort of thing that happened I noticed. I learned the hard way, the value of product management thinking. When I started at Mozilla, we had some early successes in automation and tools team, and then a string of not so successful projects. And it took a while for me to understand and other managers, "Why were some of these ideas seem so great, why were they not getting traction?"

And we realized we just didn't know anything about product management and we didn't understand how to think about our tools as products that even if we think it's a great idea, if we're not going and talking to our users and understanding their problems, we're going to build something that nobody uses. So that kind of got me interested in that space and realizing that this is something crucial in here.

And we've discussed internally, should we have PMs, would they make sense. I think it would be a challenging job for a PM. I know there are some teams, some companies out there that have had PMs for internal tooling groups. Not very many, but I've talked to one or two. And I think it's a very challenging job because our scope for such a relatively small team is extremely broad.

We have a lot of services, products that developer infrastructure just my group, a few dozen of us support. And one PM trying to figure out, they're not going to be able to represent one product. We have to have 20 PMs or something for every single product that we have here.

And so it's a very wide ranging thing, and I don't think it's product managers as they tend to operate now, that kind of scope wouldn't really work, I think with a lot of traditional approaches to product management. So instead we have kind of every engineering manager and staff engineer as well as senior engineers and really ideally any engineer on there should be thinking about these things as products and should be at least amateur product managers.

Well, I'm sure that's both really fun for those people put in those roles, and also a challenge. And you're right, when we talk to companies out there in how they're running their or developer acceleration functions, there aren't very many PMs in these organizations.

And I think in large part, due to the types of challenges you described. I'm curious, I want to loop back to those three pillars again and one just keeps, won't get out of my mind, the reducing cognitive overhead. Can we talk about maybe a couple of examples of things, specific things you guys have done in the past and things you're hoping to do in the future to really impact that pillar?

Yeah, for sure. I think probably the best example of something that we've built in the last couple of years is our cloud development environment, which is called Spin. We've actually got a number of blog posts about this on our engineering Shopify blog. And a lot of that is around reducing the cognitive overhead.

So before Spin, even though we have an amazing tool called Dev, which bootstraps applications, it was really being stretched pretty far. Your first time that you do what we call DevOp on your system. You got a brand new laptop and you want to look at the Shopify model and get it all running, it'll probably run fine, it'll take a long time, normally 45 minutes, but you'll get something at the end that 99% of the time will work.

But over time, as developers are installing different things on their laptop, as other applications are being installed and there's no real clean environment isolation, at some point you're going to do DevOp and something's going to break. And you're probably not going to be able to understand it right away, especially if you're relatively new.

And then you're going to have to go and ping the environments team and they're going to have to walk through this with you. And over time we realize that just does not scale. And so pivoting to a cloud based environment means you do a Spin up instead now of a Devop and you get a fully working system less than a minute.

There's no need to think about all the steps that are required to install all of your dependencies, to build your app, to get the server running, it just works. And so there's this whole class of problem that you don't need to think about. And if your development environment breaks because you've gone and installed something or change something, you can just Spin up a brand new one again in a minute or two and you have it in front of you and working.

You can do multiple branches at the same time. It's just so much you don't have to think about. So I would say that that's a huge part of reducing our cognitive overhead. But actually stepping back, one of the ways that even predates our dev accel team that we've reduced cognitive overhead is just by picking several technologies and really doubling down on those.

So one of the, as I mentioned earlier, one of the big parts about reducing cognitive overhead is reducing the number of decisions you have to make. So putting your cognitive efforts into a smaller number of decisions that are the most important for your business, for your business impact.

Right? And so the fact that we are a Ruby on Rails shop means for the most of the time that you need to start a new service, you're just going to go with Ruby on Rails. You're not going to think about it, you're not going to do a, "Should I use Python? Should I use Node.js? Should I use like Flask? Should we use Express?"

No, you just go with that. It's a decision you don't really have to make. If later you emerge as a specialized need, you know can spend that  energy then once you've understood the problem better. But at the beginning, don't spend that, go straight into prototyping with Ruby on Rails. We're continuing to invest inside of Spin and we have a VS code extension that we've created. We've really, again tried to double down on certain technologies and in this case it's VS code.

Not everybody uses it. There's some Ruby mind developers who I don't know that we'll ever win over to VS code. But for the most part we've got a lot of people using VS code and we provide this extension that gives an avenue for us to communicate lots of information to our developers, a single point of contact with our developers where we can push out as much information as possible.

And this is where we get into kind of the overlaps between our different pillars here. We're providing tight feedback loops so you can see all the services that are running on your Spin instance inside of your VS code extension.

So you can tell, "Okay, they're all green, they're all running." We can restart it, get logs right in front of you. Really tight feedback loop all within your editor. But also we can pull in information like maybe CI results or are there places where the code is known or files where we've known to have incidents in the past or deployment failures.

We can highlight that stuff in front of you and you're given more information around the riskiness perhaps of certain changes that you're working on so that you could be particularly careful in areas or you know that the test file that you're working on has a history of a bunch of intermittent failures. So again, maybe you'll fix it up while you're there, or at a minimum you're going to be very careful with those tests because you know that they're going to give you false positives.

Well, particularly that last concept you shared around this VS code extension sounds amazing. If we had another hour, I think I would love to just devote a whole conversation to hearing the story around that product.

But I want to kind of conclude this by asking how does your group balance, you mentioned you had that kind of support intake channel, how does your group balance customer requests or user requests versus your strategic roadmap and investments? How do you find the right balance between the two?

Yeah. So we have a couple strategies there. So one thing that we realized a few years ago is that we need to even separate the people working on these areas, on our user support versus our longer term projects. So we actually have a rotating role on all of our teams, we call it ATC, air traffic controller, where for a period of time, couple of days usually, their responsibility is first and foremost to user support.

So any support requests that get escalated up to the team, that person is responsible for answering them. If it's request for help, they're there to give guidance. If it's bugs, they're there to triage them and file them. If it's requests for features, then they're the ones who can get it written up.

And this allows not just a high degree of responsiveness to our user base, but also ensures that the rest of the team is able to focus on project work. Sometimes in the past we've had very helpful people. They want to help and they'll jump on support requests, but we don't need everybody working on those support requests. We want to make sure that the people working on the projects in the longer term impact their Rail, understand that our requests are being handled and they do not need to keep one eye on our support channel at all times.

They can focus knowing that everything's taken care of there. But as for what comes up in those, we bucket them along the severity, I guess. Is there something that's actually blocking somebody from getting something done right now? Because a user will come to us with some sort of issue and it's not always clear, is this actually preventing them from shipping something right now or is this actually just something they're thinking about?

So we need to dig in a little bit and see, "Okay, if this is actually blocking somebody from shipping today, we will need to escalate that, fix it somehow." If it's actually more of a longer term thing, then we can sit down and actually have a conversation with them around their feature. Is this something that they can work around? Is it actually a feature that we need to implement?

Is it something that we really want to implement or is somebody trying to use our system in a way that wasn't intended? In which case maybe our documentation isn't up to date, maybe we're not actually explaining the main use case of our system and they're trying to do something funny that should belong in somewhere else. And I guess, another thing is how much of our user population would be affected by this?

So somebody comes with a feature request that is very, very narrow and will only benefit a particular team, we might try to work with them like, is this something that they can implement? Is it something that they can again, work around to get things done? But if it's something that impacts 80% of our users, that's something we're going to want to fix right now. So we try to prioritize those things into what's the impact, how broad is the impact, how deep is the impact.

That makes sense. Well, thanks for sharing that. I'd love to turn the conversation now to the developer happiness survey. And I know you just kind of casually mentioned, "Oh yeah, we do this by a semi-annual survey," but as you probably know, surveys are really hard and it's something that a lot of organizations we talk to are struggling with.

So I really want to just go through in detail to understand what are some of the things that you guys do that other organizations could be doing to have more success or get more impact or value out of these types of surveys. I'd love to just start with the design.

You mentioned you partner with the people team on this. Can you just share more about, I mean, what is the process? And I'm guessing, it's never really a finished product. Is this something you're updating every time you run the survey? Can you share what that process and working group sort of looks like?

Yeah, for sure. So I guess, I could talk about the goals maybe first of our happiness survey. So there's two kind of pieces of broad pieces of information that we get out of this. This is where do we invest and how are our investments doing. So when we approach the design of our questions, we're trying to figure out one of those two things or sometimes both at the same time.

So we'll have sort of longitudinal questions I guess you could say around just general happiness with our tools. We ask these questions every single time and we track it over time. And if there is a decrease, then we'll try to dig into why. So the surveys are just sometimes an indicator that we need to dig a bit deeper.

So if our happiness has dropped a little bit, well, then I need to figure that out, I'll need to go talk to some users. Or maybe we'll have our own theories that we can test. And then, but it's also when we've done specific investments into things, we'll ask questions that will verify is that, have we had an effect there.

So in our CI systems, if we've spent some time trying to increase or decrease the time like of a testing feedback loop, how's that question going now? Are more people happy with that? Is it dropping down the list of pain points or has it stayed the same?

And if it stayed the same, then maybe our investments haven't worked and we need to go back and figure that out. But yeah, so we try to keep the number of questions about the same. So this is one of our approaches is to never or attempt not to add a question without deleting a question. It's very, very difficult.

As our survey has gone on more, we've gotten more and more people who are interested in contributing to the survey from different parts of the org. And so we have to be very careful. We get strategic questions in there. If we've launched something new, again, our cloud development environment, we had to insert a few questions about that because it's a brand new thing.

We want to know how people are using it, what their sentiments are about it. And then we have other questions around our CI systems that have been there forever because we're always investing in CI, but it's always a thing that we're going to need. The people analytics team is kind of there more to crunch the numbers afterwards and to help us craft the questions in the best ways possible.

So they are the ones who know this is where you'll get a better signal if you phrase a question like this or if we use a star system instead of a yes no or if we should have freeform text in this and if so, how do we correlate those things? So they're there for more of the mechanics of the survey I guess. And then our team, the developer acceleration is there for the actual content. And so we will source different people and different groups to see what are they interested in learning about.

Really interesting you mention that last piece. And you touched on it earlier as well, when you said you're getting more and more people sort of contributing questions and you're trying to keep the scope or the length the same.

You kind of mentioned you used this survey to see do discovery and verification of impact on the work your team does, but it sounds like who are the other people coming to and saying, "Hey, can we measure this other thing too?" Who else is involved or trying to be involved with this?

So I guess, our acceleration efforts are mostly confined to the developer acceleration team. But there are other teams that work on internal tools. So one would be front end developers. We have a design system. It's a public design system called Polaris.

And those teams that work on Polaris, they have two audiences. They have our external audience, the partners who work with us and then also our internal developers who make first party apps, our first party apps and third party apps.

And so they want to get information about how many developers are doing front end work for example, and what is their satisfaction with our Polaris library, is it meeting their needs, is it understandable, does the documentation work, where are the limitations around that. So we've had a few people from different departments there that are sort of auxiliary to developer acceleration, want to get more information from the engineering group as a whole.

I think you mentioned earlier the survey doesn't go out to every developer. Do you use some sort of sampling strategy?

Yeah. So it only includes anybody who's been at Shopify six months or longer. We want to make sure people have had a chance to onboard to our tools and understand and been able to ship a number of changes so that they have the kind of full view of our tooling.

And then out of that cohort, so everybody who's been here six months or longer, we split that into two. And then 50% of those people will get the survey earlier in the year and the other 50% will get it later in the year so that we don't have survey fatigue.

And it's pretty clear I think how you use the survey and your teams, but the results get shared back out to the teams themselves, some managers. And how do you expect those other people to use the results?

Yeah, that's a good question. So I think for the most part, the results are particularly are of most interest to the developer acceleration groups themselves in order for us to prioritize our work and figure out where are opportunities for impact are. But we do a presentation with the people analytics team after every survey where we go over the results so that everybody knows we share this all internally. And then we go into an action plan.

So the people analytics does the first part of the presentation where they talk about significant changes, notable things in this survey, have particular levels of satisfaction with different things gone up or gone down. And then it goes into my group and we talk about what we did last time since the last survey to address points that came up the last time around. And then our action plan for the next six months and how we're addressing any new things that have come up in the survey.

And what kind of participation rate do you guys get?

Right now I think we're around 50%, which is not bad. Given the current size of Shopify, I think it's a pretty admiral number. I wouldn't mind if it were higher, but there's a lot of stuff always going on. And so we send out kind of reminders, try to nudge people and things. But we have a separate internal survey for kind of more company wide things. So we try to stagger it with that so people don't have too many surveys at the same time.

And I'm just curious, you kind of mentioned nudging people, is this the kind of announcements and comms around this coming solely from the developer acceleration group or is this coming as the CTO of Shopify also telling folks, "Hey, this is important, this is something we all need to do."

Yeah. It's largely coming from us but we do get bumped occasionally and the yeah, senior leadership are interested in the results of this as well. So we'll sometimes get a little boost from them as well.

I'd love to better understand. You mentioned kind of the action plan piece, but tell me more about what your group does with the results for the weeks that follow. I imagine you're getting so much data back. And I imagine, I think might have been you who mentioned earlier, this really helps start conversations as much as provide data.

So what kinds of next steps do you take? Is it mostly just analysis and confirmation of things you're doing or does this then kind of launch a series of follow up conversations or investigations into things that maybe surprised you in the results?

Yeah. I guess, it's one of the inputs when we go into our planning. So when we try to do our quarterly planning, we'll take a look at again, the unexpected results in particular. So when we've seen in the past that length of CI runs is one of our highest pain points, that's not usually terribly surprising.

So sometimes the survey is just there for us to validate what we already pretty much figured. But if something bubbles up higher than that, then we'll try to prioritize that more. So we've noticed usually we'll get these signals in different ways. So documentation has been a more recent pain point, training, documentation, onboarding, all that kind of stuff.

Because we do have a lot of internal tools here. And so we've gotten that kind of feedback as well through support interactions where either directly where people are like, "I can't find the docs for this," or indirectly where it's clear that they're not sure how this system works and so therefore our docs are either not discoverable enough or not comprehensive enough or not readable enough.

And so we will usually have these signals coming in from different points and then we'll make a concerted effort to, "Okay, let's..." we'll either just start a new project to improve our documentation for example, or we'll use that to do targeted user interviews.

So again, depending on the fidelity if you will, of the signal that we're getting from the development handbooks or of the development survey, it's either things that we can do right now or just areas to dig into more.

And we like to, again, in the product management mindset, we like to look for highly engaged leaders out there, ICs or managers who are invested in our particular systems and then have high quality, high bandwidth conversations with them where we can really dig into these areas and figure out what's the underlying problem, use the five whys approach and think, keep asking them, "Well, why are you trying to accomplish that? Why is that a problem?" And try to dig down to some fundamental thing that we can then fix and then hopefully has a broad impact.

Well, thanks for sharing that. I think that's such a rigorous and comprehensive process for how to follow up on surveys. And I think that's one area in which I think a lot of organizations get it wrong or stumble a little bit, is they don't really know what to do after they run the survey. And as a result, the survey kind of just dwindles and people stop responding and participating.

I think the follow up that we do where we present our action plan and what we're doing as a result of this, I think is a key part of that where it's not just a, all these survey results are going into a black box and who knows what they do with it. We try to get out there and say, "This is what we're hearing. Here's where we're investing. Continue to provide us this feedback and we'll use this to make everybody's experience better because that's our mission."

I love that. Well, the last question I had for you is just today I was talking to someone who reached out to me and said, "I'm at this large corporation, 44 years old." Sorry, the company is 44 years old, "And I'm trying to get a dev [inaudible 00:43:06] function off the ground. And it's a team of one right now."

I mean, you compare that to the situation you're in with this entire developer acceleration group, huge investment, large number of projects, mature function, I mean, what advice do you have to that person who's at this old company trying to advocate for developer acceleration, developer experience, and how do you do that?

I've actually been there. I was the one, team of one tooling department at a small startup before. Unfortunately, I left before there was more people hired onto that. But I think I would advise starting with the clearest value. So usually that's around the CICD systems.

I think any org now, I mean, I'm not sure about an old company like that, but any newer org, and I'm sure many older orgs have now understood the value of CICD. But it's easy to get the basics set up, but it requires investment to get everything nice and smooth. So if you don't have the investment in there, you'll probably see a lot of intermittent failures or you'll see a lot of deploys that aren't working.

And so looking at that kind of low hanging fruit to get the CICD system flowing as smoothly as possible will show immediate results to the engineering population and to leadership as a whole and can be used to bootstrap into what I would consider the more interesting problems.

So it's really proving out that value right away, that just having one or two people dedicated to the engineering processes and the automation and the workflows will smooth everybody out, smooth out all the work that everybody's doing. That'll help prove out the need for this and invest in to explore the problem space there.

I'm glad I asked you that question because I think that was such a great response. A lot of people give the advice of start with low hanging fruit, but I like how you went even further and just said, "Look at CI/CD."

Because I agree with you. That's often a place that, an area that sort of gets neglected and wastes so much developer time that there's such a clear ROI there to the business of improving that. So love that advice and I will be sure to pass that back along. Mark, it's been so great talking to you today. Thanks for coming on the show.

Thank you very much for having me.