Supporting autonomous teams at the Financial Times

Joining us for this episode is Victoria Morgan-Smith, the Director of Delivery for Engineering Enablement at Financial Times. Victoria shares some of the tradeoffs in having an autonomous, “you build it, you run it” culture. She also shares how her group equips engineering teams with metrics, best practices, and more.


Abi: Can you start off by introducing yourself and giving a quick overview of your background?

Victoria: I’ve been leading teams at the Financial Times that deliver internal-facing products for quite a long time. I’m very interested in internal systems and people who use them. Now I’m leading an Engineering Enablement team, which in many ways is the ultimate internal product.

Along the way, I also co-wrote a book on Internal Tech Conferences with Matthew Skelton, which talks about how events can stimulate and fuel a culture of collective learning and change.

What’s your team’s charter and core responsibilities?

The Engineering Enablement team’s mission is to help product developers be able to go as fast as they need to in order to do their work. Building software products these days is massively complicated. Being a full-stack engineer, there are a lot of things you have to think about, some things are in a developer’s comfort zone, some are maybe not. If we can make some of those things as easy as possible, whether it’s providing tools, clarity, support, assistance, or standardizing some of those options simplifying the estate, all of that is the space that we’re in.

Does the company also have platform, infrastructure, or engineering tools teams as well in addition to your team?

My teams look after the infrastructure tooling that developers use. We look after AWS Heroku hosting things on the cloud, setting up systems. We look after observability, so all the monitoring and logging, the edge authentication, the operations teams are in my space as well. A lot of the infrastructure tooling that developers and engineers use, those, the systems that we provide and make it as easy for them to use as possible. There are other teams who look after other aspects of the infrastructure that the entire company uses, but developer-oriented infrastructure is our space.

When you think about your team’s scope, is it primarily focused on tooling? Is it weighted towards process culture?

Well, we’re interested in both. At the FT, we have very autonomous teams. They’re very DevOps oriented. “You build it, you run it.” That is the culture. And we look to enable tooling within that culture.

If we want to simplify standardized best practices, we have to do that within the context of teams who are at liberty to do whatever it is that they think is going to be best for them. Culture is a very big part of what we do and how we do it. A lot of what we do is about surfacing information to teams to equip them so that they can make the right decisions to make it as easy as possible for them to see the things which they need to be addressing or to take seriously, and the options that they have.

I’d love to know your personal story. What drew you into this internal facing enablement role, as opposed to a more traditional customer facing role?

Personally my entry point into delivery was scrum mastering. I used to be a developer. I really enjoyed making things. And I discovered that I took pride in other people around me doing well. That’s what drew me into delivery because it wasn’t about big shiny projects. It was about their success.

That’s why I enjoy internal systems, internal tools, because all of that is about this idea of enabling people in the organization to be successful. That’s what I enjoy is helping other people do what they need to do, whether that’s my team or the people using the tools that my teams to build.

Earlier in a conversation you mentioned that there are challenges with autonomous teams. The “you build it and you run it” culture has advantages, but there are also things that can weigh these teams down. What are the biggest problems you’ve seen with autonomous teams at Financial Times?

A lot of it is the complexity that can arise. We have stream-aligned teams. We’ve done what we can to reduce the cognitive load of teams in terms of them needing to understand the entire organization and how they work by having product teams who are aligned to business areas. They can therefore be fully equipped to make the decisions they need in their space.

But what it can mean is that they forget to talk to each other because they don’t have to. Once you get rid of dependencies and people don’t have to talk, they then forget to, and wheels get reinvented many times only slightly differently shaped, and the estate becomes really complex. I’ve written a piece of code in my ID. I want to get that live and I want it to be secure and supportable and all the rest of it. How do I do that? Well, there’s all these choices you choose.

It can take a long time for people to get up and running because there’s so many options available to them. Because people don’t swap notes and there isn’t a standardized way. It speeds people up initially, but then slows them down.

How long lived are most of these teams? I’ve worked at places that reorg every quarter.

Quite long lived. We have groups of teams, and they may change within those groups. But we have a group of teams who focus on the website and the apps. We have a group of teams who focus on internal products, which could be marketing tools and sales tools and editorial tooling. Those can be quite long-standing teams who really become very deeply connected with the business area that they work with. Actually they become really good at creatively solving the business problems that they’re in because they become close to them and become really passionate about them because they’ve been there for so long. That works fantastically actually, it’s something we’re really proud of

How much freedom do the autonomous teams have? Can they choose whatever language they want and infrastructure they want? I’m curious what governs the way they work.

We have in place a tech governance group. People don’t go there for permission to do things, but they go to say, ‘This is what we intend to do.’ And they have an opportunity to be asked, “Have you thought about this?”, “Did you know about this other thing?”, “Are you aware that you’re about to triple our costs?”. And they get presented with things that they made or couldn’t consider to help them go away and make the right decision, if that is a different decision.

One area we’re focused on is putting in a tech radar to highlight the things which are supported by our group. We’ve got opinions and the things that we have opinions on we’ll support really well and make it really easy for developers to use. We would encourage you to use it, but you can go and do something else if you want, but come back and tell us about it so that we can see whether we might change what we share with people.

Do you find that teams are often paving their own path or do they normally follow the paved path?

We’re still paving the path if I’m honest. A lot of it is trying to harvest the good practices that are out there. Communication harvesting, building that community is a big part of what we are doing is to say that people can share with each other.

Sometimes we find that people will go and make their own decisions, and that’s fine. They’re entitled to do that, but it does complicate the estate.

A lot of people are now asking for simplicity. That’s what’s going to make us successful is that there is an actual request. There is a demand by a significant number of people to please not give us 25 different brands of baked beans. Just give us something that we can choose and get going with.

Earlier, you had mentioned a big part of what you do is equipping teams with information and trying to impact their culture. Can you share more about what that means concretely and what kinds of initiatives and projects you’ve tried to drive?

We’re currently working on implementing Accelerate metrics. One of the definitions of success for this group, essentially, if you’re looking at metrics, are we speeding up developers? But there’s more to it than that. It is not just about us measuring ourselves, but also providing metrics so the teams can see themselves. Whether it is a lead time or the other metrics. Those metrics a lot of teams are asking for, which is great because they’re taking this idea seriously of being responsible and accountable, not just anarchic.

We’re looking to implement that for them. We have a lot of other metrics which are lower levels, so that if they see one of those metrics, ring a little alarm bell and go, maybe there’s something going on here, we are helping them dig down into that whether they’ve got too many repos or not enough repos or too many dependencies or whatever the thing is to help them see what it is, whether they need investment and support.

Other things are like cost attribution. If we get one big AWS bill, can we attribute those costs to teams? If teams see they’re spending a fortune, because they’re leaving all of their dev servers running all night, they might do something about it. How can we make that sort of information visible to teams?

There’s a big tooling aspect to these types of initiatives you’re describing, but also a big cultural element. For example, the metrics initiative. I’m curious, because metrics can be a controversial and sometimes inflammatory subject at companies, how have you approached that in terms of talking about the purpose of those metrics? And then with the rollout, have you hit any concern or pushback from teams?

We have found that we need to be very careful in how we present them. Some of those lower level metrics, if we were to put all the teams side by side and start giving them a score, which indicates this one’s good, and this one’s bad, then that makes teams feel as if someone’s waving a big red flag at them and making them look like a problem team.

We have to not apply judgment. If I put my delivery hat on, I think of this a bit like a burndown chart in a team, that is something that is useful for the team. They know their story, they know what that burn down chart tells them. It’s not for someone else to point at and say, that’s bad. It’s data, it’s information for the team who have the context to explain it. We treat it exactly the same.

These Accelerate metrics, these other metrics about the kind of the health of a team’s estate. We don’t put scores and health score status or anything like that. It’s very much down to these things might be triggers for conversations within a team, potentially some leaders who might be looking to see what extra support might this team need. It’s all about driving conversation and not about judgment. We reinforce that quite a lot.

It sounds like the metrics are for the teams, and maybe their immediate leaders, to help with inquiry and sparking conversation. Is there any sort of ritualized way in which you’ve seen teams leverage these types of metrics?

The intent is for them to be used as part of a quarterly OKR planning. We have an OKR system, so every quarter, teams will set their goals and they will relate to departmental goals, but they will also have a lot that they determine themselves, which are about improving the quality of their estate or their sustainability as a team. They will define those.

All this information that we’re surfacing to them, which are about the risks of their systems or things to which they might want to think about. If there’s something in their working practices, that’s all information for them to use when they define some things that they want to work on to try and improve that in the next quarter.

That’s interesting and actually very powerful. Follow up on that, do these, for example, accelerate metrics, are these OKRs at a more global level? Are they being set as departmental goals, or is this a bottoms up thing?

It’s fairly bottoms up at the moment. We have a high level departmental goal to get Accelerate metrics in place so that teams have the information, and then it’ll be down to groups and teams to decide how they want to use them to drive improvement in their area.

I would expect that teams would highlight areas that they want to improve. But perhaps if groups of teams, for example, the internal products group area, or engineering enablement themselves, might look at our metrics and go, ‘Actually, we have an awful lot of things as a group that are failing release and coming back, maybe as a group, we want to tackle that.’ Then individual teams would see how much support they have to play in that.

You also mentioned something, you said, all teams, as part of their OKRs are focused on also prioritizing things pertaining to their sustainability and the way they work. Tell me more about that emphasis on sustainability and team health. Where did that originate from?

We’ve paid the price of optimizing too much on speed and not enough on flow in the past.

You need flow, you need sustainability. Things need to be at a pace where you’re not just generating lots of problems that you pay for later, and you pay for those things later by things falling over by having to fix them.

Then having lots of totally demoralized engineers, we’re spending all the time, just fixing problems and not building products. We recognize that therefore, for the organization, engineers and products to survive and thrive, that sustainability is absolutely key and part of that is carving out space in OKRs to make it explicit that this is part of the commitment in that quarter and not engineers having to ask permission every five minutes, “Can I just do this?”. We bake it in.

Earlier, you talked about equipping teams and spreading this culture of learning and improvement. How do you educate or indoctrinate culture?

The culture evolves over a long period of time and it’s something that we’ve been very conscious about in what we’ve been doing over time.

We started a few years ago looking at what people needed in order to enable full autonomy. You’ve got the Daniel Pink’s Drive; engineers need to have mastery and you need to be able to trust. They need to be responsible. They need to be interested in what they’re doing. They need clarity on their goals. So those things all need to be lined up. Then it’s building on that and going, so what actually motivates people? And people are actually motivated by being successful. They’re motivated by realizing that they’re surrounded by loads of other really smart people. They can keep learning.

Building in opportunities for people to take time, to work in working groups on whether it’s 10% time is a really big part of what they do, keeps people engaged. It gets people working with people who they don’t normally work with and internal tech conferences, which is a lovely big event, which encourages a lot of collective learning and sharing. That’s a significant factor in some, a step change that we saw in our culture a few years ago, and that we’ve been continuing to build on since.

In terms of educating people on culture, we just keep talking about it and we keep celebrating it. Celebrating and saying what a great culture it is saying, how wonderful it is that these people are sharing this. Just reminds people that they’re doing this, and this is great. They want to do more of it. The celebration is a really big part of it.

You’ve written a book on internal tech conferences – we won’t make you recite the entire book, but I’m curious, what does that look like at Financial Times?

At Financial Times, it’s very much about running something that is by the people, for the people.

They can be run in any number of different ways at different organizations. But for us, it’s about giving the floor to our engineers and saying, what matters to you? What questions do you want to answer? What do you think is important? What’s a hot topic? What’s a common problem? What’s something that’s really cool that you just want to share with people?

We have a variety of different formats. We’ll have lightning talks, some of which are more serious and some of which might be about someone building a monitoring system in their shed about the birds that were going past or something. It can be just something that was really interesting to them, or it can be something that they’re serious they’ve been doing in their team. But anything that is sharing, anything that can spark conversation.

A few years ago when we switched to microservices and then realized that alerts were going off all over the place, and how on earth do we keep on top of that? Somebody gave a talk on that. Then that triggered a working group to go and solve that.

We’re trying to stimulate further conversations that ripple on after the event, as much as celebrating the event itself, where people come together.

And for context, how large is your team right now?

The engineering enablement teams’ about 50.

How do you determine the right size of your org? How do you kind of think about headcount and ratios as your organization matures?

Our starting point is that with the capabilities that we host and that we own and that we maintain, there needs to be a certain base level number of engineers just to maintain those systems. Then we add a lot onto that, so that they’ve got bandwidth and space so that they can be more innovative and creative and take time to explore options. It’s not so much about the ratio between them and the rest of the organization. It’s more about asking what these teams need in order to be able to achieve their goals. That’s how we treat any team size. They’ve got goals. What size team do they need for that? If we have an awful lot of things that we do look after it’s quite a broad cognitive load, if the team is too small and we don’t want them to feel stifled by that. We expand the team enough so that they can be excited about their work.

How do you see the role of engineering and enablement changing over time? Right now it sounds like you have the metrics initiative and lots of tools and are working on creating alignment. Over time do you see, for example, a shift towards building versus buying, or a shift towards trying to impact teams in different ways?

I don’t think at the moment, I would see us shifting towards building over buying because a lot of what we provide is a commodity at the end of the day. It would cost a lot of money for us to build and then there’d be a bigger dependency on us and our knowledge that everyone would have to come to our team. They couldn’t build our expertise.

We want to be able to provide tools that people can ultimately become as experts in using, and not wasting internal talent on building things that are already out there and being iterated on much more quickly by organizations that are out there specialize in them. So I don’t see us going there.

What I would hope that we would end up is gaining more confidence, actually, all of the engineers in the team. I would quite like to see the team become a whole team full of internal DevRels. Can they have more time and capacity once we’ve reached a certain point to be out there engaging with the teams, pairing with teams, spending time seconded and out in them, supporting, enabling them championing and celebrating their successes. More focus on evangelizing and celebrating and sharing. At the moment, they’re still quite technically hands on, on the systems that they’re improving, but the more they can be working in teams embedded in them and helping them with specific problems that they’ve and then helping everyone get excited about it.

That’s an interesting idea around the embedded model. How would you envision that working? Would you envision folks from enablement spending a quarter on a product team, or how would that rotation work?

No more than a quarter, because otherwise we’d become divorced and detached from where we are. At the moment we have people from other teams come and second with us for a quarter and that’s great for them because they come and learn what we do and they come and tell us their problems. That helps us understand what they need and we’ll become better as a result, but we very soon are going to start seconding people from our teams out into others so that they can go and they can see the problems for themselves and they can see where they can help. But they wouldn’t be probably more than a quarter because then we would lose them and we need them to bring back the knowledge.

I’d like to loop back more on the topic of your current metrics initiative. You mentioned that right now, you’re trying to get set up to just have those metrics, but how are you approaching that?

We have a really excellent system that gives us a starting point. We’ve got a thing that we built a few years ago called BizOps, which is a system built around GraphQL APIs. It started out just as information about what are all the systems we’ve got, who are the technical owners, what are the links to codebases and run books and stuff like that.

But now we’ve been iterating on that now so it’s got business owners and context and dependencies and it’s starting to have risks attached to it. We’re looking to see how we can add costs into it. We’re adding AWS and infrastructure information and all lots of different kinds of information is being fed into that around changes and releases that are made as well so that we can generate those metrics.

Because it’s in this graph database users, there are other engineers who can discover it. And they’re all, there’s a really big appetite and nearly almost all engineers use this BizOps in one way or another, and they’re quite excited about it. It gives us a playground where we can experiment with what we can learn if we look at it this way or that way. It’s surfacing information and we’re learning what’s interesting people and that’s given us a really good starting point for doing that so we can have lots of conversations with teams and with leaders and what do they want to know and what can they find out themselves versus what do we need to massage together and to try and generate.

That sounds like an awesome tool. I’m curious, how does your org then do planning on its own? What’s the cadence for understanding what the biggest pain points are for these teams across the org and then incorporating that into planning?

At a high level, we’ve got our quarterly OKRs and that’s where we see what other teams are doing, and what their goals are for the coming quarter. What might be problematic for them? We had a bit of a spike in recruitment where we knew that was going to be happening this year. We knew that helping new starters be able to get up and running as quickly as possible was going to be a big priority so we put a lot of effort into the onboarding process, whether it was technical, making the things easier technically or whether it was providing information for it.

The quarterly rhythm is the basic one, but we are also just out talking to engineers all the time and other teams there might be something increases security risks, where everyone it’s all hands on deck on what do we need to do to try and deal with a particular which may be present in any number of the things that we provide that all the engineers use.

You mentioned a big part of how you plan is just by having conversations with folks across the company to understand priorities. Are there any other tools or methods you use to identify priorities, or are you looking at certain metrics across organizations to find teams that you might want to inquire and look into?

We do surveys. We ask engineers what their pain points are in particular areas. That determines where we focus our efforts.

There are also metrics. These are sometimes cost-related, for example, we discovered that the way our monitoring tool was being used was about to cost an enormous amount of money. Setting that as a target to address and that involved getting everybody else to slightly change the way that they used it. That made a tremendous impact. Actually, we reduced that down a significant amount.

Sometimes there are external factors like cost or security, sometimes it’s things where we know we want to get to a point where teams are able to move more quickly because there’s a particular barrier that’s getting in the way.

Sometimes it’s just information that we hear from teams where those particular grumbles or a team has decided to solve a particular problem and we’ve seen that and gone ‘Well, we know that problem exists in other places. So let’s take that and try and make that more centrally available by making it scale or, or work in a slightly broader reaching way that’s more generally valuable.’

A lot of it is listening. A lot of it is we set our own targets because we want to be able to stop being the team that goes, you need to patch and enable them to see they need to patch.

When I worked at GitHub we had this goal to accelerate software delivery. When I would go talk to teams, one of the things I found was that the things that they said were slowing them down most were actually not the tools, but more had to do with the processes and particular issues around product management. At Financial Times, from your perspective, what are the things that actually bog down developers and teams the most?

We find that, well, some of it’s the processes that they have within their teams, which we can’t really influence, but a lot of it is the tooling. Is how do they set up a new cloud instance? We are doing an awful lot there in terms of building, making components that they can plug and play with and documentation and training workshops to try and make that as easy as possible or configuring some of the edge authentication. Some of that can feel a bit complex. Again, how do we standardize and simplify that as much as possible, and as well as providing, making the team available and to support and advice as, and when teams need. Those are two of the bigger pain points that we’ve got.

Another one is, we have a design system team who build, who enable our designers and brand team to have consistent brand representation against all of our products. We have a team who helps them do that by baking in the design and brand into components and style sheets that everybody uses across all of the products. Again, it’s anything that we can do that takes away some of what might feel like repeated churn or toil that people have to do if we can take that away. Then that’s a big driver.

When looking at tools in particular, how much of that do you think is your team’s responsibility, as opposed to things that local teams can learn and improve on their own?

A lot of our groups have teams within those groups who focus on our developer experience, but again, they’re small teams and they may be reinventing several times. If they’re reporting the same similar pain points, similar challenges, or just lack of knowledge of what the other one’s doing, these are things that we hear where we can take some of the pain away. Ultimately if what we support and provide and recommend, if we don’t make that easy to use, they can and will just take their own route. They do have full accountability for what they run.

I think that that’s appropriate if you are supporting a thing and you are, and everything, if you’re taking a bit, you build it, you run it, then you need the autonomy and the freedom to make those decisions.

This is why we are not taking an approach of “we’re going to build a platform and you have to use this, and you have no choice,” because that’s not the way we want to do things. Sometimes it does make sense for teams to optimize locally.

But ultimately that’s just been slowing them down because it’s giving them too much choice and things, whoever set that thing up leaves the company and the thing is no longer supported or they’ve picked this nice, free, lightweight, open source thing that has suddenly become really expensive and enterprise or else that it’s gone away and they have to find another one and they have to go and find another part of the internet that they having to keep in their brain. Really, it would be nice if they didn’t have to just so they can build an application.

They have the autonomy, they have the freedom, but we want to try and make the things that we offer and provide things that we are harvesting from them that are good. That way we can keep running and improve on the standard that’s going to work for them, so they don’t feel they have to go off and go and find other solutions.