How Shopify’s infrastructure team defined their charter
Brook Perry
Marketing
At Shopify, the Developer Infrastructure team exists within a broader Developer Acceleration organization that defines its mission as making developers at the company highly productive. Although this mission statement helps guide direction, it is a broad description that leaves much room for interpretation about what projects to focus on.
This article covers how the Developer Infrastructure team built on its parent organization’s mission statement to define its team charter and guiding principles. Mark Côté, who leads the Developer Infrastructure team, provided the basis for this article through an interview on our podcast.
What is an infrastructure team?
An infrastructure team is a specialized group within an organization focused on maintaining and developing the underlying technical framework that supports all its IT operations. Infra teams efficiently manage the physical and virtual core services essential for hosting services, data storage, cloud platforms, and operating systems and applications.
Their duties often include handling servers, data centers, network architecture, and software deployment platforms, ensuring these components work seamlessly to support various business functions.
The role of an infrastructure team extends beyond mere maintenance; they also play a crucial role in strategic planning and innovation. By constantly evaluating and integrating new technologies, they are an internal service that enhances system capabilities and resilience, contributing to overall business agility and continuity. Their work ensures that the company’s technological foundation remains robust and adaptable, ready to meet current and future demands.
Infrastructure team at Shopify
On the Engineering Enablement podcast, Mark discusses his background and the evolution of his career from Mozilla to Shopify, where he currently serves as the Director of Developer Infrastructure. He elaborates on the organization and function of Shopify’s infrastructure teams, which fall under the umbrella of Developer Acceleration—a group focused on enhancing the productivity of Shopify developers.
Mark emphasizes dividing the infrastructure organization into two primary sections: the Ruby on Rails infrastructure team and the broader Developer Infrastructure division. The latter is segmented into various teams, each focusing on different stages of the development workflow—coding, validation, deployment, and service management—to optimize each phase separately.
Mark’s perspective on infra teams highlights their strategic importance in accelerating development processes. He discusses their initiatives to tighten feedback loops, reduce cognitive complexity, and scale up engineering effectively. Priorities include providing timely feedback to engineers, streamlining decision-making processes, and addressing scalability issues posed by Shopify’s growth and the extensive volume of code and changes in their repositories.
He also touches on the importance of aligning the infrastructure organization’s goals with Shopify’s broader objectives, ensuring all team members effectively understand and contribute to these targets. The team facilitates this alignment by regularly reevaluating their charter, which defines their mission and strategic focus areas.
Mark’s insights reveal a solid commitment to continuous improvement and proactive management within Shopify’s infrastructure teams, aiming to foster an environment where developers can thrive and continually enhance productivity.
The need for a charter
Shopify’s Developer Infrastructure team is responsible for many initiatives, from developer environments and CI/CD processes to frameworks, libraries, and productivity tools. Previously lacking a defined charter, the team often addressed issues not necessarily critical to the business.
To address this, they decided to establish a formal charter. According to Mark, the charter was created to improve their communication of the purpose and reasoning behind their work to leadership.
“We wanted to clarify to leadership why we exist and how we approach our challenges. This way, when they review an individual project and question its relevance, they can refer to our charter and understand how it aligns with broader business opportunities. It facilitates project discussions among software engineers without delving into the detailed reasons for specific system features.”
Defining a charter
The Developer Infrastructure team’s charter includes opportunities for impact and guiding principles.
Clarifying an engineering team’s areas of focus, also known as “opportunities for impact,” helps determine which problems the team will address and which ones will not. The infra team identified these areas by analyzing past projects and considering potential future opportunities. They also reviewed notes from blogs, podcasts, and discussions about how similar teams at other companies define their charters.
The Developer Infrastructure team’s charter includes a set of guiding principles. These principles serve to define the team’s operational approach and decision-making processes.
These components both tie back to their department’s mission of making engineers at Shopify more productive.
Opportunities for impact
Tightening development feedback loops
Streamlining development feedback loops means providing developers with the necessary information precisely when needed. Two projects that the team has concentrated on, which align with this objective, are:
- Creating a test automation system, “Caution Tape Bot,” incorporated into their service catalog. This system monitors pull requests for specific patterns and warns about potential problems or unexpected system interactions.
- Enhancing the notifications from their deployment system to be more precise and alert the correct people at the right time, avoiding unnecessary noise.
Scaling up engineering
The initiative to boost the development team’s impact aimed at strengthening the engineering organization’s overall effectiveness. Spin, Shopify’s development environment platform, was launched to tackle the challenges of operating its monolithic network infrastructure and its related services on local machines.
As Mark points out, “keeping dependencies up to date slowly became a huge problem, frustrating developers and overloading our team with support requests.”
To alleviate this, Spin offers individual cloud environments that are pre-configured and ready for immediate use. This setup enables engineers to avoid the complications of managing increasingly complex application configurations, streamlining their workflow.
Reducing cognitive overhead
The goal of reducing cognitive overhead is to reduce the number of decisions coders and managers must make. Efforts in this area have included projects such as:
-
Developing a static analysis system to detect and automatically eliminate dead Ruby code. “We’re also integrating this into CI, which aligns with ‘tightening development feedback loops,’” Mark mentions.
-
Streamlining complex systems for reporting completion rates for action items and degraded Service Level Objectives (SLOs). The aim is to “clarify the results and ensure they are communicated to leadership so they can easily assess the engineering health of their organizations and make deliberate, informed trade-offs.”
Currently, the team exclusively undertakes projects that fit these categories. They identify and prioritize projects by engaging with developers through various methods. These include a quarterly survey, office hours sessions, and one-on-one interviews. Additionally, they receive further roadmap ideas from a dedicated team within their parent organization that supports their internal tools.
Guiding principles
Maximize impact with effective two-way communication
“We have to talk to our user base all the time,” Mark says, “both listening to their needs and communicating our solutions back to them.
Pursue both incremental and significant step-change improvements
Mark explains, “We have many core services, and they all have rough edges, bugs, and missing features. While one of these services provides value, we’ll continue to iterate on it."
“However,” he adds, "sometimes we need a significant alteration to the way engineering works; sometimes this is because we can’t further scale a system, sometimes because we’re getting diminishing returns on a solution, and sometimes because there’s a new need or useful technology to leverage.”
Enable self-service by providing extensible platforms, information, green paths, and guard rails.
The infra team can’t solve every problem for developers, so they try to build open and extensible systems that engineers can build and extend independently. An extensible platform allows the Infrastructure group to concentrate on higher-impact work.
Absorb complexity
Accelerating development requires reducing the overall complexity that developers face.
“We take on some of this complexity for global simplicity,” says Mark. “Accordingly, we don’t push any of the complexity of our systems onto our users. Sometimes, this is a lot of work for our teams but saves our user base more time and effort.”
--
In summary, although internal-facing teams focused on developer productivity may understand the value of their work, leaders of such teams still sometimes need help to articulate this value to leadership and explain why they prioritize specific projects. If you find yourself in a similar situation, consider adopting Shopify’s strategy as a constructive guide: defining a team charter can effectively shape all conversations with leadership, ensuring that the team’s projects are clearly and appropriately valued.