Measuring developer activity: what the research says
At some point along an organization’s journey with measuring developer productivity, there is often a discussion on whether or not to track individual developer activity metrics such as the number of commits, lines of code, or pull requests. Interest in these metrics is typically prompted by leaders wanting objective insights into developer performance, or how much work is getting done.
Although the reasons for tracking these types of metrics make intuitive sense, the industry is fraught with polarized opinions on this practice that sometimes spill out into the public. Fortunately for leaders, there’s extensive research on this topic that can help with navigating through discussions and decisions on what to (or not to) measure. This article summarizes this research.
Activity metrics capture only a thin slice of developers’ work
Software engineering is difficult to measure in part because it is a highly complex task. Therefore, traditional productivity metrics that focus on output volume do not readily apply. Furthermore, when evaluating these metrics based on individual or team, more is not necessarily better. A recent paper written by Google researchers articulates this point bluntly:
Similarly, in their seminal paper “Defining Productivity in Software Engineering,” authors Stefan Wagner and Florian Deissenboeck state that leaders must consider factors beyond activity when measuring developer productivity:
Aside from the creative nature of the work, developer activity is also difficult to measure because it involves many different types of tasks that are not easily captured. In the paper titled “The SPACE of Developer Productivity,” the authors caution that “because of the complex and diverse activities that developers perform, their activity is not easy to measure or quantify.”
Google researcher Ciera Jaspan further elaborates on this idea in the paper, “No Single Metric Captures Productivity”:
Still, many leaders today focus solely on measuring individual activity metrics. As described by Dr. Margaret-Anne Storey, a co-author of the SPACE framework, this is something researchers are concerned about:
Next we’ll cover research on the potential consequences of utilizing developer activity metrics.
Activity metrics create misincentives and can hurt morale
Several research studies have investigated the consequences of tracking individual activity metrics. One key finding is that developers hold skepticism and fear towards these types of engineering metrics, which may result in them “gaming the system” out of self-preservation. This is further discussed in a recent paper from Google:
In another paper titled “Summarizing and Measuring Development Activity,” the authors caution that the mere presence of activity metrics can warp incentives, even even when those metrics are not explicitly being used to reward or penalize developers:
Google researchers have also written about the morale issues that can arise from use of individual activity metrics. These morale issues negatively impact overall productivity, and can also cause attrition issues.
Activity metrics provide limited value overall
There are several common motivations for why a company may be interested in measuring the activity of their developers. However, in each of these cases, research shows that activity metrics generally do a poor job at providing valuable insights.
Identifying high and low performers. Companies often look to individual metrics to help them assess individual performance. However, as described earlier, activity metrics provide a limited view of activity that often doesn’t provide value. In “No Single Metric Captures Productivity,” Google researchers state:
Helping developers grow their skills. Although many metrics vendors cite this as a use case, there are no research studies which validate the use of activity metrics for helping developers grow their skills. Rather, a recent paper by Microsoft researchers identified the top five attributes of great engineers as:
- Writing quality code
- Anticipating future needs and trade-offs
- Gathering information to make informed decisions
- Enabling others to make decisions efficiently
- The ability to continuously learn and find solutions
For leaders who hire or coach developers, these findings present valuable insights for hiring, upskilling, and supporting developers’ growth. Activity metrics, however, do not help developers or managers assess or boost skills across these attributes.
Instead of attempting to use individual activity metrics to evaluate developer performance, consider improvements to your formal performance review processes, starting with a clear definition of role expectations and then working backwards to identify potential metrics.
Improving engineering productivity. Another common motivator for tracking individual activity metrics is to identify inefficiencies and improve engineering productivity. However, activity metrics are heavily influenced by outside forces: when they’re used in isolation, they may provide misleading signals into where productivity issues exist. This is explained by Microsoft researchers in their recent paper:
A recent paper by Thoughtworks further recommends against using activity metrics to try to improve developer effectiveness:
The decision of whether or not to track developer activity metrics is an often debated topic. The findings discussed in this article, which come from researchers at companies like Google and Microsoft, can help leaders better navigate the potential pitfalls of using these types of metrics.