Measuring Agile Efficiency

This blog post is inspired by another Quora question; “What metrics do you use to track Agile Efficiency?”

To begin with, I want to state that if I had to choose between efficient and effective, I’d choose effective. Efficiency is often about output (how many widgets per hour), whereas Effectiveness is often about outcome (was the purpose consistently met).

Agility is about responding to change. Efficiency is achieved by driving out variation. An over-focus on efficiency will lead down a path of standardization and control, making for a less agile system.

That said, given the question was specifically about agile efficiency, I’d look at a few things - Throughput, Cycle Time, Deployment Frequency, and Mean Time to Recovery.

Throughput

This is the number of cards complete in a given time period. For many teams this is every two weeks. You might decide it is weekly or monthly. I prefer weekly. It is small enough to give us quick feedback and large enough to be meaningful on most teams. If you are running weekly throughput and find it is extremely volatile, you might want to consider expanding the time period. As the team matures, you can contract it back down.

Throughput is useful for forecasting. The best technique I know of today is the Monte Carlo Forecast tool provided by Focused Objective on their Tools and Resources page. I provided detailed instructions on this technique in my book Escape Velocity.

Related to efficiency, Throughput indicates a rate of flow. In general, higher throughput means better flow. Better flow means higher efficiency.

Cycle Time

Cycle Time is the amount of time it takes a card to move from one state to another on the board. For many teams, they look specifically at the amount of time a card spends in the stages from when development starts to when it is ready for production.

Lower cycle times indicate the team is more responsive. Lower cycle times usually indicate the team is working in a focused manner with few items in progress at once and that the items are generally small in size.

Long cycle times indicate too much work in progress or that the work items themselves are too large.

Related to efficiency, cycle time helps us further evaluate the flow of work through the team. It helps us to see if WIP is too high or batch sizes are too large. Shorter cycle times indicate higher efficiency.

Deployment Frequency

Throughput is often measured when a card is ready to be deployed. Cycle Time often does not include deployment. As a result, a team can look quite efficient and still release software infrequently. An automotive dealer wouldn’t consider a car manufacturer efficient if they reported production of 10,000 cars/week at a cycle time of 1 day/car if they delivered 140,000 convertibles all at once 8 weeks into the summer season.

Deployment Frequency for high performing teams is multiple times per day. Highly efficient teams push to production more than once per day.

Mean Time to Recovery

So far, we’ve talked about how efficient the team is at delivering features. That’s great, but what about recovering from issues? If we want to move fast, we need to be able to respond fast and recover fast. Mean Time to Recovery is the average time it takes to remediate a production issue once found. For many team I work with, there is no list of defects, there is no classification of bugs as low, medium, or high. There are two states - broken (known defect present) and working (no known defects present). When production is broken , we stop everything else and get it working. In fact, we build mechanisms in to ensure we can get back to working as fast as possible. Blue:Green deploys, feature toggles, and rolling deployments are a few different techniques we use. The idea is to be able to recover in an instant. Once we have production back in the working state, we set about correcting the defect and moving the code back into production as soon as possible.

From an efficiency perspective, it is paramount that you are efficient at recovery.