Software Development Audits
Due Dilligence for Software Development


Relative Complexity and Sizing

One of the most important tools in the Agile Development toolbox is point-based sizing of user stories. In contrast to time-based estimation, hours and days, points are used to estimate sizes in a relative way. Two stories of equal size have equal points.

While this sounds nice in theory, it’s hard to get an intuition for what that means. What is a point? What criteria do we use to get to our point size? If you ask 10 agile teams about their sizing criteria, you’ll probably get 10 different answers. Some use effort, others technical complexity. Sometimes points are not even numerical but rather ‘Small’, ‘Extra Large’, or even ‘Cat’, ‘Dog’, ‘Cow’. Especially in teams that are just starting to use point-based sizing there can even be different criteria within the team.

Before diving in to a few ways to develop an intuition about relative sizing, let’s start with briefly reviewing the benefits of using points.

Time-based sizing

With time-based sizing there is a large problem of the impression of accuracy. If I ask a developer for a time estimate, and she tells me it’s going to take 4 hours, I can reasonably expect that task to be done 4 hours later. This is not how it works.

Especially in software development the variation in estimates and reality can be very large. While a 400-hour project might take 500 hours to complete, a 1-hour task can just as easily take 5 hours to complete. That’s a 400% increase compared to 25% for the larger project. The estimation error is not always relative to the original estimate.

Another issue with time-based sizing is that while the best estimators (I have yet to find one) might be fairly accurate in estimating their own work, it breaks down when trying to estimate the required effort for a team. Time-based estimates are dependent on who implements a task, the technology used and the state of the codebase, rate and length of interruptions, previous experience with similar problems, and other things that vary wildly across people and tasks.

A third problem is psychological. If 4 hours are estimated for a task but I’m done in 2, I often feel like I missed something. I don’t want to deliver my work with time to spare, so I take a step back and look for improvements. Finishing early a number of times might have future estimates adjusted downwards. Time-estimated tasks tend to converge towards the estimate, and not towards the most optimum.

All things considered, time-based estimates are really inaccurate for predicting the duration of a single task done by a single person.

Point-based sizing

Point-based sizing on the other hand does not aim for individual accuracy. Especially the ambiguity in what a ‘point’ is makes it impossible to assess whether a single story will be exactly 5 points or not.

Instead, points shine when aggregated over a larger number of stories. Points get their true value from empirical evidence. “The past N sprints we were able to deliver X points”. This is referred to as the velocity of a team.

The lack of a measurable quantity for what a point represents makes it hard to estimate in points. Teams should find a common framework of what to include in their estimation, and above all remain consistent over time.

Qualities of a point

It’s difficult to describe points in terms of measurable qualities. This will differ from team to team. Instead, what we can do is describe what points should not be. Looking at the drawbacks of time-based estimation, we can state that points should not:

- Change based on technology used
- Change based on experience of the developer
- Change based on interruptions or distractions
- Change based on the maturity of the existing code
- Change based on the order of other stories
- Change based on availability of team members

Each one of these properties will likely change the amount of hours a story will take to implement. But instead of incorporating that into each story individually these will impact the overall velocity instead.

Using this framework, we can clearly improve our productivity by addressing the above items individually. By improving, for instance, the maturity of the existing code we can measurably increase our velocity by delivering more points each sprint. It helps the team find bottlenecks and improve them to asking the question: “How can we increase our overall velocity?”

Using these guidelines means that sometimes implementing a story today can mean that that takes multiple sprints, while doing the same story later when there are more building blocks already in place, might take significantly less time. While at first this may seem counterintuitive, this is exactly how we want to use points. Having those building blocks in place means a direct increase in velocity, and in turn, in the capability to deliver more stories, faster.

Kamiel Wanrooij