Posted on

Is a spike a type of story?

Many teams that take an “agile” approach to software development define requirements in the form of “user stories.” Not all the work a team performs lends itself to this form. It seems that people have come up with definitions of various “types” of stories other than “user” stories so that they can deal with work items that do not represent vertical slices of functionality that can be demonstrated to a business stakeholder. That’s fine as long as it’s practical and helps teams deliver effectively.

One of these non-story stories is the so-called “spike story.” I keep running into this in my team coaching work, so I did a quick Google search and found the term “spike story” is used pretty consistently. Definitions published by the ScrumAlliance, Agile Learning Labs, Scaled Agile Framework (SAFe), SolutionsIQ and others all refer to spikes as a type of “story.”

Answers to a question about spikes on Stack Overflow are more encouraging, as they mention spikes are time-limited and at least two of the respondents refrain from calling them “stories.” Even so, the notion that a spike is a kind of story seems to have become a widespread infection.

What I see in the field is that treating spikes as stories interferes with teams’ ability to keep their work flowing smoothly. I would prefer we stop referring to spikes as any kind of “story.”

As I see it, a “story” has certain characteristics:

  • A discrete priority level set by the Product Owner or equivalent
  • Some form of “value” to at least one stakeholder
  • Some sort of functionality is delivered
  • Testable acceptance criteria
  • Non-determinate service time, so that the story is sized or estimated in some way during planning

I know that isn’t the canonical INVEST list. I know people can warp a “spike” so that it can be argued that it conforms to INVEST. That isn’t my point.
As I see it, a “spike” has certain characteristics, and they differ from those of a story:

  • It is not prioritized by the Product Owner as part of the “what.” The need for a spike is determined by the team, as part of the “how.”
  • It does not provide immediate or direct value to any stakeholder. It provides indirect value to the development team itself.
  • It does not result in the delivery of any functionality.
  • Its purpose is (a) to answer a question, or (b) to arrive at a better question.
  • A fixed time-box is specified for the spike, so it is not sized or estimated.

These differences are significant enough that it makes sense to treat spikes quite differently from stories. Using the word “story” tends to cause confusion on the part of teams that are not very mature in agile thinking. I advise teams not to treat spikes as stories at all. This seems to help reduce friction and stress, reduce planning overhead, and improve flow.

I’ve observed that treating spikes as stories leads to a domino effect of suboptimal practices. First, teams spend an inordinate amount of time trying to estimate or size spikes, because the outcome of a spike is unclear. The very reason to do a spike is that we don’t have enough information to know or guess the size or duration of the work.

Second, teams aren’t sure when a spike “story” ends. There is no deliverable and no testable acceptance criteria, so the definition of “done” is very different from that of a proper story.

Third, treating spikes as stories skews the metrics – particularly Velocity and Cycle Time. That leads to low predictability in short-term planning for future iterations or cadences.

When teams treat spikes as time-bound experiments aimed at refining or answering a question, then these problems disappear. I’m not suggesting that we stop doing spikes; I’m suggesting that we stop calling them “stories,” because that word causes confusion for teams.

Anyone else have any observations on the subject?

9 thoughts on “Is a spike a type of story?

  1. My experience is analogous to yours, and I agree with your points.
    One thing I noticed is that several teams don’t put a time box on the spike, but run it for an entire iteration, and also don’t make it focused enough so that the learning at the end is clear.

    1. Interesting. Running a spike “on the side” throughout an iteration would have an insidious effect on Velocity or Cycle Time. If they’re tracking Velocity, then the time spent on the spike will reduce the team’s Velocity for the iteration. If they’re tracking Cycle Time, then the time spent on the spike will accumulate across all other work in an “in progress” state. It seems obvious those effects would increase Cycle Time variability and degrade planning predictability.

    2. I have seen this too. I think not putting a time box on the spike is not too much of a problem when you limit the time worked on it in another way. Like, only two of our ten-person team will work on the spike (but then the learning for the whole team will be more difficult).
      I totally agree that every spike should have a well defined goal (what do we want to learn? what possibilities do we want to explore?) and be focused.

      1. I’m not so sure that approach is as clean as time-boxing the spike. It sounds a lot like the situation when a spike drags on throughout the whole iteration.

  2. I once asked a team to stop estimating time-boxed work (like spikes but also other types of work that do not add value for the customer, like database version upgrades) and stop to call them “stories”. My argument was “when we already *know* how long it takes, we don’t have to estimate how long it will take”. The product owner strongly opposed this idea because “Then our velocity would be lower. We would look bad.”
    I tried to convince him that our velocity would be unreasonably high when we treat them as stories, because we would not deliver any functionalty to the customer (but count the points as if we had). He ultimately ruled that we continue to treat this work as user stories.
    What other arguments could I have used?

    1. In my experience, that sort of thing happens when teams (or stakeholders) try to use Velocity for multiple purposes, such as tracking the amount of work done (OK), tracking the value delivered (wrong), and tracking how people spend their time (wrong). Without realizing it, that Product Owner is causing the Velocity observations to be false, which makes the observations meaningless for planning. It’s another among many examples of how Velocity is misapplied.

  3. Hi Dave,

    I generally teach that spikes are time-boxed efforts to discover enough information to estimate something or make a decision about something. If the information isn’t sufficient by the end of the time box, an explicit decision is made on whether another spike is worthwhile.

    For this reason, spikes require acceptance criteria – an explicit agreement on what information is sought. A spike along the lines of “investigate nosql options” would not qualify, but a spike along the lines of “confirm MongoDB will work with our hardware” would make sense.

    So – they aren’t stories, and they aren’t story proxies. They don’t necessarily produce value, they simply enable an estimate or decision to be made. They don’t get estimated, because there is no point in estimating something with a fixed duration.

    ymmv of course – but that’s how we’ve been using spikes.


    1. Sounds perfect to me. The fact you call the agreed-upon definition of the purpose of the spike “acceptance criteria” seems fine. Sometimes the words we choose affect behavior in undesired ways, but I don’t think “acceptance criteria” would have that effect. Using the word “story” to define a spike does seem to cause undesired behaviors, IME.

  4. Well said, well written, and I agree. In my classes I teach that Spike is a timeboxed activity… and don’t bother estimating story points for it.

Leave a Reply

Your email address will not be published. Required fields are marked *