December 14, 2010

The scientific method

Filed under: Community,Process,Reading — Freek Leemhuis @ 11:42 pm

The ‘scientific method’ – or is it?

I like the work of David Anderson on Kanban, which I think is deserving of the attention it is now getting in the software development industry. His latest post describes the basic principles well. I noticed he and others in the field describe the use of the ‘scientific model’ as approach to improve one’s process. In this post I would like to examine the use of this term, because I think there’s a lot of misunderstanding about scientific methods and how they can be applied to software projects.
In David’s post it is described as follows:

The use of models allows a team to make a prediction about the affect of a change (or intervention). After the change is implemented the outcome can be observed by measuring the flow and examining the data. The outcome can be compared to the prediction expected from the model and the change can be assessed as an improvement, or not. This process of evaluating an empirical observation with a model, suggesting an intervention and predicting the outcome based on the model, then observing what really happens and comparing with the prediction, is use of the scientific method in its fundamental sense. This scientific approach, is I believe, more likely to lead to learning at both the individual and organizational level. Hence the use of the scientific approach in Kanban will lead directly to the emergence of learning organizations.

It makes a lot of sense, change a variable and then measure the difference it makes. I do believe however that some scientific methods are overlooked in this type of description. I’ll try to explain.

Suppose you’ve run a project and you have gathered all kinds of metrics regarding performance of your team. The data shows that since this one bald developer was introduced in an otherwise hairy developer team the velocity increased. This leads you to the hypothesis that having at least one bald developer on the team leads to better team performance. In order to test this hypothesis, you will introduce a bald programmer to another team, and again you collect empirical evidence to see if your hypothesis is confirmed. Again, the numbers show an increase of productivity. You’ve followed ‘the scientific method’, so you can now state firmly that

bald programmers lead to better team productivity.

And you have numbers to back it up.

The example is deliberately silly, you will immediately think of alternative explanations. Quite often, bald developers are of advanced age and therefore on average more senior and experienced developers. There’s all sorts of alternative explanations that would, when true, show the correlation between performance and amount of hair to be spurious. That is one problem that is not easily solved, there can always be an underlying correlation that is the true predictor for the result.
However, the outcome can also be influenced by other factors, and these can be controlled by using methods that are ‘more scientific’.

Let’s talk about two types of controls that are common in social scientific research: the use of control groups and sample distribution.

control groups

In the social sciences, experiments to gather data to test a hypothesis are executed in controlled experiments. Usually, to test a hypothesis a researcher use two different (sets of) subjects simultaneous: a treatment group, for which the effect of an manipulation is observed, and a control group, which uses all of the same conditions as the first with the exception of the actual manipulation. The effect of the manipulation can then be measured by comparing the results of both groups. If you do not use a control group, the data that you so carefully collect might be influenced by factors outside of your manipulation. Increasing performance might for example be in fact due to the fact that the team was working on easier parts of the system for that time, or the bout of flue that was going around and kept important team members from contributing was over by that time. Using a control group will eliminate all factors related to the time of your experiment from influencing the outcome of your manipulation. Well established misinterpretations such as the Hawthorne effect can thus be avoided.

Sample characteristics

Secondly, you’re looking for findings that are applicable not just for the team in question but somewhat broader. But can you say that findings for the one team are expected to apply to all teams in your organization? To do so you need a sample (the team) that is representative for the population (all teams) that you want to target.
In science the sample size and distribution of these characteristics are used to measure the probability that the effect of manipulation is in fact valid for the population. If you want to measure the popularity of Ajax football club in the whole of Holland for example, you would be wise to include people from different age, sex, from different parts of the country, with different incomes and so on. Obviously, the closer the sample size is to the population, the higher the probability that the findings will apply to the population as a whole. We know teams can vary greatly. If you add the bald developer to a team of java developers, the effect might be totally different then adding him to a team of php developers. Adding a baldy to a new team might increase productivity, adding him to a well-established team however could be counter-productive.
There’s all sorts of team attributes that might be important to the effect of manipulation. One should look carefully at the attributes of a team in order to decide if the sample is representative.


I don’t think there is such a thing as one ‘scientific method’. There’s different methods, one more scientific than the other, that can lead to a measurement of probability that a hypothesis is true or false. And sure, in practice it will not always be feasible to improve our process in a rigorous scientific manner. I do think for best results we should at least understand and communicate these factors and account for them where possible.



  1. Good post! To determine whether you’re dealing just with a correlation or a causation is indeed quite hard, especially if you have so many variables when you’re doing a project (context may also have an impact on the final result). I also think that a good thing of practicing Lean is that you focus on the entire value stream, which is also something that may help. The more detailed your measurements, the more difficult it may be to find causalities. Also, the flow you’re trying to achieve in Lean will also help you to become more predictable.

    And also, the shorter your cycles, the more predictable you become. The longer the cycle from initiation to ‘in production’, the more variables you have to deal with, and the harder it becomes to be predictable and find causalities.

    Just as you wrote in your post, the essence of using a control group is to remove interfering variables. Using a control group is not always an option in real-life, so finding other ways to remove variables (short cycles, flow, etc.) is a good way to become predictable.

    So again, great post, and maybe some causality you didn’t mention is the level of baldness instead of just bold or hairy 🙂

    Comment by Jan Willem Tulp — December 15, 2010 @ 10:11 am | Reply

  2. Well, actually there is a thing called “scientific method”; Wikipedia has a decent treatment on

    One of its particular points (missing in the bald programmer example) is that one is expected to state the assumptions under which the hypothesis should work, in this case e.g. the team characteristics and type of projects. Unstated assumptions can of course radically change the outcomes.

    The key is that the whole experiment has to be repeatable by independent investigator, so that it is open to a proof or refute. If the hypothesis, assumptions, and model / experiment design are documented enough, anyone can stand up and show that your conclusion is wrong. It is nothing bad to end with a wrong conclusion, as long as someone else can come up with a correct one. The bad thing with respect to scientific method is to publish only selected parts of the process (“magic happens here”) and call the conclusion “the truth”.

    Other than that, thanks for a nice post.

    Comment by Premek Brada — December 16, 2010 @ 10:57 am | Reply

RSS feed for comments on this post. TrackBack URI

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s

Blog at

%d bloggers like this: