Data Torture

Data Torture is the practice of repeatedly interpreting source data until it reveals a desired result.

The concept of data torturing was first proposed by the British economist Ronald Coase during the 1960s. He believed that if you torture the data, it will confess to anything because there is often more than one way to sample and interogate data, and deciding which approach is best is subjective, and open to bias.

In the following chart you can clearly see trends such as increases, decreases, spikes and plateaus. None of the lines represent the individual values recorded in the source data which is shown below in the bar graph and table of data. They illustrate some of the possible scenarios for interpretation of the data.

Data Torture

For example, from the source data it would be true to say that the men’s figures are in decline, and the boy’s figures are improving. However, from the aggregate data in the line chart (above) you can see that the adult figures are improving and the male figures are in decline. So a slight shift in terminology - men to adult, boy to male - could lend a different impression to the intended audience.

Furthermore, you could also be selective about the period of data, such as female from 2010 to 2013 and say it is improving, or say that it is stable for 2013 to 2016.

In these basic examples you can see how easy it is to slip from one impression to something quite different by making different approaches to interpreting the data. Data torture simply reflects that if you keep coming at the data from different angles you can get a whole range of answers. Sometimes this is unintentional and this underlies how care needs to be taken. However more often than not, data torture reflects the will of the analyst to keep working until they derive a result that suits their agenda.

Data Torture

Data Torture