I’ve seen a few posts about Iron Butt Rally statistics. The simple data such as position, score, miles, make, model, and rider name are up from 1997 to 2011. It was interesting to me to see people who talk about the different brands as being unreliable or reliable and the question always remains is such discussion based in fact. The issue with BMW final drives seems to be a significant problem but what does the data even in a simplified form show? So I set out to analyze and evaluate the data using some extra time, a spreadsheet, and knowing that there are lots of issues with the data in this way.
First thing when discussing data is that I had to clean up the data significantly. In some years the brand was H-D, other years Harley, and other years Harley-Davidson. So I set out to make it one thing and chose Harley-Davidson. Another item is that though brand denotes one thing the model of bikes are just as important. A 250 Kawasaki Ninja is significantly different from a Concours 1400. Another issue is that some models start the rally with significantly higher mileage and a perception that they may be less likely to finish. As an example no Suzuki (in this data set) has finished the rally but at least two of the set were RE5′s So right from the start of this the utility is limited in these numbers. Remember we’re only talking about the 1999, 2001, 2003, 2005, 2007, 2009, 2011 Iron Butt Rallies. The data is all on the Iron Butt Rally website if you want to play with the numbers too.
Iron Butt Rally Bike Brands
If you flip a coin at the start of the Iron Butt Rally you have almost the same chance of hitting heads as seeing a BMW. The BMW Marquis owns about 43% of the field at the start of the rally. The next brand in contention is Honda with 28% of the field at the start. Add in Yamaha at almost 13% and you have near 90% of the primary field of contenders by brand. There were slightly more than 700 starters in the years listed. Of those approximately 150 did not finish the rally.When we say starters that does not mean there were 700 plus people to have run the rally. The sets of riders who have run multiple rallies is significant.
1999 through 2011 DNF’s
So who doesn’t make it? If half the bikes starting are BMWs are half the bikes failing also BMWs? Well the answer is a qualified yes The Distribution of bikes that did not finish (DNF) is nearly the same as the distribution of the bikes that started. Slightly more BMW’s DNF out of the sample of DNFs than start and slightly less Honda’s DNF than actually started by simple sampling methods. Yamaha is similar to Honda in the number of bikes that DNF is lower than the percentage of bikes that start suggesting that Honda and Yamaha are more reliable than BMW. Knowing all the previous caveats and much more apply to such a statement. Personally I would say they are close enough without having done a regression analysis to say the bikes are pretty much just as reliable across the board for the top five brands. There are just more of some model of bikes.
Chance that a particular brand will fail based on the number of bikes riding in the rally
7 brands represent 97% of the failures. So if you take the bikes and instead of comparing them to the population (bad statistics word but we’ll use it) and instead you compare them to each other what kind of results do you get? What we’re doing is comparing BMW’s that start the rally to the BMW’s that DNF giving us a percentage of that brand of bike that failed. So in this example we get that 24% of the BMW’s failed to finish, 18% of the Honda’s, and around 16% of the Yamaha’s failed. We could say that if you ride a BMW you have a one in four chance of DNF’ing, and if you ride a Yamaha you have a 1 in 6 chance of DNF’ing. That does not bode well for the BMW’s in the crowd. We have to keep in mind that finishing is not winning and there is more to the equation than mere brand. One of the BMW’s in the sample is known to have nearly half a million miles on it when it competed. There are many occasions with outliers to find interesting data points that should be taken into consideration.
Brand Top Ten Finishers
Starting is not winning and DNF’s are a possibility for a variety of reasons. If you want to be in the top ten at the end of the rally you are either riding a BMW, Honda, or Yamaha. Nearly half the top ten finishers were riding BMW’s which is statistically larger than the number of bikes that started. The distribution follows a similar pattern for brand for DNF, start, and top ten finish with a few skews. Triumph and Harley-Davidson have had exactly one each in the top ten in the years examined.
It’s the rider not the bike? Top ten finishers
In the years examined can we make any kind of prediction about whether it is the bike or the rider? Not really, unless you name is Hoogeveen or Jewell. If you’re in the rally you are likely going to finish in the top 10. Examining the top finishers we find that of those who have completed three or more top ten finishes there are seven people. Of those who have two or one top ten finishes there are about two-thirds of the field. It would be interesting to see how many rookies are in the top ten each time. We can derive that data should we wish by counting existence of the names each year and knowing when they occurred. The problem is that we don’t have the data from the very beginning of the rally so somebody from 1995 might not have participated again until 2005. We’d code that person as a rookie incorrectly.
BMW Iron Butt Rally BMW Models
The data for the model of BMW bike in the rally is all over the board and very noisy. So, we’re going to say right up front that it would take weeks of time to clean the data up and even then it might not mean anything. What is a BMW R1150GS with all of the pieces to make it an R1150 GS Adventure? Is it still a GS or is it an Adventure? The same could be said about several models. The data could be scrubbed to, “is it an R, K, or F bike” and what is the designation (650, 1000, 1150, 1200, 1300, 1600). As such this section is very much a guess by some numbers as it is less than hard hitting analysis. I think you will find the information interesting.
BMW Iron Butt Rally DNF’s by Model
Which BMW’s fail to complete? It is interesting to see across the marquis the values represented. The best way is to compare the number of a model as counted that started with the number that finished. As an example 100% of the coded R75′s failed, and approximately one-third of the K1200LTs failed. You have to be careful with the numbers and reasons you accept the numbers. The K1300GT is a good example. In the years examined there was exactly one of those that was in the Iron Butt Rally. Anytime that you have a single instance of an entity you should look at the possible causes for a DNF.
People like to examine the data and make conclusions about the ability of a bike to complete a rally like the Iron Butt Rally. Some brands obviously don’t finish but that doesn’t mean the bike failed. Rider fatigue, accidents, home or family issues, and so many other things happen during the two weeks of the Iron Butt Rally that would have a bike and rider dropping out. In the end I did this little exercise because I like to play with numbers. This is play. I would have to do a lot more analysis than the 50 or 60 minutes I spent on this project. There are lots of comparisons that can be made with the data set. I haven’t even looked at the points and mileage efficiencies as I think there more variability there than can be explained. I’m not sure which is why it might be fun to look at, but shouldn’t be taken to an extreme.
Lots of people have looked at these numbers but I wanted to see what I could find.
Excel spread sheet should you wish to play.