Monday, November 1, 2010

The Art of Statistical Analysis

Statistics are the infamous tool of the diabolical.  Statistics are also the crutch for the careless and lazy.  Many times you have to do your own homework to find the other half of the complete “truth” or “fact” in order to place it in proper (and accurate) perspective. 

(Just a note: I wasn’t at all excited to register for my first semester of college Statistics years ago.  But when I finished with a 4.0 I continued on with more.  I found it interesting how so much science and technique could be fabricated on top of such a murky foundation.  I think of it as a house being built where you can modify the foundation as the walls and roof are also being built).

Case in point: The NHTSA says that highway deaths were the lowest in 2009 they’ve ever been “since 1950”, and that the reduction was due to increased safety efforts by manufacturers and checkpoints by law enforcement.  [full report here]  Main points of this report were:

  • 9.7 percent fewer vehicle crash deaths from 2008 to 2009
  • 1.13 deaths per 100 million vehicle miles traveled
  • 7.4 percent fewer drunk-driving deaths
  • 5.3 percent fewer vehicle crashes overall

They based this report on FARS data from the BTS [report].

But is this really accurate?

Based on another report by the Earth Policy Institute (EPI) [report]: the number of cars “scrapped” outnumberd the number sold in 2009 by 4 million.  The total number of cars in the U.S. fell from 250 million in 2008, to 246 million in 2009.  That’s a reduction of roughly 2 percent.  In addition, the number of “fleet vehicles” fell almost 29 percent from 2008 to 2009 based on R.L. Polk survey data [report].

So, most indications point to fewer vehicles on the roads in 2009 as compared to 2008. Given a 2 percent drop in personal vehicles (potentially) and a 29 percent drop in commercial fleet vehicles, a drop of 5.3 percent in total crashes is not really all that impressive.  The 9.7 percent reduction in accident deaths is however fairly consistent within these bounds.  9.7 percent is a reasonably proportional figure.

If this is an accurate assumption, it would indicate that other factors cited by the NHTSA report really had no meaningful impact on the outcome.  Fewer people crashed and died because fewer people were driving.

Other Interesting Numbers

Another report by the Bureau of Transportation Statistics (BTS) [report] indicates that most fatal accidents involve a single vehicle, rather than multiple vehicles, and most occur on straight roadways, rather than curved roads.  Additionally, there were more single vehicle accidents in rural areas than there were multi-vehicle accidents in urban areas (where each of these categories is the predominant of their realm).  So more people died crashing their own cars into something other than another vehicle, on straight roads out in the countryside.

You’re safer driving in the city than in the country.  Those numbers are not per capita.  Those are the total numbers.  Damn.

Convergence

So, if we can assume that fewer people died in car crashes in 2009 compared with 2008, and that most people who were killed in car crashes died in “single vehicle” accidents, on straight roads, away from urban areas, AND that the most probably way to be killed is from a motor vehicle accident:

Statistically, (yes, I know how ironic it is to say that given the flavor of this article), your biggest threat to your life is you.

Conclusion

What’s even more interesting than just bending numbers or building conclusions on partial input, is how much intentional manipulation is possible (and used) after the numbers are crunched in a particular “direction”.  For example, they could say the reduction in deaths was “good” because of intentional efforts to improve safety.  Or they could say it was “bad” because it was determined by economic pressures which reduced traffic overall.  This is called “spin”.

The numbers can be selectively filtered, then analyzed with a selective lens to produce desired results, which can then be spun into being either good or bad.  Statistics is one of the most un-mathematical mathematical arts in the world.

No comments: