Sunday, September 7, 2014

Statistics paralysis ...

Technical people know all about statistics.  We focus on frames per second, or INSERTs per second or any number of other stats to tell us whether we did a good job with our programming or not.  It's rather amazing (to me) how much of our statistical tracking is about performance.  I suppose that's a side-effect of the perception by most of the world that the primary purpose of computers is to do things faster.  It might also be related to the fact that it's something that's easy to measure and communicate.  You don't often see technical people relaying stats about improved accuracy or improved quality as a result of software.  It happens, but not nearly as often as somethings per second are reported.

From a business side, people tend to worry about ROI-focused statistics.  Things like cost-per-click or number of page views for marketing, or profit/loss for the accountants.

Most of this isn't new.  If you see an old movie with a room full of workers at adding machines, they're probably compiling the statistics for their company.  What is relatively new to the business world is the speed and simplicity with which statistics can be collected and converted into a consumable form (such as graphs and charts).  Computers can collect data and turn it into statistics so fast that a lot of people don't even realize that there's a difference between data and statistics, or a process to converting.

I ran afoul of this last week, experimentally running some advertisements while at the same time doing some performance improvements to Idamu Caverns.  I found myself micro-profiling parts of the code until I ran out of ideas on how to make it faster, then spending a lot of time refreshing the statistics of the advertising system I was using to watch the page views and clicks change over time.

There were two big problems with this:

  1. The game was already fast enough.  Sure, it can always be faster, but the performance was no longer the most important thing I needed to work on, yet I'd become so focused on making the numbers smaller that I'd failed to notice that fact.
  2. Watching advertising statistics in time intervals less than 24 hours is almost a complete waste of time.

But I could do both of those things, so I did.  It wasn't until the next day when I was reviewing my work from the previous day and planning what to do next that I realized that I'd become paralyzed by the statistics, a slave to them perhaps.  Constantly tweaking ridiculous things and watching to see if the numbers improved.

It's not to say that super-tweaking is never good.  If you have a server that's hosting a large number of users and you can improve performance by 0.1% -- that could result in significant gains over time.  But that's a business justification.  Improving performance simply because you can is not productive, and that's what can happen when you look at the numbers without carefully considering what they mean.

I've seen this in large companies where there are people who's job it is, specifically, to look at the stats.  Since that's all they do, it's very easy for them to bog down others in their constant analysis of stats.  I think the solution to this is strong leadership within the company that specifically lays out time periods over which statistics are to be analyzed.  You see some of this.  For example, boards of directors will commonly want to see statistics on a quarterly basis, and don't want to be bothered with it more often than that.

On a personal level, it's an act of self-discipline (for me, at least).  I find myself doing it when the important tasks I have to do are unpleasant ones, but also spending too much time on it due to a failure to notice that things are already "good enough."  For me, I find that scheduling daily times when I go over my TODO list and prioritize it keeps me from getting too far off track.  But also, I sometimes allow myself to do the tasks that I enjoy, even if they aren't the most important.  It takes some of the sting out of doing the unpleasant ones.

Being both the worker and the boss is frequently a challenge for me.  I often joke that I don't get along with my boss at all.

No comments:

Post a Comment