Business Math, Ch. 7
Why is the standard deviation big?
As you probably already know from your reading, standard deviation indicates the spread of the data. There can be many reasons that data has a big spread. Some common reasons are:
1. That’s just the way the data happens. An example of this would be if we were randomly choosing men anywhere in the world and measuring their height. The possibility exists that we would choose the world’s tallest man and/or the world’s shortest man to include in our data. These heights would make our numbers look weird, but in fact, they are correct.
2. There is a problem with the data. This can happen for many reasons:
- The measurement was not done properly
- miscalibrated machinery
- The wrong units were used (years rather than days, net income versus gross income, etc.)
- The respondents in our sample misunderstood the question. One example of this from my business would be when we asked a question about “talk shows” we did not anticipate that a large part of our sample would consider morning shows on music stations as “talk shows.”
- The procedure for measurement has changed over a period of time (for example, the way Autism is diagnosed has changed over the year, so including data from years past may not be accurate).
- Machinery is more sensitive and/or sophisticated and can now measure more and better than before. For instance, now we are able to detect earthquakes and measure different attributes of hurricanes than we were in the 1800s.
- There is a typo in our data.
- The interviewer (in the case of an in-person interview) misunderstood the respondent’s answer, or recorded it incorrectly.
3. There is an underlying variable that is causing variation in our data. An example of this would be with our DJ example above. Most likely, the DJ with the higher standard deviation has a polarizing personality that causes one gender to like the DJ while the other does not, or the DJ appeals to a younger audience rather than an older audience, etc. Another common example here would be if we were in retail looking at the number of transactions for a store, an underlying variable that we would need to account for before beginning the analysis would be that there are different sizes of stores.
Of course the lists above are not exhaustive, so there are more reasons, but these are some main points. Have you encountered any of these scenarios in your job? If so, please share your experience with the class.