Pop quiz: which of the following statements about decisions do you agree with:
- You need at least thirty data points to get a statistically significant result.
- One data point tells you nothing.
- In a business decision, the monetary value of data is more important than its statistical significance.
- If you know almost nothing, almost anything will tell you something.
I’m reintroducing the Measurement Challenge for the blog. I ran it for a couple of years on the old site and had some very interesting posts.
Use this thread to post comments about the most difficult – or even apparently “impossible” – measurements you can imagine. I am looking for truly difficult problems that might take more than a couple of rounds of query/response to resolve. Give it your best shot!
I came across more interesting research about possible “placebo effects” in decision making. According to the two studies cited below, receiving formal training in lie detection (e.g. so that law enforcement officers are more likely to detect a untruthful statement by a suspect) has a curious effect. The training greatly increases confidence of the experts in their own judgments even though it may decrease their performance at detecting lies. Such placebo effects were a central topic of The Failure of Risk Management. I’m including this in the second edition of How to Measure Anything as another example of how some methods (like formal training) may seem to work and increase confidence of the users but, in reality, don’t work at all.
- DePaulo, B. M., Charlton, K., Cooper, H., Lindsay, J.J., Muhlenbruck, L. “The accuracy-confidence correlation in the detection of deception” Personality and Social Psychology Review, 1(4) pp 346-357, 1997
- Kassin, S.M., Fong, C.T. “I’m innocent!: Effect of training on judgments of truth and deception in the interrogation room” Law and Human Behavior, 23 pp 499-516, 1999
Thanks to my colleague Michael Gordon-Smith in Australia for sending me these citations.
Originally posted at http://www.howtomeasureanything.com, on Tuesday, September 08, 2009 8:20:46 AM, by Dreichel.
This is my first post on your board. I am excited to actually be able to pose this to the author (or to others who want to chime in).
First some background;
I work as an Analyst predicting Budget “Burn Rate”, that is to say “Here is what we forecasted we’d spend, here is what we actually spent”. A book like “How to Measure Anything” is invaluable because I am often asked to come up with ways to “measure the intangible” and more importantly predict what that value is going to be once you can measure it.
I do not always use formal statistical models to do my work. In fact, I tend to blend what I call “Common Sense” modeling into my approach. This involves using my past experience as a guideline to tell me when to use a six weighted moving average and when to consider a particularly unusual situation “This Holiday falls on a Saturday, even though we are not open, it WILL impact us, because people will take that Friday Off or Monday after”. Often, this common sense approach follows a quantitative logical pattern, but there is no set-in-stone approach to these methods.
Part of my job involves removing the statistical jargon and telling them in plain English “I took a 10% reduction in expected working hours for Friday and subsequently 15% for Monday because I expect a certain number of people to call in sick or take vacation around a Saturday Holiday (July 4th) greater than normal” or simplifying it further still.
The old “I just asked you what time it is, why are you telling me how to make a watch?” phrase applies here. I must keep my message simple.
Second the question (s);
ONE: I have incorporated the concept of “90% CI” into my approach from Chapter Two of the book for Forecasting. Naturally, if you read the book you have an idea what 90% CI means, so I won’t go into it here.
However, I often deal with Managers who do not understand the concept of 90% CI. I prepare sort of a Elevator (30 seconds or less) speech for what it means. However, I’d like to ask you for yours?
TWO: they ask for a Target Number for a Forecast, let’s say it is 10,124,556. This is created by adding up several other values that are provided to me from other sources. Unfortunately, I cannot round this to a less precise value like 10.1M when I express it. I would prefer to to do this, but they are used to seeing the dollar value.
They do not want that value expressed as a range. They already have metrics in place as a Target of +/- 5% to their Target Number. How do I assign a confidence factor to a single target value? I don’t feel that 90% CI is correct when applied to a single value.
THREE: This is the big question. How do you answer “I don’t care about 90% right, I want 100% CI I want to know what you think it WILL be?”
Many managers get hung up on the concept that you must go with your BEST guess, not your 90% guess. What are some things you have done when that came up in the past as I am sure I am not the only one to hear “100% CI”. They want to feel they are working with the best information possible and I try to explain that I am trying not to be over-confident that my number is perfect. They don’t seem to buy that. They want to know what do I REALLY think it will be.
As an example, the Author gave in the calibration tests. Get 9 out 10 questions correct. The typical person to bring up the “Why not 100%?” would say “What if I happen to know all 10 answers? you are saying purposely get one wrong, just to throw it off?”
Let’s go back to that forecasted number, but lets make it 10 million for sake of simple math. If I said the forecast was going to be 10 million dollars with a CI of 90%, they would then ask “Does that mean I can take 10% of 10 million +/- (in this case 1 million dollars) and you really think its going to fall within 9 million or 11 million?
This range would be unacceptable to someone who must hit their target within +/-5%, so if that is the case, should I be trying to get 95% CI? and if so, what additional rigor needs to take place to get there? They would ask if I could get 95% CI why not 100%?
[That’s] more than a few questions for an introductory post, so I’ll just sit back and hope to hear from you.”
Thanks for your question and for giving me the chance to cover some of these important concepts again.
For starters, don’t presume how much or how little other managers might understand if the material is explained correctly. I constantly run into some managers who warn me about how little “other” managers will understand about these basic concepts. Yet, when I explain it I don’t find any of the resistance they anticipated. What I do find more often is that the first manager didn’t quite understand the issues themselves and were explaining it poorly.
Some of your questions, in fact, indicate to me that we might have some confusion about the meaning and use of some of these concepts and missed some key points in the book. Otherwise, responses to the kinds of questions you encounter should be fairly obvious. For example, at one point, you reference an estimate of a “…10 million dollars with CI of 90%…”. 10 million dollars can’t be a CI because it isn’t an interval, but an exact point. You have to state an upper and lower bound to have an interval of any confidence. If you presented it that way, perhaps you would have avoided the speculation about what it might mean (i.e. “Does that mean I can take +/- 10% of 10 million”). I don’t encounter those types of questions because I give them the whole interval and don’t make them guess – as in “The 90% CI is 6 million to 14 million”. So first, I would make sure you feel you understand the concepts very well yourself before we infer how much others – who are understandably confused by that kind of comment – would understand if it were correctly presented.
The confidence interval, of course, must be an interval (i.e. a range with an upper and lower bound) and it must have a stated confidence (e.g. 90%). My elevator pitch for a 90% CI is a range of values for which there is a 90% chance that the range will actually contain the true answer. In other words, if I go back and look at all my 90% CI (e.g. over the last few months or years) I should find that about 90% of the intervals contained the true answer. (90% of the intervals for sales contained the actual sales)
The reason why we often use a 90% CI instead of a 100% CI is because often the 100% CI can be so wide it might be useless to us. The 100% CI for the change in the next day of the Dow Jones Industrial Average, for example, could be greater than +/- 25% (since larger price changes have occurred, we know it is possible). We are effectively saying that anything outside of the 100% CI is absolutely impossible and should never occur – ever. But the 90% CI for the one-day change in the DJIA is a little less than +/- 2%. We are saying that very large changes are possible, but it is much more likely to be in this narrower range. This is a useful description of our actual uncertainty.
Regarding the calibration exams, I explain that 10 is a very small sample and you could easily get 10 right by chance alone. However, since most people are initially very overconfident (that is, they are right much less often than their stated confidence would predict) the sample of 10 is usually sufficient to demonstrate they are not well calibrated. It is common for most people in their first attempt to get less than half of the answers within their stated 90% CI’s. If only 4 out of 10 of your 90% CI contained the true answer, then you are probably very overconfident. (A little math shows that if there really were a 90% chance that each interval contained the answer, then there should be only a 1 in 6807 chance you would get less than 5 out of 10 within the ranges.)
We also have to make sure you understand that under-confidence and overconfidence are equally undesirable. You could put absurdly wide ranges in all of your calibration tests but then those wouldn’t be your 90% CI and would not represent your real knowledge about the question. A range of, say, 1 to a million miles for the air distance between NYC and LA (a range I’ve seen someone use in the calibration tests) implies the estimator believes it is possible for NYC and LA to be as little as 5 miles apart or further apart than many times the circumference of the planet. This range does not represent their real knowledge. A well calibrated person is right just as often as they expect to be – no more, no less. They are right 90% of the time they say they are 90% confident and 75% of the time they say they are 75% confident.
Furthermore, I highly recommend calibration training for any manager who has to deal with uncertainties. I find that most of them (from a variety of industries and education backgrounds) understand it quite well. And when they get calibrated, they just don’t generate the kinds of questions you mention. Calibration puts your “common sense” to the test. Einstein said common sense is just all of the prejudices you accumulated by the age of 18. Your intuition about forecasts has a performance that can be measured and calibration is one way to measure it.
I would also recommend just collecting historical data about estimates in your organization. Apparently, they have been doing this for a while and you should have lots of historical data (and if you don’t have the data it is not too late to start tracking it). Once managers see how forecasts historically compared to actual outcomes they seem to “get” the point of ranges. At the very least, they will probably see that a perception of “+/-5%” certainty is an utter delusion.
By the way, I’m scheduling some webinars for calibration training. I’ll be covering all of these issues and more and people we overcome these problems by applying them in a series of tests.
Thanks for your questions.
Originally posted at http://www.howtomeasureanything.com, on Sunday, September 06, 2009 9:20:07 PM, by sujoymitra17.
“While using Lens Model (multiple regression), I am getting negative scores for a few parameters and positive scores for few others. I am computing the score using the formula:
– <Coeff of parameter-1>*Val of parameter-1+<Coeff of parameter-2>*Val of parameter-2….+Intercept.
Since few parameters are showing -ve scores and others +ve (considering few have -ve correlation-coeff and others have +ve correlation-coeff), how do I formulate weights?”
I’m a little confused by your message. The coefficients in a regression model ARE the “weights”. The output of a regression analysis includes the coefficients. A regression analysis is how the weights are computed for a Lens model (the former is a tool for the later, they are not the same thing).
Are you actually performing a least-squares best-fit linear regression analysis? Are you using the regression tool in Excel? Just making a formula with parameters and coefficients is not a linear regression.
Getting negative coefficients is not necessarily a problem, since that actually makes sense for many situations (examples of negative coefficients include criminal convictions and income, body fat and life expectancy, driving speed and mileage, etc.) If you are doing an actual regression, then getting negative values is not a problem. It can even be an expected outcome.
Perhaps you can describe what you are attempting to do in more detail.