What to say when they ask, “Why not 100% CI?”

Originally posted at http://www.howtomeasureanything.com, on Tuesday, September 08, 2009 8:20:46 AM, by Dreichel.

“Hello,

This is my first post on your board. I am excited to actually be able to pose this to the author (or to others who want to chime in).

First some background;

I work as an Analyst predicting Budget “Burn Rate”, that is to say “Here is what we forecasted we’d spend, here is what we actually spent”. A book like “How to Measure Anything” is invaluable because I am often asked to come up with ways to “measure the intangible” and more importantly predict what that value is going to be once you can measure it.

I do not always use formal statistical models to do my work. In fact, I tend to blend what I call “Common Sense” modeling into my approach. This involves using my past experience as a guideline to tell me when to use a six weighted moving average and when to consider a particularly unusual situation “This Holiday falls on a Saturday, even though we are not open, it WILL impact us, because people will take that Friday Off or Monday after”. Often, this common sense approach follows a quantitative logical pattern, but there is no set-in-stone approach to these methods.

Part of my job involves removing the statistical jargon and telling them in plain English “I took a 10% reduction in expected working hours for Friday and subsequently 15% for Monday because I expect a certain number of people to call in sick or take vacation around a Saturday Holiday (July 4th) greater than normal” or simplifying it further still.

The old “I just asked you what time it is, why are you telling me how to make a watch?” phrase applies here. I must keep my message simple.

Second the question (s);

ONE: I have incorporated the concept of “90% CI” into my approach from Chapter Two of the book for Forecasting. Naturally, if you read the book you have an idea what 90% CI means, so I won’t go into it here.

However, I often deal with Managers who do not understand the concept of 90% CI. I prepare sort of a Elevator (30 seconds or less) speech for what it means. However, I’d like to ask you for yours?

TWO: they ask for a Target Number for a Forecast, let’s say it is 10,124,556. This is created by adding up several other values that are provided to me from other sources. Unfortunately, I cannot round this to a less precise value like 10.1M when I express it. I would prefer to to do this, but they are used to seeing the dollar value.

They do not want that value expressed as a range. They already have metrics in place as a Target of +/- 5% to their Target Number. How do I assign a confidence factor to a single target value? I don’t feel that 90% CI is correct when applied to a single value.

THREE: This is the big question. How do you answer “I don’t care about 90% right, I want 100% CI I want to know what you think it WILL be?”

Many managers get hung up on the concept that you must go with your BEST guess, not your 90% guess. What are some things you have done when that came up in the past as I am sure I am not the only one to hear “100% CI”. They want to feel they are working with the best information possible and I try to explain that I am trying not to be over-confident that my number is perfect. They don’t seem to buy that. They want to know what do I REALLY think it will be.

As an example, the Author gave in the calibration tests. Get 9 out 10 questions correct. The typical person to bring up the “Why not 100%?” would say “What if I happen to know all 10 answers? you are saying purposely get one wrong, just to throw it off?”

FOUR:

Let’s go back to that forecasted number, but lets make it 10 million for sake of simple math. If I said the forecast was going to be 10 million dollars with a CI of 90%, they would then ask “Does that mean I can take 10% of 10 million +/- (in this case 1 million dollars) and you really think its going to fall within 9 million or 11 million?

This range would be unacceptable to someone who must hit their target within +/-5%, so if that is the case, should I be trying to get 95% CI? and if so, what additional rigor needs to take place to get there? They would ask if I could get 95% CI why not 100%?

[That’s] more than a few questions for an introductory post, so I’ll just sit back and hope to hear from you.”

Thanks for your question and for giving me the chance to cover some of these important concepts again.

For starters, don’t presume how much or how little other managers might understand if the material is explained correctly. I constantly run into some managers who warn me about how little “other” managers will understand about these basic concepts. Yet, when I explain it I don’t find any of the resistance they anticipated. What I do find more often is that the first manager didn’t quite understand the issues themselves and were explaining it poorly.

Some of your questions, in fact, indicate to me that we might have some confusion about the meaning and use of some of these concepts and missed some key points in the book. Otherwise, responses to the kinds of questions you encounter should be fairly obvious. For example, at one point, you reference an estimate of a “…10 million dollars with CI of 90%…”. 10 million dollars can’t be a CI because it isn’t an interval, but an exact point. You have to state an upper and lower bound to have an interval of any confidence. If you presented it that way, perhaps you would have avoided the speculation about what it might mean (i.e. “Does that mean I can take +/- 10% of 10 million”). I don’t encounter those types of questions because I give them the whole interval and don’t make them guess – as in “The 90% CI is 6 million to 14 million”. So first, I would make sure you feel you understand the concepts very well yourself before we infer how much others – who are understandably confused by that kind of comment – would understand if it were correctly presented.

The confidence interval, of course, must be an interval (i.e. a range with an upper and lower bound) and it must have a stated confidence (e.g. 90%). My elevator pitch for a 90% CI is a range of values for which there is a 90% chance that the range will actually contain the true answer. In other words, if I go back and look at all my 90% CI (e.g. over the last few months or years) I should find that about 90% of the intervals contained the true answer. (90% of the intervals for sales contained the actual sales)

The reason why we often use a 90% CI instead of a 100% CI is because often the 100% CI can be so wide it might be useless to us. The 100% CI for the change in the next day of the Dow Jones Industrial Average, for example, could be greater than +/- 25% (since larger price changes have occurred, we know it is possible). We are effectively saying that anything outside of the 100% CI is absolutely impossible and should never occur – ever. But the 90% CI for the one-day change in the DJIA is a little less than +/- 2%. We are saying that very large changes are possible, but it is much more likely to be in this narrower range. This is a useful description of our actual uncertainty.

Regarding the calibration exams, I explain that 10 is a very small sample and you could easily get 10 right by chance alone. However, since most people are initially very overconfident (that is, they are right much less often than their stated confidence would predict) the sample of 10 is usually sufficient to demonstrate they are not well calibrated. It is common for most people in their first attempt to get less than half of the answers within their stated 90% CI’s. If only 4 out of 10 of your 90% CI contained the true answer, then you are probably very overconfident. (A little math shows that if there really were a 90% chance that each interval contained the answer, then there should be only a 1 in 6807 chance you would get less than 5 out of 10 within the ranges.)

We also have to make sure you understand that under-confidence and overconfidence are equally undesirable. You could put absurdly wide ranges in all of your calibration tests but then those wouldn’t be your 90% CI and would not represent your real knowledge about the question. A range of, say, 1 to a million miles for the air distance between NYC and LA (a range I’ve seen someone use in the calibration tests) implies the estimator believes it is possible for NYC and LA to be as little as 5 miles apart or further apart than many times the circumference of the planet. This range does not represent their real knowledge. A well calibrated person is right just as often as they expect to be – no more, no less. They are right 90% of the time they say they are 90% confident and 75% of the time they say they are 75% confident.

Furthermore, I highly recommend calibration training for any manager who has to deal with uncertainties. I find that most of them (from a variety of industries and education backgrounds) understand it quite well. And when they get calibrated, they just don’t generate the kinds of questions you mention. Calibration puts your “common sense” to the test. Einstein said common sense is just all of the prejudices you accumulated by the age of 18. Your intuition about forecasts has a performance that can be measured and calibration is one way to measure it.

I would also recommend just collecting historical data about estimates in your organization. Apparently, they have been doing this for a while and you should have lots of historical data (and if you don’t have the data it is not too late to start tracking it). Once managers see how forecasts historically compared to actual outcomes they seem to “get” the point of ranges. At the very least, they will probably see that a perception of “+/-5%” certainty is an utter delusion.

By the way, I’m scheduling some webinars for calibration training. I’ll be covering all of these issues and more and people we overcome these problems by applying them in a series of tests.

Thanks for your questions.

Doug Hubbard

Lens Model: Negative Value

Originally posted at http://www.howtomeasureanything.com, on Sunday, September 06, 2009 9:20:07 PM, by sujoymitra17.

“While using Lens Model (multiple regression), I am getting negative scores for a few parameters and positive scores for few others. I am computing the score using the formula:

– <Coeff of parameter-1>*Val of parameter-1+<Coeff of parameter-2>*Val of parameter-2….+Intercept.

Since few parameters are showing -ve scores and others +ve (considering few have -ve correlation-coeff and others have +ve correlation-coeff), how do I formulate weights?”

I’m a little confused by your message. The coefficients in a regression model ARE the “weights”. The output of a regression analysis includes the coefficients. A regression analysis is how the weights are computed for a Lens model (the former is a tool for the later, they are not the same thing).

Are you actually performing a least-squares best-fit linear regression analysis? Are you using the regression tool in Excel? Just making a formula with parameters and coefficients is not a linear regression.

Getting negative coefficients is not necessarily a problem, since that actually makes sense for many situations (examples of negative coefficients include criminal convictions and income, body fat and life expectancy, driving speed and mileage, etc.) If you are doing an actual regression, then getting negative values is not a problem. It can even be an expected outcome.

Perhaps you can describe what you are attempting to do in more detail.

Thanks,

Doug Hubbard

How to Measure Innovation

Originally posted at http://www.howtomeasureanything.com, on Thursday, March 05, 2009 7:30:54 PM, by JBehling.

“I am Six Sigma Black Belt for an IS Organization and my team has been struggling to measure the impact of “Innovation” in our company. We bring new and innovative systems to our business partners to help them streamline their practices and processes.

Any thoughts on how to develop a measurement system for innovation? Are there any standard practices for measuring IS Innovation? HELP!”

 

How to Measure Performance

Originally posted at http://www.howtomeasureanything.com, on Friday, March 20, 2009 9:14:48 PM, by jerry.

“Greetings,

I loved your book. Thanks for sharing such valuable information. Now I’m trying to apply it.

I am leading a project of training developers and instructional designers and am attempting to put together a meaningful way to measure their performance. I have come up with some parameters that seem evident to me, such as time to complete a lesson, number of edits recommended (to the designer), type of edits recommended (order, strategies, completeness of content), edit recommendation trends (is the number of recommended edits going up or going down).

Is there a particular part of your book I should re-read that would help me frame a thorough performance evaluation measuring framework? Or can you suggest anything that would help expand the framework or make it a more reliable measure of performance?

Thank you in advance for any direction you can point me in or for any suggestions you can provide.

Jerry”

Thanks for reading my book. I think you might find part of what you are looking for in Chapter 11 on measuring preferences and attitudes. On page 197 I show how different performance measures of a software developer could be combined into a single metric by quantifying the acceptable tradeoffs.

You might also consider more of an “end result” metric of some kind. Isn’t the ultimate success of the instructional material measured by the performance of students? Obviously, many things affect the performance of students but among those should be the design of the material. Individual students will vary but if one set of material consistently results in better student performance than another set, then I think it’s fair to attribute some of that to the material designer.

Thanks,

Doug Hubbard

Lens Model Example – Chapter 12

Originally posted at http://www.howtomeasureanything.com, on Sunday, March 01, 2009 1:30:45 PM, by Paddy.

“Could you please clarify what scenarios the can Lens Model can remove human inconsistency in decision making (i.e., problems that are well defined/repeatable or unstructured)? Would like to apply Lens Model to evaluate computer interfaces.

Also, could you please clarify the variables in step 6 of the Lens Model Procedure – Perform regression analysis. For example, could you please clarify independent and dependent variables in step 6 and the end output in step 7. Diagram was great, example would be better.

Thanks,

Amran”

Originally posted at http://www.howtomeasureanything.com, on Friday, April 17, 2009 9:21:44 AM, by Paddy.

“Any help with an example would be much appreciated.

Thank you”

Chapter 6

Originally posted on http://www.howtomeasureanything.com/forums/ on Thursday, February 19, 2009 1:41:48 PM, by Thakur.

“I enjoyed reading Chapter 6 (Measuring Risk: Introduction to the Monte Carlo Simulation). It was very informative. After reading it I tried to do the following using Excel. But I failed.

1). Simulating the Monty Hall Problem.

2). Simulating Birthdays

3). Genetics: Simulating Population Control

Can You please help me and guide me.

Thanks

Thakur”

You are asking for a lot! But how about I answer a bit at a time? First, lets do Monty Hall.

For those of you who might not have heard of this problem, its based on a classic probability theory example. Imagine that you are on the 70’s game show “Let’s Make a Deal” hosted by Monty Hall. You are a contestant and you are given three doors to choose from. Behind one of the doors is a brand new car! If you choose the door with the car behind it, you get to drive it away.

You choose a door. But then Monty Hall shows you what is behind one of the other doors to reveal one of the “joke prizes” (e.g. a donkey). Then he asks you if you would like to keep the door you first chose or switch to the other remaining door. People often think that the odds of winning would be the same whether you switched or not. But they would be wrong.

To demonstrate why switching doors would be better, let’s set up a spreadsheet simulation where we define columns for the prize door, the chosen door, and the revealed door. One more column will be used as a flag to indicate whether we would have won if we stayed with the first door we chose or if we should have switched doors. Then we will copy down the first row of these columns to a few thousand rows to see the outcome.

Column 1, The Prize Door: This is the door the prize is really behind. As a contestant, you wouldn’t know this information, but we need it for the simulation. Write “The Prize Door” in cell A1. In cell A2 write =int(rand()*3+1). This will randomly generate the value of 1, 2 or 3.

Column 2, The Chosen Door: This is the door the contestant chose. In B1, write “The Chosen Door” and in B2 write the same formula you wrote in A2; =int(rand()*3+1). Again, this will randomly generate the value of 1, 2 or 3.

Column 3, The Revealed Door: This is the door Monty Hall reveals. Monty will always reveal a door you didn’t choose and it will always be a door that doesn’t have a prize behind it. In cell C1 write “The Revealed Door” and in C2 write =if(and(a2=1,b2=1),int(rand()*2+2),if(and(a2=1,b2=2),3,if(and(a2=1,b2=3),2,if(and(a2=2,b2=1),3,if(and(a2=2,b2=2),int(rand()+.5)*2+1,if(and(a2=2,b2=3),1,if(and(a2=3,b2=1),2,if(and(a2=3,b2=2),1,int(rand()*2+1))))))))) This seems clumsy, but its visually easier to decompose and understand than some approaches I might have taken. This will generate values according to the following table:

Prize Door……Chosen Door……Revealed Door
1…………………..1………………….2 or 3
1…………………..2………………….3
1…………………..3………………….2
2…………………..1………………….3
2…………………..2………………….1 or 3
2…………………..3………………….1
3…………………..1………………….2
3…………………..2………………….1
3…………………..3………………….1 or 2

Column 4, Winning Strategy; This cell tells you what the winning strategy would have been. Either you stick with the door you first chose or you switch doors. In D1 write “Winning Strategy” and in D2 write =if(A2=b2,0,1). This will generate a 0 if the winning strategy would have been to stick with the door you have and a 1 if you were better off switching.

Now copy down row 2 a thousand rows and take the average of the values in column 4 (remember not to average in the text in D1). One way to do this is write =average(D2:D1001) in cell E1. If you were just as well off sticking with the first chosen door as switching, then this average would be .5. But you will find that the average will be about .667. In other words, two thirds of the time the winning strategy was switching doors. The reason this works is that when Monty Hall reveals one of the other doors, he gives you additional information you didn’t have before. He reveals ONLY a door that doesn’t have a prize and ONLY a door you didn’t choose. When you first choose a door, you have a 2/3 chance of not winning (the prize is behind one of the other two doors). Once he reveals which of the other 2 doors is not a winner, then the remaining door has a 2/3 chance of winning.

Check back for my responses to your other questions. For clarification, when you talk about birthdays do you mean simulating the problem where you find minimum number of people before there is equal odds that at least 2 people have the same birthday?

Thanks for your question
Doug Hubbard