Help please! Determing a 95% accuracy statement

Thread Starter

DiodeMan

Joined Feb 3, 2013
13
Hello all!

It's been a while since I've been out of school, and I am unable to remember how to calculate an uncertainty value (and also what this value would be called!). Sorry in advance if this is something that should be easy to figure out, I am but a lowly technician :p.

I have a multiple linear regression model used to calculate an output based on a number of inputs. This model was created using ~1,500 input and output data points.

I have back calculated predicted y-values for the entire data set, and determined the error between the predicted and actual values. What I would like to do using this data is come up with a generic statement which reflects the accuracy of any future predictions.

For example, if the model predicts a value of 43.912, the accuracy statement might look something like this:
43.912 +/- 0.15%

Where 95% (2 Standard Deviations) of the time, the actual value will be between (43.912 * 0.9985 =) 43.846 and (43.912 * 1.0015 =) 43.978.

Can anybody please help me figure out how to get to this +/- error value in terms of % of reading, or please direct me to a resource which clearly explains this?

All help is appreciated! :)
 

jpanhalt

Joined Jan 18, 2008
11,087
Can anybody please help me figure out how to get to this +/- error value in terms of % of reading, or please direct me to a resource which clearly explains this?
The term you may be looking for is coefficient of variation (aka CV). Divide the standard deviation by the mean and multiply by 100% to convert to percentage. For example, if the s.d =1 and the mean = 40, the CV = (1/40)*100% = 2.5%. CV's can be used just as you used the s.d. That is, ±2CV = 95% confidence interval. Of course, you need to check your data for normality, but S.d. and CV are pretty robust statistical parameters.

John
 

wayneh

Joined Sep 9, 2010
17,496
I have back calculated predicted y-values for the entire data set, and determined the error between the predicted and actual values. What I would like to do using this data is come up with a generic statement which reflects the accuracy of any future predictions.
What you want is called the confidence interval. There are a lot of tools to compute it, and Excel has the functions you need built in.

A rule of thumb for small data sets is that the 90% confidence interval will be roughy ± 2 standard deviations. With a large data set, the confidence interval will narrow.

Be precise with your wording: So far you have computed the "fit" or "model" values of Y for the existing X1, X2...Xn values. Making predictions is different, especially if you are making any extrapolations outside of the current X ranges.

If future observations are drawn from the same population as the existing sample data, you can estimate the confidence interval.

Also, you should look at the T value (estimated coefficient value divided by standard error) for each independent variable. All should ideally have T>2 to be considered a "significant" factor with greater than 90% confidence.
 
Top