# least square line fit

Discussion in 'Math' started by samjesse, Oct 26, 2011.

1. ### samjesse Thread Starter Active Member

Sep 14, 2008
202
0
Hi

I have 79 observation which are increasing in value, it suppose to give a positive slop but I get a negative one (-0.00279).
y = a * bx
where x is 1 to 79
I rechecked my work many times over but not able to find out why I am getting a negative "b".

Pls. help.
thx

1 121.1745
2 121.181
3 121.1985
4 121.2245
5 121.253
6 121.2775
7 121.268
8 121.279
9 121.315
10 121.3505
11 121.3475
12 121.3615
13 121.409
14 121.4445
15 121.419
16 121.393
17 121.407
18 121.4355
19 121.4575
20 121.4745
21 121.4905
22 121.5205
23 121.5405
24 121.534
25 121.515
26 121.4995
27 121.478
28 121.4665
29 121.48
30 121.515
31 121.5265
32 121.5315
33 121.5495
34 121.5085
35 121.5375
36 121.5895
37 121.6065
38 121.596
39 121.5805
40 121.5865
41 121.559
42 121.5395
43 121.553
44 121.574
45 121.582
46 121.585
47 121.5895
48 121.605
49 121.615
50 121.6155
51 121.6405
52 121.675
53 121.6925
54 121.6845
55 121.657
56 121.645
57 121.654
58 121.6555
59 121.646
60 121.638
61 121.6445
62 121.635
63 121.6505
64 121.6565
65 121.653
66 121.668
67 121.6865
68 121.6685
69 121.6635
70 121.673
71 121.6865
72 121.6855
73 121.6885
74 121.686
75 121.712
76 121.7495
77 121.7605
78 121.796
79 121.8515

Mar 9, 2011
329
396
3. ### samjesse Thread Starter Active Member

Sep 14, 2008
202
0
a is the constant
b is the slop

I am sure the formula as written in the below code is correct.
Code ( (Unknown Language)):
1.
2.       a = ((sumY * sumX2) - (sumX * sumXY))/((n * sumX2) - MathPow(sumX,2));
3.       b = ((n*sumXY) - (sumX*sumY)) / ((n*sumX2) - MathPow(sumX,2));
4.
5.
it works good till n gets larger than some value then it fails.
it is time series data thus n are sequential numbers 1 to 79 in this case.

4. ### justtrying Active Member

Mar 9, 2011
329
396
I think you made a mistake in your calculations. I had originally obtained same value, but I used wrong sum in the formula - be careful what your x and y are. It is confusing since number of data points is same as the x. I think what I got now is correct - slope is 1.81. Check out the excel sheet...

File size:
13 KB
Views:
22
5. ### samjesse Thread Starter Active Member

Sep 14, 2008
202
0
Sorry, I could not open the file you posted.
here is my work, could you please tell me where I went wrong.
thx

Code ( (Unknown Language)):
1.
2.
3.       double sumY = 0.0;
4.       double sumXY = 0.0;
5.
6.       int sumX = n*(n+1)/2;  //--- sum of consecutive integers.
7.       int sumX2 = n*(n+1)*(2*n+1) / 6;   //--- sum of squares of above.
8.       for( ushort i = 0; i < n; i++)
9.       {
10.          double Y = gets calculated here;
11.          sumY += Y;
12.          sumXY += ((i+1) * Y);
13.       }
14.       a = ((sumY * sumX2) - (sumX * sumXY))/((n * sumX2) - MathPow(sumX,2));
15.       b = ((n*sumXY) - (sumX*sumY)) / ((n*sumX2) - MathPow(sumX,2));
16.
17.
18.
19.
20.

6. ### MrChips Moderator

Oct 2, 2009
12,624
3,451
You must be doing something wrong. My results to a straight line y = mx + c gives me nothing like what you are getting.

Jul 7, 2009
1,585
141
Here's the regression I get with a script I wrote a number of years ago:

Code ( (Unknown Language)):
1.
2.
3. ANOVA Table
4.
5. Source of
6. Variation                   SS          df                MS
7. --------------------------------------------------------------
8. Regression      SSR =      1.54318       1  MSR =      1.54318
9. Error           SSE =      0.20356      77  MSE =   0.00264364
10. --------------------------------------------------------------
11. Total          SSTO =      1.74674      78
12.
14.          R^2 =  88.35
15.
16. F statistic with df=(1,77) = 583.7
17.
18. Covariance matrix:
19. [[  1.36e-04  -2.57e-06]
20.  [ -2.57e-06   6.44e-08]]
21.
22. Model parameters:    Estimate        StdDev    100*StdDev/Est
23.     b0                121.298       0.01168              0.01
24.     b1             0.00612904     0.0002537              4.14
25.
26. Residuals = Res:
27.                 Y         Yhat          Res    100*Res/Y
28.    1        121.2        121.3        -0.13        -0.11
29.    2        121.2        121.3      -0.1297        -0.11
30.    3        121.2        121.3      -0.1183        -0.10
31.    4        121.2        121.3     -0.09844        -0.08
32.    5        121.3        121.3     -0.07607        -0.06
33.    6        121.3        121.3     -0.05769        -0.05
34.    7        121.3        121.3     -0.07332        -0.06
35.    8        121.3        121.3     -0.06845        -0.06
36.    9        121.3        121.4     -0.03858        -0.03
37.   10        121.4        121.4    -0.009211        -0.01
38.   11        121.3        121.4     -0.01834        -0.02
39.   12        121.4        121.4     -0.01047        -0.01
40.   13        121.4        121.4       0.0309         0.03
41.   14        121.4        121.4      0.06027         0.05
42.   15        121.4        121.4      0.02864         0.02
43.   16        121.4        121.4    -0.003485        -0.00
44.   17        121.4        121.4     0.004386         0.00
45.   18        121.4        121.4      0.02676         0.02
46.   19        121.5        121.4      0.04263         0.04
47.   20        121.5        121.4       0.0535         0.04
48.   21        121.5        121.4      0.06337         0.05
49.   22        121.5        121.4      0.08724         0.07
50.   23        121.5        121.4       0.1011         0.08
51.   24        121.5        121.4      0.08848         0.07
52.   25        121.5        121.5      0.06335         0.05
53.   26        121.5        121.5      0.04172         0.03
54.   27        121.5        121.5       0.0141         0.01
55.   28        121.5        121.5    -0.003534        -0.00
56.   29        121.5        121.5     0.003837         0.00
57.   30        121.5        121.5      0.03271         0.03
58.   31        121.5        121.5      0.03808         0.03
59.   32        121.5        121.5      0.03695         0.03
60.   33        121.5        121.5      0.04882         0.04
61.   34        121.5        121.5     0.001692         0.00
62.   35        121.5        121.5      0.02456         0.02
63.   36        121.6        121.5      0.07043         0.06
64.   37        121.6        121.5       0.0813         0.07
65.   38        121.6        121.5      0.06468         0.05
66.   39        121.6        121.5      0.04305         0.04
67.   40        121.6        121.5      0.04292         0.04
68.   41        121.6        121.5     0.009289         0.01
69.   42        121.5        121.6     -0.01634        -0.01
70.   43        121.6        121.6    -0.008969        -0.01
71.   44        121.6        121.6     0.005902         0.00
72.   45        121.6        121.6     0.007773         0.01
73.   46        121.6        121.6     0.004643         0.00
74.   47        121.6        121.6     0.003014         0.00
75.   48        121.6        121.6      0.01239         0.01
76.   49        121.6        121.6      0.01626         0.01
77.   50        121.6        121.6      0.01063         0.01
78.   51        121.6        121.6       0.0295         0.02
79.   52        121.7        121.6      0.05787         0.05
80.   53        121.7        121.6      0.06924         0.06
81.   54        121.7        121.6      0.05511         0.05
82.   55        121.7        121.6      0.02148         0.02
83.   56        121.6        121.6     0.003353         0.00
84.   57        121.7        121.6     0.006224         0.01
85.   58        121.7        121.7     0.001595         0.00
86.   59        121.6        121.7     -0.01403        -0.01
87.   60        121.6        121.7     -0.02816        -0.02
88.   61        121.6        121.7     -0.02779        -0.02
89.   62        121.6        121.7     -0.04342        -0.04
90.   63        121.7        121.7     -0.03405        -0.03
91.   64        121.7        121.7     -0.03418        -0.03
92.   65        121.7        121.7     -0.04381        -0.04
93.   66        121.7        121.7     -0.03494        -0.03
94.   67        121.7        121.7     -0.02257        -0.02
95.   68        121.7        121.7      -0.0467        -0.04
96.   69        121.7        121.7     -0.05782        -0.05
97.   70        121.7        121.7     -0.05445        -0.04
98.   71        121.7        121.7     -0.04708        -0.04
99.   72        121.7        121.7     -0.05421        -0.04
100.   73        121.7        121.7     -0.05734        -0.05
101.   74        121.7        121.8     -0.06597        -0.05
102.   75        121.7        121.8      -0.0461        -0.04
103.   76        121.7        121.8     -0.01473        -0.01
104.   77        121.8        121.8    -0.009857        -0.01
105.   78        121.8        121.8      0.01951         0.02
106.   79        121.9        121.8      0.06889         0.06
107.
108.

8. ### justtrying Active Member

Mar 9, 2011
329
396
This is for Excel, I cannot help much with code as I am in the beginner stage myself. But I think there is a calculation error happening somewhere. I forgot which values got mixed up when I got the same value as you did initially. Maybe consider using different formulas?

Code ( (Unknown Language)):
1.
2.
3.   a*79+b*780=9601,94    - a*n + b*SumXi = SumYi
4.
5.
6.   a*780+b*167480=384329.5   - a*SumXi + b*SumXi^2=SumXiYi
7.
8.
9.
10.   a=103,65   (intercept)
11.
12.
13.   b=1,81 (slope)
14.    X    Y       X^2     XY
15.
16.   1   121,17   1   121,1745
17.
18.   2   121,18   4   242,362
19.
20.   3   121,20   9   363,5955
21.
22.   4   121,22   16   484,898
23.
24.   5   121,25   25   606,265
25.
26.   6   121,28   36   727,665
27.
28.   7   121,27   49   848,876
29.
30.   8   121,28   64   970,232
31.
32.   9   121,32   81   1091,835
33.
34.   10   121,35   100   1213,505
35.
36.   11   121,35   121   1334,823
37.
38.   12   121,36   144   1456,338
39.
40.   13   121,41   169   1578,317
41.
42.   14   121,44   196   1700,223
43.
44.   15   121,42   225   1821,285
45.
46.   16   121,39   256   1942,288
47.
48.   17   121,41   289   2063,919
49.
50.   18   121,44   324   2185,839
51.
52.   19   121,46   361   2307,693
53.
54.   20   121,47   400   2429,49
55.
56.   21   121,49   441   2551,301
57.
58.   22   121,52   484   2673,451
59.
60.   23   121,54   529   2795,432
61.
62.   24   121,53   576   2916,816
63.
64.   25   121,52   625   3037,875
65.
66.   26   121,50   676   3158,987
67.
68.   27   121,48   729   3279,906
69.
70.   28   121,47   784   3401,062
71.
72.   29   121,48   841   3522,92
73.
74.   30   121,52   900   3645,45
75.
76.   31   121,53   961   3767,322
77.
78.   32   121,53   1024   3889,008
79.
80.   33   121,55   1089   4011,134
81.
82.   34   121,51   1156   4131,289
83.
84.   35   121,54   1225   4253,813
85.
86.   36   121,59   1296   4377,222
87.
88.   37   121,61   1369   4499,441
89.
90.   38   121,60   1444   4620,648
91.
92.   39   121,58   1521   4741,64
93.
94.   40   121,59   1600   4863,46
95.
96.   41   121,56   1681   4983,919
97.
98.   42   121,54   1764   5104,659
99.
100.   43   121,55   1849   5226,779
101.
102.   44   121,57   1936   5349,256
103.
104.   45   121,58   2025   5471,19
105.
106.   46   121,59   2116   5592,91
107.
108.   47   121,59   2209   5714,707
109.
110.   48   121,61   2304   5837,04
111.
112.   49   121,62   2401   5959,135
113.
114.   50   121,62   2500   6080,775
115.
116.   51   121,64   2601   6203,666
117.
118.   52   121,68   2704   6327,1
119.
120.   53   121,69   2809   6449,703
121.
122.   54   121,68   2916   6570,963
123.
124.   55   121,66   3025   6691,135
125.
126.   56   121,65   3136   6812,12
127.
128.   57   121,65   3249   6934,278
129.
130.   58   121,66   3364   7056,019
131.
132.   59   121,65   3481   7177,114
133.
134.   60   121,64   3600   7298,28
135.
136.   61   121,64   3721   7420,315
137.
138.   62   121,64   3844   7541,37
139.
140.   63   121,65   3969   7663,982
141.
142.   64   121,66   4096   7786,016
143.
144.   65   121,65   4225   7907,445
145.
146.   66   121,67   4356   8030,088
147.
148.   67   121,69   4489   8152,996
149.
150.   68   121,67   4624   8273,458
151.
152.   69   121,66   4761   8394,782
153.
154.   70   121,67   4900   8517,11
155.
156.   71   121,69   5041   8639,742
157.
158.   72   121,69   5184   8761,356
159.
160.   73   121,69   5329   8883,261
161.
162.   74   121,69   5476   9004,764
163.
164.   75   121,71   5625   9128,4
165.
166.   76   121,75   5776   9252,962
167.
168.   77   121,76   5929   9375,559
169.
170.   78   121,80   6084   9500,088
171.
172.   79   121,85   6241   9626,269
173.       sum   780   9 601,94   167480   384329,5
174.
175.
176.
177.

Last edited: Oct 26, 2011
9. ### samjesse Thread Starter Active Member

Sep 14, 2008
202
0
you posted
Sum of x = 780

my excel shows it to be: 3160
and the code
sumX = n*(n+1)/2= 3160
where n = 79
i.e 79 observations numbered sequentially 1 - 79
how did you get it to be 780?

that is
1 + 2 + ....+ 79= 3160

Last edited: Oct 27, 2011
10. ### justtrying Active Member

Mar 9, 2011
329
396
my bad, that does not take all data into account (only took 40 rows). It should be 3160. Than the slope comes out as 0.00613 when corrected (all other values are right). It is also the same value as you would get if you used excel to get the best fit line (excel equation is y=0.0061x+121.3).

11. ### samjesse Thread Starter Active Member

Sep 14, 2008
202
0
I got it fixed. thx. it was my bad. ushort data type does not go more than 65536. so I changed it to int and all is well.