least square line fit

Thread Starter

samjesse

Joined Sep 14, 2008
212
Hi

I have 79 observation which are increasing in value, it suppose to give a positive slop but I get a negative one (-0.00279).
y = a * bx
where x is 1 to 79
I rechecked my work many times over but not able to find out why I am getting a negative "b".

Pls. help.
thx

1 121.1745
2 121.181
3 121.1985
4 121.2245
5 121.253
6 121.2775
7 121.268
8 121.279
9 121.315
10 121.3505
11 121.3475
12 121.3615
13 121.409
14 121.4445
15 121.419
16 121.393
17 121.407
18 121.4355
19 121.4575
20 121.4745
21 121.4905
22 121.5205
23 121.5405
24 121.534
25 121.515
26 121.4995
27 121.478
28 121.4665
29 121.48
30 121.515
31 121.5265
32 121.5315
33 121.5495
34 121.5085
35 121.5375
36 121.5895
37 121.6065
38 121.596
39 121.5805
40 121.5865
41 121.559
42 121.5395
43 121.553
44 121.574
45 121.582
46 121.585
47 121.5895
48 121.605
49 121.615
50 121.6155
51 121.6405
52 121.675
53 121.6925
54 121.6845
55 121.657
56 121.645
57 121.654
58 121.6555
59 121.646
60 121.638
61 121.6445
62 121.635
63 121.6505
64 121.6565
65 121.653
66 121.668
67 121.6865
68 121.6685
69 121.6635
70 121.673
71 121.6865
72 121.6855
73 121.6885
74 121.686
75 121.712
76 121.7495
77 121.7605
78 121.796
79 121.8515
 

Thread Starter

samjesse

Joined Sep 14, 2008
212
a is the constant
b is the slop

I am sure the formula as written in the below code is correct.
Rich (BB code):
      a = ((sumY * sumX2) - (sumX * sumXY))/((n * sumX2) - MathPow(sumX,2)); 
      b = ((n*sumXY) - (sumX*sumY)) / ((n*sumX2) - MathPow(sumX,2));
it works good till n gets larger than some value then it fails.
it is time series data thus n are sequential numbers 1 to 79 in this case.
 

justtrying

Joined Mar 9, 2011
429
I think you made a mistake in your calculations. I had originally obtained same value, but I used wrong sum in the formula - be careful what your x and y are. It is confusing since number of data points is same as the x. I think what I got now is correct - slope is 1.81. Check out the excel sheet...
 

Attachments

Thread Starter

samjesse

Joined Sep 14, 2008
212
Sorry, I could not open the file you posted.
here is my work, could you please tell me where I went wrong.
thx

Rich (BB code):
      double sumY = 0.0;
      double sumXY = 0.0;   

      int sumX = n*(n+1)/2;  //--- sum of consecutive integers.
      int sumX2 = n*(n+1)*(2*n+1) / 6;   //--- sum of squares of above.
      for( ushort i = 0; i < n; i++) 
      {
         double Y = gets calculated here;
         sumY += Y;
         sumXY += ((i+1) * Y);
      }
      a = ((sumY * sumX2) - (sumX * sumXY))/((n * sumX2) - MathPow(sumX,2)); 
      b = ((n*sumXY) - (sumX*sumY)) / ((n*sumX2) - MathPow(sumX,2));
 

MrChips

Joined Oct 2, 2009
21,842
You must be doing something wrong. My results to a straight line y = mx + c gives me nothing like what you are getting.
 

someonesdad

Joined Jul 7, 2009
1,583
Here's the regression I get with a script I wrote a number of years ago:

Rich (BB code):
ANOVA Table 
 
Source of
Variation                   SS          df                MS
--------------------------------------------------------------
Regression      SSR =      1.54318       1  MSR =      1.54318
Error           SSE =      0.20356      77  MSE =   0.00264364
--------------------------------------------------------------
Total          SSTO =      1.74674      78

Adjusted R^2 =  88.19
         R^2 =  88.35

F statistic with df=(1,77) = 583.7

Covariance matrix:
[[  1.36e-04  -2.57e-06]
 [ -2.57e-06   6.44e-08]]

Model parameters:    Estimate        StdDev    100*StdDev/Est
    b0                121.298       0.01168              0.01
    b1             0.00612904     0.0002537              4.14

Residuals = Res:
                Y         Yhat          Res    100*Res/Y 
   1        121.2        121.3        -0.13        -0.11
   2        121.2        121.3      -0.1297        -0.11
   3        121.2        121.3      -0.1183        -0.10
   4        121.2        121.3     -0.09844        -0.08
   5        121.3        121.3     -0.07607        -0.06
   6        121.3        121.3     -0.05769        -0.05
   7        121.3        121.3     -0.07332        -0.06
   8        121.3        121.3     -0.06845        -0.06
   9        121.3        121.4     -0.03858        -0.03
  10        121.4        121.4    -0.009211        -0.01
  11        121.3        121.4     -0.01834        -0.02
  12        121.4        121.4     -0.01047        -0.01
  13        121.4        121.4       0.0309         0.03
  14        121.4        121.4      0.06027         0.05
  15        121.4        121.4      0.02864         0.02
  16        121.4        121.4    -0.003485        -0.00
  17        121.4        121.4     0.004386         0.00
  18        121.4        121.4      0.02676         0.02
  19        121.5        121.4      0.04263         0.04
  20        121.5        121.4       0.0535         0.04
  21        121.5        121.4      0.06337         0.05
  22        121.5        121.4      0.08724         0.07
  23        121.5        121.4       0.1011         0.08
  24        121.5        121.4      0.08848         0.07
  25        121.5        121.5      0.06335         0.05
  26        121.5        121.5      0.04172         0.03
  27        121.5        121.5       0.0141         0.01
  28        121.5        121.5    -0.003534        -0.00
  29        121.5        121.5     0.003837         0.00
  30        121.5        121.5      0.03271         0.03
  31        121.5        121.5      0.03808         0.03
  32        121.5        121.5      0.03695         0.03
  33        121.5        121.5      0.04882         0.04
  34        121.5        121.5     0.001692         0.00
  35        121.5        121.5      0.02456         0.02
  36        121.6        121.5      0.07043         0.06
  37        121.6        121.5       0.0813         0.07
  38        121.6        121.5      0.06468         0.05
  39        121.6        121.5      0.04305         0.04
  40        121.6        121.5      0.04292         0.04
  41        121.6        121.5     0.009289         0.01
  42        121.5        121.6     -0.01634        -0.01
  43        121.6        121.6    -0.008969        -0.01
  44        121.6        121.6     0.005902         0.00
  45        121.6        121.6     0.007773         0.01
  46        121.6        121.6     0.004643         0.00
  47        121.6        121.6     0.003014         0.00
  48        121.6        121.6      0.01239         0.01
  49        121.6        121.6      0.01626         0.01
  50        121.6        121.6      0.01063         0.01
  51        121.6        121.6       0.0295         0.02
  52        121.7        121.6      0.05787         0.05
  53        121.7        121.6      0.06924         0.06
  54        121.7        121.6      0.05511         0.05
  55        121.7        121.6      0.02148         0.02
  56        121.6        121.6     0.003353         0.00
  57        121.7        121.6     0.006224         0.01
  58        121.7        121.7     0.001595         0.00
  59        121.6        121.7     -0.01403        -0.01
  60        121.6        121.7     -0.02816        -0.02
  61        121.6        121.7     -0.02779        -0.02
  62        121.6        121.7     -0.04342        -0.04
  63        121.7        121.7     -0.03405        -0.03
  64        121.7        121.7     -0.03418        -0.03
  65        121.7        121.7     -0.04381        -0.04
  66        121.7        121.7     -0.03494        -0.03
  67        121.7        121.7     -0.02257        -0.02
  68        121.7        121.7      -0.0467        -0.04
  69        121.7        121.7     -0.05782        -0.05
  70        121.7        121.7     -0.05445        -0.04
  71        121.7        121.7     -0.04708        -0.04
  72        121.7        121.7     -0.05421        -0.04
  73        121.7        121.7     -0.05734        -0.05
  74        121.7        121.8     -0.06597        -0.05
  75        121.7        121.8      -0.0461        -0.04
  76        121.7        121.8     -0.01473        -0.01
  77        121.8        121.8    -0.009857        -0.01
  78        121.8        121.8      0.01951         0.02
  79        121.9        121.8      0.06889         0.06
 

justtrying

Joined Mar 9, 2011
429
This is for Excel, I cannot help much with code as I am in the beginner stage myself. But I think there is a calculation error happening somewhere. I forgot which values got mixed up when I got the same value as you did initially. Maybe consider using different formulas?

Rich (BB code):
  a*79+b*780=9601,94    - a*n + b*SumXi = SumYi
  
      
  a*780+b*167480=384329.5   - a*SumXi + b*SumXi^2=SumXiYi
  
      
  
  a=103,65   (intercept)
  
  
  b=1,81 (slope)
   X    Y       X^2     XY   
      
  1   121,17   1   121,1745   
      
  2   121,18   4   242,362   
      
  3   121,20   9   363,5955   
      
  4   121,22   16   484,898   
      
  5   121,25   25   606,265   
      
  6   121,28   36   727,665   
      
  7   121,27   49   848,876   
      
  8   121,28   64   970,232   
      
  9   121,32   81   1091,835   
      
  10   121,35   100   1213,505   
      
  11   121,35   121   1334,823   
      
  12   121,36   144   1456,338   
      
  13   121,41   169   1578,317   
      
  14   121,44   196   1700,223   
      
  15   121,42   225   1821,285   
      
  16   121,39   256   1942,288   
      
  17   121,41   289   2063,919   
      
  18   121,44   324   2185,839   
      
  19   121,46   361   2307,693   
      
  20   121,47   400   2429,49   
      
  21   121,49   441   2551,301   
      
  22   121,52   484   2673,451   
      
  23   121,54   529   2795,432   
      
  24   121,53   576   2916,816   
      
  25   121,52   625   3037,875   
      
  26   121,50   676   3158,987   
      
  27   121,48   729   3279,906   
      
  28   121,47   784   3401,062   
      
  29   121,48   841   3522,92   
      
  30   121,52   900   3645,45   
      
  31   121,53   961   3767,322   
      
  32   121,53   1024   3889,008   
      
  33   121,55   1089   4011,134   
      
  34   121,51   1156   4131,289   
      
  35   121,54   1225   4253,813   
      
  36   121,59   1296   4377,222   
      
  37   121,61   1369   4499,441   
      
  38   121,60   1444   4620,648   
      
  39   121,58   1521   4741,64   
      
  40   121,59   1600   4863,46   
      
  41   121,56   1681   4983,919   
      
  42   121,54   1764   5104,659   
      
  43   121,55   1849   5226,779   
      
  44   121,57   1936   5349,256   
      
  45   121,58   2025   5471,19   
      
  46   121,59   2116   5592,91   
      
  47   121,59   2209   5714,707   
      
  48   121,61   2304   5837,04   
      
  49   121,62   2401   5959,135   
      
  50   121,62   2500   6080,775   
      
  51   121,64   2601   6203,666   
      
  52   121,68   2704   6327,1   
      
  53   121,69   2809   6449,703   
      
  54   121,68   2916   6570,963   
      
  55   121,66   3025   6691,135   
      
  56   121,65   3136   6812,12   
      
  57   121,65   3249   6934,278   
      
  58   121,66   3364   7056,019   
      
  59   121,65   3481   7177,114   
      
  60   121,64   3600   7298,28   
      
  61   121,64   3721   7420,315   
      
  62   121,64   3844   7541,37   
      
  63   121,65   3969   7663,982   
      
  64   121,66   4096   7786,016   
      
  65   121,65   4225   7907,445   
      
  66   121,67   4356   8030,088   
      
  67   121,69   4489   8152,996   
      
  68   121,67   4624   8273,458   
      
  69   121,66   4761   8394,782   
      
  70   121,67   4900   8517,11   
      
  71   121,69   5041   8639,742   
      
  72   121,69   5184   8761,356   
      
  73   121,69   5329   8883,261   
      
  74   121,69   5476   9004,764   
      
  75   121,71   5625   9128,4   
      
  76   121,75   5776   9252,962   
      
  77   121,76   5929   9375,559   
      
  78   121,80   6084   9500,088   
      
  79   121,85   6241   9626,269   
      sum   780   9 601,94   167480   384329,5
 
Last edited:

Thread Starter

samjesse

Joined Sep 14, 2008
212
you posted
Sum of x = 780

my excel shows it to be: 3160
and the code
sumX = n*(n+1)/2= 3160
where n = 79
i.e 79 observations numbered sequentially 1 - 79
how did you get it to be 780?

that is
1 + 2 + ....+ 79= 3160
 
Last edited:

justtrying

Joined Mar 9, 2011
429
my bad, that does not take all data into account (only took 40 rows). It should be 3160. Than the slope comes out as 0.00613 when corrected (all other values are right). It is also the same value as you would get if you used excel to get the best fit line (excel equation is y=0.0061x+121.3).
 

Thread Starter

samjesse

Joined Sep 14, 2008
212
I got it fixed. thx. it was my bad. ushort data type does not go more than 65536. so I changed it to int and all is well.
 
Top