Two dimensional gaussian distribution

Thread Starter

joeyd999

Joined Jun 6, 2011
5,287
I could work out the math on my own, but I'm feeling lazy today. One of you know the answer off the top of your heads:

If I have two independent variables, each with its own mean and standard deviation, how do I compute a combination of the two?

For (a simple) example, lets say I have a population of rectangles. I compute the mean and SD of the length and height of the rectangles in a sample and I know (or determine) the two parameters are independent.

Now, I wish to choose, say 1 SD of rectangles, taking into consideration both length and height. Obviously, 1 SD of length will give me 68% of rectangles, and 1 SD of height will give me another (different set of) 68%.

How do I choose 68% of all rectangles based upon both length and height?
 

wayneh

Joined Sep 9, 2010
17,498
More likely, an ellipse.
If the data is normalized, it should give you a circle if both dimensions are normal distributions. I'm thinking that if the radius is in units of SD, the percentages included/excluded are the same as in one dimension. So if the origin is at the mean, you can rotate the axes by any arbitrary angle and see the same distribution along each axis.

The thing that's nagging me is that the area increases with the square of the radius. So there must be some r•dr/då thing. å = angle Not sure that's a relevant concern.
 
Last edited:

DGElder

Joined Apr 3, 2016
351
I could work out the math on my own, but I'm feeling lazy today. One of you know the answer off the top of your heads:

If I have two independent variables, each with its own mean and standard deviation, how do I compute a combination of the two?

For (a simple) example, lets say I have a population of rectangles. I compute the mean and SD of the length and height of the rectangles in a sample and I know (or determine) the two parameters are independent.

Now, I wish to choose, say 1 SD of rectangles, taking into consideration both length and height. Obviously, 1 SD of length will give me 68% of rectangles, and 1 SD of height will give me another (different set of) 68%.

How do I choose 68% of all rectangles based upon both length and height?
You must adjust your acceptance criteria for length and height that includes the best 80% (0.64^0.5) of each population. Then 64% of the total population, on average, will be acceptable for both parameters.

If the means = the nominal values for each parameter and if the populations are normally distributed then your acceptance criteria would be +/- 1.28 SD for each parameter.
 
Last edited:

Thread Starter

joeyd999

Joined Jun 6, 2011
5,287
You must adjust your acceptance criteria for length and height that includes the best 80% (0.64^0.5) of each population. Then 64% of the total population, on average, will be acceptable for both parameters.
This is my assumption. But I'd want to prove this somehow before I depend on it.

I've got a nagging feeling in the back of my head that there is a two-dimensional chi-squared test that should arrive at this result.
 

wayneh

Joined Sep 9, 2010
17,498
I think what I was describing was computing a normalized geometric mean (sqrt(L^2 + W^2) = radius) for every rectangle. That collapses it to a single metric.
 

Thread Starter

joeyd999

Joined Jun 6, 2011
5,287
I think what I was describing was computing a normalized geometric mean (sqrt(L^2 + W^2) = radius) for every rectangle. That collapses it to a single metric.
Hmmmm....that's different. Yet I can think of lots of *very different* rectangles with exactly the same "radius".
 

Thread Starter

joeyd999

Joined Jun 6, 2011
5,287
Here is a sample of the data I am working with. I did it with code tags. I think there is a way to do tables, but I forgot how.

Code:
Sample    X    Y
1    -116    356
2    -104    198
3    -137    305
4    -92    219
5    -118    318
6    -125    334
7    -99    188
8    -136    173
9    -89    58
10    -129    163
11    -95    213
12    -73    -191
13    -53    -6
14    -122    249
15    -169    260
16    -95    220
17    -51    135
18    -115    69
19    -122    236
20    -131    250
21    -153    239
22    -167    49
23    -73    18
24    -51    -24
25    -128    105
26    -114    163
27    -63    85
28    32    100
29    -42    126
30    29    121
31    -28    -121
32    -147    371
33    -89    58
34    -64    -328
35    -114    9
36    -51    45
37    -65    158
38    -157    202
39    -76    224
40    -128    223
41    -39    269
42    5    -107
43    9    -82
44    -5    27
45    48    261
46    -109    281
47    -98    301
48    -105    30
49    -75    326
50    -59    115
51    -125    87
52    -142    177
53    -8    288
54    -76    83
55    -85    186
56    -92    251
57    -23    66
     
Count    57    57
Average    -84.2    142.6
SD    51.4    137.8
 

DGElder

Joined Apr 3, 2016
351
This is my assumption. But I'd want to prove this somehow before I depend on it.

I've got a nagging feeling in the back of my head that there is a two-dimensional chi-squared test that should arrive at this result.

That is straight forward probability. The population distribution doesn't matter as long as the two parameters are independent and you know the percentage of the each population that meet your dimensional criteria. If you don't know these things then you need to describe the problem in much more detail to get useful practical statistics.
 

DGElder

Joined Apr 3, 2016
351
Here is a sample of the data I am working with. I did it with code tags. I think there is a way to do tables, but I forgot how.

Code:
Sample    X    Y
1    -116    356
2    -104    198
3    -137    305
4    -92    219
5    -118    318
6    -125    334
7    -99    188
8    -136    173
9    -89    58
10    -129    163
11    -95    213
12    -73    -191
13    -53    -6
14    -122    249
15    -169    260
16    -95    220
17    -51    135
18    -115    69
19    -122    236
20    -131    250
21    -153    239
22    -167    49
23    -73    18
24    -51    -24
25    -128    105
26    -114    163
27    -63    85
28    32    100
29    -42    126
30    29    121
31    -28    -121
32    -147    371
33    -89    58
34    -64    -328
35    -114    9
36    -51    45
37    -65    158
38    -157    202
39    -76    224
40    -128    223
41    -39    269
42    5    -107
43    9    -82
44    -5    27
45    48    261
46    -109    281
47    -98    301
48    -105    30
49    -75    326
50    -59    115
51    -125    87
52    -142    177
53    -8    288
54    -76    83
55    -85    186
56    -92    251
57    -23    66
    
Count    57    57
Average    -84.2    142.6
SD    51.4    137.8
What are you trying to accomplish?
 

Thread Starter

joeyd999

Joined Jun 6, 2011
5,287
That is straight forward probability. The population distribution doesn't matter as long as the two parameters are independent and you know the percentage of the each population that meet your dimensional criteria. If you don't know these things then you need to describe the problem in much more detail to get useful practical statistics.
The only thing that doesn't feel right about your approach is that it results in a rectangular box in which the points that correspond to the criteria lie. I'd expect an eliptical result.

What are you trying to accomplish?
Just what I said in post #1. I'd like to generalize the problem, though, in order to select items that fit within an arbitrary number of SDs.
 

DGElder

Joined Apr 3, 2016
351
The only thing that doesn't feel right about your approach is that it results in a rectangular box in which the points that correspond to the criteria lie. I'd expect an eliptical result.



Just what I said in post #1. I'd like to generalize the problem, though, in order to select items that fit within an arbitrary number of SDs.
I don't understand what you are talking about. Are you talking about plotting the length height dimensions on a X Y scatter plot and the resulting shape of the 2 dimensional solid you have graphed on that plot - where n=infinity? If so that would look like a rectangle. But if you are talking about probabilty of falling at points within that rectangle you would need a 3rd dimension, Z, to illustrate the probabilities. P(z)=P(x)*P(y) which can be thought as the density function for the population.

Alternatively if n is large enough to fill in the 2D plot with a good distribution (solid black near the origin) but n not so large to darken the whole rectangle then the density of the plot points would look oval-ish as the least probable points near the rectangle corners fade to near invisibility due to their low density.
 
Last edited:

BR-549

Joined Sep 22, 2013
4,928
Arranging the squares into a rectangle with one side equal to the number of values, n, results in the other side being the distribution's variance, σ².

Does that work? It's all geek to me.
 
Top