Analyzing Random numbers

MrAl · Apr 2, 2019

bogosort said:
In order for N tosses of a fair coin to have a mean of exactly 0.5, you must have N/2 heads. This rules out any sequence with an odd number of tosses. In the case of an even number of tosses, the probability of getting exactly N/2 heads is

nCr(N, N/2) / 2^N

As 2^N grows faster than nCr(N, N/2), the probability of any particular sequence having exactly N/2 heads diminishes as the length of the sequence grows. Indeed, the probability approaches zero as N → ∞.

Recall that you were talking about a practical test for determining whether a given (finite) sequence is random. I've given you "my theory" as to why it is a terrible test for randomness; if you disagree, please explain how my analysis fails.

Because a practical test shows clearly otherwise, the result 'settles' to 1/2 or rather the average of the samples.
To say that settling to the average is antithetical means we must not be talking about the same things or else you have NEVER EVER NEVER EVER done a REAL LIFE practical test to prove or disprove your theoretical 'result'.
DO A GOD DANG TEST WILL YA ?? ha ha

bogosort · Apr 2, 2019

WBahn said:
So just because one compression algorithm does very poorly on a particular data set does not really say much about whether that data set is truly random; it could simply be a data set that is too poor a match for the kind of data that the algorithm was designed for. What we CAN say is that if we DO get any significant compression (and we could probably put a threshold on it), then we have high confidence that the data is NOT truly random since all compression algorithms exploit nonrandomness in the data to achieve compression. But if we don't get any significant compression, about all we can say is that we haven't ruled out the possibility that the data is truly random and, perhaps, we have some reason to suspect that it might be.

Agreed, randomness cannot be proven. It may not even really exist, except as a measure of our own ignorance.

bogosort · Apr 2, 2019

MrAl said:
Because a practical test shows clearly otherwise, the result 'settles' to 1/2 or rather the average of the samples.
To say that settling to the average is antithetical means we must not be talking about the same things or else you have NEVER EVER NEVER EVER done a REAL LIFE practical test to prove or disprove your theoretical 'result'.
DO A GOD DANG TEST WILL YA ?? ha ha

I gave you a practical test, a really easy one to perform yourself. Flip a coin 10 times; did you get 5 heads? Likely not. Flip a coin 1,000 times; did you get 500 heads? I'll bet you $100 you didn't. How much more practical of an experiment do you need?

MrAl · Apr 2, 2019

WBahn said:
[1] The details are in the fine print.

[2] The odds that it will be within an arbitrarily small range centered on 1/2 will approach unity.

[3] The odds that it will BE 1/2 will approach zero.

Hi,

Not sure what you mean by the last 3rd statement. Why do you make this sound significant?

MrAl · Apr 2, 2019

bogosort said:
I gave you a practical test, a really easy one to perform yourself. Flip a coin 10 times; did you get 5 heads? Likely not. Flip a coin 1,000 times; did you get 500 heads? I'll bet you $100 you didn't. How much more practical of an experiment do you need?

Hello,

I dont think you can rely on so few samples. There are runs in these experiments. You can see a run of 10 heads for example.
However, it looks like you either dont want to do a good test or you believe so strongly that it wont work that you feel strongly it would be a waste of time, so then i'll have to repeat the test myself and show you the results. I hope you are not wearing a hat today because from what i hear they dont taste very good

Thanks for being a good sport about this though too. I appreciate that you discuss this mostly with intelligence and not with an over emphasis on emotion.

bogosort · Apr 2, 2019

MrAl said:
I dont think you can rely on so few samples. There are runs in these experiments. You can see a run of 10 heads for example.
However, it looks like you either dont want to do a good test or you believe so strongly that it wont work that you feel strongly it would be a waste of time, so then i'll have to repeat the test myself and show you the results.

Did you see the part above about the probability of getting N/2 heads in N samples diminishing as N gets larger? How many samples is good enough for you? How about a billion? The chances of you getting exactly 500 million heads in a billion throws is less than three-thousandths of a percent.

I hope you are not wearing a hat today because from what i hear they dont taste very good

I'm happy to eat my hat if it means that I'm able to correct a mistake in the way I've been thinking. But, so far, it feels like I've been providing quantifiable evidence for my case, while you haven't provided any. Show me the money, as it were.

djsfantasi · Apr 2, 2019

bogosort said:
Did you see the part above about the probability of getting N/2 heads in N samples diminishing as N gets larger? How many samples is good enough for you? How about a billion? The chances of you getting exactly 500 million heads in a billion throws is less than three-thousandths of a percent.

I'm happy to eat my hat if it means that I'm able to correct a mistake in the way I've been thinking. But, so far, it feels like I've been providing quantifiable evidence for my case, while you haven't provided any. Show me the money, as it were.

Does anyone else feel that this discussion is pedantic and has no resolution. IMHO, bogosort is taking an untenable position, with which he is claiming undeniable truth. WBahn has stated the problem succinctly in post#77. I agree with bogosort that finding a probability of EXACTLY 0.5 is unlikely.

That’s irrelevant.

The limit of the process is the real story. And it WILL be 0.5, no matter what he says.

The experiment will approach a value of 0.5 with a sample size that approaches infinity. As sample size increases, the range of values around 0.5 decreases decrease to a sufficiently small value. And I don’t know about you, but there comes a point where that range is PRACTICALLY zero.

There’s an old joke we told at Georgia Tech. There was a ball and at the end of the night, they had a special dance. All women were on one side of the ballroom; the men were on the other. At each verse, each approached the opposite side by half of the distance. The mathematicians left. The Engineers stayed, because they new they could get close enough for all practical purposes.

And that is that!

Ya’akov · Apr 2, 2019

djsfantasi said:
There’s an old joke we told at Georgia Tech. There was a ball and at the end of the night, they had a special dance. All women were on one side of the ballroom; the men were on the other. At each verse, each approached the opposite side by half of the distance. The mathematicians left. The Engineers stayed, because they new they could get close enough for all practical purposes.

And the physicists stayed by going home which was an order of magnitude approximation of staying.

bogosort · Apr 2, 2019

djsfantasi said:
Does anyone else feel that this discussion is pedantic and has no resolution. IMHO, bogosort is taking an untenable position, with which he is claiming undeniable truth. WBahn has stated the problem succinctly in post#77. I agree with bogosort that finding a probability of EXACTLY 0.5 is unlikely.

That’s irrelevant.

On the contrary, the context of this "pedantic" discussion is a practical test for randomness, namely, that if you sum the extreme values in a sequence and divide by 2, the sequence is random if it equals the mean value. In other words, MrAl's test requires that a sequence of coin flips have a mean of 0.5 EXACTLY. My analysis of this test is based on the very practical fact that the vast majority of actual random sequences will not have this property. How this can be interpreted as theoretical pedantry is beyond me.

djsfantasi · Apr 2, 2019

bogosort said:
On the contrary, the context of this "pedantic" discussion is a practical test for randomness, namely, that if you sum the extreme values in a sequence and divide by 2, the sequence is random if it equals the mean value. In other words, MrAl's test requires that a sequence of coin flips have a mean of 0.5 EXACTLY. My analysis of this test is based on the very practical fact that the vast majority of actual random sequences will not have this property. How this can be interpreted as theoretical pedantry is beyond me.

Sorry, I just don’t get it. I’ve reread every one of MrAl’s posts and disagree with your statement on his conclusion.

WBahn · Apr 2, 2019

djsfantasi said:
Sorry, I just don’t get it. I’ve reread every one of MrAl’s posts and disagree with your statement on his conclusion.

Look at Post #60: "One test that i've used in the past that i remember is the mean. The mean of an infinite number of samples of a true random variable is 1/2 of the sum of the two extremes (given every number between is generated also). So if you generate numbers 3, 4, and 5, the mean is 4 so after an infinite number of generations you better get 4 exactly."

So here's my PRNG that will pass this test.

C:

int prng()
{
   static int n = 3;
   n = (n - 2) % 3 + 3;
   return n;
}

It generates every number between 3 and 5, inclusive, and after an infinite number of generations the average of the extremes is exactly the mean. (provided infinity is divisible by 3

).

Or consider a prng that uses a true rng -- which we will assume exists and is accessible -- that returns a random integer between 0 and one less than the number passed to it as follows:

C:

int prng()
{
   static int history[] = {0,0,0};

   n = 3 + trueRandom(3);

   if (3 * history[n] < 1)
   {
      history[n]++;
      return n + 3;
   }

   history[(n + 1) % 3]++;
   return 3 + ((n + 1) % 3) ;
}

The idea here is that you preferentially return a number if that number is presently underrepresented in the sequence thus far.

Note that you can easily tweak this to prevent overflow in the history array (well, make it astronomically unlikely).

You can also easily tweak it so that the likelihood of the least represented number is increased just slightly while the likelihood of most represented number is decreased just slightly and by the same amount (leaving the likelihood of the third unchanged).

By definition the result is NOT a truly random sequence precisely because the trials are not independent -- the outcome probabilities for the next trial is dependent on the results of prior trials.

If I make the shift in the probabilities small enough, this prng will pass any test you devise that is based on looking at the results matching the expected distributions close enough. The tests that it might fail will be the ones that look at the likelihood that the results match the expected distributions too closely.

WBahn · Apr 2, 2019

MrAl said:
Hi,

Not sure what you mean by the last 3rd statement. Why do you make this sound significant?

Because you've stated that the results go to EXACTLY the expected value.

djsfantasi · Apr 2, 2019

WBahn said:
Look at Post #60: "One test that i've used in the past that i remember is the mean. The mean of an infinite number of samples of a true random variable is 1/2 of the sum of the two extremes (given every number between is generated also). So if you generate numbers 3, 4, and 5, the mean is 4 so after an infinite number of generations you better get 4 exactly."

So here's my PRNG that will pass this test.

C:

int prng() { static int n = 3; n = (n - 2) % 3 + 3; return n; }

It generates every number between 3 and 5, inclusive, and after an infinite number of generations the average of the extremes is exactly the mean. (provided infinity is divisible by 3 ).

Or consider a prng that uses a true rng -- which we will assume exists and is accessible -- that returns a random integer between 0 and one less than the number passed to it as follows:

C:

int prng() { static int history[] = {0,0,0}; n = 3 + trueRandom(3); if (3 * history[n] < 1) { history[n]++; return n + 3; } history[(n + 1) % 3]++; return 3 + ((n + 1) % 3) ; }

The idea here is that you preferentially return a number if that number is presently underrepresented in the sequence thus far.

Note that you can easily tweak this to prevent overflow in the history array (well, make it astronomically unlikely).

You can also easily tweak it so that the likelihood of the least represented number is increased just slightly while the likelihood of most represented number is decreased just slightly and by the same amount (leaving the likelihood of the third unchanged).

By definition the result is NOT a truly random sequence precisely because the trials are not independent -- the outcome probabilities for the next trial is dependent on the results of prior trials.

If I make the shift in the probabilities small enough, this prng will pass any test you devise that is based on looking at the results matching the expected distributions close enough. The tests that it might fail will be the ones that look at the likelihood that the results match the expected distributions too closely.

Ok. You found the smoking gun. But that was the only strict statement. The posts I remember state that the mean should be within a small but finite range of the mean. Focusing on this one statement, IMHO, had diminished the rest of the discussion.

WBahn · Apr 2, 2019

djsfantasi said:
Ok. You found the smoking gun. But that was the only strict statement. The posts I remember state that the mean should be within a small but finite range of the mean. Focusing on this one statement, IMHO, had diminished the rest of the discussion.

The claim repeated elsewhere, as well.

But note that the majority of my post isn't about that and so let's assume we are talking about getting close to the expected values of whatever metric we are using. I put forth a prng which is demonstrably nonrandom but which I claim (and I could be wrong, but it's what I believe right now) will pass any suite of tests that is based on this approach. The most likely way to detect the nonrandomness is probably to test for it getting too close too quickly to those expected values.

It's interesting that some (many? most?) of the more widely used suites, including diehard, do not check the data for whether all of the possible values occurred with (about) equal frequency. Presumably this is (or at least might be) because such tests are of such poor validity and reliability that they aren't worth performing even though they are far simply and faster to perform than the other tests in the suite.

MrAl · Apr 3, 2019

WBahn said:
Because you've stated that the results go to EXACTLY the expected value.

Hello again,

Well that is only when the limit of the number of samples approaches infinity. So would you rather state it as the difference between 1/2 and 1/infinity?
So in other words, it approaches:
0.5+1/inf
or:
0.5-1/inf

which looks like exactly 0.5 to me.

MrAl · Apr 3, 2019

bogosort said:
On the contrary, the context of this "pedantic" discussion is a practical test for randomness, namely, that if you sum the extreme values in a sequence and divide by 2, the sequence is random if it equals the mean value. In other words, MrAl's test requires that a sequence of coin flips have a mean of 0.5 EXACTLY. My analysis of this test is based on the very practical fact that the vast majority of actual random sequences will not have this property. How this can be interpreted as theoretical pedantry is beyond me.

Hi again,

Before i produce the actual test results let me run one more thing by you.

Let's use the numbers 1,2,3 as the possible samples. So either of those three numbers can come out of the generator.
The simple formula i suggested was the mean is (1+3)/2=2 but the full fledged formula would be:
mean=(1+2+3)/3=2
and since we dont want to leave out any samples let's use that last one.

Now the probability that 1 would be generated is just as likely as 2 and just as likely as 3.
What that means is that 1 is not systematically generated and neither is 2 or 3, so they must all be generated with the same frequency over a large number of samples. That's because if 1 is generated then 1 suddenly has less probabliity than 2 or 3 otherwise 2 and 3 would not have equal probability of being generated with 1 being systematic.
What this means is that for a large number of samples we would see 1 as many times as 2 and as many times as 3 being generated.
Reducing the set for simplicity, we might see:
1 generated 1 million times
2 generated 1 million times
3 generated 1 million times.
Now of course this is optimal, but we might see let's say 2 generated 1 million +10 times or 1000010 times.
The mean of this is 2.00000666666666 with repeating 6.
Now since 2 was generated 10 more times than 1 or 3, we would expect 1 and 3 to be generated more times in the future than 2 which would then bring the count closer to maybe 2000020 times for all three which brings the mean back to 2 again.
The reason we would see 1 and 3 generated more times is again because 2 being generated can not be systematic because if it was then the generator could not be claimed to be generating numbers with equal probability.

Next i will produce actual test values and what we will see (because i've done this numerous times) is the mean getting closer and close to the presumed value which is the mean as the number of samples goes up. Thus i cant help but think that as the number of samples approaches infinity the mean would be "mean+1/inf" which is just the mean.
If this requirement is not met then one of the numbers must be being produced systematically and that's not random. As we see from above if 2 is produced more times than 1 or 3 we dont get exactly 2.000000000000000 we get 2+err where err is some small error.
Note that this is not the ONLY test in that it is not a "pass or fail" test on the RNG, it's just one test that might be called a first test because if it passes that does not mean that it is definitely random, it's just that if it fails then it cant be. The reason of course is that (2+2+2)/6 is also equal to 2, so a RNG that always produces "2" will also pass the test obviously, but (2+2+3)/6 will not (with 1 never being generated).

bogosort · Apr 3, 2019

MrAl said:
That's because if 1 is generated then 1 suddenly has less probabliity than 2 or 3 otherwise 2 and 3 would not have equal probability of being generated with 1 being systematic.

If I'm understanding you correctly -- that the probabilities change depending on previous outcomes -- then you're not describing a random process. The reason that we do see runs of 10 heads in a row on coin flips is because each flip is independent of the previous; the probability of heads is always 0.5 (for a fair coin). A process that always generates exactly N/2 heads for N flips is the opposite of random.

Next i will produce actual test values and what we will see (because i've done this numerous times) is the mean getting closer and close to the presumed value which is the mean as the number of samples goes up.

If your "big reveal" is that the mean is approached when you take the average mean of an ensemble of sequences, don't bother, as that's not surprising in the least. Likewise, if your big reveal is that the mean is approached as N → ∞, again, don't bother.

You billed this method as a real-world randomness test for a sequence of numbers. This of course implies finite sequences. You acknowledge that your test is susceptible to false positives (a string of "2"s will pass the test), and say that if a sequence fails the test then it "cant be" random. My challenge to you is to produce ten sequences, any length you want, generated by established random or pseudo-random methods, and see how many pass your test. Flip coins, use rand(), count Geiger clicks and transform the results to a uniform distribution, whatever. If you don't have the time, tell me the properties the sequences should have and I can quickly whip something up in MATLAB.

MrChips · Apr 3, 2019

The probability of flipping heads is 0.5.

However, knowing that the previous outcome was heads, the probability of flipping heads is now 0.25.

This is called conditional probability.

Reference:
https://en.wikipedia.org/wiki/Conditional_probability

bogosort · Apr 3, 2019

MrChips said:
The probability of flipping heads is 0.5.

However, knowing that the previous outcome was heads, the probability of flipping heads is now 0.25.

This is both false and depressing if you really believe this. I'm hoping that what you really meant to say was that the probability of flipping two heads in a row in two flips is 0.25.

WBahn · Apr 3, 2019

MrChips said:
The probability of flipping heads is 0.5.

However, knowing that the previous outcome was heads, the probability of flipping heads is now 0.25.

This is called conditional probability.

Reference:
https://en.wikipedia.org/wiki/Conditional_probability

Are you really trying to claim that the probability of flipping heads on the next flip is determined by what the outcome on the previous flip was?

What will the probability be of flipping heads on the tenth flip if we know the prior nine flips were all heads?

I can't tell for sure how you are getting the 0.25 value. Is this your line of thinking:

If I flip a coin twice then there are four possible outcomes: HH, HT, TH, TT.
So the probability of getting HH after two flips is 0.25
If I flip the coin once and it comes out heads, then on the next flip if it comes out heads I will have HH and that can only happen 25% of the time, so my probability of getting heads on the second flip must therefore be 0.25

If so, then follow that same line of reasoning to determine the probability of getting tails on the next flip. It will also be 0.25 since if you flip tails you then have HT and that, too, can only happen 25% of the time.

Now let's do it correctly.

The probability that you will have HH after two flips given that I have H after one flip is 0.50.
The probability that you will have HT after two flips given that I have H after one flip is 0.50.
The probability that you will have TH after two flips given that I have H after one flip is 0.00.
The probability that you will have TT after two flips given that I have H after one flip is 0.00.
The probability that you will have HH after two flips given that I have T after one flip is 0.00.
The probability that you will have HT after two flips given that I have T after one flip is 0.00.
The probability that you will have TH after two flips given that I have T after one flip is 0.50.
The probability that you will have TT after two flips given that I have T after one flip is 0.50.

Thread starter	Similar threads	Forum	Replies	Date
M	Is LTSpice consistenly analyzing identical circuits differently?	PCB Layout , EDA & Simulations	16	Jul 19, 2026
Y	analyzing current limiting switch circuit	Analog & Mixed-Signal Design	34	Feb 5, 2026
Y	analyzing pulse generating circuit	Analog & Mixed-Signal Design	19	Jan 17, 2026
	Problem analyzing a resistor network	Homework Help	19	Oct 23, 2024
C	Need Help Analyzing Silicon Structures in a IC	IC Design	1	May 7, 2024

Analyzing Random numbers

Join our Engineering Community! Sign-in with:

Analyzing Random numbers

MrAl

bogosort

bogosort

MrAl

MrAl

bogosort

djsfantasi

Ya’akov

bogosort

djsfantasi

WBahn

WBahn

djsfantasi

WBahn

MrAl

MrAl

bogosort

MrChips

bogosort

WBahn

You May Also Like

Breaking AI Bottlenecks: 3 Startups Look Beyond the Chip

Rambus Rolls Out Memory Chipset to Unburden AI PCs

Exclusive—Powering AI Processors Using a 48 V Factorized Power Architecture

PCIe 7.0 Roundup: Test and Timing Tools Emerge as Ecosystem Takes Shape