Interesting Algorithms

WBahn

Joined Mar 31, 2012
32,847
I have a feeling this algorithm is nowhere to be found online because it was developed by myself and it was unlike any other that I have ever seen. That's mostly because of how fast the result gets better and better, doubling the number of precise digits for each pass. That's quite different I think because if we got up to 32 digits, the next pass would give us 64 digits, and the next 128 digits, etc.
This is known as quadratic convergence and there are a few algorithms that achieve it (or even better, at least in some cases). I mentioned AGM methods previously, which are used for the ultra-precise applications (thousands of digits) that you were talking about.
 

MrAl

Joined Jun 17, 2014
13,704
You sound like a bit of an expert in this area. Years ago I developed a compiler and needed a floating point library for the Intel CPU. I tracked down a guy who had one, the licensing was pricey, no doubt a money spinner for him.
Hi,

I don't think I could be called an expert in that area, but I spent so much time over the years developing math routines for both high level languages and assembler that I found a lot of tricks that probably are not published yet or very rarely published. It's amazing how many tricks there are that are not well-known even by all mathematicians. It seems like one mathematician knows some tricks that the other does not know, and that other one knows tricks that the first one does not know. This seems evident by the differences in books written by each.
 

MrAl

Joined Jun 17, 2014
13,704
This is known as quadratic convergence and there are a few algorithms that achieve it (or even better, at least in some cases). I mentioned AGM methods previously, which are used for the ultra-precise applications (thousands of digits) that you were talking about.
Hi,

Yes, quadratic convergence is the goal.

Amazingly, I found an algorithm for sin(x) that has quadratic convergence but it's just a theoretical first step in what I might have found years and years ago. It's in theory only right now and still needs work to get anything useable, if it is even possible.

When you say AGM methods do you mean Arithmetic Geometric Mean? I must have missed that.
Any reference for sin(x) or similar?
And, would that get me quadratic convergence somehow?
 

Futurist

Joined Apr 8, 2025
758
Hi,

Yes, quadratic convergence is the goal.

Amazingly, I found an algorithm for sin(x) that has quadratic convergence but it's just a theoretical first step in what I might have found years and years ago. It's in theory only right now and still needs work to get anything useable, if it is even possible.

When you say AGM methods do you mean Arithmetic Geometric Mean? I must have missed that.
Any reference for sin(x) or similar?
And, would that get me quadratic convergence somehow?
Go to copilot and enter "quadratic convergence sin(x)"
 

Futurist

Joined Apr 8, 2025
758
Regarding the sin(x) function. The series has several options for optimization.

1. factorial n can be a lookup table.
2. the numerator of each term is x^2 * the numerator of previous term.
3. any angle θ an be normalized to 0 >= θ <= 90

I'm sure proper implementations leverage these things.

But how does "floating point hardware" do these things?

I have an old 287 coprocessor in my archives, big deal back in the day.
 
Last edited:

xox

Joined Sep 8, 2017
936
Regarding the sin(x) function. The series has several options for optimization.

1. factorial n can be a lookup table.
2. the numerator of each term is x^2 * the numerator of previous term.
3. any angle θ an be normalized to 0 >= θ <= 90

I'm sure proper implementations leverage these things.

But how does "floating point hardware" do these things?

I have an old 287 coprocessor in my archives, big deal back in the day.
One thing that make me wonder about the Taylor series expansion of sin(x) using standard floating point operations is the precision. When I run the following program with x=1, the output is fine, but otherwise the accuracy seems to suffer.

Code:
#include <math.h>
#include <float.h>
#include <stdbool.h>
#include <stdlib.h>
#include <stdio.h>

int main(int argc, char** argv) {
  if (argc < 2 || argc > 4)
    return printf("Usage: %s [ARGUMENT] [DIGITS] [EPSILON]\n", argv[0]);
  double x = atof(argv[1]);
  int digits = 53;
  if (argc > 2)
    digits = atoi(argv[2]);
  double epsilon = DBL_EPSILON;
  if (argc > 3)
    epsilon = atof(argv[3]);
  int iterations = 0;
  double computed = 0;
  double x_n = fmod(x, 2 * M_PI);
  double x_squared = x_n * x_n;
  double n = 1;
  double factorial = n;
  double last = 0;
  double sign = 1;
  while (true) {
    ++iterations;
    computed += sign * (x_n / factorial);
    x_n *= x_squared;
    factorial *= ((n + 1) * (n + 2));
    n += 2;
    sign = -sign;
    double check = fabs(computed - last);
    if (check <= epsilon)
      break;
    last = computed;
  }
  double standard = sin(x);
  double diff = standard - computed;
  double error = (diff / standard);
  printf("--- sin(%g) ---\n", x);
  printf("Computed: %.*g\n", digits, computed);
  printf("Standard: %.*g\n", digits, standard);
  printf("AbsError: %g%%\n", error * 100);
  printf("Digits  : %d\n", digits);
  printf("Epsilon : %g\n", epsilon);
  printf("Terms   : %d\n", iterations);
}
*** EDIT ***

Ah, right. It's because the factorial grows so quickly.
 
Last edited:

BobTPH

Joined Jun 5, 2013
11,516
1. factorial n can be a lookup table.
2. the numerator of each term is x^2 * the numerator of previous term.
3. any angle θ an be normalized to 0 >= θ <= 90
Yes, and, as I described, it can be reduced to 0 to 45 by mapping it to a cos for 45 to 90. This can improve any sin algorithm.

Each term takes only 2 multiplies and an add.
 

Futurist

Joined Apr 8, 2025
758
Yes, and, as I described, it can be reduced to 0 to 45 by mapping it to a cos for 45 to 90. This can improve any sin algorithm.

Each term takes only 2 multiplies and an add.
Seems like quite a few of you guys have delved deeply into all this, very interesting stuff too.

Anyone else here recall using "log books" at school? (I still have one somewhere around here...)
 
Last edited:

WBahn

Joined Mar 31, 2012
32,847
Seems like quite a few of you guys have delved deeply into all this, very interesting stuff too.

Anyone else here recall using "log books" at school? (I still have one somewhere around here...)
Sure. I was fortunate to be in (what I consider to be) the Goldilocks zone, a period that only lasted about three years, in which math courses were still no-calculator but science courses allowed scientific calculators. So we had to learn and become proficient with the fundamentals, but could leverage technology for the grunt work. That fundamentals component has paid off so many times that I can't begin to count them. While my math classes had ditched slide rules, the class before me used them. Fortunately, I got to become comfortable with them in pilot training.
 

xox

Joined Sep 8, 2017
936
OK, this one is really mind blowing. You can actually multiply numbers together using little more than pure randomness!

 

WBahn

Joined Mar 31, 2012
32,847
OK, this one is really mind blowing. You can actually multiply numbers together using little more than pure randomness!

The concept is very simple and is anything but new. Stochastic signal processing is used in many fields.

She very conveniently overlooks a few major issues.

She claims that all of the thousands of transistors that make up a high-precision multiplier can be replaced by a single AND gate. She then claims that this is a big deal because of the energy used by the multiplier as well as the silicon die area used and in a couple of places she implies that this stochastic approach would also be much faster because it only uses a single simple gate. But she conveniently forgets something that she had previously relied on in explaining how it works -- that this single AND gate has to be walked bit-by-bit across the stream of random bits that represent the two numbers. While she does point out that accuracy requires lots of random samples and that speed and accuracy are tradeoffs, she very conveniently fails to give any indication of how long the bitstreams would have to be in order to achieve even a modest degree of accuracy and precision. The problem is that the needed length of the bitstream grows exponentially with the needed resolution and again grows exponentially with the required precision.

To grasp the significance of this, let's consider that we want to be able to represent something with just eight bits of precision. That means that we want to be able to distinguish 256 different values, If these values are represented via the density of a bit stream, then at the very minimum we would have to have a stream that was 255 bits long such that the number is represented by the number of bits in that stream that are set. So we have to run a 256-bit stream through our simple AND gate, which is going to take time and consume energy.

But if only it were only that bad.

The bits in the stream are stochastic, with each bit being a 0 or 1 independent of any of the other bits in the stream. The value being represented merely determines the probability of each bit being set.

With a stream that is 255 bits long, it is virtually guaranteed that the stream that is generated will represent the wrong value.

A simple example that most people here can probably relate to is the classic coin-flipping experiment.

If we have the proverbial "fair coin", the the probability of a 1 (aka, heads) is 0.5, thus the resulting bit stream will represent the number 0.5 (or, in binary, 0.1b).

Since we are wanting eight bits of precision, that means that we want our bitstream to represent 0.5 (i.e., 0.10000000b) as being different from 0.01111111b and 0.10000001b (as well as every other eight bit value in the range 0 <= x < 1).

Now, since this is a stochastic process, we can't guarantee this. The best we can do is say something like we want probability that the bit density of the bit stream is within one-half of an lsb of the represented value to be at least some minimum threshold. But what should that threshold be? That will depend on what's important which will be related to why we said that we wanted eight bits of precision to begin with.

If I've reconstructed the math right, the probability that a random 256-bit bitstream will have exactly 50% 1s is about 5%. Maybe that's good enough, maybe it's not. But how could we improve it?

If we quadruple the length of our bitstream to 1024 bits, then if the number of 1s is 510, 511, 512, 513, or 514 we are within 0.5 LSB of the intended value. While the likelihood of getting exactly 0.5 goes down is cut in half, the fact that we can now accept five nearby values as being identical makes up for it and raises the likelihood of getting a correct number to about 10%. Hence you see the exponential scaling problem -- we have to quadruple the length of the bit stream to double the probability of having the correct value. So to get the probability of not being off by more than 0.5 LSB down below 1%, we would likely need pushing (or perhaps even more) than a million bits.

So is generating two million-bit streams of random numbers, bitwise AND'ing them together, and then counting the number of 1s in the resulting bitstream going to be faster and consume less power than using an 8-bit multiplier? How would we speed it up? Well, we could put a million AND gates on the chip and do that part of it in parallel. But now we are using up that valuable die area again.

Then there is the question of how we generate these two random bit streams in the first place.

She talks about using the noise from a resistor as a source of true random numbers. Okay. Fine. Now how do you use that to produce a bit stream in which each bit has the specified probability of being a one?

There are ways to do it, but they are not fast.
 

xox

Joined Sep 8, 2017
936
The concept is very simple and is anything but new. Stochastic signal processing is used in many fields.

She very conveniently overlooks a few major issues.

She claims that all of the thousands of transistors that make up a high-precision multiplier can be replaced by a single AND gate. She then claims that this is a big deal because of the energy used by the multiplier as well as the silicon die area used and in a couple of places she implies that this stochastic approach would also be much faster because it only uses a single simple gate. But she conveniently forgets something that she had previously relied on in explaining how it works -- that this single AND gate has to be walked bit-by-bit across the stream of random bits that represent the two numbers. While she does point out that accuracy requires lots of random samples and that speed and accuracy are tradeoffs, she very conveniently fails to give any indication of how long the bitstreams would have to be in order to achieve even a modest degree of accuracy and precision. The problem is that the needed length of the bitstream grows exponentially with the needed resolution and again grows exponentially with the required precision.

To grasp the significance of this, let's consider that we want to be able to represent something with just eight bits of precision. That means that we want to be able to distinguish 256 different values, If these values are represented via the density of a bit stream, then at the very minimum we would have to have a stream that was 255 bits long such that the number is represented by the number of bits in that stream that are set. So we have to run a 256-bit stream through our simple AND gate, which is going to take time and consume energy.

But if only it were only that bad.

The bits in the stream are stochastic, with each bit being a 0 or 1 independent of any of the other bits in the stream. The value being represented merely determines the probability of each bit being set.

With a stream that is 255 bits long, it is virtually guaranteed that the stream that is generated will represent the wrong value.

A simple example that most people here can probably relate to is the classic coin-flipping experiment.

If we have the proverbial "fair coin", the the probability of a 1 (aka, heads) is 0.5, thus the resulting bit stream will represent the number 0.5 (or, in binary, 0.1b).

Since we are wanting eight bits of precision, that means that we want our bitstream to represent 0.5 (i.e., 0.10000000b) as being different from 0.01111111b and 0.10000001b (as well as every other eight bit value in the range 0 <= x < 1).

Now, since this is a stochastic process, we can't guarantee this. The best we can do is say something like we want probability that the bit density of the bit stream is within one-half of an lsb of the represented value to be at least some minimum threshold. But what should that threshold be? That will depend on what's important which will be related to why we said that we wanted eight bits of precision to begin with.

If I've reconstructed the math right, the probability that a random 256-bit bitstream will have exactly 50% 1s is about 5%. Maybe that's good enough, maybe it's not. But how could we improve it?

If we quadruple the length of our bitstream to 1024 bits, then if the number of 1s is 510, 511, 512, 513, or 514 we are within 0.5 LSB of the intended value. While the likelihood of getting exactly 0.5 goes down is cut in half, the fact that we can now accept five nearby values as being identical makes up for it and raises the likelihood of getting a correct number to about 10%. Hence you see the exponential scaling problem -- we have to quadruple the length of the bit stream to double the probability of having the correct value. So to get the probability of not being off by more than 0.5 LSB down below 1%, we would likely need pushing (or perhaps even more) than a million bits.

So is generating two million-bit streams of random numbers, bitwise AND'ing them together, and then counting the number of 1s in the resulting bitstream going to be faster and consume less power than using an 8-bit multiplier? How would we speed it up? Well, we could put a million AND gates on the chip and do that part of it in parallel. But now we are using up that valuable die area again.

Then there is the question of how we generate these two random bit streams in the first place.

She talks about using the noise from a resistor as a source of true random numbers. Okay. Fine. Now how do you use that to produce a bit stream in which each bit has the specified probability of being a one?

There are ways to do it, but they are not fast.
I agree, she's definitely glossing over the details quite a bit (albeit in the sort of way one might expect from a pop-science YouTuber). I actually hacked together a semi-working implementation, but it mostly just confirms what you've said. Really no way to guarantee accuracy in the computation and the number of iterations required to converge on a "reasonable" approximation are pretty high.

Code:
#include <stdint.h>
#include <time.h>
#include <stdlib.h>
#include <stdio.h>

uint32_t random_bits(int bits) {
  static FILE* stream = NULL;
  if (stream == NULL) {
    stream = fopen("/dev/random", "rb");
    srand(time(0));
  }
  uint32_t value;
  if (stream != NULL)
    fread(&value, sizeof(value), 1, stream);
  else
    value = rand();  // FIXME: our fallback probably isn't random enough!
  return value & ((1 << bits) - 1);
}

int count_bits(uint32_t value) {
  int count = 0;
  do {
    ++count;
    value >>= 1;
  } while (value != 0);
  return count;
}

uint64_t stochastic_multiply(uint32_t lhs, uint32_t rhs, int trials) {
  int lhs_bits = count_bits(lhs);
  int rhs_bits = count_bits(rhs);
  int max_bits = lhs_bits + rhs_bits;
  int div_shift = count_bits(trials) - 1;
  if (div_shift > max_bits) {
    div_shift = max_bits;
    trials = 1 << max_bits;
  }
  uint64_t counter = 0;
  while (trials-- != 0)
    if ((random_bits(lhs_bits) < lhs) && (random_bits(rhs_bits) < rhs))
      ++counter;
  return counter << (max_bits - div_shift);
}

int main(int argc, char** argv) {
  puts(" Stochastic Multiplication");
  if (argc < 3 || argc > 4)
    return printf("Usage: %s [LHS] [RHS] [LOG2_TRIALS]\n", argv[0]);
  uint32_t lhs = atoi(argv[1]);
  uint32_t rhs = atoi(argv[2]);
  uint32_t log2_trials = 20;
  if (argc > 3)
    log2_trials = atoi(argv[3]);
  int trials = 1 << log2_trials;
  uint64_t result = stochastic_multiply(lhs, rhs, trials);
  printf("Trials            :     %d\n", trials);
  printf("LHS               :     %d\n", lhs);
  printf("RHS               :     %d\n", rhs);
  printf("Product (Actual)  :     %zu\n", uint64_t(lhs) * uint64_t(rhs));
  printf("Product (Computed):     %zu\n", result);
}
So yeah. Probably not the most viable technology. The process is however extremely fascinating. I mean, it's like "math out of thin air" or something!
 
Last edited:

WBahn

Joined Mar 31, 2012
32,847
I agree, she's definitely glossing over the details quite a bit (albeit in the sort of way one might expect from a pop-science YouTuber).
True that!

So yeah. Probably not the most viable technology. The process is however extremely fascinating. I mean, it's like "math out of thin air" or something!
It follows directly from the basic principles of probability.

If the probability of A is p1 and the probability of B is p2, and if A and B are independent, then the probability of A and B both occurring is p1*p2.

There are lots of ways to visualize this, each of which is probably (no pun intended) easier for some people to see the connection that others.

Here's one that comes to mind.

Imagine that I have lots of pennies and each penny is either a 2000 or a 2001 penny (i.e., is made in one of two possible years). Those pennies are also either manufactured at the Philly Mint or the Denver Mint (i.e., is made at one of two possible mints). So I have four possibilities -- 2000-P, 2000-D, 2001-P, and 2001-D.

By lots of pennies, let's just pick a big number, say something in excess of one million of each of the four possibilities. Doesn't matter how balanced they are, just that I have a lot of each kind available.

Now let's say that I have two numbers between 0 and 1 that I want to multiply together. Let's use n1 = 0.2 and n2 = 0.7.

I choose to represent n1 as the fraction of pennies made in 2000. So I might start by taking four empty buckets and in one of them putting 200,000 2000-P pennies and another 200,000 2000-D pennies. In the other two buckets, I put 800,000 of each of the 2001 pennies.

At this point, my buckets contain the following:

A (2000-P) 200,000
B (2000-D) 200,000
C (2001-P) 800,000
D (2001-P) 800,000

Now, I represent n2 as the fraction of pennies that are from the Philly mint. So I go to each bucket that has Philly pennies and I remove 30% of them. I go to each bucket that has Denver pennies and I remove 70% of them.

At this point, my buckets contain the following:

A (2000-P) 140,000
B (2000-D) 60,000
C (2001-P) 560,000
D (2001-P) 240,000

I now put all of the pennies into one huge bucket and mix them up. I have a total one million pennies, of which 140,000 are 2000 Philly pennies.

If I draw a penny at random from the bucket, what is the probability that it will be a 2000 Philly penny? 14%, which is 0.2 multiplied by 0.7, or n1*n2.

I can make the stochastic nature of this more obvious by being less exact and combining things earlier.

A thousand pennies ($10 worth) is about 2.5 kg. So say I start with $200 of each of the four types (giving me something over 100 lb of each kind).

First, I deal with n1 = 0.2 by taking 20 lb of each type of 2000 penny and putting them into two buckets -- a Philly bucket and a Denver bucket. I then take 80 lb of the 2001 pennies of each type and add them to the appropriate city bucket. Each bucket now has 100 lb of pennies, 20% of which were made in 2000. I thoroughly mix the contents of each bucket.

Next, I deal with n2 = 0.7 by taking 70 lb of pennies from the Philly bucket and combining that with 30 lb of pennies from the Denver bucket. I thoroughly mix up the contents of this bucket.

I know take out about a pound of pennies from this final bucket (which is about 1% of the contents and which will be about, very roughly, 200 pennies). I now count how many of them are 2000 Philly pennies and divide that by the total number that I pulled out and that is my estimate for n1*n2.

We can see all the many weak spots that kill the accuracy of our estimate. How exact were the various amounts that I took out? Did I weight those 80 lb of 2001 pennies down to the 2.5 g level and do so accurately? If not, then I only have approximately the intended fraction. The same for the other weights. When I pulled a from any bucket that had any kind of mix in it, was the mix of the sample I pulled exactly equal to the mix of the bucket as a whole? Pretty much guaranteed that it wasn't. What my final sample large enough to give me an estimate to the desired resolution, even if the mix ratio was right what it should be?
 
Last edited:

MrAl

Joined Jun 17, 2014
13,704
The concept is very simple and is anything but new. Stochastic signal processing is used in many fields.

She very conveniently overlooks a few major issues.

She claims that all of the thousands of transistors that make up a high-precision multiplier can be replaced by a single AND gate. She then claims that this is a big deal because of the energy used by the multiplier as well as the silicon die area used and in a couple of places she implies that this stochastic approach would also be much faster because it only uses a single simple gate. But she conveniently forgets something that she had previously relied on in explaining how it works -- that this single AND gate has to be walked bit-by-bit across the stream of random bits that represent the two numbers. While she does point out that accuracy requires lots of random samples and that speed and accuracy are tradeoffs, she very conveniently fails to give any indication of how long the bitstreams would have to be in order to achieve even a modest degree of accuracy and precision. The problem is that the needed length of the bitstream grows exponentially with the needed resolution and again grows exponentially with the required precision.

To grasp the significance of this, let's consider that we want to be able to represent something with just eight bits of precision. That means that we want to be able to distinguish 256 different values, If these values are represented via the density of a bit stream, then at the very minimum we would have to have a stream that was 255 bits long such that the number is represented by the number of bits in that stream that are set. So we have to run a 256-bit stream through our simple AND gate, which is going to take time and consume energy.

But if only it were only that bad.

The bits in the stream are stochastic, with each bit being a 0 or 1 independent of any of the other bits in the stream. The value being represented merely determines the probability of each bit being set.

With a stream that is 255 bits long, it is virtually guaranteed that the stream that is generated will represent the wrong value.

A simple example that most people here can probably relate to is the classic coin-flipping experiment.

If we have the proverbial "fair coin", the the probability of a 1 (aka, heads) is 0.5, thus the resulting bit stream will represent the number 0.5 (or, in binary, 0.1b).

Since we are wanting eight bits of precision, that means that we want our bitstream to represent 0.5 (i.e., 0.10000000b) as being different from 0.01111111b and 0.10000001b (as well as every other eight bit value in the range 0 <= x < 1).

Now, since this is a stochastic process, we can't guarantee this. The best we can do is say something like we want probability that the bit density of the bit stream is within one-half of an lsb of the represented value to be at least some minimum threshold. But what should that threshold be? That will depend on what's important which will be related to why we said that we wanted eight bits of precision to begin with.

If I've reconstructed the math right, the probability that a random 256-bit bitstream will have exactly 50% 1s is about 5%. Maybe that's good enough, maybe it's not. But how could we improve it?

If we quadruple the length of our bitstream to 1024 bits, then if the number of 1s is 510, 511, 512, 513, or 514 we are within 0.5 LSB of the intended value. While the likelihood of getting exactly 0.5 goes down is cut in half, the fact that we can now accept five nearby values as being identical makes up for it and raises the likelihood of getting a correct number to about 10%. Hence you see the exponential scaling problem -- we have to quadruple the length of the bit stream to double the probability of having the correct value. So to get the probability of not being off by more than 0.5 LSB down below 1%, we would likely need pushing (or perhaps even more) than a million bits.

So is generating two million-bit streams of random numbers, bitwise AND'ing them together, and then counting the number of 1s in the resulting bitstream going to be faster and consume less power than using an 8-bit multiplier? How would we speed it up? Well, we could put a million AND gates on the chip and do that part of it in parallel. But now we are using up that valuable die area again.

Then there is the question of how we generate these two random bit streams in the first place.

She talks about using the noise from a resistor as a source of true random numbers. Okay. Fine. Now how do you use that to produce a bit stream in which each bit has the specified probability of being a one?

There are ways to do it, but they are not fast.
Hi,

First, thanks for posting this detailed reply now I don't have to watch the whole video. I hate watching pseudo-science videos only to find out at the end that "we might have something new here". That's why I seldom watch videos anymore. It's almost always about somebody trying to make waves about nothing just so they can get noticed somehow. What a shame; it consumes atomic space and muddies up the internet which I think makes the real stuff seem less believable.

As to the probability factor, I think any of these methods are going to be much slower than anything else. It reminds me of solving for 'pi' by generating random 2d numbers. Yeah it works, but geeze, the number of random numbers needed to get anything even partly useful is sky high and that means a long processing time.
 

MrAl

Joined Jun 17, 2014
13,704
I agree, she's definitely glossing over the details quite a bit (albeit in the sort of way one might expect from a pop-science YouTuber). I actually hacked together a semi-working implementation, but it mostly just confirms what you've said. Really no way to guarantee accuracy in the computation and the number of iterations required to converge on a "reasonable" approximation are pretty high.

Code:
#include <stdint.h>
#include <time.h>
#include <stdlib.h>
#include <stdio.h>

uint32_t random_bits(int bits) {
  static FILE* stream = NULL;
  if (stream == NULL) {
    stream = fopen("/dev/random", "rb");
    srand(time(0));
  }
  uint32_t value;
  if (stream != NULL)
    fread(&value, sizeof(value), 1, stream);
  else
    value = rand();  // FIXME: our fallback probably isn't random enough!
  return value & ((1 << bits) - 1);
}

int count_bits(uint32_t value) {
  int count = 0;
  do {
    ++count;
    value >>= 1;
  } while (value != 0);
  return count;
}

uint64_t stochastic_multiply(uint32_t lhs, uint32_t rhs, int trials) {
  int lhs_bits = count_bits(lhs);
  int rhs_bits = count_bits(rhs);
  int max_bits = lhs_bits + rhs_bits;
  int div_shift = count_bits(trials) - 1;
  if (div_shift > max_bits) {
    div_shift = max_bits;
    trials = 1 << max_bits;
  }
  uint64_t counter = 0;
  while (trials-- != 0)
    if ((random_bits(lhs_bits) < lhs) && (random_bits(rhs_bits) < rhs))
      ++counter;
  return counter << (max_bits - div_shift);
}

int main(int argc, char** argv) {
  puts(" Stochastic Multiplication");
  if (argc < 3 || argc > 4)
    return printf("Usage: %s [LHS] [RHS] [LOG2_TRIALS]\n", argv[0]);
  uint32_t lhs = atoi(argv[1]);
  uint32_t rhs = atoi(argv[2]);
  uint32_t log2_trials = 20;
  if (argc > 3)
    log2_trials = atoi(argv[3]);
  int trials = 1 << log2_trials;
  uint64_t result = stochastic_multiply(lhs, rhs, trials);
  printf("Trials            :     %d\n", trials);
  printf("LHS               :     %d\n", lhs);
  printf("RHS               :     %d\n", rhs);
  printf("Product (Actual)  :     %zu\n", uint64_t(lhs) * uint64_t(rhs));
  printf("Product (Computed):     %zu\n", result);
}
So yeah. Probably not the most viable technology. The process is however extremely fascinating. I mean, it's like "math out of thin air" or something!
Hi,

There are a lot of interesting probability schemes that create math curiosities. I think of the generation of the number 'pi' using random numbers as an example. The problem is it always takes so long to get anything useful.

There are some problems that have to use probability, but for those that don't, I don't think there is anything faster using probability because decent precision requires a large number of trial and error experiments.
 

MrAl

Joined Jun 17, 2014
13,704
Seems like quite a few of you guys have delved deeply into all this, very interesting stuff too.

Anyone else here recall using "log books" at school? (I still have one somewhere around here...)
Hi,

You mean the log tables?
Ha ha, yeah, if anything went out the window any faster when computers came onto the scene I can't think of it :)

Now that I think about it, a lot of stuff became obsolete due to advanced math software.
Remember massive tables of integrals? Who needs them now.
Remember engineers? Who needs them now :)
 

Thread Starter

joeyd999

Joined Jun 6, 2011
6,297
This is one of the ways we used to perform analog multiplication (in the old days): convert the analog signals to PDM (pulse density modulation) waveforms, AND them together, and low-pass filter the result.

It's also the heart of modern day Delta Sigma converters.
 

xox

Joined Sep 8, 2017
936
There are a lot of interesting probability schemes that create math curiosities. I think of the generation of the number 'pi' using random numbers as an example. The problem is it always takes so long to get anything useful.

There are some problems that have to use probability, but for those that don't, I don't think there is anything faster using probability because decent precision requires a large number of trial and error experiments.
Yep, that almost always seems to be the case. Intriguing if inefficient.

Here's another: Select two N-bit numbers at random and take the maximum value. It turns out that the distribution of the result is precisely the same as if you had generated a single 2N-bit number at each iteration and then calculated the ceiling of the square root of that.

This can be generalized to K numbers; the max of K N-bit random numbers is "essentially the same" as taking the ceiling of the Kth root of a KN-bit number. (Which is to say that they produce identical distributions/histograms.) None of these are very useful of course because taking roots is invariably going to be more expensive than finding the max. But interesting nonetheless. :)

This is one of the ways we used to perform analog multiplication (in the old days): convert the analog signals to PDM (pulse density modulation) waveforms, AND them together, and low-pass filter the result.

It's also the heart of modern day Delta Sigma converters.
Wow, I almost can't believe I'd never heard of these methods before! I assume that the results were integers. How was the accuracy?
 

Futurist

Joined Apr 8, 2025
758
Hi,

You mean the log tables?
Ha ha, yeah, if anything went out the window any faster when computers came onto the scene I can't think of it :)

Now that I think about it, a lot of stuff became obsolete due to advanced math software.
Remember massive tables of integrals? Who needs them now.
Remember engineers? Who needs them now :)
I recall doing school math tests where each student was given one of these:

1760722786714.png

When calculators emerged I was stunned, even this (my first ever) was amazing to me and was my workhorse when studying electronics.

1760722911520.png

Commodore SR7919
 
Top