Interesting Algorithms

MrAl

Joined Jun 17, 2014
13,704
The question of how you know how many digits are truly significant takes on added emphasis given your example.

The sine(π/4) = √2 / 2

Your result and the result from Wolfram Alpha:

Code:
0.7071067811865475244008443621048490392848359376884740365883398689 409498
0.7071067811865475244008443621048490392848359376884740365883398689 953662392310535194251937671638207864
As we can see, the match for 65 sig figs, but part company after that. So there's no point using whatever approach you used beyond that.

I think that, for high-precision computation of trig functions, that the preferred approach is to use algebraic/geometric means and elliptic integrals.
Hi,

This question is quick to answer so I'll do this one first.

The application for that result only need 50 digits but I calculate more digits to make sure at least 50 are correct, then often just truncate the result to 50 digits. I use that for a lot of things that are not too demanding. It's just like a hand calculator that does various little tasks and has 16 immediate memories unless we consider the disk saves which gives almost unlimited memory space. It's linked to other programs similar to a 'plugin' so it can do other things like units conversions. it's not really meant to be used for very complicated math tasks.
 

WBahn

Joined Mar 31, 2012
32,847
Hi,

This question is quick to answer so I'll do this one first.

The application for that result only need 50 digits but I calculate more digits to make sure at least 50 are correct, then often just truncate the result to 50 digits.
Makes sense.

Though it begs the question (and this is a question that is generally germane) of how one determines how many more digits are needed in order to ensure that the desired number of digits are accurate.
 

MrAl

Joined Jun 17, 2014
13,704
Maybe you already mentioned this previously, but I couldn't spot it quickly. What was the application where 800 digits wasn't enough?

A proton's diameter is approximately 0.84 fm (0.84 x 10^-15 m).

The diameter of the visible universe is about 93 billion light-years, which is about 880 Ym (0.88 x 10^27 m).

To represent that distance to the resolution of the diameter of a proton would thus require just 42 digits.

Number all of the fundamental particles in the known universe would require about 80 digits.

So what application needs more than 800 digits?

And how do you know how many digits in whatever result you get are truly significant?
Hi,

This is actually a very good question. Now that I think about I am a little surprised nobody asked before this.

The answer in my case is that the ultimate reason is I like to be prepared for the need for more precision in case I need it in the future. In that respect it is more for pure math than for applied math like that used in engineering. Engineering math is a walk in the park compared to theoretical math which can require billions of digits.
Some applications in experimental mathematics would include:
1. Proving that pi has no repeating patterns.
2. Finding zeros of the zeta function (some found are over 1000 digits).

The secondary reason for me was to calculate the diameter of a circle that encloses other N circles of smaller diameter to high precision. For example, if N=2 then the enclosing circle diameter is simply 1+2 . For N=3 it's a little more complicated because they spread out in 2d space, and N=4 more complicated, etc., etc. The diameters of the N smaller circles would be N=1 has diameter 1, N=2 has diameters 1 and 2, and N=3 has diameters 1, 2, and 3, etc., etc. It's a sort of curiosity more than having a real world application, I think. I used either 1024 or 2048 digits can't remember which it was now.

Another reason for me is to calculate a result with more digits than another so I can check the lesser with the result that has more digits. For example, if I calculate 1.002 and with the more accurate result I get 1.0020 I will start to think that 1.002 was accurate to those 4 digits. If in doubt, I go higher and higher with more digits to look for a trend. If I get 1.0021 I might go to more digits than that.

The way to check these is to design a function y=f(x) and design the inverse function at the same time x=g(y). You can then check that g(y) gives the original result 'x' that was used with y=f(x). It has to match withing the required error.
Another way is to calculate a host of results and match them against slower methods just for the matter of testing the better algorithm.

The way I checked the result I provided earlier though was I only needed 64 digits.

I do have the algorithm now that I had mention originally in this thread and other threads asking for ideas from other members on how it might have been developed because I can't remember. Since I have it now though I was going through the process of reverse engineering it but haven't gotten back to it yet. I have to boil it back down to the basic algorithm.
Maybe I should get back to doing that so I can post the algorithm and then we can all compare it to the other ones we talked about here, which are also interesting BTW.

If we read around the web we always see that the Taylors series is slow to converge but that does not mean that I would not use it. I'm sure I used it in the past at some point, and if I wanted to use it but preferred the Chebyshev I might still use it to avoid the more complex coding of the latter.
I don't know if anyone used the very old Sinclair computer, but in that ROM they used the Chebyshev for trig functions.

I still have one more mystery to solve someday though. It's about representing an ellipse using just one variable instead of two (replace 'a' and 'b' with just 'c'). For the life of me I can't remember how to do this either anymore. It greatly simplifies some elliptic integrals used in magnetic field calculations like for inductors. That's because sometimes the shape of the curve is more important than the actual dimensions, but that's all I remember unfortunately. It amazed me when I first learned about it, but that one I did not develop myself I read about it somewhere.

Anymore ideas from anyone would still be interesting to hear about.
 

MrAl

Joined Jun 17, 2014
13,704
Makes sense.

Though it begs the question (and this is a question that is generally germane) of how one determines how many more digits are needed in order to ensure that the desired number of digits are accurate.
There must be a trend because I see this a lot. The RKF45 algorithm uses this idea too. The 5th order DE solver checks the results of the 4th order solver. I think that is a standard way to do this, but having an inverse function is good to have too.
 

sparky 1

Joined Nov 3, 2018
1,218
A look at Matlab code for a simple multiplication ported into photonic processor.

A look at the network specs. click on pam-4 demo video icon
BCM87400 | 7 nm 400GbE PAM-4 PHY (8:4)

Code:
% Define weight matrix (crossbar multipliers)
W = [2 3;
     1 4];

% Define input vector (optical amplitudes)
x = [5; 6];

% Perform matrix-vector multiplication
y = W * x;

% Display results
disp('Input vector:');
disp(x);

disp('Weight matrix:');
disp(W);

disp('Output vector:');
disp(y);
 
Last edited:

Futurist

Joined Apr 8, 2025
753
Maybe you already mentioned this previously, but I couldn't spot it quickly. What was the application where 800 digits wasn't enough?

A proton's diameter is approximately 0.84 fm (0.84 x 10^-15 m).

The diameter of the visible universe is about 93 billion light-years, which is about 880 Ym (0.88 x 10^27 m).

To represent that distance to the resolution of the diameter of a proton would thus require just 42 digits.

Number all of the fundamental particles in the known universe would require about 80 digits.

So what application needs more than 800 digits?

And how do you know how many digits in whatever result you get are truly significant?
In some chaotic systems, differences even in a 1000th digit of an input, can lead to huge differences in future states.
 
Last edited:

MrAl

Joined Jun 17, 2014
13,704
I find a paradox there ... infinity can and does have repeatable patterns, last time I checked
Hi there and thanks for the reply,

Your statement teeters on the edge of reality, which makes it interesting because it helps us identify the difference between not only reality and theory, but the difference between one theory and another theory in the way that their respective contexts define how they should be understood and used by us. After all, even theory has to be grounded in reality eventually or else we don't know the theory well enough yet, or we don't know the reality behind it yet.

Here we have 'perhaps' the possible contrast between the idea of infinity as it stands by itself, and the theory of numbers. The theory of infinity is different than the theory of numbers most simply put I think by simply recognizing that infinity is not a number while pi is a number.

Infinity can have repeated patterns like 123412341234 etc., and we can find that pattern somewhere within pi I would bet, but the idea is that if you keep repeating that to an infinite number of digits then we will not be able to find that within pi.

Perhaps stating that 'repeating patterns' is not the best way to frame these ideas, we really need to think about repeating patterns that repeat an infinite number of times. Infinity can have that while pi cannot have that.
For example, in infinity we might find:
123123
and in pi we might find that also, and in infinity we might find:
123123123
and in pi we might find that also, and we can repeat this test out to many digits. However, if we repeat that pattern an infinite number of times we can't say that we can find that within the digits of pi. So we might call this a cyclic pattern.

If, on the other hand, pi did happen to settle onto that pattern, then it would no longer be considered to be irrational, which it is strongly believed that it actually is, based on over 200 trillion digits (don't remember how many offhand) and some logical proofs.

Lambert proved this a long time ago I think using the tangent function which is 'known' to produce irrational numbers from rational numbers like 1/2. If we take the tangent of pi/4 for example we get an exact number: 1 (one, no digits to the right of it). If we assumed that pi/4 was rational (because pi is assumed rational) then if we take the tangent of that we must get an irrational number, but we don't, we get a rational number. Something like that anyway :)
There's another proof somewhere too.

Now to the question that is probably starting to rise in everyone's mind:
Will pi ever be proven to have a pattern that repeats indefinitely?
Well it would be hard to believe because then pi, after a certain LARGE number of digits, would have to have a pattern that keeps repeating over and over without end. To be brief a shortened version might be something like:
pi=3.1415926... with more digits unknown yet, and then we find later:
pi=3.14159260123012301230123012301230123012301230123012301230123
where the patten 0123 repeats indefinitely.

This seems unlikely, but to ground that idea in reality I think we should at least consider the possibility. We then start to get really, really, real ("RRR" ha ha) because now we are precisely considering ALL the possibilities of not only nature but of our very own ability to reason.
In doing so, we might then ask for the thing that usually grounds everything: is there any application that exists for this. The answer for now is probably only in number theory not in science and engineering, but hey, you never know what the future holds. Maybe proving there is a repeated cyclic pattern in pi will help to prove something else in the future. We won't know until then.

If this seems a little long, just keep in mind you were the one who brought it up :)
 

MrAl

Joined Jun 17, 2014
13,704
A look at Matlab code for a simple multiplication ported into photonic processor.

A look at the network specs. click on pam-4 demo video icon
BCM87400 | 7 nm 400GbE PAM-4 PHY (8:4)

Code:
% Define weight matrix (crossbar multipliers)
W = [2 3;
     1 4];

% Define input vector (optical amplitudes)
x = [5; 6];

% Perform matrix-vector multiplication
y = W * x;

% Display results
disp('Input vector:');
disp(x);

disp('Weight matrix:');
disp(W);

disp('Output vector:');
disp(y);
Hi,

What are you trying to say here. Not sure what you are trying to show and if it is related to calculating sin(x) or something else.
 

MrAl

Joined Jun 17, 2014
13,704
In some chaotic systems, differences even in a 1000th digit of an input, can lead to huge differences in future states.
Hi,

Oh yes that's another area for a calculation using a large number of digits. I have not gotten into that area too much myself though.
I think fractal calculations might be another application.

I think a lot of us here are used to working with 8 or 16 digits, maybe 32 at most. That's for real world applications, and even a precision of 0.5 percent is often good enough which could mean only 4 digits :)

I did a lot of image processing routines, and there we have to think about doing calculations that come one after the other in order to get the final pixel results. Doing one operation usual does not affect anything, but if the precision is too low the errors can accumulate when we have a pipeline of operations that have to be completed. With the computers of today this gets easier because of the way the CPU's can handle math functions now.
 

Futurist

Joined Apr 8, 2025
753
Hi,

Oh yes that's another area for a calculation using a large number of digits. I have not gotten into that area too much myself though.
I think fractal calculations might be another application.

I think a lot of us here are used to working with 8 or 16 digits, maybe 32 at most. That's for real world applications, and even a precision of 0.5 percent is often good enough which could mean only 4 digits :)

I did a lot of image processing routines, and there we have to think about doing calculations that come one after the other in order to get the final pixel results. Doing one operation usual does not affect anything, but if the precision is too low the errors can accumulate when we have a pipeline of operations that have to be completed. With the computers of today this gets easier because of the way the CPU's can handle math functions now.
Are you old enough to remember this:

https://www.youtube.com/shorts/vp98qy9N_jM
 

MrAl

Joined Jun 17, 2014
13,704
Are you old enough to remember this:

https://www.youtube.com/shorts/vp98qy9N_jM
Hi,

I think I remember that, 1994, and another in 1997 the Pentium F00F bug, and another security issue in 2018 I think.

The one that had the biggest impact on my work was the not-as-well-known AMD Bulldozer Controversy.
They tried to bamboozle the public by calling their 'cores' 'modules', which were in fact degenerate cores. They were sued in 2015, a class action lawsuit that I was a part of.
I noticed the slowdown right away because I did a lot of image processing that was floating point intensive. Instead of 8 cores all with floating point units, they provided only 4 floating point units and 8 integer cores. But they still called it an 8 core CPU.
I am very surprised it took 3 or 4 years for anyone to do anything legal about it, but even after that I only got something like $35 USD while the CPU price as around $200 dollars. They should have replaced the entire CPU. There must have been a lot of people who were depending on the full 8 cores.
After that I moved to Intel exclusively, but I do know now that they changed the entire architecture which they called "Zen" and it might be doing better because the 8 core version does have 8 floating point units, from what I read anyway. So they went back to "core" and don't call them "modules" anymore.
There is a trend now for other CPU's to start degrading their cores which really sux. Some cores on the same chip may be limited either in capabilities or in speed. They do advertise this information though so you have to watch out.
I think the total payout for AMD was around 12 million USD, but that's nothing compared to the Intel problem which was almost 500 million I think. AMD got off cheap.
The only real solution for me was I had to develop a fixed point math package that used big integers to do fixed point instead of floating point. Luckily this resulted in only a small error which I don't think could be noticed with the human eyes when viewing an image that has been processed.
 

Futurist

Joined Apr 8, 2025
753
Hi,

I think I remember that, 1994, and another in 1997 the Pentium F00F bug, and another security issue in 2018 I think.

The one that had the biggest impact on my work was the not-as-well-known AMD Bulldozer Controversy.
They tried to bamboozle the public by calling their 'cores' 'modules', which were in fact degenerate cores. They were sued in 2015, a class action lawsuit that I was a part of.
I noticed the slowdown right away because I did a lot of image processing that was floating point intensive. Instead of 8 cores all with floating point units, they provided only 4 floating point units and 8 integer cores. But they still called it an 8 core CPU.
I am very surprised it took 3 or 4 years for anyone to do anything legal about it, but even after that I only got something like $35 USD while the CPU price as around $200 dollars. They should have replaced the entire CPU. There must have been a lot of people who were depending on the full 8 cores.
After that I moved to Intel exclusively, but I do know now that they changed the entire architecture which they called "Zen" and it might be doing better because the 8 core version does have 8 floating point units, from what I read anyway. So they went back to "core" and don't call them "modules" anymore.
There is a trend now for other CPU's to start degrading their cores which really sux. Some cores on the same chip may be limited either in capabilities or in speed. They do advertise this information though so you have to watch out.
I think the total payout for AMD was around 12 million USD, but that's nothing compared to the Intel problem which was almost 500 million I think. AMD got off cheap.
The only real solution for me was I had to develop a fixed point math package that used big integers to do fixed point instead of floating point. Luckily this resulted in only a small error which I don't think could be noticed with the human eyes when viewing an image that has been processed.
Very interesting, thanks for sharing.
 

MrAl

Joined Jun 17, 2014
13,704
Hello again,

Well unfortunately, I looked over the algorithm I found after many years of not using that computer, and realized I captured the wrong algorithm. It's not the one I had been talking about. Still interesting, but not as good as the one I was really looking for. So that means I still do not yet have the algorithm I wanted.

The one I did find is also interesting though. It's rather simple in theory. It uses one-third angle compression to repeatedly compress the value of x (for calculating sin(x)) and then computes the sine using a Taylor Series with highest power of 7. It then decompresses using the third angle formula.
So we divide the angle by 3 repeatedly until we get to a small threshold related to the number of digits of precision we need, then use a Taylors with highest power x^7 (show elsewhere in this thread by Bob), then use the expander:
y=3*z-4*z^3 (where simply z=y[n-1])
the same number of times we had to divide by 3 originally.
It works pretty nice really and has some advantages over a direct Taylors. I am not sure it is any faster I'd have to check, but there is no large integer division for example.

This was probably one of the precursors I used way back in the who knows what years before the newer algorithm came to mind. I originally thought it was the right algorithm because I had it coded in lines that incremented by 100 for each function, sine, cosine, tangent, exponential, etc.
This was brought on because the old TRS80 did not have built in support for double precision trig functions, only single precision which was a very sorry thing for them to do since add, subtract, multiply, and divide were all ok with double precision.
In the newer algorithm, I would first do y=sin(x) in raw built in code which would give me roughly 8 digits to start with, so no series needed for that, and then one pass of the new algorithm and I had 16 digits which was the target precision at the time. Of course all this was very slow compared to what we have now running on a 4MHz computer (ha ha). Yes that is 4MHz not 4GHz as the heart of that machine was a Z80 CPU running at 4MHz.

So the search for the newer algorithm continues... :)

I should note though that in this day and age it's more of a curiosity what I was doing back then rather than something I really need that bad. The computers today run fast enough for most things, at least until we get up to a huge number of digits and then things slow down pretty fast. I had to wait days with some calculations, but back in the day of the TRS80 it would have been months I bet, so I would have never even attempted it back then.
 

Thread Starter

joeyd999

Joined Jun 6, 2011
6,282
Hello again,

Well unfortunately, I looked over the algorithm I found after many years of not using that computer, and realized I captured the wrong algorithm. It's not the one I had been talking about. Still interesting, but not as good as the one I was really looking for. So that means I still do not yet have the algorithm I wanted.

The one I did find is also interesting though. It's rather simple in theory. It uses one-third angle compression to repeatedly compress the value of x (for calculating sin(x)) and then computes the sine using a Taylor Series with highest power of 7. It then decompresses using the third angle formula.
So we divide the angle by 3 repeatedly until we get to a small threshold related to the number of digits of precision we need, then use a Taylors with highest power x^7 (show elsewhere in this thread by Bob), then use the expander:
y=3*z-4*z^3 (where simply z=y[n-1])
the same number of times we had to divide by 3 originally.
It works pretty nice really and has some advantages over a direct Taylors. I am not sure it is any faster I'd have to check, but there is no large integer division for example.

This was probably one of the precursors I used way back in the who knows what years before the newer algorithm came to mind. I originally thought it was the right algorithm because I had it coded in lines that incremented by 100 for each function, sine, cosine, tangent, exponential, etc.
This was brought on because the old TRS80 did not have built in support for double precision trig functions, only single precision which was a very sorry thing for them to do since add, subtract, multiply, and divide were all ok with double precision.
In the newer algorithm, I would first do y=sin(x) in raw built in code which would give me roughly 8 digits to start with, so no series needed for that, and then one pass of the new algorithm and I had 16 digits which was the target precision at the time. Of course all this was very slow compared to what we have now running on a 4MHz computer (ha ha). Yes that is 4MHz not 4GHz as the heart of that machine was a Z80 CPU running at 4MHz.

So the search for the newer algorithm continues... :)

I should note though that in this day and age it's more of a curiosity what I was doing back then rather than something I really need that bad. The computers today run fast enough for most things, at least until we get up to a huge number of digits and then things slow down pretty fast. I had to wait days with some calculations, but back in the day of the TRS80 it would have been months I bet, so I would have never even attempted it back then.
Are you thinking of CORDIC?

Compliments Grok (because I'm lazy):

C:
// 32-bit fixed-point CORDIC sine (16.16 format)
int32_t cordic_sin(int32_t theta) {
    int32_t x = K; // K = 0.607252935 in 16.16 fixed-point (~39806)
    int32_t y = 0;
    int32_t z = theta;
    int32_t atan_table[16] = { ... }; // Precomputed arctan(2^-i) in 16.16

    for (int i = 0; i < 16; i++) {
        int32_t d = (z >= 0) ? 1 : -1;
        int32_t tx = x - (d * (y >> i));
        int32_t ty = y + (d * (x >> i));
        z = z - (d * atan_table[i]);
        x = tx;
        y = ty;
    }
    return y; // y contains sin(theta)
}
 

WBahn

Joined Mar 31, 2012
32,847
Hello again,

Well unfortunately, I looked over the algorithm I found after many years of not using that computer, and realized I captured the wrong algorithm. It's not the one I had been talking about. Still interesting, but not as good as the one I was really looking for. So that means I still do not yet have the algorithm I wanted.

The one I did find is also interesting though. It's rather simple in theory. It uses one-third angle compression to repeatedly compress the value of x (for calculating sin(x)) and then computes the sine using a Taylor Series with highest power of 7. It then decompresses using the third angle formula.
So we divide the angle by 3 repeatedly until we get to a small threshold related to the number of digits of precision we need, then use a Taylors with highest power x^7 (show elsewhere in this thread by Bob), then use the expander:
y=3*z-4*z^3 (where simply z=y[n-1])
the same number of times we had to divide by 3 originally.
It works pretty nice really and has some advantages over a direct Taylors. I am not sure it is any faster I'd have to check, but there is no large integer division for example.

This was probably one of the precursors I used way back in the who knows what years before the newer algorithm came to mind. I originally thought it was the right algorithm because I had it coded in lines that incremented by 100 for each function, sine, cosine, tangent, exponential, etc.
This was brought on because the old TRS80 did not have built in support for double precision trig functions, only single precision which was a very sorry thing for them to do since add, subtract, multiply, and divide were all ok with double precision.
In the newer algorithm, I would first do y=sin(x) in raw built in code which would give me roughly 8 digits to start with, so no series needed for that, and then one pass of the new algorithm and I had 16 digits which was the target precision at the time. Of course all this was very slow compared to what we have now running on a 4MHz computer (ha ha). Yes that is 4MHz not 4GHz as the heart of that machine was a Z80 CPU running at 4MHz.

So the search for the newer algorithm continues... :)

I should note though that in this day and age it's more of a curiosity what I was doing back then rather than something I really need that bad. The computers today run fast enough for most things, at least until we get up to a huge number of digits and then things slow down pretty fast. I had to wait days with some calculations, but back in the day of the TRS80 it would have been months I bet, so I would have never even attempted it back then.
A common algorithm used in integer-only processors was CORDIC, which use shift-add operations with no multiplications or divisions. Back then, multiplication was expensive and division was astronomically so. Today, on many processors, integer multiplication is pretty tame, though division is still quite expensive. So there might be some algorithms, perhaps variants on CORDIC, that leverage this change to trade of doing lots of shift-add operations in exchange for a few multiplications. I don't know what the general methods that are used in modern embedded systems and simple handheld calculators. My guess is that it depends on a number of factors and I wouldn't be surprised if CORDIC is still used in lower-end systems because it is amenable to cheap processors and low-power operation. Even in the processors available in the 1980s, such as the Sharp scientific calculators I was partial to in high school, these operations were fast enough that there was no perceptible delay and the batteries typically lasted for several years of pretty heavy use, so there probably wasn't a compelling reason to switch to something else unless there was an obvious win in other areas. But in more capable processors, particularly if they have floating point support, they likely use some kind of series approximation, probably with range reduction and/or argument transformation. Of course, for application-specific designs, I would imagine that some use table-lookup and interpolation, particularly in real-time systems that need to be very responsive.
 

MrAl

Joined Jun 17, 2014
13,704
Are you thinking of CORDIC?

Compliments Grok (because I'm lazy):

C:
// 32-bit fixed-point CORDIC sine (16.16 format)
int32_t cordic_sin(int32_t theta) {
    int32_t x = K; // K = 0.607252935 in 16.16 fixed-point (~39806)
    int32_t y = 0;
    int32_t z = theta;
    int32_t atan_table[16] = { ... }; // Precomputed arctan(2^-i) in 16.16

    for (int i = 0; i < 16; i++) {
        int32_t d = (z >= 0) ? 1 : -1;
        int32_t tx = x - (d * (y >> i));
        int32_t ty = y + (d * (x >> i));
        z = z - (d * atan_table[i]);
        x = tx;
        y = ty;
    }
    return y; // y contains sin(theta)
}
Hi,

Well, no that's not it either, but thanks for posting that too it looks interesting as well.

I have a feeling this algorithm is nowhere to be found online because it was developed by myself and it was unlike any other that I have ever seen. That's mostly because of how fast the result gets better and better, doubling the number of precise digits for each pass. That's quite different I think because if we got up to 32 digits, the next pass would give us 64 digits, and the next 128 digits, etc.

It has bugged me for several years now that I could not remember how I did this nor could I find it anywhere written down, although I am sure I had it written down somewhere because I told someone not only the algorithm itself but how it was developed, and they were a member here on this forum.
It's starting to bug me again so maybe I'll try to find it again. I'll certainly post it if I find it again, and post some comparison test runs to see how it works.

I might roughly compare this to the difference between using a Taylors method vs Gears method for solving ODE's. Taylors is series based while Gears includes an iterative phase, which does much better for nonlinear stuff. In this case though Gears is a more complicated algorithm.

I hope I can find this, and if not, I'll have to try to remember (again) how I came about finding this solution. It was not that complicated; it just was tricky. It was just one of those things that rarely comes up.

All I can remember at this point is I started by thinking about how asin() is the inverse function of sin(), and if we can use asin() to check sin() and find a way to get that to converge, then we have an algorithm for sin(). There is a chance though that I moved away from that idea but I just can't remember that far back.
What happened was once I moved from the TRS80 to the more modern computers with Intel CPU's, I ran into the 386 with math coprocessor, then the 486 with that built right into the CPU, and at that time I only needed 16 digits of precision so I pretty much stopped using the TRS80 and the sin(x) algorithm that I desperately needed for that machine due to its limitations. The new CPU's were so much faster too, so I thought I would never need the algorithm anymore. Then I started talking to someone here about solving something and I remembered enough to explain it to them, but that was probably 10 years or more ago already, and I have not talked about it again after that.

Memory can fade with time, that's the bigger problem :)
 

MrAl

Joined Jun 17, 2014
13,704
A common algorithm used in integer-only processors was CORDIC, which use shift-add operations with no multiplications or divisions. Back then, multiplication was expensive and division was astronomically so. Today, on many processors, integer multiplication is pretty tame, though division is still quite expensive. So there might be some algorithms, perhaps variants on CORDIC, that leverage this change to trade of doing lots of shift-add operations in exchange for a few multiplications. I don't know what the general methods that are used in modern embedded systems and simple handheld calculators. My guess is that it depends on a number of factors and I wouldn't be surprised if CORDIC is still used in lower-end systems because it is amenable to cheap processors and low-power operation. Even in the processors available in the 1980s, such as the Sharp scientific calculators I was partial to in high school, these operations were fast enough that there was no perceptible delay and the batteries typically lasted for several years of pretty heavy use, so there probably wasn't a compelling reason to switch to something else unless there was an obvious win in other areas. But in more capable processors, particularly if they have floating point support, they likely use some kind of series approximation, probably with range reduction and/or argument transformation. Of course, for application-specific designs, I would imagine that some use table-lookup and interpolation, particularly in real-time systems that need to be very responsive.
Hi,

Yes that's an interesting algorithm too, probably targeted for hardware.
I might be able to adapt it in some way, but I'd still like to find my old algorithm I've been talking about.

I tried to search my PM's but can't find a way to do that. It seems to only search the forums and blogs.
 

Futurist

Joined Apr 8, 2025
753
Hello again,

Well unfortunately, I looked over the algorithm I found after many years of not using that computer, and realized I captured the wrong algorithm. It's not the one I had been talking about. Still interesting, but not as good as the one I was really looking for. So that means I still do not yet have the algorithm I wanted.

The one I did find is also interesting though. It's rather simple in theory. It uses one-third angle compression to repeatedly compress the value of x (for calculating sin(x)) and then computes the sine using a Taylor Series with highest power of 7. It then decompresses using the third angle formula.
So we divide the angle by 3 repeatedly until we get to a small threshold related to the number of digits of precision we need, then use a Taylors with highest power x^7 (show elsewhere in this thread by Bob), then use the expander:
y=3*z-4*z^3 (where simply z=y[n-1])
the same number of times we had to divide by 3 originally.
It works pretty nice really and has some advantages over a direct Taylors. I am not sure it is any faster I'd have to check, but there is no large integer division for example.

This was probably one of the precursors I used way back in the who knows what years before the newer algorithm came to mind. I originally thought it was the right algorithm because I had it coded in lines that incremented by 100 for each function, sine, cosine, tangent, exponential, etc.
This was brought on because the old TRS80 did not have built in support for double precision trig functions, only single precision which was a very sorry thing for them to do since add, subtract, multiply, and divide were all ok with double precision.
In the newer algorithm, I would first do y=sin(x) in raw built in code which would give me roughly 8 digits to start with, so no series needed for that, and then one pass of the new algorithm and I had 16 digits which was the target precision at the time. Of course all this was very slow compared to what we have now running on a 4MHz computer (ha ha). Yes that is 4MHz not 4GHz as the heart of that machine was a Z80 CPU running at 4MHz.

So the search for the newer algorithm continues... :)

I should note though that in this day and age it's more of a curiosity what I was doing back then rather than something I really need that bad. The computers today run fast enough for most things, at least until we get up to a huge number of digits and then things slow down pretty fast. I had to wait days with some calculations, but back in the day of the TRS80 it would have been months I bet, so I would have never even attempted it back then.
You sound like a bit of an expert in this area. Years ago I developed a compiler and needed a floating point library for the Intel CPU. I tracked down a guy who had one, the licensing was pricey, no doubt a money spinner for him.
 
Top