PIC 18F family - CORDIC division algorithm: dividend (up to 48 bits) / divisor (up to 32 bits)

Thread Starter

atferrari

Joined Jan 6, 2004
4,255
Ego massage mode maybe but based on genuine interest.

I recently became aware of the so called CORDIC division algorithm thanks to a post by member @cmartinez which brought me to this site. If I recall right, member @jpanhalt contributed there and maybe @Papabravo and Bob something (sorry).

Instead of limiting myself to 16 bits I decided to greedily work out a routine able to cope with up to 48 bits (dividend) and 32 bits (divisor). The algorithm gives de facto the remainder whether you need it or not.

Two hours ago I completed the first tests with valid results. Had to struggle with lot of clerical errors and a printed flow diagram was invaluable.

Code is not optimized in any way albeit seems to be correctly debugged but I do not plan to show it yet.

The calculations for the initial alignment of both MS bits plus the subsequent shiftings make for a lot of code, lot of RAM registers and a lot of T cycles.

My request: before deciding whether to stop here or to continue improving my code, I would like to hear comments if it sounds reasonable simply maybe because this CORDIC variety is not the way to go for huge numbers.

Gracias.

F2 79 D8 94 65 31 / BC = 01 4A 2E 16 93 9A | 19
 

cmartinez

Joined Jan 17, 2007
7,294
Hello Agustín, my friend. If you dig a little more, you will find that virtually every large processor out there makes use of an in-hardware resident look up table to perform division operations in the fastest possible way. Of course, emulating said technique would use either a lot of code, or a lot of ram. Another possibility would be to use an external chip containing said table that could be consulted whenever a division operation is required. But that would also be a slower and more expensive technique.

Here's an excellent article detailing a disaster suffered by the first generation of Intel's Pentium chip due to an error in its division look up table more than 25 years ago.
 

Thread Starter

atferrari

Joined Jan 6, 2004
4,255
Gracias César.

A priori, I do not see where a LUT could have part in a pure "CORDIC" algorithm (considering it is basically a repetitive substract divisor from dividend , shift divisor, add base index to quotient, shift base index).

For the little I know of all this, barrel shifters are the tool to have. What they can't do (because they do not exist in those micros) I have to do it with laborious multi step shifts.

The other part, which adds a huge overhead, is to calculate (and do) the initial alignment of the divisor's MSb under the dividend's MSb. Contradictory as it could seem, this complexity, in my case, adds to the convenience of being able to divide from FF FF FF FF FF FF /1 up to FF FF FF FF FF FF / FF FF FF FF.

I am already using MPLABX 5.40 what implies breakpoints in simulations being close to impossible; benchmarking this routine is illusory for now.
 

jpanhalt

Joined Jan 18, 2008
11,088
I had planned to respond earlier but had a whopper of a headache this morning. So, I mowed for 3 hours. That didn't help. I will write a more detailed response later with code examples.

I played with that algorithm quite a bit a couple of years ago. Some takeaways:

1) You can use indirect addressing to help in aligning the bits/bytes.
2) At some point, a decision must be made between speed and length of code. That is, aligning the msb(bit) gives a faster result, but longer code. All that needs to be done is ensure the divisor is equal to or greater than [(dividend/2) + 1]. It can be arbitrarily larger than that.
3) You can avoid "trial subtractions" (or their equivalent) by adding the divisor in the next step when the subtraction is negative. Of course, you rotate in zeros when it is negative.
4) Multi-precision math with 8-bit devices will require multiple registers, but no more than expected, e.g., 6 for dividend, etc.
 

Thread Starter

atferrari

Joined Jan 6, 2004
4,255
1) You can use indirect addressing to help in aligning the bits/bytes.

2) At some point, a decision must be made between speed and length of code. That is, aligning the msb(bit) gives a faster result, but longer code. All that needs to be done is ensure the divisor is equal to or greater than [(dividend/2) + 1]. It can be arbitrarily larger than that.

3) You can avoid "trial subtractions" (or their equivalent) by adding the divisor in the next step when the subtraction is negative. Of course, you rotate in zeros when it is negative.

4) Multi-precision math with 8-bit devices will require multiple registers, but no more than expected, e.g., 6 for dividend, etc.
Hola John

1) Yes, indirect addressing is a marvel. I used it extensively in many projects. Coding the John Conway's LIFE game for a 128x64 GLCD without it would have been a nightmare.

2) That will come later when I have improved the most crude details of my code. Optimizing even more the usage of indirect addressing is in the list.

3) I avoided them by using an orderly sequence of "compare" (18F family has 3 for >, = or <). Will be revised though; I have the feeling that it could be shortened radically.

4) Currently with absolute address:
DEND 5:0
DSOR 6:0
BASE_INDX 5:0
QUOTIENT 5:0

plus eight additional GPRs.
 
Last edited:
Top