Compare Conventional and Alternative Division Methods

Thread Starter

jpanhalt

Joined Jan 18, 2008
11,087
This thread is related to a previous thread on binary division here: http://forum.allaboutcircuits.com/threads/divide-a-binary-number.121934/

Rather than continuing off topic, I decided to start a new thread to compare a "standard" multiprecison method for division with one based on the Kenyan multiplication algorithm but developed independently.

For the standard method I chose Peter Hemsley's from Piclist (http://www.piclist.com/techref/microchip/math/div/32by16ph.htm). I have used it many times and simply adapted it to 24-bit X 16-bit for this experiment.

For the Kenyan method, I simply followed the algorithm developed by G. Reichborn-Kjennerud and presented here: http://www.piclist.com/techref/method/math/muldiv.htm The code is written for 24-bit X 24-bit. I will refer to it as the R-K algorithm -- not to be confused with any other person living or dead. ;)

Both methods were simulated on MPLab 8.92 using MPLab SIM at 4 MHz. The pseudo-Kenyan method uses the instruction set from an enhanced mid-range PIC 16F1829, namely the subwfb instruction. My code is still pretty crude, and I do intend to post it once it is cleaned up. The common subwf(b) method destroys the number being subtracted from (minuend) and my current code uses a shadow register to reconstruct those registers when needed. I don't like doing that, if it can be avoided. It can be done other ways, but the added loop time for doing that may be self-defeating. For example, this destroys the minuend, which ends up with the result:
Code:
     movf      T2L,w     ;
     subwf     T1L,f     ;
     movf      T2H,w     ;
     subwfb    T1H,f     ;
     nop
This preserves the minuend, but takes more time:
Code:
     movf      T2L,w     ;
     subwf     T1L,w
     movwf     T3L     ;
     movf      T2H,w     ;
     subwfb    T1H,w   ;
     movwf     T3H
     nop
Cutting to the chase, here are my timing results:
upload_2016-5-21_7-1-14.png
The Hemsley method uses rotations and has a relatively fixed time. The alternative method depends on the number of subtractions needed. It can be quite short when a number is divided by itself or long when the divisor is 1. My code does not check (yet) for a zero divisor.

Although no question is presented here, I would like to hear from others who have efficient multi-precision methods or who have also tried the Reichborn-Kjennerud algorithm themselves.

Regards, John
 

Thread Starter

jpanhalt

Joined Jan 18, 2008
11,087
Due to the overwhelming interest, it looks like time to put this baby to bed.

I updated the code a little (posted below). Basically, one must do a trial subtraction. If that result is negative (STATUS,0 is clear), a zero is rotated into the result. If the subtraction result is positive, a 1 is rotated into the result. In either instance, the subtrahend is divided by 2 and the process is repeated. Subtraction often results in destruction of the minuend registers. The first approach shown above used back-up registers and restored from those registers when a subtraction gave a negative result ("failed"). In today's version, the subtraction is done twice. The first time avoids destruction of the minuend registers. Then if needed, the subtraction is repeated with the result put in to the minuend registers as usual. It seemed like a quotient with a lot of set bits (e.g., dec. 63) would take longer than one with a lot of clear bits (e.g, dec. 64). That is the case; however, the double-subtraction method is still faster.

Here is a revised chart from my first post:
upload_2016-5-22_10-37-35.png
Here is the revised code:
Code:
;Author: ©JPANHALT
;Date: May 22,2016

    list        p=16f1829      ; list directive to define processor
    #include    <p16f1829.inc> ; processor specific variable definitions
     errorlevel -302,-305
     RADIX     DEC
    
    __CONFIG _CONFIG1, _FOSC_INTOSC & _WDTE_OFF & _PWRTE_OFF & _MCLRE_ON & _CP_OFF & _CPD_OFF & _BOREN_OFF & _CLKOUTEN_ON & _IESO_OFF & _FCMEN_OFF
    __CONFIG _CONFIG2, _WRT_OFF & _PLLEN_OFF & _STVREN_OFF & _BORV_19 & _LVP_OFF

     cblock    0x20
     T1U
     T1H
     T1L
     T2U
     T2H
     T2L
     count
     resultL
     resultH
     endc

    org 0x0000

     goto      Start
Start
;T1(24bit)/T2(16bit)
     movlw     high(881) 
     movwf     T2H
     movlw     low(881)   
     movwf     T2L
     call      Refresh
     clrf      count
     clrf      T2U
     clrf      resultL
     clrf      resultH
     nop
Kenyan
     incf      count,f
     movf      T2L,w
     subwf     T1L,f
     movf      T2H,w
     subwfb    T1H,f
     movf      T2U,w
     subwfb    T1U,f
     btfss     STATUS,0
     goto      Calculate
     lslf      T2L
     rlf       T2H
     rlf       T2U
     goto      Kenyan

Calculate
     call      Refresh
  
Cycle                       
;T1 registers are destroyed during subtraction.  When the subtrahend is too
;big, which results in a zero being rotated into the result, those registers
;must be recovered.  Back-up registers can be used; however, several steps are
;required to restore them.  Doing a trial subtraction and preserving the
;minuend registers was slightly faster.
     movf      T2L,w
     subwf     T1L,w
     movf      T2H,w
     subwfb    T1H,w
     movf      T2U,w
     subwfb    T1U,w
     btfss     STATUS,0       ;if C=0, subtrahend too big divide by 2 and repeat
     goto      Result
     movf      T2L,w          ;subtractand OK, repeat subtraction w/saves in f
     subwf     T1L,f
     movf      T2H,w
     subwfb    T1H,f
     movf      T2U,w
     subwfb    T1U,f          ;carry will be set 
Result     
     rlf       resultL,f      ;carry bit is set or clear correctly from above
     rlf       resultH,f
     decfsz    count,f
     goto      DivideT2
     goto      Finish
DivideT2
     lsrf      T2U
     rrf       T2H
     rrf       T2L
     goto      Cycle

Finish
     nop                      ;answer in resultH and resultL

;Subroutine

Refresh 
     movlw     upper(55536)
     movwf     T1U
     movlw     high(55536)
     movwf     T1H
     movlw     low(55536)
     movwf     T1L
     return   

     end
Regards, John
 
Top