Keil uVision4 for 8051 slow in doiing float * float products. Why?

Thread Starter

beamrider

Joined Dec 14, 2011
3
Keil uVision4 for 8051 slow in doing float * float products. Why?
Maybe I miss a setting, an optimization. I do not know what to do to accelerate the calculations.

I wanted to multiply two float numbers af=15343E-34 and bf=-23474E21. I noticed that Keil compiler generates assembly code that needs 271 Clock Cycles (on a Silab C8051F121, accelerated 8051) to generate the product.
Now, if a do a trick and calculate separately the mantissa and exponent of af * bf, I need only 63 Clock Cycles (see the code), i.e. same result is obtained 4.3 times faster.
(Multiplying two float numbers reduces to multiplying two signed integer mantissas and adding two signed integer exponents).

I mention that I had the same problem with AVR Studio 4.18 for Atmel AVR 8 bit uControllers, floating point operation were incredible slow. Fortunately there was a fix for AVR. A library (libm.a) had to be added somewhere. With that libm.a the speed for float * float grew 14 times (see: http://www.avrfreaks.net/index.php?name=PNphpBB2&file=viewtopic&t=114878 )

I the case of Keil I do not know what to do, what settings to make to speed up calculations. Maybe you know and can help me.

One thing is sure, 8051 is capable of multiplying float quantities at least 4 times faster than it does with the default configuration of Keil.

Rich (BB code):
#include <Si8250.H>                                   
#include <stdio.h>               
#include <math.h> 

typedef unsigned char uint8_t;
typedef char int8_t;
typedef short int int16_t;
typedef long int int32_t;

uint8_t i;
volatile uint8_t q;

volatile int8_t ae8, be8;
volatile int8_t se8;

volatile int16_t am16, bm16;
volatile int32_t pm32;

volatile float af, bf;
volatile long pf;

int main(void)
{
   ae8=-34;
   be8=21;
   am16=15343;
   bm16=-23474;
   af=15343E-34;
   bf=-23474E21;
   
   for (i=1;i<5;i++)
   {
    pm32 = am16 * bm16; // Clock Cycles (CC) -> 567
    se8 = ae8 + be8; // CC -> 624
    pf = af * bf; // CC -> 630
    q = 1; // CC -> 901
   }
return 1;

}
 

thatoneguy

Joined Feb 19, 2009
6,359
What speed do you get when using gcc porting to 8051? Or a different free/limited use compiler?

If the other compilers also make faster code, submit a bug report to the vendor.
 

Thread Starter

beamrider

Joined Dec 14, 2011
3
Only somebody having direct experience with Keil and 8051 can answer my question.
I do not believe Keil has such bugs because everybody talks about it with respect, in general.
Likely it is a setting I miss, a library I do not have or something like this.
 

ErnieM

Joined Apr 24, 2011
8,377
Now, if a do a trick and calculate separately the mantissa and exponent of af * bf, I need only 63 Clock Cycles (see the code), i.e. same result is obtained 4.3 times faster.
(Multiplying two float numbers reduces to multiplying two signed integer mantissas and adding two signed integer exponents).
Multiplying two float numbers is far more complicated then just multiplying two signed integer mantissas and adding two signed integer exponents
 

Thread Starter

beamrider

Joined Dec 14, 2011
3
Multiplying two float numbers is far more complicated then just multiplying two signed integer mantissas and adding two signed integer exponents
Why far more!?

af=15343E-34;
bf=-23474E21;
af*bf= -360161582E-13

ae=-34; (exp of a)
be=21;
am=15343; (mantissa of a)
bm=-23474;

am*bm=-360161582
ae+be=-13

 

thatoneguy

Joined Feb 19, 2009
6,359
IF that is how your compiler breaks it down to asm.

Otherwise it loads the floating point library and does everything the hard way to cover all possibilities.

That's why I suggested a different compiler, maybe its library could be more efficient, or you could break it into integer math on your own as you are doing.
 
Top