problem with representing real numbers in binary system

Thread Starter

Ghina Bayyat

Joined Mar 11, 2018
135
thank you all for your help but all of you said that the fixed point representation is more common and easier . so why in many books and also many videos on youtube say that the floating point representation is the more common way ? i'm really getting lost now can anyone please tell me a summary or if there is a table including all the ways of representing negative and real numbers ? and which one is better or which one is used nowadays ? and how can i decide which way i can use ? is it depending on the computer and on the number ?
 

Papabravo

Joined Feb 24, 2006
15,092
thank you all for your help but all of you said that the fixed point representation is more common and easier . so why in many books and also many videos on youtube say that the floating point representation is the more common way ? i'm really getting lost now can anyone please tell me a summary or if there is a table including all the ways of representing negative and real numbers ? and which one is better or which one is used nowadays ? and how can i decide which way i can use ? is it depending on the computer and on the number ?
Normally floating point calculations are done with dedicated hardware. This is expensive in terms of chip real estate. In low end microcontrollers it is common to support only integer operations. In some cases you don't even get integer multiplication and division.
 

andrewmm

Joined Feb 25, 2011
1,048
thank you all for your help but all of you said that the fixed point representation is more common and easier . so why in many books and also many videos on youtube say that the floating point representation is the more common way ? i'm really getting lost now can anyone please tell me a summary or if there is a table including all the ways of representing negative and real numbers ? and which one is better or which one is used nowadays ? and how can i decide which way i can use ? is it depending on the computer and on the number ?

Quote us the books, and we might be able to comment more,

This is posted in digital design, ( see the end )

In digital design, despite what the silicon vendors say, silicon is expensive, both in terms of dollars, but also power used.

One of the main skills of a digital designer is to use the "most efficient" method of achieving the requirement,

Floating point is always going to be an order or more in magnitude than a fixed point solution,
If minimum time to develop is the requirement, and you have dollars and watts to chuck at the problem, go floating point,

As an example, take the classic problem sqrt ( A*A + B*B)

where A and B are in range 0 to 1

Simple answer , is to use real fractional numbers, and a sqrt algorithm,
this would take a bunch of silicon,

Alternatively you could use largest + 1/2 smallest, and do it all in fixed point,
it would be a few percent in error, but may be that's good enough.

digital design is all about understanding the constraints and requirements, and compromises,

BUT

Then you mention CPU,
now CPU is not digital design, its programming,

Traditionally, a CPU has not a floating point unit, thus floating point numbers are "evaluated" in multiple steps, i.e. they are traditionally much slower then integers,

More modern CPUs have floating point units which are almost as fast as the integer units,
so the argument to use integers only for small programs is less,
the saving in using the integer unit to the floating point unit is may be only 10:1 in execution time,

To a first approximation, if a small program took say 10ms in floating point , it would take 1ms in integer.
who cares,

but

if your program took 10 hours to run and going integer, it went to 1 hour, then its a different story,
 
Last edited:

jpanhalt

Joined Jan 18, 2008
11,088
Normally floating point calculations are done with dedicated hardware. This is expensive in terms of chip real estate. In low end microcontrollers it is common to support only integer operations. In some cases you don't even get integer multiplication and division.
Sounds "good," but I have never seen a low end MCU (limited to 12F5xx and above) that didn't support fixed point math including routines with multiplication and division. Of course, those are not single step operations either.
 

MrChips

Joined Oct 2, 2009
22,893
No one here is saying one is better than the other. There is never a "best" solution. It depends on your application and implementation options.

Floating point is costly in one way or another.
If the MCU does not have FP HW then you have to implement it in SW.
If you implement it in SW it will require lots of program space and will take lots of execution cycles.
FP math is not exact. There are errors in FP computation.

Integer arithmetic is exact.
Integer computation in SW is fast and efficient. Many MCUs have HW multiply and divide. This makes computation even more faster and code space efficient.

If you are running programs such as spreadsheet or MATLAB on a PC, go ahead and use FP. I am not going to quibble. If you are building an embedded system on an MCU my choice would be integer arithmetic for speed and efficiency.

I once had to calculate dewpoint on an MCU. There was no way to implement this in FP in 2K bytes on an 8-bit MCU.
 

Papabravo

Joined Feb 24, 2006
15,092
No one here is saying one is better than the other. There is never a "best" solution. It depends on your application and implementation options.

Floating point is costly in one way or another.
If the MCU does not have FP HW then you have to implement it in SW.
If you implement it in SW it will require lots of program space and will take lots of execution cycles.
FP math is not exact. There are errors in FP computation.

Integer arithmetic is exact.
Integer computation in SW is fast and efficient. Many MCUs have HW multiply and divide. This makes computation even more faster and code space efficient.

If you are running programs such as spreadsheet or MATLAB on a PC, go ahead and use FP. I am not going to quibble. If you are building an embedded system on an MCU my choice would be integer arithmetic for speed and efficiency.

I once had to calculate dewpoint on an MCU. There was no way to implement this in FP in 2K bytes on an 8-bit MCU.
As another anecdotal data point it might be worthwhile to point out that DSP is often done with fixed point arithmetic with things normalized to lie in the interval (-1,1), an open interval that does not include either endpoint
 

MrChips

Joined Oct 2, 2009
22,893
As another anecdotal data point it might be worthwhile to point out that DSP is often done with fixed point arithmetic with things normalized to lie in the interval (-1,1), an open interval that does not include either endpoint
Good point. FP FFT would be too darn slow. FFT is usually computed in scaled integer arithmetic.
 

MrChips

Joined Oct 2, 2009
22,893
Here is a straight forward comparison.
Take any MCU. Compare how long it takes (or computer cycles) to add two numbers
(a) using integers
(b) using floating point.
 

Thread Starter

Ghina Bayyat

Joined Mar 11, 2018
135
thanks a lot for your help
so depending on the requirements i can use any appropriate method :
No one here is saying one is better than the other. There is never a "best" solution. It depends on your application and implementation options.

Floating point is costly in one way or another
Normally floating point calculations are done with dedicated hardware. This is expensive in terms of chip real estate
Floating point is always going to be an order or more in magnitude than a fixed point solution,
If minimum time to develop is the requirement, and you have dollars and watts to chuck at the problem, go floating point
but one last question : i noticed that neither of you mentioned the 2's complement method and you only talked about fixed and floating point
is there a reason like it is not used when we talk about real numbers ?
if it is used then how because i can't understand how can i use 2's complement to represent a real number
 

LesJones

Joined Jan 8, 2017
2,953
The 2's complement is just an easy way to change the sign of a number. Invert all the bits (Most microcontrollers have a complement instruction.) which gives the 1's complement. Then just add 1 to the 1's complement to give the 2's complement. Using this method makes subtraction easier. Just add the 2's complement of a number to the number you want to subtract it from.
If you program in assembler you will soon get to understand binary. (And other base number systems.)

Les.
 

BobTPH

Joined Jun 5, 2013
2,878
As I said a few posts into this thread, fixed point can use two’s complement to represent negatve numbers.

What makes you think it cannot?

Bob
 

jpanhalt

Joined Jan 18, 2008
11,088
i noticed that neither of you mentioned the 2's complement method and you only talked about fixed and floating point
is there a reason like it is not used when we talk about real numbers ?
if it is used then how because i can't understand how can i use 2's complement to represent a real number
2's complement is frequently used to represent both positive and negative "real" numbers, as shown in post#5. Take these 3 examples from that table:

1590833400617.png

The left 12 bits are intergers.

0011 1110 1000 = 0x3E8 = 512+256+128+64+32+8 = 1000
0000 0110 0100 = 0x64 = 100
0000 0001 1001 = 0x19 = 25

One of the nice things about 2's complement, besides it use in subtraction, is that positive numbers represented in that manner do not change (as illustrated above). Thus, you will see 2's complement used very often for data from a variety of sensors.
 

andrewmm

Joined Feb 25, 2011
1,048
Twos compliment and fixed point are two different things.

Fixed / floating is a way of representing a number with a "point" in it.

2's compliments , ones compliment, offset binary et all are all ways of representing a number that's got a negative and a positive.

fixed / floating can be represented in 2's compliment, 1's compliment, offset binary et all.

There is an interesting theory, from the 70's that the universe is coded in 2's compliment !

look at tip 154,,
http://www.inwap.com/pdp10/hbaker/hakmem/hacks.html#item154
 

MrChips

Joined Oct 2, 2009
22,893
2's complement was not mentioned because it is a given. That means that it is a commonly used technique to differentiate between positive and negative numbers.
 

Thread Starter

Ghina Bayyat

Joined Mar 11, 2018
135
thank you all so much
everything is clear now
As I said a few posts into this thread, fixed point can use two’s complement to represent negatve numbers.

What makes you think it cannot?
i didn't know that before i thought fixed point , floating point and 1&2's complement are all ways of representing numbers in binary and a number can be represented in only one way of them so that's what made me confused about how to represent a real negative number
but thanks to what you said i get it now
Twos compliment and fixed point are two different things.

Fixed / floating is a way of representing a number with a "point" in it.

2's compliments , ones compliment, offset binary et all are all ways of representing a number that's got a negative and a positive.
thanks for your help
 

MrChips

Joined Oct 2, 2009
22,893
To be clear, floating point uses 2's complement.
In the example I gave you on fixed point arithmetic in post #9 I used 2's complement. Hence it is not one versus the other.
 

andrewmm

Joined Feb 25, 2011
1,048
To be clear, floating point uses 2's complement.
In the example I gave you on fixed point arithmetic in post #9 I used 2's complement. Hence it is not one versus the other.
To be clear,,,

Floating point NORMALY uses 2's compliment, it does not HAVE to ,
IEEE 754 does but there are other formats possible and found in the past.
 
Top