Floating point 'equality' testing

MrChips

Joined Oct 2, 2009
30,802
How does this answer the question regarding testing for the equality of two floating point numbers?
Do ALL calculations in fixed point arithmetic.

For example, I can do dewpoint calculations from temperature and humidity readings with two decimal place accuracy, all done in fewer than 8k bytes code using integer arithmetic.
 

joeyd999

Joined Jun 6, 2011
5,283
Do ALL calculations in fixed point arithmetic.
Granted. But that wasn't the question.

I've already said that I cannot fathom a case where I would require to test two floats for absolute equality. One should always assume floats represent a continuous and infinite field of possible values -- two of which may be indefinitely close to -- or far away from -- each other.

Edit: there are times when floating point numbers represent a far easier solution than fixed point. A good for instance, which I use often, is polynomial regression for on-the-fly curve fitting. I tried it once in fixed-point -- the code was nasty.
 

WBahn

Joined Mar 31, 2012
30,052
If the floats are represented using IEEE-754, or something comparable, then exploit the structure of the representation to do thresholded equality comparisons using just integer arithmetic on the raw bit pattern itself.
 

joeyd999

Joined Jun 6, 2011
5,283
If that's what you said, then why are you talking about checking exponents and aligning significands by shifting (which is going to have problems because of the inferred leading 1, by the way)?
The inferred 1 is a given, and processed during the extraction of the significand. This was implied, but not explicitly stated.

The single shift is required prior to the signed integer subtraction -- only in the case if the exponents are different by 1 -- because two very close values, say 1.9999 and 2.0000, may have different exponents. Therefore, the bits must be aligned prior to subtraction. A single bit right shift, even across a 3 or 4 byte significand, is trivial.

How would you account for two values with similar significands but different exponents?

Edit: In fact, a further test must be performed. Each operand must be tested explicitly for zero prior to converting the the significand to a signed int, lest the inferred 1 creates substantial errors.
 
Last edited:

ci139

Joined Jul 11, 2016
1,898
i had to bypass the system to round right the log10() results to get where the decimal point actually is on PC
couple of days of head ace and testing everithing i could come up with - still have this in .lib and usable
i haven´t regret the work done for for so far ??
 

NorthGuy

Joined Jun 28, 2014
611
This is for embedded code and I am trying to avoid fabs(A-B)<0.00001 which looks like a lot of code space and clock cycles.
Your comparison involves fixed scale. Which means you, you know what the scale is (otherwise you would use something like (fabs(A-B)/(A+B))<0.00001). In such situation you most likely don't really need floats and can do everything with scaled integers. Floats are for the case where the scale is unknown.
 

JohnInTX

Joined Jun 26, 2012
4,787
If the floats are represented using IEEE-754, or something comparable, then exploit the structure of the representation to do thresholded equality comparisons using just integer arithmetic on the raw bit pattern itself.
That may be problematic in the PIC C world for some compilers at least. At build-time you can choose IEEE754, IEEE modified 24 bit or Microchip Floating Point representations all of which are different. For example, MCHP 32bit floats rearrange the exponent and mantissa sign bits so that the exponent sign + 7bit exponent occupies most sig. 8 bits and the mantissa sign + 23bit mantissa is in the lower 3 bytes. Sometimes, you have to convert between the two representations to make the compiler happy while still preserving compatibility with external systems that expect IEEE754. MPLAB 8.x (at least) offers several formats for watch window floats but I've used compilers that don't use any of them.

I suppose you could argue that for a given build, with care, exact values are bit compatible but it would be ill advised to count on it IMHO. Who knows what the next compiler rev. brings?
FWIW, I use the fabs() method in either C or assembler.
 

WBahn

Joined Mar 31, 2012
30,052
The inferred 1 is a given, and processed during the extraction of the significand. This was implied, but not explicitly stated.

The single shift is required prior to the signed integer subtraction -- only in the case if the exponents are different by 1 -- because two very close values, say 1.9999 and 2.0000, may have different exponents. Therefore, the bits must be aligned prior to subtraction. A single bit right shift, even across a 3 or 4 byte significand, is trivial.

How would you account for two values with similar significands but different exponents?

Edit: In fact, a further test must be performed. Each operand must be tested explicitly for zero prior to converting the the significand to a signed int, lest the inferred 1 creates substantial errors.
Keeping my part of this restricted to IEEE-754 then for most values (and I need to look at the special cases and see if they need to be specifically trapped or not) your format is

(sign)(exponent)(significand)

This format was specifically intended to make magnitude comparisons easy using the same hardware as for comparing unsigned integers, with only minor tweaks for some special cases.

Now, magnitude comparisons admittedly more forgiving than equality comparisons (i.e., wanting the absolute value of the relative difference to be below some threshold), so if we use this feature we can expect to give up some accuracy, but let's see where it leads.

Let's look at the half-precision standard having 16 bits with a five bit exponent and 10 bits of significand. Let's say that we want two values to agree to 7-bits (~1%), which means 6-bits of stored significand thanks to the implied one. As an integer, those last 4 insignificant bits means that we want the values to match to within 16. To actually get 1% we want the difference of the (integer) significands to be less than 20 if they are near the max (2.0000) and 10 if they are near the min (1.0000).

Let's first look at two values that are both just below the max and that differ by 1%. Let's choose 127.68 and 126.40. The bit patterns for these are

127.68 = 0 10101 1111111011 = 0101 0111 1111 1011 = 0x57FB = 22523
126.40 = 0 10101 1111100110 = 0101 0111 1110 0110 = 0x57E6 = 22502

The difference is 21, right at our upper limit.

Next let's look at two values that both just above the min and that differ by 1%. Let's use 128.90 and 130.19. The bit patterns for these are

128.90 = 0 10110 0000000111 = 0101 1000 0000 0111 = 0x5807 = 22535
130.19 = 0 10110 0000010010 = 0101 1000 0001 0010 = 0x5812 = 22546

The difference is 11, right at the lower limit.

But what about values that straddle the limit but are separated by 1%. Let's use 127.68 and 128.90.

128.90 = 0 10110 0000000111 = 0101 1000 0000 0111 = 0x5807 = 22535
127.68 = 0 10101 1111111011 = 0101 0111 1111 1011 = 0x57FB = 22523

The difference is 12, right near the lower limit. That we are near means that these are being effectively seen as a significand just above 1.0 and one just below 1.0 with both having the higher exponent. This is due to the impact of the inferred leading 1 that is not included in the integer interpretation.

So if we simply interpret the bit pattern as an integer and take the difference, we can compare it to a threshold representing a rough fractional threshold and we have our result -- without looking at the exponent or shifting things or doing any floating point operations at all.

Now, there are special cases to consider. Doing relative thresholds runs into problems when one of the values is exactly zero. But that's always the case, even with integers. We also would need to consider the impact if one (or both) were denormalized values (which are rare to non-existent in most applications). We also need to consider the flag values (which are also rare).
 

WBahn

Joined Mar 31, 2012
30,052
That may be problematic in the PIC C world for some compilers at least. At build-time you can choose IEEE754, IEEE modified 24 bit or Microchip Floating Point representations all of which are different. For example, MCHP 32bit floats rearrange the exponent and mantissa sign bits so that the exponent sign + 7bit exponent occupies most sig. 8 bits and the mantissa sign + 23bit mantissa is in the lower 3 bytes. Sometimes, you have to convert between the two representations to make the compiler happy while still preserving compatibility with external systems that expect IEEE754. MPLAB 8.x (at least) offers several formats for watch window floats but I've used compilers that don't use any of them.

I suppose you could argue that for a given build, with care, exact values are bit compatible but it would be ill advised to count on it IMHO. Who knows what the next compiler rev. brings?
FWIW, I use the fabs() method in either C or assembler.
I agree that using tricks that exploit the internal representation are often fragile. A floating point representation is actually one of the most basic examples of an abstract data type in which the representation and the operations that act on it are intended to provide the user interface and the user is not supposed to be aware of -- or go fiddling around with -- the internal representation. But in resource-starved and/or performance-critical applications that is also going to be one of the first fictions that gets dispensed with, leading people to go charging straight off the edge of the map where there be demons..
 

Thread Starter

AlbertHall

Joined Jun 4, 2014
12,346
I deliberately chose NOT to use the "daemons" spelling, but to stick to the notion of monsters and demons being beyond the edge of the known world as was commonly depicted on nautical charts several hundred years ago.
Not "Here there be dragons"?
 

joeyd999

Joined Jun 6, 2011
5,283
Keeping my part of this restricted to IEEE-754 then for most values (and I need to look at the special cases and see if they need to be specifically trapped or not) your format is

(sign)(exponent)(significand)

This format was specifically intended to make magnitude comparisons easy using the same hardware as for comparing unsigned integers, with only minor tweaks for some special cases.

Now, magnitude comparisons admittedly more forgiving than equality comparisons (i.e., wanting the absolute value of the relative difference to be below some threshold), so if we use this feature we can expect to give up some accuracy, but let's see where it leads.

Let's look at the half-precision standard having 16 bits with a five bit exponent and 10 bits of significand. Let's say that we want two values to agree to 7-bits (~1%), which means 6-bits of stored significand thanks to the implied one. As an integer, those last 4 insignificant bits means that we want the values to match to within 16. To actually get 1% we want the difference of the (integer) significands to be less than 20 if they are near the max (2.0000) and 10 if they are near the min (1.0000).

Let's first look at two values that are both just below the max and that differ by 1%. Let's choose 127.68 and 126.40. The bit patterns for these are

127.68 = 0 10101 1111111011 = 0101 0111 1111 1011 = 0x57FB = 22523
126.40 = 0 10101 1111100110 = 0101 0111 1110 0110 = 0x57E6 = 22502

The difference is 21, right at our upper limit.

Next let's look at two values that both just above the min and that differ by 1%. Let's use 128.90 and 130.19. The bit patterns for these are

128.90 = 0 10110 0000000111 = 0101 1000 0000 0111 = 0x5807 = 22535
130.19 = 0 10110 0000010010 = 0101 1000 0001 0010 = 0x5812 = 22546

The difference is 11, right at the lower limit.

But what about values that straddle the limit but are separated by 1%. Let's use 127.68 and 128.90.

128.90 = 0 10110 0000000111 = 0101 1000 0000 0111 = 0x5807 = 22535
127.68 = 0 10101 1111111011 = 0101 0111 1111 1011 = 0x57FB = 22523

The difference is 12, right near the lower limit. That we are near means that these are being effectively seen as a significand just above 1.0 and one just below 1.0 with both having the higher exponent. This is due to the impact of the inferred leading 1 that is not included in the integer interpretation.

So if we simply interpret the bit pattern as an integer and take the difference, we can compare it to a threshold representing a rough fractional threshold and we have our result -- without looking at the exponent or shifting things or doing any floating point operations at all.

Now, there are special cases to consider. Doing relative thresholds runs into problems when one of the values is exactly zero. But that's always the case, even with integers. We also would need to consider the impact if one (or both) were denormalized values (which are rare to non-existent in most applications). We also need to consider the flag values (which are also rare).
@WBahn, I am preparing for a hurricane. Your essay is interesting. I will comment when I've had a chance to read and comprehend it.
 

WBahn

Joined Mar 31, 2012
30,052
@WBahn, I am preparing for a hurricane. Your essay is interesting. I will comment when I've had a chance to read and comprehend it.
Good luck weathering the storm!

I'll be interested in hearing your thoughts on it. I may get a chance to consider the implications of the special cases, but I've got lots of other things that I need to make higher priority, too.
 
Top