The importance of 2,147,483,647 ...

jgessling

Joined Jul 31, 2009
82
That is an interesting piece but I'm disappointed at it's lack of detail about the Ariadne rocket explosion. What really happened was that the developers of the code for controlling the rocket used an existing library function from the previous version of the rocket.. Seems like a good idea, except that the library used 16 bit integers which couldn't handle the magnitude of the values from the new craft. So the control program failed and the rocket went off course and was destroyed. The message is that reuse is not safe without control over the inputs and proper exception handling. Reuse has been a source of problems in many engineering disciplines including bridge building in the 19th century. If you want more details see:

http://www.rvs.uni-bielefeld.de/publications/Reports/ariane.html

Enjoy.
 

nsaspook

Joined Aug 27, 2009
13,315
It's something everyone now about. I've had to deal with 32bit signed integer rollover when using nanoseconds as the base 1 second counter element in some kernel drivers C data structure. You can either use a long long (and eat cpu cycles on a 32bit architecture) or normalize the numbers to microseconds and lose precision. For most slow interfaces that humans react to a jitter of +- 1usec is not a worry unless you need hard real-time event timing but then you would be using a fast 64bit machine without that problem or a dedicated controller without an OS to get in the way.

Blaming the software is sort of a cop-out. What's missing is testing of the software for sanity of results. Robust software should be able to take random noise as the input and not fail if its function is critical.
 
Last edited:

MrChips

Joined Oct 2, 2009
30,824
Number precision overflow is ubiquitous in all areas of computers and computation. There is no excuse for blaming bad code or libraries. It is a stark failure of due diligence on the part of designers, engineers and quality assurance.

What is shocking and even less understood and hence given improper due attention is the pitfalls of floating point arithmetic. Many programmers resort to floating point when they think the problem is a result of insufficient precision with integer arithmetic or usually lack of comprehension of the differences between integer and floating point arithmetic.
 

nsaspook

Joined Aug 27, 2009
13,315
Number precision overflow is ubiquitous in all areas of computers and computation. There is no excuse for blaming bad code or libraries. It is a stark failure of due diligence on the part of designers, engineers and quality assurance.

What is shocking and even less understood and hence given improper due attention is the pitfalls of floating point arithmetic. Many programmers resort to floating point when they think the problem is a result of insufficient precision with integer arithmetic or usually lack of comprehension of the differences between integer and floating point arithmetic.
Ain't that the truth. It's French so they must have used Pascal but even strongly typed languages can't stop stupid.

One thing I try to do is to specify the exact bit width of every variable using the standard type if using C in functions and prototypes, if the header is not there then I create one to give me those types. It won't stop overflow errors but it should make you at least think of it when coding.

Code:
#ifdef INTTYPES
#include <stdint.h>
#else
#define INTTYPES
/*unsigned types*/
typedef unsigned char uint8_t;
typedef unsigned int uint16_t;
typedef unsigned long uint32_t;
typedef unsigned long long uint64_t;
/*signed types*/
typedef signed char int8_t;
typedef signed int int16_t;
typedef signed long int32_t;
typedef signed long long int64_t;
#endif
 

WBahn

Joined Mar 31, 2012
30,082
This is tightly coupled to units -- in fact, it would not be hard to directly embed datatype and unit into the variable names.

uint16_t length_m_16ui
double speed_mps_dbl

Some coding standards already mandate datatype information in variable names. I think that unit information is probably more important (I think that more errors go uncaught because of poor unit control than poor datatype control, but I have no information to support that claim and I could be wrong).

What is really missing is that we have a engineering/scientific/mathematical culture fails to recognize the importance of these things, in part (and only in part) because all three communities want to emphasize working at a "higher level of abstraction".
 

jgessling

Joined Jul 31, 2009
82
Actually a Frenchman has provided us with a language with built in syntax to prevent these types of errors. It's called Eiffel as developed by Bertrand Meyer. Now chair of software engineering at ETH (Zurich). The basic idea is known as "design by contract" (DBC) that is if you provide a function then you need to specify for what inputs it will guarantee a correct output. In the rocket case the function could have been written like this: (simplified with my errors)

Trajectory : (x: INTEGER, y: INTEGER)
require
x <= 65535, y <= 65535
do
int16 xx, yy
! calculate as needed but now we know that the inputs are within range
xx = x, yy = y
return (xx**2 + yy**2) / SQRT(2) ! totally wrong but you get the idea
end

If the calling code has not provided valid data then the call will fail and the program will not continue. Hopefully the mainline code will do something in this case. Or even better the whole thing will not compile in the first place and force someone to look at this issue.

If it's not obvious by now you should know that I am a big fan of Mr. Meyer and DBC. Unfortunately I've never been able to get a job writing Eiffel code but still keep it in my heart in whatever I do. The basic idea is not to trust whatever stuff proceeds the current project without careful inspection.
 
Last edited:

Thread Starter

cmartinez

Joined Jan 17, 2007
8,257
I've actually taken advantage of integer overflows... I remember I once wrote an assembly program for the 8051 that would delay a series of digital signals by storing historical values in 256 bits... what I did is I used 32 bytes of memory, and another byte that pointed at the bit to be consulted. In that byte, the upper 5 bits determined the byte number (one of the 32 bytes of memory) that contained the bit info that I wanted, and the lower 3 bits pointed at the specific bit within that byte. While at the same time each bit in the 32-byte "carrousel" was being written with the current signal value. When the pointer byte "overflowed" it would start overwriting the values stored in the 32-byte "carrousel" ... the whole thing worked like a dynamic database "caterpillar" of sorts...
 
Last edited:

nsaspook

Joined Aug 27, 2009
13,315
Actually a Frenchman has provided us with a language with built in syntax to prevent these types of errors. It's called Eiffel as developed by Bertrand Meyer. Now chair of software engineering at ETH (Zurich). The basic idea is known as "design by contract" (DBC) that is if you provide a function then you need to specify for what inputs it will guarantee a correct output. In the rocket case the function could have been written like this: (simplified with my errors)
Another pretty 'OOP' sandbox language with no pointers or low-level bit banging capabilities. That's great on top of the Eiffel tower but not so great on the ground. As cmartinez said sometime you need to bend the rules but always know that the side-effects are. One of the reasons I liked the ETH Modula-* (Algol family) languages designed for systems programming (With ADA and oberon being descendants) is the ability to directly talk to and address memory in a way that's machine like instead of building the human type world that's mandated with most OOP languages.

http://www.ethoberon.ethz.ch/books.html
 
Last edited:

MrChips

Joined Oct 2, 2009
30,824
Interesting to note that ETH Zurich is the birthplace of Structured Programming with pioneers such as Niklaus Wirth, Tony Hoare, Edsger Dijkstra and Peter Naur. We had to study the works of these computer scientists in one of my graduate level computer science courses.
 

nsaspook

Joined Aug 27, 2009
13,315
Interesting to note that ETH Zurich is the birthplace of Structured Programming with pioneers such as Niklaus Wirth, Tony Hoare, Edsger Dijkstra and Peter Naur. We had to study the works of these computer scientists in one of my graduate level computer science courses.
They are all great people (Wirth is one of my personal heroes with a history of usable research) with great methods to develop safe computer systems not just safe code but C/C++ will never die. Most programmers like having a job and writing code that nobody else can understand is the key to job security. The more complex and baroque the language the better.

Men like Wirth believe the simplify of computer languages is more important than pure elegance as the key to safe software systems.
Thank you, Niklaus, for leading the way.
 

GopherT

Joined Nov 23, 2012
8,009
Actually a Frenchman has provided us with a language with built in syntax to prevent these types of errors. It's called Eiffel as developed by Bertrand Meyer. Now chair of software engineering at ETH (Zurich). The basic idea is known as "design by contract" (DBC)...
The Ariane 5 software was developed in ADA, a strongly typed language that does support design-by-contract. As @nsaspook suggested, it is based on PASCAL, but then again, many others are as well. It seems the type mismatch was known but high values were not expected. I didn't read the entire report to see if blame was placed on the aero-engineers who assumed high values would not happen or if it was the software team or someone in between.

Here is the report into the failure. Section 2.1 is most interesting.
https://www.ima.umn.edu/~arnold/disasters/ariane5rep.html
 

nsaspook

Joined Aug 27, 2009
13,315
Here is the report into the failure. Section 2.1 is most interesting.
https://www.ima.umn.edu/~arnold/disasters/ariane5rep.html
That's a very good report with some important conclusions.

These nozzle deflections were commanded by the On-Board Computer (OBC) software on the basis of data transmitted by the active Inertial Reference System (SRI 2). Part of these data at that time did not contain proper flight data, but showed a diagnostic bit pattern of the computer of the SRI 2, which was interpreted as flight data.
...
An underlying theme in the development of Ariane 5 is the bias towards the mitigation of random failure. The supplier of the SRI was only following the specification given to it, which stipulated that in the event of any detected exception the processor was to be stopped. The exception which occurred was not due to random failure but a design error. The exception was detected, but inappropriately handled because the view had been taken that software should be considered correct until it is shown to be at fault. The Board has reason to believe that this view is also accepted in other areas of Ariane 5 software design. The Board is in favour of the opposite view, that software should be assumed to be faulty until applying the currently accepted best practice methods can demonstrate that it is correct.
Stopped and then output a diagnostic bit pattern to another computer that assumed correct data, no wonder it blew-up.

http://en.wikipedia.org/wiki/Defensive_programming
 
Top