RESULT OF EXPRESSION

WBahn

Joined Mar 31, 2012
29,979
Very often I want to do a calculation such as:

R = A / B * C;

Assume that all variables are 16-bit unsigned integers and values do not result in overflows.
I want to make sure that I get the best precision in the result R.
Since I have no idea how the compile will choose the order of operations, I force the compiler to do it in the manner I want.
Hence I would break up the calculation as follows.

R = A * C;
R = R / B;

To be honest, I don't even know if doing this is necessary or what would an optimizing compiler choose to do. The only way to find out is to look at the ASM code generated by the compiler. Time to go check this out.
In this case the order of operations dictates that the expression be evaluated as:

R = (A / B) * C;

Since / and * are of the same precedence and left associative.

So if you want it to evaluate it as (A * C) / B, then just do:

R = A * C / B;

Either way you have some concerns. With A/B first you have the potential for round off issues, with A*C first you have potential issues with overflow.
 

Thread Starter

Saumyojit

Joined Feb 20, 2020
16
If the side-effects are deferred until after evaluation of the expression, then (++a) will evaluate to 5 and (a--) will evaluate to 4, making 'result' equal to 9, regardless of whether you evaluate left-to-right or right-to-left. But what about the value of 'a'? You are setting it equal to 5 and also setting it equal to 4, so it will be whichever of the two is done last.
side-effects are deferred means after a++ the value has increased to 5 but the new value will not be reflected in a-- , when a-- will be evaluated it will start with the orignal value of a i.e 4 right?

i have printed the value of a that is coming 4 . THat means which expression is executed at last?
 

WBahn

Joined Mar 31, 2012
29,979
side-effects are deferred means after a++ the value has increased to 5 but the new value will not be reflected in a-- , when a-- will be evaluated it will start with the orignal value of a i.e 4 right?

i have printed the value of a that is coming 4 . THat means which expression is executed at last?
When you evaluate the expression a++ there are two things that happen. One of them is getting the value of the expression. This is completely independent of the side effect. For instance, if a=6 and b=7 and I evaluate a+b, this expression evaluates to 13. Neither a nor b change because there is no side effect. So when I evaluate a++ the expression evaluates to the value that the variable a has at the time I evaluate the expression. The side effect is the changing of the value stored in the variable a to one greater than it was when I accessed it. That can happen at any time.

So let's think of our good old list situation.

Say I have

a = ++a + a++;

This parses as

a = expr
expr = left + right
left = ++a
right = a++

These all fall between the same pair of sequence points, so we have one list of things to do:

Access the variable 'a' for ++a and set temp_a1 = a
Evaluate the expression ++a as left = (temp_a1 + 1)
Set the value of a to (temp_a1 + 1)
Access the variable 'a' for a++ and set temp_a2 = a
Evaluate the expression a++ as right = (temp_a2)
Set the value of a to (temp_a2 + 1)
Evaluate the expression expr = left + right
Set the value of a to (expr)

The items in this list can be done in ANY order, subject only to the need to evaluate each operand before it is used.

So I could do them in this order:

Access the variable 'a' for ++a and set temp_a1 = a
Access the variable 'a' for a++ and set temp_a2 = a
Evaluate the expression a++ as right = (temp_a2)
Evaluate the expression ++a as left = (temp_a1 + 1)
Evaluate the expression expr = left + right
Set the value of a to (expr)
Set the value of a to (temp_a2 + 1)
Set the value of a to (temp_a1 + 1)

Note particularly that the value of 'a' is changed three times and the final value depends very much on which one of them is done last.

But even if I set 'a' to (expr) last, I can get different results depending on the relative order of when I access the variable and when I set it.

Consider:

Access the variable 'a' for ++a and set temp_a1 = a
Evaluate the expression ++a as left = (temp_a1 + 1)
Set the value of a to (temp_a1 + 1)
Access the variable 'a' for a++ and set temp_a2 = a
Evaluate the expression a++ as right = (temp_a2)
Set the value of a to (temp_a2 + 1)
Evaluate the expression expr = left + right
Set the value of a to (expr)

Versus:

Access the variable 'a' for ++a and set temp_a1 = a
Access the variable 'a' for a++ and set temp_a2 = a
Set the value of a to (temp_a1 + 1)
Set the value of a to (temp_a2 + 1)
Evaluate the expression ++a as left = (temp_a1 + 1)
Evaluate the expression a++ as right = (temp_a2)
Evaluate the expression expr = left + right
Set the value of a to (expr)

Versus:

Access the variable 'a' for a++ and set temp_a2 = a
Set the value of a to (temp_a2 + 1)
Evaluate the expression a++ as right = (temp_a2)
Access the variable 'a' for ++a and set temp_a1 = a
Set the value of a to (temp_a1 + 1)
Evaluate the expression ++a as left = (temp_a1 + 1)
Evaluate the expression expr = left + right
Set the value of a to (expr)

The compiler writer was given this flexibility because different processor architectures lend themselves to different approaches and an approach that works extremely well on one architecture might perform abysmally poorly on another, largely dependent on how many registers they have and what constraints there are as far as what can and can't be done with data in memory directly, between registers, or between registers and memory.
 

Thread Starter

Saumyojit

Joined Feb 20, 2020
16
i dont understand from where temp_a1 & temp a2 is coming. Why u have taken so many variables it makes me confuse right or left...
i would rather make a++ simple in my terms it would be like
(1) first the orignal value of a gets implemented in the expression (++a +a++)
(2)then a=a+1

for ++a (1): a=a+1
(2):then new value of a gets implemented in expression

do i have to breakdown to more variables for deeper understnading right? thats why u used temp_1 and temp_2 or we can work with my two above assumptions
 
Last edited:

djsfantasi

Joined Apr 11, 2010
9,156
i dont understand from where temp_a1 & temp a2 is coming. Why u have taken so many variables it makes me confuse right or left...
i would rather make a++ simple in my terms it would be like
(1) first the orignal value of a gets implemented in the expression (++a +a++)
(2)then a=a+1

for ++a (1): a=a+1
(2):then new value of a gets implemented in expression

do i have to breakdown to more variables for deeper understnading right? thats why u used temp_1 and temp_2 or we can work with my two above assumptions
Temp_a1 and temp_a2 actually exist during the evaluation in this statement. So you need to understand them in order to understand what the statement (might) be doing.

They only exist in the execution of the compiled code. They may be a register or two. They could exist on a stack. You don’t know nor can you control where they are.

The point being that during execution, there are intermediate values which are not stored in your code’s variables. The order of calculation and where they are stored are decided upon by the compiler authors.

This is why the concept of ‘sequence points’ becomes important. The compiler specifications imply that at sequence points and not between the calculations will be consistent.

Personally, I try to avoid complex statements with possible side-effects, unless I am 100% sure these intermediate values can NOT affect the result. I actually go out of my way to use separate statements so I control the sequence of operations and control any side-effects. I do NOT trust the compiler authors. That way, I have a better chance that my code will still work when different compiler is used.
 

WBahn

Joined Mar 31, 2012
29,979
i dont understand from where temp_a1 & temp a2 is coming. Why u have taken so many variables it makes me confuse right or left...
i would rather make a++ simple in my terms it would be like
(1) first the orignal value of a gets implemented in the expression (++a +a++)
(2)then a=a+1

for ++a (1): a=a+1
(2):then new value of a gets implemented in expression

do i have to breakdown to more variables for deeper understnading right? thats why u used temp_1 and temp_2 or we can work with my two above assumptions
No, you cannot use your assumption because your assumption is not valid -- you are assuming that the compiler is going to do what you WANT it to do and the person that wrote the compiler is NOT constrained by your wishes.

Any expression has to be evaluated in small pieces and, in nearly all cases, the data is accessed from memory and placed into one of the registers in the processor (those are the temp variables). The results of applying the operators are also stored in registers (more temp variables) before being written back into the variables. When you have an expression that involves side effects, you have, in effect, two independent evaluations going one. One is the evaluation of the value that will become the value of the expression that is used in later computations in a larger expression and the other is the evaluation of the value that will be used to execute the side effect. These may or may not be the same value.

The behavior specified by the language is intentionally loose and only certain things HAVE to happen. There are many ways to implement that behavior and, for the things that are not specified or defined by the language, there can be several possible results. The specification is loose specifically so that the compiler writer can craft an implementation that exploits the capabilities and respects the limitations of the target hardware and thus results in the best performance to achieve the specified behavior by not having to worry about what happens for the unspecified or undefined behaviors. It is up to the programmer to write their code so that it does not invoke unspecified or undefined behavior.

C is an extremely powerful language that gives programmers extreme levels of control and, in exchange, gives the programmer plenty of rope with which to hang themselves. It thus requires the programmer to become very knowledgeable about exactly what the language does and does not specify and the self-discipline to live within the resulting bounds since the language will not do it for you. Many programmers are not sufficiently emotionally prepared to deal with this.
 
Last edited:

Thread Starter

Saumyojit

Joined Feb 20, 2020
16
Access the variable 'a' for ++a and set temp_a1 = a
Evaluate the expression ++a as left = (temp_a1 + 1)
Set the value of a to (temp_a1 + 1)
Access the variable 'a' for a++ and set temp_a2 = a
Evaluate the expression a++ as right = (temp_a2)
Set the value of a to (temp_a2 + 1)
Evaluate the expression expr = left + right
Set the value of a to (expr)

I seriously dont understand left right .. these all 7 statments . I know that this is the breakdown of ur expression a=++a+ a++;
left name is taken as ++a as it is on the left side of arithmetic plus or as ++ is left of a??

Temp_a1 and temp_a2 may be a register or two .
can variables automatically stored on register if i dont declare variables with register keywrd just for concept i am asking?

@
WBahn says : The execution of a program is nothing more than a bunch of tasks happening one after another -- a sequence of events. Some of those tasks have side effects, such as changing the value stored in a variable. Imagine all of these tasks that happen written on a long list with perhaps a thousand items on it. Highlight all of the tasks on the list that have side effects in yellow. Now draw a red line between any two of the tasks on the list and do this randomly until you have, say, twenty lines. These lines are the sequence points. Between any pair of sequence point if there are any tasks highlighted in yell then those side effects can be applied at any time between the bounding sequence points, but they cannot be applied prior to the early sequence point and must be applied before reaching the last.

can i see it visually and in simple words with many different operators in a long list eg can i get the explanation?


@
djsfantasi
This is why the concept of ‘sequence points’ becomes important. The compiler specifications imply that at sequence points and not between the calculations will be consistent.

i still dont get this word sequence points.
 

djsfantasi

Joined Apr 11, 2010
9,156
A sequence point is a place in the code where the calculations are consistent. I.e., there are no temporary results and all side effects have been calculated.
 

WBahn

Joined Mar 31, 2012
29,979
Access the variable 'a' for ++a and set temp_a1 = a
Evaluate the expression ++a as left = (temp_a1 + 1)
Set the value of a to (temp_a1 + 1)
Access the variable 'a' for a++ and set temp_a2 = a
Evaluate the expression a++ as right = (temp_a2)
Set the value of a to (temp_a2 + 1)
Evaluate the expression expr = left + right
Set the value of a to (expr)

I seriously dont understand left right .. these all 7 statments . I know that this is the breakdown of ur expression a=++a+ a++;
left name is taken as ++a as it is on the left side of arithmetic plus or as ++ is left of a??

Temp_a1 and temp_a2 may be a register or two .
can variables automatically stored on register if i dont declare variables with register keywrd just for concept i am asking?

@
WBahn says : The execution of a program is nothing more than a bunch of tasks happening one after another -- a sequence of events. Some of those tasks have side effects, such as changing the value stored in a variable. Imagine all of these tasks that happen written on a long list with perhaps a thousand items on it. Highlight all of the tasks on the list that have side effects in yellow. Now draw a red line between any two of the tasks on the list and do this randomly until you have, say, twenty lines. These lines are the sequence points. Between any pair of sequence point if there are any tasks highlighted in yell then those side effects can be applied at any time between the bounding sequence points, but they cannot be applied prior to the early sequence point and must be applied before reaching the last.

can i see it visually and in simple words with many different operators in a long list eg can i get the explanation?


@
djsfantasi
This is why the concept of ‘sequence points’ becomes important. The compiler specifications imply that at sequence points and not between the calculations will be consistent.

i still dont get this word sequence points.
There's a limit to how simple this concept can be made and we are rapidly approaching it.

Consider how YOU evaluate an expression like:

x = a * b + c * d

You multiply a * b and then have to remember it somehow. So you write it down on a piece of paper and set that aside for the moment. Let's call that temp1.

Next you multiply c * d and now you have to remember that somehow. So you write it down on another piece of paper. Let's call that temp2.

Now can add what is written down on temp1 and temp2 to get the final result.

Now imagine a processor architecture in which the only thing you can do with general purpose memory (RAM) is load a value from memory into a CPU register or store a value from a CPU register into memory. Nothing else. So you might have

LOAD reg ram // load the contents of the given RAM cell into the given register.
STORE ram reg // store the contents of the given register into the given RAM cell

All of your operation operands must be stored in registers and the result must be written to a register. So you might have

ADD r2 r1 r0 // Add the contents of registers R1 and R2 and store the result in register R2.

So this expression might translate to something like:

LOAD R0 a
LOAD R1 b
LOAD R2 c
LOAD R3 d
ADD R4 R0 R1
ADD R5 R2 R3
MULT R6 R4 R5
STORE x R6

Now, your processor might not even have seven registers; I've worked with processors that only have one, while others have 32. But let's assume that we have plenty of registers available.

So how about
b = a++;

That could be implemented as

LOAD R0 a
STORE b R0
INC R0 // Increment the contents of the register
STORE a RO

Or it could be implemented as

LOAD R0 a
ADD R1 R0 1 // Add the literal (immediate) value 1 to R0 and store result in R1
STORE a R1
STORE b R0

Now change this to

a = a++;

Our two implementations are

LOAD R0 a
STORE a R0
INC R0
STORE a RO

and

LOAD R0 a
ADD R1 R0 1
STORE a R1
STORE a R0

The first loads the value of 'a' into a register and then writes it to the target of the overall expression (which is 'a') and the increments the value and writes that to the target of the side effect (which is also 'a'). The result is that the final value of 'a' is one more than the original value.

The second loads the value of 'a' into a register and then adds 1 to it and stores the result in a different register before storing it into the target of the side effect (which is 'a'). The original value of 'a' is then written to the target of the overall expression (which is also 'a'). The result is that the final value of 'a' is equal to the original value.
 

Thread Starter

Saumyojit

Joined Feb 20, 2020
16
@MrChips
[QUOTE="MrChips, post: 1489975, membee
If the second operator (a--) is performed first
a = 4
(a--) = 4 because a has not yet been updated?

When is a updated? After the operator or have the assignment to result?
[/QUOTE]

Can I know why you have put a question mark in the line (a--) because a has not yet been updated yet??

Isn't it obvious that in post fix a gets updated later first the original value is being used
 

WBahn

Joined Mar 31, 2012
29,979
Isn't it obvious that in post fix a gets updated later first the original value is being used
No, it is NOT obvious because that is NOT what the language definition requires, demands, suggests, implies, or anything else. It is undefined specifically so that the compiler writer can choose when THEY want to update it and they are under NO obligation to do it some way that makes you happy.

All that postfix versus prefix determines is the value of the expression (namely whether it returns the value that 'a' had when it accessed that variable or whether it returns one more than the value that 'a' had when it accessed that variable. This is completely and totally separate from when that value of 'a' is changed as a result of executing the side-effect. The language standard only requires that the change NOT happen sooner than the prior sequence point, and that it NOT happen later than the subsequent sequence point. Other than that, it can happen at ANY time with the only caveat being that each side affect on 'a' has to happen after 'a' was accessed (since the processor that the program is running on is not clairvoyant).

I have given you several examples of how this can result in different results while adhering to the language specifications. But you insist that things MUST happen in the order that YOU want them to happen in. Until you get over that, don't program in C (or, really, in any language since all languages have such behaviors, but C is probably the worst in this sense because it implicitly assumes that people programming in that language are past such foolish notions).
 

Thread Starter

Saumyojit

Joined Feb 20, 2020
16
if i see a-- not in a undefined situation suppose the expression is like :
int a=4
result= a-- + c;

then first a original value will be used in the a-- that is 4 after that a will be 3 .
 

WBahn

Joined Mar 31, 2012
29,979
if i see a-- not in a undefined situation suppose the expression is like :
int a=4
result= a-- + c;

then first a original value will be used in the a-- that is 4 after that a will be 3 .
Sure. If you don't invoke undefined behavior, then the behavior is as it is defined. Not a big surprise.
 
Top