RESULT OF EXPRESSION

Thread Starter

Saumyojit

Joined Feb 20, 2020
14
#include <stdio.h>
int main()
{
int a =4;
int result=++a + a--;
printf("%d",result);
}

RESULT: if i dry run and go from left to right then result is 8
or going from right to left result is 10.
Neither is the correct answer why so whats the logic
 

WBahn

Joined Mar 31, 2012
25,290
#include <stdio.h>
int main()
{
int a =4;
int result=++a + a--;
printf("%d",result);
}

RESULT: if i dry run and go from left to right then result is 8
or going from right to left result is 10.
Neither is the correct answer why so whats the logic
There is no correct answer.

You are invoking undefined behavior, so ANY behavior is equally valid -- generating a compile time error, generating a compile time warning, producing a run time error, starting a global thermonuclear war, or doing exactly what you thought it should.

The underlying problem is that the C language specification defines "sequence points" which are points during program execution at which all side-effects before it must have been applied and at which all side-effects after it must not have been applied. But between sequence points the order in which side-effects are applied is undefined. The compiler writer is given complete freedom to handle this however they see fit, which enables them to leverage the strengths and weaknesses of the underlying platform to squeeze performance out of the machine -- something that is very desirable in a language intended for system-level programming.

Your code has three side effects on the line where you set the value of 'result': changing the value of 'a' due to the pre-increment operation, changing the value of 'a' due to the post-decrement operation, and changing the value of 'result' due to the assignment operation. The compiler is free to do these in any order it chooses.

EDIT: Fix typos and clean up the grammar a bit.
 
Last edited:

nsaspook

Joined Aug 27, 2009
6,951
You code with gcc: cc (Debian 9.2.1-30) 9.2.1 20200224

All warnings

# cc -Wall lo.c
lo.c: In function 'main':
lo.c:5:19: warning: operation on 'a' may be undefined [-Wsequence-point]
5 | int result=++a + a--;
| ~^~

Result: 9
 

WBahn

Joined Mar 31, 2012
25,290
#include <stdio.h>
int main()
{
int a =4;
int result=++a + a--;
printf("%d",result);
}

RESULT: if i dry run and go from left to right then result is 8
or going from right to left result is 10.
Neither is the correct answer why so whats the logic
Order of operations require that this line parses as if it were:

result = (++a) + (a--);

That's the box you have to work within. The binary addition CANNOT be applied before either of the other two.

The value of 'a' will be changed twice and these changes can occur at any time and in any order, so you have to allow for all possibilities (and that's assuming that the compiler writer didn't play some game that you can't even imagine -- remember, undefined behavior is exactly that).

If the side-effects are deferred until after evaluation of the expression, then (++a) will evaluate to 5 and (a--) will evaluate to 4, making 'result' equal to 9, regardless of whether you evaluate left-to-right or right-to-left. But what about the value of 'a'? You are setting it equal to 5 and also setting it equal to 4, so it will be whichever of the two is done last.

If you assume that side-effects are applied as they are encountered, then if you go left-to-right you get (++a) evaluates to 5 and changes 'a' to 5 and then you get (a--) which evaluates to 5 and changes 'a' to 4. So that would get give 'result' a value of 10 and 'a' a value of '4'.

If you assume that side-effects are applied as they are encountered, then if you go right-to-left you get (a--) evaluates to 4 and changes 'a' to 3 and then you get (++a) which evaluates to 4 and changes 'a' to 5. So that would give 'result' a value of 8 and 'a' a value of '5'.

There are many other possibilities, but these are perhaps the three most likely.
 

WBahn

Joined Mar 31, 2012
25,290
Note that you should avoid undefined behavior, pretty much at all costs. Do not experiment and figure out what the compiler does and then leverage that knowledge in your code. Yes, been there, done that, got burned because I had never heard of undefined behavior (that is how I learned about it), the notion of side effects (that is how I learned about them), or the concept of sequence points (it would be many years before I learned about them).

IF you wrote the compiler or IF the compiler writer explicitly defined what that behavior is for THAT version of THAT compiler, then you are probably safe. But the compiler writer is completely free to change the behavior in the next version. Even if you never plan to use another compiler or upgrade the one you have, you can't assume that your experiments have revealed everything. The compiler writer might have done something very different for expressions with fewer than N terms than they did for expressions with more than N because of the practicality of putting things into registers when there are just a few terms. For similar reasons, they might deal with side effects for early terms different than they do later terms.

So learn what things result in undefined behavior and then religiously stay the hell away from them.

Probably the biggest source of undefined behavior for most C programmers is using a variable more than once in a statement that has side effects involving that variable.

About the only exception that I can think of right now would be the normal assignment statement such as

a = a + a;

Although this has a side effect that changes the value of 'a', it can't be applied until after the expression is evaluated, so this is safe (and it would be a really bad thing if it weren't).

But if you do something like

b = a + a = 3;

Now you have invoked undefined behavior. The second 'a' will definitely evaluate to 3, but the first 'a' could evaluate to the old value of 'a', or the new value of 'a'.
 

Thread Starter

Saumyojit

Joined Feb 20, 2020
14
The underlying problem is that the C language specification defines "sequence points" which are points during program execution at which all side-effects before it must have been applied and at which all side-effects after it must not have been applied.
A sequence point defines any point in a computer program's execution at which it is guaranteed that all side effects of previous evaluations will have been performed, and no side effects from subsequent evaluations have yet been performed

please explain this line with some egs including some different operators (++,& bitwise and,..)

i dont get the sequence point meaning
 

WBahn

Joined Mar 31, 2012
25,290
There is a lot of information about sequence points out there on the Internet for the asking. Just Google "sequence points" and start reading.

The meaning is exactly as it says. The execution of a program is nothing more than a bunch of tasks happening one after another -- a sequence of events. Some of those tasks have side effects, such as changing the value stored in a variable. Imagine all of these tasks that happen written on a long list with perhaps a thousand items on it. Highlight all of the tasks on the list that have side effects in yellow. Now draw a red line between any two of the tasks on the list and do this randomly until you have, say, twenty lines. These lines are the sequence points. Between any pair of sequence point if there are any tasks highlighted in yell then those side effects can be applied at any time between the bounding sequence points, but they cannot be applied prior to the early sequence point and must be applied before reaching the last.

The language standard defines where the sequence points are.
 

MrChips

Joined Oct 2, 2009
20,329
result = (++a) + (a--);

This is a thought process and the results should never be relied on in any real engineering application.

Given a = 4 initially

(++a) = 5

(a--) = 3
but it really depends on how the compiler performs its operations.

Hence, in conclusion, the result is indeterminate.
 

MrChips

Joined Oct 2, 2009
20,329
The question is, "which value of the variable a is being used in the calculation?"

There are many possible scenarios which would result in different outcomes.

Suppose the variable a is in global application memory.
Now suppose the calculation uses CPU register to temporarily hold the value of a. That in itself causes a problem

Now add to that the fact that (++a) is pre-increment operator while (a--) is post-decrement.
Which operator is performed first?

If we are to assume that (++a) is performed first then
a = 4
(++a) = 5

What value of a is used in the next operator (a--)?
Has the value of a been updated in memory?
Is the value in the holding register equal to 4 or 5?
What if two different registers were initialized by the compiler for the two separate operations?

If the second operator (a--) is performed first
a = 4
(a--) = 4 because a has not yet been updated?

When is a updated? After the operator or have the assignment to result?
 

Thread Starter

Saumyojit

Joined Feb 20, 2020
14
WHy did @MrChips say this "
(++a) is pre-increment operator while (a--) is post-decrement.

Which operator is performed first?
"

https://en.cppreference.com/w/c/language/operator_precedence#cite_note-1 acc to this link the unary operators have higher precedence than arithmetic addition so ++ and -- will operate first now a -- has higher precedence than ++a ...so a-- gets executed


the calculation uses CPU register to temporarily hold the value of a. That in itself causes a problem
HOW IS that a problemn please elaborate..
 

MrChips

Joined Oct 2, 2009
20,329
What is the point of continuing this discussion?

https://en.cppreference.com/w/c/language/eval_order

Undefined behavior
1) If a side effect on a scalar object is unsequenced relative to another side effect on the same scalar object, the behavior is undefined.
C:
i = ++i + i++; // undefined behavior
i = i++ + 1; // undefined behavior
f(++i, ++i); // undefined behavior
f(i = -1, i = -1); // undefined behavior
This has clearly been identified as undefined behavior.

unspecified behavior - two or more behaviors are permitted and the implementation is not required to document the effects of each behavior. For example, order of evaluation, whether identical string literals are distinct, etc. Each unspecified behavior results in one of a set of valid results and may produce a different result when repeated in the same program.

https://en.cppreference.com/w/c/language/behavior#UB_and_optimization
 

MrChips

Joined Oct 2, 2009
20,329
Let us make the problem a little simpler.
Suppose the problem is
A = 4;
B = 4;
result = (++A) + (B--);

Clearly,
result = 5 + 4 = 9

whereas your stated problem is declared as undefined behavior, since we cannot be certain of the behavior of a.
 

Thread Starter

Saumyojit

Joined Feb 20, 2020
14
There is no concept of left-to-right or right-to-left evaluation in C, which is not to be confused with left-to-right and right-to-left associativity of operators: the expression f1() + f2() + f3() is parsed as (f1() + f2()) + f3() due to left-to-right associativity of operator+, but the function call to f3 may be evaluated first, last, or between f1() or f2() at run time.

"what is the difference between associativity and evaluation?"

{If a side effect on a scalar object is unsequenced relative to another side effect on the same scalar object } ??

i also did not understand the upper line.

result = ++a+ a--;
in the precedence table of the link I mentioned, it is clearly given post decrement should be evaluated then pre increment....
 

WBahn

Joined Mar 31, 2012
25,290
result = (++a) + (a--);

This is a thought process and the results should never be relied on in any real engineering application.

Given a = 4 initially

(++a) = 5

(a--) = 3
but it really depends on how the compiler performs its operations.

Hence, in conclusion, the result is indeterminate.
What assumptions are you making that result in (a--) evaluating to 3? This would require that at the time it was evaluated, the value stored in a was 3.
 

WBahn

Joined Mar 31, 2012
25,290
https://en.cppreference.com/w/c/language/operator_precedence#cite_note-1 acc to this link the unary operators have higher precedence than arithmetic addition so ++ and -- will operate first now a -- has higher precedence than ++a ...so a-- gets executed which gives 4 only, post decremnt a now is 3 and ++a gives 4 so result is 8 and in this line int result=++a + a--; where is the sequence point in this line
There IS no sequence point within that line -- THAT's why the writer of the compiler has the option to apply the side effect in whatever order and at whatever time they choose.

A -- does not have higher precedence than ++. The suffix increment/decrement operators have higher precedence than the prefix increment/decrement operators.

But that's immaterial here as precedence and associativity only apply when the two operators in question have a shared operand.

For instance

y = a + b * c - d;

The + and * have b as a shared operand, so precedence says that the * must be done before the +.
The * and the - have c as a shared operand, so precedence says that the * must be done before the -

So we can put parens around the b * c without changing anything.

y = a + (b * c) - d

The + and - have (b * c) as a shared operand, so associativity says that the + must be done before the -.

y = (a + (b * c)) - d

Notice how the parens force the evaluation to be *, then +, then -.

Now consider

y = a * b + c / d;

The * and + have b as a shared operand, so precedence says that the * has to be done before the +.
The + and / have c as a shared operand, so precedence says that the / has to be done before the +.

y = (a * b) + (c / d);

At this point there are no more shared operands.

Notice that the parens do not dictate whether the * is done before the /. In fact, there is no way to put parens in this expression so as to dictate that this happen. In fact, the compiler writer is completely free to evaluate the operands to the + in either order.

But even if it did require the a++ to be evaluated before the a--, that does not resolve the ambiguity because the evaluation of the expression is completely separate from the application of the side effect (changing the value stored in the variable 'a').
 

Thread Starter

Saumyojit

Joined Feb 20, 2020
14
what is the difference between associativity and evaluation?"

what is the meaning of this line in simple words-> {If a side effect on a scalar object is unsequenced relative to another side effect on the same scalar object } ??
 

WBahn

Joined Mar 31, 2012
25,290
what is the difference between associativity and evaluation?"

what is the meaning of this line in simple words-> {If a side effect on a scalar object is unsequenced relative to another side effect on the same scalar object } ??
Associativity has to do with the order in which operations are applied. Evaluation has to do with evaluating an expression.

x = (expression#1) + (expression#2) + (expression#3);

Associativity requires that I evaluate (expression#1) + (expression#2) before adding the result of that evaluation to (expression#2).

But I can evaluate the three expressions in any order I want. Clearly I have to evaluate (expression#1) & (expression#2) before I can evaluate (expression#1) + (expression#2), but there is NOTHING that prevents me from evaluating (expression#3) first, then (expression#2), and finally (expression#1). Or, I can hold off evaluating (expression#3) until after I evaluate (expression#1) + (expression#2). My choice.

Unsequenced simply means that they lie between the same set of sequence points.

Imagine you give someone a list of things to do.

List #1
Drop of the prescriptions. (side effect: none)
Pick up dog food. (side effect: money in wallet goes down)
Get gas in the car.(side effect: money in wallet goes down)
Withdraw $50 from the ATM(side effect: money in wallet goes up)
List #2
Pick up the prescriptions.(side effect: money in wallet goes down)
Pay the utility bill.(side effect: money in wallet goes down)
Pick up pizza for dinner.(side effect: money in wallet goes down)

The rules are that everything on List #1 must be accomplished before anything on List #2 can be done. But, within a list, they can be done in any order.

Assuming you start out with enough (let's say just enough) money in your wallet to pay for the dog food and the gas, it doesn't matter in which order you do those four things, but it is critical that you withdraw the money before you do anything in the second list, otherwise you will not have enough money to pay for the purchase. It's also critical that dropping off the prescription be done before picking the filled prescriptions up.

If everything were on a single list, then you would be free to do them in any order, You could pick up the prescriptions first and withdraw the money last. Either of those causes problems.

That's what sequence points do -- they break your program up into lists of instructions. Within each list (between any two consecutive sequence points), the compiler writer is free to apply the side effect in any order they want, including all right at the beginning or all right at the end.
 

MrChips

Joined Oct 2, 2009
20,329
Very often I want to do a calculation such as:

R = A / B * C;

Assume that all variables are 16-bit unsigned integers and values do not result in overflows.
I want to make sure that I get the best precision in the result R.
Since I have no idea how the compiler will choose the order of operations, I force the compiler to do it in the manner I want.
Hence I would break up the calculation as follows.

R = A * C;
R = R / B;

To be honest, I don't even know if doing this is necessary or what would an optimizing compiler choose to do. The only way to find out is to look at the ASM code generated by the compiler. Time to go check this out.
 
Top