Is there a robust way to implement an ASM table like structure in C

Thread Starter

Robin66

Joined Jan 5, 2016
275
Hi, I'm using xc16 on PIC24F and I have a block of code that could be implemented in a much nicer fashion if I were using asm. Is there a trick I'm missing in C that can replicate the asm efficiency?

My C code, fast if phase==0; gets progressively slower when phase==5 because every condition needs to be evaluated along the way.
Code:
if(phase==0)
  LATAbits.RA0 = 1;
else if(phase==1)
  LATCbits.RC7 = 1;
else if(phase==2)
  LATBbits.RB2 = 1;

Etc…
ASM code, fast for all phases because the program counter jumps straight to the appropriate instruction
Code:
phase = phase*2;

Add PCL, phase
BSF LATA, 0
GOTO done
BSF LATC, 7
GOTO done
BSF LATB, 2
GOTO done
Etc…
done:
 

Papabravo

Joined Feb 24, 2006
21,228
Depends entirely on the compiler, but the case statement has numerous opportunities for optimization.
C:
switch(phase)
{
  case 1: // code for case 1
  case 2: // code for case 2
  case 3: // code for case 3
  default: // code for none of the above
}
 
Last edited:

Thread Starter

Robin66

Joined Jan 5, 2016
275
Ok, I wouldn't expect that to compile to anything faster than the if..else block. For the optimizer to turn this into the desired asm it would need to look at each case statement, realise that their value is incrementing by 1 each time, and that each case then breaks out to the end.... seems a big ask of the optimiser.

I'll try it tonight tho and check the disassembly.
 

nsaspook

Joined Aug 27, 2009
13,315
Hi, I'm using xc16 on PIC24F and I have a block of code that could be implemented in a much nicer fashion if I were using asm. Is there a trick I'm missing in C that can replicate the asm efficiency?

My C code, fast if phase==0; gets progressively slower when phase==5 because every condition needs to be evaluated along the way.
Code:
if(phase==0)
  LATAbits.RA0 = 1;
else if(phase==1)
  LATCbits.RC7 = 1;
else if(phase==2)
  LATBbits.RB2 = 1;

Etc…
ASM code, fast for all phases because the program counter jumps straight to the appropriate instruction
Code:
phase = phase*2;

Add PCL, phase
BSF LATA, 0
GOTO done
BSF LATC, 7
GOTO done
BSF LATB, 2
GOTO done
Etc…
done:
As recommended by those above use a switch because under the hood it's just a structured goto (address with offset value). Forget the books and naysayers if you are a good programmer, you can also use the C goto if needed for code like this if you need to short-circuit function execution and cleanup.
 
Its fundamentally different...
Case
calculate pointer offst from ordinal argument, just simple maths, > jump to predefined address.

Nested ifs are calling many and varied math/logic routines and the whole thing will be evaluated every time untill a true condition is found.
If the first if is true then there probably isnt much in it, after that the processor time taken is exponential.

Al
 

Papabravo

Joined Feb 24, 2006
21,228
Ok, I wouldn't expect that to compile to anything faster than the if..else block. For the optimizer to turn this into the desired asm it would need to look at each case statement, realise that their value is incrementing by 1 each time, and that each case then breaks out to the end.... seems a big ask of the optimiser.

I'll try it tonight tho and check the disassembly.
Not really. You have identified the critical features enabling optimization, and the smart compiler writers have also.
 

Stuntman

Joined Mar 28, 2011
222
I use a switch in these types of situations, but as previously mentioned, it may depend on exactly how time critical you need the routine to be.

From a different perspective: If your concern is execution consistency, not speed, why not replace the "else if" with if statement? This approach assumes a few things such as the test value cannot satisfy two conditions and the register being tested is not modified during execution of an earlier if statement.
 

Thread Starter

Robin66

Joined Jan 5, 2016
275
I spent all last night debugging a FET driver. I will try the switch before the weekend and report back.

The goal is for the code to be consistently fast so I'll see how they compare. A series of if statements actually might be better on avg than the if..else block because it can use the skip-if-zero instruction. Worth a try. Thanks for all the feedback. It's given me 2 alternatives to try.
 

Papabravo

Joined Feb 24, 2006
21,228
As you investigate this beware of jumping to a conclusion based on a single test case. With only two or three alternatives the code might look the same as the code for an if-else. There might be some minimal number of alternative cases before the code generator/optimizer generates the code you are looking (hoping?) for. Don't forget to try all the compiler configuration options for doing optimization.
 

WBahn

Joined Mar 31, 2012
30,088
Not really. You have identified the critical features enabling optimization, and the smart compiler writers have also.
But relying on a compiler to always optimize code a particular way, even if it is capable of doing it, is very dangerous. These kinds of optimizations rely on the compiler to recognize a particular code pattern. It is quite easy to make a minor change to the code that, to the writer, seems inconsequential but that makes it so that the compiler fails to see the required pattern.
 

WBahn

Joined Mar 31, 2012
30,088
I spent all last night debugging a FET driver. I will try the switch before the weekend and report back.

The goal is for the code to be consistently fast so I'll see how they compare. A series of if statements actually might be better on avg than the if..else block because it can use the skip-if-zero instruction. Worth a try. Thanks for all the feedback. It's given me 2 alternatives to try.
Use a lookup table

Code:
#include <stdio.h>
#include <stdlib.h>

int main(void)
{

   unsigned LATAbits = 0;
   unsigned LATBbits = 0;
   unsigned LATCbits = 0;
 
   unsigned *latch[] = {&LATAbits, &LATCbits, &LATBbits};
   unsigned  bits[]  = {  0,  7,  2};
 
   printf("A: %u\n", LATAbits);
   printf("B: %u\n", LATBbits);
   printf("C: %u\n", LATCbits);

   int phase = 1;
   printf("phase = %i\n", phase);

   *latch[phase] |= 1 << bits[phase];
 
   printf("A: %u\n", LATAbits);
   printf("B: %u\n", LATBbits);
   printf("C: %u\n", LATCbits);
 
   return EXIT_SUCCESS;
}
 

Thread Starter

Robin66

Joined Jan 5, 2016
275
Oh wow, thanks @WBahn . I'll try this and check out the disassembly. I like what you did with the left bit rotate to OR the correct bit of the addressed LAT register.
 

Papabravo

Joined Feb 24, 2006
21,228
But relying on a compiler to always optimize code a particular way, even if it is capable of doing it, is very dangerous. These kinds of optimizations rely on the compiler to recognize a particular code pattern. It is quite easy to make a minor change to the code that, to the writer, seems inconsequential but that makes it so that the compiler fails to see the required pattern.
Of course. "Trust, but verify" is good advice for compilers and arms control as well.
 

WBahn

Joined Mar 31, 2012
30,088
Of course. "Trust, but verify" is good advice for compilers and arms control as well.
But this would mean that you would need to verify that it optimized the code just they way you wanted it to EVERY time you recompile it.

Far better to just write the code so that it does what you are trying to achieve whether it is optimized in a particular way or not.
 

Papabravo

Joined Feb 24, 2006
21,228
But relying on a compiler to always optimize code a particular way, even if it is capable of doing it, is very dangerous. These kinds of optimizations rely on the compiler to recognize a particular code pattern. It is quite easy to make a minor change to the code that, to the writer, seems inconsequential but that makes it so that the compiler fails to see the required pattern.
Your concern is a bit overblown. You might check it if you make a change to the number of entries, or when you upgrade to a new version of the compiler, but hey without a specific example were just trying to hit a target blindfolded.
 

WBahn

Joined Mar 31, 2012
30,088
Your concern is a bit overblown. You might check it if you make a change to the number of entries, or when you upgrade to a new version of the compiler, but hey without a specific example were just trying to hit a target blindfolded.
No, I'm not overblowing the concern.

We wrote some macros that did bit rotations. Since C doesn't have a bit rotation operator, we had to brute force it by doing a fractional left shift combined with a fractional right shift. The XCode compiler, much to our surprise, recognized the pattern as a rotate and implemented it using the hardware's rotate instruction. We were thrilled because this was critically important code that needed to run very fast, but it was running faster than we thought it should be able to, so after a lot of experimentation and finally looking at the object code we discovered that it was using the rotate instruction. We were quite impressed. Everything was going fine until we made a change in a file that didn't even include these macros and that had nothing to do with the speed-critical code. We had made a lot of code enhancements since the last time we stress tested the critical part but nothing we had done affected those macros (including any of the code that called those macros). But when we went to show some people the code it was now running a lot slower. Pretty quickly we looked at the object code and discovered that it wasn't using the rotate. So after a lot of head scratching, since we had changed none of the code that should impact the compilation of that part, we started undoing things and as soon as we reverted back to the previous version of that unrelated file it started optimizing it again. Over the next few months we went through this several times. We never figured out why this was happening, but we put in a little benchmarking routine that tested the speed of the bottleneck code whenever we launched the program.
 

Papabravo

Joined Feb 24, 2006
21,228
Not every situation will be similar to yours. In many cases the root cause of the problem can be identified and dealt with. Do you consider that experience the norm, or just an outlier. If it is the norm then maybe you should consider different tools, or get a hotline to the compiler developer.
 

WBahn

Joined Mar 31, 2012
30,088
Not every situation will be similar to yours. In many cases the root cause of the problem can be identified and dealt with. Do you consider that experience the norm, or just an outlier. If it is the norm then maybe you should consider different tools, or get a hotline to the compiler developer.
I didn't say every situation is going to be like that. Did I?

I said relying on the optimization to always work a specific way is dangerous for critical pieces of code and that it is better to write the code so that it achieves what you want directly.

What is wrong with that statement?

Is it incorrect to bring that to the attention of the TS?

Even if the compiler reliably does the optimization you are counting on, it is very poor practice. Unless you thoroughly document the code pointing out this reliance, future users of the code (including yourself) are highly likely to not use that exact compiler with that exact optimization configured and then have no clue why the code doesn't perform as needed. Even if you do document the code adequately, you have severely restricted the utility of the code. You yourself stated that the compiler might not behave the same unless the code meets some criteria, such as the number of alternatives in a switch() statement, so now future users must be made aware of this and must perform tests anytime they change the number of alternatives. They also can't use any compiler configurations except the ones that perform this particular optimization. And if the next version of the compiler doesn't happen to do this particular optimization in this particular configuration, then they have to go through a bunch of options to hopefully find one that does (and hope that that configuration still does all of the other optimizations that you were relying on for acceptable behavior). If none of them do, then they can't upgrade their compiler or they have to try other compilers hoping that they can find one that exhibits just the right set of magic optimizations.

Or, they could do what you should have done to begin with and rewrite the code so that it does not rely on a specific compiler optimization to meet any critical performance requirement.
 
Top