takao21203
- Joined Apr 28, 2012
- 3,702
That was the approach 20 or 25 years ago, optimize trivial instructions, and count cycles.
Since the Pentium age, the answer is MMX and other streaming extensions, and mulltiple cores.
Applications are no longer programmed in assembler, none of the mainstream stuff like Word or Skype or Excel is using it.
Video drivers use some MMX but also to some degree, via Intrinsics, or special compilers.
Forget about these optimizations it does not make a difference at the end of the day. Get your job done as quick as possible, if you dont get enough performance there are two approaches:
a) more powerful chip, higher clocking frequency, more memory
b) multiple controllers
b) might be favourable in some cases, where you only want 8bit architecture, but the MCU becomes clogged very much by display refresh for instance, so you do that with a seperate MCU in realtime, and send serial commands with a 10s of msecs. latency.
You can scale up like they do it with CPU cores in PCs, use 2, 4, 8, or more, just to keep everything to the same architecture (and same electrical specs).
Also C programs are optimized, not down to the bone, but good programmers know what to do, some constructs are just painfully slow. The extra instructions from a free compiler are a minor tradeoff, you dont get 4x speed just from optimizations, when you write good C code already.
It's a marketing gag.
Look at PIC32- you get very fast serial ports, and things like DMA. You dont need assembler anymore for time critical tasks.
If a PIC32 is still too slow, use a cluster of them, or another MCU with even more MHz.
When you look at the Manual for PIC32 and it's assembler instruction set, you know it's a lost cause, and when you compare the price of a small PIC32 to 8bit 16F, you also know optimizing a 16F assembler program for speed is a lost cause too.
I mean, do it some months, but dont get absorbed by it. It no longer plays an important role in the embedded world.
1. Get your job done.
2. If you have extra time, optimize it. If not, just move on to next project.
Since the Pentium age, the answer is MMX and other streaming extensions, and mulltiple cores.
Applications are no longer programmed in assembler, none of the mainstream stuff like Word or Skype or Excel is using it.
Video drivers use some MMX but also to some degree, via Intrinsics, or special compilers.
Forget about these optimizations it does not make a difference at the end of the day. Get your job done as quick as possible, if you dont get enough performance there are two approaches:
a) more powerful chip, higher clocking frequency, more memory
b) multiple controllers
b) might be favourable in some cases, where you only want 8bit architecture, but the MCU becomes clogged very much by display refresh for instance, so you do that with a seperate MCU in realtime, and send serial commands with a 10s of msecs. latency.
You can scale up like they do it with CPU cores in PCs, use 2, 4, 8, or more, just to keep everything to the same architecture (and same electrical specs).
Also C programs are optimized, not down to the bone, but good programmers know what to do, some constructs are just painfully slow. The extra instructions from a free compiler are a minor tradeoff, you dont get 4x speed just from optimizations, when you write good C code already.
It's a marketing gag.
Look at PIC32- you get very fast serial ports, and things like DMA. You dont need assembler anymore for time critical tasks.
If a PIC32 is still too slow, use a cluster of them, or another MCU with even more MHz.
When you look at the Manual for PIC32 and it's assembler instruction set, you know it's a lost cause, and when you compare the price of a small PIC32 to 8bit 16F, you also know optimizing a 16F assembler program for speed is a lost cause too.
I mean, do it some months, but dont get absorbed by it. It no longer plays an important role in the embedded world.
1. Get your job done.
2. If you have extra time, optimize it. If not, just move on to next project.