Testing Arduino code efficiency

Thread Starter

LikeTheSandwich

Joined Feb 22, 2021
151
I have an Arduino-compatible MCU (Adafruit QT Py, based on SAMD21). I thought of a couple different ways to write a function I want to use. What's the best way to test each one for clock-cycle efficiency? I read online I can use millis(), call it before and after and get the difference, but I also read that's not super accurate. I've also read about using digitalWrite() high before and low after (or vice-versa) and use an oscilloscope to read the pulse width, but that could be even less accurate. What're the best ways you know of?
 

click_here

Joined Sep 22, 2020
541
I have an Arduino-compatible MCU (Adafruit QT Py, based on SAMD21). I thought of a couple different ways to write a function I want to use. What's the best way to test each one for clock-cycle efficiency? I read online I can use millis(), call it before and after and get the difference, but I also read that's not super accurate. I've also read about using digitalWrite() high before and low after (or vice-versa) and use an oscilloscope to read the pulse width, but that could be even less accurate. What're the best ways you know of?
Call the function 100 times and work out the average time
 

Ya’akov

Joined Jan 27, 2019
5,657
Profiling is a good thing but are you having some sort of performance problem that requires optimization? Also, have you looked at the code generated by the compiler? Your source may produce something quite different than you expect.

Premature optimization is a common problem that wastes your cycles, and probably does nothing to help with the final product.

Not saying you are definitely doing it, just a heads up that is a common pitfall.
 

djsfantasi

Joined Apr 11, 2010
8,323
Making the signal setting faster by port manipulation won’t make your measurement more accurate. The errors introduced by any circuit lag will cancel each other out.
  • Event occurs at time t1 and you set the GPIO pin
  • The pin takes additional time, d, to assert and hence is seen at t1+d.
  • The event end occurs at time t2 and you set the GPIO pin
  • The pin takes additional time to assert and hence is seen at t2+d

The event time, ideally is t2-t1. The observed time is (t2+d)-(t1+d). Which can be rewritten as t2+d-t1-d. Or t2-t1+d-d. Which is… drum roll… t2-t1​
 

Thread Starter

LikeTheSandwich

Joined Feb 22, 2021
151
Profiling is a good thing but are you having some sort of performance problem that requires optimization? Also, have you looked at the code generated by the compiler? Your source may produce something quite different than you expect.

Premature optimization is a common problem that wastes your cycles, and probably does nothing to help with the final product.

Not saying you are definitely doing it, just a heads up that is a common pitfall.
I don't have any performance problems, but I want to turn my code into a library, so I want to optimize it so it won't cause performance problems for anyone else down the line. As far as the compiled code, no I haven't looked at it, what exactly do you mean? Like which one takes up more space/RAM on the chip?
 

Ya’akov

Joined Jan 27, 2019
5,657
I don't have any performance problems, but I want to turn my code into a library, so I want to optimize it so it won't cause performance problems for anyone else down the line. As far as the compiled code, no I haven't looked at it, what exactly do you mean? Like which one takes up more space/RAM on the chip?
Size is one thing, but they also may be very similar as machine code depending on what you mean by “two different ways”. If you are profiling CPU don’t neglect to profile memory as wel.
 

Thread Starter

LikeTheSandwich

Joined Feb 22, 2021
151
Size is one thing, but they also may be very similar as machine code depending on what you mean by “two different ways”. If you are profiling CPU don’t neglect to profile memory as wel.
Thanks. I've never looked directly at compiled code before. The "two different ways" amounts ultimately to simple operations (>, ==, &&, etc), but the operations can be done in a few ways and in different orders, grouped in certain ways, and I want to see which way is fastest.
 

Ya’akov

Joined Jan 27, 2019
5,657
Thanks. I've never looked directly at compiled code before. The "two different ways" amounts ultimately to simple operations (>, ==, &&, etc), but the operations can be done in a few ways and in different orders, grouped in certain ways, and I want to see which way is fastest.
The nice thing about looking at the object code to see what the compiler is doing. Sometimes we try to tweak source code because it looks “long” and it turns out the compiler is smarter than you and optimizes your source into the same object code in either case but the “optimized“ source is harder to read and maintain so you actually make things worse.

Profiling is very helpful stuff, and in the case of MCUs that will run on battery power, power profiling is also important. FOr the purpose of code, though, it is good enough to know how long the code executes because that will be proportional to the power use unless peripheral hardware is called asynchronously by the code, then you have to actually look at power consumption if you are trying to squeeze every nW out of the battery.
 

Thread Starter

LikeTheSandwich

Joined Feb 22, 2021
151
The nice thing about looking at the object code to see what the compiler is doing. Sometimes we try to tweak source code because it looks “long” and it turns out the compiler is smarter than you and optimizes your source into the same object code in either case but the “optimized“ source is harder to read and maintain so you actually make things worse.

Profiling is very helpful stuff, and in the case of MCUs that will run on battery power, power profiling is also important. FOr the purpose of code, though, it is good enough to know how long the code executes because that will be proportional to the power use unless peripheral hardware is called asynchronously by the code, then you have to actually look at power consumption if you are trying to squeeze every nW out of the battery.
I have heard about the compilers often being smarter then we are. The differences ultimately is between more comparisons and fewer operations (more ==, fewer =), or vice versa. I would be inclined to think that == takes fewer cycles than = since it's all reading and no writing, but I don't know for certain, and I have to use extra comparators in place of assignments. I'll try tonight or tomorrow and eventually upload my code with the results.
 

MrChips

Joined Oct 2, 2009
26,084
I have heard about the compilers often being smarter then we are. The differences ultimately is between more comparisons and fewer operations (more ==, fewer =), or vice versa. I would be inclined to think that == takes fewer cycles than = since it's all reading and no writing, but I don't know for certain, and I have to use extra comparators in place of assignments. I'll try tonight or tomorrow and eventually upload my code with the results.
Sorry, I do not understand what you are trying to do.
Assignment operator (=) and comparison operator (==) in C/C++ syntax are two very different operations. You cannot replace one with the other.
 

Thread Starter

LikeTheSandwich

Joined Feb 22, 2021
151
Sorry, I do not understand what you are trying to do.
Assignment operator (=) and comparison operator (==) in C/C++ syntax are two very different operations. You cannot replace one with the other.
I can significantly rewrite the code to use more comparators and fewer assignments, or vice versa. If I'm not clear, don't worry about it, how I'm rewriting the code isn't the purpose of this thread anyway.
 

nsaspook

Joined Aug 27, 2009
9,722
I can significantly rewrite the code to use more comparators and fewer assignments, or vice versa. If I'm not clear, don't worry about it, how I'm rewriting the code isn't the purpose of this thread anyway.
I wouldn't worry much about it unless there is a actual performance problem or hardware module that needs tweaking. You want the code to be clear and easily understandable to a human. Let the compiler and optimization twist your well structured source code.
 

Thread Starter

LikeTheSandwich

Joined Feb 22, 2021
151
I wouldn't worry much about it unless there is a actual performance problem or hardware module that needs tweaking. You want the code to be clear and easily understandable to a human. Let the compiler and optimization twist your well structured source code.
So when making a library, most would encourage simple, good, legible code over strictly optimized?
 
Top