I have an Arduino-compatible MCU (Adafruit QT Py, based on SAMD21). I thought of a couple different ways to write a function I want to use. What's the best way to test each one for clock-cycle efficiency? I read online I can use millis(), call it before and after and get the difference, but I also read that's not super accurate. I've also read about using digitalWrite() high before and low after (or vice-versa) and use an oscilloscope to read the pulse width, but that could be even less accurate. What're the best ways you know of?