10ns signal/event timestamper

jars121 · Jan 30, 2023

Hi all,

I'm currently updating and improving an existing design. The architecture of the updated design is as follows:

Multiple Cortex A cores running Linux.
Multiple (homogenous) Cortex M cores running FreeRTOS (SMP).

The Cortex M cores interface with peripherals on the board and perform real-time data acquisition and pre-processing. Data is then passed to the Linux user space environment (Cortex A cores) with RPMsg.

What I'm looking to implement is a signal/event timestamping capability, and I'm aiming for a 10ns resolution. When a signal or event is captured within FreeRTOS, a timestamp is captured alongside it for processing in Linux. The timestamp is used both for general timestamping purposes (i.e. understanding the relative timing of various inputs, signals, events, etc.) as well as part of the digital input processing function (i.e. measuring pulse width, period, frequency, etc. of digital inputs).

On the current version of the design, the timestamping function is built using multiple cascading/chained timer counters within the FreeRTOS-based MCU which is also doing all the real-time data acquisition. This works to an extent, but due to the frequency of the peripheral clock and the associated TC dividers, the resolution isn't quite what I'm after. Furthermore, given the amount of context switching and ISR handling going on in the MCU, there is noticeable jitter in the resultant timestamp as well. I'm looking for a more deterministic solution, with as little overhead for the Cortex M cores as possible.

With that context out of the way, these are some of the options I've been considering. These are by no means valid or exhaustive, and I'd very much appreciate any insight, suggestion or clarification.

Use a 32-bit binary counter (probably a dual 16-bit chained counter) IC with a parallel or serial output (i2c/SPI) interface to read the current count whenever a signal/event is captured. I know the SN74 series can be used for this purpose, but I'm not sure if I can reliably achieve 100MHz operation, and the latency associated with reading the count via an 8-bit parallel or (relatively slow) serial interface isn't ideal.
Use a small MCU and use the same TC method as above, with an additional output to interrupt the primary Cortex M cores when a rollover occurs. As with the above option, reading from the standalone MCU via a serial interface introduces additional latency; perhaps have the small MCU write to a small asynchronous dual-port RAM module?
Use a CPLD/FPGA. This approach would obviously provide the best timing performance/reliability, and would give flexibility as to how the timestamps are read by the main Cortex M cores, but potentially comes with additional complexity. I could use a high speed serial interface to minimise read latency from the Cortex M cores (e.g. 50MHz SPI), but I had also considered a 32-bit parallel output from the CPLD/FPGA, mapped to a 32-bit wide GPIO register in the Cortex M core. This adds space and layout complexity to the design, but in theory, would allow the Cortex M to read the 32-bit count with a single read of the GPIO port's memory address. Is that feasible?

Are there any other options I haven't considered?

Thanks!

MrChips · Jan 30, 2023

Since I am already an STM32 ARM user I would choose that chip.

You can get STM32 with 550MHz core frequency. They have built-in 32-bit counter modules so you don't need any external hardware to do time-stamping.

In fact I am already doing time-stamping with 10ns resolution.

jars121 · Jan 30, 2023

Thanks for your input. That certainly sounds like a good option. Accessing the count is where I become unstuck; perhaps a high speed serial interface (50MHz SPI as mentioned above) would be a good compromise.

nsaspook · Jan 30, 2023

I would likely use a CP counter as a high resolution timing source.
https://developer.arm.com/documentation/ka001406/latest

https://interrupt.memfault.com/blog/profiling-firmware-on-cortex-m
Let me explain how the system works: CYCCNTENA enables a cycle counter in the DWT unit of your microcontroller. This counter is incremented every CPU cycle (i.e. 168 million times per second).

MrChips · Jan 30, 2023

jars121 said:
Thanks for your input. That certainly sounds like a good option. Accessing the count is where I become unstuck; perhaps a high speed serial interface (50MHz SPI as mentioned above) would be a good compromise.

Why bother to use SPI when you can read the counter register directly in parallel with one instruction?

jars121 · Jan 30, 2023

nsaspook said:
I would likely use a CP counter as a high resolution timing source.
https://developer.arm.com/documentation/ka001406/latest

https://interrupt.memfault.com/blog/profiling-firmware-on-cortex-m
Let me explain how the system works: CYCCNTENA enables a cycle counter in the DWT unit of your microcontroller. This counter is incremented every CPU cycle (i.e. 168 million times per second).

Ah I had seen something about this and forgot to include it as an option. So once enabled, it's as simple as reading the relevant register address to access the current cycle; that makes perfect sense. I don't suppose there's a watermark or overflow interrupt available as well?

I'm using Cortex M4 cores @ 266MHz; does this mean I'd have access to the full 266MHz cycle rate, or are dividers incorporated?

Thread starter	Similar threads	Forum	Replies	Date
T	Relearning Digital Signal Processing 4 years after graduation.	General Electronics Chat	2	Jun 8, 2026
V	Analog Signal Measurement	Microcontrollers	29	Jun 7, 2026
K	Do FRAMs need their CE (bar) signal pulsed?	Microcontrollers	2	May 24, 2026
	Boosting 1V amplitude square wave to 50V with rise time under 10ns.	Analog & Mixed-Signal Design	6	Nov 30, 2023
	Remove 10ns jitter from a 1PPS signal from a GPS receiver	General Electronics Chat	7	Feb 11, 2022

10ns signal/event timestamper

Join our Engineering Community! Sign-in with:

10ns signal/event timestamper

jars121

MrChips

jars121

nsaspook

MrChips

jars121

You May Also Like

Broadcom Targets Mass-Market Broadband With 10G PON and Wi-Fi 8 SoCs

Understanding the Hogge Detector and the Triwave Solution

Paragraf Unwraps Graphene-Based FET Made at New Graphene Foundry

Qorvo Targets 5G Radio Complexity With New Wideband RF Switch Family