FIFO logic block meta-stability

Thread Starter

Engineering_Junkie

Joined Sep 9, 2015
41
Good afternoon everyone,

I have a few questions on FIFO logic blocks.

Here's a description of my project: I'll be syncing two high speed, 12 bit ADC's to a raspberry-pi. Due to the latency issues that the raspberry pi's SPI bus has, I want to have a FIFO memory block between the ADC's and the pi, to ensure no samples are lost. I want to run the FIFO's input and output at different clock speeds (Asynchronously)


Problem: While doing my initial research online it seems that running an FIFO logic block asynchronously can result in meta-stability. Which will result in inaccurate data acquisition.

1) Is meta-stability something I need to take into account or does the FIFO logic block designer account for it?
2) If I need to account for it, how can I so that no data is lost?
3) Besides meta-stability is there anything else I need to take into account when running data acquisition through these logic blocks?

Thanks in advance for any help
 

nsaspook

Joined Aug 27, 2009
13,275

Thread Starter

Engineering_Junkie

Joined Sep 9, 2015
41
Could you explain in detail the latency issues you see. The current RPi2/3 DMA driver for SPI should be able to handle >200Ksps (ADS8330) with a few mods to the native SPI driver with a custom protocol kernel driver instead of the normal ioctl interface.
https://forum.allaboutcircuits.com/threads/raspberry-pi-daq-system.75543/#post-1047682
16 bit ADC SPI data transfers.

http://www.jumpnowtek.com/rpi/Analyzing-raspberry-pi-spi-performance.html

I'm completely new to pi, but I'll be sampling 12 bits at around 2MSPS. for around 10 milliseconds. Whats worrying me is this post:https://www.raspberrypi.org/forums/viewtopic.php?t=19489

and a few others where it states cause the pi is running Linux and the SPI is embedded in the kernel, the SPI will not operate like a real-time operating system.

EDIT: Reading through your post now
 
Last edited:

nsaspook

Joined Aug 27, 2009
13,275
Shouldn't the pi be able to handle this if its SPI can run up to 125 MHz?
Sure you might get that speed for a few bytes of a transfer but for sustained throughput without timing variations over milliseconds the pi with generic desktop Linux is a tricky hardware/software problem.

I created a 'special' device for a protocol driver to test how the RPi3 4 core cpu handles fast (64MHz) serial transfers.
From the Linux device driver code.
C:
    {
        .name = "special",
        .ai_subdev_flags = SDF_READABLE | SDF_GROUND | SDF_CMD_READ | SDF_COMMON,
        .max_speed_hz = 64000000,
        .min_acq_ns = 30000,
        .rate_min = 30000,
        .spi_mode = 3,
        .spi_bpw = 8,
        .n_chan_bits = 12,
        .n_chan = 2,
        .n_transfers = 64,
    },
...
    case special: // dummy device transfer speed testing
        pdata->one_t.len = devpriv->ai_spi->device_spi->n_transfers;
        for (i = 0; i < 32; i++) // set the first 32 bytes to 1, leave the next 32 byte 0
            pdata->tx_buff[i] = 0xff;

        spi_message_init_with_transfers(&m,
                        &pdata->one_t, 1);
        spi_bus_lock(spi->master);
        spi_sync_locked(spi, &m);
        spi_bus_unlock(spi->master);
        val = 99;
        devpriv->ai_count++;
        break;
This kernel 'device' sends a 64 byte burst at 64MHz. The first 32 bytes are 0xff and the rest are 0x00. The internal SPI hardware has a 16 32-bit word FIFO with DMA to reduce or eliminate CPU intervention. Up to 64 bytes the timing and spacing of each transfer should be very stable. After 64 bytes the DMA pump timing stability could be affected by context switches on the kernel workqueue as it keeps the tx/rx pipeline full.

RPI3 with external I/O board and ADC connected to a Tek MSO 2012B

The user program bmc (cpu1) executes a loop of reads on the 'special device while the spi task (cpu2) handles the I/O.
RIP Adam West.

SPI 64 byte burst to burst timing. line 12 CS

64 byte spi transfer time

SPI byte to byte timing. line 9 MISO

SPI clock timing. line 8 MSCK

The timing here is just raw speed. There is a ADS8330 ADC (it normally runs at 16MHz) on the bus that looks like it responds at 64Mhz (data line 10) with something after CS goes low for the first two bytes.

If I get time will post some results with much larger transfers.
 
Last edited:

nsaspook

Joined Aug 27, 2009
13,275

20000 byte SPI transfer @ 64MHz

64 byte block timing gaps in the 20000 byte transfer stream.

The 20000 byte transfer is using DMA so the cpu usage for kernel SPI is close to zero.

RPi3 16MHz spi with AD8330 ADC in normal 2 byte 16-bit data DMA transfer mode @330ksps.

Timing spec's are nice and stable at 16MHz using DMA. I could maybe run the ADC at 32MHz (I/O clock max is 50MHz) but I don't really need the speed (it only increases to 400ksps due to fixed system delays with short transfers) and the data/clocks line transmission characteristics start to get tricky at those speeds.

The display program xoscope with 32MHz SPI, lots of noise and buffer overruns.


32 byte transfer per SPI 16MHz transaction. The gap here is mainly the setup time for DMA between blocks.

The short term timing stability between blocks is pretty good. line 12 is CS for 32 byte blocks using the DMA engine.

Cpu usage for the above SPI transfer. SPI is using the DMA engine so the cpu usage is almost zero for actual physical transfers. The daqgerth_a is kworker thread locked on cpu3 that sends data to the DMA engine while the user testing program xoscope runs on cpu 0.

You can run the system without the DMA engine using only using interrupts and CPU but that limits the samples per second to about 70ksps on a RPi3.
 
Last edited:

Thread Starter

Engineering_Junkie

Joined Sep 9, 2015
41

20000 byte SPI transfer @ 64MHz

64 byte block timing gaps in the 20000 byte transfer stream.

The 20000 byte transfer is using DMA so the cpu usage for kernel SPI is close to zero.

RPi3 16MHz spi with AD8330 ADC in normal 2 byte 16-bit data DMA transfer mode @330ksps.

Timing spec's are nice and stable at 16MHz using DMA. I could maybe run the ADC at 32MHz (I/O clock max is 50MHz) but I don't really need the speed (it only increases to 400ksps due to fixed system delays with short transfers) and the data/clocks line transmission characteristics start to get tricky at those speeds.

The display program xoscope with 32MHz SPI, lots of noise and buffer overruns.


32 byte transfer per SPI 16MHz transaction. The gap here is mainly the setup time for DMA between blocks.

The short term timing stability between blocks is pretty good. line 12 is CS for 32 byte blocks using the DMA engine.

Cpu usage for the above SPI transfer. SPI is using the DMA engine so the cpu usage is almost zero for actual physical transfers. The daqgerth_a is kworker thread locked on cpu3 that sends data to the DMA engine while the user testing program xoscope runs on cpu 0.

You can run the system without the DMA engine using only using interrupts and CPU but that limits the samples per second to about 70ksps on a RPi3.
Thank you for the extremely detailed reply. I think mainly due to the latency and speed limits, I'd like to have a FIFO between the ADC and PI,I was thinking something like this: http://www2.mouser.com/ProductDetai...GAEpiMZZMvM2rVpdjNFG1V69%2bXOAvndWvINI0HBZF4=

Although the only interface types I've found for these logic blocks are parallel. Is there a way to connect something like this to the PI without using a PISO shift register? Thanks for all the help
 

Thread Starter

Engineering_Junkie

Joined Sep 9, 2015
41
Thank you for the extremely detailed reply. I think mainly due to the latency and speed limits, I'd like to have a FIFO between the ADC and PI,I was thinking something like this: http://www2.mouser.com/ProductDetail/Texas-Instruments/SN74V293PZAEP/?qs=sGAEpiMZZMvM2rVpdjNFG1V69%2bXOAvndWvINI0HBZF4=

Although the only interface types I've found for these logic blocks are parallel. Is there a way to connect something like this to the PI without using a PISO shift register? Thanks for all the help
Actually I think I can just interface it to the GPIO pins and have them read at their own time from the FIFO
 

nsaspook

Joined Aug 27, 2009
13,275
Thank you for the extremely detailed reply. I think mainly due to the latency and speed limits, I'd like to have a FIFO between the ADC and PI,I was thinking something like this: http://www2.mouser.com/ProductDetail/Texas-Instruments/SN74V293PZAEP/?qs=sGAEpiMZZMvM2rVpdjNFG1V69%2bXOAvndWvINI0HBZF4=

Although the only interface types I've found for these logic blocks are parallel. Is there a way to connect something like this to the PI without using a PISO shift register? Thanks for all the help
I don't think you will find a external SPI serial FIFO ready made for this.

If you really need continuous high speed SPI data acquisitions then my solution would be to completely offload the ADC spi timing and buffering to a 16/32 microcontroller with two spi channels (40MHz on PIC32) because the pi is not designed for this. Controller Channel #1 would be the slave for the RPi spi data stream that could easily use back to back transfers for speed with a few gpio lines for hardware control. Channel #2 would the the master channel to control the ADC devices.
 
Last edited:

Deleted member 115935

Joined Dec 31, 1969
0
Shouldn't the pi be able to handle this if its SPI can run up to 125 MHz?
why ?

The two are not connected..

The say the SPI runs, and you get the data,
then what ?

In the chip, the spi available flag is raised, causes an interrupt, to store the data in memory,
or maybe you will DMA it in to internal memory.

If the process or was not doing anything else, then the delay would 'just' be the interrupt latency,
BUT, if your running other things on the procesor, the SPIU available flag in the CPU is only going to get serviced when the processor gets around to it.

Some real time processors, like the M4 series , often have built in fifos on the peripheral ports for just this sort of problem,

wiht an OS running, your totaly at the constraints of the OS.
 
Top