How can I read large amounts of data through an MCU configured as an SPI slave?

peckett · Mar 29, 2023

Hello,

I am trying to send 20,000KB of data through some LED display panels.
I am using SPI to send the data from the MCU and it is converted into RS485 so it can transmit the data through the length of the sign which is roughly about 8 meters.
As I need to detect any errors in the sign the data is returned through each LED panel and back to the MCU.
Currently only the data is returned and this limits the speed I can drive the sign to about 3MHz.
As only the MOSI signal is being returned to the MCU the data is out of sync with the SCK
After reading the application note from TI attached to this email it says by returning the clock signal and setting up the MCU as a slave the MOSI and SCK will still be in sync.
I have tried this method with some development boards (teensy4.1 and ESP32) and have successfully sent data between a master and a slave at speeds up to 30MHz.
The only issue is I was limited to the amount of data I was able to receive.
I was using a few libraries for the slave https://github.com/hideakitai/ESP32SPISlave and https://github.com/tonton81/SPISlave_T4 and they are both limited to only receive about 32 bytes of data.
Is there an easy way for me to read up to 20,000 bytes of data with any MCU configured as an SPI slave or is there a better solution to what I am trying to achieve?
I have also attached a drawing for an example.

Thanks

Craig

Papabravo · Mar 29, 2023

If the MCU is going to be the slave device, then somebody has to be the master because it is the master the generates the SCLK. SPI is really a piss-poor way to do what you are doing. Almost any other choice except maybe CAN or I2C would be superior. You could pretend the sign was a very small disk drive and implement a SCSI interface with 256-byte sectors and error correction codes. You could do a USB interface. Who came up with this scheme?

ronsimpson · Mar 29, 2023

I2C is not fast.
Hardware: any system where data is sent differently is good for distance.
Many CPUs have DMA. I have pointed at memory, set length to 1024, turned on DMA and let the hardware to the job. My LED sign, many years ago had a Z80 computer and we moved all the data via DMA and SPI. No return data.

nsaspook · Mar 29, 2023

DMA is IMO the only reliable way to send/receive (ttl level serial) data that fast (30MHz) on a MCU. SPI is easily capable of that speed if there is multi-word FIFO buffering for TX and RX to handle data on the physical serial bus and DMA to handle MCU memory movements to and from SPI.
https://ww1.microchip.com/downloads/en/DeviceDoc/61106G.pdf

23.3.2.2 ENHANCED BUFFER MODE
The Enhanced Buffer Enable (ENHBUF) bit in the SPI Control (SPIxCON<16>) register can be
set to enable the Enhanced Buffer mode.
In Enhanced Buffer mode, two 128-bit FIFO buffers are used for the transmit buffer (SPIxTXB)
and the receive buffer (SPIxRXB). SPIxBUF provides access to both the receive and transmit
FIFOs and the data transmission and reception in the SPISR buffer in this mode is identical to
that in Standard Buffer mode. The FIFO depth depends on the data width chosen by the
Word/Half-Word Byte Communication Select (MODE<32,16>) bits in the SPI Control
(SPIxCON<11:10>) register. If the MODE field selects 32-bit data lengths, the FIFO is 4 deep, if
MODE selects 16-bit data lengths, the FIFO is 8 deep, or if MODE selects 8-bit data lengths the
FIFO is 16 deep.
The SPITBF status bit is set when all of the elements in the transmit FIFO buffer are full and is
cleared if one or more of those elements are empty. The SPIRBF status bit is set when all of the
elements in the receive FIFO buffer are full and is cleared if the SPIxBUF buffer is read by the
software.
The SPITBE status bit is set if all the elements in the transmit FIFO buffer are empty and is
cleared otherwise. The SPIRBE bit is set if all of the elements in the receive FIFO buffer are
empty and is cleared otherwise. The Shift Register Empty (SRMT) bit is valid only in Enhanced
Buffer mode and is set when the shift register is empty and cleared otherwise.
There is no underrun or overflow protection against reading an empty receive FIFO element or
writing a full transmit FIFO element. However, the SPIxSTAT register provides the Transmit
Underrun Status bit (SPITUR) and Receive Overflow Status bit (SPIROV), which can be
monitored along with the other status bits.
The Receive Buffer Element Count bits (RXBUFELM<4:0>) in the SPI Status
(SPIxSTAT<28:24>) register indicate the number of unread elements in the receive FIFO. The
Transmit Buffer Element Count bits (TXBUFELM<4:0>) in the SPI Status (SPIxSTAT<20:16>)
register indicate the number of elements not transmitted in the transmit FIFO.

https://forum.allaboutcircuits.com/threads/pic32mk-mc-qei-example.150351/post-1618239

panic mode · Mar 29, 2023

20000kb is 168Mbit and using 30MHz will take 5.5 seconds to transfer - if uninterrupted.

traditional fieldbus networks based on RS485 go to some 5Mbps or sometimes even 1Mbps.
CanBus and DeviceNet are at the low end of the spectrum with baud rates 125-500kbps.
Profibus uses special chips that can push this to 20MHz.
But all of those are fading away as more capable products are available and trying to match the Ethernet speeds.
Some of more recent chips like SN65HVD76 go up to 50Mpps. The fastest one that i can recall is Maxim MAX22500E goes up to 100Mbps.

peckett · Mar 29, 2023

Papabravo said:
If the MCU is going to be the slave device, then somebody has to be the master because it is the master the generates the SCLK. SPI is really a piss-poor way to do what you are doing. Almost any other choice except maybe CAN or I2C would be superior. You could pretend the sign was a very small disk drive and implement a SCSI interface with 256-byte sectors and error correction codes. You could do a USB interface. Who came up with this scheme?

The idea would be to have either a master mcu and slave mcu on the same PCB or have a slave mcu at the end that can send the faulty pixel map back to the master mcu. The LED drivers we are using are SPI devices, that is they have a data, data clock, latch and data out pin (not sure how else I would send data to them)? I don't think USB can travel 16 meters? This scheme used when I started working here I am just trying to improve it. I tried using LVDS over RS485 which was a lot faster when using a few panels but by the time I got to the width of a full sign I was seeing the same sort of speeds.

peckett · Mar 29, 2023

panic mode said:
20000kb is 168Mbit and using 30MHz will take 5.5 seconds to transfer - if uninterrupted.

traditional fieldbus networks based on RS485 go to some 5Mbps or sometimes even 1Mbps.
CanBus and DeviceNet are at the low end of the spectrum with baud rates 125-500kbps.
Profibus uses special chips that can push this to 20MHz.
But all of those are fading away as more capable products are available and trying to match the Ethernet speeds.
Some of more recent chips like SN65HVD76 go up to 50Mpps. The fastest one that i can recall is Maxim MAX22500E goes up to 100Mbps.

sorry I meant 20kbytes*. I tried LVDS chips and they are a lot faster (100-400Mbps) and I can drive the LED's just fine with that its just that the data shifted out of the last LED driver needs to come back to the controller so I can read the faulty pixel data. when the data is returned on its own into the MISO pin the data starts getting corrupted so I need to feed the clock signal back and read the data using a separate mcu configured as a slave device (I think).

peckett · Mar 29, 2023

nsaspook said:
DMA is IMO the only reliable way to send/receive (ttl level serial) data that fast (30MHz) on a MCU. SPI is easily capable of that speed if there is multi-word FIFO buffering for TX and RX to handle data on the physical serial bus and DMA to handle MCU memory movements to and from SPI.
https://ww1.microchip.com/downloads/en/DeviceDoc/61106G.pdf

https://forum.allaboutcircuits.com/threads/pic32mk-mc-qei-example.150351/post-1618239

Thanks, I haven't played around much with DMA so will do some research.

ronsimpson · Mar 30, 2023

I see the problem is that the only SPI clock is the transmit clock and you have 1/3 delay round trip.
Maybe, build a SPI receiver with its own clock, using TTL shift register. Read the data on a parallel port. I can't remember the part number, there is a 8 bit SR with 8 bit latch, so you can be shifting data in/out and still be holding the data from last byte.

ronsimpson · Mar 30, 2023

This might what I used to make a SPI to parallel converter. Notet the latches in the middle. I think there are a number of different parts that are very similar. I believe there is a parallel to serial version. You might be able to shift in by one clock, latch, load then shift out by another clock.

nsaspook · Mar 30, 2023

ronsimpson said:
I see the problem is that the only SPI clock is the transmit clock and you have 1/3 delay round trip.
Maybe, build a SPI receiver with its own clock, using TTL shift register. Read the data on a parallel port. I can't remember the part number, there is a 8 bit SR with 8 bit latch, so you can be shifting data in/out and still be holding the data from last byte.

The normal solution is to split the TX and RX functions and clock pairs (per the application note from TI ) so you only need one additional driver for the returned SCK for the second SPI receiver.

A quick PIC32MK DMA to DMA demo of split remote SCK receive on a DEMO board using MPLABX 6.05 and MCC.

C:

#include <stddef.h>                     // Defines NULL
#include <stdbool.h>                    // Defines true
#include <stdlib.h>                     // Defines EXIT_FAILURE
#include <stdio.h>
#include <string.h>
#include <proc/p32mk0512mcj064.h>
#include "definitions.h"                // SYS function prototypes
#define    BANK1        0xA000A000    // bank 1 frame buffer memory address
#define    DMA_GAP        1        // set to 0 for SPI byte gaps in DMA transmissions
#define USE_DMA

/*
* use DMA-able memory for string storage
*/
char __attribute__((address(BANK1), coherent)) spi_buffer[] = "The quick brown fox jumps over the lazy dogs back";
char __attribute__((address(BANK1 + 128), coherent)) spi_rec_buffer[128];
volatile bool dmaT_done = false, dmaR_done = false, purge = true;

/*
* DMA complete callbacks
*/
void SPI1DmaChannelHandler_State(DMAC_TRANSFER_EVENT, uintptr_t);
void SPI2DmaChannelHandler_State(DMAC_TRANSFER_EVENT, uintptr_t);

int main(void)
{
    uint32_t startCount, endCount;
    /* Initialize all modules */
    SYS_Initialize(NULL);

    /* Start system tick timer */
    CORETIMER_Start();

    /*
     * SPI single W/R channel with split devices
     */
    DMAC_ChannelCallbackRegister(DMAC_CHANNEL_0, SPI1DmaChannelHandler_State, 0); // TX, SCK pair
    SPI1CONbits.STXISEL = DMA_GAP; // set to 0 for byte gaps
    SPI1CONbits.ENHBUF = true; // enable FIFO

    DMAC_ChannelCallbackRegister(DMAC_CHANNEL_7, SPI2DmaChannelHandler_State, 0); // RX, SCK pair
    SS2_IN_InterruptDisable();

    while (true) {
#ifndef USE_DMA
        LED_Toggle();
#endif
        CORETIMER_DelayMs(100); // 10 Hz updates for blink-led
        dmaT_done = false;
        dmaR_done = false;
#ifdef USE_DMA
        SPI2_REC_DATA_Set(); // debug sig
        SS_CS_Clear(); // enable the remote slave SPI
        DMAC_ChannelTransfer(DMAC_CHANNEL_7, (const void *) &SPI2BUF, (size_t) 1, (const void *) spi_rec_buffer, (size_t) strlen(spi_buffer), (size_t) 1);
        RS_Set(); // debug sig
        DMAC_ChannelTransfer(DMAC_CHANNEL_0, (const void *) spi_buffer, (size_t) strlen(spi_buffer), (const void *) &SPI1BUF, (size_t) 1, (size_t) 1);
        /* Calculate the end count for the given delay */
        endCount = (CORE_TIMER_FREQUENCY / 1000000) * 200;
        startCount = _CP0_GET_COUNT();
        while (!dmaR_done) { // While DMA running processing loop, check for receive errors or timeouts
            CSB_Toggle(); // do something
            if (purge) { // clear out startup SPI2 receiver buffer junk
                purge = false; // run only once
                break;
            }

            if ((_CP0_GET_COUNT() - startCount) > endCount) { // timeout after 200us
                LED_Toggle(); // blink per block received
                SS_CS_Set(); // disable slave SPI
                SPI2_REC_DATA_Clear();
                break;
            };
        };
#else
        SPI1_Write(spi_buffer, strlen(spi_buffer));
#endif
    }
    /* Execution should not come here during normal operation */
    return( EXIT_FAILURE);
}

/*
* interrupt at the end of strlen(spi_buffer) DMA byte TX transfers
*/
void SPI1DmaChannelHandler_State(DMAC_TRANSFER_EVENT event, uintptr_t contextHandle)
{
    if (event == DMAC_TRANSFER_EVENT_COMPLETE) {
        dmaT_done = true;
        RS_Clear();
    }
}

/*
* interrupt at the end of receive DMA strlen(spi_buffer) RX byte transfer
*/
void SPI2DmaChannelHandler_State(DMAC_TRANSFER_EVENT event, uintptr_t contextHandle)
{
    if (event == DMAC_TRANSFER_EVENT_COMPLETE) {
        dmaR_done = true;
        LED_Toggle(); // blink per block received
        SS_CS_Set(); // disable slave SPI
        SPI2_REC_DATA_Clear();
    }

}

Lightly tested.

https://github.com/nsaspook/spi_loopback

30MHz SPI clock.

Polled SPI timing. Gaps between SPI bytes from checking for buffer complete flags.

DMA SPI1 TX, no gaps beween bytes.

The yellow trace falling is SPI2 receive complete. Purple SPI1 TX clocks. 6 inch SPI1 to SPI2 clock loop jumper

5 foot SPI1 to SPI2 clock loop jumper.

Back to the short jumper to show the time delta.

That 5 foot jumper gives a 3.7us clock skew from the SPI1 clock reference but because we used the delayed returned clock SCK and MISO on SPI2 they stay synchronized.

nsaspook · Mar 31, 2023

A small addition to the demo program to display the looped data transfer to usart2 and to detect SPI comm errors (incorrect or missing clocks back to the SPI2 receiver).
https://github.com/nsaspook/spi_loopback/tree/main

C:

        UART2_Write(spi_rec_buffer, display_len); // send the received SPI2 buffer for signal loop monitor
        CORETIMER_DelayMs(100); // 10 Hz updates for blink-led

Yellow trace SPI1 to SPI2 normal loop transaction, Purple USART echo of received data.

SCK IN wire disconnected from SPI2 receiver, causing receive timeout and USART warning message.

Thread starter	Similar threads	Forum	Replies	Date
A	Can't read the zeroes in an ASK signal	Analog & Mixed-Signal Design	17	Mar 22, 2026
F	Adapt 0-5V sine wave signal (centered on +2.5V) to 3.3V analog read input	General Electronics Chat	20	Mar 10, 2026
H	How to read MIL-883	General Electronics Chat	4	Feb 20, 2026
	Does anyone read books anymore?	Off-Topic	44	Jan 7, 2026
L	How to modify the large SD card read socket into mini one?	General Electronics Chat	10	Apr 1, 2024

How can I read large amounts of data through an MCU configured as an SPI slave?

Join our Engineering Community! Sign-in with:

How can I read large amounts of data through an MCU configured as an SPI slave?

peckett

Attachments

Papabravo

ronsimpson

nsaspook

panic mode

peckett

peckett

peckett

ronsimpson

ronsimpson

nsaspook

nsaspook

You May Also Like

ST’s New High-Precision Op Amp Takes Aim at the 4 V to 36 V Range

Build Your Own Clock With Analog Dials, Part 3

Diode-Based Phase Detectors: Key Principles and Example Circuits

U-blox’s New Wi-Fi 6E Module Steps Up When the Airwaves Get Crowded