Raspberry PI DAQ system

Discussion in 'Embedded Systems and Microcontrollers' started by nsaspook, Oct 11, 2012.

  1. nsaspook

    Thread Starter AAC Fanatic!

    Aug 27, 2009
    2,909
    2,170
    This is a project I'm working on to provide a Linux kernel module for the GPIO and analog in/out ports on the Gertboard or a addon PIC18 controller with a slave SPI interface to the RPi.

    The first objective was to write a Comedi compatible Linux module that would interface with the I/O pins in a standard manner usable with the Comedi data acquisition library called daq_gert. The analog interface code is still being designed.

    Comedi home page.
    http://www.comedi.org/

    A RPi Debian based HOWTO and C source code are here.
    Howto: https://github.com/nsaspook/daq_gert/blob/master/RPi_Comedi_HOWTO.txt
    Source code: https://github.com/nsaspook/daq_gert.git

    The second objective is the PIC18 based ADC module that can provide 12 10bit ADC channels and 8 additional I/O pins using the on-board SPI master in the RPi board for communication. This part of the project is in the very early stages of testing but the current software does demonstrate interrupt driven ADC and SPI interface code.

    See RPi_PIC directory in the above source code location.

    A blinking LED demo of the software running on the RPi board. http://flic.kr/p/dh4ur3
     
  2. nsaspook

    Thread Starter AAC Fanatic!

    Aug 27, 2009
    2,909
    2,170
    Last edited: Jan 16, 2013
  3. nsaspook

    Thread Starter AAC Fanatic!

    Aug 27, 2009
    2,909
    2,170
    The Raspberry PI daq driver also includes code for my PIC ADC channel SPI expander. The MCP3002 chip is removed and a dip 8 header connected to the off-board controller is inserted. When the daq_gert module is loaded it can detect the expander and configure the linux driver to use it instead of the normal 2 channel input chip. The expander code running on a 28pin 18f25k22 chip can provide 11 extra 10bit channels analog channels.

    Some prototype pictures:
    http://flic.kr/p/dMFtWS
    http://flic.kr/p/dMzVvp

    Code: https://github.com/nsaspook/daq_gert/blob/master/RPi_PIC/SlaveO.c
     
  4. nsaspook

    Thread Starter AAC Fanatic!

    Aug 27, 2009
    2,909
    2,170
    While updating my RPi linux driver for the new Raspberry PI 2 I ran into a problem many others have, the board is camera-shy.
    It's a great board that's light years faster than the original but just don't use a flash for a photo on the bare board.

    A photo using my trusty Olympus with a true Xenon flash of my prototype cube caused a lockup.
    [​IMG]
     
  5. nsaspook

    Thread Starter AAC Fanatic!

    Aug 27, 2009
    2,909
    2,170
    Spent some spare time getting this to work with the new board.
    Completed a (mainly) working version of the RPi 2 kernel module AI/AO/DIO driver with async command control (background channel scanning) for the Linux DAQ library so Comedi DAQ applications like XOSCOPE can be used.

    The program is running on the PI but is displayed on a remote Linux PC via ssh X forwarding.


    It's pretty limited in two channel analog sample speed due to the slow ADC chip (MCP3202 12 bit resolution 100k samples/second so SPI is limited to 1mhz) on the Gertboard and no DMA for memory transfers but I should be able to tweak the speed up to at least a few thousand samples per second using native the hardware and SPI without DMA.

    The 4 core RPi 2 makes real kernel threads possible on the hardware so I'm only using 20% for daq I/O on one while the scope software uses 90%+ on another core.
    https://github.com/nsaspook/daq_gert/blob/master/daq_gert.c
     
    Last edited: Apr 24, 2015
  6. nsaspook

    Thread Starter AAC Fanatic!

    Aug 27, 2009
    2,909
    2,170
    I've added the code in my kernel protocol driver to speed up the sample rate to 1024 samples in a chunk at the SPI device kernel level instead of single requests. This has the effect of reducing the scope software user-mode CPU use to less than 50% while speeding the max possible scope sample rate to at least 10k per second. The kernel thread CPU use has increased with the lack of DMA but that's ok with 4 cores.
    With full DMA the speed should be at least 20k and usable for lo-fi audio waveforms.

    The RPi SPI hardware interface is the current bottleneck but good people are working on it.
    https://github.com/msperl/spi-bcm2835/wiki
    https://www.kernel.org/doc/Documentation/spi/spi-summary
    The Linux SPI system uses a work and messages queueing mechanism that while good is not real-time, especially at the user level.

    Code (Text):
    1.  
    2. From my driver:
    3. * In the async command mode transfers can be handled in HUNK mode by creating a SPI message
    4. * of many conversion sequences into one message, this allows for close to native driver wire-speed
    5. * An optimized DMA driver is in the works
    6. * https://github.com/msperl/spi-bcm2835/wiki
    7. * (the current interrupt driven kernel driver is limited to a 12 to 64 byte FIFO and no DMA) HUNK_LEN data samples
    8. * into the Comedi read buffer with a special mix_mode for sampling both ADC devices in an alt sequence for
    9. * programs like xoscope at full speed. The transfer array is currently static but can easily be made into
    10. * a config size parameter runtime value if needed with kmalloc for the required space
    11.  
    12. // lets talk to the device in a blast on the wire
    13.  
    14. #define HUNK_LEN   1024
    15. struct comedi_control {
    16.    u8 *tx_buff;
    17.    u8 *rx_buff;
    18.    struct spi_transfer t[HUNK_LEN];
    19.    struct mutex daqgert_platform_lock;
    20. };
    21.  
    22. ...
    23. static unsigned int daqgert_ai_get_sample(struct comedi_device *dev,
    24.    struct comedi_subdevice *s)
    25. {
    26. ...
    27.   struct spi_param_type *spi_data = s->private;
    28.   u8 *tx_buff, *rx_buff;
    29. ...
    30.     memset(&pdata->t, 0, sizeof(pdata->t)); // clear the transfer array
    31.      if (devpriv->ai_hunk) { /* for single channel command scans with pre-formatted tx_buffer*/
    32.        if (spi_data->device_type == MCP3002) { // 10 bit adc data
    33.          len = 2;
    34.        } else {
    35.          len = 3;
    36.        }
    37.        tx_buff = pdata->tx_buff;
    38.        rx_buff = pdata->rx_buff;
    39.        for (i = 0; i < HUNK_LEN; i++) { /* be sure we toggle CS between ADC commands */
    40.          pdata->t[i].cs_change = 1;
    41.          pdata->t[i].len = len;
    42.          pdata->t[i].tx_buf = tx_buff;
    43.          pdata->t[i].rx_buf = rx_buff;
    44.          tx_buff += len; /* move the buffer ptr to the next transfer slot in the buffer memory */
    45.          rx_buff += len;
    46.        }
    47.        spi_message_init_with_transfers(&m, &pdata->t[0], HUNK_LEN); // make the proper message with the transfers
    48. ...
    49.     spi_sync(spi_data->spi, &m); // exchange SPI data
    50. ...
    51. }
    52.  
     
  7. nsaspook

    Thread Starter AAC Fanatic!

    Aug 27, 2009
    2,909
    2,170
    Still hacking the RPi2 driver to something usable.:)
    Things are looking pretty good for the MCP3X02 ADC driver section of the code. I can get a 50usec per sample avg over a 1 second burst period using xoscope with 25usecs for wire-speed and ~25 for CPU overhead without DMA. Without DMA the output data stream is not totally continuous at full speed as I have to stop the SPI transfer to process and copy data but that can be fixed with a double-buffer on the SPI side by using another thread and core. With a driver request for less than full speed sample rates the software inserts calculated delays between samples (or groups of samples during a two channel scan) to adjust the sample to the correct time-splice. I'm thinking about having the option to always run a full speed and then integrate the extra samples (costing more CPU cycles) into one per the requested time-splice. (runs at 20,000 S/sec, requests for 1000 S/sec, driver integrates every 20 samples to 1 for output to the data consumer)
     
    Last edited: May 6, 2015
  8. nsaspook

    Thread Starter AAC Fanatic!

    Aug 27, 2009
    2,909
    2,170
    That's just about what I said before :p but I think I have a better idea of modifying the transmit command buffer on the fly.:) It's seems easy as a concept if it were just a simple polled system. The Comedi command sequences can be up to 256 separate channel, range, counts and timing parameters in one block. The driver communicates back to the user program with what it can do by modifying the sent commands and then the user program can accept that to be run continuously in background by the kernel thread (at Linux Ring zero for something close to real-time priority) or quit. A real DMA driver for SPI is expected in the Raspberry kernel at version 4.1 (The RPi foundation kernel is still at 3.18 and 4.01 is out) so all of this tricker won't be needed much longer to maintain precise sample timing. The speed won't be much faster but the timing will be hard locked to the DMA/CPU clock instead of a stream of instructions in the CPU execution cache.

    The driver also has the capability to use a PIC18 or 24 device to offload most of the hardware interface timing issues from the RPi but that currently that only works in sync/polled mode.
     
    Last edited: May 7, 2015
  9. nsaspook

    Thread Starter AAC Fanatic!

    Aug 27, 2009
    2,909
    2,170
    This is pretty close to max with what you can get with the current RPi2 4.0.y driver via SPI with a 12bit ADC/DAC system on the Gertboard. (1Mhz ADC & 8Mhz DAC clocks)

    Two analog 8bit (from a simple 256 entry table) signals generated on the RPi2 using a separate program (bmc) with the Comedilib programming interface via the two AO MCP4822 DAC channels looped back into the two AI MCP3202 ADC channels with Xoscope (via X on another machine) using Comedi as the data source at 10KS/s with two channels per scan. The driver is working in a unblocked mode here so long term timing is affected but in blocked mode for AI sampling it has the SPI bus for 1000 sample blocks in a Kthread at a time so timing is much better at the expense of other SPI users like AO.

    [​IMG]

     
    Last edited: May 29, 2015
  10. nsaspook

    Thread Starter AAC Fanatic!

    Aug 27, 2009
    2,909
    2,170
    Another video using asynchronous commands to both generate and display several waveforms concurrently. Linux Kthreads (each to a core on the RPi2) are used by two I/O scan sequence commands to share the SPI link to the analog devices. I'm missing the real DMA driver (still waiting for the 'official PI SPI with DMA kernel driver' to be released) so the timing jitter is nasty at high sample rates when the SPI bus is not locked to the I/O data stream for a 1000 sample periods as seen in these demos.



    The ao_waveform program source is from the Debian comedilib demo source directory.
    https://packages.debian.org/sid/i386/libcomedi-dev/filelist

    board_info from the driver
    ao_waveform -v -c 0 -f /dev/comedi0_subd2 -n0
    -v verbose debug
    -c 0 first channel of the subdevice
    -f /dev/comedi0_subd2 is the AO device file created by the driver when it loads during boot
    -n 0 first waveform type

    Code (Text):
    1.  
    2. ...[B]First it creates a command to submit to the driver:[/B]
    3.   cmd.subdev = options.subdevice;
    4.   cmd.flags = CMDF_WRITE;
    5.   cmd.start_src = TRIG_INT;
    6.   cmd.start_arg = 0;
    7.   cmd.scan_begin_src = TRIG_TIMER;
    8.   cmd.scan_begin_arg = 1e9 / options.freq;
    9.   cmd.convert_src = TRIG_NOW;
    10.   cmd.convert_arg = 0;
    11.   cmd.scan_end_src = TRIG_COUNT;
    12.   cmd.scan_end_arg = options.n_chan;
    13.   cmd.stop_src = TRIG_NONE;
    14.   cmd.stop_arg = 0;
    15.  
    16.   cmd.chanlist = chanlist;
    17.   cmd.chanlist_len = options.n_chan;
    18.  
    19. ... [B]sends it to the driver[/B]
    20.  
    21.   if ((err = comedi_command(dev, &cmd)) < 0) {
    22.   comedi_perror("comedi_command");
    23.   exit(1);
    24.  
    25. ... [B]preloads the buffer some data to size check[/B]
    26.  
    27.   dds_output(data,BUF_LEN);
    28.   n = BUF_LEN * sizeof(sampl_t);
    29.   m = write(comedi_fileno(dev), (void *)data, n);
    30.   if(m < 0){
    31.   perror("write");
    32.   exit(1);
    33.   }else if(m < n)
    34.   {
    35.   fprintf(stderr, "failed to preload output buffer with %i bytes, is it too small?\n"
    36.   "See the --write-buffer option of comedi_config\n", n);
    37.   exit(1);
    38.   }
    39.   if (options.verbose)
    40.   printf("m=%d\n",m);
    41.  
    42. ...[B] triggers the command to start[/B]
    43.  
    44.   ret = comedi_internal_trigger(dev, options.subdevice, 0);
    45.   if(ret < 0){
    46.   perror("comedi_internal_trigger\n");
    47.   exit(1);
    48.   }
    49.  
    50. ...[B]writes more data[/B]
    51.  
    52.   while(1){
    53.   dds_output(data,BUF_LEN);
    54.   n=BUF_LEN*sizeof(sampl_t);
    55.   while(n>0){
    56.   m=write(comedi_fileno(dev),(void *)data+(BUF_LEN*sizeof(sampl_t)-n),n);
    57.   if(m<0){
    58.   perror("write");
    59.   exit(0);
    60.   }
    61.   if (options.verbose)
    62.   printf("m=%d\n",m);
    63.   n-=m;
    64.   }
    65.   total+=BUF_LEN;
    66.   //printf("%d\n",total);
    67.   }
    68.  
    69.   }
    70.  
     
    Last edited: May 30, 2015
  11. nsaspook

    Thread Starter AAC Fanatic!

    Aug 27, 2009
    2,909
    2,170
    Some interesting points on Linux drivers and the kernel in general.
    http://kernelnewbies.org/FAQ/LinkedLists
    Many Linux device structures are managed by lists/queues/stacks and the Linux kernel provides several easy to use methods for them. This driver is a very simple example.

    In this RPi driver the SPI master hardware (with it's own kernel driver) has two native chip selects so two devices are created for the protocol code to use. CS0 is used for ADC devices and CS1 is used for DAC devices with each device having channel selects in their respective command codes.

    When the RPi boots, device tables in the kernel (or user supplied tables) tell it which module(s) to load for each device probed. For this driver I created a table entry that selects my driver when the SPI master kernel module code is loaded.

    in arch/arm/mach-bcm2709/bcm2709.c
    Code (Text):
    1.  
    2. #ifdef CONFIG_BCM2708_SPIDEV
    3. static struct spi_board_info bcm2708_spi_devices[] = {
    4. #ifdef CONFIG_SPI_SPIDEV
    5.   {
    6.   .modalias = "spidev",
    7.   .max_speed_hz = 500000,
    8.   .bus_num = 0,
    9.   .chip_select = 0,
    10.   .mode = SPI_MODE_0,
    11.   }, {
    12.   .modalias = "spidev",
    13.   .max_speed_hz = 500000,
    14.   .bus_num = 0,
    15.   .chip_select = 1,
    16.   .mode = SPI_MODE_0,
    17.   }
    18. #endif
    19. #ifdef CONFIG_SPI_COMEDI
    20.   {
    21.   .modalias = "spigert",
    22.   .max_speed_hz = 500000,
    23.   .bus_num = 0,
    24.   .chip_select = 0,
    25.   .mode = SPI_MODE_0,
    26.   }, {
    27.   .modalias = "spigert",
    28.   .max_speed_hz = 500000,
    29.   .bus_num = 0,
    30.   .chip_select = 1,
    31.   .mode = SPI_MODE_0,
    32.   }
    33. #endif
    34. };
    35. #endif
    36.  
    if CONFIG_SPI_COMEDI is set when the kernel is created then my module "spigert" is loaded and probed with two SPI devices with chip_select = 0, chip_select = 1.
    ...
    The 'probe' code in my driver then looks for the correct chip_select and puts the device into a Linux linked list global structure for the program so other functions 'daq_gert' in that program can use it later.
    Code (Text):
    1.  
    2. /*
    3. * Do only two chip selects for the Gertboard
    4. */
    5. if (spi->chip_select == CSnA) {
    6. /*
    7. * get a copy of the slave device 0 to share with comedi
    8. * we need a device to talk to the ADC
    9. */
    10. INIT_LIST_HEAD(&pdata->device_entry); /* create entry into the Comedi device list */
    11. pdata->slave.spi = spi;
    12. list_add_tail(&pdata->device_entry, &device_list); /* put entry into the Comedi device list */
    13. }
    14. if (spi->chip_select == CSnB) {
    15. /*
    16. * we need a device to talk to the DAC
    17. */
    18. INIT_LIST_HEAD(&pdata->device_entry);
    19. pdata->slave.spi = spi;
    20. list_add_tail(&pdata->device_entry, &device_list);
    21. }
    22.  
    This creates a new entry for each correct chip_select seen during the two probes using a link structure in my device structure pdata->device_entry a with: INIT_LIST_HEAD, it then adds that entry to the queue 'device_list' with: list_add_tail
    ....
    After the probes are completed the Comedi DAQ protocol part of the driver is auto-loaded.
    Code (Text):
    1.  
    2. if (gert_autoload)
    3. ret = comedi_auto_config(&slave_spi->spi->master->dev, &daqgert_driver, 0);
    4.  
    ...
    This part of the driver needs to find SPI devices for it to send and receive analog data so it searches the spigert 'device_list' queue for matching devices from a 'board' structure.
    Code (Text):
    1.  
    2. static const struct daqgert_board daqgert_boards[] = {
    3. {
    4. .name = "Gertboard",
    5. .board_type = 0,
    6. .n_aichan = 2,
    7. .n_aochan = 2,
    8. .ai_ns_min = 50000, /* values plus software overhead */
    9. .ai_ns_min_calc = 35000,
    10. .ai_rate_min = 20000,
    11. .ao_ns_min = 5000,
    12. .ao_ns_min_calc = 4500,
    13. .ao_rate_min = 10000,
    14. .ai_cs = 0,
    15. .ao_cs = 1,
    16. .ai_max_speed_hz = 1000000,
    17. .ao_max_speed_hz = 8000000,
    18. },
    19. ....
    20. static int32_t daqgert_auto_attach(struct comedi_device *dev, unsigned long unused_context)
    21. {
    22. const struct daqgert_board *thisboard = &daqgert_boards[gert_type];
    23. struct comedi_subdevice *s;
    24. int32_t ret, i;
    25. int32_t num_ai_chan, num_ao_chan, num_dio_chan = NUM_DIO_CHAN;
    26. struct daqgert_private *devpriv;
    27. struct comedi_spigert *pdata;
    28. struct spi_param_type *slave_spi_adc=NULL, *slave_spi_dac=NULL;
    29. ....
    30. /*
    31. * loop the spi device queue for needed devices
    32. */
    33. if (list_empty(&device_list))
    34. return -ENODEV;
    35.  
    36. list_for_each_entry(pdata, &device_list, device_entry)
    37. {
    38. if (pdata->slave.spi->chip_select == thisboard->ai_cs) {
    39. slave_spi_adc = &pdata->slave;
    40. pdata->slave.spi->max_speed_hz = thisboard->ai_max_speed_hz;
    41. spi_setup(pdata->slave.spi);
    42. dev_info(dev->class_dev, "setup: spi cd %d: %d Hz: assigned to adc devices\n",
    43. pdata->slave.spi->chip_select, pdata->slave.spi->max_speed_hz);
    44. } else {
    45. slave_spi_dac = &pdata->slave;
    46. pdata->slave.spi->max_speed_hz = thisboard->ao_max_speed_hz;
    47. spi_setup(pdata->slave.spi);
    48. dev_info(dev->class_dev, "setup: spi cd %d: %d Hz: assigned to dac devices\n",
    49. pdata->slave.spi->chip_select, pdata->slave.spi->max_speed_hz);
    50. }
    51. }
    52. /*
    53. * check for possible bad spigert table entry
    54. */
    55. if (!slave_spi_adc || !slave_spi_dac)
    56. return -ENODEV;
    57.  
    We only have 0 and 1 for chip_selects here so a simple if statement works to see if we have a match to the needed 'thisboard->ai_cs' structure element and then copy the queue spi pointer (&pdata->slave) to the local pointer variable (slave_spi_adc/slave_spi_dac) for actual data transmission use from the 'pdata' structure pointer of the list entry in the program.

    It seems a complex way just to setup a SPI device the Linux way but by using non-global pointers to variables, dynamic variables with queues and lists it's possible to write easily reusable C code that can automatically adjust to changes in the hardware with just a few table entries, avoid data duplication and reduce the number of program locks needed for correct operation on multi-processor machines like the new RPI2 4 core board.
     
    Last edited: Jun 3, 2015
  12. nsaspook

    Thread Starter AAC Fanatic!

    Aug 27, 2009
    2,909
    2,170
    The next step is to use the RPi2 multi-core capability.

    When kernel threads are created they run in the same memory space but can execute on different 'nodes' with SMP machine hardware. The RPi2 has four cores so we want to make each of our two SPI i/o threads have it's own CPU to reduce context switching latencies and to clock our sampling periods from the CPU the thread is running on instead of the user process CPU.

    First we check the number of CPU cores online and update the cores we want to run on from the board data structure.
    Code (Text):
    1.  
    2. ... added node information
    3. static const struct daqgert_board daqgert_boards[] = {
    4.    {
    5.      .name = "Gertboard",
    6.      .board_type = 0,
    7.      ....
    8.      .ai_node = 3,
    9.      .ao_node = 2,
    10.    },
    11. .....
    12.   /*
    13.     * setup kthreads on other cores if possible
    14.     */
    15.    cpu_nodes = num_online_cpus();
    16.    dev_info(dev->class_dev, "%d cpu(s) online for threads\n", cpu_nodes);
    17.    if (cpu_nodes >= 4) {
    18.      devpriv->ai_node = thisboard->ai_node;
    19.      devpriv->ao_node = thisboard->ao_node;
    20.    }
    21.  
    With this information we can call a thread setup routine using nodes 2&3 for execution units (the devpriv structure memory is set to zero in kzalloc so the default nodes are 0 if the cpu_node test fails).
    Code (Text):
    1.  
    2. /*
    3. * make two threads for the i/o streams
    4. */
    5. static int32_t daqgert_create_thread(struct comedi_device *dev, struct daqgert_private *devpriv)
    6. {
    7.    const char hunk_thread_name[] = "daqgerth", thread_name[] = "daqgert";
    8.    const char *name_ptr;
    9.  
    10.    if (devpriv->hunk)
    11.      name_ptr = hunk_thread_name;
    12.    else
    13.      name_ptr = thread_name;
    14.  
    15.    devpriv->ai_spi->daqgert_task = kthread_create_on_node(&daqgert_ai_thread_function, (void *) dev,
    16.      cpu_to_node(devpriv->ai_node), "%s_a/%d", name_ptr, devpriv->ai_node);
    17.    if (!IS_ERR(devpriv->ai_spi->daqgert_task)) {
    18.      kthread_bind(devpriv->ai_spi->daqgert_task, devpriv->ai_node);
    19.      wake_up_process(devpriv->ai_spi->daqgert_task);
    20.    } else
    21.      return PTR_ERR(devpriv->ai_spi->daqgert_task);
    22.  
    23.    devpriv->ao_spi->daqgert_task = kthread_create_on_node(&daqgert_ao_thread_function, (void *) dev,
    24.      cpu_to_node(devpriv->ao_node), "%s_d/%d", name_ptr, devpriv->ao_node);
    25.    if (!IS_ERR(devpriv->ao_spi->daqgert_task)) {
    26.      kthread_bind(devpriv->ao_spi->daqgert_task, devpriv->ao_node);
    27.      wake_up_process(devpriv->ao_spi->daqgert_task);
    28.    } else
    29.      return PTR_ERR(devpriv->ao_spi->daqgert_task);
    30.  
    31.    return 0;
    32. }
    33.  
    We check if 'hunk' (SPI master controlled sample timing) is possible and then select a process name for the thread with an 'h' if true.
    Creating the actual thread on the node is next with the name and the wanted node number with a pointer to the task returned with the process sleeping. The error condition for thread creation is a little special. A possible error condition is encoded as a lower value of the returned pointer than normal so IS_ERR is used to detect this condition and is returned as a 'normal' error code by PTR_ERR. The process is then started with wake_up_process.
    ....

    To get the best timing between samples without direct DMA hardware timing we use the hrtimer subsystem. On the RPi this runs at 64-bit 1Mhz but the Linux hrtimer resolution is 64-bit 1nsec so we are really limited to +-1 usec best case precision. (see attachment)
    Because now the code is running on a SMP host with multi-level caches on each core we need to make sure state flags are ATOMIC and are updated on all cores, this is normally handled with smp_mb_* 'memory barrier' functions and atomic machine assembly coded bit operations.
    https://www.kernel.org/doc/Documentation/memory-barriers.txt
    https://www.kernel.org/doc/Documentation/atomic_ops.txt
    Code (Text):
    1.  
    2.        smp_mb__before_atomic();
    3.        set_bit(SPI_AO_RUN, &devpriv->state_bits);
    4.        smp_mb__after_atomic();
    5.  
    6.       __set_current_state(TASK_UNINTERRUPTIBLE);
    7.        pdata->kmin = ktime_set(0, pdata->delay_nsecs);
    8.        schedule_hrtimeout_range(&pdata->kmin, 0, HRTIMER_MODE_REL_PINNED);
    9.  
    The hrtimer system uses rbtrees to sort events.
    https://www.kernel.org/doc/Documentation/rbtree.txt

    A 'top' screen-shot of the threads running on separate cores using the 'ps ax' command with the system generating and displaying a sawtooth wave to check for timing variations.
    [​IMG]
     
    Last edited: Jun 5, 2015
  13. nsaspook

    Thread Starter AAC Fanatic!

    Aug 27, 2009
    2,909
    2,170
    Now that most of the SPI based routines are running it's time to check the actual timing so corrections can be made and maybe some software routines can be optimized for better performance. I used the old Tek 2465 triggered on the input start bit to look at the input, clock and output data lines on the MCP3202 ADC chip. The RPi2 SPI master is programmed to send the data in 3 8bit segments.
    [​IMG]
    Top trace input, middle clock, bottom output data.
    [​IMG]
    1Mhz clock with SPI mode 3

    The total time between samples is about ~45us with the software overhead (~26us for data transmission) but the desired design time is 50us for a 20khz max sample rate so a timing delay correction will be added between samples to correct for that. The actual average sample jitter over several seconds is not too bad for a non RT Linux system.

    The output bits are from the RPi2 generated analog sine wave signal.
     
    Last edited: Jun 8, 2015
  14. nsaspook

    Thread Starter AAC Fanatic!

    Aug 27, 2009
    2,909
    2,170
    If you use the RPi and need to write your own device driver then you must understand at some level how things are configured at boot. The traditional Linux way is to use kernel module tables and files to control and configure hardware but on most new embedded systems and now for the 'new way' on the SoC RPi you will use 'device-tree' files instead. The original concept is from the PowerPC Open-Firmware line of systems: https://ols.fedoraproject.org/OLS/Reprints-2008/likely2-reprint.pdf

    http://events.linuxfoundation.org/sites/events/files/slides/petazzoni-device-tree-dummies.pdf
    Video of the above pdf:

    The Raspberry Pi's version with some needed changes is here: https://www.raspberrypi.org/documentation/configuration/device-tree.md

    The 'overlay' tree for the daq_gert module is simple as it uses the existing platform tree and modifies only a few things.

    1. It's compatible with three existing SPI driver platform structures that create the spi0.0 and spi0.1 devices.
    2. A existing protocol driver for accessing the SPI devices is disabled so I can use them. spidev
    3. I then specify what gpio pins the master spi0 will use to communicate with the devices. spi0_pins
    4. The information needed to use the devices code with my driver is given. spigert0/1
    (listed in the compatible platform files )
    Code (Text):
    1.  
    2. /*
    3. * Device Tree overlay for the spigert module used by Comedi daq_gert analog devices
    4. *
    5. */
    6.  
    7. /dts-v1/;
    8. /plugin/;
    9.  
    10. / {
    11.   compatible = "brcm,bcm2835", "brcm,bcm2708", "brcm,bcm2709";
    12.   /* disable spi-dev for spi0.0 & spi0.1 */
    13.   fragment@0 {
    14.    target = <&spi0>;
    15.    __overlay__ {
    16.      status = "okay";
    17.  
    18.      spidev@0{
    19.        status = "disabled";
    20.      };
    21.      spidev@1{
    22.        status = "disabled";
    23.      };
    24.    };
    25.   };
    26.  
    27.   fragment@1 {
    28.    target = <&gpio>;
    29.    __overlay__ {
    30.   spi0_pins: spi0_pins {
    31.   brcm,pins = <7 8 9 10 11>;
    32.   brcm,function = <4>; /* alt0 */
    33.   };
    34.    };
    35.   };
    36.  
    37.   fragment@2 {
    38.    target = <&spi0>;
    39.    __overlay__ {
    40.   #address-cells = <1>;
    41.   #size-cells = <0>;
    42.   pinctrl-names = "default";
    43.   pinctrl-0 = <&spi0_pins>;
    44.  
    45.   spigert@0 {
    46.   compatible = "spigert";
    47.   reg = <0>;  /* CE0 */
    48.   spi-max-frequency = <500000>;
    49.      status = "okay";
    50.   };
    51.  
    52.   spigert@1 {
    53.   compatible = "spigert";
    54.   reg = <1>;  /* CE1 */
    55.   spi-max-frequency = <500000>;
    56.      status = "okay";
    57.   };
    58.    };
    59.   };
    60. };
    61.  
    Output from the running system device-tree: dtc -I fs /proc/device-tree
    Code (Text):
    1.  
    2.   spi@7e204000 {
    3.   reg = <0x7e204000 0x1000>;
    4.   dmas = <0x3 0x6 0x3 0x7>;
    5.   interrupts = <0x2 0x16>;
    6.   pinctrl-0 = <0x27>;
    7.   compatible = "brcm,bcm2835-spi";
    8.   cs-gpios = <0x0 0x0>;
    9.   clocks = <0x7>;
    10.   status = "okay";
    11.   #address-cells = <0x1>;
    12.   phandle = <0x13>;
    13.   #size-cells = <0x0>;
    14.   dma-names = "tx", "rx";
    15.   pinctrl-names = "default";
    16.   linux,phandle = <0x13>;
    17.  
    18.   spidev@0 {
    19.   reg = <0x0>;
    20.   compatible = "spidev";
    21.   spi-max-frequency = <0x7a120>;
    22.   status = "disabled";
    23.   #address-cells = <0x1>;
    24.   #size-cells = <0x0>;
    25.   };
    26.  
    27.   spidev@1 {
    28.   reg = <0x1>;
    29.   compatible = "spidev";
    30.   spi-max-frequency = <0x7a120>;
    31.   status = "disabled";
    32.   #address-cells = <0x1>;
    33.   #size-cells = <0x0>;
    34.   };
    35.  
    36.   spigert@0 {
    37.   reg = <0x0>;
    38.   compatible = "spigert";
    39.   spi-max-frequency = <0x7a120>;
    40.   status = "okay";
    41.   };
    42.  
    43.   spigert@1 {
    44.   reg = <0x1>;
    45.   compatible = "spigert";
    46.   spi-max-frequency = <0x7a120>;
    47.   status = "okay";
    48.   };
    49.   };
    50.  
    Output from the system boot log:
    Code (Text):
    1.  
    2. [35090.568221] daq_gert: module is from the staging directory, the quality is unknown, you have been warned.
    3. [35090.570027] spigert spi0.1: setup: cd 1: bpw 8, mode 0x3
    4. [35090.570260] spigert spi0.0: setup: cd 0: bpw 8, mode 0x3
    5. [35090.571456] comedi comedi0: setup: spi cd 1: 8000000 Hz: assigned to dac devices
    6. [35090.571488] comedi comedi0: setup: spi cd 0: 1000000 Hz: assigned to adc devices
    7. [35090.571821] comedi comedi0: Gertboard WiringPi pins setup
    8. [35090.571845] comedi comedi0: RPi new scheme rev a21041, serial 00000000f3083d7a, new rev 1
    9. [35090.571863] comedi comedi0: driver gpio board rev 3
    10. [35090.571885] comedi comedi0: Gertboard WPi pins set [0..7] to outputs
    11. [35090.571901] comedi comedi0: Gertboard spi slave device detection started
    12. [35090.572242] comedi comedi0: Gertboard adc board pre detect code 0, daqgert_conf option value 1
    13. [35090.572400] comedi comedi0: Gertboard adc chip board detected, 2 channels, range code 0, device code 0, PIC code 0, detect code 0
    14. [35090.572452] comedi comedi0: 4 cpu(s) online for threads
    15. [35090.572987] comedi comedi0: daq_gert attached: gpio iobase 0x3f200000, ioremaps 0xbe5e0000  0xf3003000, io pins 0x0, 1Mhz timer value 0x8:0x2badae9c
    16. [35090.580193] comedi comedi0: driver 'daq_gert' has successfully auto-configured 'Gertboard'.
    17.  
     
    Last edited: Jun 12, 2015
  15. nsaspook

    Thread Starter AAC Fanatic!

    Aug 27, 2009
    2,909
    2,170
    Timing stability from the new spi-2835 RPi driver, still no DMA but it's much better.


    Top to bottom (MCP3202/channel 0 then 1) with delay_usecs working great: data in(trigger), chip select, clock (1mhz), data out
    [​IMG]
    [​IMG]

    Top to bottom (MCP3202/channel 0 then 1) with delay_usecs: data in, chip select(trigger), clock (1mhz), data out
    [​IMG]
     
  16. nsaspook

    Thread Starter AAC Fanatic!

    Aug 27, 2009
    2,909
    2,170
    Fixed some bugs in my driver and worked around some quirks in the spi framework for Linux to finally get the correct timing.

    The project target is precisely timed 20k S/sec with samples from ADC/DAC channels 0&1 as close as possible then delay for the correct amount of time for two samples before restarting the process in a 1000 sample transfer queue as one message from the spi master to reduce timing jitter. It's off frequency by about 1% over the needed S/sec but well within tolerance for regular Linux without direct SPI DMA timing.

    The old 2465 with all the options is still the wonder of the analog scope world.
    [​IMG]

     
  17. nsaspook

    Thread Starter AAC Fanatic!

    Aug 27, 2009
    2,909
    2,170
    200000 sample scan uniformity test
    [​IMG]

    Cpu usage:
    [​IMG]
     
    Last edited: Jun 20, 2015
  18. nsaspook

    Thread Starter AAC Fanatic!

    Aug 27, 2009
    2,909
    2,170
  19. nsaspook

    Thread Starter AAC Fanatic!

    Aug 27, 2009
    2,909
    2,170
    The protocol driver has been updated to use the ADS8330 adc chip as an option. I needed something with a higher resolution and sample rate for a project. The ADS8330 is a 16-bit 1-Mhz dual input ADC that in this driver configuration can sample up to 325Ksps using the SPI interface at ~16MHz. One of the neat things about this chip is the ability to cascade several chips into a SPI daisy-chain with manual triggers (using a PIC controller for the device sequencer glue). I've configured the device for auto-triggered conversions using the internal 21MHz conversion clock in this Linux driver.
    [​IMG]
    http://www.ti.com/lit/ds/symlink/ads8330.pdf
    [​IMG]
    SPI buffered 'hunk' transfer mode 16-bit sampling. 3.08 usecs

    [​IMG]
    Single acquisition read times between 16-bit samples using a Comedi.org compatible C program. 13.60 usecs.

    I should be able up the spi clock to 32MHz with a proper board instead of the prototype wire jungle. :D

    [​IMG]
    [​IMG]
     
    Last edited: Oct 7, 2016
Loading...