Speach Comparing

Discussion in 'Embedded Systems and Microcontrollers' started by Jorgy, Nov 26, 2009.

  1. Jorgy

    Thread Starter New Member

    Oct 10, 2009
    I am having problems finding a working solution to compare two speach mic inputs using the ac97 on the virtex ii pro. I am able to record input from the mic, and play it back at a push of a button but what I want to do is be able to compare two inputs and see if the same word was said. I have the following for read and write commands:

    //------------To Record ---------------------//
    if (~pressed & BTN_UP){


    print("-- recording --\r\n");
    while(sampleCnt < SAMPLE_SIZE)
    //--------To Playback --------------------//
    if (~pressed & BTN_DOWN){
    print("--Playing sample -\r\n");
    for(i=0; i< SAMPLE_SIZE; i++){
    recorded1 = SAMPLE1;
    XAC97_WriteFifo(XPAR_AUDIO_CODEC_BASEADDR, recorded1);

    Like I said, I can record multiple inputs, and I am able to play them back with no problems, the problem lies when I want to compare them to see if the same word was said by the same person. I am setting ReadFifo equal to an array, and my assumtion was that if the a word was said twice into the mic for two separate inputs and these inputs were stored in different arrays than the arrays should be the same. I found out that this assumption was wrong when I monitored what data was being stored (I was getting values like -10089 198678 67811) it just seems random. I then I took the sum of each array and compared them to see if they were the same. Even if I repeatedly said the same thing in the same tone and volume I was getting extremly different results (1173899756 vs -67324328582) even though when I play back each input they sound identical. I am going to try a few more things, but I just wanted to see if anyone had any suggestions of what might work.
  2. beenthere

    Retired Moderator

    Apr 20, 2004
    If you read through the material at the link - http://en.wikipedia.org/wiki/Sound_card - you will get a better idea of how a sound card works.

    In the specific case you are dealing with, one thing you have to know is the setting for the A to D converter. It can be set to several different levels of conversion (different bits/conversion). This has a great deal to do with interpreting the data stream out. The converter also alternates right/left channel conversions, and can adjust the conversion rate from 44.1 KHZ to several lower levels. You need to know all this in order to set up the array.

    This means you have to know the bit length for each conversion and how to assign the two channels outputs for proper storage in the array. Once you have done that, it may be a lot easier to find matches in similar words. If your conversion is at the highest level of 44.1 KHz, there are going to be a lot of individual digital values to compare.