![]() |
|
|||||||
| Programmer's Corner Discussion forum for all aspects of programming and software engineering. Any software programming language welcome: C, C++, C#, Fortran, Java, Matlab, etc. |
|
|
|
Thread Tools | Display Modes |
|
#11
|
||||
|
||||
|
Quote:
Many of the fucntions and mathematical operations can be implemented using integrated Matlab functions, thus allowing the user from abstracting from some of the complexities. The only real mathematical operation is the FFT, have you come across this before? Have you done the standard Fourier Transform? I'll explain it if you are unsure about how it works. Dave
__________________
"If I have seen further, it is by standing on the shoulders of giants" - Sir Isaac Newton All About Circuits
|
|
#12
|
|||
|
|||
|
hi DAVE,
Have you done the standard Fourier Transform? Yes, i have in signals analysis & communication courses but not the FFT yet. I'm supposed to take the DSP course next semester "including FFT " but what i want is to understand the matlab code posted previously and get the idea mathematically; i.e what is the relation between the matrix we get out of FFT after "?standardization" using the "?bins" & correlation which the code used for comparing & judging the voice. I hope you got the idea, thanx for your help. Last edited by afab1986; 12-17-2007 at 09:39 AM. |
|
#13
|
||||
|
||||
|
Quote:
When you perform the DFT you are mapping a time-domain signal (such as a voice signal) into the frequency domain. When we consider voice comparison we are looking at not pitch similarities (which to the human ear are similar for different people) but a match of frequency components in the sound output. So where person A may sound, to you and me, like person B, if we were to map the audio into the frequency domain we would expect that there would be distinct differences to mark one from another. The conjugation and absolutes are nothing more than crude conditioning operations in order to allow for a comparison from one to another. Why split into bins? An FFT bin emphasises a set from the FFT matrix contains the energy (or effective voltage) from a frequency range, it is not a single frequency. Single frequency components are not of much use because of other, often experimentally related, variables. We could say safety in numbers. It is important to stress that too small a bin size is useless for comparative purposes, whereas too large a bin dilutes the result. There are lots of sources on how to determine your bin size as related to your frequency range. You also need to average your (energy) value across the bin to ascertain a single value for that bin. Finally you divide the FFT matrix by the average value for the bin within that frequency range - this tells you how close your FFT value (your frequency mapping for your chosen voice sample) for a particular frequency component compares to the average for that bin. Plotting energy against bin/frequency will show you a mapping from which you can make a comparison for the similarity of two voices. As I stated previously, this is a crude method that will allow you to decipher between two different people. There are many further tweaks and analysis techniques you can implement to make the recognition package better, but hopefully this will give you a start. Dave
__________________
"If I have seen further, it is by standing on the shoulders of giants" - Sir Isaac Newton All About Circuits
|
|
#14
|
|||
|
|||
|
I would suggest using cepstral analisys instead of FFT. It is much easier to recognise patterns deriving from the effect of the vocal tract physiology in the cepstral domain.
Cepstrum If you are IEEE member you could look up the IEEE Xplore for a plethora of Cepstrum related voice recognition papers. Good work |
|
#15
|
||||
|
||||
|
Quote:
Dave
__________________
"If I have seen further, it is by standing on the shoulders of giants" - Sir Isaac Newton All About Circuits
|
|
#16
|
|||
|
|||
|
hi Dave,
I will start researching and studying this project as soon as possible because I liked the idea and hopefully I will implement it in my future graduation project Thanx again. & KEEP UP THE GOOD WORK. |
|
#17
|
||||
|
||||
|
Quote:
Dave
__________________
"If I have seen further, it is by standing on the shoulders of giants" - Sir Isaac Newton All About Circuits
|
|
#18
|
|||
|
|||
|
hey i 've just started reading about speech recognition systems, & before starting to write code for myself, i wanted to see the sample code someone mentioned above
well that page has been removed or something , so can anyone help me in this regard? |
|
#19
|
||||
|
||||
|
Quote:
http://web.archive.org/web/200709021...ice/soundSig.m http://web.archive.org/web/200709021...cs/voice/run.m It is very crude and I would suggest you look at more advanced techniques (I have embellished here in this thread). But these codes are a good starting point. Dave
__________________
"If I have seen further, it is by standing on the shoulders of giants" - Sir Isaac Newton All About Circuits
|
|
#20
|
|||
|
|||
|
hey dave...
i have imported the sound file directly into matlab and now i want to plot its fft...m tryin a code bt its nt working...m posting the code here ..can u tel me where m i going wrong.. [data,fs,nbits] = wavread("host.wav"); % Read wav file data_fft = fft(data); % Perform FFT P_data_fft = data_fft.* conj(data_fft) / size(data_fft,2); % Deduce Power Spectra f = 1000*(0 size(data_fft,2)))/size(data_fft,2); % Define frequency range over which to plot power spectra. This is half the size of the fft since there is merely a reflection around the dc pointplot(f,P_data_fft(1 size(data_fft,2)+1))) % Plot
|
|
| Bookmarks |
| Tags |
| matlab, recognition, voice |
| Thread Tools | |
| Display Modes | |
|
|
Similar Threads
|
||||
| Thread | Thread Starter | Forum | Replies | Last Post |
| Useful Matlab Information | Dave | Programmer's Corner | 7 | 11-09-2008 06:51 AM |
| Matlab project on voice recognition | payate | Programmer's Corner | 4 | 06-27-2008 05:22 PM |
| GUI building using Matlab for non Matlab users | Fadel Megahed | Programmer's Corner | 8 | 06-22-2007 07:19 PM |
| MATLAB, retrieve voice signal corrupted by awgn? | makyboy | Programmer's Corner | 3 | 05-25-2007 11:14 PM |
| Installing Matlab Student on Suse Linux 10 | Dave | Programmer's Corner | 0 | 01-22-2006 02:07 PM |