All About Circuits Forum  

Go Back   All About Circuits Forum > Software, Microcomputing, and Communications Forums > Programmer's Corner

Programmer's Corner Discussion forum for all aspects of programming and software engineering. Any software programming language welcome: C, C++, C#, Fortran, Java, Matlab, etc.

Reply   Post New Thread
 
Thread Tools Display Modes
  #11  
Old 12-15-2007, 09:51 AM
Dave's Avatar
Dave Dave is online now
Administrator
 
Join Date: Nov 2003
Location: United Kingdom (GMT)
Posts: 6,645
Blog Entries: 17
Default

Quote:
Originally Posted by afab1986 View Post
hello every body,
first of all i am a new member here & im realy intreasted on what you all discusing
the process that Dave added is good but is there any body that could explain what is the operations Mathmaticly, i mean by equations , for the above Dave's process.
because i didnt take DSP course yet.
infact maybe i will include this idea in my graduation project if i get it wright
keep up the good work,
Hi,

Many of the fucntions and mathematical operations can be implemented using integrated Matlab functions, thus allowing the user from abstracting from some of the complexities.

The only real mathematical operation is the FFT, have you come across this before? Have you done the standard Fourier Transform? I'll explain it if you are unsure about how it works.

Dave
__________________
"If I have seen further, it is by standing on the shoulders of giants" - Sir Isaac Newton

All About Circuits
Reply With Quote
  #12  
Old 12-17-2007, 09:33 AM
afab1986 afab1986 is offline
New Member
 
Join Date: Dec 2007
Posts: 3
Default

hi DAVE,
Have you done the standard Fourier Transform?
Yes, i have in signals analysis & communication courses
but not the FFT yet.
I'm supposed to take the DSP course next semester "including FFT "
but what i want is to understand the matlab code posted previously and get
the idea mathematically; i.e what is the relation between the matrix we get out of FFT after "?standardization" using the "?bins" & correlation which the code used for comparing & judging the voice.
I hope you got the idea,
thanx for your help.

Last edited by afab1986; 12-17-2007 at 09:39 AM.
Reply With Quote
  #13  
Old 12-20-2007, 11:34 PM
Dave's Avatar
Dave Dave is online now
Administrator
 
Join Date: Nov 2003
Location: United Kingdom (GMT)
Posts: 6,645
Blog Entries: 17
Default

Quote:
Originally Posted by afab1986 View Post
hi DAVE,
Have you done the standard Fourier Transform?
Yes, i have in signals analysis & communication courses
but not the FFT yet.
I'm supposed to take the DSP course next semester "including FFT "
but what i want is to understand the matlab code posted previously and get
the idea mathematically; i.e what is the relation between the matrix we get out of FFT after "?standardization" using the "?bins" & correlation which the code used for comparing & judging the voice.
I hope you got the idea,
thanx for your help.
The FFT is just an efficient implementation of the Discrete Fourier Transform. It achieves this by removing redundancy from the solution through an understanding of the importance of something called the root of unity (a Google search will explain what it is). Basically it works on the principle that some calculations of the DFT is known without the need to explicitly calculate the value. The important point is the DFT is the FFT.

When you perform the DFT you are mapping a time-domain signal (such as a voice signal) into the frequency domain. When we consider voice comparison we are looking at not pitch similarities (which to the human ear are similar for different people) but a match of frequency components in the sound output. So where person A may sound, to you and me, like person B, if we were to map the audio into the frequency domain we would expect that there would be distinct differences to mark one from another.

The conjugation and absolutes are nothing more than crude conditioning operations in order to allow for a comparison from one to another.

Why split into bins? An FFT bin emphasises a set from the FFT matrix contains the energy (or effective voltage) from a frequency range, it is not a single frequency. Single frequency components are not of much use because of other, often experimentally related, variables. We could say safety in numbers. It is important to stress that too small a bin size is useless for comparative purposes, whereas too large a bin dilutes the result. There are lots of sources on how to determine your bin size as related to your frequency range. You also need to average your (energy) value across the bin to ascertain a single value for that bin.

Finally you divide the FFT matrix by the average value for the bin within that frequency range - this tells you how close your FFT value (your frequency mapping for your chosen voice sample) for a particular frequency component compares to the average for that bin. Plotting energy against bin/frequency will show you a mapping from which you can make a comparison for the similarity of two voices.

As I stated previously, this is a crude method that will allow you to decipher between two different people. There are many further tweaks and analysis techniques you can implement to make the recognition package better, but hopefully this will give you a start.

Dave
__________________
"If I have seen further, it is by standing on the shoulders of giants" - Sir Isaac Newton

All About Circuits
Reply With Quote
  #14  
Old 12-21-2007, 05:33 PM
BlackBox BlackBox is offline
Junior Member
 
Join Date: Apr 2007
Posts: 20
Default

I would suggest using cepstral analisys instead of FFT. It is much easier to recognise patterns deriving from the effect of the vocal tract physiology in the cepstral domain.

Cepstrum

If you are IEEE member you could look up the IEEE Xplore for a plethora of Cepstrum related voice recognition papers.

Good work
Reply With Quote
  #15  
Old 12-21-2007, 06:12 PM
Dave's Avatar
Dave Dave is online now
Administrator
 
Join Date: Nov 2003
Location: United Kingdom (GMT)
Posts: 6,645
Blog Entries: 17
Default

Quote:
Originally Posted by BlackBox View Post
I would suggest using cepstral analisys instead of FFT. It is much easier to recognise patterns deriving from the effect of the vocal tract physiology in the cepstral domain.

Cepstrum

If you are IEEE member you could look up the IEEE Xplore for a plethora of Cepstrum related voice recognition papers.

Good work
Yes it is certainly worth considering. You will still need to take the Fourier Transform (FFT) if you are using cepstral analysis as it is explicit in the calculations. From my brief musings at IEEE Explore, there is plenty of evidence to suggest that it is a suitable tool for voice recognition.

Dave
__________________
"If I have seen further, it is by standing on the shoulders of giants" - Sir Isaac Newton

All About Circuits
Reply With Quote
  #16  
Old 12-22-2007, 09:00 PM
afab1986 afab1986 is offline
New Member
 
Join Date: Dec 2007
Posts: 3
Default

hi Dave,
I will start researching and studying this project as soon as possible because I liked the idea and hopefully I will implement it in my future graduation project
Thanx again.
& KEEP UP THE GOOD WORK.
Reply With Quote
  #17  
Old 12-23-2007, 10:46 AM
Dave's Avatar
Dave Dave is online now
Administrator
 
Join Date: Nov 2003
Location: United Kingdom (GMT)
Posts: 6,645
Blog Entries: 17
Default

Quote:
Originally Posted by afab1986 View Post
hi Dave,
I will start researching and studying this project as soon as possible because I liked the idea and hopefully I will implement it in my future graduation project
Thanx again.
& KEEP UP THE GOOD WORK.
Good luck with your research and project. Keep us posted on how it goes, I'd be interested in seeing your project come along. Also feel free to ask any further questions, if we can help we will.

Dave
__________________
"If I have seen further, it is by standing on the shoulders of giants" - Sir Isaac Newton

All About Circuits
Reply With Quote
  #18  
Old 12-20-2008, 03:14 PM
d_devil d_devil is offline
New Member
 
Join Date: Dec 2008
Posts: 1
Default

hey i 've just started reading about speech recognition systems, & before starting to write code for myself, i wanted to see the sample code someone mentioned above

well that page has been removed or something , so can anyone help me in this regard?
Reply With Quote
  #19  
Old 12-24-2008, 08:18 AM
Dave's Avatar
Dave Dave is online now
Administrator
 
Join Date: Nov 2003
Location: United Kingdom (GMT)
Posts: 6,645
Blog Entries: 17
Default

Quote:
Originally Posted by d_devil View Post
hey i 've just started reading about speech recognition systems, & before starting to write code for myself, i wanted to see the sample code someone mentioned above

well that page has been removed or something , so can anyone help me in this regard?
The two codes previously referenced can be retrieved from web archive:

http://web.archive.org/web/200709021...ice/soundSig.m

http://web.archive.org/web/200709021...cs/voice/run.m

It is very crude and I would suggest you look at more advanced techniques (I have embellished here in this thread). But these codes are a good starting point.

Dave
__________________
"If I have seen further, it is by standing on the shoulders of giants" - Sir Isaac Newton

All About Circuits
Reply With Quote
  #20  
Old 03-25-2009, 01:13 PM
pulkit.143 pulkit.143 is offline
New Member
 
Join Date: Mar 2009
Posts: 3
Default voice recognition

hey dave...
i have imported the sound file directly into matlab and now i want to plot its fft...m tryin a code bt its nt working...m posting the code here ..can u tel me where m i going wrong..


[data,fs,nbits] = wavread("host.wav"); % Read wav file
data_fft = fft(data); % Perform FFT
P_data_fft = data_fft.* conj(data_fft) / size(data_fft,2); % Deduce Power Spectra
f = 1000*(0size(data_fft,2)))/size(data_fft,2); % Define frequency range over which to plot power spectra. This is half the size of the fft since there is merely a reflection around the dc point
plot(f,P_data_fft(1size(data_fft,2)+1))) % Plot
Reply With Quote
Reply   Post New Thread

Bookmarks

Tags
, ,

Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Useful Matlab Information Dave Programmer's Corner 7 11-09-2008 06:51 AM
Matlab project on voice recognition payate Programmer's Corner 4 06-27-2008 05:22 PM
GUI building using Matlab for non Matlab users Fadel Megahed Programmer's Corner 8 06-22-2007 07:19 PM
MATLAB, retrieve voice signal corrupted by awgn? makyboy Programmer's Corner 3 05-25-2007 11:14 PM
Installing Matlab Student on Suse Linux 10 Dave Programmer's Corner 0 01-22-2006 02:07 PM


All times are GMT. The time now is 09:09 AM.


User-posted content, unless source quoted, is licensed under a Creative Commons Public Domain License.
Copyright © 2009, All About Circuits. All Rights Reserved.