Text-Independent Speaker Verification

A text-independent speaker verification system basedÂ upon classiï¬cation of Mel-Frequency Cepstral Coefficients (MFCC) using a minimum-distance classifier and a Gaussian Mixture Model (GMM) Log-Likelihood Ratio (LLR) classifier. Â The speaker recognition system was implemented in MATLAB using training data and test data stored in WAV files. I developed custom matching and testing routines based upon minimum distance classification, extracted feature vectors using the melcepst function from Voicebox toolkit, and used an open source GMM library. For testing, I used 8 speakers (4 male, 4 female) from the popular TIMIT speaker database each saying two phonetically-diverse sentences. One sentence was used for training and the other for testing. Manually training the threshold for the minimum-distance classifier resulted in a 91% classification accuracy. Cody developed this project independently as part of a Digital Signal Processing course.Â A presentation and report follow. The code is available onÂ GitHub.

The presentation provides an overview of the theory behind MFCCs, minimum-distance classification, and least-likelihood ratio classification using Gaussian Mixture Models, as well as discussing the experimental results of my implementation.

The report elaborates upon that shown in the presentation above. In particular, the report describes both the theory and practical details for a reference implementation of a text-independent speaker verification system.Â Furthermore, it discusses an advanced technique for classification based upon Gaussian Mixture Models (GMM). Finally, it discusses the results of a set of experiments performed using my reference implementation.

Posted in .

25 comments

By codyaray – September 15, 2010

25 Responses

Stay in touch with the conversation, subscribe to the RSS feed for comments on this post.

Saad says

Hi i really liked your code for speaker recognition technique, can you plz tell me how many test sound you’ve used,

Thanks

May 18, 2012, 2:37 am Reply
Saad says

Ok you’ve used 8 speaker , can you please send me the test voices at my email

Thanks in advance

May 18, 2012, 2:42 am Reply
- codyaray says
  
  Thanks Saad. Unfortunately, I completed this project a while ago and don’t have the data files anymore. If you have access to the TIMIT data set, though, you can find an ample supply of speaker recordings. Otherwise, its easy enough to hook a mic to your computer (if its not built-in, such as a mac) and record your own data set. Just enlist the help of a few friends or co-workers, etc.
  
  PS – If you’re a student, feel free to use this code for reference, but I’d encourage you to develop your own solution still.
  
  May 18, 2012, 9:47 am Reply
Saad says

Thanks codyaray thank you for your kind advice im working on it, but first i want to understand different codes available.
Your code is giving me the following error
Attempted to access PC(2); index out of bounds because numel(PC)=1.

Error in Probability_of_Cluster_given_X (line 2)
PY = PC(Label_of_Cluster)*Mixing_Coefficient(X,Means(:,Label_of_Cluster),
Variances(:,Label_of_Cluster));

Error in GMM (line 41)
Probability_of_Cluster_given_Point(i,j) =
Probability_of_Cluster_given_X(Input(:,j),Mu,Variances,PC,i);

Error in main (line 63)
[training_idx1, training_mu1, training_sigma1] = GMM(training_features1′, No_of_Clusters,
No_of_Iterations);

any help plz

May 19, 2012, 1:25 pm Reply
- codyaray says
  
  Try the timit.m file. It was the one I actually used. However, you have to have all the files present for it to work properly. If you only have a subset, comment out some of the code for interacting with the other files.
  
  May 20, 2012, 9:33 am Reply
codyaray says

My guess is that IndexOutOfBounds error is caused because No_of_Clusters or No_of_Iterations is higher than need be. But I haven’t used this code in a couple years, so I don’t really recall the details off the top of my head.

May 20, 2012, 9:37 am Reply
- Saad says
  
  I have reduced No_of_Clusters and No_of_Iterations, but now it is giving me different error:
  
  Undefined function or variable ‘background_mu’.
  
  Error in main (line 91)
  log(Cluster_Probability(testing_features1′, training_mu1)) – log(Cluster_Probability(testing_features1′, background_mu))
  
  May 23, 2012, 11:51 am Reply
  - codyaray says
    
    As stated above, use timit.m instead of main.m.
    
    May 28, 2012, 12:35 pm Reply
    - Saad says
      
      no both timit.m and main.m are giving me same errors, plus can you explain to me how can i make it real time.
      
      Thanks
      
      May 28, 2012, 2:53 pm Reply
      - codyaray says
        
        Sorry, I can’t help you any further. Now I recall (and the presentation shows) that the GMM implementation wasn’t quite complete. I don’t have access to MATLAB any longer to help complete it. This code should give you a guideline though, not solve your homework problems for you. You need to have an understanding of whats going on to really make use of it.
        
        May 28, 2012, 6:34 pm
Saad says

can you please tell me one more thing what is ‘background_mu’ have you assigned some variable to it or is it just a function.

June 14, 2012, 2:19 pm Reply
ritika says

@saad cud u finish the thing?

February 17, 2013, 6:44 am Reply
ajay mishra says

Hi I really want your code for speaker recognition technique, can you plz give it to me as soon as possible. and can you tell me how many test sound youâ€™ve used.

regards
ajay mishra

February 20, 2013, 12:30 am Reply
- codyaray says
  
  Both a link to the code and the number of test sounds are included in the article. Please read. Also note that the GMM implementation was incomplete, if that’s what you’re seeking. Good luck.
  
  February 20, 2013, 10:51 am Reply
rania says

hi
please send your code about this

March 8, 2014, 1:10 am Reply
- codyaray says
  
  Hi rania,
  
  There’s a link to the source in the first paragraph: https://github.com/codyaray/speaker-recognition. Note that this is quite old and unsupported at this point.
  
  -Cody
  
  March 9, 2014, 12:02 pm Reply
tony says

hi
can we use svm for speaker recoginition.

April 7, 2014, 2:52 pm Reply
Priya says

hiii…

Now a days phone log likelihood ratios (PLLR) are used for speaker recognition. so plz can you give some idea about how to implement it on MATLAB.

January 18, 2015, 7:20 am Reply
- codyaray says
  
  Sorry – I haven’t touched this in a long time. If you figure it out, feel free to drop a note here with resources for others.
  
  March 4, 2015, 10:28 pm Reply
Joseph says

hello codyaray

is this a speaker independent or dependent?
because I’m searching for independent

thanks a lot

April 17, 2015, 8:26 am Reply
codyaray says

Joseph – As this is speaker *verification*, it only makes sense in a speaker dependent context. That is, the objective is to verify whether a test sample corresponds to a known speaker.

-Cody

April 20, 2015, 12:50 pm Reply
hany1992 says

Hi Codyaray
please reply for below question:
“can you please tell me one more thing what is ‘background_mu’ have you assigned some variable to it or is it just a function.”
thanks a lot for your good site & code 😉

February 22, 2017, 2:36 am Reply
- hany1992 says
  
  because I have same error:
  “Undefined function or variable ‘background_mu’.”
  in both of timit & main
  
  February 22, 2017, 2:39 am Reply
imene says

hello
I’m working on a project for the end of master’s degree that is about biometric vocal authentication in order to access a wifi network
after doing the speaker verification and getting an EER of 0.14
I need help to make a program to test the access and the refusal of one speaker by matlab

May 28, 2018, 1:38 pm Reply
Vinay says

I downloaded code from https://github.com/codyaray/speaker-recognition. I observed that few functions were missing like GMM() and Cluster_Probability(). I downloaded Voicebox and I could not find these functions there as well. Please let me know how to get access to them?

September 21, 2018, 6:12 am Reply