Search Results: 1 - 10 of 100 matches for " "
All listed articles are free for downloading (OA Articles)
Page 1 /100
Display every page Item
International Journal of Engineering Science and Technology , 2010,
Abstract: The idea of the AUDIO SIGNAL PROCESSING (Speaker Recognition [4] Project) is to implement a recognizer using Matlab which can identify a person by processing his/her voice. The Matlab functions and scriptswere all well documented and parameterized in order to be able to use them in the future. The basic goal of our project is to recognize and classify the speeches of different persons. This classification is mainly based on extracting several key features like Mel Frequency Cepstral Coefficients (MFCC [2]) from the speech signals of those persons by using the process of feature extraction using MATLAB. The above features may consists of pitch, amplitude, frequency etc. It can be achieved by using tools like MATLAB. Using a statistical model like Gaussian mixture model (GMM [6]) and features extracted from those speech signals we build a unique identity for each person who enrolled for speaker recognition [4]. Estimation and Maximization algorithm is used, An elegant and powerful method for finding the maximum likelihood solution for a model with latent variables, to test the later speeches against the database of all speakers who enrolled in the database.
Iterative Unsupervised GMM Training for Speaker Indexing
M. Paralic,R. Jarina
Radioengineering , 2007,
Abstract: The paper addresses a novel algorithm for speaker searching and indexation based on unsupervised GMM training. The proposed method doesn't require a predefined set of generic background models, and the GMM speaker models are trained only from test samples. The constrain of the method is that the number of the speakers has to be known in advance. The results of initial experiments show that the proposed training method enables to create precise GMM speaker models from only a small amount of training data.
FPGA Implementation for GMM-Based Speaker Identification  [PDF]
Phaklen EhKan,Timothy Allen,Steven F. Quigley
International Journal of Reconfigurable Computing , 2011, DOI: 10.1155/2011/420369
Abstract: In today's society, highly accurate personal identification systems are required. Passwords or pin numbers can be forgotten or forged and are no longer considered to offer a high level of security. The use of biological features, biometrics, is becoming widely accepted as the next level for security systems. Biometric-based speaker identification is a method of identifying persons from their voice. Speaker-specific characteristics exist in speech signals due to different speakers having different resonances of the vocal tract. These differences can be exploited by extracting feature vectors such as Mel-Frequency Cepstral Coefficients (MFCCs) from the speech signal. A well-known statistical modelling process, the Gaussian Mixture Model (GMM), then models the distribution of each speaker's MFCCs in a multidimensional acoustic space. The GMM-based speaker identification system has features that make it promising for hardware acceleration. This paper describes the hardware implementation for classification of a text-independent GMM-based speaker identification system. The aim was to produce a system that can perform simultaneous identification of large numbers of voice streams in real time. This has important potential applications in security and in automated call centre applications. A speedup factor of ninety was achieved compared to a software implementation on a standard PC. 1. Introduction Speaker recognition is an important branch of speech processing. It is the process of automatically recognizing who is speaking by using speaker-specific information included in the speech waveform. It is receiving increasing attention due to its practical value and has applications ranging from police work to automation of call centers. Speaker recognition can be classified into speaker identification (discovering identity) and speaker verification (authenticating a claim of identity). A closed-set speaker identification system selects the speaker in the training set who best matches the unknown speaker. Open-set speaker identification allows for the possibility that the unknown speaker may not exist in the training set; thus, an additional decision alternative is required for the unknown speaker who does not match any of the models in the training set [1]. Reconfigurable computing systems use reconfigurable hardware to augment a CPU-based system. The application is decomposed into parts running on the CPU and parts running on the reconfigurable hardware, which is used to form a custom hardware accelerator for the portions of the algorithm that are capable of
Multistage VQ Based GMM For Text Independent Speaker identification System  [PDF]
Piyush Lotia,M.R. Khan
International Journal of Soft Computing & Engineering , 2011,
Abstract: The use of Gaussian Mixture Models (GMM) are most common in speaker identification due to it can be performed in a completely text independent situation. However, it sounds efficient to speaker identification application, but it results long time processing in practice. In this paper, we propose a decision function by using vector quantization (VQ)techniques to decrease the training model for GMM in order to reduce the processing time. In our proposed modeling, we take the superiority of VQ, which is simplicity computation to distinguish between male and female speaker. Then, in second phase of classification, decision tree rule are applied to separate out the similar speaker in same gender into two difference group. While in phase 3, GMM is applied into the subgroup of speaker to get the accuracy rates. Experimental result shows that our hybrid VQ/GMM method always yielded better improvements in accuracy and bring almost 20% reduce in time processing.
Application of Speaker Recognition Based on LSSVM and GMM Mixture Model  [PDF]
Haiyan Yang,Xinxing Jing,Ping Zhou
Information Technology Journal , 2012,
Abstract: The technique of speaker recognition is becoming mature. A system of speaker recognition based on Least Squares Support Vector Machine (LSSVM) and Gaussian Mixture Model (GMM) mixture model is discussed in this study. The designed system will be considered to application in Internet environment. The performances of different feature parameters are compared such as LPCC, MFCC and WPTMFC in this study. After comparing recognition rate of LSSVM model and LSSVM-GMM mixture model, recognition rate of the system using the mixture model based on LSSVM-GMM is better and the mixture model is chose. The final results show that the system has good recognition rate and is potential for the practical applications.
Archana Shende,Subhash Mishra,Shiv Kumar
International Journal of Soft Computing & Engineering , 2011,
Abstract: The performance of Speaker recognition systems has improved due to recent advances in speech processing techniques but there is still need of improvement. In this paper we present the comparison of different parameters used in automatic speech recognition system to increase the accuracy of the system. The main goal here is a detailed evaluation of the parameters used in Automatic speech recognition system such as window type, MFCC frame size, number of Gaussian mixtures and GMM & VQ/GMM technique .In this paper we propose a decision function by using vector quantization techniques to decrease the training model for GMM in order to reduce the processing time.
The Likelihood Ratio Decision Criterion for Nuisance Attribute Projection in GMM Speaker Verification  [cached]
Bo?tjan Vesnicer,France Miheli?
EURASIP Journal on Advances in Signal Processing , 2008, DOI: 10.1155/2008/786431
Abstract: We propose a way of integrating likelihood ratio (LR) decision criterion with nuisance attribute projection (NAP) for Gaussian mixture model- (GMM-) based speaker verification. The experiments on the core test of the NIST speaker recognition evaluation (SRE) 2005 data show that the performance of the proposed approach is comparable to that of the standard approach of NAP which uses support vector machines (SVMs) as a decision criterion. Furthermore, we demonstrate that the two criteria provide complementary information that can significantly improve the verification performance if a score-level fusion of both approaches is carried out.
GMM-UBM Based Speaker Verification in Multilingual Environments  [PDF]
Utpal Bhattacharjee,Kshirod Sarmah
International Journal of Computer Science Issues , 2012,
Abstract: Speaker verification system shows poor performance when speaker model training is done in one language and the testing in another language. This is a major problem in multilingual speaker verification system. In this paper, we report the experiment carried out on a recently collected multilingual and multichannel speaker recognition database to study the impact of language variability on speaker verification system. The speech database consists of speech data recorded from 200 speakers with Arunachali languages of North-East India as mother tongue. The speech samples are collected in three different languages English, Hindi and a local language of Arunachal Pradesh. The collected database is evaluated with Gaussian Mixture Model based speaker verification system using universal background model (UBM) for alternative speaker representation and Mel-Frequency Cepstral Coefficients (MFCC) as a front end feature vectors. The impact of the mismatch in training and testing languages have been evaluated.
Score regulation based on GMM token ratio similarity for speaker recognition

- , 2017, DOI: 10.16511/j.cnki.qhdxxb.2017.21.006
Abstract: 该文提出一种基于Gauss混合模型(GMM)托肯配比相似度校正得分(GMM token ratio similarity based score regulation,GTRSR)的说话人识别方法。基于GMM-UBM(通用背景模型)识别框架,在自适应训练和测试阶段计算并保存自适应训练语句和测试语句在UBM上使特征帧得分最高的Gauss分量编号(GMM token)出现的比例(配比),然后在测试阶段计算测试语句和自适应训练语句的GMM托肯分布的配比的相似度GTRS,当GTRS小于某阈值时对测试得分乘以一个惩罚因子,将结果作为测试语句的最终得分。在MASC数据库上进行的实验表明,该方法能够使系统识别性能有一定的提升。
Abstract:A GMM token ratio similarity based score regulation approach for speaker recognition is presented in this paper to judge the reliability of a test score based on the GMM token ratio similarity. In the GMM-UBM (universal background model) method, the GMM token which is the index of the UBM component giving the highest score is saved for each frame to form a vector called the GMM token ratio (GTR) of an utterance during the training and testing phases. In the test phase, the test utterance GTR is compared to the training utterance GTR to compute the similarity for a target speaker. When the similarity is less than a threshold, the original likelihood score is regulated by multiplying by a penalty factor as the final score of this test utterance. Tests on MASC show that this method improves the speaker recognition performance.
Mandarin-Sichuan dialect bilingual text-independent speaker verification using GMM

ZHAO Jing,GONG Wei-guo,YANG Li-ping,

计算机应用 , 2008,
Abstract: Due to the mismatch between mandarin and Sichuan dialect in training and test stages,the performance of speaker verification system degrades dramatically.To solve this problem,a combined Gaussian Mixture Model(GMM),which is trained by proportional pooling mandarin and Sichuan dialect,was presented in this paper.Compared with the Gaussian mixture model trained solely using mandarin/Sichuan dialect,the combined Gaussian mixture model described the characteristic of speaker from both mandarin and Sichuan diale...
Page 1 /100
Display every page Item

Copyright © 2008-2017 Open Access Library. All rights reserved.