Search Results: 1 - 10 of 100 matches for " "
All listed articles are free for downloading (OA Articles)
Page 1 /100
Display every page Item
Text Independent Speaker Recognition and Speaker Independent Speech Recognition Using Iterative Clustering Approach  [PDF]
International Journal of Computer Science & Information Technology , 2009,
Abstract: This paper presents the effectiveness of perceptual features and iterative clustering approach forperforming both speech and speaker recognition. Procedure used for formation of training speech is differentfor developing training models for speaker independent speech and text independent speaker recognition. So,this work mainly emphasizes the utilization of clustering models developed for the training data to obtainbetter accuracy as 91%, 91% and 99.5% for mel frequency perceptual linear predictive cepstrum with respectto three categories such as speaker identification, isolated digit recognition and continuous speechrecognition. This feature also produces 9% as low equal error rate which is used as a performance measurefor speaker verification. The work is experimentally evaluated on the set of isolated digits and continuousspeeches from TI digits_1 and TI digits_2 database for speech recognition and on speeches of 50 speakersrandomly chosen from TIMIT database for speaker recognition. The noteworthy feature of speakerrecognition algorithm is to evaluate the testing procedure on identical messages of all the 50 speakers,theoretical validation of results using F-ratio and validation of results by statistical analysis using2 cdistribution.
Deep Speaker Vectors for Semi Text-independent Speaker Verification  [PDF]
Lantian Li,Dong Wang,Zhiyong Zhang,Thomas Fang Zheng
Computer Science , 2015,
Abstract: Recent research shows that deep neural networks (DNNs) can be used to extract deep speaker vectors (d-vectors) that preserve speaker characteristics and can be used in speaker verification. This new method has been tested on text-dependent speaker verification tasks, and improvement was reported when combined with the conventional i-vector method. This paper extends the d-vector approach to semi text-independent speaker verification tasks, i.e., the text of the speech is in a limited set of short phrases. We explore various settings of the DNN structure used for d-vector extraction, and present a phone-dependent training which employs the posterior features obtained from an ASR system. The experimental results show that it is possible to apply d-vectors on semi text-independent speaker recognition, and the phone-dependent training improves system performance.
A Tutorial on Text-Independent Speaker Verification  [cached]
Douglas A. Reynolds,Dijana Petrovska-Delacrétaz,Javier Ortega-García,Teva Merlin
EURASIP Journal on Advances in Signal Processing , 2004, DOI: 10.1155/s1687617204310024
Abstract: This paper presents an overview of a state-of-the-art text-independent speaker verification system. First, an introduction proposes a modular scheme of the training and test phases of a speaker verification system. Then, the most commonly speech parameterization used in speaker verification, namely, cepstral analysis, is detailed. Gaussian mixture modeling, which is the speaker modeling technique used in most systems, is then explained. A few speaker modeling alternatives, namely, neural networks and support vector machines, are mentioned. Normalization of scores is then explained, as this is a very important step to deal with real-world data. The evaluation of a speaker verification system is then detailed, and the detection error trade-off (DET) curve is explained. Several extensions of speaker verification are then enumerated, including speaker tracking and segmentation by speakers. Then, some applications of speaker verification are proposed, including on-site applications, remote applications, applications relative to structuring audio information, and games. Issues concerning the forensic area are then recalled, as we believe it is very important to inform people about the actual performance and limitations of speaker verification systems. This paper concludes by giving a few research trends in speaker verification for the next couple of years.
Text Independent Speaker Identification System for Authentication
S. Selva Nidhyananthan,R. Shantha Selva kumari,M. Jahidha Sultana,S. Lavanya
International Journal of Advanced Electrical and Electronics Engineering , 2013,
Abstract: In this paper the development of text independent speaker identification for authentication is focused. The speaker identification system is developed using various feature extraction techniques like Mel-Frequency Cepstral Coefficients (MFCC), Perceptual Linear prediction (PLP), Revised Perceptual Linear prediction (RPLP) for feature extraction. The system is extended with a modification in RPLP, in order to achieve a better result. The Speaker identification system is trained and identified with Gaussian Mixture Model (GMM) for modeling. This system is intended for deployment of speaker identification system in real life applications, for which Graphical User Interface Development Environment (GUIDE) is experimented. This provides a track for combining the hardware with the software so that processing would be better. The proposed speaker Identification system provides 90.5% identification accuracy over a Telephonic recorded database of 42 speakers
International Journal of Innovative Research in Computer and Communication Engineering , 2013,
Abstract: This article presents the implementation of Text Independent Speaker Identification system. It involves two parts- “Speech Signal Processing” and “Artificial Neural Network”. The speech signal processing uses Mel Frequency Cepstral Coefficients (MFCC) acquisition algorithm that extracts features from the speech signal, which are actually the vectors of coefficients. The backpropagation algorithm of the artificial neural network stores the extracted features on a database and then identify speaker based on the information. The direct speech does not work for the identification of the voice or speaker. Since the speech signal is not always periodic and only half of the frames are voiced, it is not a good practice to work with the half voiced and half unvoiced frames. So the speech must be preprocessed to successfully identify a voice or speaker. The major goal of this work is to derive a set of features that improves the accuracy of the text independent speaker identification system.
Text-Independent Speaker Identification Using Hidden Markov Model  [cached]
World of Computer Science and Information Technology Journal , 2012,
Abstract: This paper presents a text-independent speaker identification system based on Mel-Frequency Cepstrum Coefficient (MFCC) feature vectors and Hidden Markov Model (HMM) classifier. The implementation of the HMM is divided into two steps: feature extraction and recognition. In the feature extraction step, the paper reviews MFCCs by which the spectral features of speech signal can be estimated and shows how these features can be computed. In the recognition step, the theory and implementation of HMM are reviewed and followed by an explanation of how HMM can be trained to generate the model parameters using Forward-Backward algorithm and tested using forward algorithm. The HMM is evaluated using data of 40 speakers extracted from Switchboard corpus. Experimental results show an identification rate of about 84%.
Text Independent Speaker Identification using Integrating Independent Component Analysis with Generalized Gaussian Mixture Model
N M Ramaligeswararao,Dr.V Sailaja,Dr.K. Srinivasa Rao
International Journal of Advanced Computer Sciences and Applications , 2011,
Abstract: Recently much work has been reported in literature regarding Text Independent speaker identification models. Sailaja et al (2010)[34] has developed a Text Independent speaker identification model assuming that the speech spectra of each individual speaker can be modeled by Mel frequency cepstral coefficient and Generalized Gaussian mixture model. The limitation of this model is the feature vectors (Mel frequency cepstral coefficients) are high in dimension and assumed to be independent. But feature represented by MFCC’s are dependent and chopping some of the MFCC’s will bring falsification in the model. Hence, in this paper a new and novel Text Independent speaker identification model is developed by integrating MFCC’s with Independent component analysis(ICA) for obtaining independency and to achieve low dimensionality in feature vector extraction. Assuming that the new feature vectors follows a Generalized Gaussian Mixture Model (GGMM), the model parameters are estimated by using EM algorithm. A Bayesian classifier is used to identify each speaker. The experimental result with 50 speaker’s data base reveals that the proposed procedure outperforms the existing methods.
Robust Features for Automatic Text-Independent Speaker Recognition Using Gaussian Mixture Model  [PDF]
R. Rajeshwara Rao,A. Prasad,Ch. Kedari Rao
International Journal of Soft Computing & Engineering , 2011,
Abstract: In this paper, robust features for text-independent speaker recognition has been explored. Through different experimental studies, it is demonstrated that the speaker related information can be effectively captured using Gaussian mixture Models (GMMs). The study on the effect of feature vector size for good speaker recognition demonstrates that, feature vector size in the range of 20-24 can capture speaker discrimination information effectively for a speech signal sampled at 16 kHz, it is established that the proposed speaker recognition system requires significantly less amount of data during both during training as well as in testing. The speaker recognition study using robust features for different mixtures components, training and test duration has been exploited. We demonstrate the speaker recognition studies on TIMIT database.
Performance evaluation of Statistical Approaches for Automatic Text-Independent Speaker Recognition using Robust Features
R. Rajeswara Rao,A. Prasad,Ch. Kedari Rao
International Journal of Computer Science Issues , 2012,
Abstract: This paper introduces the performance evaluation of statististical approaches for Automatic-text-independent Speaker Recognition system. Automatic-text-independent Speaker Recognition system is to quickly and accurately identify the person from his/her voice. The study on the effect of feature vector size for good speaker recognition demonstrates that the feature vector size in the range of 18-22 can capture speaker related information effectively for a speech signal sampled at 16 kHz. it is demonstrated that the timing varying speaker related information can be effectively captured using hidden Markov models (HMMs) than GMM. It is established that the HMM based speaker recognition system requires significantly less amount of data during both during training as well as in testing than GMM based Speaker Recognition System. The performance evaluation of speaker recognition study using robust features for HMM based method and GMM based method is exploited for different mixtures components, training and test durations We demonstrate the speaker recognition studies on TIMIT database.
Data-Model Relationship in Text-Independent Speaker Recognition  [cached]
John S. D. Mason,Nicholas W. D. Evans,Robert Stapert,Roland Auckenthaler
EURASIP Journal on Advances in Signal Processing , 2005, DOI: 10.1155/asp.2005.471
Abstract: Text-independent speaker recognition systems such as those based on Gaussian mixture models (GMMs) do not include time sequence information (TSI) within the model itself. The level of importance of TSI in speaker recognition is an interesting question and one addressed in this paper. Recent works has shown that the utilisation of higher-level information such as idiolect, pronunciation, and prosodics can be useful in reducing speaker recognition error rates. In accordance with these developments, the aim of this paper is to show that as more data becomes available, the basic GMM can be enhanced by utilising TSI, even in a text-independent mode. This paper presents experimental work incorporating TSI into the conventional GMM. The resulting system, known as the segmental mixture model (SMM), embeds dynamic time warping (DTW) into a GMM framework. Results are presented on the 2000-speaker SpeechDat Welsh database which show improved speaker recognition performance with the SMM.
Page 1 /100
Display every page Item

Copyright © 2008-2017 Open Access Library. All rights reserved.