Signal Processing Augmentations to Spectrum-Based Modeling for Speaker Recognition
- Author
- Metzger, Richard Anthony
- Published
- [University Park, Pennsylvania] : Pennsylvania State University, 2018.
- Physical Description
- 1 electronic document
- Additional Creators
- Doherty, John F.
Access Online
- etda.libraries.psu.edu , Connect to this object online.
- Graduate Program
- Restrictions on Access
- Open Access.
- Summary
- When processing real-world recordings of speech, it is highly probable noise will be present at some instance in the signal. Compounding this problem is the situation when the noise occurs in short, impulsive bursts at random intervals. Traditional voice activity detectors (VADs) rely on an energy threshold in the spectrum of the incoming signal to make a decision, and therefore can erroneously flag noise segments as speech. This noise is then propagated through the speaker recognition system, resulting in an increase in the system error rate. Therefore, an approach is needed to remove the noise before the modeling of features takes place while still preserving spectral features that were uncorrupted by the noise. Motivated by principles in both information theory and signal processing, a novel processing algorithm will be explored which mitigates both high and low entropy noise. In this dissertation, the following topics will be investigated: (1) a speech and noise detection algorithm will be constructed from the approximate entropy (ApEn) statistic, (2) the ApEn algorithm will be tested on various noise cases and its resulting model will be compared to models produced by an energy-based voice activity detector (VAD), and (3) improvements will be made to ApEn by adding empirical mode decomposition (EMD) to the processing chain. The results put forth in this dissertation pose a promising technique at noise mitigation, and represent a novel approach to spectrum-based modeling in noisy environments.
- Other Subject(s)
- Genre(s)
- Dissertation Note
- Ph.D. Pennsylvania State University 2018.
- Reproduction Note
- Microfilm (positive). 1 reel ; 35 mm. (University Microfilms 13871888)
- Technical Details
- The full text of the dissertation is available as an Adobe Acrobat .pdf file ; Adobe Acrobat Reader required to view the file.
View MARC record | catkey: 25793995