Source-specific informative prior for i-vector extraction

An i-vector is a low-dimensional fixed-length representation of a variable-length speech utterance, and is defined as the posterior mean of a latent variable conditioned on the observed feature sequence of an utterance. The assumption is that the prior for the latent variable is non-informative, since for homogeneous datasets there is no gain in generality in using an informative prior. This work shows that extracting i-vectors for a heterogeneous dataset, containing speech samples recorded from multiple sources, using informative priors instead is applicable, and leads to favorable results.

Tests carried out on the NIST 2008 and 2010 Speaker Recognition Evaluation (SRE) dataset show that our proposed method beats three baselines: For the short2-short3 core-task in SRE’08, for the female and male cases, five and six respectively, out of eight common conditions were beaten, and for the core-core task in SRE’10, for both genders, five out of nine common conditions were beaten.

Share This Post

Get Updates

Related Posts

Knowledge extraction from usage data of mobile devices with educational purposes

City-Level Geolocation Based on Routing Feature

Frequent itemset mining for Big Data in social media using ClustBigFIM algorithm

Nanostructured Thermionics for Conversion of Light to Electricity: Simultaneous Extraction of Device Parameters