E-library page

AES E-Library

Deep Neural Network Based Forensic Automatic Speaker Recognition in VOCALISE using x-Vectors

In this article we present a Deep Neural Network (DNN)-based version of the VOCALISE (Voice Comparison and Analysis of the Likelihood of Speech Evidence) forensic automatic speaker recognition system. DNNs mark a new phase in the evolution of automatic speaker recognition technology, providing a powerful framework for extracting highly-discriminative speaker-specific features from a recording of speech. The latest version of VOCALISE aims to preserve the ‘open-box’ philosophy of its predecessors, offering the forensic practitioner flexibility in the configuration and training of all parts of the automatic speaker recognition pipeline. VOCALISE continues to support both legacy and state-of-the-art speaker modelling algorithms, the latest of which is a DNN-based ‘x-vector’ framework, a state-of-the-art approach that leverages a DNN to extract compact speaker representations. Here, we introduce the x-vector framework and its implementation in VOCALISE, and demonstrate its powerful performance capabilities on some forensically relevant data.

Author (s): Kelly, Finnian; Forth, Oscar; Kent, Samuel; Gerlach, Linda; Alexander, Anil
Affiliation: Oxford Wave Research Ltd., Oxford, UK; Oxford Wave Research Ltd., Oxford, UK; Oxford Wave Research Ltd., Oxford, UK; Philipps-Universität Marburg, Germany; Oxford Wave Research Ltd., Oxford, UK (See document for exact affiliation information.)
Publication Date: 2019-06-06 Import into BibTeX
Permalink: https://aes2.org/publications/elibrary-page/?id=20477

(838KB)

This paper costs $33 for non-members and is free for AES members and E-Libary subscribers.

Click to purchase paper as a non-member or login as an AES member. If your company or school subscribes to the E-Library then switch to the institutional version. If you are not an AES member Join the AES. If you need to check your member status, login to the Member Portal.

Type: Conference Paper
E-Libary location: TMP/conf/2019/forensics/

Learn more about the AES E-Library

AES Conventions

AES Conferences

AES Training & Development

Gift Membership

AES Membership Benefits

Gift Membership

AES Membership Benefits

Become a Sustaining Member

AES Membership Benefits

AES Inside Track

Current Standards

Standards Blog

Journal of the AES

AES E-library

Special Publications

AES Sections are active around the world and provide a means for members to meet locally.

AES Student Website

AES Educational Foundation

Student Sections

See the committee’s accomplishments in diversity & inclusion

AES Statement of solidarity

AES E-Library

Deep Neural Network Based Forensic Automatic Speaker Recognition in VOCALISE using x-Vectors

Choose your country of residence from this list:

AES E-Library

Login Institutions

Deep Neural Network Based Forensic Automatic Speaker Recognition in VOCALISE using x-Vectors

Choose your country of residence from this list: