V. Ramasubramanian and Amitav Das
This tutorial is concerned with speaker-recognition, with a particular focus on “text-dependent” systems, where a person is recognized (or verified) by his/her natural voice characteristics as typified by a spoken “password”. The text-dependent mode has a higher performance potential than text-independent mode of operation and is therefore naturally the preferred mode in most successful speaker-recognition products currently in vogue. In the present-day scenario of the highly sensitive field of IT security, be it access to financial transactions through tele- and net-banking, ATMs, sensitive web-sites, private information such as medical records or high security installations such as defense laboratories, nuclear plants, etc., the need for a highly reliable biometric which has both high security and high convenience is paramount. While the notion of voice as a biometric is already well acknowledged and documented, the use of “passwords” and highly individualized “personal profiles” are increasingly becoming popular and well accepted as they cater to both the (usually conflicting) requirements of security and convenience. This has seen the emergence of “text-dependent” speaker-recognition techniques and systems as a key player in the biometric arena. While there are several excellent overview and tutorial papers on the general subject of speaker-recognition dating from 1970s, none of them pay any particular attention to “text-dependent” systems. This tutorial attempts to fill this gap and provides an overview of this class of speaker-recognition techniques and systems which have a very high performance potential and viability for practical deployment.