**Staff:**

Lecturer: Dr. Yossi Keshet

Teaching assistant: Shua Dissen

**Info:**

Exam Moed A will be on 25.7.17

Exam Moed B will be on 12.9.17

**Course books:**

Rabiner and Schafer: Theory and Applications of Digital Speech Processing, Prentice Hall, 2010.

Huang, Acero, and Hon: Spoken Language Processing, Prentice Hall, 2001.

Rabiner and Juang: Fundamentals of Speech Recognition, Prentice Hall, 1993.

Deller, Hansen, and Proakis: Discrete-time Processing of Speech Signals, 2000.

Quatieri: Discrete-time Speech Signal Processing, Prentice Hall, 2001.

**Class notes:**

Some of the lecture notes are based on the lecture notes of the speech recognition course given in Columbia University (e6870).

Lecture 1 - Introduction and signal processing. The matlab code explaining what_is_fft.m (a bonus will be given to anyone how traslate the code into Python)

Lecture 2 - Signal processing and features.

Lecture 3 - Dynamic Time Warping (DTW).

Lecture 4 - Gaussian Mixture Models (GMM). Further reading on the EM algorithm and GMMs can be found in Jeff A. Bilmes's tutorial.

Lecture 5 - Hidden Markov Models (HMM). They are explained beautifully in the seminal tutorial of Lawrance R. Rabiner

Lecture 6 - Language modelling. See also the book chapter on N-grams of Jurafsky and Martin

Lecture 7 (based on presentation of Lim Zhi Hao, 2015) - Weighted Finite State Transducers (WFSTs). See also Mohri, Pereira, and Riley's survey

**Assignments:**

A discussion group for this course is available on Piazza. Before the first use, you need to register using this link with the code "89608".

Assignment 1 (corrected version) and it's WAV and transcription (TextGrid) files, additionally here you can find many spoken digits examples -- Due: May 3, 2017 Grades

Assignment 2 and it's files and more files can be downloded from here (new link 22.5.2017). -- Due: ~~May 21, 2017~~ ~~May 28, 2017~~ June 4, 2017

Assignment 3 -- Due: June 30, 2017