Audio and Speech Processing with MATLAB
eBook - ePub

Audio and Speech Processing with MATLAB

  1. 330 pages
  2. English
  3. ePUB (mobile friendly)
  4. Available on iOS & Android
eBook - ePub

Audio and Speech Processing with MATLAB

About this book

Speech and audio processing has undergone a revolution in preceding decades that has accelerated in the last few years generating game-changing technologies such as truly successful speech recognition systems; a goal that had remained out of reach until very recently. This book gives the reader a comprehensive overview of such contemporary speech and audio processing techniques with an emphasis on practical implementations and illustrations using MATLAB code. Core concepts are firstly covered giving an introduction to the physics of audio and vibration together with their representations using complex numbers, Z transforms and frequency analysis transforms such as the FFT.

Later chapters give a description of the human auditory system and the fundamentals of psychoacoustics. Insights, results, and analyses given in these chapters are subsequently used as the basis of understanding of the middle section of the book covering: wideband audio compression (MP3 audio etc.), speech recognition and speech coding.

The final chapter covers musical synthesis and applications describing methods such as (and giving MATLAB examples of) AM, FM and ring modulation techniques. This chapter gives a final example of the use of time-frequency modification to implement a so-called phase vocoder for time stretching (in MATLAB).

Features



  • A comprehensive overview of contemporary speech and audio processing techniques from perceptual and physical acoustic models to a thorough background in relevant digital signal processing techniques together with an exploration of speech and audio applications.


  • A carefully paced progression of complexity of the described methods; building, in many cases, from first principles.


  • Speech and wideband audio coding together with a description of associated standardised codecs (e.g. MP3, AAC and GSM).


  • Speech recognition: Feature extraction (e.g. MFCC features), Hidden Markov Models (HMMs) and deep learning techniques such as Long Short-Time Memory (LSTM) methods.


  • Book and computer-based problems at the end of each chapter.


  • Contains numerous real-world examples backed up by many MATLAB functions and code.

Frequently asked questions

Yes, you can cancel anytime from the Subscription tab in your account settings on the Perlego website. Your subscription will stay active until the end of your current billing period. Learn how to cancel your subscription.
No, books cannot be downloaded as external files, such as PDFs, for use outside of Perlego. However, you can download books within the Perlego app for offline reading on mobile or tablet. Learn more here.
Perlego offers two plans: Essential and Complete
  • Essential is ideal for learners and professionals who enjoy exploring a wide range of subjects. Access the Essential Library with 800,000+ trusted titles and best-sellers across business, personal growth, and the humanities. Includes unlimited reading time and Standard Read Aloud voice.
  • Complete: Perfect for advanced learners and researchers needing full, unrestricted access. Unlock 1.4M+ books across hundreds of subjects, including academic and specialized titles. The Complete Plan also includes advanced features like Premium Read Aloud and Research Assistant.
Both plans are available with monthly, semester, or annual billing cycles.
We are an online textbook subscription service, where you can get access to an entire online library for less than the price of a single book per month. With over 1 million books across 1000+ topics, we’ve got you covered! Learn more here.
Look out for the read-aloud symbol on your next book to see if you can listen to it. The read-aloud tool reads text aloud for you, highlighting the text as it is being read. You can pause it, speed it up and slow it down. Learn more here.
Yes! You can use the Perlego app on both iOS or Android devices to read anytime, anywhere — even offline. Perfect for commutes or when you’re on the go.
Please note we cannot support devices running on iOS 13 and Android 7 or earlier. Learn more about using the app.
Yes, you can access Audio and Speech Processing with MATLAB by Paul Hill in PDF and/or ePUB format, as well as other popular books in Computer Science & Computer Networking. We have over one million books available in our catalogue for you to explore.

Information

1
MATLAB® and Audio
CONTENTS
1.1 Reading Sounds
1.2 Audio Display and Playback
1.3 Audio-Related MATLAB
1.4 Example Audio Manipulation
1.5 Summary
1.6 Exercises
To me, it’s always a joy to create music no matter what it takes to actually get there. The real evils are always whatever stops you from doing that – like if your CPU is spiking and you have to sit there and bounce all your MIDI to audio. Now that’s annoying!
Skrillex
MATLAB (MATrix LABoratory) is used throughout this book for audio processing and manipulation together with associated visualisations. This chapter therefore gives an introduction to the basic capabilities of MATLAB for audio processing. Appendix B also gives a list of core MATLAB functions and commands that are applicable to audio processing for those starting to use the language. This chapter (and Appendix B) can be skipped or skimmed if the reader is familiar with the basic operations of the MATLAB programming language and its visualisation capabilities.
More general information is given on the Mathworks (creators of MATLAB) website www.mathworks.com and within the innumerable help files, demos and manuals packaged with MATLAB.
TABLE 1.1: Audio formats available to be read by MATLAB command audioread
Audio File Format
Description
File extension
WAVE
Raw audio
.wav
OGG
OGG vorbis
.ogg
FLAC
Lossless audio compression
.flac
AU
Raw audio
.au
AIFF
Raw audio
.aiff,.aif
AIFC
Raw audio
.aifc
MP3
MPEG1 Layer 3, lossy compressed audio
.mp3
MPEG4 AAC
MPEG4, lossy compressed audio
.m4a, .mp4
1.1 Reading Sounds
Audio is read into MATLAB using the function audioread whose basic functionality is as follows.1
Image
Where filename in this case is a MATLAB variable containing a string (array of chars) defining the entire name of the audio file to be read including any file extension (e.g., mp3, wav, etc.). A typical example of a call to audioread would be
Image
where y is the array or matrix of sampled audio data and Fs is the sampling frequency of the input audio. audioread is able to read the formats shown in Table 1.1. In this example, filename is ‘exampleAudio.wav’, the file to be read in (filename is required to be of a MATLAB string type and is therefore delimited by single quotes ’). filename can be a MATLAB string that can also include a path (defined in the format of your operating system) to any location on your hard drive. For example, filename could be ‘c:\mydirectory\mysubdirectory\exampleAudio.wav’ (on windows) or ‘~/mydirectory/mysubdirectory/exampleAudio.wav’ (on OSX/Unix/Linux). A statement in MATLAB will automatically display its results. It is therefore common to want to suppress this output and this is achieved by using the semicolon at the end of each line where no output is required.
It is often useful to determine detailed information about an audio file before (or indee...

Table of contents

  1. Cover
  2. Half Title
  3. Title Page
  4. Copyright Page
  5. Dedication
  6. Table of Contents
  7. Preface
  8. List of Acroynms
  9. Introduction
  10. 1 MATLAB® and Audio
  11. 2 Core Concepts
  12. 3 Frequency Analysis for Audio
  13. 4 Acoustics
  14. 5 The Auditory System
  15. 6 Fundamentals of Psychoacoustics
  16. 7 Audio Compression
  17. 8 Automatic Speech Recognition: ASR
  18. 9 Audio Features for Automatic Speech Recognition and Audio Analysis
  19. 10 HMMs, GMMs and Deep Neural Networks for ASR
  20. 11 Speech Coding
  21. 12 Musical Applications
  22. A The Initial History of Complex Numbers
  23. B MATLAB Fundamentals (Applicable to Audio Processing)
  24. Index