Working with the Web Audio API is the definitive and instructive guide to understanding and using the Web Audio API.
The Web Audio API provides a powerful and versatile system for controlling audio on the Web. It allows developers to generate sounds, select sources, add effects, create visualizations and render audio scenes in an immersive environment.
This book covers all essential features, with easy to implement code examples for every aspect. All the theory behind it is explained, so that one can understand the design choices as well as the core audio processing concepts. Advanced concepts are also covered, so that the reader will gain the skills to build complex audio applications running in the browser.
Aimed at a wide audience of potential students, researchers and coders, this is a comprehensive guide to the functionality of this industry-standard tool for creating audio applications for the web.
This chapter introduces the Web Audio API. It explains the motivations behind it, and compares it to other APIs, packages and environments for audio programming. It gives an overview of key concepts, such as the audio graph and how connections are made. The AudioContext is introduced, as well as a few essential nodes and methods that are explored in more detail in later chapters. A âhello worldâ application is presented as a code example, showing perhaps the simplest use of the Web Audio API to produce sound. We then extend this application to show alternative approaches to its implementation, coding practices, and how sound is manipulated in an audio graph.
The Web Audio API
The Web Audio API is a high-level Application Programming Interface for handling audio operations in web applications. It makes audio processing and analysis a fundamental part of the web platform. It has a lot of built-in tools, but also allows one to create their own audio processing routines within the same framework. Essentially, it allows one to use a web browser to perform almost any audio processing that one could create for stand-alone applications. In particular, it includes capabilities found in modern game engines and desktop audio production applications, including mixing, processing, filtering, analysis and synthesis tasks.
The Web Audio API is a signal flow development environment. It has a lot in common with visual data flow programming, like LabView, Matlabâs Simulink, Unrealâs BluePrint, PureData, or Max MSP. They all provide a graphical representation of signal processing. But unlike the others, the Web Audio API is text-based JavaScript, not graphical. There are third-party tools to work with a graphical representation for web audio development, but they are still in early stages.
With the Web Audio API, one can define nodes, which include sound sources, filters, effects and destinations. One can also create his or her own nodes. These nodes are connected together, thus defining the routing, processing and rendering of audio.
The audio context
Audio operations are handled within an audio context. The audio operations are performed with audio nodes (consisting of sources, processors and destinations), and the nodes are connected together to form an audio routing graph. The graph defines how an audio stream flows from sources (such as audio files, streaming content or audio signals created within the audio context) to the destination (often the speakers).
The audio context is defined with a constructor, AudioContext(), as we will see in the Hello World example below.
All routing occurs within an AudioContext containing a single AudioDestinationNode, In the simplest case, a single source can be routed directly to the output, as in Figure 1.1. The audio nodes appear as blocks. The arrows represent connections between nodes.
Modular routing allows arbitrary connections between different audio nodes. Each node can have inputs and/or outputs. A source node has no inputs and a single output. Sources are often based on sound files, but the sources can also be real-time input from a live instrument or microphone, redirection of the audio output from an audio element, or entirely synthesized sound.
A destination node has one input and no outputs. Though the final destination node is often the loudspeakers or headphones, you can also process without sound playback (for example, if you want to do pure visualization) or do offline processing, which results in the audio stream being written to a destination buffer for later use.
Other nodes such as filters can be placed between the source and destination nodes. Such nodes can often have multiple incoming and outgoing connections. By default, if there are multiple incoming connections into a node, the Web Audio API simply sums all the incoming audio signals. The developer also doesnât usually have to worry about low-level stream format details when two objects are connected together. For example, if a mono audio stream is connected to a stereo input it should just mix to left and right channels appropriately.
Modular routing also permits the output of AudioNodes to be routed to an audio parameter that controls the behavior of a different AudioNode. In this scenario, the output of a node can act as a modulation signal rather than an input signal.
A single audio context can support multiple sound inputs and complex audio graphs, so, generally speaking, we will only need one for each audio application we create.
The default nodes of the Web Audio API are fairly minimal, only 19 in all.