Image Dataset GitHub
, ,

The best 40 Open-Source Audio Datasets for ML

The best 40 Open-Source Audio Datasets for ML

Audio Datasets

Audio Datasets. October is over and so is the DagsHub’s Hacktoberfest project. whilst pronouncing the challenge, we didn’t believe we’d reach the end line with almost forty new audio datasets, publicly available and parseable on DagsHub! huge kudos to our community for doing wonders and pulling off any such fantastic attempt in so little time. additionally, to digital Ocean, GitHub, and GitLab for organizing the event.

This year we focused our contribution at the audio domain. For that, we stepped forward DagsHub’s audio catalog capabilities. Now, you could pay attention to samples hosted on DagsHub without having to download anything domestically. For each sample, you get extra records like waveforms, spectrograms, and file metadata. final however no longer least, the dataset is versioned via DVC, making it clean to improve and geared up to apply.

To make it less complicated for audio practitioners to find the dataset they’re searching out, we collected all Hacktoberfest’s contributions to this put up. we’ve got datasets from seven(!) one-of-a-kind languages, numerous domain names, and resources. in case you’re interested by a dataset this is missing here, please allow us to know, and we’ll ensure to feature it.

Even though the month-long digital pageant is over, we’re still welcoming contributions to open supply information technological know-how. if you’d like to enrich the audio datasets hosted on DagsHub, we’d be satisfied to assist you inside the method! Please reach out on our Discord channel for more information.

The Acted Emotional Speech Dynamic Database (AESDD) is a publicly to be had speech emotion popularity dataset. It consists of utterances of acted emotional speech within the Greek language. it is divided into two most important classes, one containing utterances of acted emotional speech and the other controlling spontaneous emotional speech. you could contribute to this dataset by submitting recordings of emotional speech to the website. they’ll be established and be provided publicly for non-industrial studies functions.

Contributed with the aid of: Abid Ali Awan
Unique dataset
Arabic Speech Corpus

The Arabic Speech Corpus has been evolved as part of Ph.D. work by Nawar Halabi at the university of Southampton. The corpus become recorded in south Levantine Arabic (Damascian accent) the use of a professional studio. Synthesized speech as an output the usage of this corpus has produced a 86f68e4d402306ad3cd330d005134dac, natural voice.

Contributed through: Mert Bozkır
original dataset

Att-hack: French Expressive Speech

This information is acted expressive speech in French, one hundred phrases with a couple of versions/repetitions (three to five) in four social attitudes: friendly, distant, dominant, and seductive. This research has been supported through the French Ph2D/IDF move project on modeling of speech attitudes and alertness to an expressive conversational agent and funded through the Ile-de-France vicinity. This database has brought about a book for the 2020 Speech Prosody conference in Tokyo. For a extra unique account, see the research article.

Contributed by way of: Filipp Levikov
unique dataset

This repository incorporates code and facts used in deciphering and Explaining Deep Neural Networks for Classifying Audio indicators. The dataset consists of 30,000 audio samples of spoken digits (0–9) from 60 distinct speakers. moreover, it holds the audioMNIST_meta.txt, which gives meta records which includes the gender or age of every speaker.

Contributed via: Mert Bozkır
original dataset
BAVED: simple Arabic Vocal feelings

The primary Arabic Vocal feelings Dataset (BAVED) consists of 7 Arabic words spelled in distinctive stages of emotions recorded in an audio/ wav format. each phrase is recorded in three levels of feelings, as follows:

stage 0 — The speaker is expressing a low stage of emotion. this is similar to feeling tired or down.
level 1 — The “widespread” degree where the speaker expresses impartial emotions.
level 2 — The speaker is expressing a high level of fantastic or negative emotions.

Contributed through: Kinkusuma
authentic dataset

Chicken Audio Detection

This facts set is part of a challenge hosted through the gadget Listening Lab from the Queen Mary university of London In collaboration with the IEEE signal Processing Society. It incorporates datasets gathered in actual stay bio-acoustics tracking initiatives and an goal, standardized assessment framework. The freefield1010 hosted on DagsHub has a group of over 7,000 excerpts from discipline recordings worldwide, amassed by using the FreeSound task and then standardized for studies. This series is very diverse in place and environment.


Audio Datasets

Contributed through: Abid Ali Awan
original dataset

The CHiME-domestic dataset is a set of annotated home surroundings audio recordings. The audio recordings were originally made for the CHiME assignment. in the CHiME-home dataset, four-2d audio chunks are every associated with more than one labels, based on a fixed of 7 labels related to sound assets in the acoustic environment.

Contributed by: Abid Ali Awan
authentic dataset
CMU-Multimodal SDK

CMU-MOSI is a general benchmark for multimodal sentiment evaluation. it’s miles specially desirable to teach and test multimodal fashions considering that most of the most recent works in multimodal temporal facts use this dataset in their papers. It holds sixty five hours of annotated video from more than 1000 speakers, 250 topics, and 6 emotions (happiness, disappointment, anger, fear, disgust, marvel).

Contributed by way of: Michael Zhou
authentic dataset
CREMA-D: Crowd-sourced Emotional Multimodal Actors

CREMA-D is a dataset of 7,442 original clips from ninety one actors. these clips were from forty eight male and 43 lady actors among the a while of 20 and seventy four coming from diverse races and ethnicities (African the usa, Asian, Caucasian, Hispanic, and Unspecified). Actors spoke from a variety of 12 sentences. The sentences had been presented the use of six distinctive emotions (Anger, Disgust, fear, happy, impartial, and sad) and 4 extraordinary emotion levels (Low, Medium, high and Unspecified).

Members rated the emotion and emotion stages primarily based on the blended audiovisual presentation, the video on my own, and the audio alone. due to the large quantity of ratings wanted, this effort turned into crowd-sourced, and a complete of 2443 individuals each rated ninety precise clips, 30 audio, 30 visible, and 30 audio-visible.

Contributed via: Mert Bozkır
unique dataset
children’s music

Children’s tune Dataset is an open-supply dataset for making a song voice research. This dataset includes 50 Korean and 50 English songs sung via one Korean girl professional pop singer. each tune is recorded in separate keys ensuing in a total of 200 audio recordings. every audio recording is paired with a MIDI transcription and lyrics annotations in each grapheme-stage and phoneme-stage.

Contributed by means of: Kinkusuma
original dataset
device and Produced Speech

The DAPS (tool and Produced Speech) dataset is a set of aligned versions of professionally produced studio speech recordings and recordings of the same speech on common client devices (pill and phone) in real-world environments. It has 15 variations of audio (three professional variations and 12 client device/actual-world environment combinations). each version consists of approximately four half of hours of records (about 14 minutes from each of 20 speakers).

Contributed via: Kinkusuma
unique dataset
Deeply Vocal Characterizer

The latter is a human nonverbal vocal sound dataset which includes fifty six.7 hours of brief clips from 1419 speakers, crowdsourced by the general public in South Korea. also, the dataset includes metadata such as age, sex, noise degree, and high-quality of utterance. This repo holds handiest 723 utterances (ca. 1% of the whole corpus) and is loose to apply below CC with the aid of-NC-ND For gaining access to the entire dataset beneath a extra restrictive license, please touch deeplyinc.

Contributed by way of: Filipp Levikov
original dataset

The EMODB database is the freely to be had German emotional database. The database turned into created with the aid of the Institute of conversation technological know-how, Technical college, Berlin. Ten expert speakers (five adult males and 5 girls) participated in records recording. The database includes a complete of 535 utterances. The EMODB database incorporates seven feelings: anger, boredom, anxiety, happiness, unhappiness, disgust, and impartial. The information became recorded at a forty eight-kHz sampling rate and then down-sampled to 16-kHz.

Contributed by means of: Kinkusuma
unique dataset
EMOVO Corpus

EMOVO Corpus database built from the voices of 6 actors who performed 14 sentences simulating six emotional states (disgust, worry, anger, joy, wonder, sadness) plus the neutral country. those feelings are determined in most of the literature associated with emotional speech. The recordings had been made with expert system within the Fondazione Ugo Bordoni laboratories.

Contributed through: Abid Ali Awan
unique dataset
ESC-50: Environmental Sound classification

The ESC-50 dataset is a categorised collection of 2000 environmental audio recordings suitable for benchmarking strategies of environmental sound class. The dataset consists of five-2nd-long recordings organized into 50 semantical lessons (with forty examples per elegance) loosely arranged into five important classes:


natural soundscapes & water sounds.

Human, non-speech sounds.

indoors/home sounds.

outdoors/urban noises.

Clips in this dataset were manually extracted from public area recordings accrued by way of the assignment. The dataset has been prearranged into five folds for similar go-validation, making sure that fragments from the identical unique source file are contained in a single fold.

Contributed by means of: Kinkusuma
authentic dataset
EmoSynth: Emotional artificial Audio

EmoSynth is a dataset of 144 audio files, about five seconds lengthy and 430 KB in size, which 40 listeners have categorized for their perceived emotion concerning the dimensions of Valence and Arousal. It has metadata approximately the category of the audio based on the scale of Valence and Arousal.

Contributed with the aid of: Abid Ali Awan
unique dataset
Estonian Emotional Speech Corpus

The Estonian Emotional Speech Corps (EEKK) is a corps created at the Estonian Language Institute inside the framework of the kingdom software “Estonian Language Technological help 2006–2010”. The corpus incorporates 1,234 Estonian sentences that explicit anger, joy, and disappointment or are impartial.

Contributed with the aid of: Abid Ali Awan
authentic dataset
Flickr 8k Audio Caption Corpus

The Flickr 8k Audio Caption Corpus contains 40,000 spoken audio captions in .wav audio format, one for every caption included inside the train, dev, and take a look at splits within the authentic corpus. The audio is sampled at 16000 Hz with sixteen-bit depth and saved in Microsoft WAVE audio layout.

Contributed by way of: Michael Zhou
original dataset
Golos: Russian ASR

Golos is a Russian corpus suitable for speech research. The dataset mainly consists of recorded audio files manually annotated on the crowd-sourcing platform. the overall length of the audio is about 1240 hours.

Contributed through: Filipp Levikov
authentic dataset
JL Corpus

Emotional speech in New Zealand English. This corpus was built by using preserving an identical distribution of four lengthy vowels. The corpus has five secondary emotions in conjunction with five primary feelings. Secondary emotions are important in Human-robotic interaction (HRI), in which the intention is to version natural conversations amongst human beings and robots.

Contributed by using: Hazalkl
original dataset

LJ Speech
a public area speech dataset inclusive of thirteen,100 brief audio clips of a single speaker analyzing passages from 7 non-fiction books. A transcription is supplied for each clip. Clips range in period from 1 to 10 seconds and feature a total length of about 24 hours. The texts had been posted among 1884 and 1964 and are inside the public area. The audio became recorded in 2016–17 through the LibriVox undertaking and is likewise within the public domain.

Contributed by means of: Kinkusuma
authentic dataset

This dataset contains a big collection of clean speech documents and various environmental noise documents in .wav format sampled at 16 kHz. It presents the recipe to combine easy speech and noise at diverse signal-to-noise ratio (SNR) conditions to generate a large, noisy speech dataset. The SNR conditions and the wide variety of hours of facts required may be configured relying at the application requirements.

Contributed by means of: Hazalkl
original dataset
Public domain Sounds

A big range of sounds may be used for object detection research. The dataset is small (543MB) and divided into subdirectories by means of its layout. The audio files vary from five seconds to 5 mins.

Contributed by: Abid Ali Awan
unique dataset
RSC: sounds from RuneScape traditional

Extract RuneScape classic sounds from cache to wav (and vice versa). Jagex used solar’s original .au sound layout, that’s headerless, 8-bit, u-law encoded, 8000 Hz pcm samples. This module can decompress authentic sounds from sound data as headered WAVs, and recompress (+ resample) new WAVs into documents.

Contributed via: Hazalkl
original dataset
Speech accessory Archive

This dataset carries 2140 speech samples, every from a exceptional talker analyzing the same analyzing passage. Talkers come from 177 countries and feature 214 one of a kind native languages. every talker is speaking in English.

Contributed by means of: Kinkusuma
original dataset

Speech commands Dataset
The dataset (1.four GB) has 65,000 one-2d long utterances of 30 quick words by way of hundreds of various humans, contributed with the aid of public contributors through the AIY internet site. this is a set of 1-2d .wav audio documents, each containing a unmarried spoken English word.

Contributed by means of: Abid Ali Awan
original dataset
TESS: Toronto Emotional Speech Set

The Northwestern university Auditory check №6 changed into used to create these stimuli. actresses (elderly 26 and 64 years) recited a hard and fast of 2 hundred target words in the provider phrase “Say the phrase _____,” and recordings were produced of the set depicting every of 7 feelings (anger, disgust, worry, happiness, first-class wonder, disappointment, and impartial). There are a complete of 2800 stimuli.

Contributed by using: Hazalkl
original dataset

The URDU dataset carries emotional utterances of Urdu speech accumulated from Urdu talk suggests. There are four hundred utterances of 4 primary emotions inside the book: irritated, glad, impartial, and Emotion. There are 38 speakers (27 male and 11 girl). This facts is made out of YouTube.

Contributed by: Abid Ali Awan
original dataset
VIVAE: Variably excessive Vocalizations of affect and Emotion

The Variably intense Vocalizations of affect and Emotion Corpus (VIVAE) consists of a hard and fast of human non-speech emotion vocalizations. the whole set, comprising 1085 audio files, capabilities eleven speakers expressing 3 tremendous (success/ triumph, sexual delight, and surprise) and 3 negatives (anger, fear, physical pain) affective states. every parametrically varied from low to top emotion intensity.

Contributed by way of: Mert Bozkır
unique dataset
FSDD: unfastened Spoken Digit Dataset

A simple audio/speech dataset such as recordings of spoken digits in wav files at 8kHz. The recordings are trimmed so that they have close to minimum silence at the start and ends.

Contributed by way of: Kinkusuma
authentic dataset
LEGOv2 Corpus

This spoken speak corpus incorporates interactions captured from the CMU let’s cross (LG) device by Carnegie Mellon college in 2006 and 2007. it’s miles based totally on raw log documents from the LG device. 347 dialogs with nine,083 gadget-person exchanges; feelings categorized as garbage, non-indignant, barely angry, and really indignant.

Contributed by using: Kinkusuma
unique dataset

Multi-song song dataset for music supply separation. There are two variations of MUSDB18, the compressed and the uncompressed(HQ).

MUSDB18 — consists of a complete of 150 full-tune songs of different patterns and consists of each the stereo combos and the original resources, divided among a education subset and a take a look at subset.

MUSDB18-HQ — the uncompressed version of the MUSDB18 dataset. It consists of a total of one hundred fifty full-music songs of various styles and includes each the stereo mixtures and the original sources, divided among a training subset and a take a look at subset.
Contributed by using: Kinkusuma
authentic dataset

Voice Gender
The VoxCeleb dataset (7000+ unique audio system and utterances, 3683 adult males / 2312 women). The VoxCeleb is an audio-visible dataset consisting of quick clips of human speech, extracted from interview motion pictures uploaded to YouTube. VoxCeleb incorporates speech from speakers spanning a wide variety of various ethnicities, accents, professions, and a while.

Beginner’s Guide to Audio Data

Audio Data

Audio Data processing refers back to the manipulation and amendment of audio alerts the use of various techniques and algorithms. It includes the software of digital signal processing (DSP) techniques to audio facts so as to decorate, regulate, or analyze the sound. Audio processing is regularly occurring in a huge variety of programs, together with music production, telecommunications, speech reputation, audio compression, and greater.

I am going over basic terminology and signal processing implementation in this article. And one of the maximum desired records

We could speak approximately signal processing,


Sign processing is a fundamental area of look at in engineering, computer technological know-how, and related fields. It performs a essential role in advancing generation and has a extensive range of applications that contact many elements of our daily lives.

Signal processing is an electrical engineering sub subject that focuses on studying, modifying and synthesizing indicators


Signal processing allows us to higher understand and make use of the big amounts of facts generated in our increasingly linked and digitized world. Its programs span across numerous industries, from healthcare to conversation, from enjoyment to scientific studies, making it a essential and important area for current generation and progress.

In which?

  • Audio and tune
  • photograph and Video Processing
  • Communications
  • Speech and Voice reputation
  • scientific Imaging and Diagnostics
  • those are only a few examples of the tremendous variety of programs for sign processing. Its importance extends to numerous other fields wherein the processing and evaluation of signals are crucial for knowledge, conversation, and decision-making.


Audio Data

Kind of sign:

Non-stop signal: A continuous sign or a continuous-time sign is a various amount (a sign) whose domain, which is regularly time, is a continuum (e.g., a connected interval of the reals)

Discrete signal: A discrete-time sign is a chain of values that correspond to particular instants in time. The time instants at which the signal is described are the sign’s pattern instances, and the associated sign values are the sign’s samples.

Digital signal: A digital sign is a signal that represents facts as a series of discrete values; at any given time it could most effective tackle, at most, considered one of a finite wide variety of values.

Credit score: on line
we’ve began to cognizance at the fundamentals of signal processing

Amplitude Envelops:
Amplitude envelope refers back to the changes in the amplitude of a sound over time, and is an influential assets as it impacts our perception of timbre. this is an crucial assets of sound, due to the fact it’s miles what allows us to resultseasily perceive sounds, and uniquely distinguish them from other sounds

2. RMS:

The square root of the mean of the square. RMS is (to engineers anyway) a meaningful way of calculating the common of values over a time frame. With audio, the sign fee (amplitude) is squared, averaged over a period of time, then the rectangular root of the end result is calculated. The result is a cost, that when squared, is related (proportional) to the effective strength of the sign.

3. Zero_Crossing_Rate:

The zero-Crossing price (ZCR) of an audio frame is the charge of sign-changes of the sign throughout the body. In different words, it’s miles the number of times the sign modifications cost, from nice to terrible and vice versa, divided via the period of the frame

4. Fourier_Transform: The Fourier remodel is an critical photo processing device that’s used to decompose an picture into its sine and cosine components. The output of the transformation represents the photo inside the Fourier or frequency area, while the input picture is the spatial domain equivalent.

5. magnitude Spectrum:

The Fourier remodel can be used to construct the electricity spectrum of a sign with the aid of taking the rectangular of the importance spectrum. The energy spectral curve shows signal power as a feature of frequency. The electricity spectrum is in particular useful for noisy or random records in which section traits have little that means

6. Spectrogram: A spectrogram is a visible way of representing the signal strength, or “loudness”, of a sign through the years at various frequencies present in a selected waveform. not best can one see whether or not there is greater or less strength at, for instance, 2 Hz vs 10 Hz, but one can also see how strength tiers range through the years

STFT: the short-time Fourier remodel (STFT), is a Fourier-associated remodel used to decide the sinusoidal frequency and section content material of neighborhood sections of a sign as it changes over the years.

STFT VS Spectrogram: stft specializes in the ft of windowed and segmented (overlaped) data and the output can be used to reconstruct the authentic (below positive circumstance). spectrogram makes a speciality of the spectral estimation based on STFT. It has the alternatives for electricity spectrum or strength spectrum density.

7. Mel Spectrogram

The Mel spectrogram is used to provide our models with sound statistics just like what a human would understand. The raw audio waveforms are surpassed via clear out banks to obtain the Mel spectrogram.


  • choose wide variety of mel bands
  • construct mel filter banks
  • Convert lowest/highest frequency to Mel
  • Create bands similarly space factors
  • Convert factors again to Hertz
  • round to nearest frequency bin
  • Create triangular filters
  • follow mel filter out banks to spectrogram
  • mel filter out spectrogram and log spectrogram


Mel-frequency cepstral coefficients (MFCCs) are coefficients that together make up an MFC.they are derived from a form of cepstral representation of the audio clip (a nonlinear “spectrum-of-a-spectrum”). The distinction between the cepstrum and the mel-frequency cepstrum is that inside the MFC, the frequency bands are equally spaced at the mel scale, which approximates the human auditory device’s response extra carefully than the linearly-spaced frequency bands used inside the normal spectrum.

This frequency warping can allow for better representation of sound, for instance, in audio compression.

MFCCs are generally derived as follows:

Take the Fourier remodel of (a windowed excerpt of) a signal.

Map the powers of the spectrum received above onto the mel scale, the use of triangular overlapping home windows or alternatively, cosine overlapping windows.

Take the logs of the powers at every of the mel frequencies.

Take the discrete cosine remodel of the listing of mel log powers, as if it had been a sign.

The MFCCs are the amplitudes of the resulting spectrum.

Block diagram and output
These are the essential principles of sign processing, and inside the following article we are able to speak approximately sound prediction (Sound reputation the usage of deep gaining knowledge of).

AudioData. Description

Audiodata. An audio song consists of a circulation of audio samples, each pattern representing a captured moment of sound. An AudioData element is a representation of this type of pattern. Running alongside the Insertable Streams API interfaces, you can mess up a move on individual AudioData objects with MediaStreamTrackProcessor, or create an audio track from a sequence of frames with MediaStreamTrackGenerator.



  • bookmark_border
  • public elegance AudioData
  • Defines a ring buffer and some software capabilities to prepare the input audio samples.

Maintains a ring buffer to maintain input audio statistics. Clients must enter audio statistics through the “load” methods and access the added audio samples through the “getTensorBuffer” method.

Note that this elegance can only be handled with audio input in sliding (in AudioFormat.ENCODING_PCM_16BIT) or short (in AudioFormat.ENCODING_PCM_FLOAT) formats. Internally converts and stores all audio samples in PCM drift encoding.

Nested classes

AudioData class. AudioDataFormat Wraps some constants that describe the format of the incoming audio samples, that is, a wide range of channels and the sample rate.


This specification describes a high-level web API for processing and synthesizing audio in web programs. Paradigm number one is that of an audio routing graph, in which some of the AudioNode objects are linked together to outline the overall representation of the audio. The actual processing will often take place in the underlying implementation (usually optimized C/C++/assembly code), but direct script processing and synthesis is also supported.

The advent phase covers the incentive at the end of this specification.

This API is designed to be used in conjunction with other APIs and elements in the web platform, in particular: XMLHttpRequest [XHR] (the use of response and reaction type attributes). For games and interactive programs, it is expected to be used with the Canvas Second [2dcontext] and WebGL [WEBGL] 3D photography APIs.

Popularity of this record

This section describes the status of this document at the time of publication. other files can also replace this registry.

Future updates to this tip may include new capabilities.

Audio on the Internet has been quite primitive until now and until now has had to be incorporated through plugins along with Flash and QuickTime. Creating audio details in HTML5 is very essential as it allows easy streaming audio playback. however, it is not efficient enough to handle more complicated audio packets. For completely web-based video games or interactive programs, another solution is needed. The goal of this specification is to cover the capabilities found in modern gaming audio engines, as well as some of the mixing, processing, and filtering functions found in audio production applications for today’s computing devices.

The APIs were designed with a wide variety of use cases in mind [webaudio-usecases]. preferably, it should be able to assist in any use case that can be moderately implemented with an optimized C++ engine driven by script and executed in a browser. That said, modern laptop audio software will have far superior capabilities, some of which might be difficult or impossible to build with this system.

Apple’s Logic Audio is one such application that supports external MIDI controllers, arbitrary plug-in synthesizers and audio effects, highly optimized direct-to-disk audio document reading/writing, tightly integrated time stretching, etc. However, the proposed device could be quite capable of supporting a wide range of quite complex interactive games and programs, in addition to musical ones. And it can be a very good complement to the superior imaging capabilities provided by WebGL. The API has been designed so that more advanced skills can be incorporated in the future.

The API supports these number one features:

  • Modular routing for easy or complex mix/hit architectures.
  • High dynamic range, using 32-bit floats for internal processing.
  • Programmed sound playback with correct pattern and low latency for music packages that require a completely excessive degree of rhythmic precision, including drum machines and sequencers. This also includes the possibility of a dynamic arrival of results.
  • Automation of audio parameters for envelopes, fades in and out, granular consequences, filter sweeps, LFOs, etc.
  • Flexible management of channels in an audio movement, allowing them to be divided and merged.
  • Processing of audio sources from an audio or video multimedia element.
  • Live audio processing input using a MediaStream of getUserMedia().
  • Integration with WebRTC
  • Processing audio acquired from a remote peer using MediaStreamTrackAudioSourceNode and [webrtc].
  • Sending a generated or processed audio stream to a distant peer using a MediaStreamAudioDestinationNode and [webrtc].
  • The audio circulates in synthesis and immediate processing through scripts.
  • Spatialized audio compatible with a wide variety of 3D games and immersive environments:
  • Panoramic Models: Equal Power, HRTF, Bypass
  • Distance attenuation
  • sound cones
  • Obstruction/Occlusion
  • source/listener based primarily
  • A convolution engine for a wide range of linear effects, especially very 86f68e4d402306ad3cd330d005134dac room results. Here are some examples of viable effects:
  • Small/huge room
  • Cathedral
  • concert hall
  • Cueva
  • Tunnel
  • Aisle
  • bosque
  • Amphitheater
  • Room sound through a door.
  • excessive filters
  • ordinary backward consequences
  • Excessive comb cleaning results
  • Dynamic compression for universal manipulation and blend sweetening.
  • Efficient music viewer/analysis support in real-time time domain and frequency domain.
  • Green biquad filters for low pass, high pass and other common filters.
  • A waveform impact for distortion and other non-linear results
  • Oscillators

Modular routing

Modular routing allows arbitrary connections between unique AudioNode objects. Each node will have inputs and/or outputs. A source node has no inputs and only one output. A destination node has one input and no output. Other nodes can be placed along with filters between the source and destination nodes. The developer does not need to worry about low-level flow layout data when two devices are connected to each other; the right thing just happens. For example, if a mono audio stream is connected to a stereo input, it should easily mix with the left and right channels appropriately.

In the only case, a single source can be routed directly to the output. All routing occurs within an AudioContext containing a single AudioDestinationNode:

modular routing
A simple example of modular routing.
To illustrate this simple route, here is a simple example that relies on a single sound:

const context = new AudioContext();

feature playSound() {
const supply = context.createBufferSource();
supply.buffer = dogBarkingBuffer;
here’s a more complicated instance with three assets and a convolutional reverb send with a dynamic compressor on the final output level:

modular routing2

A more complicated example of modular routing.

leave context;

leave compressor;

allow reverb;

allow source1, source2, source3;

enable low pass filter;

enable waveShaper;

leave panner;

let dry1, dry2, dry3;

leave wet1, wet2, wet3;

let dry main;

permitir mainWet;

function setupRoutingGraph() {

context = new AudioContext();

// Create the result nodes.

lowpassFilter = contexto.createBiquadFilter();

waveShaper = contexto.createWaveShaper();

panoramic = context.createPanner();

compressor = context.createDynamicsCompressor();

reverb = context.createConvolve();

// Create main wet and dry.

mainDry = contexto.createGain();

mainWet = contexto.createGain();

// connect the last compressor to the last destination.


// connect dry and wet primary to compressor.



// connects the reverb to the higher humidity.


// Create some fonts.

source1 = context.createBufferSource();

source2 = context.createBufferSource();

source3 = context.createOscillator();

source1.buffer = manTalkingBuffer;

source2.buffer = pasosBuffer;

source3.frequency.cost = 440;

// connect source1

dry1 = contexto.createGain();

wet1 = context.createGain();






// connect source2

dry2 = contexto.createGain();

wet2 = context.createGain();






// join source3

dry3 = contexto.createGain();

wet3 = context.createGain();






// start the resources now.





Modular routing also allows you to route the output of AudioNodes to an AudioParam parameter that controls the behavior of a single AudioNode. In this scenario, the output of a node can act as a modulation signal instead of an input signal.

While BaseAudioContext is in the country of “going for a walk”, the value of this attribute grows monotonically and is updated with the help of the rendering thread in uniform increments, similar to a rendering quantum. therefore, for a walking context, currentTime will progressively increase as the device processes audio blocks and continuously represents the start time of the next audio block to be processed. It is also the earliest viable time at which any planned alternative in the modern country could come into effect.

CurrentTime must be read atomically in the control thread before being returned.

MDN  destination , of type AudioDestinationNode, read-only

An AudioDestinationNode with a single entry that represents the final destination for all audio. G enerally this can represent actual audio hardware. All AudioNodes that are actively playing audio will immediately or indirectly connect to the destination.

listener, of type AudioListener, read-only

An AudioListener used for three-dimensional spatialization.

onstatechange, del tipo EventHandler

An element used to configure the EventHandler for an event that is sent to BaseAudioContext while the country of the AudioContext has changed (that is, while the corresponding promise would have resolved). An occasion type event could be sent to the occasion handler, which could query the AudioContext realm immediately. A newly created AudioContext will always start within the suspended country, and a state fallback event will be triggered every time the realm changes to a different country. This occasion is triggered before the incomplete occasion is triggered.

sampleRate, stream type, read-only

The sample rate (in sample frames per second) at which BaseAudioContext handles audio. All AudioNodes within the context are assumed to run at this speed. By making this assumption, pattern speed converters or “variable speed” processors do not support real-time processing. The Nyquist frequency is half of this pattern rate.

MDN Nation
, of type AudioContextState, read-only

Describes the current realm of BaseAudioContext. Get this feature returns the contents of slot [[control thread state]].

Starting an AudioContext is said to be allowed if the user agent allows the context’s nation to go from “suspended” to “running”. A user agent can also disallow this initial transition and allow it only as long as the relevant AudioContext world element has fixed activation.

AudioContext has an internal slot:

[[suspended by user]]
A boolean flag that represents whether or not the context is suspended by user code. The initial rate is false.

MDN AudioContext constructors
(context options)

  • If the file responsible for the current configuration item is not always fully active, raise an InvalidStateError and cancel these steps.
  • While developing an AudioContext, execute these steps:
    Set a [[control thread state]] to suspended on the AudioContext.
  • Set a [[render thread state]] to suspended on AudioContext.
  • let [[pending resume promises]] be a space in this AudioContext, which is, first of all, an empty ordered list of promises.
  • If contextOptions is provided, follow the alternatives:
  • Set the internal latency of this AudioContext according to contextOptions.latencyHint, as described in latencyHint.
  • If contextOptions.sampleRate is accurate, set the sampleRate of this AudioContext to this rate. otherwise, use the default output tool sample rate. If the chosen sample rate differs from the output device’s pattern rate, this AudioContext should resample the audio output to maintain the output tool’s pattern rate.
  • Please note: if resampling is necessary, AudioContext latency may be affected, probably greatly.
  • If the context is allowed to start, send a control message to start processing.
  • returns this AudioContext object.
  • Send an administration message to begin the processing method by executing the following steps:
    Try to collect the device sources. In case of failure, cancel the following steps.
  • Set the [[render thread state]] to move on AudioContext.
  • Queue a media details challenge to execute the following steps:
  • Set the AudioContext country feature to “jogging”.
  • Queue a media challenge to trigger an event called state change on the AudioContext.

Please note: Unfortunately it is not feasible to programmatically notify authors that AudioContext arrival failed. Retail consumers are encouraged to register an informational message if they have access to a registration mechanism, such as a developer tools console.

Arguments in favor of the AudioContext.constructor(contextOptions) technique.

Parameter Type Nullable optionally available Description
contextOptions AudioContextOptions. exact alternatives to who control how the AudioContext should be constructed.

MDN baseLatency attributes
, type double, read-only

This represents the number of seconds of processing latency incurred with the help of the AudioContext passing the audio from the AudioDestinationNode to the audio subsystem. It does not include any additional latency that may be caused by some other processing between the output of the AudioDestinationNode and the audio hardware, and especially does not include any latency generated by the audio graph itself.

For example, if the audio context runs at 44.1 kHz and AudioDestinationNode implements double buffering internally and can process and output audio at each rendering quantum, then the rendering latency is (2⋅128)/44100=5.805 ms
, approximately.

MDNLatency output
, dual type, read only

The estimate in seconds of the audio output latency, that is, the c program language period between the time the UA requests the host machine to play a buffer and the time the audio output device processes virtually the first pattern within the buffer. For devices that include speakers or headphones that produce an acoustic signal, the latter time refers to the time at which a pattern sound is produced.

The output latency characteristic rate depends on the platform and linked hardware audio output device. The output latency feature cost does not change over the lifetime of the context as long as the connected audio output device remains the same. If the audio output device is changed, the output latency attribute rate might be updated accordingly.

MDN methods

Closes AudioContext and frees any device resources that are being used. This will no longer automatically start all devices created by AudioContext, but will instead suspend development of the AudioContext’s currentTime and stop processing audio statistics.

When close is called, execute these steps:

  • If the report related to this globally relevant element is not fully active, return a rejected promise with DOMException “InvalidStateError”.
  • allow the promise to be a new Promise.
  • If the [[control thread state]] flag on AudioContext is closed, reject the promise with InvalidStateError, cancel those steps, and return the promise.
  • Set the [[control thread status]] flag on AudioContext to closed.
  • Queue a management message to close AudioContext.
  • promise to return
  • trigger a control message to close an AudioContext focus trigger those steps in the rendering thread:
    try to release the device sources.
  • Set the [[render thread state]] to suspended.
  • this may prevent rendering.
    If this management message is executed in response to the file download, cancel this algorithm.
  • In this case, there is no need to notify the handling thread.
    Queue a media item that commits to executing the following steps:
  • clarify the promise.
  • If the AudioContext state feature is not always “closed”:
  • Set the AudioContext country feature to “closed”.
  • enqueue a media item assignment to trigger an event called state change on AudioContext.
  • While an AudioContext is closed, the output of any MediaStreams and HTMLMediaElements that have been bound to an AudioContext may be neglected. that is, they will no longer generate any output to speakers or other output devices. For more flexibility in behavior, consider using HTMLMediaElement.captureStream().

Word: While an AudioContext has been closed, the implementation may choose to aggressively release greater resources than when it is deferred.

No parameters.
return type:
MDN Promise

Creates a MediaElementAudioSourceNode given an HTMLMediaElement. Due to calling this technique, audio playback from the HTMLMediaElement can be redirected to the AudioContext render graph.

Arguments for the AudioContext.createMediaElementSource() method.
Parameter Type Optional Nullable Description
mediaElement HTMLMediaElement ✘ ✘ The media element to be redirected to.
go back type: MediaElementAudioSourceNode

Crea un MediaStreamAudioDestinationNode

No parameters.
return type: MediaStreamAudioDestinationNode

Crea un MediaStreamAudioSourceNode.

Arguments for the AudioContext.createMediaStreamSource() method.
Parameter Type Nullable not required Description
mediaStream MediaStream ✘ ✘ The media stream as a way to act as a source.
return type: MediaStreamAudioSourceNode


Crea un MediaStreamTrackAudioSourceNode.

Arguments in favor of the AudioContext.createMediaStreamTrackSource() approach.
Parameter Type Optional Nullable Description
mediaStreamTrack MediaStreamTrack ✘ ✘ The MediaStreamTrack to act as a feed. The cost of its type attribute must be identical to “audio”, or an InvalidStateError exception must be raised.

Volver tipo: MediaStreamTrackAudioSourceNode

Returns a new AudioTimestamp instance containing related audio motion function values ​​for the context: the contextTime member consists of the time of the sample body that is currently being processed with the help of the audio output tool (i.e. the position of the output audio stream), within the same gadgets and starting location as the current time of the context; The performanceTime member embeds the time that estimates the moment while the body of the pattern similar to the stored contextTime rate is processed using the audio output device, within the same devices and starting location as (defined in [hr-time- 3]).

If the context rendering graph has not yet processed an audio block, the name getOutputTimestamp returns an AudioTimestamp instance in which each member contains 0.

Once the context rendering graph has begun processing audio blocks, the currentTime attribute rate continually exceeds the contextTime cost received from the getOutputTimestamp method call.

The rate again from the getOutputTimestamp method can be used to obtain an estimate of the overall performance time for the marginally later context time rate:

  • function outputPerformanceTime(contextTime) {
  • const timestamp = context.getOutputTimestamp();
  • const elapsedTime = contextTime – timestamp.contextTime;
  • return timestamp.performanceTime + elapsedTime * thousand;
    In the example above, the accuracy of the estimate depends on how close the argument rate is to the current motion position of the output audio: the closer the given context is to timestamp.contextTime, the higher the accuracy of the estimate. estimate obtained.

Please note: The difference between the context’s currentTime and contextTime values ​​acquired from the getOutputTimestamp technique name cannot be considered a reliable estimate of output latency due to the fact that currentTime can increase at non-uniform time intervals, so the output latency feature should be used as an alternative.

No parameters.
return type: AudioTimestamp

Resumes the progression of the AudioContext’s currentTime while it has been suspended.

When resume is called, execute these steps:
If the associated record of this relevant global object is not always fully active, return a rejected promise with DOMException “InvalidStateError”.

  • May the promise be a new Promise.
  • If the [[control thread state]] on AudioContext is closed, reject the promise with InvalidStateError, cancel these steps, and return the promise.
  • Set [[suspended by user]] to false.
  • If the context is not always allowed to start, add the promise to [[pending promises]] and [[pending resume promises]] and cancel these steps, returning the promise.
  • Set the [[control thread state]] to AudioContext to go for a walk.
  • Queue a crafted message to resume AudioContext.
  • promise to return
  • going for walks a control message to resume an AudioContext way strolling these steps at the rendering thread:
    try to gather machine sources.
  • Set the [[rendering thread state]] at the AudioContext to running.
  • begin rendering the audio graph.
  • In case of failure, queue a media detail assignment to execute the subsequent steps:
  • Reject all guarantees from [[pending resume promises]] so as, then clean [[pending resume promises]].
  • additionally, dispose of those promises from [[pending promises]].
  • queue a media element project to execute the subsequent steps:
  • solve all promises from [[pending resume promises]] so as.
  • clean [[pending resume promises]]. additionally, remove those guarantees from [[pending promises]].
  • resolve promise.
  • If the nation attribute of the AudioContext is not already “running”:
  • Set the state attribute of the AudioContext to “going for walks”.
  • queue a media detail task to fireplace an occasion named statechange on the AudioContext.

No parameters.
return kind: Promise

Suspends the development of AudioContext’s currentTime, permits any modern context processing blocks which might be already processed to be performed to the vacation spot, after which permits the device to launch its claim on audio hardware. that is usually beneficial when the utility knows it’s going to no longer want the AudioContext for some time, and desires to temporarily launch device useful resource associated with the AudioContext. The promise resolves whilst the body buffer is empty (has been surpassed off to the hardware), or straight away (without a different impact) if the context is already suspended. The promise is rejected if the context has been closed.

When droop is referred to as, execute these steps:
If this’s relevant global item’s related file isn’t always fully active then return a promise rejected with “InvalidStateError” DOMException.

allow promise be a new Promise.

If the [[control thread state]] at the AudioContext is closed reject the promise with InvalidStateError, abort those steps, returning promise.

Append promise to [[pending promises]].

Set [[suspended by user]] to real.

Set the [[control thread state]] on the AudioContext to suspended.

Queue a manage message to droop the AudioContext.

go back promise.

going for walks a manipulate message to suspend an AudioContext method strolling those steps at the rendering thread:
try to release system sources.

Set the [[rendering thread state]] on the AudioContext to suspended.

queue a media detail venture to execute the subsequent steps:

clear up promise.

If the country attribute of the AudioContext isn’t always already “suspended”:

Set the state characteristic of the AudioContext to “suspended”.

queue a media element mission to fireplace an event named statechange on the AudioContext.

whilst an AudioContext is suspended, MediaStreams may have their output unnoticed; that is, records could be lost by means of the real time nature of media streams. HTMLMediaElements will similarly have their output overlooked till the gadget is resumed. AudioWorkletNodes and ScriptProcessorNodes will quit to have their processing handlers invoked at the same time as suspended, but will resume while the context is resumed. For the cause of AnalyserNode window capabilities, the records is taken into consideration as a non-stop circulation – i.e. the resume()/droop() does no longer motive silence to appear inside the AnalyserNode’s move of facts. specifically, calling AnalyserNode features again and again whilst a AudioContext is suspended ought to go back the equal information.

No parameters.
return type: Promise
1.2.four. AudioContextOptions
The AudioContextOptions dictionary is used to specify person-specific alternatives for an AudioContext.

dictionary AudioContextOptions {
(AudioContextLatencyCategory or double) latencyHint = “interactive”;
go with the flow sampleRate;
}; Dictionary AudioContextOptions individuals
latencyHint, of type (AudioContextLatencyCategory or double), defaulting to “interactive”

pick out the form of playback, which affects tradeoffs among audio output latency and energy intake.

The preferred fee of the latencyHint is a fee from AudioContextLatencyCategory. but, a double can also be specified for the variety of seconds of latency for finer manage to balance latency and energy consumption. it’s far at the browser’s discretion to interpret the quantity as it should be. The actual latency used is given by means of AudioContext’s baseLatency attribute.

sampleRate, of kind flow

Set the sampleRate to this fee for the AudioContext on the way to be created. The supported values are the same as the pattern charges for an AudioBuffer. A NotSupportedError exception ought to be thrown if the desired sample price is not supported.

If sampleRate isn’t detailed, the desired pattern fee of the output tool for this AudioContext is used.

1.2.5. AudioTimestamp
dictionary AudioTimestamp {
double contextTime;
DOMHighResTimeStamp performanceTime;
1.2.five.1. Dictionary AudioTimestamp contributors
contextTime, of type double
Represents a point within the time coordinate device of BaseAudioContext’s currentTime.

performanceTime, of type DOMHighResTimeStamp
Represents a factor inside the time coordinate machine of a performance interface implementation (defined in [hr-time-3]).

1.3. The OfflineAudioContext Interface
OfflineAudioContext is a selected form of BaseAudioContext for rendering/mixing-down (probably) quicker than real-time. It does no longer render to the audio hardware, but rather renders as quick as feasible, pleasant the returned promise with the rendered result as an AudioBuffer.

interface OfflineAudioContext : BaseAudioContext {
constructor(OfflineAudioContextOptions contextOptions);
constructor(unsigned long numberOfChannels, unsigned lengthy duration, glide sampleRate);
Promise startRendering();
Promise resume();
Promise droop(double suspendTime);
readonly attribute unsigned lengthy duration;
attribute EventHandler oncomplete;
1.3.1. Constructors

If the present day settings item’s responsible document isn’t always completely energetic, throw an InvalidStateError and abort those steps.

allow c be a brand new OfflineAudioContext item. Initialize c as follows:
Set the [[control thread state]] for c to “suspended”.

Set the [[rendering thread state]] for c to “suspended”.

construct an AudioDestinationNode with its channelCount set to contextOptions.numberOfChannels.

Arguments for the OfflineAudioContext.constructor(contextOptions) approach.
Parameter type Nullable non-compulsory Description
contextOptions The initial parameters needed to assemble this context.
OfflineAudioContext(numberOfChannels, duration, sampleRate)
The OfflineAudioContext can be built with the same arguments as AudioContext.createBuffer. A NotSupportedError exception have to be thrown if any of the arguments is negative, 0, or out of doors its nominal variety.

The OfflineAudioContext is constructed as if

new OfflineAudioContext({
numberOfChannels: numberOfChannels,
duration: period,
sampleRate: sampleRate
had been called as an alternative.

Arguments for the OfflineAudioContext.constructor(numberOfChannels, length, sampleRate) technique.
Parameter type Nullable elective Description
numberOfChannels unsigned long  Determines what number of channels the buffer could have. See createBuffer() for the supported number of channels.
length unsigned long  Determines the size of the buffer in pattern-frames.
sampleRate waft Describes the pattern-fee of the linear PCM audio information inside the buffer in pattern-frames consistent with 2nd. See createBuffer() for legitimate sample rates.

1.3.2. Attributes
duration, of type unsigned long, readonly

the size of the buffer in pattern-frames. that is the same as the price of the length parameter for the constructor.

oncomplete, of type EventHandler

An EventHandler of type OfflineAudioCompletionEvent. it’s far the last occasion fired on an OfflineAudioContext.

1.three.three. strategies

Given the cutting-edge connections and scheduled modifications, starts rendering audio.

Although the number one method of getting the rendered audio records is through its promise go back value, the example will also hearth an event named whole for legacy reasons.

Let [[rendering started]] be an internal slot of this OfflineAudioContext. Initialize this slot to false.
whilst startRendering is referred to as, the following steps have to be achieved on the manipulate thread:

If this’s applicable international object’s associated document isn’t always fully lively then return a promise rejected with “InvalidStateError” DOMException.
If the [[rendering started]] slot on the OfflineAudioContext is real, return a rejected promise with InvalidStateError, and abort those steps.
Set the [[rendering started]] slot of the OfflineAudioContext to true.

Permit promise be a brand new promise.
Create a brand new AudioBuffer, with a number of channels, length and sample fee same respectively to the numberOfChannels, length and sampleRate values handed to this example’s constructor in the contextOptions parameter. Assign this buffer to an internal slot [[rendered buffer]] in the OfflineAudioContext.
If an exception became thrown at some stage in the preceding AudioBuffer constructor call, reject promise with this exception.
in any other case, within the case that the buffer become efficiently built, start offline rendering.

Append promise to [[pending promises]].
return promise.
To start offline rendering, the following steps ought to show up on a rendering thread this is created for the event.

Given the present day connections and scheduled modifications, begin rendering period pattern-frames of audio into [[rendered buffer]]

For each render quantum, test and suspend rendering if essential.

If a suspended context is resumed, preserve to render the buffer.

Once the rendering is whole, queue a media element undertaking to execute the following steps:

remedy the promise created via startRendering() with [[rendered buffer]].

queue a media detail assignment to fire an event named entire using an example of OfflineAudioCompletionEvent whose renderedBuffer property is ready to [[rendered buffer]].

No parameters.
return kind: Promise

Resumes the development of the OfflineAudioContext’s currentTime while it has been suspended.

  • when resume is called, execute those steps:
    If this’s applicable worldwide item’s related file isn’t fully energetic then return a promise rejected with “InvalidStateError” DOMException.
  • permit promise be a brand new Promise.
  • Abort these steps and reject promise with InvalidStateError when any of following situations is authentic:
  • The [[control thread state]] on the OfflineAudioContext is closed.
  • The [[rendering started]] slot on the OfflineAudioContext is false.
  • Set the [[control thread state]] flag at the OfflineAudioContext to jogging.
  • Queue a manage message to renew the OfflineAudioContext.
  • return promise.

walking a manage message to resume an OfflineAudioContext means strolling these steps at the rendering thread:
Set the [[rendering thread state]] at the OfflineAudioContext to jogging.

  • start rendering the audio graph.
  • In case of failure, queue a media element assignment to reject promise and abort the final steps.
  • queue a media detail project to execute the following steps:
  • remedy promise.
  • If the country characteristic of the OfflineAudioContext is not already “jogging”:
  • Set the nation attribute of the OfflineAudioContext to “walking”.
  • queue a media detail project to fireplace an occasion named statechange at the OfflineAudioContext.

No parameters.
go back type: Promise

Schedules a suspension of the time development inside the audio context at the required time and returns a promise. that is usually beneficial when manipulating the audio graph synchronously on OfflineAudioContext.

Word that the maximum precision of suspension is the scale of the render quantum and the specified suspension time could be rounded as much as the closest render quantum boundary. because of this, it is not allowed to agenda multiple suspends on the same quantized frame. additionally, scheduling must be completed while the context isn’t always walking to ensure particular suspension.

Copies the samples from the required channel of the AudioBuffer to the vacation spot array.

permit buffer be the AudioBuffer with Nb

frames, allow Nf

be the range of elements in the destination array, and k

be the value of bufferOffset. Then the range of frames copied from buffer to destination is max(zero,min(Nb−ok,Nf))

.If that is much less than Nf

 Then the remaining elements of destination aren’t modified.
  • A UnknownError can be thrown if source cannot be copied to the buffer.
  • permit buffer be the AudioBuffer with Nb
  • frames, allow Nf
  • be the variety of factors within the source array, and okay
  • be the fee of bufferOffset. Then the quantity of frames copied from source to the buffer is max(0,min(Nb−okay,Nf))
  • .If this is much less than Nf
  •  then the last elements of buffer are not modified.

Arguments for the AudioBuffer. GetChannelData() method.

Audio Compressor - best Ways to Reduce audio size audio quality reducer

Audio Compressor – best Ways to Reduce audio size audio quality reducer

Parameter Type Nullable Not Mandatory Description
Unsigned Channel Long ✘ ✘ This parameter is an index that represents the particular channel for which data is obtained. A price index of 0 represents the primary channel. This index price must be less than [[number of channels]] or an IndexSizeError exception must be raised.
return type: Float32Array

Note: The 24x7offshoring methods can be used to fill part of an array by passing a Float32Array which is a view of the larger array. When parsing channel information from an AudioBuffer, and records can be processed in chunks, copyFromChannel() should be preferred over calling getChannelData() and accessing the resulting array, as it can avoid unnecessary memory allocation and copying. .

An internal operation to accumulate the contents of an AudioBuffer is invoked when the contents of an AudioBuffer are desired via some API implementation. This operation returns immutable channel information to the caller.

When a content collection operation occurs on an AudioBuffer, execute the following steps:
If the IsDetachedBuffer operation on any of the AudioBuffer’s ArrayBuffers returns true, cancel those steps and return a channel information buffer of length 0 to the caller .

Separate all ArrayBuffers from the previous arrays using getChanne  Data() on this AudioBuffer.

Best Free Public Datasets to Use in Python

word: Because AudioBuffer can only be created through createBuffer() or through the AudioBuffer constructor, this cannot be generated.

preserve the underlying [[internal data]] of the ArrayBuffers and return references to them to the caller.

connect the ArrayBuffers containing copies of the data to the AudioBuffer, to be passed back down via the next name to getChannelData().

The gather contents operation of an AudioBuffer operation is invoked in the following cases:

while referring to AudioBufferSourceNode. Begin, it acquires the contents of the node’s buffer. If the operation fails, nothing is played.

When an AudioBufferSourceNode’s buffer is ready and AudioBufferSourceNode.start has been previously called, the setter acquires the contents of the AudioBuffer. If the operation fails, nothing is played.

when the buffer of a ConvolverNode is set to an AudioBuffer, it acquires the contents of the AudioBuffer.

when sending an AudioProcessingEvent completes, it acquires the contents of its OutputBuffer.

note: this means that copyToChannel() cannot be used to exchange the contents of an AudioBuffer currently in use across an AudioNode that has obtained the contents of an AudioBuffer because the AudioNode will continue to apply the previously received information.


Table of Contents