<?xml version="1.0" encoding="UTF-8"?><rss xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:atom="http://www.w3.org/2005/Atom" version="2.0"><channel><title><![CDATA[Infrasonic Audio Blog]]></title><description><![CDATA[Development log, tutorials, and insights brought to you by Infrasonic Audio: independent and freelance software engineering, DSP, and firmware development for s]]></description><link>https://blog.infrasonicaudio.com</link><image><url>https://cdn.hashnode.com/res/hashnode/image/upload/v1661206862210/U1d46ZrH1.png</url><title>Infrasonic Audio Blog</title><link>https://blog.infrasonicaudio.com</link></image><generator>RSS for Node</generator><lastBuildDate>Tue, 14 Apr 2026 04:07:18 GMT</lastBuildDate><atom:link href="https://blog.infrasonicaudio.com/rss.xml" rel="self" type="application/rss+xml"/><language><![CDATA[en]]></language><ttl>60</ttl><item><title><![CDATA[Real-Time Audio Synthesis on ESP-32]]></title><description><![CDATA[Background
I've recently been learning the ESP-32 platform as part of some contract work I'm doing for a synthesizer-adjacent product line. The requirements for the work involve real-time, audio-rate digital synthesis via I2S which is something that ...]]></description><link>https://blog.infrasonicaudio.com/real-time-audio-synthesis-on-esp-32</link><guid isPermaLink="true">https://blog.infrasonicaudio.com/real-time-audio-synthesis-on-esp-32</guid><category><![CDATA[C]]></category><category><![CDATA[embedded]]></category><category><![CDATA[music]]></category><dc:creator><![CDATA[Nick Donaldson]]></dc:creator><pubDate>Sat, 30 Apr 2022 16:32:00 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1651267078345/Ce9Ph4MCJ.heic" length="0" type="image/jpeg"/><content:encoded><![CDATA[<h2 id="heading-background">Background</h2>
<p>I've recently been learning the <a target="_blank" href="https://www.espressif.com/en/products/socs/esp32">ESP-32</a> platform as part of some contract work I'm doing for a synthesizer-adjacent product line. The requirements for the work involve real-time, audio-rate digital synthesis via <a target="_blank" href="https://en.wikipedia.org/wiki/I%C2%B2S">I<sup>2</sup>S</a> which is something that is a bit lacking in the official ESP-IDF examples. With some research, prior examples, and a bit of stabbing in the dark, I was able to get a standard buffer-callback-like architecture working for providing samples to  I<sup>2</sup>S DMA buffers using a FreeRTOS task, providing a simple foundation on which to build real-time audio signal processing on ESP-32 for any application.</p>
<p>Although there is a <a target="_blank" href="https://github.com/espressif/esp-idf/tree/master/examples/peripherals/i2s/i2s_basic">basic I<sup>2</sup>S DSP example</a> provided in the official ESP-IDF SDK which creates a simple oscillator tone, it isn't all that helpful in practice. For one, it isn't really doing "real-time" synthesis, in the sense that the static oscillator tone in produces is at a frequency specifically chosen such that its period in samples is (and must be) <em>exactly</em> one DMA buffer in length. This means that the buffer only has to be filled up one time – except when changing the waveform or bit depth – and the I<sup>2</sup>S driver will continue to read the same samples out of the DMA buffer over and over, creating a sustained tone from a perfectly repeating single cycle wave.</p>
<p>I published a GitHub repository containing my own example code which you can access via the link below. Read on to learn more about how it works.</p>
<p><a target="_blank" href="https://github.com/infrasonicaudio/esp32-i2s-synth-example">&gt;&gt; Get the Example Code here</a></p>
<h2 id="heading-a-quick-overview">A Quick Overview</h2>
<p>At a high level the example code is doing a few things:</p>
<ol>
<li>Configure the I<sup>2</sup>S driver to use the ESP-32's onboard DAC at our defined sample rate</li>
<li>Start a FreeRTOS Task to continuously fill the DMA buffers with samples</li>
<li>Within the task loop, generate a sine wave at an arbitrary predefined frequency</li>
</ol>
<p>Steps 1 and 3 are the focus of this article and covered in some detail below, with a summary at the end that ties everything together with step 2.</p>
<h2 id="heading-the-esp-idf-i2s-driver">The ESP-IDF I2S Driver</h2>
<p>ESP-IDF, the official SDK for development on the ESP-32, provides a C driver API for I<sup>2</sup>S just as it does for other types of peripherals. The driver is <a target="_blank" href>fairly well-documented</a> so we won't go into a ton of detail on how to use it, but rather focus on how it is used in the example code linked above.</p>
<h3 id="heading-configuration-options">Configuration Options</h3>
<p>The first thing we see in <code>main</code> is a declaration of a config struct with a bunch of values:</p>
<pre><code class="lang-c"><span class="hljs-keyword">i2s_config_t</span> i2s_config = {
    .mode = I2S_MODE_MASTER | I2S_MODE_TX | I2S_MODE_DAC_BUILT_IN,
    .sample_rate = SAMPLE_RATE,
    .bits_per_sample = I2S_BITS_PER_SAMPLE_16BIT,
    .channel_format = I2S_CHANNEL_FMT_RIGHT_LEFT,
    .communication_format = I2S_COMM_FORMAT_STAND_MSB,
    .dma_buf_count = DMA_NUM_BUF,
    .dma_buf_len = DMA_BUF_LEN,
    .use_apll = <span class="hljs-literal">false</span>,
    .intr_alloc_flags = ESP_INTR_FLAG_LEVEL2
};
</code></pre>
<p>Here's an overview of each meaningful value for our use case:</p>
<p><code>.mode = I2S_MODE_MASTER | I2S_MODE_TX | I2S_MODE_DAC_BUILT_IN</code></p>
<p>Bitmask which puts the driver in the "master" (preferred term: "leader") mode, tells the driver we want to transmit (TX) data and not receive it, and that we want to use the ESP-32's built-in 8 bit DAC rather than an external DAC or Codec IC.</p>
<p><code>.sample_rate = SAMPLE_RATE</code></p>
<p>Sets the sample rate for the driver. This also informs how the underlying clock source will be configured/divided.</p>
<p><code>.bits_per_sample = I2S_BITS_PER_SAMPLE_16BIT</code></p>
<p>Sets the number of bits per sample. For the built-in DACs we have to use 16 bit as documented even though they only support 8-bit resolution. Only the MSB of each 16-bit sample is used (more on that later).</p>
<p><code>.channel_format = I2S_CHANNEL_FMT_RIGHT_LEFT</code><br />
<code>.communication_format = I2S_COMM_FORMAT_STAND_MSB</code></p>
<p>Sets the I<sup>2</sup>S channel and communication format (including endianness). For external codecs these need to be set to something with which the codec is compatible and will inform how data is transmitted at the signal protocol level. For the internal DACs we have to use exactly these values.</p>
<p><code>.dma_buf_count = DMA_NUM_BUF</code><br />
<code>.dma_buf_len = DMA_BUF_LEN</code></p>
<p>Sets the number and size (in sample frames, NOT bytes) of the allocated DMA buffers. In common real-time audio terms - this effectively determines our buffer size and latency, as discussed in more detail below.</p>
<p><code>.use_apll = false</code></p>
<p>Tells the driver NOT to use the ostensibly more accurate and evenly divisble <code>APLL</code> for the main I<sup>2</sup>S source clock, in favor of using <code>PLL_D2</code>. The <code>APLL</code> does not seem to work correctly for the internal DAC, though from <a target="_blank" href="https://www.espressif.com/sites/default/files/documentation/esp32_technical_reference_manual_en.pdf#subsection.12.3">reference documentation</a> it seems like <code>APLL</code> is preferred for external codecs in order to avoid stability issues from dividing by decimal values.</p>
<p><code>.intr_alloc_flags = ESP_INTR_FLAG_LEVEL2</code></p>
<p>Sets the interrupt flags for allocating the I<sup>2</sup>S DMA interrupt. This is the second highest C-level interrupt we can choose. Passing <code>0</code> here instead will have the system automatically choose a dedicated interrupt level which may be a better option.</p>
<h3 id="heading-dma-buffer-size-and-latency">DMA Buffer Size and Latency</h3>
<p>For the purposes of the example program, we are telling the driver to allocate 2 x 32 frame DMA buffers for our I<sup>2</sup>S output. In bytes, this comes out to <code>2 * 32 frames * 2 channels * 2 bytes/sample = 256 bytes</code>. The program must be able to fill up one of these buffers in the time it takes for the DMA controller to write the samples in the other buffer out to I<sup>2</sup>S, which is about 725 microseconds at a sample rate of 44.1 kHz – this is effectively our output latency. For those acquainted with real time audio APIs in desktop or mobile operating systems, this should all sound very familiar.</p>
<p>Why only 2 buffers? Because we ideally want to minimize output latency. The program fills up the buffer not in use while the DMA controller is transmitting the other one, then swap, ad infinitum. The driver takes care of the "swapping" for us as the <code>i2s_write()</code> function will block until the DMA controller has buffer space available for us to write into. <em>Note that this is essentially an asynchronous block; the FreeRTOS scheduler will switch to other tasks while we're waiting (more on that a little later).</em></p>
<p>The more DSP calculations we run in our synthesis loop, the more CPU time we will be using before our "deadline" to fill the buffer. If we don't keep up, not only will our output have dropouts, but we will also end up completely consuming one of the ESP-32's CPU cores leading to unexpected behavior and the FreeRTOS scheduler watchdog printing errors to the console. This example program is doing nothing but generate a simple sine wave oscillator, so it can keep up with a 32 frame buffer. More complex applications may require larger buffer sizes and/or counts, at the cost of added latency. I recommend <a target="_blank" href="https://blog.atomic14.com/2021/04/20/esp32-i2s-dma-buf-len-buf-count.html">this excellent overview</a> as a general explanation of tradeoffs when choosing the buffer size and number of buffers.</p>
<h3 id="heading-installing-the-driver">Installing the Driver</h3>
<p>Once we have our configuration struct populated correctly, there are two simple steps remaining to "install" the I<sup>2</sup>S driver at the system level:</p>
<p>First we call <code>i2s_driver_install()</code> passing which I<sup>2</sup>S "port" we want to use and a pointer to our struct. The last two parameters pertain to setting up a queue used for receiving audio input, which we aren't doing in this example.</p>
<pre><code class="lang-c">i2s_driver_install(I2S_NUM, &amp;i2s_config, <span class="hljs-number">0</span>, <span class="hljs-literal">NULL</span>);
</code></pre>
<p>Finally, we set which GPIO pins we want to use for our chosen I<sup>2</sup>S port. For an external codec IC we would need to choose pins for each of the discrete signals that are part of the protocol, but for the internal DACs we simply pass NULL in lieu of a configuration struct.</p>
<pre><code class="lang-c">i2s_set_pin(I2S_NUM, <span class="hljs-literal">NULL</span>);
</code></pre>
<p>These functions can both return error codes and it's probably wise to check those in a real application. If all is well, at this point we should have an installed and running I<sup>2</sup>S driver outputting silence from the allocated DMA buffers since we haven't written anything to them yet.</p>
<h2 id="heading-simple-sine-wave-synthesis">Simple Sine Wave Synthesis</h2>
<p>Above the <code>main</code> function there is another function defined called <code>audio_task</code> which contains a loop in which simple sinusoidal oscillator is synthesized at a predefined frequency.</p>
<pre><code class="lang-c"><span class="hljs-keyword">for</span> (<span class="hljs-keyword">int</span> i=<span class="hljs-number">0</span>; i &lt; DMA_BUF_LEN; i++) {
    <span class="hljs-comment">// Scale sine sample to 0-1 for internal DAC</span>
    <span class="hljs-comment">// (can't output negative voltage)</span>
    samp = (sinf(p) + <span class="hljs-number">1.0f</span>) * <span class="hljs-number">0.5f</span>;

    <span class="hljs-comment">// Increment and wrap phase</span>
    p += PHASE_INC;
    <span class="hljs-keyword">if</span> (p &gt;= TWOPI)
        p -= TWOPI;

    <span class="hljs-comment">// Scale to 8-bit integer range</span>
    samp *= <span class="hljs-number">255.0f</span>;

    <span class="hljs-comment">// Shift to MSB of 16-bit int for internal DACs (interlaved buffer)</span>
    out_buf[i*<span class="hljs-number">2</span>] = out_buf[i*<span class="hljs-number">2</span>+<span class="hljs-number">1</span>] = (<span class="hljs-keyword">uint16_t</span>)samp &lt;&lt; <span class="hljs-number">8</span>;
}
</code></pre>
<p>There is nothing fancy or unusual going on here; for each sample, we're simply passing the current phase value into a standard <code>sinf()</code> function to get a sinusoidal wave sample at the current phase and then advancing the phase variable by the correct amount for our desired frequency at the output sample rate. In practice, depending on performance needs, this might instead be done with an interpolated table lookup from a table loaded into memory.</p>
<p><code>PHASE_INC</code> is defined via the relationship between our desired oscillator frequency (in Hz) and our sample rate (also in Hz) as:</p>
<p><code>(TWOPI * WAVE_FREQ_HZ / SAMPLE_RATE)</code></p>
<p>The wave frequency and sample rate are simple preprocessor macro definitions, which may be redefined to arbitrary (sensible) values for experimentation. Changing <code>WAVE_FREQ_HZ</code> will produce a sine wave at another arbitrary frequency. In a more practical synth implementation, the frequency of the oscillator would be dynamic and influenced by some other means of control (directly via a knob, etc) which would not be a difficult modification to make to this simple example.</p>
<p>The sine wave sample is also scaled and offset from its standard range of <code>-1 to 1</code> to a range of <code>0 to 1</code> for purposes of using the internal DAC, which is unipolar (it can't produce negative voltages). With an external I<sup>2</sup>S codec IC designed for audio, you would not need to do this.</p>
<p>Next, the normalized floating point sample is scaled to a range of <code>0 - 255</code>. While it's convenient to work with floats during the synthesis phase, the DAC requires us to write integer values into the output buffer, and the internal DAC on ESP-32 has an 8-bit resolution - hence <code>2^8 - 1 = 255</code> as the max value.</p>
<p>Finally, we write the sample into the temporary output buffer. This buffer is <em>not</em> the DMA memory buffer but rather one that has been statically allocated into SRAM, as a temporary space for us to fill a buffer before it's copied to DMA memory. Note that the buffer is interleaved, so we write each sample for each channel next to each other in memory.</p>
<p>Remember how we configured the I<sup>2</sup>S driver for 16-bit samples, and that the internal DACs only use the MSB? Because of that, we need to cast our sample to 16 bit integer and shift left by 8 bits before writing to the buffers. Note that in this example both DACs are given the same sample, but they could be given different samples for stereo or dual mono output. </p>
<h4 id="heading-an-important-note-about-performance">⏱ An important note about performance</h4>
<p>The ESP-32 has a floating point unit (FPU), but it only natively supports single-precision operations and is quite slow at division compared to multiplication. (<a target="_blank" href="https://esp32.com/viewtopic.php?p=82090#p82090">Benchmarks</a> for reference). Thus, explicit use of single-precision library functions, constants, and variables is encouraged for CPU efficiency, as well as pre-calculating the reciprocal of divisor constants or variables that don't change often, so you can multiply instead of divide in the DSP loop.</p>
<h3 id="heading-passing-output-to-i2s">Passing output to I2S</h3>
<p>There is one additional line of code in the synthesis task loop which copies the temporary output buffer into which we just wrote our samples to the I<sup>2</sup>S driver's allocated DMA memory buffers:</p>
<pre><code class="lang-c">i2s_write(I2S_NUM, out_buf, <span class="hljs-keyword">sizeof</span>(out_buf), &amp;bytes_written, portMAX_DELAY);
</code></pre>
<p>As previously mentioned this will <em>block</em> the current task and let the scheduler run other tasks until there is space in the DMA memory - i.e. one of the buffers has just been shifted out via I<sup>2</sup>S and is now free for us to write new data into. The last parameter determines the timeout interval for which the scheduler should wait for free DMA memory space until "giving up" and moving on - we pass <code>portMAX_DELAY</code> because for this use case, we never want it to "give up".</p>
<p>This function also takes a pointer to a <code>size_t</code> variable which is populated with the number of bytes actually written when the function returns. If there were not enough space to write the entire temporary buffer before the scheduler timeout interval elapsed, the output value would be less than the number of bytes we requested to write. In our case, since our temporary buffer <code>out_buf</code> is exactly the same size as one of the 2 DMA buffers we allocated when configuring the driver and because we have an infinite timeout, we don't need to check this value.</p>
<h2 id="heading-putting-it-all-together">Putting it All Together</h2>
<p>So far we have covered how to configure and install the I<sup>2</sup>S driver as well as how to synthesize some audio samples and feed them to the output buffers. The last thing we need to do is setup a FreeRTOS task to run our output loop.</p>
<p>In short, a task is a "routine" with its own stack and specific priority for the FreeRTOS scheduler to make time to run, balanced against other tasks and things demanding CPU time. Tasks are generally not meant to ever return; they typically contain an infinite loop which waits for some work to do, does the work, and then delays or goes back to waiting for more work to do.</p>
<p>A task is created by invoking <code>xTaskCreate()</code> with a few parameters</p>
<pre><code class="lang-c">xTaskCreate(audio_task, <span class="hljs-string">"audio"</span>, <span class="hljs-number">1024</span>, <span class="hljs-literal">NULL</span>, configMAX_PRIORITIES - <span class="hljs-number">1</span>, <span class="hljs-literal">NULL</span>);
</code></pre>
<p>The first parameter is a pointer to the function which represents the entry point of the task. The signature of the function should be <code>void func_name(void *data)</code> - it takes a void pointer to arbitrary data and returns void. Here we are passing our audio task function which contains the infinite loop to fill the output buffer.</p>
<p>The second parameter is a unique name for the task, and the third parameter is the size of the stack that will be allocated for the task. In our case 1024 bytes is plenty as we aren't aren't allocating much data on the stack or invoking any deep function call trees. If it's not enough, ESP-32 will print a stack overflow error to the console and reboot.</p>
<p>The third parameter is a pointer to any arbitrary data we want to pass into the task entry point function. In our case we don't need to pass anything into the task, so we pass <code>NULL</code>.</p>
<p>The fourth parameter is the task priority. In the case of a low-latency, real-time synthesis output, our output deadline is paramount. We must under no circumstances miss the deadline or audio dropouts will occur. Thus, we want this task to have very high priority, hence we give it the system's maximum possible priority <code>configMAX_PRIORITIES - 1</code>.</p>
<p>The final parameter is an optional pointer to a task handle variable which will be populated with a reference to the task that is created. This is useful if you need a reference to the task, but we don't , so we pass <code>NULL</code>.</p>
<p>Once this function is called, our task is created and the system starts running it. At this point it's safe for us to return from <code>main</code> because we have created at least one other task that is still alive. Now we should have a sine wave oscillator output at both of the internal DAC pins! </p>
<hr />
<p>Thanks for reading, and I hope this is a useful resource to ESP-32 synth tinkerers! </p>
]]></content:encoded></item></channel></rss>