Recovery and restoration service:frontea online,corp.

Recovery and restoration service:frontea online,corp.

Deep Abyss Audio (3) – Spinning up the I2S engine Building a Pro Audio–grade USB DAC on ESP32‑S3

‹ 2025/12/25 ›

Hello again—third article already. This time I’m starting from the output side: the I2S task module that actually drives the DAC. Without that, everything else is just a nice fantasy running in circles.

Hardware setup for this round

DAC board:

I’m using a PCM5102A kit board. It’s almost “plug and play” as far as circuitry goes, so there’s not much to draw schematically—but I’ll still treat it with the respect it deserves.



Included OLED:

The bundled OLED module is also part of the system. With these two pieces, we have the full “visual audio” front end.

Pro Audio requirement: MCLK is mandatory

Because this build targets Pro Audio behavior, the master clock (MCLK) is not optional. MCLK is properly wired to the PCM5102A (“パクリMOS” in my notes), so the DAC sees a clean, explicit master clock instead of relying on internal tricks.

USB connector choice:

On the USB side, I’ve pulled D± out to a USB‑A connector.

The original USB‑C on the board? Let’s just say… it’s going into storage for now.

I2S startup module overview

The heart of this article is the i2s_init() function—the I2S startup module. At first glance it looks like a standard ESP‑IDF I2S initialization sequence, but there’s an important “ α” hidden inside: a follow‑up step that revisits the PLL clock dividers to get closer to the ideal audio clock.


Conceptually, i2s_init() does four big things:

- Create and configure the I2S channel

- Configure standard I2S format and slot behavior

- Compute the ideal clock structure and compare it to the actual PLL dividers

- If necessary, re‑search and re‑apply better divider values, then bind data-processing functions


Let’s walk through those.


void i2s_init(void)

{

    i2s_chan_config_t chan_cfg = I2S_CHANNEL_DEFAULT_CONFIG(I2S_PORT, I2S_ROLE_MASTER);

    

    chan_cfg.dma_desc_num = i2s_dma_setting.count;

    chan_cfg.dma_frame_num = i2s_dma_setting.frame_size;

    chan_cfg.intr_priority = i2s_dma_setting.intr_priority;


    chan_cfg.auto_clear_after_cb = true;

    //chan_cfg.auto_clear = true; 

    //chan_cfg.queue_size = 8;


    ESP_ERROR_CHECK(i2s_new_channel(&chan_cfg, &tx_handle, NULL));


    i2s_std_config_t std_cfg = {

        .clk_cfg = I2S_STD_CLK_DEFAULT_CONFIG(uac_as_quality.sample_rate),

        .slot_cfg = I2S_STD_PHILIPS_SLOT_DEFAULT_CONFIG(I2S_DATA_BIT_WIDTH_16BIT, I2S_SLOT_MODE_STEREO),

        .gpio_cfg = {

            .mclk = PIN_MCK,

            .bclk = PIN_BCK,  

            .ws   = PIN_LRCK,

            .dout = PIN_DOUT,

            .din  = I2S_GPIO_UNUSED,

            .invert_flags = {

                .mclk_inv = false,

                .bclk_inv = false,

                .ws_inv = false,

            },

        },

    };


    if (uac_as_quality.resolution_bits == 24) {

        std_cfg.slot_cfg.data_bit_width = I2S_DATA_BIT_WIDTH_24BIT;


    } else if (uac_as_quality.resolution_bits == 32) {

        std_cfg.slot_cfg.data_bit_width = I2S_DATA_BIT_WIDTH_32BIT;

    }


    std_cfg.slot_cfg.ws_width = I2S_SLOT_BIT_WIDTH_32BIT;

    std_cfg.slot_cfg.slot_bit_width = I2S_SLOT_BIT_WIDTH_32BIT;


    if (std_cfg.slot_cfg.slot_bit_width == I2S_SLOT_BIT_WIDTH_24BIT) {

        if (uac_as_quality.sample_rate > 96000)

            std_cfg.clk_cfg.mclk_multiple = I2S_MCLK_MULTIPLE_192;

        else

            std_cfg.clk_cfg.mclk_multiple = I2S_MCLK_MULTIPLE_384;        

    }


    ESP_LOGI(TAG, "i2s slotcfg data=%d, ws=%d, slot=%d, mode=%d"

        , std_cfg.slot_cfg.data_bit_width, std_cfg.slot_cfg.ws_width, std_cfg.slot_cfg.slot_bit_width, std_cfg.slot_cfg.slot_mode);

 

    g_clk.src_clk = std_cfg.clk_cfg.clk_src == I2S_CLK_SRC_PLL_160M?SRC_CLK_160MHZ:SRC_CLK_40MHZ;


    g_cnt_ideal.src = g_clk.src_clk;

    g_cnt_ideal.mclk_multiple = std_cfg.clk_cfg.mclk_multiple;

    g_cnt_ideal.mclk = g_cnt_ideal.mclk_multiple * std_cfg.clk_cfg.sample_rate_hz;

    g_cnt_ideal.bclk = std_cfg.clk_cfg.sample_rate_hz * std_cfg.slot_cfg.slot_bit_width * std_cfg.slot_cfg.slot_mode;

    g_cnt_ideal.bclk_divn = g_cnt_ideal.mclk / g_cnt_ideal.bclk;

    g_cnt_ideal.bclk_divi = g_cnt_ideal.mclk % g_cnt_ideal.bclk;


    g_cnt_ideal.pcnt_fs = std_cfg.clk_cfg.sample_rate_hz;

    g_cnt_ideal.pcnt_warmup = 0;

    g_cnt_ideal.pcnt_fs_max = g_cnt_ideal.pcnt_fs * FEEDBACK_AVAILABLE_RANGE;

    g_cnt_ideal.pcnt_fs_min = g_cnt_ideal.pcnt_fs / FEEDBACK_AVAILABLE_RANGE;


    g_cnt_ideal.pcnt_fs_factor = 1000.0f / (float)g_cnt_ideal.mclk_multiple;


    std_cfg.clk_cfg.bclk_div = g_cnt_ideal.bclk_divn;


    ESP_LOGI(TAG, "i2s request mclk=%d, mclk_multi=%d blck=%d, blck_div=%d.%d"

        , g_cnt_ideal.mclk, g_cnt_ideal.mclk_multiple, g_cnt_ideal.bclk, g_cnt_ideal.bclk_divn, g_cnt_ideal.bclk_divi);


    ESP_ERROR_CHECK(i2s_channel_init_std_mode(tx_handle, &std_cfg));

    


#ifdef I2S_PIN_DRIVE_CAP

    gpio_set_drive_capability(PIN_MCK, I2S_PIN_DRIVE_CAP);

    gpio_set_drive_capability(PIN_BCK, I2S_PIN_DRIVE_CAP);

    gpio_set_drive_capability(PIN_LRCK, I2S_PIN_DRIVE_CAP);

    gpio_set_drive_capability(PIN_DOUT, I2S_PIN_DRIVE_CAP);

#endif


    i2s_current_clock(I2S_PORT, &g_clk);


    ESP_LOGI(TAG, "i2s(1) sel=%d src=%d mlck=%.2f num=%d x=%d y=%d z=%d yn1=%d"

        , g_clk.clk_sel, g_clk.src_clk, g_clk.mclk, g_clk.div_num, g_clk.div_x, g_clk.div_y, g_clk.div_z, g_clk.div_yn1);

    ESP_LOGI(TAG, "i2s(1) en=%d, act=%d bits=%d bck_div=%d"

        , g_clk.clk_enable, g_clk.clk_active, g_clk.bits_mod, g_clk.bck_div);

    ESP_LOGI(TAG, "i2s(1) latency %dus/1ms", CALC_US_LATENCY(g_cnt_ideal.mclk, g_clk.mclk));


    if (fabs(g_clk.mclk-(double)g_cnt_ideal.mclk) > 1000) {

        clock_info_t cur_clk = g_clk;

        for (int i=100; i<256; i ) {

            clock_info_t tmp_clk = cur_clk;

            if (i2s_scan_div(g_clk.src_clk, g_cnt_ideal.mclk, round(g_clk.src_clk/g_cnt_ideal.mclk), i, &tmp_clk)) {

                if (fabs((double)g_cnt_ideal.mclk - tmp_clk.mclk) < fabs((double)g_cnt_ideal.mclk - cur_clk.mclk)) {

                    cur_clk = tmp_clk;

                }

            }


        }

        ESP_LOGI(TAG, "i2s(2) sel=%d src=%d mlck=%.2f num=%d x=%d y=%d z=%d, yn1=%d"

            , cur_clk.clk_sel, cur_clk.src_clk, cur_clk.mclk, cur_clk.div_num, cur_clk.div_x, cur_clk.div_y, cur_clk.div_z, cur_clk.div_yn1);

        ESP_LOGI(TAG, "i2s(2) en=%d, act=%d bits=%d bck_div=%d"

            , cur_clk.clk_enable, cur_clk.clk_active, cur_clk.bits_mod, cur_clk.bck_div);

        ESP_LOGI(TAG, "i2s(2) latency %dus/1ms", CALC_US_LATENCY(g_cnt_ideal.mclk, cur_clk.mclk));

        

        i2s_set_clock_div(I2S_PORT, &cur_clk);

        i2s_current_clock(I2S_PORT, &g_clk);


        ESP_LOGI(TAG, "i2s(3) sel=%d src=%d mlck=%.2f num=%d x=%d y=%d z=%d, yn1=%d"

            , g_clk.clk_sel, g_clk.src_clk, g_clk.mclk, g_clk.div_num, g_clk.div_x, g_clk.div_y, g_clk.div_z, g_clk.div_yn1);

        ESP_LOGI(TAG, "i2s(3) en=%d, act=%d bits=%d bck_div=%d"

            , g_clk.clk_enable, g_clk.clk_active, g_clk.bits_mod, g_clk.bck_div);

        ESP_LOGI(TAG, "i2s(3) latency %dus/1ms", CALC_US_LATENCY(g_cnt_ideal.mclk, g_clk.mclk));


    }


    if (uac_as_quality.resolution_bits == 32) {

        g_data_functions.pSRC = linearSRC32;

        g_data_functions.padjustVolume = adjust_volume32;

        g_data_functions.pcalcRms = calc_rms32;


    } else if (uac_as_quality.resolution_bits == 24) {

        g_data_functions.pSRC = linearSRC24;

        g_data_functions.padjustVolume = adjust_volume24;

        g_data_functions.pcalcRms = calc_rms24;


    } else {

        g_data_functions.pSRC = linearSRC16;

        g_data_functions.padjustVolume = adjust_volume16;

        g_data_functions.pcalcRms = calc_rms16;


    }

}


1. I2S channel configuration

The code starts by creating a standard I2S transmit channel:


Channel role:

Master (I2S_ROLE_MASTER) on a given I2S_PORT.


DMA settings:

- Descriptor count: i2s_dma_setting.count

- Frame size: i2s_dma_setting.frame_size

- Interrupt priority: i2s_dma_setting.intr_priority


Queue behavior:

auto_clear_after_cb = true so that buffers are automatically cleared after transmission.

This is the “plumbing” that ensures the I2S engine can continuously stream audio data without underruns, with DMA doing the heavy lifting.


2. Standard I2S slot configuration (PCM5102A‑aware)

Next comes i2s_std_config_t std_cfg:


Clock config:

I2S_STD_CLK_DEFAULT_CONFIG(uac_as_quality.sample_rate)

This sets the base sample rate (e.g., 44.1k, 48k, 96k, etc.).


Slot config (Philips format):

I2S_STD_PHILIPS_SLOT_DEFAULT_CONFIG(I2S_DATA_BIT_WIDTH_16BIT, I2S_SLOT_MODE_STEREO)

Then adjusted according to the requested resolution:

  • 16‑bit → I2S_DATA_BIT_WIDTH_16BIT
  • 24‑bit → I2S_DATA_BIT_WIDTH_24BIT
  • 32‑bit → I2S_DATA_BIT_WIDTH_32BIT

GPIO mapping:

  • MCLK: PIN_MCK
  • BCLK: PIN_BCK
  • LRCLK (WS): PIN_LRCK
  • DOUT: PIN_DOUT
  • DIN: unused (I2S_GPIO_UNUSED)
  • No inversion flags are used here.

PCM5102A slot alignment

For the PCM5102A, the slot timing is adjusted:

  • Word select width: ws_width = I2S_SLOT_BIT_WIDTH_32BIT
  • Slot bit width: slot_bit_width = I2S_SLOT_BIT_WIDTH_32BIT

Even if the data is 24‑bit, the slot is 32‑bit. This is a common pattern with many DACs: they expect 24‑bit data left‑aligned in a 32‑bit frame.


24‑bit mode and MCLK multiple

When using a 24‑bit slot mode, the code ensures that the MCLK multiple is an integer ratio suitable for high‑rate audio:

  • If sample_rate > 96000 → mclk_multiple = I2S_MCLK_MULTIPLE_192
  • Else → mclk_multiple = I2S_MCLK_MULTIPLE_384

This keeps the clock tree in a clean, integer relationship—important for jitter and for predictable feedback timing later.


3. Ideal clock model vs. actual PLL dividers

Here comes the “deep” part.


Building the ideal clock structure

The code constructs an “ideal” clock model in g_cnt_ideal:

- Source clock:

If clk_src == I2S_CLK_SRC_PLL_160M → SRC_CLK_160MHZ

Else → SRC_CLK_40MHZ

  • Ideal MCLK: mclk=mclk_multiple * sample_rate
  • Ideal BCLK: bclk=sample_rate * slot_bit_width * slot_mode
  • BCLK divider:

The code then sets std_cfg.clk_cfg.bclk_div to g_cnt_ideal.bclk_divn so that the requested BCLK matches the ideal ratio as closely as the standard config allows.

Counters for feedback and latency

g_cnt_ideal also stores:

  • pcnt_fs: the nominal sample rate (for feedback timing)
  • pcnt_fs_max / pcnt_fs_min: allowable range for feedback
  • pcnt_warmup: warm‑up count (here 0)
  • pcnt_fs_factor: pcnt_fs_factor = 1000.0 / mclk_multiple
  • This is a precomputed factor for later fs calculations.

All of this is about making the USB feedback and internal timing predictable and measurable—not just “it plays sound”.

4. PLL divider re‑search and refinement

After calling i2s_channel_init_std_mode(), the code reads back the actual clock registers via i2s_current_clock() into g_clk and logs:

  • Clock source, MCLK, divider numbers (x, y, z, yn1)
  • BCLK divider
  • A calculated latency in microseconds per millisecond of audio

Why re‑search?

Even if you request a certain MCLK, the internal PLL and divider structure may not hit it exactly. So the code checks:- If |actual_mclk - ideal_mclk| > 1000 Hz, then:

  • It scans possible divider combinations using i2s_scan_div() to find a closer match.
  • It keeps the best candidate (cur_clk) that minimizes the absolute difference from the ideal MCLK.
  • Once found, it writes the new divider set with i2s_set_clock_div() and reads back again to confirm.

This is the “ α” I mentioned: a self‑tuning step that pushes the ESP32‑S3’s I2S PLL closer to the mathematically ideal audio clock, instead of just accepting the first configuration.The result is logged again, including the updated latency. This gives you a concrete, measurable view of how close your clocking is to the target.


Optional: strengthening the I2S pins

If I2S_PIN_DRIVE_CAP is defined, the code boosts the drive capability of:

  • MCLK
  • BCLK
  • LRCLK
  • DOUT

This is a practical, hardware‑level tweak: stronger drive can help maintain signal integrity over real‑world PCB traces and cables, especially at higher sample rates and bit clocks.

5. Binding data processing functions by resolution

Finally, the module assigns function pointers in g_data_functions based on the active resolution:

  • 32‑bit:
    • pSRC = linearSRC32
    • djustVolume = adjust_volume32
    • - pcalcRms = calc_rms32
  • 24‑bit:
    • pSRC = linearSRC24
    • padjustVolume = adjust_volume24
    • pcalcRms = calc_rms24
  • 16‑bit (default):
    • pSRC = linearSRC16
    • padjustVolume = adjust_volume16
    • pcalcRms = calc_rms16

So the I2S engine isn’t just pushing bits—it’s wired into a resolution‑aware processing pipeline: sample‑rate conversion, volume control, and RMS calculation all switch to the correct implementation automatically.

What this module really achieves

On the surface, i2s_init() is “just” an I2S startup routine. But in practice, it:- Treats MCLK/BCLK/LRCLK as a coherent, measurable clock system, not a black box.

- Uses an ideal clock model to define what “correct” means.

- Re‑searches PLL dividers to get closer to that ideal, instead of settling for the default.

- Aligns slot timing with the PCM5102A’s expectations.

- Binds data‑path functions to the active resolution, keeping the processing chain consistent.


This is the moment where the ESP32‑S3 stops being “a microcontroller that can output I2S” and starts behaving like the core of a Pro Audio–grade USB DAC engine.

If you’d like, next time we can zoom in on either:

- the USB feedback / fs counter side, or

- the data‑path functions (linearSRCxx, adjust_volumexx, calc_rmsxx) and how they interact with this clock model.


I2S_STD_CLK_DEFAULT_CONFIG(rate) — The Hidden Trap

This is an easy one to overlook, yet it directly affects the most important aspect of audio quality: clock integrity.

Inside this macro, there is a silent landmine:

  • .bclk_div = 8,

Yes—a fixed BCLK divider of 8 is hard‑coded into the default configuration.

The ESP32‑S3’s I2S PLL is smart, flexible, and capable of generating clean clocks…

but this one parameter can ruin everything.

If you don’t override this value before calling i2s_channel_init_std_mode(),

the PLL will never produce a jitter‑free MCLK.

This is the classic trap that causes people to “hear jitter” in the ESP32’s I2S output.

If you’ve ever felt something was “off” in the PLL behavior,

correcting bclk_div to the mathematically ideal value will fix it.


Initial I2S Bring‑Up Check

Some parts of i2s_init() aren’t needed at this stage,so ignore the deeper logic for now.

Once initialization is complete, call:

  • i2s_channel_enable(tx_handle);

You don’t even need to call i2s_write().

As soon as the channel is enabled, the I2S engine begins outputting:

  • MCLK
  • BCLK
  • LRCK

So you can immediately probe the pins with an oscilloscope.


LRCK Observation

And there it is—

a beautifully clean LRCK edge at 96 kHz.



This is the moment you realize:

The ESP32‑S3 doesn’t need an external clock module.

The internal PLL can absolutely deliver—

at least at these frequencies.


For mid‑range sample rates, the S3’s internal clocking is surprisingly competent.


MCLK Observation

Next, let’s look at the MCLK waveform.


At first glance you might think:

“Wait… is this analog?”


The waveform looks barely within the oscilloscope’s detection threshold.

Your XE‑702S has a detection limit around 20 MHz,

and MCLK is slightly above that, so the scope is struggling.

Despite that, the PCM5102A accepts the signal without complaint.

The DAC is far more tolerant than the oscilloscope.

So for now, we trust the hardware and move forward—

assuming the scope is the one hitting its limit, not the S3.