Improvements in latency? #201

rerdavies · 2024-09-21T22:53:16Z

rerdavies
Sep 21, 2024
Maintainer

Stable, no under-runs, with plugins using 85% of available CPU, 64x3 buffer configuration, MOTU M2 USB audio adapter!

I'm not sure when it happened, but all of a sudden I seem to be getting it.

I was doing some testing on changes made to audio handling, hoping to cause ALSA audio to get upset, I added a TooB Nam plugin AND a ToobML plugin with one of the large Proteus models to the same preset. And it ran stably with NO under-runs! At 85% CPU use, with a 64x3 buffer configuration. (Also seems fine at 32x3 and 32x4).

There is a fix in TooB NAM which prevents memory allocations in the audio thread (a bug in the underling Neural Amp Modeler library, which was fixed and pushed back upstream to Steven Atkin's projec) which allows NAM to run with less variability in each audio frame.

There were a bunch of updates to audio subsystems, and drivers in a recent Raspberry Pi OS release. I'm wondering whether Raspberry Pi OS has made changes to Raspberry Pi OS that accounts for the difference. Very reasonable things that might account for the differences: updated device drivers for non-audio devices that have full RT_PREEMPT patches applied.

At any rate, I am quite amazed by this. I have never seen audio work stations on any OS running stably with 85% plugin CPU use.

So, I'm asking users of PiPedal to give me some feedback on what kind of CPU use works for you while still not under-running, as well as what kinds of buffer configurations are working for you, when you're using the latest release of PiPedal with full Raspberry Pi OS updates applied. applied.

I'm also curious about whether I should be allowing things like 16x6 or 16x8 buffer configurations. These sorts of buffer configurations do work, and at first glance, they seem to be surprisingly stable. And I do have reason to think that 32x4 might actually be more stable than 64x3, while providing even lower latency. But I haven't yet done sufficient testing to make that an actionable point. So I'm asking.

In passing, so you know: graphics operations in buster have a dire effect on audio stability. Much more so than in buster. Just moving the cursor will cause underruns on my system even with large audio buffers. The solution: run headless, or disconnect your HDMI cable. Bookworm seems to shut down the GPU if there's no display attached, which is a very good thing for PiPedal. That may not be surprising to you, but it is quite a big change from previous versions of Raspberry Pi OS. I suspect that the problem is more than one of relative priority of GPU and audio interrupts, and make have more to do with the fact that graphics operations are going to put a heavy load on CPU L2 memory caches. The heavyweight plugins (ML, NAM, and the convolution plugins) are carefully optimised to make best use of CPU L1 and L2 memory caches. So even low priority processes can cause audio unde-runs can and will cause audio under-runs if they do big flat memory operations that invalidate the caches being used by real-time audio.

38github · 2024-11-25T21:04:10Z

38github
Nov 25, 2024

I use a RPi4 and can run two instances of NAM (one STANDARD and one LITE or FEATHER) plus IR, expander, split, EQ, delay with 64/2.

I also tested the Radxa Zero 3W with one STANDARD NAM which used around 80% CPU but got xruns when using 64/2. I have not tried increasing to 3 or 4 yet. The board gets very hot even with a sink. The WiFi on it is really bad and doesn't work well with 2.4GHz in my tests. It creates alot rxfrag errors that hogs the CPU and journald. As long as WiFi is not used it works quite well.

On a Libre La Frite I can use one FEATHER NAM and lots of additional effects at 64/2. It also does not get hot and has been very stable and most of it works off of mainline kernel.

PiPedal has on my devices been incredibly stable while MODEP used a lot of CPU that caused xruns even on a RPi4 and also Rpi5 I think.

0 replies

rerdavies · 2024-11-25T22:56:00Z

rerdavies
Nov 25, 2024
Maintainer Author

@38github:

Buffer size

FYI, I think 64x2 is not a good choice of butter size. 32x4 will definitely give you fewer underruns, with equivalent or better latency . I do know that I did release a version of PiPedal that defaulted to 64x2 buffers; but it quickly became apparent to me that this was a very bad choice. PiPedal currently defaults to 64x3 buffers, but I have suspected for quite some time that it should actually be defaulting to 32x4. I'm pretty sure this is true, even on very lightweight hardware. Unfortunately, I've been chasing higher-priority issues for a while, so I haven't had the luxury of being able to play as much as I should before making a fairly risky change. There's even some reason to think that 16x5, or 16x6 buffer configurations might be a good idea. I still don't have a firm understanding of how audio buffer configuration works on Linux. There seems to be lots of lore, and not a whole lot of detail. My current best understanding is that buffer size primarily affects PiPedal's software rendering; and the number of buffers affects how much breathing room ALSA has to keep the hardware fed. And that under the covers, ALSA is mostly chasing buffer pointers for USB (or I2C), and doesn't really care what the buffer sizes are, just how many bytes of data it has to play with. In point of fact, the ALSA driver doesn't even get told what the buffer size is, just the value of SIZE x NUMBER OF BUFFERS. There is deep lore that says that buffers should be a multiple of 48 bytes for USB audio devices. This may have been true for USB 1.0 devices; but I Think it is no longer true for USB 2.0+ devices, that have a more efficient way to transfer bulk data. So I sincerely believe that piece of lore should be retired, except for exceptionally ancient hardware. So I think... 32x4 is much better because PiPedal needs one buffer in which to process input, which gives the OS up to 32x3=48 samples of data to feed the hardware with. At any given time, ALSA is feeding the hardware with some portion of that buffered data; but it can release at least two of the three buffers back to Pipedal so that pipedal can start filling them again. So 1 buffer for pipedal; 1 buffer to feed the hardware; and 2 buffers to keep things running smoothly. In the 64x2 case, pipedal needs one buffer to fill; but it can't get access to the next buffer until the OS has finished transferring the last byte of the other buffer to hardware So 1 buffer for Pipedal to fill; and some potentially very small lead time between the time that the hardware transfer completes, and the time that PiPedal gets to start filling a new buffer.A disaster waiting to happen! So for the 64x2 case, there's no spare buffer. Pipedal has to process the entire buffer in the interval between when the hardware releases the end of the previous buffer, and starts requesting data for the start of the next buffer. (Omitted for the sake of simplicity: input and output each get 32x4 buffers, so the same general argument holds for input buffers if you were just reading input data; in actual fact, input and output transfers are locked together, so it's not clear how many of the input buffers actually get used).

Problems with Wi-Fi

One of the nice things about Pi's 4 and above: that WiFi and USB run on separate buses. On older PIs, the WiFi device appears as a USB device, and shares an internal USB bus with USB audio. If your troublesome devices have USB 2.0 AND USB 3.0 connectors, you might want to experiment with using either the USB 2.0 or USB 3.0 ports, which may get their own dedicated buses and controllers. On my Pi4, I take great care to ensure that my SSD drive goes on a USB 3.0 port, and my USB audio device goes on the USB 2.0 bus (so that it doesn't share an internal bus with the SSD.

…

On Mon, Nov 25, 2024, 16:04 38github ***@***.***> wrote: I use a RPi4 and can run two instances of NAM (one STANDARD and one LITE or FEATHER) plus IR, expander, split, EQ, delay with 64/2. I also tested the Radxa Zero 3W with one STANDARD NAM which used around 80% CPU but got xruns when using 64/2. I have not tried increasing to 3 or 4 yet. The board gets very hot even with a sink. The WiFi on it is really bad and doesn't work well with 2.4GHz in my tests. It creates alot rxfrag errors that hogs the CPU and journald. As long as WiFi is not used it works quite well. On a Libre La Frite I can use one FEATHER NAM and lots of additional effects at 64/2. It also does not get hot and has been very stable and most of it works off of mainline kernel. PiPedal has on my devices been incredibly stable while MODEP used a lot of CPU that caused xruns even on a RPi4 and also Rpi5 I think. — Reply to this email directly, view it on GitHub <#201 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ACXK2DHGSL3GZ76WTBLUEXD2COGGBAVCNFSM6AAAAABOT7ARAOVHI2DSMVQWIX3LMV43URDJONRXK43TNFXW4Q3PNVWWK3TUHMYTCMZXG4YDMMA> . You are receiving this because you authored the thread.Message ID: ***@***.***>

0 replies

BorisSutin · 2024-11-27T14:35:50Z

BorisSutin
Nov 27, 2024

I use 32x4 on pi5 with a hardware codec and it is more stable than 64x2. The difference in latency is minimal. If you use 16, the sound with neural plugins starts to disappear completely. Apparently the network window can't be larger than the buffer.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improvements in latency? #201

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 3 comments

{{title}}

{{title}}

{{title}}

Select a reply

Improvements in latency? #201

rerdavies Sep 21, 2024 Maintainer

Replies: 3 comments

38github Nov 25, 2024

rerdavies Nov 25, 2024 Maintainer Author

BorisSutin Nov 27, 2024

rerdavies
Sep 21, 2024
Maintainer

38github
Nov 25, 2024

rerdavies
Nov 25, 2024
Maintainer Author

BorisSutin
Nov 27, 2024