Is it really possible to get down to 0 Xruns?

Thanks for reviewing my post!

Some of this could give other people the wrong idea.

That is what I want to avoid!

What did you edit and what’s in them? JACK complains if it can’t run realtime threads, so your user must be given the ability to run realtime threads somewhere. Cadence shows you if your user is in the audio group under ‘System Checks’. I bet your user is in the audio group. :smiley:

I raised the realtime priorities in the file and played around with higher values other than rtprio 90 to avoid xruns. That was meant by the Action: limits.conf/audio.conf.
@merlyn: Thanks for pointing this out. Definitely confusing. Is it more clear now?

Additionally (you’re right) there is something else that has to be made more clear which I thought of is “a standard procedure” (since it is more a general error rather than reducing xruns - but it also may reduce them):
I gave my user realtime priorities by adding the following lines (found here) to the audio.conf.

@audio - rtprio 90       # maximum realtime priority
@audio - memlock unlimited  # maximum locked-in-memory address space (KB)

and my user was in the audio group. :innocent:
I was never happy with it since there are reasons to avoid this (Biggest problem for me: audio card is blocked for other users that are logged in - might also be an advantage). I then thought if only my user needs realtime capabilities, then only that user should get these rights and not the rights of the whole audio group. Therefore I added only my user to the limits.conf:

@tobias - rtprio 90       # maximum realtime priority
@tobias - memlock unlimited  # maximum locked-in-memory address space (KB)

and removed my user from the audio group. :upside_down_face:
Current setup: I’m not in the audio group, but I made a change to the vanilla Debian system and that is of significant importance here: I gave my user realtime capabilities.

It sounds like you made a mess of that. :smiley: A realtime kernel is probably unnecessary for you, but it doesn’t increase CPU load like you have described. I am running a realtime kernel and my idea of low latency is 96kHz with a 32 buffer and 2 periods with JACK in synchronous mode. Not that my computer is powerful enough to do much with that but something light like Carla with Tal Noize Mak3r runs without Xruns at ~8% DSP load. The latency would have to go up to do anything more elaborate.

My fault. I most likely did not get a higher CPU load (can’t remember it to be honest) - instead I wanted to say that I got a higher system load. Where did I gathered that information from? cpufreq (cannot provide link to GNOME extensions - not allowed) showed it. Does that makes more sense to you?

Did you set up GRUB and reboot and all that?

I did resetup GRUB (sudo update-grub) and rebootet the system and chose the RT kernel in GRUB.
Since it is fun and easy to do I might test the RT-kernel once again to make sure I did not do anything wrong here.

BTW: Pretty nice latency values. I never tried to go down that low.

Note that increasing the value in limits/audio.conf only increases the limit that you are allowed to set, so there will be no change in behavior unless you change the RT priority that your software is attempting to use. So if you had the jackd or ardour RT priority configured as 70, and increased the limit from say 80 to 90, but did not change the jackd or ardour configuration, then the audio software will still be running at RT priority 70. As the directory name implies the values in the files under the limits directory limit the maximum value you are allowed to set, they do not change the default value your software is using.

When you start jackd it will print the currently used RT priority to the console, I expect ardour would have a similar message if you open the log messages window.

You probably have to log out and log in again after making any changes to a configuration file which affects your user or group. Just adding for completeness in case someone runs across this thread.

Yes, important to know. Setting the priority limit was not the only thing I did. I raised the priority for jack set up in cadence. I played around with the priorities and checked the result with the

ps axHo user,lwp,pid,rtprio,ni,command

command to compare the priorities of the jack (and the threads of the software which jack started) threads with the other threads on the test system. I can remember that I tested values of the jack priority up to 98 or 99. :flushed:

Absolute values are irrelevant. SCHED_FIFO simply runs threads with a higher value first. It’s a simple arithmetic sort.

Have you also configured the thread priorities after enabling this? e.g with rtirq

A really cutting edge link that is. :smiley: You could avoid the audio group so nobody logs in and takes over your soundcard. Do you find that happens often? :smiley:

Using ALSA MIDI with ALSA sequencer requires your user to be in the audio group. It’s the sort of thing where the system will work until you try to use ALSA MIDI then it won’t work and you may be left scratching your head. That’s one reason Cadence checks. If you type

$ sudo find / -group audio

into a terminal you will see all the directories you need to be in the audio group to access.

And more of the actions you listed are like that – they’re preventative rather than aimed at fixing an Xrunning system. Disable WIFI – recommended. Swappiness – recommended. CPU governor to performance --recommended.

My values were higher than nearly every software that run on that machine. Exception was the watchdog, migration and the rtkit-daemon. So I compared the values of my software with the with the other software of the system. (Used ps axHo user,lwp,pid,rtprio,ni,command)
Since values around 70 didn’t reduce xruns I raised the values up to the highest values I was able to find on my system. (Thats dangerous - I know - It worked for that test - Currently with the low-latency kernel I reverted back to RT priority of 10 (Standard value in cadence))
Let me know if this could be done better or different.

After testing the initial version of the file rtirq.conf I edited the file as stated in the [https://wiki.linuxaudio.org/wiki/system_configuration#rtirq](http://Linux audio system configuration) according to my needs.

Can you provide a more recent link regarding that topic?
My girlfriend was unable to get sound (using another sound card? - I don’t know anymore) when she was using my computer - because I was in the audio group and was logged in in parallel. (So it happened, but we don’t use the PC in parallel currently - so not an issue anymore.

I currently use MIDI in my sessions by connecting some software with midi connections. Is that something that shouldn’t work or do you mean something else?

That’s fine - I just wanted to say that my experience is that they don’t really help me currently - maybe I’ll come back to these actions in the future. (That might be interesting to others)

I concur! A couple of weeks ago I reinstalled Linux after upgrading my 7200rpm to an Samsung Evo SSD and this is all I did to customize the antiX install aside from making cadence happy with being in an audio group etc. Compared to what I had to do to make Windows 10 happy with the same SSD re-install situation, this was heaven.

This is a good thread to share a recent experience. For years I had run KXStudio 14.04 – a distro based on Ubuntu 14.04. I had run it first on an Athlon based system, then moved the disk over to an FX-8350 system. Everything ran smoothly – no xruns or anything. This new system, I had an add-on 6GBPS SATA card in it, because the ones on the board were 3GBPS. But everything worked fine.

When FalkTX got busy on KXStudio again, I figured I’d make my own distro. I started with Kubuntu 19.10 and used a “spin your own distro” tool that kind of runs the distro in a jail while you add the repos etc. So I added the low-latency kernel, all the cool kxstudio stuff, ardour of course etc. Made myself an ISO … and installed it on a fresh disk.

I had all sorts of trouble, and I mean all sorts.

Meanwhile, I do some audio work on Windows 7, and that had never given me trouble either. But I did a fresh Windows 10 load on a drive and installed all my proprietary node-locked software and stuff.

Both the new Linux AND the new Windows (same computer for both – I have a gizmo where I just pop in the drive I need) had crackles, pops, xruns etc etc etc.

There were two problems.

The first is that I needed to set the setting in Cadence for the processor to always be at top speed. I had to do the exact same thing in Windows 10, which was tricky to find.

But I also had to remove that add-on SATA card because for some strange reason with both the latest Windows and the latest Linux kernel, it really messed up the USB bus and I use a USB sound interface.

Anyway, I thought y’all might find that interesting!

Shared IRQ seems a likely culprit. Might have been enough to move the SATA to a different slot.

Obviously not really recommended for most people, you need to know what you are doing for it to be succesful, and I am not sure that some of those ‘roll-your-own’ tools are smart enough to allow us to fix everything that is needed. Likely just going to have better luck using a distro close to it and customizing it.

   Seablade
1 Like

Hi Seablade – that’s essentially what I did. I used a tool called “cubic” to make a customized kubuntu that already had all the cool Kxstudio stuff plus all the other stuff I wanted, all configured correctly, etc.

I suspect that moving the card would have been sufficient, but the SSDs could only, on the best day, move data at about 320 MB which is less than 3Gb … so the 6Gb card wasn’t buying me anything anyway. What is odd is that before the upgrades to the 5.0+ linux kernel and Windows 10, it never gave me trouble.

There is an update to my previously shared experience. (Is it really possible to get down to 0 Xruns?)

After testing more intense there are still xruns when

  • WIFI state is changed (ON/OFF) - always reproducable and
  • during really long sessions (I started testing for 6+ hours over night) there are few (about 100) xruns which appear at one certain point in time (midnight :wink: - It seems the regeneration of man pages took place and was the cause for a sudden load on the system.

Another update is that I achieve similar results with the XanMod RT kernel as with the liqourix kernel. (This is positive because both reduce xruns to zero for long session duration (three hours as explained in my first comment))

That’s it. I just wanted to give feedback immediately because we are talking about zero xruns here and for my shared experience this is no longer true when testing longer.
I think I’ll come back with updates since I want to test the actions of my first post with these two kernels to get better in the long duration tests. I even need to find out more when the xruns appear and how reproducible they are.

( I cannot edit the old post - but I replied to it so everyone finding it should find my update)

Some wifi drivers do that. Sometimes there is a better kernel module for the same wifi chip that works better or try a different wifi device or leave it powered off.

To get rid of sudden CPU usage… turn off cron while doing lowlatency work. Cron is supposed to run as a “nice” process but a high enough load is still a high load. I think there are some things done in the update process that end up running atomic.

So long as these things (that cause xruns) can be done when you are not doing recording/playback/audio that is not really a problem.

I have run machines over 24 hours even at forced 800mhz instead of full speed with 0 xruns but that was with no wifi and cron turned off as well as making sure the audio device was properly prioritized. A steady speed is more important than a high speed. Audio seems to tolerate the speed of the cpu core going up better than decreasing… so the xruns happen as the cpu load goes away and ondemand or powersave lowers the speed (boost will also do this). Definitely not plug and play…

I can also use a wired connection - not ideal in the living room - but possible. Another idea is to cache my learning videos and play them in jack using vlc.

I should do this to find out if this is the cause - good idea I’ll try that.

Already read about that in a github issue of cpufreq. I ran the XanMod kernel which set the CPU to steady AND high and I left it there. :slight_smile: But this is good to know. I found out that I cannot get xruns by using stress-ng --cpu 4 (so sudden CPU stress seems to be no problem), but I got xruns by filling up memory with stress-ng --brk 1 -t 5. I don’t know if having xruns when memory is full is okay or not.

I think I have to go though the measures of “wiki.linuxaudio org - system_configuration” since I reverted all of these things since with the stock kernel they did not help. That means for example jack is running at realtime priority 10.

That means once it gets run it cannot be interrupted and blocks jackd or an audio interrupt? There are 4 CPU. jackd or the interrupts may get another CPU if there are no higher processes running and so on.

Can you tell me how you do that?
I know about the options in the system_configuration of the linux wiki. I mentioned that in a post before:

The wiki links to http://subversion.ffado.org/wiki/IrqPriorities where there are more details about irq prioritizing. I have a problem in step 2 since I do not see so many entries in my list as they have in their list.

Step 1

tobias@tobias-pc:~$ cat /proc/interrupts
           CPU0       CPU1       CPU2       CPU3       
  0:          6          0          0          0   IO-APIC   2-edge      timer
  8:          0          0          1          0   IO-APIC   8-edge      rtc0
  9:          0          0          0          0   IO-APIC   9-fasteoi   acpi
 16:     170413          0          0          0   IO-APIC  16-fasteoi   ehci_hcd:usb1, ath9k, cx88[0], cx88[0], cx88[0]
 17:          0          0          0        911   IO-APIC  17-fasteoi   snd_hda_intel:card2
 18:          0          0          0          0   IO-APIC  18-fasteoi   i801_smbus
 19:         52          0          0          0   IO-APIC  19-fasteoi   snd_ice1724
 23:          0          0          0         33   IO-APIC  23-fasteoi   ehci_hcd:usb4
 24:          0          0          0          0   PCI-MSI 1572864-edge      enp3s0
 25:          0      67837          0          0   PCI-MSI 512000-edge      ahci[0000:00:1f.2]
 26:          0          0      76501          0   PCI-MSI 327680-edge      xhci_hcd
 27:     172837          0          0          0   PCI-MSI 524288-edge      nvkm
 28:          0         17          0          0   PCI-MSI 360448-edge      mei_me
 29:          0          0        903          0   PCI-MSI 442368-edge      snd_hda_intel:card0
NMI:         10          9         10         10   Non-maskable interrupts
LOC:    4538007    4547056    5124427    4138157   Local timer interrupts
SPU:          0          0          0          0   Spurious interrupts
PMI:         10          9         10         10   Performance monitoring interrupts
IWI:          0          0          0          0   IRQ work interrupts
RTR:          0          0          0          0   APIC ICR read retries
RES:      23675      22253      20577      19001   Rescheduling interrupts
CAL:      27139      22127      29797      23675   Function call interrupts
TLB:      22821      18400      25576      19394   TLB shootdowns
TRM:          0          0          0          0   Thermal event interrupts
THR:          0          0          0          0   Threshold APIC interrupts
DFR:          0          0          0          0   Deferred Error APIC interrupts
MCE:          0          0          0          0   Machine check exceptions
MCP:          3          4          4          4   Machine check polls
HYP:          0          0          0          0   Hypervisor callback interrupts
HRE:          0          0          0          0   Hyper-V reenlightenment interrupts
HVS:          0          0          0          0   Hyper-V stimer0 interrupts
ERR:          0
MIS:          0
PIN:          0          0          0          0   Posted-interrupt notification event
NPI:          0          0          0          0   Nested posted-interrupt event
PIW:          0          0          0          0   Posted-interrupt wakeup event

Step 2

tobias@tobias-pc:~$ ps -eLo pid,cls,rtprio,pri,nice,cmd | grep -i "irq"
    9  TS      -  19   0 [ksoftirqd/0]
   17  TS      -  19   0 [ksoftirqd/1]
   22  TS      -  19   0 [ksoftirqd/2]
   27  TS      -  19   0 [ksoftirqd/3]
  359  FF     50  90   - [irq/28-mei_me]
 9864  TS      -  19   0 grep -i irq

I have a USB audio interface (focusrite Scarlett 2i4).

Are you sure that is the right filename? I think rtirq these days uses /etc/default/rtirq anyway…

and it appears you have an internal audio and an ice1724 as well. Do you also use these?
Looking at your interrupts I see two USB2.0 buses: ehci_hcd:usb1 and ehci_hcd:usb4. The USB1 bus should absolutely not be used for audio work as it is on irq16 along with 4 other devices. Yet it appears that is where the USB sound device is plugged in. Try plugging your USB audio device into any other USB plug until you get it in USB4. I see there is also an xhci process which will have more usb buses… I would not use those either.
Having said that, Assuming you will find a way to get your USB audio plugged into USB4 (and have made sure nothing else is) in /etc/default/rtirq there is a line:
RTIRQ_NAME_LIST=“snd usb i8042”
There is some debate if the i8042 needs to be there at all, but seeing it is last that is fine.
using “snd” or “usb” in here will not help and may even make things worse.you may be able to get away with: “usb4 snd_isc snd_hda i8042 usb”. Then reboot and run in a terminal: /etc/init.d/rtirq status. You should find ehci_hcd:usb4 has the highest priority of all your USB ports with the ice 5 down and the hda 5 more down and the rest of the usb stuff below that. If that does not work as expected… it may be needed to use 23-ehci instead of usb4 in the line above. usb4 is better if it works because it will work even if you change the bios and the irq assigned to USB4 changes.

No more - I left them inside since I might use them again. I might get rid of them.

I tried all unused USB ports, but I only got the audio interface isolated on bus 3. Is that okay? It is a USB 3.0 port.

lsusb -t
/:  Bus 04.Port 1: Dev 1, Class=root_hub, Driver=xhci_hcd/4p, 5000M
/:  Bus 03.Port 1: Dev 1, Class=root_hub, Driver=xhci_hcd/4p, 480M
    |__ Port 4: Dev 2, If 0, Class=Audio, Driver=snd-usb-audio, 480M
    |__ Port 4: Dev 2, If 1, Class=Audio, Driver=snd-usb-audio, 480M
    |__ Port 4: Dev 2, If 2, Class=Audio, Driver=snd-usb-audio, 480M
    |__ Port 4: Dev 2, If 3, Class=Audio, Driver=snd-usb-audio, 480M
    |__ Port 4: Dev 2, If 4, Class=Audio, Driver=snd-usb-audio, 480M
    |__ Port 4: Dev 2, If 5, Class=Vendor Specific Class, Driver=, 480M
/:  Bus 02.Port 1: Dev 1, Class=root_hub, Driver=ehci-pci/2p, 480M
    |__ Port 1: Dev 2, If 0, Class=Hub, Driver=hub/8p, 480M
/:  Bus 01.Port 1: Dev 1, Class=root_hub, Driver=ehci-pci/2p, 480M
    |__ Port 1: Dev 2, If 0, Class=Hub, Driver=hub/6p, 480M
        |__ Port 1: Dev 3, If 2, Class=Vendor Specific Class, Driver=btusb, 12M
        |__ Port 1: Dev 3, If 0, Class=Vendor Specific Class, Driver=btusb, 12M
        |__ Port 1: Dev 3, If 3, Class=Application Specific Interface, Driver=, 12M
        |__ Port 1: Dev 3, If 1, Class=Vendor Specific Class, Driver=btusb, 12M
        |__ Port 2: Dev 4, If 1, Class=Human Interface Device, Driver=usbhid, 12M
        |__ Port 2: Dev 4, If 2, Class=Human Interface Device, Driver=usbhid, 12M
        |__ Port 2: Dev 4, If 0, Class=Human Interface Device, Driver=usbhid, 12M
        |__ Port 5: Dev 5, If 0, Class=Vendor Specific Class, Driver=usbfs, 12M

I need to find out device names for the name list in rtirq next. (I don’t know where you get your suggestions from - but I would like to know where the name come from. rtirq readme states:

The term service seems to refer to module names and sound device designations (so the output of lsmod and aplay -l respectively) and doesn’t have to correspond to the full output, part of the output may suffice as the rtirq script does the matching itself

)
Then prioritise them.

Can someone tell me how to prioritise a USB audio device correctly with rtirq?
I cannot see what to specify in rtirq config file that references the device I want to prioritise. I tried USB1 (For USB bus 1). Can I use snd_usb_audio? That seems to be the closest hit. However whatever I choose it doesn’t appear in the rtirq status.

That should be fine. The hard part is finding the service name. Looking at you irqs above, xhci_hcd is on irq 26 so the process you are interested in is: irq/26-xhci_hcd. rtirq does not need the whole thing just a unique part like 26-xhci. so in /etc/default/rtirq at about line 30 you will find: RTIRQ_NAME_LIST=“rtc snd usb i8042” or something like that. you want to change that to RTIRQ_NAME_LIST=“26-xhci snd-ice snd-hda i8042 usb”. That is you want the USB bus your audio device is on to be highest priority. the ICE1724 should be next and the HDA after because HDA could interfere with the others. (being higher minimum latency). I add usb again right at the end because I have seen times where if one USB port is raised to say 90, all the rest end up right under it not at 50. Putting it at the end makes sure they are lower than everything else. Please note that anything else plugged into any USB3.0 port may affect your audio. Even though you have two USB 3.0 buses, I only see one xhci (USB3) process.
Also note that all motherboards are different. finding a good setup for one does not just map to another. The minimum latency varies from system to system. I do not think there are any motherboards that are designed from the ground up for low latency (possibly some of the compaq server setups… but they tend to optimize around network rather than audio). There is just not enough demand and through put is so much easier to measure and advertise, while low latency means lower throughput which to most PR people is bad.

I cannot really get a specific IRQ process to be highest priority. Do you know how to solve this? If not I’ll ask the devs of rtirq.
With RTIRQ_NAME_LIST="29-xhci snd-ice snd-hda i8042 usb" I end up with:

/etc/init.d/rtirq status`:
  PID CLS RTPRIO  NI PRI %CPU STAT COMMAND    
  120 FF      70   - 110  0.3 S    irq/16-ehci_hcd    
  122 FF      70   - 110  0.4 S    irq/29-xhci_hcd    
  121 FF      69   - 109  0.0 S    irq/23-ehci_hcd    
  105 FF      50   -  90  0.0 S    irq/9-acpi    
  123 FF      50   -  90  0.0 S    irq/8-rtc0    
  220 FF      50   -  90  0.0 S    irq/18-i801_smb 

So irq process 16 is still higher than 29, but why? Do they belong together from the hardware side?

irq 16 is not higher but it is the same just listed first. (70 for both) I do not know why this is but suspect some kind of mother board thing. It would be interesting to try:
RTIRQ_NAME_LIST=“29-xhci snd-ice snd-hda i8042 usb 16-ehci”
but I don’t have high hopes. It appears the designer of your MB have made some decisions that are less than stellar for low latency USB use. Being able to use The USB that goes to 23-ehci seems the best solution if it is possible. How does this affect how low of a buffer size you can use without xruns?