AI stem/source separation in Ardour

shantikari · January 14, 2026, 7:32pm

This is just a pipe dream post, wondering about possibilities for AI features in Ardour and difficulties in implementation.

AI stem separation is becoming more common part of producer workflows, via third party APIs like lalal.ai/LANDR (there are dozens of such web services) or even natively within DAWs like FL Studio and Logic.

Is there currently any discussion for implementing such features in Ardour?

I guess most obvious difficulty is pricing and hardware limitations:

Pricing: Using third-party APIs costs would scale per request, so would to limit calls or having pricing tiers/subscriptions as Matt Tytel did for Vital’s text-to-wavetable feature.
Hardware: Using open-source source separation models (like Spleeter or Demucs), there would be hardware (RAM/GPU) and cross-OS compatibility issues. (I guess platform-specific DAWs like Logic don’t worry about this.)

x42 · January 14, 2026, 7:36pm

I had the idea to include Spleeter when it came out all those years ago.

spleeter depends on python (and various python libs), and bundling python with Ardour cross platform is where this project halted:

DHealey · January 14, 2026, 8:37pm

I hate trying to do anything with Python, even trying to get some programs to run in an isolated container can be a nightmare.

shantikari · January 14, 2026, 9:38pm

bundling python with Ardour cross platform is where this project halted

UVR and Mixxx devs implement python AI libs by converting it to ONNX format… Would that route be plausible for Ardour development too? Or does it involve too much external dependency mess?

ccaudle · January 14, 2026, 10:32pm

Not exactly. ONNX is an interchange format for models, you still need a runtime to execute inference using the model weights. What the Mixxx project did was to extract the model parameters from demucs, which were apparently intertwined with the code, into a separate model file which could be imported into a different runtime.
You still need to find an appropriate runtime inference engine to incorporate into the project.

x42 · January 15, 2026, 1:03am

OpenVINO™ is likely our best bet:

Although I’m yet to do my homework investigating how feasible this is for Ardour.

maybe libtorch can provide.

Lexridge · January 15, 2026, 3:25am

I built OpenVINO for Linux/Audacity last summer and it worked okay. It did pretty well when extracting two stems (music and vocals), but not so well at extracting four stems which contained tons of artifacts. I would expect training my own models may have improved on this, but training models is extremely difficult.

GenGen · January 15, 2026, 9:02am

Speaking of audacity, no time to research this right now, but leaving this as a mental note here.
Iirc, there was a stem separation V.A.M.P. Plugin. Gotta rush now, maybe someone else can shed light on the meantime

Ljuba · January 15, 2026, 10:10am

There’s Demucs GUI (FOSS) available, and it does a very good job of separating tracks into 4 basic stems - drums, bas, vocals, other.
Where things get realy interesting is at that “RipXDaw” (nonfree) level, where you can pretty clearly separate individual instruments.
Demucs developer had the idea of doing more in-depth version of his software (there’s a clip on youtube of him speaking about it), but he didn’t made/release it yet.
That’s the man (i think) that sould been contacted about it.
Having new, advanced version od Demucs inside Ardour would be awesome. (Then i could probalby remix my first electronic music album that i did when i was still a highschool teenager ) .

Majik · January 15, 2026, 10:37am

I don’t think this would be any better than Spleeter, which Robin has explored.

Just like Spleeter, Demucs is Python based and relies on a lot of Python libraries, which is extremely problematic to include in a binary distribution like Ardour.

But, unlike Spleeter, Demucs, which Demucs-GUI is dependent on, is abandonware.

Cheers,

Keith

Ljuba · January 15, 2026, 12:11pm

I see…So rewriting it all in C (if it’s even possible) would be a coding marathon, like another neverending good will black hole

jmantra · January 15, 2026, 1:59pm

Native stem separation in Ardour would be a cool feature to have, in the meantime, I have a project that uses lua scripting and demucs for four stem separation if you are interested:

GMaq · January 15, 2026, 3:09pm

FWIW,

I have a purchased RipXDAW on my Windows partition, my intention was to use it mostly to help remix Audio from old concert footage and the results are heavily mixed just like people report from the other solutions. Sure on modern already well-mixed material these tools can be impressive but my experience with live footage with Audience applause or bands with both guitars and keyboards, or a string and horn section things often end up worse than they started with all kinds of new phase-y artifacts. I would hazard to guess nobody is training the models on such old and difficult material so even through several updates of the software the results have not really improved all that much. Suffice it to say I’m not getting Peter Jackson results…

That has been my experience with it so far, but I agree it’s going to pretty much be a required feature in the DAW future…

EZ4Stephen · January 15, 2026, 4:08pm

I’m not too familiar with AI stem separation outside of using demucs [specifically, v4 ft (fine tuned)], and many other models on Ultimate Vocal Remover v5.6. Personally, I’d benefit as a user if it were implemented somehow in Ardour, but I assume finding the right smith set of process methods and their relevant models would be tricky. And then UVR has advanced menus for each processing method and more. I wonder how an implementation in Ardour would appear…

While I also assume that almost always it’s used for remixing/sampling, so far I’ve personally used it for the purposes of

Trying to get a clearer copy of the vocals to recognize the words of a song(if I can’t find the lyrics),
Learning some drumming patterns
Trying to hear out and recognize chord patterns (or note patterns/intervals, if the sequence is faster than what I can understand without the efforts of separating the stem(s))

[For the latter point, I’ve also used NeuralNote sometimes on the separated stem, to try and help myself recognize some patterns, though I’m not fully accustomed to a MIDI pianoroll.]

Ljuba · January 15, 2026, 5:00pm

Yea, “phasey artifacts” i kinda knew it…that’s exactly what i was afraid of. I was on the edge to purchase RipX just to save my first work from the oblivion (funny i made an entire album back then, and i wasn’t even aware that’s called “music production” ) , but when i tought about it, my dilemmas just multiplied for various reasons. Phasey afrifacts is something i definitely don’t want to introduce.
Lately i’m thinking of doing a proper “Redux” from scratch, but i freeze when i think about how much actual work is in it (midi programming, different instrument layering, envelope shapping etc) , cause, you know, it’s not a “Master Of Puppets” or something, it’s just a self-made album by a boy influnced by Moby, Enigma, Prodigy and d n’ b.
And i was stupid enough to keep the projects on floppys. Some i can’t even find, some contain something entirely else now, and some are just unreadable. All that is left safe is, by some miracle, still functional CD. How dumb can a young man be?

Majik · January 15, 2026, 10:53pm

Could something like this be adapted?

Cheers,

Keith

Locynaeh · January 16, 2026, 9:06am

I stumbled upon C++ ports of:

Demucs : GitHub - sevagh/demucs.cpp: C++17 port of Demucs v3 (hybrid) and v4 (hybrid transformer) models with ggml and Eigen3
Open-Unmix: GitHub - sevagh/umx.cpp: C++17 port of Open-Unmix-PyTorch with streaming LSTM inference, ggml, quantization, and Eigen

Maybe it can be of some use to @x42 ?

Though I wonder if a standalone app would not be more relevant.

Brent_Baccala · January 21, 2026, 3:22am

Interesting. I’ve used some of the on-line services, but this the first I’ve heard of Spleeter and Demucs. I took a look at Spleeter - definitely worth a try.

@x42 mentioned the complexity of python, but if you’re running the models locally, you’ve also got GPU libraries that might need to be installed. On the other hand, if you’re using models in the cloud, you might not need anything more than some kind of REST client.

I agree with @Locynaeh that a standalone app seems relevant, especially given the complexity of running on a local GPU.

Surely we don’t want Ardour to be dependent on something like TensorFlow, do will? I think that’s what it would be with Spleeter, basically.

Of course, TensorFlow integration would probably give Ardour the capability of running some of its effects processing on a GPU, which might be useful. Still…

There’s a lot of AI models available. Which stem splitter to use should probably be up to the user; like picking samples or plugins.

What would the GUI look like? You select a region and click on an option to run it through a stem splitter?

Axel99092 · January 21, 2026, 8:41pm

There’s another Stem Separation Tool with a “Complete AppImage”

x42 · January 21, 2026, 8:56pm

You don’t usually need GPU for that. spleeter runs fine and amazingly fast on the CPU.