What is Ardour's Generative AI policy?

Personally I am looking forward to seeing how well it can handle certain aspects of session analysis for my own purpose, again nothing to do with generating work at all, everything to do with analyzing existing work. Could I do it without an LLM? Sure, in fact I already do, but it takes hours I don’t have to spare to do so and I probably still don’t go as in depth as I could.

 Seablade

So can lua scripts :wink:

But you’re right, an LLM is way more convenient and accessible than the LUA API. Not everybody has the time and technical understanding to write a script that does the right thing in Ardour. Even LLMs fail at that task so far, I read a few times here.

Pretty sure that many fear that the AI will overstep in its intern role and starts either interupting or taking over the creative process. A script runs and shuts up afterwards, or is only triggered on clearly defined signals from the session.

In my opinion, the assistant would need some very hard and easily definable borders, like some checkboxes in the ardour settings what it can and can’t do, not some file in plain english asking it not to overstep

… or deleting the complete project data base - all wav and midi files - because it “thinks” it’s crap.
I read about a case where an AI bot deleted the complete company data base including backups on the network. They only could recover from backups stored externally.
Those are the issues we need to address :slight_smile:

BTW: Can also happen to human “bots”. I remember a case - loooong time ago - where a collegue of mine accidentally deleted the work of a whole week with the customer engineer sitting next to him, because a mount and cd command had failed, so his working directory was still the project directory instead of the backup device …

IF you know what the setup is going to be before you write the script.

I haven’t done this yet, but it has been interesting to watch videos of people get tech riders from bands, and have agents set up their console for them, even if it isn’t perfect as long as it is ‘good enough’ it can save a not insignificant amount of time on short days to get you that much more prepared for a live gig, and thus that much more successful. Mind you don’t let it do all this and at least check a bit behind it, but it is still kind of impressive what it could potentially do. In at least one of those cases they had it triggered based off emails to a specific address, again doable with scripting, but you are talking about a lot of work to get to that point, and then someone throws something like a hurdy gurdy at you one week and all your scripts fail because they don’t know how to handle it.

I can see similar potential use cases in the studio as well, again not generating audio in any way shape or form, but as Robin mentioned, an assistant to take some of the tedium out of things.

IN terms of deleting entire collections of sounds etc. yes this is definitely a concern and should be for anyone that allows this. There is a reason I haven’t trusted agents to manage much data on my drive, or given them ability to do so. The more power you give them the more destruction a wayward agent can cause, as others have said this is not to different from wayward scripts/programs or commands by human users. There is definitely a learning curve to happen here, and yes maybe the answer is to make sure that there is a similar permission structure built into Ardour’s MCP access.

 Seablade
2 Likes

Has anybody actually demonstrated that the MCP server in Ardour (an opt-in feature, btw) allows one to “regurgitate a mishmash of other people’s creations”?

I’m not sure if it’s currently even possible. So far there is no audio import. Though FreeSound import would be amazing feature to expose via MCP.

So far only MIDI drums and MIDI melodies are possible.

As for actual real world examples, have a look here:

If my memory serves me well, using FreeSound requires setting up access to the FreeSound server for each new launch of Ardour. The service really wasn’t built for this kind of integration (I don’t blame them, it’s just a fact). So it might come in handy, but the UX would still not be great.

My own experience with FreeSound is that I need to listen to 10-20 different records to pick the right one, I’d have to write a very detailed prompt for the LLM to make a semi-right choice for me, and then the LLM would have to be good enough to translate that prompt into the right action.

Yeah, I saw that. That’s precisely what I would expect from this kind of functionality: being a sound engineer’s assistant.

I am referring to LLMs in general. I do not think the server can or is supposed to do this, but any use of AI enables such things. Hence I’m against it no matter what supposed benefits it brings. Basically this post.

No amount of functionality is worth being complicit in this ever-present problem that’s only getting worse the more people try to justify its use.

I see what you’re saying and understand the goals, but use of AI is use of AI. It enables and furthers the same things even if you aren’t using it for more explicitly “generative” purposes. I’ll just be blunt and say there is no use case that justifies being complacent with fundamentally fascist technology.

It is possible to train AI models using only data you own the copyright to. For example there is Melisma which is an AI performance generator - you give it notation and it will render it as audio.

All of the audio it was trained on was purpose recorded, the musicians were fairly compensated (similar to when creating a sample library).

I don’t see any ethical issues with this approach.

2 Likes

The goal of LLMs is to minimize human involvement in activities that usually require and benefit from them. Any short-term gains from their usage is vastly overshadowed by the long-term effects of their use and the sheer intent behind them. This is why despite accessibility always being a positive thing, those purposes do not justify use of LLMs either.

Your example strikes me as similar to the way session musicians were often credited, or lack thereof, in the mid-1900s. Many of them would record their parts for “solo” artists’ tracks and not be credited for decades, sometimes ever because records of their contributions were lost. The fact that they were paid doesn’t make that okay, no matter the sum. Same logic applies here.

I would reconsider the decision for adding support for LLMs into Ardour.

In the past few weeks, I watched this video called Suno, AI Music, and the Bad Future. In the video there is a section called “Futurism/Techno-Optimism”.

You should watch the section of the video (linked below at 0:59:05). My attempt to summarize the section is as follows:

In the early 19th Century, a young poet named Filippo Tommaso Marinetti wrote the Futurist Manifesto. The manifesto advocated for breaking free from tradition, and asked people to embrace technology and the future. About a century later, venture capitalist Marc Andreessen writes the Techno-Optimist Manifesto, which pays homage to Marinetti’s writing. This manifesto argues a stance similar to the Futurist Manifesto: that technological innovation should be used to help improve the economy and preventing deaths. He positions that AI will be key to helping with modern technological growth and that any “deceleration of AI will cost lives”.

It is important to note that Andreessen is a very important person in Silicon Valley culture. So when he writes this manifesto, the people in Silcon Valley takes what he writes seriously. His writing is considered a manifesto of Effective Acceleration (also known as e/acc), which believes that unrestricted technological progress (especially progress made by AI), will allow all of humanity’s problems to be solved.

Okay, so why is this important?

Later on in Marinetti’s life, he published another manifesto called The Manifesto of the Italian Fasces of Combat. This is better known today as the Fascist Manifesto. Yes, he wrote the manifesto that started the Fascist parties that later started World War I, with Marinetti serving in Mussolini’s army. Futurism’s love of progress was used as a nice aesthetic cover for the Fascist party.

Andreessen, and many of billionaires in Silicon Valley have aligned themselves with modern day populism in the United States, similar to how the Futurists aligned with Mussolini. Andreessen himself sees Marinetti as a “Patron Saint of Techno-Optimism” and modeled his rise to political power, with Andresseen serving as an unpaid intern in the second Trump Administration.

To quote the video:

Techno-Capitalism is all too happy to align itself with right-wing authorities if it means the future can come faster. “Accelerate, or Die” is the motto of this movement. Great and terrible things are sometimes necessary if the future can be made to come faster.

One of the other goals of the “Techno-Capitalists” is to turn major cities into anarcho-capitalist “startup societies” ruled by philosopher-king CEOS. The way they plan to do this is by building a “parallel establishment” that bypasses democracy and its processes. Generative AI is key for this to succeed; with AI, the “Techno-Capitalists” can build “parallel” systems. Examples include Suno, which is a generative AI “parallel” music service and Oboe, which is a generative AI “parallel” education service.

This is concerning. AI/LLMs are different from other music technologies like MIDI because many of the Silicon Valley billionaires appear to see themselves as the successors of the Fascists, and are using AI/LLMS as tools for political coercion to build “parallel” services to bypass democratic processes. For this reason, I believe that Ardour should remove the MCP server functionality, as its presence will help accelerate (or at the very least associate itself with) the goals of the “Techno-Capitalists.”

1 Like

From what I can tell, Ardour uses an internal version of GTK2. How about updating to GTK4?

Yes, I know you have heard of this before, but the reason that was given in the past was that later versions of GTK “does not provide anything to users that Ardour does not already have”:

However, in early 2025, GTK4 gained a large update in accessibility, including allowing screen readers like Orca to provide information for keyboard shortcuts and having the AccessKit a11y backend merged into the framework.

Using LLM’s seems like a band-aid fix for giving accessibility support to Ardour rather than a more complete solution.

1 Like

Providing an MCP server is not nothing, but it’s a relatively easy patch as compared to a GTK4 port.

Not to mention the current pool of unknowns. For example, can GTK4 apps support using the same shortcut in different contexts differently?

I’m a die-hard GNOME user, and even I don’t pretend it’s all unicorns and rainbows. The number of GTK4-based applications with complex user-software interaction is extremely small, and those that exist are not exemplary by any measure.

Let’s be honest: modern GTK is not a popular choice when you want to create something complex. Dune3D/HorizonEDA dev regretted porting from GTK3 to GTK4. All sorts of projects, including Zrythm, moved from GTK to Qt/QML (meanwhile, hardware acceleration in Qt6 is hairy, to put it mildly, just listen to Krita and Friction devs).

  1. We do not use GTK for somehwere between 50% and 90% of the GUI, depending on how you measure and/or look at it. Consequently, moving to GTK4 would do essentially nothing for accessibility, and nothing for any other part of Ardour other than provide Wayland support. We do not use GTK’s shortcut mechanism (it is inadequate for our needs). The entire editor, other than the editor lists, are rendered with almost no GTK involvement, and what there is would not be useful for a11y purposes (box packing, mostly).

  2. Wayland support sounds nice, until you remember that there are almost no plugins on Linux that use Wayland for their GUIs, making all of their own GUIs unusable.

  3. Our long term goal GUI wise, which I do not expect to see realized during my time on the project, is to ditch GTK entirely, rely only a lower layer windows+events abstraction (e.g. GDK), and do everything else ourselves. This is what many other DAWs have done, and probably what I should have done back in 1999/2000 as I started the project.

5 Likes

It’s a mistake to assume that enabling an MCP capability ≍ using LLMs.

Yes, MPC is most associated with LLMs and is, at this point, probably the most useful API for for them, and AI is the primary use-case for MCP.

But MCP itself does not mandate the use of LLMs or any AI at all. It’s really just an API that supports capability discovery and control.

It’s not dissimilar in scope and function to Linux D-Bus or UPnP.

Cheers,

Keith

1 Like

I just wanted to add a concrete note to this discussion.

I’ve got a RT-thread-safe version of PR #986 that lets Ardour trigger things like track muting/activation, plugin activation, transport position/roll from a MIDI Program Change or CC message. Hit a button on the keyboard and switch active tracks, turn plugins on and off, that kind of thing.

It was written by Claude, so no point in pushing it.

I read that Boris Cheny talks at Y Combinator where founders are letting AI write 50-100% of their code. It’s all proprietary, of course. An open-source project like Ardour admits 0% AI code because we can’t copyright it.

This technology presents serious threats to free software, and not the least of those threats is that we’ll refuse to adopt it because we can’t make it fit our licensing model.

Wait… you can control ardour with MIDI messages or using a keyboard ?! Kind of a thing indeed :slight_smile: .

Well, that depends. The Linux kernel allows LLM assisted code generation:
https://docs.kernel.org/process/coding-assistants.html

So no vibe coding, but one can code with assistance of the stochastic parrot, and then is expected to review and understand the code in order to take responsibility. This could be described as “auto complete on steroids”.

Claude source code was accidentally leaked recently. Despite or because it is now written by Claude, it had abysmal code quality.
Imo ‘written by AI’ is just a seal of poor quality…

1 Like