I’m not sure you’re aware of this and if you like ardour will be used this way… take a look there:
Nexus Consulting: Nexus Consulting hiring Ardour Expert (Remote) in European Union | LinkedIn
I’m not sure you’re aware of this and if you like ardour will be used this way… take a look there:
Nexus Consulting: Nexus Consulting hiring Ardour Expert (Remote) in European Union | LinkedIn
I view this development with considerable reservation. I am not at all comfortable with the direction in which things appear to be heading. While I cannot prevent Ardour from being used in certain ways, I also do not feel this is a type of use I would want to actively support.
In my opinion, this direction is concerning — particularly if open-source tools are being leveraged for purposes that conflict with their original spirit and values.
What do others think about this? I’d be very interested to hear different perspectives.
There’s no shame in refusing an ecocidal madness that randomly aims at removing all the fun in actually learning and doing things. Musicians who just want to play music don’t ask AI, they ask a fellow musician. I believe it’s the same with sound engineer, either find one or learn to do what you need. The whole point of free software is to give back power and understanding to the people, this will just do the opposite.
Hey, maybe this will lead to a few features being contributed. That would make Ardour more flexible with regard to the future of AI, and we as users would all benefit from it. Isn’t that the FOSS philosophy? I think AI will still remain just a tool.
While I personally would not want to listen to AI generated or AI enhanced music - I simply would loose interest very quickly - it enables other creative persons to make music - based on their unique ideas but not having the in-depth knowledge to implement them.
And there is a grey-zone in between.
To be honest I could not care less about the mass through-away music - which ever way it is produced, like it it was through-away music also in earlier days created by session musicians.
Interesting find.
There is a related pull request Ardour MCP server by zabooma · Pull Request #1060 · Ardour/ardour · GitHub that we’re currently debating if to merge. It allows to ask an LLM to interact with Ardour.
It allows basic prompts like “Mute ardour’s master track and save session”. So far it’s not very impressive, and somewhat limited by the number of tool actions that are exposed, but I expect that may change.
The MCP API is not too different from what Ardour already exposes (OSC remote control, Websocket,…). I can see use-cases e.g. allowing visually impaired people interacting with Ardour.
Also the GPL does not discriminate, if someone wants to use Ardour with an LLM, it’s perfectly in their right. Maybe at some point there will be a new XGPL (like AGPL triggered by tivoization) to prevent LLM use, but those won’t apply to Ardour. Anyway…
I believe that we cannot prevent and resist this by technical means, but should rather focus on educating users. While one can use an LLM/AI, there are very good reasons why you should not.
That being said if you want to toy with it, and don’t want to use any of the commercial / proprietary LLM listed in the PR’s Readme, here’s how you can run it locally with ollama:
## see https://docs.ollama.com/linux to install ollama
ollama serve
ollama pull qwen2.5:7b
python -m venv mcp-bridge
source mcp-bridge/bin/activate
pip install --upgrade ollmcp
ollmcp -u http://localhost:4820/mcp
Fascinating thread. I’d be interested to know how close the Ardour session the LLM created was to what you actually needed on recording day (of course, you actually did record that band, didn’t you? ![]()
I’ve been poking at similar stuff (but in a field unrelated to music) for $DAYJOB now for about a year. One way to think about this example is to compare it with a) a static session template for a similar recording session; or b) a LUA script which sets up such a session.
Assuming one records similar lineups frequently, a pre-made session template with tracks for all those mics / instruments (plus the extras for your studio’s usual customers, like extra keys, string parts, etc.) could achieve a similar outcome. The amount of time spent tweaking and updating that template is likely non-trivial, but then again, tweaking the various prompts, etc. to get the LLM to do that work is non-trivial as well.
Greater flexibility than the static template, and likely similar to that of the LLM-driven approach (presuming that the Lua scripting interface is a superset of the OSC-like MCP tool interface which I haven’t seen). To make it really flexible, you’d need to add UI bits to the script, to allow specifying which instruments to set up, and allow choosing EQ / Compressor / Verb plugins. I would guess roughly that the maintenance effort of the Lua script template would be about double that of the static template (because SOFTWAREZ IZ HARD).
Similar effort in maintaining the prompts to the pure Lua approach, and equally flexible, but without adding the UI: the LLM discovers the MCP tools (or has them packaged up as something like “skills”), and interprets the human natural language instructions to drive them.
In a way, telling the LLM how to set up the session puts the human in the “senior audio engineer” seat, directing the “intern” (LLM) to do the drudge work.
Now if only the LLM could crawl around on the floor next to the drummer’s stinky feet to set up the snare-bottom and kick-front mics!
The UI for interaction with Ardour itself should be accessible though. If it’s in a browser somewhere, that might just work. However, if the prompt window is in Ardour, that’s a problem as Ardour is not a11y-friendly really.
Here in the UK there was quite a funny item on breakfast TV a couple of weeks ago. Apparently some guy said to 6 x AI engines “I need to clean my car today and the car wash is only 50 metres from my house. Should I drive there or just walk?” One of them replied “Well if you want to clean the car you’ll need to take it with you” and the other five said, “if it’s that close to your house, you might as well just walk!!”
On a more serious note there was a short TV series here in February about some of the horrendous mistakes that AI can make while it’s training itself. One example was from the early days of driverless cars because some countries drive on the right and others drive on the left. So all the car knew was that it needed to keep the pavement on one side and a white line on the other side. But if they were driving on a road with no white lines they’d happily drive on the wrong side of the road! Or even drive up onto the pavement trying to find a white line!! It’s no wonder people are worried…
If you ask stupid questions, you’ll get stupid answers ![]()
…and there are plenty of humans that drive the wrong way on the highway, too.
For most LLM the prompt interaction is a CLI in a terminal, I only tested those.
The LLM interface connects to Ardour via HTTP (similar like websockets or OSC). So I am not sure if a browser based interface works due to cross-site connections.
Anyway, the prompt interface is entirely separate. The current a11y bottleneck is that you need to load an Ardour Session first.
I have not tested if it works with headless Ardour, but it likely does.
Interaction with Ardour is done via MCP which is just a protocol, based on JSON gRPC, which is kind of rest.
The thing is that LLM does not talk to ardour directly, it needs an agent capable of tool calling in between.
The conversation goes like this:
User: List all tracks in Ardour
Agent: passes user’s request to LLM and attaches a list of tools that are available (with metadata explaining their use)
LLM: realizes that user’s requests needs interaction with the tools. looks at the available tools and finds “list_tracks” tool. LLM now responds to agent saying: please run list_tracks tool
Agent: runs list_tracks tool gathers the response (raw json data) and passes that back to LLM
LLM: looks at the response, formats it in a human readable format, filters out the relevant info, etc and sends that back
Agent: presents the response to user
So in order to use any MCP there is a need for an agen tool in between. for example codex, claude desktop, gemini cli, custom coded agent. main things is that they need to be MCP aware.
I am developing a general purpose local chat ui tool that can do this, soon to be released.
this is exactly how I see this, not just with ardour but also with any other areas where AI is used. I know there is much of the pushback but people need to realize that this is just another tool in the toolbox that can be used.
Sure, but I see some risk in this phrasing and not mentioning any caveats. I’ll avoid elaborating on that (for now) to keep the discussion on-topic.
So far I’ve gathered the following points about why LLM+MCP could be used (trying to not overlap points):
Now I’m not for or against MCP tools/integration, but what would make it different than if its functionality was added in a way/ multiple ways not involving LLM usage? Only metaphor/comparison coming to mind is commands in a terminal.