Do the tutorials on Ardour's YouTube channel use generative AI?

I noticed the voices sound oddly robotic in some of the videos, and there were a few weird gaps between words that made me suspicious. While I’m sure there are use cases for tech similar to “generative” AI that are not harmful, its disappointing to find a FOSS project I love and support making use of an environment and career destroying theft machine. A big part of my switch to Linux and open source software was to get away from large, predatory tech companies with their privacy concerns and general corporate greed, so I’m really lost as to why AI voices would be used in such a project.
Sorry if this sounds overly hostile, I’m not trying to directly attack anyone, it’s just something I’ve been meaning to ask about for a while.

Possibly because some of the people who work on the videos are not native English speakers.
I don’t know if speech synthesis was used in the videos or not, but if the thought of that possibility bothers you so much perhaps you could offer free English language voice services.

We do not. We almost certainly will not.

We have used 3 different speakers, 2 native english speakers, one not. The voice tracks are sometimes edited from the original take to give correct alignment with the video.

6 Likes

I was in the middle of writing this when Paul posted his response. I don’t have free services to recommend, though I considered offering myself as a volunteer when writing the original post. I’ve been told I have a decent voice for such things, but I didn’t want to make my potential criticism look like a self promotion, you know?

1 Like

I see, my apologies for the accusation then. I think I’m just a bit overly suspicious of AI use lately. Thanks for the clarification, though, I’m very happy to hear this!

1 Like

Role of generative AI in anything related to Ardour: zero.

In addition, our forum terms of service prohibit the posting of material generated via LLMs or any other technology.

16 Likes

Once again, thank you very much. You have no idea how relieving this is to hear. Keep up the good work.

3 Likes

We have a limited budget for the videos, so we don’t hire professionals for voiceovers. We do it ourselves or hire a less experienced person.

When people read the voiceover without much experience doing so, they will sound less relaxed and a little robotic. Neither Julie, nor Monty, nor me have a ton of experience there, so that’s one reason.

I don’t think there are “odd” pauses there, though. In fact, I nudge regions specifically to even out pauses. Smaller pauses inside sentences, slightly larger between sentences, slightly larger between paragraphs.

Finally, for the life of me, I can’t think of a reason why someone would create generative AI that excels at Eastern European accent and then someone would actually choose to use it :smiley:

9 Likes

heh.

Though really same reason why someone created an AI to generate songs.
Here profit likely comes in form of being useful for propaganda. …and it all takes away the joy of actually doing things.

6 Likes

On the Linux side, many of us are running AI voice to speech, speach to voice, and translation on our own computers with large language models, we train ourselves via interaction, and not at all the same thing as asking chat GPT, or Grock (Crock O shit) that gets its info from who knows where, and anyone and their dog!

I recently installed Speech Note, for that purpose: to help me via dictation, because although I have a good grasp of my language from a linguistics point of view :face_with_monocle:, I still suck at spelling, and even after over 45 years of using computers still can’t type worth crap! :woozy_face:

AI isn’t bad, but there certainly are bad, even terrible implementations thereof and ways to use it.

To be honest, given the immense feat it is to make help documents and any other documentation for world wide use in an ever evolving and changing tech landscape, Linux is severely lacking in good and thorough documentation, and a good place to use AI for at least the written stuff, and anything to narrate videos it can translate too in writing, and once it is, anyone of that language who can speak well and articulate could do the actual narration based on the generated text, even tweak it before they do for better cadence…

Add how it is perfect to help the visually, audibly, or otherwise impaired.

1 Like

Even after reading that paragraph twice, I’m still not sure what you want to express therein.

But I do agree, that one should distinguish between different uses of AI. Where the AI runs, which data it was trained on and what you use it for certainly makes a difference on its impact.

1 Like

This is an example of an AI-utilizing technology (-in this case a plugin & application) that’s super helpful in certain contexts (like mine): https://restemapp.com/ (ReStem)

For ReStem there, all the processing happens locally on your computer, and all it does is process a full-stereo drum recording you have and then spits-out the stems (e.g. snare, kick, toms, hi-hat, etc.).

I tested it out and it does a remarkable job actually, so they obviously trained the program pretty damn well. Obviously this sort of thing is still in its infancy, so each stem (on its own) sometimes sounds low-qual/digitally-compressed/etc., and is obviously way inferior to original, individual-mic recordings. But for people like me who have songs where you only have access to full-stereo drum tracks, this is now a very appealing option to help polish and emphasize certain drum elements, exaggerate transients, etc…

So yeah, there are a few uses of AI/training tech. that are nice and not wholly cancerous.

-J

Well it wasn’t cryptic, and you did get the part about how it’s implemented, but I wasn’t speaking of high impact, and I’m not in the marketing business. I was speaking of good and valid uses for it, since many people just say “AI = bad” and done, as if that’s not a gross generalization and a false equivocation logical fallacy all in one.

I don’t mind AI voices when using text to speech, and would totally welcome the ability to talk to the computer as an interface, like they do on Star Treck; more like TNG era, not the Kirk era, because I don’t want my computer to fall in love with me, and then have to have a fist fighting match with it! :sweat_smile:

1 Like

Yep, there sure are good uses for it, in all sorts of places and ways, but the most important part: Since AI is way more A then I for now, it takes someone with lot’s of I to implement AI in I ways that isn’t obviously A! :crazy_face:

1 Like

I spent 20 years working in public radio, often as an on-air announcer.

If you give me a script, I will happily record myself reading it for you for free. I’d be very, very glad to help out - it would be an easy way for me to help support Ardour.

13 Likes

Thanks! I’ll contact you privately :slight_smile:

3 Likes

Ah, can’t PM here. Please email me: alexandre.prokoudine@gmail.com

3 Likes

We have a limited budget for the videos, so we don’t hire professionals for voiceovers. We do it ourselves or hire a less experienced person.

I have a huge amount of respect for this, and I’m glad it is how things are handled at Ardour! I think my suspicion speaks more to how on-edge I am when trying to detect AI “generated” stuff then it does to anyone’s performance. I’m likely reading way too much into things, haha.
Apologies to all three of you if this post was insulting, I’m glad Ardour has people willing to lend their voices to the videos.

6 Likes

If you want, I would be up to joining the speaker pool :innocent:

4 Likes

Likewise. If you need extra voices, I’m open to giving it a shot as well.

3 Likes