Now you are getting into the gist of it, is Blender the best tool for the job?
Don’t get me wrong the VSE in Blender has improved significantly, but it is still best when used to edit sequences together created in Blender, and most of the time for animation you are animating to the dialog, which means this is a moot point for the most part. I would edit the dialog, give it to the animators, they would animate to the dialog to match lip sync etc. and then I would mix in music and SFX.
Up until now you had just mentioned Blender and Ardour, not that you were only editing externally created video. In that case then yes I would edit the video with the audio recorded and then import both into Ardour to resync the audio. Ideally you would actually import the audio and sync in a video editor using something like Pluraleyes, but there isn’t a great solution for that in Open Source or Linux that I know of yet (Though in the back of my head I thought I remember seeing a project to solve that at one point). But honestly I wouldn’t have both open at the same time, edit the video, tell the story, export and then edit the audio in Ardour or similar.
For the record this is what I did for years with fast turnaround times, working with others from Final Cut Pro/Pluraleyes (The latter of which had a nasty habit of downmixing to mono for my editor at the time but that was years ago) and other solutions for editing video, and taking it into Ardour and Mixbus to edit the audio.