Floating point MIDI values

(ChuckkH) #1

Bear with me.
I have been complaining for more than a decade of the lack of support for precise tuning in audio software. Today, 64-bit audio samples are standard… and 7-bit pitch information!
Yes, we have pitch bend, which means you can use 1 note per channel. Yes, we have .scl files, which means you can set up 128 exact pitches for the duration of a piece (in Harry Partch’s famous 43-tone octave, which he insisted was not exhaustive, you could play ALMOST 3 octaves with 128 notes).

The reasons I think floating-point note numbers would be a good solution:

  1. You wouldn’t have to change the code that calculates pitch. It could still be 440*(2**((NN-69)/12), for example. But, in C++, every single function that treats MIDI note numbers could simply be copied and pasted, then the words “byte” or “int” replaced with “double” in the copy. In C++, that could coexist alongside the original limited byte/int version. Then, programs that used integer note numbers could continue exactly the same, but others could have the option of sending floats and doubles for pitch information.

  2. There would be practically no limit to the number of possible pitches. You could play 100 pitches at the same time, all within 1 cent of each other.

I have my own program that needs unlimited pitch precision, and I would love to use this as the means. But, for other programs, even Ardour’s piano roll could do it - It would just need a hotkey or option to turn off snap-to-grid for pitch. Most DAWs already have this option, to snap or not to snap, for rhythm, but not for pitch.

Of course, the same would need to be done / could just as easily be done for plugins.

The only reason I’m suggesting this is because I think it would be fairly simple code-wise. I have begun to browse through some of the code, but it will take me a while. Just thought I’d put this out there in case someone who is more familiar finds it interesting. This is functionality that no other DAW has. There is very little music software out there that allows complete per-note tuning freedom. I believe the only reason no one is asking for it is because they haven’t tried it yet.


(Robin Gareus) #2

you lost me there. 17 bits marketing, 29 bits thermal noise, and perhaps 18 bits useful information at best :slight_smile:

The main reason is simple. MIDI.org does not include this in their specification that vendors for both hardware and software use. Perhaps the upcoming MIDI 2.0 standard will help. A 32bit integer for notes perhaps (a ratio is much preferable over floating point).

If you want seamless pitch you could use a CV value and a modular synth, perhaps.
Other DAWs allow this via MPE, check out ROLI’s Seaboard and Bitwig for example.

(ChuckkH) #3

Hi. Thanks for your reply!

Maybe for some purposes, but, e.g., the Csound devs have offered - and strongly recommended - double-precision (64-bit) installers for well over a decade. And Csound is 0.0% marketing!
Regardless: even 16 bits of audio precision is considered trivial for modern DAWs. I get that the 7-bit limitation is a standard, but is it law? Is it forbidden to use a similar note system with a variation? OSC doesn’t follow the MIDI standard, either. What I’m proposing would be far easier to implement (and use) than OSC. It would be exactly the same as MIDI but with more bits for note number.

I agree that a ratio is preferable, but it would have to be implemented from the ground up just for people who want to compose just intonation, and we’re few; there are also people composing in, e.g. 53-tone equal temperament, or many other non-equal temperaments, non-octave tunings (e.g. Wendy Carlos decades ago, and she managed alone), Gamelan tunings (decades ago), etc. Even 32-bit float note numbers would be more than sufficient (195 million possible tones per octave over standard MIDI range if bit 0 is still reserved), at least until the aliens come and want to use our music software. Or cats!

Well, what I personally want is for my program to be able to send a note with any frequency at any time. It can with Csound or Pd, but the learning curve for microtuning is already steep enough without making people learn to use those daunting formats. Incidentally, Pure Data’s audio testing page is where I got the idea. It has a field for a MIDI note to play, and allows it to be floating-point. It is unlimited microtonality at one’s fingertips just in an audio settings test patch. Just that the sine wave is boring!

MPE makes no difference for sequencers, only for live controllers; there is still 1 pitch-bend value per channel, so a chord with 7 tones would still take 7 channels. To change quickly, or let a chord ring out over the next one, you’d need 14 channels. I just saw part of a score for Beethoven’s 5th Symphony and there were 14 distinct instruments (counting tympani as distinct from percussion). That’s two hundred years ago, hardly avant-garde. But 14 x 14 = 196 channels with plugins, 13 MIDI ports. My laptop couldn’t handle the DSP involved.

The MIDI per-note MTS standard would do the trick, but there are all of 2 programs out there that have implemented it. It also requires a sysex message for each note, which is overkill. 24 more bits would eliminate the need for anything else.
It may take me a few years, but I hope to come up with a proof-of-concept myself if I can’t convince anyone else.


(Robin Gareus) #4

For the case at hand my issue with float is that its prone to rounding errors when converting forth and back between frequency to note.

In any case Ardour uses SMF to save/load MIDI files, so that’s a non-starter.

(ChuckkH) #5

RE: floats, still far more precision than 7 bits, but point taken.
RE: SMF, again, point taken. My software is a sequencer that I want to be a MIDI clock slave to a DAW, and send all the tuning and MIDI data real time straight to plugins, leaving the DAW to handle audio, automations, timing, etc. It wouldn’t save or load MIDI files. It could export them using pitch bend messages, but this would happen after a piece is composed. During composition is when the user may not have decided on a fixed scale to use. Regarding my piano roll suggestion, perhaps that is a non-starter then; but an external piano roll program to do the same real-time would be a cinch.

If anyone’s curious, this is what I did with my program long ago, using (unfortunately) only Csound:

The tuning system illustrated:

(Robin Gareus) #6

Well, yeah csound is awesome. You might even use algoscore or some OSC sequencer and not even the sky is the limit.

But changing the MIDI standard, requires getting involved with the MIDI consortium :wink:

(Paul Davis) #7

I am not sure if you’re aware of the MIDI Tuning Standard. If not, it seems as if you probably should be.

(ChuckkH) #8

Thank you! I am, and I’ll probably include it as well in my sequencer.

(ChuckkH) #9

Just a note, I just stumbled across this. Not something I would use, but the tuning table interface, hidden by the “SCALING” - “EDIT” button, represents the values of the different notes in hundredths of MIDI note numbers. Note 69, e.g., is listed as having 6900.00. It doesn’t seem to be directly editable in this web demo, but it doesn’t look like the MIDI consortium is stopping them from representing pitch that way.


(Paul Davis) #10

Chuck, I don’t want to be rude, but I’m not sure that you grasp the difference between the various MIDI specifications and the way a given application might present information.

MIDI note messages use integer values to describe pitch. The MIDI Tuning Standard allows you to map a given integer pitch value to a specific frequency, in units of 100/16384 cents.

Nothing in MIDI (1.0 at least) provides for any non-integer (read: non-rational) representation of pitch.

(ChuckkH) #11

You don’t seem rude. It’s nice of you to take the time to reply.
Yes, I understand all of this. As far as I know, there is no law forbidding anyone to use the same sets of numbers that MIDI uses but without restricting them to integers. If I do manage to do this with some open source software, will I receive a cease and desist letter from the MIDI people? If the application accepts standard MIDI format by default with no problems, and also - additionally - accepts the same messages with 31 bits instead of 7? Did they patent 0-127? If that would not be forbidden by law (and I can’t imagine how it could be), I don’t care whether anything in MIDI 1.0 provides for non-integer representation of pitch. The fact that it doesn’t exist is why I’m suggesting it. If you don’t see the need for it, that’s understandable, but the fact that the MIDI standard doesn’t include something is not a reason to exclude it.

Perhaps I should have posted in a different forum. I was considering doing the code-hunting myself, hoping for a hint or two where to start. It seems like it should be utterly trivial, simply overloading every function/method using note numbers, but Ardour is a lot of code to search, and as the site says, not simple to compile.

(Paul Davis) #12

The point is here not what would be useful - people have been arguing about pitch representations in MIDI for decades precisely because of the limits of the MIDI standard(s).

The point is that neither you or we can change the MIDI specification. If you want to define some other protocol to convey pitch information, feel free to do so (remember to figure out how to get everybody else on board!). But you can’t change what MIDI is, and so if you want to use MIDI, you have to deal with it as it.

You say “If the application accepts standard MIDI format by default with no problems, and also - additionally - accepts the same messages with 31 bits instead of 7?” … sure, no problem BUT the "also - additionally - " clause here isn’t MIDI. You’re not talking about the MIDI protocol there, but something else. You could write a synth engine, for example, that also accepts OSC messages in which pitch is specified in any way you want. And something else could send it the right OSC messages, and you’d be happy (and so would I). But that’s not MIDI.

If you want to do this with MIDI, you have to use MIDI note messages and the MIDI Tuning Standard. This is conceptually flexible enough to do what you want, but it is cumbersome since it’s really designed to describe tuning systems rather than continuously variable pitch.

(ChuckkH) #13

That’s just it. I’m not trying to define a standard or change a specification. I’m trying to make something that works. Getting users would be simple; they’re waiting for something like this. As for implementation, if it’s GPL, who’s going to stop me? 1,2,3 programs. If those programs work, the usefulness will be immediately apparent to those who understand it, and I suspect ultimately other developers are slightly more on the side of usefulness than standards, as long as having an extra non-MIDI type of message coexisting with the MIDI isn’t violating anything.

I don’t want to change what MIDI is. Can I or can I not create (or modify) software to not follow the MIDI standard? AFAIK, I can.

Why would I care if it is or isn’t “MIDI”?

I do not understand why that would bother me.
Incidentally, I cannot find any reference anywhere to MIDI being sent at anything other than 31250 baud. Of course, I don’t have access to all the MIDI docs, so maybe that’s changed. Does Ardour send MIDI to plugins at 31250?
If manufacturers have never completely adhered to MIDI standards at any point in its existence, is there some reason I shouldn’t do this?

(Paul Davis) #14

You titled this thread “Floating Point MIDI values”. At the very least that implies you’re talking about MIDI. Had you called it “Floating Point Pitch values”, then it might be more obvious that the actual question is how to deliver/sequence/edit non-MIDI style pitch values.

But that’s an entirely different question, and the real answer is that there are no standards for this, which means that few existing tools (including Ardour) can really help. So to answer the question “Why would I care if it is or isn’t “MIDI”?”, the answer is that existing sequencer and plugin hosts are designed around MIDI, whether we like it or not.

The 31250 baud part of the MIDI spec is defined only for MIDI-over-DIN-serial. It doesn’t apply to MIDI delivered via different mechanisms. Using the same terminology as for networks, the 31250 baud is a description of the physical layer. MIDI-via-USB or MIDI-via-plugin-API is not and never has been confined to the serial physical layer, and thus the 31250 baud spec is not relevant.

Contrary to your suggestion, MIDI manufacturers have actually been extremely compliant with the specifications. There are a few corner cases, but they generally only come up in the areas where the specifications are a little loose.

(J Rigg) #15

I think you’re confusing several unrelated things here. Csound (and some DAWs like Reaper) can do double-precision DSP calculations, but those operate on audio data that is stored in a smaller number of bits, eg. 16 or 24 bit int or 32 bit float.

There may also be Csound installers for 64-bit operating systems, but that refers to CPU architecture, not audio data.

There are also 64 bit audio file formats like W64, but that refers to the number of bytes the file can store, not the audio data (which is still standard bit depth, eg 16 or 24 bit int or 32 bit float).

(ChuckkH) #16

Do you really know that?

The csound language provides three basic data types: i-, k- and a-types. The first is used for initialisation variables, which will assume only one value in performance, so once set, they will usually remain constant throughout the instrument code. The other types are used to hold scalar (k-type) and vectorial (a-type) variables. The first will hold a single value, whereas the second will hold an array of values (a vector) and internally, each value is a floating-point number, either 32- or 64-bit, depending on the version used.

Or see the official manual:
“The actual resolution should be the same as for the type of the audio sample variable. For ‘float’ Csound, that is a 32-bit, single-precision floating point number. It has 24 bits of precision in the mantissa. For ‘double’ Csound, that is a 64-bit, double-precision floating point number in the mantissa. It has 52 bits of precision.”

from https://csound.com/docs/manual/MiscCsound64.html

Or check the code:

static void MYFLT_to_short(int nSmps, MYFLT *inBuf, int16_t *outBuf, int *seed)

That is a buffer of MYFLT-sized input samples being forced to 16-bit samples for ALSA playback, from https://github.com/csound/csound/blob/6d319136c69dac4127c721403e76c52b6bf7a354/InOut/rtalsa.c

MYFLT is here in sysdep.h:

#ifndef __MYFLT_DEF
# define __MYFLT_DEF
# ifndef USE_DOUBLE
# define MYFLT float
# else
# define MYFLT double
# endif

The audio coming in and going out is in most cases not 64-bit, but internally, those samples are stored as 32 or 64, depending on build options.
As I said, regardless, the 7 bits of precision most DAWs offer for pitch is needlessly miniscule. You don’t need to update pitch information tens of thousands of times per second. But, yes, the Csound devs have been recommending a build that operates with 64-bit internal audio samples for over a decade, and whether they’re right or not, marketing and hype don’t explain that.


(Robin Gareus) #17

That article is not correct. Floating point rounding errors don’t propagate linearly with each operation. But aside from the technicalities…

If you write a synthesizer or algorithm that allows an input signal delta well below the thermal noise floor to influence output in the audible dynamic range, you’re doing something wrong. Using double precision won’t help you either, it’ll just be wrong differently :slight_smile:

(ChuckkH) #18

At the end of the day, I’m just a banjo player, but…
My understanding of that appendix is that an operation could introduce at most 6.02 dB of noise; not that it will. I understand that that part is just saying that it would take many operations for any noticeable effect to even be possible, let alone inevitable. The vast majority of actual cases will not add the same amount of error for each operation. So not linear, but that even the largest errors over and over would take many operations to become noticeable. But, as it implies, some people regularly use Csound to perform hundreds of operations on each sample (then play back the results in rooms with 53 speakers aimed at different places and 3 chairs). That’s one of the reasons it exists. Somebody somewhere wants to filter, shape and convolute a synthesized waveform that many times; not me, but I’m glad those people are out there, the same way I’m glad there was a Bobby Fischer.
So it isn’t one synthesizer or algorithm, but chains of sometimes hundreds.
Again, this may not suit everyone’s needs, but they’ve been doing it for ages. I brought up Csound only because there’s no sales involved. I’d never use Csound to record a bluegrass track, but for microtonal synthesis, honestly, it’s pretty useful.


(J Rigg) #19

Looks like we’re arguing semantics here. By “stored” I meant samples stored in an audio file on disk (which is what I thought you meant - sorry for the confusion). The Csound quote refers to samples stored temporarily in a variable in system memory. Both are correct :slight_smile:

(ChuckkH) #20

Understood. Csound does that too, but a large part of it is internally-calculated waveforms. In other words,