I wonder how on earth stuff like x86->ARM translation works so well if games break even after switching from x87 registers to SSE preserving all the logic otherwise...
I think x87 fpu is the only 'weird' floating point units left. I think if you stick with 64-bit double precision floats or 32-bit single precision floats, where the registers are also 64 or 32 bits, all the modern stuff behaves the same. x87 is just weird because registers are 80-bits ... the idea was to have more accurate results from more precision, but it ends up weird because if you run out of registers and have to spill to memory, you typically lose precision.
Edit: since this post was second chanced, I can add on that some of the pre-PC consoles have weird floats too. If they had floats at all. Lots of fun for emulation developers. Even fun for contemporaneous game developers... PilotWings on the SNES comes with different revision accelerator chips and the demo only works properly on the early revision chips (but I think? the later revision chips have more accurate math). The PS2 FPU has weirdness around NaN, Infinity, very large numbers, and denormalized numbers. Etc.
It's probably because you have to have weird precision issues where the numbers are calculated ever so slightly differently, and some other effect like a guard being slightly too close and getting clipped by a door where that difference matters.
I debugged some software synthesizer code a while back (like 20 years or so now I think of it) where a build of it on one platform failed because of a precision bug. I can't remember the details, but there was a lot of "works fine on my machine" type discussion around it. Anyway it relied on a crude simulation of an RC circuit reaching very close to 0 asymptotically to trigger a state change, but on something like 64-bit Intel with a specific processor it never quite made it low enough to trip the comparison because of something to do with not flushing denormals.
From an electronic standpoint, making it simulate "it's high enough" as being about 0.7 and " it's low enough" being about 0.01 was far closer to the instrument they were trying to simulate, and making it massively imprecise like that got it going on everything.
Denormals in audio code are kind of the "perfect storm", because they take ages to deal with - you're suddenly back into softfloat land - and because you have to deal with many thousands of them in a few hundred microseconds.
We take how fast hardware floating point is for granted. I suspect it would be interesting to compare something compiled with softfloat with a normal benchmark and see just how bad it is.
It's a great reason to do your DSP code in fixed-point, which is just integer with a couple of steps you have to write down on paper to keep straight until you get to the end. Or, I do, because I suck at arithmetic. Just do it all in machine-length signed ints, and forget all the mystical world of tiny tiny floating point values ;-)
Fixed floating point has been a mistery to me, and to be fair floting point is too. I know digital synths like Virus or Waldorf all used 24 bit fixed point math DSP.
I remember this dps site with lost of c and delphi code, there is where I found what denormals are.
Nowdays I dont see DPS code dealing with denormals. Maybe the CPU does not have to do it in software anymore? I don't really know.
> I remember this dps site with lost of c and delphi code, there is where I found what denormals are.
musicdsp.org?
> Fixed floating point has been a mistery to me, and to be fair floting point is too. I know digital synths like Virus or Waldorf all used 24 bit fixed point math DSP.
If you imagine scaling a 16-bit value for like a volume control from 0 to 1, then you'd have maybe 32767 for maximum positive, and -32768 for maximum negative. You could convert those to floats, multiply, and convert back to a 16-bit integer.
But you don't want to use floats, you want to keep it all integer. So you make the volume range be from 0 to 255, and multiply your 16-bit value by that. Now you've got a 24-bit value, with a "binary point" between bits 7 and 8. Now the output is way off scale for the 16-bit DAC but if we chop off the fractional part by just shifting the result of the multiply left 8 bits, you've now got your volume control.
Some DSPs will actually do a 16 bit by 16 bit multiply which just discards the lower 16 bits of the result, with the assumption being that both 16-bit values mean "-1 to 1".
I remember there was a huge scandal where Intel's compiler, icc (considered to be the fastest for quite a while back when) defaulted to x87 when it detected an AMD CPU instead of SSE, giving AMD cpu's a handicap (incidentally, that's the reason why x87 used to be much faster on AMD for a while).
A lot of games were shipped with icc, so my guess is they'd work just fine as they were tested with both.
Rosetta uses software emulation for x87 floating point. That's slow, but in practice that doesn't matter much. Mac software never had a reason to use x87 FP, every Intel Mac had at least SSE3 support.
Looks like a demonstration that using `long double` math requires dipping into x87 instructions, specifically the `fldt` instruction: "floating point load ten bytes".