Extra Instructions Of The 65XX Series CPU (1996)

JetSetIlly · 2025-12-06T10:03:48 1765015428

A couple of threads on AtariAge are exploring the possibility of using the "unstable" opcodes in this group (ARR, etc.) as a sort of fingerprint. The hope is that the instability is a prediction of the specific model of CPU. To what end I'm not sure of yet, but it's interesting research all the same.

https://forums.atariage.com/topic/385516-fingerprinting-6502... https://forums.atariage.com/topic/385521-fingerprinting-6502...

rossjudson · 2025-12-06T01:46:36 1764985596

Some of those crazy instructions were used for copy protection, back in the day. Those mystery page boundary overflows were entertaining.

embedding-shape · 2025-12-06T01:53:17 1764985997

Aah, that's much better and more realistic than my previous assumption that they were "government instructions", something used in the military and similar more secretive contexts, but I suppose they didn't use off-the-shelves components perhaps like today.

cyco130 · 2025-12-06T02:41:53 1764988913

These instructions were not intentionally designed and put in there in secret. They're simply an unintended consequence of the "don't care" states of the instruction decoding logic.

The decoder is the part of the CPU that maps instruction opcodes to a set of control signals. For example "LDA absolute" (opcode 0xA5) would activate the "put the result in A" signal on its last cycle while "LDX absolute" (opcode 0xA6) would activate the "put the result in X" signal. The undocumented "LAX absolute" (opcode 0xA7) simply activates both because of the decoder logic's internal wiring, causing the result to be put in both registers. For other undocumented opcodes, the "do both of these things" logic is less recognizable but it's always there. Specifically disallowing these illegal states (to make them NOPs or raise an exception, for instance) would require more die space and push the price up.

See here[1] for example to get a sense of how opcode bits form certain patterns when arranged in a specific way.

  [1] https://www.nesdev.org/wiki/CPU_unofficial_opcodes

kimixa · 2025-12-06T02:28:55 1764988135

I don't think they were "intended" for anything - it's just that was the state of the control lines after it decoded that instruction byte, and combination might do something somewhat sane.

Wiring all the "illegal" instructions to a NOP would have taken a fair bit of extra logic, and that would have been a noticeable chunk of the transistor budget at the time.

NobodyNada · 2025-12-06T03:10:53 1764990653

That's exactly right. There's a really good article about it here: https://www.pagetable.com/?p=39

ruk_booze · 2025-12-06T06:03:27 1765001007

A nice guide on how to actually put those ”illegal opcodes” into work is ”No More Secrets”

https://csdb.dk/release/?id=248511

gblargg · 2025-12-06T06:34:40 1765002880

That's a new name I hadn't heard that fits well: unintended opcodes. I also like unofficial. Undocumented isn't correct because these are quite well documented.

adrian_b · 2025-12-06T10:47:26 1765018046

They are well documented now, after reverse engineering.

The manufacturer did not document them, so they really were undocumented.

The same happened with many other CPUs, like Zilog Z80, Intel 8086 and the following x86 CPUs.

They all had undocumented instructions, which have been discovered by certain users through reverse engineering.

Some of the undocumented instructions were unintended, so they existed only due to cost-cutting techniques used in the design of the CPU, therefore the CPU manufacturer intended to remove them in future models and they had a valid reason to not document them.

However a few instructions that were undocumented for the public were documented for certain privileged customers, like Microsoft in the case of Intel CPUs, so they were retained in all future CPU models, for compatibility.

bonzini · 2025-12-06T13:15:57 1765026957

Not always. LOADALL was used heavily by Microsoft's HIMEM.SYS on the 286, but was not preserved on subsequent models.

adrian_b · 2025-12-06T17:17:01 1765041421

That was because LOADALL was impossible to preserve, since the internal state of the CPU changed in the next models.

80386 also had an undocumented LOADALL instruction, but it was encoded with a different opcode, as it was incompatible with the 80286 LOADALL, by restoring many more registers.

After 1990, no successors to LOADALL were implemented, because Intel introduced the "System Management Mode" instead, which provided similar facilities and much extra.

Dwedit · 2025-12-06T14:42:55 1765032175

One way to use unofficial instructions is so you can use Read-Modify-Write instructions in addressing modes that the official instruction cannot be used in.

To understand, it helps if you write out the instruction table in columns, so here's the CMP and DEC instructions:

Byte C1: (add 4 to get to the next instruction in this table)

CMP X,ind [x indirect, read instruction's immediate value, add X, then read that pointer from zeropage, written like CMP ($nn,x)]

CMP zpg [zeropage, written like CMP $nn]

CMP # [immediate value, written like CMP #$nn]

CMP abs [absolute address, written like CMP $nnnn]

CMP ind,Y [indirect Y, read pointer from zeropage then add Y, written like CMP ($nnnn,Y)]

CMP zpg,X [zeropage plus X, add X to the zeropage address, written like CMP $nn,X]

CMP abs,Y [absolute address plus Y, add Y to the address, written like CMP $nnnn,Y]

CMP abs,X [absolute address plus X, add X to the address, written like CMP $nnnn,X]

So that's 8 possible addressing modes for this instruction.

Immediately afterwards:

Byte C2: (add 4 to get to the next instruction in this table)

???

DEC zpg

DEX

DEC abs

???

DEC zpg,X

???

DEC abs,X

That's 5 possible addressing modes. So where's "DEC X,ind", "DEC ind,Y", and "DEC abs,Y"? They don't exist.

Table for Byte C3 is 8 undocumented instructions that aren't supposed to be used. So people determined what the instruction did. Turns out, it's a combination of CMP and DEC, so people named the instruction "DCP".

Byte C3:

DCP X,ind

DCP zpg

???

DCP abs

DCP ind,Y

DCP zpg,X

DCP abs,Y

DCP abs,X

Unlike the "DEC" instruction, you have the "X,ind", "ind,Y", and "abs,Y" addressing modes available. So if you want to decrement memory, and don't care about your flags being correct (because it's also doing a CMP operation), you can use this DCP instruction.

Same idea with INC and SBC, you get the ISC instruction. For when you want to increment, and don't care about register A and flags afterwards.

djmips · 2025-12-06T14:40:11 1765032011

a good essay on how they work https://www.pagetable.com/?p=39