On 1/6/2022 at 9:24 PM, rje said:
Ideally, the opcodes would be organized in some rational way. Instead, Woz just has them however he likes and uses a table.
But I was taught (if that's the word) that complex instruction codes ideally are organized rationally for decoding, rather than jumptabling.
On the gripping hand, though, Woz' jump table is only 64 bytes? That's pretty small. Maybe I can decode 32 instructions in less than 64 bytes (and maybe not!), but I certainly can't dispatch fast with decoding logic.
It's not all at random, though it's definitely not like a microcoded processor instruction set ... more like the 6502 which feels free to take an opcode that doesn't make sense for one type of operation and use it for another. That is
aaa d rrrr, address-mode, direction, register
rrrr is the 16bit pseudo register, R0-R15
d=0: operand to ACC, d=1, ACC to operand
aaa is the operand address mode
aaa=000, immediate (followed by 16bit immediate value)
aaa=001, register direct
aaa=010, register indirect post-increment (lower 8bits, upper 8bits cleared)
aaa=011, register double indirect post-increment
aaa=100, pre-decrement register indirect
aaa=110, pre-decrement register double indirect
... but "0000 rrrr" is a nonsense action (eg, you cannot store the accumulator to the number 768), so instead "rrrr" is a non-register operation. With all of the indirect loads and store being post-increment, you only need one direction of pre-decrement to make a stack. HOWEVER, the single byte pre-decrement needs load AND store, so together they can do a move of a block of data from "back to front", if source is below destination and they overlap. So the single byte "POP" has both directions but the double byte one (to allow 16bit value stacks) only needs one direction.
Then there is arithmetic:
aaa s rrrr, arithmetic-op, sign, register
s= sign, 0=+ (plus), 1=- (minus)
aaa = 101, sum, ACC = ACC +/- register, set branch carry, zero, negative conditions
aaa = 110, sum value = ACC +/- register, set branch carry, zero, negative conditions, discard value
aaa = 111, inc/decrement, register = register +/- 1
Of course, 6 load/store operations and 3 arithmetic operations do not fit into 3bits, except the comparison operation only needs to subtract, and double byte pre-decrement only needs to work in one direction, so that lets it fit together like a jigsaw puzzle.
Edit: Note that while the register in the bottom and the instruction at the top is for functional reasons, there is ONE instruction that is almost implied by the design, which is the CPR Rn, since when beginning execution, the four operation bits end up in bits 1-4 of the Y register (for the instruction table look-up), and CPR uses that to give the index of the target for the subtraction, which the CPR instruction places in R13 rather than R0 (the accumulator). So the CPR opcode has to be $Dn, unless the CPR result register is relocated.
And then that implies that the two-byte POP instruction is at $Cn, by the "jigsaw puzzle" logic above.
Since I was attempting a re-implementation, I focused on the description of the functioning of the operations rather than Woz's implementation.
However, even with a different dispatch model, if trying to squeeze object size in a "Sweet 16 replacement", rather than optimizing for speed, I could imagine have a single indirect load and a single indirect store routine, which works out from the bits of the opcode and the status of the carry flag whether it is pre-decrement or post-increment and whether it is a single or double byte operation, covering 7 operations in two routines. Direct register moves could be handled by putting source in Y and destination in X, at the cost of using absolute rather than direct addressing for the Y-indexed operation, giving one routine the two direct ones. One could imagine the immediate register load being run by the two-byte accumulator load, setting the indirect source register to R15, the PC register, and using Y-indexed store, so the immediate load is taken over by the single indirect load routine as well.
Then at the cost of three more zero page bytes ... two more bytes in a dedicated "register 17" initialized to $0001, and one set to either $80 or $00 based on whether adding or subtracting, setting up the correct target and operand index in X and Y would all allow all five arithmetic operations to be done in a single routine. If that was done by shifting the instruction one bit to the left and using the carry flag and sign flag to split the code set into quarters, you might restrict the jump table to the $0n instructions, making it only 26-32 bytes long.