Commander X16

Posted: **Wed May 03, 2023 6:51 pm**

Pointers and C … They are one … right?

When I started after a long IT career, moving back into the 8-bit era picking up an old hobby, inspired by the Commander X16, I re-discovered cc65 and KICK C. In my whole optimism taking first baby steps, developing C on the CX16 in 1990s style was a recipe for a long path of discoveries and learning…

The CX16 is an 8 bit computer, but with a 16 bit addressing bus, so by definition, pointer typed variables are 16 bits = 2 bytes.
Addressing memory using pointers can quickly require the usage of indirect indexed addressing mode of the 65C02 CPU (here we go again), so the compiler tries to find ways to avoid having to use such addressing mode … Instead, it searches for opportunities to use constants, absolute addressing or absolute indexed addressing …

So here a flavour of this learning experience, demonstrating what a great job Jesper has done in his KICK C compiler, let's consider a few use cases where pointers are a viable coding style, and embed them into the C code and still have optimal code as a result. One key rule to keep in mind, pointers is a language feature, the compiler will optimize and translate your C semantics into the best possible code for the 65C02. In other words, when writing code in C for the 65C02, always keep in mind how likely the compiler will generate the assembler, and always check the generated assembler if it matches your expectations.

Let's start with a simple example. What about the case where pointers passed as parameters to functions, right? Consider the following program:

Code: Select all

void properties_id(unsigned int* id) {
    *id = 12;
}

void main() {
    unsigned int id = 0;
    properties_id(&id);
    printf("id = %u\n", id);
}

This generates the following assembler, where the optimizer did its work …
You can instruct KickC to use zeropages where possible, so id is labelled to zeropage $18/$19.
Remember, id is an unsigned int, so 2 bytes …

Code: Select all

main: {
    .label id = $18
    lda #<0    // Note the funny syntax, a consequence of the unsigned int …
    sta.z id   // In 65C02 instruction set, this would be stz.z id and stz.z id+1 …
    sta.z id+1
    jsr properties_id  // Here the compiler calls properties_id …
    ...
}

Below you can see how the "parameter" is converted directly into the usage of the label and a constant $c.
Since id is 2 bytes, the low and high byte are assigned and rts returns.

Code: Select all

properties_id: {
    .label id = main.id
    lda #<$c
    sta.z id
    lda #>$c
    sta.z id+1
    rts
}

But what if this function gets called 2 times, with different variables…

Code: Select all

void properties_id(unsigned int* id) {
    *id = 12;
}

void main() {
    unsigned int id1 = 0;
    unsigned int id2 = 0;
    properties_id(&id1);
    properties_id(&id2);
    printf("id = %u\n", id1);
    printf("id = %u\n", id2);
}

The assembler routine for main would look like:

Code: Select all

main: {
    .label id1 = $18
    .label id2 = $1a
    lda #<0
    sta.z id1
    sta.z id1+1
    sta.z id2
    sta.z id2+1
    ...
}

Nothing spectacular, but now, if we observe the routine properties_id, we see that the compiler is using indirect indexed addressing! Why? Because id1 and id2 are different memory addresses which are processed by the same properties_id function, so the function now needs to deal with this dynamism! Also note that in assembler, id has become a zeropage assignment to handle the offset for the indirect indexed addressing mode.

Code: Select all

properties_id: {
    .label id = $10
    lda #$c
    ldy #0
    sta (id),y
    tya
    iny
    sta (id),y
    rts
}

Imagine you don't want this, you just want the code to keep using absolute addressing.
The trick then to use is what we call "inline" functions. It will "paste" the generated code right at the function call, and will produce the most optimal code. This will and can increase your code footprint, but it can still generate fast code…

Code: Select all

inline void properties_id(unsigned int* id) {
    *id = 12;
}
void main() {
    unsigned int id1 = 0;
    unsigned int id2 = 0;
    properties_id(&id1);
    properties_id(&id2);
    printf("id = %u\n", id1);
    printf("id = %u\n", id2);
}

Resulting in only a main routine and no properties_id routine code being generated!

Code: Select all

main: {
    .label properties_id1_id = id1
    .label properties_id2_id = id2
    .label id1 = $18
    .label id2 = $1a
    lda #<0
    sta.z id1
    sta.z id1+1
    sta.z id2
    sta.z id2+1
    lda #<$c
    sta.z properties_id1_id
    lda #>$c
    sta.z properties_id1_id+1
    lda #<$c
    sta.z properties_id2_id
    lda #>$c
    sta.z properties_id2_id+1
    ...
}

As you can see, the compiler has generated code for id1 and id2 assignment within main. Nowhere there is indirect addressing used, there are no JSRs and RTSes, but keep in mind that when using inline, the complete function is generated inline, each function call results in inline code!

OK. But what if this inline code is an issue, what are the options then?
Well, for scalar values i guess we've explored the options,
but for indexed code, there are still more options that can be considered...

However, on an 8 bit machine, try to avoid using pointers where you can.
For really efficient code, and i repeat myself, use Structures of Arrays, 1 byte indexes ...
Once you apply indexes with 2 bytes, adding in the equation pointers and pointer arithmetic, your code may end of
clunky and complex.

But that i'm going to add in this post later, since i'm already typing 2 hours on this one ... Step by step ...

Leave a note in the comments if there are other aspects to document or discuss.

Sven

Posted: **Sat May 06, 2023 6:58 am**

One can observe that when you declare independent local or global variables and pass the pointers to these variable addresses as parameters to functions, that when these functions need to update the contents of these variables, the indexed indirect addressing mode is required, which not only is a rather slow instruction, but also requires the preparation using 2 zerpage memory addresses to be able to execute it. So ... Is there a faster alternative to this?

The answer is yes, we can adjust our programming style, and guide the compiler to use absolute indexed within the function. The trick is to allocate these variables in an array, but it only will result in a tangible improvement if the index of the array can be expressed by one unsigned char (1 byte).

So when you declare:

Code: Select all

unsigned int ids[2] = { 1, 2 };

the memory area will have these variables stored as a list, which allow for absolute indexed addressing adjusting the values...

Code: Select all

void properties_id(unsigned char index) {
    ids[index] = 12;
}

void main() {
    properties_id(0);
    properties_id(1);
    printf("id = %u\n", id1[0]);
    printf("id = %u\n", id2[1]);
}

Sven

Posted: **Sat May 06, 2023 7:11 am**

I'm feeling very personally attacked, Sven

, but, thank you for looking at my code.

Posted: **Sat May 06, 2023 2:18 pm**

Posted: **Sat May 06, 2023 10:11 pm**

svenvandevelde wrote: ↑Sat May 06, 2023 2:18 pm?

I believe that TediusTimmy is referring to the fact that if we look at where he says "Imagine you don't want this, you just want the code to keep using absolute addressing. The trick then to use is ...", then use your solution, then there is no need to inline code.

However, TediusTimmy, don't feel bad! While the specific example you gave might have a more elegant solution, not every example will, and so knowing how to inline code is still an incredibly valuable skill when working with any 65C02 C-assembler.

Indeed, if you are using a 6502 C-assembler that doesn't incorporate the additional 65C02 opcodes, then there are certain examples where you would inline code precisely to get those opcodes. For example, TSB and TRB allow setting or resetting an individual bit, so if you have define PB1 in VIA#1 (accessed via the J6 pin header) as a serial TX pin, Set.TX() and Reset.TX() are quite efficiently inlined as the assembly routines "LDA #1 : TSB VIA1.PortB" and "LDA #1 : TRB VIA1.PortB", respectively, and doing it that way avoids all pointer handling and also avoids popping the set mask off the C data stack ... and even more importantly, avoids the risk of the wrong set/reset mask interfering with Kernel IEC operations, since only the first three lines of PortB are available on J6, and the rest of PortB is dedicated to system operations.

Posted: **Mon May 08, 2023 1:39 am**

svenvandevelde wrote: ↑Sat May 06, 2023 2:18 pm?

I was just joking that you could use the C code I have written as a list of examples of how not to write C on the 65C02.

Posted: **Mon May 08, 2023 4:48 am**

ah

... I hope the topics highlighted are of use (to you).
Just remind that there are other ways to avoid pointers, just scratching the surface.
Actually, the compiler could use those techniques in the optimizer to avoid indirect indexed addressing, but it doesn't at the moment.

There is tons of other stuff i need to write about, but it will take time. So plan to write little articles piece by piece on kickc.

Stay tuned

Sven

Commander X16

KICK C and the 65C02 CPU - Applying pointers

KICK C and the 65C02 CPU - Applying pointers

Re: KICK C and the 65C02 CPU - Applying pointers

Re: KICK C and the 65C02 CPU - Applying pointers

Re: KICK C and the 65C02 CPU - Applying pointers

Re: KICK C and the 65C02 CPU - Applying pointers

Re: KICK C and the 65C02 CPU - Applying pointers

Re: KICK C and the 65C02 CPU - Applying pointers