KICK C and the 65C02 CPU - Applying pointers
Posted: Wed May 03, 2023 6:51 pm
Pointers and C … They are one … right?
When I started after a long IT career, moving back into the 8-bit era picking up an old hobby, inspired by the Commander X16, I re-discovered cc65 and KICK C. In my whole optimism taking first baby steps, developing C on the CX16 in 1990s style was a recipe for a long path of discoveries and learning…
The CX16 is an 8 bit computer, but with a 16 bit addressing bus, so by definition, pointer typed variables are 16 bits = 2 bytes.
Addressing memory using pointers can quickly require the usage of indirect indexed addressing mode of the 65C02 CPU (here we go again), so the compiler tries to find ways to avoid having to use such addressing mode … Instead, it searches for opportunities to use constants, absolute addressing or absolute indexed addressing …
So here a flavour of this learning experience, demonstrating what a great job Jesper has done in his KICK C compiler, let's consider a few use cases where pointers are a viable coding style, and embed them into the C code and still have optimal code as a result. One key rule to keep in mind, pointers is a language feature, the compiler will optimize and translate your C semantics into the best possible code for the 65C02. In other words, when writing code in C for the 65C02, always keep in mind how likely the compiler will generate the assembler, and always check the generated assembler if it matches your expectations.
Let's start with a simple example. What about the case where pointers passed as parameters to functions, right? Consider the following program:
This generates the following assembler, where the optimizer did its work …
You can instruct KickC to use zeropages where possible, so id is labelled to zeropage $18/$19.
Remember, id is an unsigned int, so 2 bytes …
Below you can see how the "parameter" is converted directly into the usage of the label and a constant $c.
Since id is 2 bytes, the low and high byte are assigned and rts returns.
But what if this function gets called 2 times, with different variables…
The assembler routine for main would look like:
Nothing spectacular, but now, if we observe the routine properties_id, we see that the compiler is using indirect indexed addressing! Why? Because id1 and id2 are different memory addresses which are processed by the same properties_id function, so the function now needs to deal with this dynamism! Also note that in assembler, id has become a zeropage assignment to handle the offset for the indirect indexed addressing mode.
Imagine you don't want this, you just want the code to keep using absolute addressing.
The trick then to use is what we call "inline" functions. It will "paste" the generated code right at the function call, and will produce the most optimal code. This will and can increase your code footprint, but it can still generate fast code…
Resulting in only a main routine and no properties_id routine code being generated!
As you can see, the compiler has generated code for id1 and id2 assignment within main. Nowhere there is indirect addressing used, there are no JSRs and RTSes, but keep in mind that when using inline, the complete function is generated inline, each function call results in inline code!
OK. But what if this inline code is an issue, what are the options then?
Well, for scalar values i guess we've explored the options,
but for indexed code, there are still more options that can be considered...
However, on an 8 bit machine, try to avoid using pointers where you can.
For really efficient code, and i repeat myself, use Structures of Arrays, 1 byte indexes ...
Once you apply indexes with 2 bytes, adding in the equation pointers and pointer arithmetic, your code may end of
clunky and complex.
But that i'm going to add in this post later, since i'm already typing 2 hours on this one ... Step by step ...
Leave a note in the comments if there are other aspects to document or discuss.
Sven
When I started after a long IT career, moving back into the 8-bit era picking up an old hobby, inspired by the Commander X16, I re-discovered cc65 and KICK C. In my whole optimism taking first baby steps, developing C on the CX16 in 1990s style was a recipe for a long path of discoveries and learning…
The CX16 is an 8 bit computer, but with a 16 bit addressing bus, so by definition, pointer typed variables are 16 bits = 2 bytes.
Addressing memory using pointers can quickly require the usage of indirect indexed addressing mode of the 65C02 CPU (here we go again), so the compiler tries to find ways to avoid having to use such addressing mode … Instead, it searches for opportunities to use constants, absolute addressing or absolute indexed addressing …
So here a flavour of this learning experience, demonstrating what a great job Jesper has done in his KICK C compiler, let's consider a few use cases where pointers are a viable coding style, and embed them into the C code and still have optimal code as a result. One key rule to keep in mind, pointers is a language feature, the compiler will optimize and translate your C semantics into the best possible code for the 65C02. In other words, when writing code in C for the 65C02, always keep in mind how likely the compiler will generate the assembler, and always check the generated assembler if it matches your expectations.
Let's start with a simple example. What about the case where pointers passed as parameters to functions, right? Consider the following program:
Code: Select all
void properties_id(unsigned int* id) {
*id = 12;
}
void main() {
unsigned int id = 0;
properties_id(&id);
printf("id = %u\n", id);
}
You can instruct KickC to use zeropages where possible, so id is labelled to zeropage $18/$19.
Remember, id is an unsigned int, so 2 bytes …
Code: Select all
main: {
.label id = $18
lda #<0 // Note the funny syntax, a consequence of the unsigned int …
sta.z id // In 65C02 instruction set, this would be stz.z id and stz.z id+1 …
sta.z id+1
jsr properties_id // Here the compiler calls properties_id …
...
}
Since id is 2 bytes, the low and high byte are assigned and rts returns.
Code: Select all
properties_id: {
.label id = main.id
lda #<$c
sta.z id
lda #>$c
sta.z id+1
rts
}
Code: Select all
void properties_id(unsigned int* id) {
*id = 12;
}
void main() {
unsigned int id1 = 0;
unsigned int id2 = 0;
properties_id(&id1);
properties_id(&id2);
printf("id = %u\n", id1);
printf("id = %u\n", id2);
}
Code: Select all
main: {
.label id1 = $18
.label id2 = $1a
lda #<0
sta.z id1
sta.z id1+1
sta.z id2
sta.z id2+1
...
}
Nothing spectacular, but now, if we observe the routine properties_id, we see that the compiler is using indirect indexed addressing! Why? Because id1 and id2 are different memory addresses which are processed by the same properties_id function, so the function now needs to deal with this dynamism! Also note that in assembler, id has become a zeropage assignment to handle the offset for the indirect indexed addressing mode.
Code: Select all
properties_id: {
.label id = $10
lda #$c
ldy #0
sta (id),y
tya
iny
sta (id),y
rts
}
The trick then to use is what we call "inline" functions. It will "paste" the generated code right at the function call, and will produce the most optimal code. This will and can increase your code footprint, but it can still generate fast code…
Code: Select all
inline void properties_id(unsigned int* id) {
*id = 12;
}
void main() {
unsigned int id1 = 0;
unsigned int id2 = 0;
properties_id(&id1);
properties_id(&id2);
printf("id = %u\n", id1);
printf("id = %u\n", id2);
}
Code: Select all
main: {
.label properties_id1_id = id1
.label properties_id2_id = id2
.label id1 = $18
.label id2 = $1a
lda #<0
sta.z id1
sta.z id1+1
sta.z id2
sta.z id2+1
lda #<$c
sta.z properties_id1_id
lda #>$c
sta.z properties_id1_id+1
lda #<$c
sta.z properties_id2_id
lda #>$c
sta.z properties_id2_id+1
...
}
OK. But what if this inline code is an issue, what are the options then?
Well, for scalar values i guess we've explored the options,
but for indexed code, there are still more options that can be considered...
However, on an 8 bit machine, try to avoid using pointers where you can.
For really efficient code, and i repeat myself, use Structures of Arrays, 1 byte indexes ...
Once you apply indexes with 2 bytes, adding in the equation pointers and pointer arithmetic, your code may end of
clunky and complex.
But that i'm going to add in this post later, since i'm already typing 2 hours on this one ... Step by step ...
Leave a note in the comments if there are other aspects to document or discuss.
Sven