Standard interface for extending Basic

mgkaiser · Post by **mgkaiser** » Tue Dec 05, 2023 3:17 pm

How about introducing a standard interface for extending BASIC? My thought is that it would work something like this. We add a single new BASIC statement, "@", that tokenizes and takes a parameter list. The first parameter is an unquoted string, which is follow by a variable number of parameters. All the command does is parse the parameters and then call a vector. By default the vector will point to a "do nothing" function. When people write extensions, they will hook the vector. It will be the standard to remember what the vector was previously set to and call that at the end of your call if you did not process the command. That would allow multiple extension commands to be hooked to the interface written by different people.

Example
@MYCOMMAND1,2,3
@ANOTHERCOMMAND,"SOME TEXT"

That would call the vector and pass 4 parameters to it. The vector handler would look at param1 and go "I handle MYCOMMAND" and handle it. If it didn't handle "ANOTHERCOMMAND" then it would just pass it along to the next vector handler.

I know we have the "USR" function, but that only allows a single function with a single parameter. What I am proposing allows as many commands as desired, allows multiple commands from multiple authors, and does not break tokenization. The only down size is a custom command will be bigger than tokenized commands.

Double bonus is that these will work both in code and in immediate mode with no extra effort.

EXTRA POINTS if we add 2 KERNAL routines, one to hook and one to unhook commands so the commands don't even need to know how to hook and unhook themselves and can be loaded and unloaded in any order.

skaratso · Post by **skaratso** » Mon Jan 08, 2024 1:55 am

What you're proposing here sounds very similar to the "&" (ampersand) command in Applesoft BASIC on Apple II computers.

What & did was an immediate jump to $3F5, called the ampersand vector. The difference is that people writing extended commands would do their own command parsing using Applesoft BASIC's own command parser. This would, of course, require that certain entry points within BASIC would need to be preserved, but it would be much simpler to implement.

TomXP411 · Post by **TomXP411** » Mon Jan 08, 2024 2:52 am

My thinking was something slightly more sophisticated, but not by much...

basically, I was thinking about creating a custom escape token that would be "execute user defined token", which would then point to a user created token table somewhere else in memory.

The user token table would basically be two lists:
List 1: Address Table
The address of each user routine, starting at token $80. Each entry is: 1 byte bank, 2 byte address.
List 2: Names Table
A list of token names. Last character of name is $80+character (same as CBM table.)

skaratso · Post by **skaratso** » Mon Jan 08, 2024 4:50 am

I'm not super familiar with Commodore 64 or Commander X16 BASICs, as I grew up using Apple II machines, but I just received notice that my X16 will be shipping very soon. But, given the similarities between Applesoft and Commodore BASIC, is the CHRGET routine still at $B1 on the zero page?

In Applesoft, when the ampersand is encountered and it jumps to $3F5, all you have to do is put a JMP <addr> to your parser. The character or token immediately after the ampersand is pointed to by TXTPTR and in the accumulator, so you can do a CMP <whatever> and then branch accordingly. If the high bit is clear, it's interpreted as an ASCII character, but if the high bit is set it's interpreted as whatever command token corresponds to that value. So CMP #$41 would be the letter A, but CMP #$BA would be the PRINT token. So you could have &A or even &PRINT, for example. Some ampersand parsers would go further and do full on string comparisons, like &"FOOBAR" but those are more complex. Are some of the extended commands in X16 BASIC implemented with 16 bit tokens? That would make things a little more difficult.

What you propose, though, could work a little better in terms of implementing a list of additional commands after the "escape code" or character, but it seems that the parsing after the command would still need to be done with internal BASIC routines, no?

TomXP411 · Post by **TomXP411** » Mon Jan 08, 2024 5:49 am

skaratso wrote: ↑Mon Jan 08, 2024 4:50 am What you propose, though, could work a little better in terms of implementing a list of additional commands after the "escape code" or character, but it seems that the parsing after the command would still need to be done with internal BASIC routines, no?

In this case, the BASIC interpreter would know to look at the user-loaded token tables and parse and tokenize user tokens, as well as ROM based tokens. Obviously, this requires extending BASIC to make it work, but it would allow for any number of additions, especially if you just need a few commands to optimize an existing program.

skaratso · Post by **skaratso** » Fri Jan 12, 2024 6:02 pm

Ok, so what you're proposing is more or less a "standard" way for "escape character" commands to be passed to the code that's interpreting them. If we say, follow Apple's convention and use the & as the escape character, then when the interpreter sees "& FOO" it'll look it up in the user token table and pass that to the appropriate routine.

The next question is, how will the user routine parse any arguments? Would the BASIC interpreter do it automatically and pass a list of pointers to each argument to the user routine (which could either limit the arguments to a certain format or create more complexity for you guys to allow maximum flexibility), or would the user routine be responsible for parsing, as it is done in Applesoft?

In Applesoft, ampersand commands just use internal Applesoft routines to parse all the arguments passed. None of them were "official" entry points, though, so if Apple had made substantial changes to BASIC any user routines would have needed to be updated. If all the X16 project is guaranteeing is an API (meaning that internal entry points might change from ROM version to ROM version), then if you go the route of letting user routines parse their own arguments with BASIC's own parser then you might want to create a jump table or some other standard way of finding those routines regardless of where they might move.

If you would like to see some sample code for Applesoft ampersand commands as an illustration of how it's done I have some I can send you. I also have a list of the common internal entry points used as well. With Michael Steil's C64 BASIC & KERNAL ROM Disassembly (https://www.pagetable.com/c64ref/c64disasm/) with multiple sets of comments from several sources, we should be able to map the Applesoft entry points to the appropriate ones in X16 BASIC.

TomXP411 · Post by **TomXP411** » Fri Jan 12, 2024 7:01 pm

skaratso wrote: ↑Fri Jan 12, 2024 6:02 pm Ok, so what you're proposing is more or less a "standard" way for "escape character" commands to be passed to the code that's interpreting them. If we say, follow Apple's convention and use the & as the escape character, then when the interpreter sees "& FOO" it'll look it up in the user token table and pass that to the appropriate routine.

To be clear, I'm NOT proposing a printable escape character, like some BASIC extensions use. It's just a 2 byte token, with the first byte being one designated for user tokens. So if you wrote a command named FOO, it would just be written as FOO in the listing, but internally, BASIC would see it as something like $FF $85.

This method makes FOO a first-class command, just like PRINT or POKE.

The next question is, how will the user routine parse any arguments? Would the BASIC interpreter do it automatically and pass a list of pointers to each argument to the user routine (which could either limit the arguments to a certain format or create more complexity for you guys to allow maximum flexibility), or would the user routine be responsible for parsing, as it is done in Applesoft?

You'd just do the same thing BASIC does. There are functions to read the next parameter in the line buffer, based on what you're expecting. So you can read an integer with GETNUM, a byte value with GETBYT, an address with GETADR, and so on.

So basically, each command is responsible for its own parsing, through the use of the helper functions.

skaratso · Post by **skaratso** » Fri Jan 12, 2024 9:42 pm

Ok, so the command parsing itself is done the same way, essentially, with the same routines that BASIC uses. But user commands get their own entries as two byte tokens but are essentially first-class commands.

That does raise the question of how the end user will distinguish between "native" commands and "user" commands when looking at a listing.

I presume that if the commands aren't loaded and such a program is listed, the user commands will be replaced with some kind of "placeholder" text? Would it be possible to somehow highlight the user commands in a listing so that they can easily be distinguished? Making user commands first-class citizens is a much nicer idea than using an "escape" character. But the one advantage the & provides in Applesoft is that it is very easy to tell the difference between "native" and user commands.

I look forward to seeing this. I've written a handful of ampersand commands for Applesoft BASIC in the past and I'd love to try my hand at porting them over to the X16.

TomXP411 · Post by **TomXP411** » Fri Jan 12, 2024 11:03 pm

skaratso wrote: ↑Fri Jan 12, 2024 9:42 pmThat does raise the question of how the end user will distinguish between "native" commands and "user" commands when looking at a listing.

I can't think of a reason that's necessary. However, if someone wanted to write their own LISTer, they could do that.

The only obvious issue I can see is that if the BASIC extender is not present, then tokenized programs with extended tokens won't list or run properly. Right now, if you have an invalid character in a BASIC line, you'll either get a Syntax Error, or you'll get a repeat of a different token.

There's not really a good way to solve that with tokenized BASIC, and running non-tokenized code slows things down a lot... which is why BASIC programs are tokenized to begin with (well, aside from saving 2-4 bytes per instruction.)

Anyway, it's just an idea. You'd have to get someone to actually implement it to make it useful, and I don't think anyone on the current dev team has the bandwidth for that. I'm playing around with a few ideas myself, but I'm also a long ways away from being able to write BASIC extenders, yet.

Martin Schmalenbach · Post by **Martin Schmalenbach** » Sat Jun 22, 2024 5:12 pm

TomXP411 wrote: ↑Fri Jan 12, 2024 11:03 pm
skaratso wrote: ↑Fri Jan 12, 2024 9:42 pmThat does raise the question of how the end user will distinguish between "native" commands and "user" commands when looking at a listing.
I can't think of a reason that's necessary. However, if someone wanted to write their own LISTer, they could do that.

The only obvious issue I can see is that if the BASIC extender is not present, then tokenized programs with extended tokens won't list or run properly. Right now, if you have an invalid character in a BASIC line, you'll either get a Syntax Error, or you'll get a repeat of a different token.

There's not really a good way to solve that with tokenized BASIC, and running non-tokenized code slows things down a lot... which is why BASIC programs are tokenized to begin with (well, aside from saving 2-4 bytes per instruction.)

Anyway, it's just an idea. You'd have to get someone to actually implement it to make it useful, and I don't think anyone on the current dev team has the bandwidth for that. I'm playing around with a few ideas myself, but I'm also a long ways away from being able to write BASIC extenders, yet.

A great idea - I would very much welcome something like this...

I have played around with something very similar in the C64, drawing on the contents of a most excellent book for the C64 from its heady days of the early-mid 80s, called "Commodore 64 Machine Code Master". It presents the user with a 2-pass symbolic assembler/disassembler and simple monitor written in BASIC, and then shows you how to add additional commands to BASIC using the assembler etc, starting off with trapping & extending the token table and the data needed for the LISTer to also list out the additional tokens, doing this for functions as well as statements. Lots of fun!

It relied on the fact the C64 ROMS could be copied into their underlying RAM, then disabled so the user program could modify some of the JMPs and JSRs and thus intercept what would otherwise be some form of syntax error and go from there.

This isn't possible on the X16, so some kind of vector in RAM is needed to be built into BASIC to facilitate this. If nobody touches that vector everything works as normal.

By the way, it is possible to do it via the USR() function.

The USR() function accepts only a single parameter, which can be a numeric OR A STRING expression... your routine simply needs to know which and then go from there. ALSO, your own USR routine can look for additional parameters places after the closing ")" of the USR function and by using BASIC's own parameter parsing routines, which are quite straightforward, the user code can handle any number of, and any type of, parameter, including arbitrarily complex ones.

And it is possible for the USR() function to return a string result too. Almost as easy as returning a numeric one.

I'd be happy to work on this - I think for development work I can simulate this new vector and get the functionality working and proven, and then perhaps I make a copy of the ROM but tweak the appropriate JMP/JSR to go via this vector in RAM to demonstrate it working as required - probably best done in the emulator for now!!!

Any additional thoughts, things for me to take into account or corrections to my assumptions etc above?

Commander X16

Standard interface for extending Basic

Standard interface for extending Basic

Re: Standard interface for extending Basic

Re: Standard interface for extending Basic

Re: Standard interface for extending Basic

Re: Standard interface for extending Basic

Re: Standard interface for extending Basic

Re: Standard interface for extending Basic

Re: Standard interface for extending Basic

Re: Standard interface for extending Basic

Re: Standard interface for extending Basic