When my mood swings to machine learning, I think about implementing things on the X16.
I SHOULD be implementing cellular automata (CA) or genetic algorithms (GA) first.
CA are fun and easy. They work on a grid of byte values, which is usually displayed on the screen as pixels or characters. A CA program iterates on the grid (typically buffered), using neighborhood operations to adjust the value of each cell, then when that's done the grid is re-rendered. Repeat. Very satisfying. Probably quite doable in BASIC in fact.
GA are more subtle but still fun. Programs build a population of random strings (or arrays of bytes), which represent behavioral parameters to a script. They are therefore somewhat "genetic". The population then interacts according to the script, and the result is evaluated by the program -- in short the genetic strings are ranked in fitness from best to worst. The bottom half (or more) is discarded. The remaining bunch are crossed at random points to generate a new population, which is then put through the process. Iteration continues until a fitness threshold is reached.
However, I jump straight to neural networks, because I am a glutton for punishment. One of the punishments this sets before me is to implement a "minifloat" in C -- a floating point library that uses the unsigned char as the essential type, with a 1.4.3 or 1.5.2 structure. Maybe more on that later.
Anyway, as I get bogged down, I search the web for people with as little sense as I have. And I found this: https://www.ekkono.ai/running-ekkonos-e ... modore-64/
So if the C64 can do it, then the X16 can do it better. I am reassured, a little.
Alife and Machine Learning on the X16
Re: Alife and Machine Learning on the X16
As an initial diversion I coded up a starter CA for myself in BASIC. I really need to move this to C to get some speedup.
Re: Alife and Machine Learning on the X16
Here's my first attempt at a "real" cellular automaton: the Hodgepodge machine, from Scientific American around 1989 or so.
Written in C with CC65. Faster than BASIC, slower than assembly.
Written in C with CC65. Faster than BASIC, slower than assembly.
Re: Alife and Machine Learning on the X16
Oh, I remember back in the early days of the emulator being available to the public, there were a couple of Conway Life demos...
-
- Posts: 503
- Joined: Sat Jul 11, 2020 3:30 pm
Re: Alife and Machine Learning on the X16
This is actually what I wrote FASTMATH for. I started working on fuzzy logic, neural networks, and genetic algorithms back in the early 90s, and have been working on an language that makes all three easy on 8 bit machines ever since. I call it FUNGAL (FUzzy logic Neural network Genetic Algorithm Language).rje wrote: ↑Thu Jun 08, 2023 7:42 pm However, I jump straight to neural networks, because I am a glutton for punishment. One of the punishments this sets before me is to implement a "minifloat" in C -- a floating point library that uses the unsigned char as the essential type, with a 1.4.3 or 1.5.2 structure. Maybe more on that later.
It'll probably take me a few more years before the language is ready for others to use. I've had to change the way neural networks work a bit. The key difference is limiting the neurons to a maximum of two inputs, which effectively turns them into adjustable fuzzy logic gates.
Re: Alife and Machine Learning on the X16
I'd suggest breaking it up into semi-independent but inter-compatible modules. In this way, you could work on them independently and release them separately. This has the benefits of
1. accomplishment
2. reducing complexity
3. more likely to be bottom-up code projects
But you already know this, because if you've been working on it on and off since the 90s then you're approximately my age and therefore know all this already. So I'll shut up about that.
1. accomplishment
2. reducing complexity
3. more likely to be bottom-up code projects
But you already know this, because if you've been working on it on and off since the 90s then you're approximately my age and therefore know all this already. So I'll shut up about that.
Re: Alife and Machine Learning on the X16
Minifloats
I would prefer to steal someone else's minifloat code (and have been looking at examples lately); unfortunately minifloats on the 6502 really ought to be in assembly language, and I'm not too keen on doing that. Maybe I can approach it by writing very elementary and memory-focused C.
Thankfully, floating point math is well understood, and minifloats have been around for awhile as well.
The minifloats I think I'm most interested in are 1.5.2,-28. As you know, that means there's one sign bit, a 5 bit exponent, and a 2 bit mantissa. So this thing can represent numbers from 0 to 3, with an exponent from 2^-28 to 2^3, I think. Since neural net weights are values between [0,1], they need to represent very small floats, but not large ones so much, and apparently their precision is not as important as their exponent range. (1.4.3 minifloats are interesting as well).
If I recall correctly, anyway.
Also then, the operations done on weights are multiplication, addition, and division. Each perceptron sums up the product of each input weight times its input value, then calculates a threshold function. Division is needed for the threshold function. So that's what the minifloats need to support.
I would prefer to steal someone else's minifloat code (and have been looking at examples lately); unfortunately minifloats on the 6502 really ought to be in assembly language, and I'm not too keen on doing that. Maybe I can approach it by writing very elementary and memory-focused C.
Thankfully, floating point math is well understood, and minifloats have been around for awhile as well.
The minifloats I think I'm most interested in are 1.5.2,-28. As you know, that means there's one sign bit, a 5 bit exponent, and a 2 bit mantissa. So this thing can represent numbers from 0 to 3, with an exponent from 2^-28 to 2^3, I think. Since neural net weights are values between [0,1], they need to represent very small floats, but not large ones so much, and apparently their precision is not as important as their exponent range. (1.4.3 minifloats are interesting as well).
If I recall correctly, anyway.
Also then, the operations done on weights are multiplication, addition, and division. Each perceptron sums up the product of each input weight times its input value, then calculates a threshold function. Division is needed for the threshold function. So that's what the minifloats need to support.
-
- Posts: 503
- Joined: Sat Jul 11, 2020 3:30 pm
Re: Alife and Machine Learning on the X16
There's more. If you have more than two inputs to a neuron, then you need to normalize the weights. That means summing the squares of the weights, taking the square root, inverting, and then multiplying the result by each weight to give a relative weighting.
That's a lot of math. That's why I limited my neurons to only two inputs. With only two inputs, you can have a single number - an angle - to represent the relative weight. A lookup table can convert that angle into two weights whose absolute value totals 1; the first input gets multiplied by
cos(theta)/(abs(sin(theta))+abs(cos(theta)))
The second input is multiplied by
sin(theta)/(abs(sin(theta))+abs(cos(theta)))
For any value of theta, the sum of the absolute values of the weights is 1.
These two lookup tables have considerable overlap. One could do this as two separate tables, or two overlapping tables, or just a single sin table by adding pi/2 to theta for the cosine part.
This is why all values in FASTMATH are 1.7 fixed point, in the range from -1 to 1, and why the angles are all in bigrees (256 bigrees = 360 degrees).
There also needs to be a nonlinearity in there somewhere. Neural networks have been using the logistic function on the output to do this, but this could be applied at the inputs instead of the output.
There's actually a number of ways to apply nonlinearity, to either the inputs or the outputs, but probably the simplest is to use a Max function (equivalent to a fuzzy OR) or a Min function (equivalent to a fuzzy AND) to the two weighted inputs.
That's a lot of math. That's why I limited my neurons to only two inputs. With only two inputs, you can have a single number - an angle - to represent the relative weight. A lookup table can convert that angle into two weights whose absolute value totals 1; the first input gets multiplied by
cos(theta)/(abs(sin(theta))+abs(cos(theta)))
The second input is multiplied by
sin(theta)/(abs(sin(theta))+abs(cos(theta)))
For any value of theta, the sum of the absolute values of the weights is 1.
These two lookup tables have considerable overlap. One could do this as two separate tables, or two overlapping tables, or just a single sin table by adding pi/2 to theta for the cosine part.
This is why all values in FASTMATH are 1.7 fixed point, in the range from -1 to 1, and why the angles are all in bigrees (256 bigrees = 360 degrees).
There also needs to be a nonlinearity in there somewhere. Neural networks have been using the logistic function on the output to do this, but this could be applied at the inputs instead of the output.
There's actually a number of ways to apply nonlinearity, to either the inputs or the outputs, but probably the simplest is to use a Max function (equivalent to a fuzzy OR) or a Min function (equivalent to a fuzzy AND) to the two weighted inputs.
Last edited by Ed Minchau on Thu Jun 15, 2023 12:47 am, edited 1 time in total.
-
- Posts: 503
- Joined: Sat Jul 11, 2020 3:30 pm
Re: Alife and Machine Learning on the X16
You're right, but just because I know something doesn't mean that others reading this do, so it's helpful pointing all this out.rje wrote: ↑Mon Jun 12, 2023 3:37 am I'd suggest breaking it up into semi-independent but inter-compatible modules. In this way, you could work on them independently and release them separately. This has the benefits of
1. accomplishment
2. reducing complexity
3. more likely to be bottom-up code projects
But you already know this, because if you've been working on it on and off since the 90s then you're approximately my age and therefore know all this already. So I'll shut up about that.
Re: Alife and Machine Learning on the X16
But that looks like a new technique (2016 ?) that NNs until very recently didn't bother with. I imagine it emerged because computing power allowed NNs to become better at doing their thing on modern computers.Ed Minchau wrote: ↑Thu Jun 15, 2023 12:27 am There's more. If you have more than two inputs to a neuron, then you need to normalize the weights. That means summing the squares of the weights, taking the square root, inverting, and then multiplying the result by each weight to give a relative weighting.
Since it's optional, and since it's computationally expensive, then at least for now I'm not worried about weight normalization --- but I will recheck my notes all the same.