Proposal for a hardware-agnostic math accel API
Re: Proposal for a hardware-agnostic math accel API
A math coprocessor...
- ahenry3068
- Posts: 1136
- Joined: Tue Apr 04, 2023 9:57 pm
Re: Proposal for a hardware-agnostic math accel API
Of course! And it's great! Though it is limited to only certain math functions (I think basically just multiply?). Of note,
I'm not trying to compete with FX since it's built in and nails cases that are good for video, obviously. VERA is also, appropriately so, well scrutinized since every X16 gets one and putting in sketchy code would be bad for every single user. It's also a (very) limited resource (noting FX eats up quite a few LUTs making certain things, like the XOR addition for the triangle wave, harder to justify).
Just looking at RISCV soft-core (so not considering optimized constructs on the FPGA directly), I can do lots of the maths, even floating-point if I wanted (not sure that'd be super useful on X16 though) and can simulate SIMD instructions given the higher internal clock speed. My end game is to see if I can make SIMD solutions using Verilog vs just writing code on the core. That would truly allow the S in SIMD at that point.
FX has a specific goal. I was more looking at a general Math API / APU sort of approach. The main reason was having something to work on that wasn't just hello world as a means to help me learn both FPGA stuff as well as RISCV assembly. That's the main goals. I think there's some cases where the APU could be useful. Enough to put a card in an X16? Perhaps not but I think it's fun to work on just the same.
Anyways I took your comment as confrontational but that may not have been your intent. Either way though, I'mna keep at it as there's value for me in the journey as a minimum. As a maximum, it could bring about some interesting thoughts and ideas for future hardware or software.
I'm not trying to compete with FX since it's built in and nails cases that are good for video, obviously. VERA is also, appropriately so, well scrutinized since every X16 gets one and putting in sketchy code would be bad for every single user. It's also a (very) limited resource (noting FX eats up quite a few LUTs making certain things, like the XOR addition for the triangle wave, harder to justify).
Just looking at RISCV soft-core (so not considering optimized constructs on the FPGA directly), I can do lots of the maths, even floating-point if I wanted (not sure that'd be super useful on X16 though) and can simulate SIMD instructions given the higher internal clock speed. My end game is to see if I can make SIMD solutions using Verilog vs just writing code on the core. That would truly allow the S in SIMD at that point.
FX has a specific goal. I was more looking at a general Math API / APU sort of approach. The main reason was having something to work on that wasn't just hello world as a means to help me learn both FPGA stuff as well as RISCV assembly. That's the main goals. I think there's some cases where the APU could be useful. Enough to put a card in an X16? Perhaps not but I think it's fun to work on just the same.
Anyways I took your comment as confrontational but that may not have been your intent. Either way though, I'mna keep at it as there's value for me in the journey as a minimum. As a maximum, it could bring about some interesting thoughts and ideas for future hardware or software.
Author of Dreamtracker (https://www.dreamtracker.org/)
Check Out My Band: https://music.victimcache.com/
Check Out My Band: https://music.victimcache.com/
- ahenry3068
- Posts: 1136
- Joined: Tue Apr 04, 2023 9:57 pm
Re: Proposal for a hardware-agnostic math accel API
It certainly wasn't my intent.m00dawg wrote: ↑Sat Feb 24, 2024 1:38 am Anyways I took your comment as confrontational but that may not have been your intent. Either way though, I'mna keep at it as there's value for me in the journey as a minimum. As a maximum, it could bring about some interesting thoughts and ideas for future hardware or software.
- ahenry3068
- Posts: 1136
- Joined: Tue Apr 04, 2023 9:57 pm
Re: Proposal for a hardware-agnostic math accel API
I'm all for whatever capabilities we can bring to the system. I was just saying we kind of have a limited math co-processor right now.
Re: Proposal for a hardware-agnostic math accel API
My apologies, I didn't get the notification you replied doh.
We do and honestly I'm not sure how much more useful this may end up being. Even with SIMD, there's still the time it takes to load the registers and move things around though if I don't do hidden registers (striding) then those data doesn't have to move around as much depending on the use case. Division is a big one I think might be useful. Really depends on how fast the RISCV or discrete FPGA logic arrangements end up being.
As a general update, Upduino, the bus transceivers, and Kevin (Texelec)'s prototype expansion cards came in (plus some pin headers and random accessories I'll need). I won't likely have a ton of time to work on it for a bit, but I did program the FPGA with the FemtoRV core and mess around with demo LED programs.
Once I get it on the expansion card, I think next test is to light up an LED when writing to the configured MMIO address and start building it up from there. FemtoRV includes some serial code so I can output debugging over USB and that's my plan for how to figure out what's going on inside. That'll just take quite a bit more glue than the example program I made has.
We do and honestly I'm not sure how much more useful this may end up being. Even with SIMD, there's still the time it takes to load the registers and move things around though if I don't do hidden registers (striding) then those data doesn't have to move around as much depending on the use case. Division is a big one I think might be useful. Really depends on how fast the RISCV or discrete FPGA logic arrangements end up being.
As a general update, Upduino, the bus transceivers, and Kevin (Texelec)'s prototype expansion cards came in (plus some pin headers and random accessories I'll need). I won't likely have a ton of time to work on it for a bit, but I did program the FPGA with the FemtoRV core and mess around with demo LED programs.
Once I get it on the expansion card, I think next test is to light up an LED when writing to the configured MMIO address and start building it up from there. FemtoRV includes some serial code so I can output debugging over USB and that's my plan for how to figure out what's going on inside. That'll just take quite a bit more glue than the example program I made has.
Author of Dreamtracker (https://www.dreamtracker.org/)
Check Out My Band: https://music.victimcache.com/
Check Out My Band: https://music.victimcache.com/
Re: Proposal for a hardware-agnostic math accel API
I actually think it would be super-useful on any platform.m00dawg wrote: ↑Sat Feb 24, 2024 1:38 amJust looking at RISCV soft-core (so not considering optimized constructs on the FPGA directly), I can do lots of the maths, even floating-point if I wanted (not sure that'd be super useful on X16 though) and can simulate SIMD instructions given the higher internal clock speed.
You seem to be proposing a standard similar to the tube interface on the Acorn BBC 8-bit micros. I'd be all for it.
Re: Proposal for a hardware-agnostic math accel API
Ah I was unaware of Tube but in the final form, essentially yep it would be similar. The X16 bus includes RDY which allows for suspending the main CPU as well. Can make for interesting solutions there. Although that's a ways out but the APU idea can be a workable stepping stone. I'll be sharing how I wired everything up and the schematics and things if I end up making a purpose built card so others could use this or a similar design to make FPGA-based solutions to do different things as well. The X16 bus seems like it can be a little particular so it'll be interesting to see how things go once I start wiring things to the bus directly.
Author of Dreamtracker (https://www.dreamtracker.org/)
Check Out My Band: https://music.victimcache.com/
Check Out My Band: https://music.victimcache.com/
Re: Proposal for a hardware-agnostic math accel API
Well, yes floating point on the X16 is pointless....but I still do it and I'll take any speed improvement I can get!m00dawg wrote: ↑Sat Feb 24, 2024 1:38 am
Just looking at RISCV soft-core (so not considering optimized constructs on the FPGA directly), I can do lots of the maths, even floating-point if I wanted (not sure that'd be super useful on X16 though) and can simulate SIMD instructions given the higher internal clock speed. My end game is to see if I can make SIMD solutions using Verilog vs just writing code on the core. That would truly allow the S in SIMD at that point.
Re: Proposal for a hardware-agnostic math accel API
That would be easy enough to do I think though would need a larger FPGA I think. The FemtoRV's largest core (petitbateau which implements RV32IMFC) supports single precision. I have no idea how fast it would be but surely faster than BASIC. Alas I haven't found a open source core that supports the vector extensions yet, at least which isn't fairly specialized. I read up on the RISCV vector extension while on vacation (weird thing to read on vacation maybe but I enjoyed it ) and it has some clever design principles that would work well with what I was trying to do with the SIMD style API calls.yock1960 wrote: ↑Thu Mar 07, 2024 12:00 amWell, yes floating point on the X16 is pointless....but I still do it and I'll take any speed improvement I can get!m00dawg wrote: ↑Sat Feb 24, 2024 1:38 am
Just looking at RISCV soft-core (so not considering optimized constructs on the FPGA directly), I can do lots of the maths, even floating-point if I wanted (not sure that'd be super useful on X16 though) and can simulate SIMD instructions given the higher internal clock speed. My end game is to see if I can make SIMD solutions using Verilog vs just writing code on the core. That would truly allow the S in SIMD at that point.
Their approach limits how many instructions needed to be added to the ISA to accommodate larger or smaller vectors. These things wouldn't be seen on the X16 side so using the vector extension might not be needed but I still nonetheless found it interesting and it may change how I implement SIMD in the API.
Author of Dreamtracker (https://www.dreamtracker.org/)
Check Out My Band: https://music.victimcache.com/
Check Out My Band: https://music.victimcache.com/