Page 3 of 4

Reconsidering the 65816 (W65C816S)

Posted: Wed Aug 18, 2021 7:51 pm
by rje


4 hours ago, paulscottrobson said:




RJE wrote about doing a P-System interpreter and I know Bruce was looking at the 6502 Pascal Compilers that are out there. If the 6502 was 5-10 times faster than it is now, say the speed of the Mega65's core, the problems about speed, data sizes and address space simply go away. 



And there's the "solution" -- if I want an interpreted HLL badly enough, I'll do it on the Mega65.

 


Reconsidering the 65816 (W65C816S)

Posted: Wed Aug 18, 2021 9:50 pm
by Wavicle


18 hours ago, BruceMcF said:




However, what he said there was why ... using the 65816 in 24bit address mode was using a CPLD to dereference the address, while going to the 64K address map with banking allowed the chip select to be done with glue logic. And the 64K address map could be done in the VIC-20 style that Dave preferred, where every part of the memory map only does one thing.



Of course, we don't get a blow-by-blow account of development, especially false starts and dead ends, so we don't know how much of the description of development hell in the second video was just general experience and how much was experience with the first generation of the design.



However, once you have a 64K address map, you can build the board with a 65C02 to have one less point of difference between the original Commodore Kernel and BASIC ROM code that was the starting point, so you have one fewer problem areas in trying to get the board to boot up at full 8MHz speed. Then if the board can boot with the 65C02, it's possible to see if it will boot with the 65C816.



At this point, it doesn't boot with the 65C716, seemingly because of some problem in the Vera initialization, but the board has been designed to be electrically compatible with putting a 65C816 in the 65C02 socket.



I'm not sure that's what he said; he said it "requires a lot of external circuitry to decode this and split it out" (referring to demuxing the signals), but demultiplexing the top address bits from the data bits requires two 7400 series chips, and an inverter which might require one additional chip (no additional chip if you have an extra NAND gate or inverter lying around). This is my demux circuit on the breadboard:image.thumb.png.1a117cf58513605288f636abc0ca970c.png



The top component is a latch and the bottom is a bus transceiver (the clock and RW lines are not here because I'm in the middle of a layout change). The green wires are address bus, blue wires data. You can see the data lines go from the CPU to the latch and the three wires on the right of the latch are BA0-BA2 (or A16-A18 if you prefer). If the concern is with the low 16 addresses going to the chip select circuit when BA0-BA7 != 0, you can fix that with another 74x245 on A8-A15 and a wired OR of BA0-BA7 connected to OE#.

A GAL/SPLD (I don't think this would need a CPLD) certainly decreases the component count and wiring complexity, but even with 74xx logic chips, I don't think it's going to increase the complexity much above the ten or so 74xx parts already on the board for bank and chip select generation. Hence why I asked the question.


Reconsidering the 65816 (W65C816S)

Posted: Wed Aug 18, 2021 10:19 pm
by Wavicle


6 hours ago, paulscottrobson said:




There are availability arguments about the 65816, it's not as popular as the 65C02 and it could go out of usage.



 



Has WDC given hints about this? My understanding is that most of their money comes from licensing the design and I suspect that their production of these parts exists primarily so that customers can prototype designs with an inexpensive part before committing to a core license.

To be fair, if I'm reading the date codes correctly, my brand new W65C816S purchased from Mouser in April was manufactured in 2010 while my brand new W65C02S purchased at the same time was manufactured in 2019. That would seem to suggest that 6502s move significantly more volume.


Reconsidering the 65816 (W65C816S)

Posted: Thu Aug 19, 2021 12:01 am
by BruceMcF


1 hour ago, Wavicle said:




A GAL/SPLD (I don't think this would need a CPLD) certainly decreases the component count and wiring complexity, but even with 74xx logic chips, I don't think it's going to increase the complexity much above the ten or so 74xx parts already on the board for bank and chip select generation. Hence why I asked the question.



The design of the board he describes seems to be using a CPLD or FPGA for that, at least by the time that "memory map version 1" was dropped for the simpler "memory map version 2".

Of course, at the time, as at present, they were not talking about what was going on in detail, but they were at least referring to issues with bus timing, and that would as easily be a timing issue rather than a complexity issue ... the in production, through pin static RAM they are using has pretty tight timing for chip select partway through an 8MHz clock cycle, and pulling the chip select logic into a single PLD speeds up the timing of the assert. To put the chip selects and A16-A23 out through a single PLD requires more than the 10 output pins available on a 22V10, hence a CPLD or as part of the functions on an FPGA.

(In the Feonix256, that's done through a bus master FPGA, which AFAIR in the latest design has been integrated with the audio master FPGA.)


Reconsidering the 65816 (W65C816S)

Posted: Thu Aug 19, 2021 3:06 am
by Wavicle


1 hour ago, BruceMcF said:




The design of the board he describes seems to be using a CPLD or FPGA for that, at least by the time that "memory map version 1" was dropped for the simpler "memory map version 2".



Of course, at the time, as at present, they were not talking about what was going on in detail, but they were at least referring to issues with bus timing, and that would as easily be a timing issue rather than a complexity issue ... the in production, through pin static RAM they are using has pretty tight timing for chip select partway through an 8MHz clock cycle, and pulling the chip select logic into a single PLD speeds up the timing of the assert. To put the chip selects and A16-A23 out through a single PLD requires more than the 10 output pins available on a 22V10, hence a CPLD or as part of the functions on an FPGA.



(In the Feonix256, that's done through a bus master FPGA, which AFAIR in the latest design has been integrated with the audio master FPGA.)



I think they are going to have a lot of difficulty pulling off 8MHz, even exploiting early address bus stabilization, if the parts shown in the videos are correct (xx74ACTxx logic, Alliance AS6C4008 SRAM). With tACC=70ns on the CPU and tACE=55ns on the SRAM, there may only be 15ns available from address stable until the correct CS# must be asserted. Those gates have a propagation delay up to 9.5ns, therefore if your output requires more than 1 level of combinatorial logic (which I think is guaranteed to be the case for memory below $9F00), you have the potential to hit a timing violation. Switching to faster xx74AHCTxx components will let you have two levels of logic if your load capacitance and temperature are sufficiently low (i.e. 15 pF and 25C or better). If I recall correctly, the problem with the v2 prototype board was that CS and PHI2 were effectively being ANDed together which should never work at 8MHz because you have less than 55ns from PHI2 until the end of tACC.

A16-A23 should not need to go through a PLD unless it is also demuxing the data bus, which is quite a bit more expensive than using a 74AHCT245 and only 1ns faster. In my experimental case, I only feed the high address bits into the SPLD and only the chip select signals come out.

Incidentally, the SPLD has a propagation delay as low as 7ns so you could go through two of them without a timing violation at 8Mhz on a 65C02. I'm not expert working with these old parts, but going strictly by the numbers, I just don't see how CX16 is going to be able to run stable at 8MHz unless the published timing values are all lies. If they do manage it, I am probably going to buy a sacrificial unit just to connect to the logic analyzer and figure out HOW. At 4MHz, all of this concern goes away and delay for demuxing the data bus is not going to be fatal to a 65C816.


Reconsidering the 65816 (W65C816S)

Posted: Thu Aug 19, 2021 4:44 am
by Lorin Millsap
I think they are going to have a lot of difficulty pulling off 8MHz, even exploiting early address bus stabilization, if the parts shown in the videos are correct (xx74ACTxx logic, Alliance AS6C4008 SRAM). With tACC=70ns on the CPU and tACE=55ns on the SRAM, there may only be 15ns available from address stable until the correct CS# must be asserted. Those gates have a propagation delay up to 9.5ns, therefore if your output requires more than 1 level of combinatorial logic (which I think is guaranteed to be the case for memory below $9F00), you have the potential to hit a timing violation. Switching to faster xx74AHCTxx components will let you have two levels of logic if your load capacitance and temperature are sufficiently low (i.e. 15 pF and 25C or better). If I recall correctly, the problem with the v2 prototype board was that CS and PHI2 were effectively being ANDed together which should never work at 8MHz because you have less than 55ns from PHI2 until the end of tACC. A16-A23 should not need to go through a PLD unless it is also demuxing the data bus, which is quite a bit more expensive than using a 74AHCT245 and only 1ns faster. In my experimental case, I only feed the high address bits into the SPLD and only the chip select signals come out. Incidentally, the SPLD has a propagation delay as low as 7ns so you could go through two of them without a timing violation at 8Mhz on a 65C02. I'm not expert working with these old parts, but going strictly by the numbers, I just don't see how CX16 is going to be able to run stable at 8MHz unless the published timing values are all lies. If they do manage it, I am probably going to buy a sacrificial unit just to connect to the logic analyzer and figure out HOW. At 4MHz, all of this concern goes away and delay for demuxing the data bus is not going to be fatal to a 65C816.
HCT is not faster. ACT is way faster. However as a while you are right.   Sent from my iPhone using Tapatalk

Reconsidering the 65816 (W65C816S)

Posted: Thu Aug 19, 2021 7:43 am
by Wavicle


2 hours ago, Lorin Millsap said:




HCT is not faster. ACT is way faster. However as a while you are right.



 



 



Sent from my iPhone using Tapatalk



AHCT. It is slightly better than ACT:

image.thumb.png.e1d07ec321a3f865fce07a010a07a559.png


Reconsidering the 65816 (W65C816S)

Posted: Thu Aug 19, 2021 11:28 am
by Lorin Millsap
AHCT. It is slightly better than ACT:
image.thumb.png.e1d07ec321a3f865fce07a010a07a559.png

Sorry. I missed that. However I think we are already using it in a few places.


Sent from my iPhone using Tapatalk

Reconsidering the 65816 (W65C816S)

Posted: Thu Aug 19, 2021 2:55 pm
by rje


16 hours ago, Wavicle said:




...if I'm reading the date codes correctly, my brand new W65C816S purchased from Mouser in April was manufactured in 2010 while my brand new W65C02S purchased at the same time was manufactured in 2019. That would seem to suggest that 6502s move significantly more volume.



Wow!  Thank you for sharing; current popularity is a data point I hadn't considered before.


Reconsidering the 65816 (W65C816S)

Posted: Thu Aug 19, 2021 2:56 pm
by rje


17 hours ago, Wavicle said:




I'm not sure that's what he said; he said it "requires a lot of external circuitry to decode this and split it out" (referring to demuxing the signals), but demultiplexing the top address bits from the data bits requires two 7400 series chips, and an inverter which might require one additional chip (no additional chip if you have an extra NAND gate or inverter lying around). This is my demux circuit on the breadboard:image.thumb.png.1a117cf58513605288f636abc0ca970c.png







Wow, your breadboards are as clean as a Ben Eater video.  Mine are always so messy... but them I'm a rank amateur.