6502 RISC instruction set running at 3.4ghz..
-
- Posts: 7
- Joined: Sat Jul 10, 2021 2:55 am
6502 RISC instruction set running at 3.4ghz..
Could you imagine a processor running the 6502 (or similar) instruction set at 3.4ghz or any other modern CPU speed?
I can't help but wonder if that wouldn't just run circles around Intel and AMD processors.
I could be wrong. But it sure is fun to think about (especially multi-core) ?
-
- Posts: 952
- Joined: Fri Mar 19, 2021 9:06 pm
6502 RISC instruction set running at 3.4ghz..
The biggest problem I think would be memory access time. Modern CPUs have complex fetching and caching schemes to pull lots of memory into the CPU at one time, and lots of units running in parallel to keep the CPU busy at all times. Whenever an x86 CPU has to access RAM it has to slow down potentially to wait for the memory request to be fulfilled.
This is the primary reason why a 1 MHz 6502 was comparable to a 4.77 MHz 8088 for certain types of processing. If you could keep an 8088 busy with data already loaded into registers, it would potentially be a lot faster than a 6502 (depending on the instructions), but any time it had to go to memory it took 4 cycles, and since it was a 16 bit CPU with an 8 bit bus, it took 8 cycles to load a word.
I would love to see a super fast 6502, but since the model requires a memory access (or more) with most instructions, it will always be limited to the speed of RAM access. Now, modern RAM can be accessed very quickly compared to the Good Old Days, but each access is (as I understand it) accessing 64 bits at one time. So the interface between the CPU and RAM has to be able to deal with the bits per access. A 6502 is built around 8 bit bytes, so either the CPU has to be rearchitected to deal with more bits per memory access, or a shim interface would have to be inserted between the CPU and RAM to mask out just the bits of interest, which would slow things down.
http://forum.6502.org/viewtopic.php?f=1&t=6049 is a forum post that talks about theory and practice of what 6502 architecture speeds have done. Most notably is the quote (if accurate):
Quote
Bill Mensch (WDC's owner) said in an interview [...] that he estimated that with the newest technology of the day [circa 2017], [6502] could probably hit 10GHz.
So it isn't thought impossible by experts, but someone has to want to do it to make it happen. Clearly the market for it isn't there or else it would have been done already (most likely).
6502 RISC instruction set running at 3.4ghz..
Surely you could put the entire zero page in the processor as a register file, have rotating 64bit instruction cache, a 64 bit read cache and a rotating 64bit write cache ... though for non-sequential writes you will often have to do a read then a write ... and crank up the CPU until it is running so fast it is always waiting for memory ...
... and that speed where it is always waiting for memory will be somewhere short of the maximum feasible speed.
The 6502 was designed as a memory bound processor with an instruction set that required very few transistors but still allows you to get something useful done. Also to avoid violating Motorola IP like the 6501. It had its role in home computing because it was the cheapest way to do things at the time.
Now, the version where there is 128K on cpu cache with the Low RAM all in on-cpu cache and the High RAM and ROM segments starting to be cached as soon as the banks are selected ... that version would let you crank the cpu speed a bit higher before it gets memory bound.
6502 RISC instruction set running at 3.4ghz..
3 hours ago, AstronautSurfer said:
Could you imagine a processor running the 6502 (or similar) instruction set at 3.4ghz or any other modern CPU speed?
I can't help but wonder if that wouldn't just run circles around Intel and AMD processors.
I could be wrong. But it sure is fun to think about (especially multi-core) ?
Not really. Modern AMD64 processors are much more clock-efficient than the 6502 was, and Intel CPUs have been Superscalar since the 90s. A Superscalar CPU is capable of executing one instruction per clock cycle, which is about as fast as you can make a CPU at any given clock speed.
If one was to assume MOS continued development of the 6502 and built a 32-bit and 64-bit chip, it would have ended up taking basically the same development path either Intel or ARM took: ARM trended toward RISC designs and low power CPUs (which is why virtually all cell phone CPUs use ARM processors), and Intel trended toward larger dies and more parallelism, which is why we have 20 execution units on a Core i9.
In fact, going back to 1985 or so... the 6502 only appeared to run faster than its competition, because it cheats: the 6502 splits the clock internally into two phases (Phi A and Phi B), and it further splits each of those phases on half and does certain things on the front half and back half of the phase. While it worked at 1MHz and 2MHz clock speeds, I suspect this is unsustainable at higher speeds. You simply can't jam 3 or 4 T-States into a single clock tick and expect all of the disparate parts of a system to stay in sync at speeds of hundreds or thousands of megaHertz.
- StephenHorn
- Posts: 565
- Joined: Tue Apr 28, 2020 12:00 am
- Contact:
6502 RISC instruction set running at 3.4ghz..
Well, with DDR5 promising 4.8GHz and up to 8.4GHz clock speeds, I guess I could start to believe in the possibility of a high-quality manufactured ASIC hitting the GHz range while running the 6502 set of opcodes, without having to be throttled by memory fetches. Would it run circles around modern Intel and AMD processors, though? Lulz no, we're talking about an 8-bit CPU that can't even do its own integer multiplication, much less floating point, and let's not even start on division. Even if you could clock it up to modern CPU speeds, your programs to compare against would be spending thousands of cycles performing operations that a modern Intel, AMD, or ARM cpu can do in as few as 30 cycles. And the memory efficiency of the program code would suffer greatly, as well, as the 6502 has no opcodes for floating point math, or different type sizes, or vector operations.
Sorry, we're all fans of the 6502 here, but there's just no reality where a 6502-circa-2022 is going to remotely compete on performance.
Developer for Box16, the other X16 emulator. (Box16 on GitHub)
I also accept pull requests for x16emu, the official X16 emulator. (x16-emulator on GitHub)
I also accept pull requests for x16emu, the official X16 emulator. (x16-emulator on GitHub)
-
- Posts: 952
- Joined: Fri Mar 19, 2021 9:06 pm
-
- Posts: 952
- Joined: Fri Mar 19, 2021 9:06 pm
6502 RISC instruction set running at 3.4ghz..
Also: https://en.wikipedia.org/wiki/Instructions_per_second
So if we extrapolate from that list, bumping a 1 MHz 6502 to 3.4 GHz, we would get a 6502 running at 1462 MIPS. Compare that to a Intel Core i7 2600K from 2011 (the most recent CPU in the list running at 3.4 GHz) which achieves 176,170 MIPS. So roughly two orders of magnitude faster.
Of course, that is a 4 core CPU, so divide the MIPS by 4 to get 44042 MIPS for a single core. That means that the Core i7 is 30 times faster, core for core, than a 6502.
Interesting note is that the Raspberry Pi 2 is running at about the same MIPS per core at only 1 GHz, roughly 1/3 the speed of the theoretical 6502 imagined.
6502 RISC instruction set running at 3.4ghz..
The 6502 running at 3.4 GHz would be significantly slower than modern CPUs. You can check out this article where someone manages to emulate the 6502 at speeds equivalent to 15 GHz on a modern Intel CPU running at a clock speed less than a third of that. So modern CPUs do a lot more in a single clock cycle than the 6502.
https://scarybeastsecurity.blogspot.com/2020/04/clocking-6502-to-15ghz.html
The 6502 is also not very RISC-like. Its whole design assumes that accessing memory is fast. Something like the zero page, if not cached, would be horribly slow in the world of modern memory, which is not competitive with CPU registers for speed at all.
6502 RISC instruction set running at 3.4ghz..
So, then, let's drive this discussion in another direction. Assuming that, clock-for-clock, the 65C02 wouldn't be 'competitive,' would a max-speed chip nonetheless still be... 'acceptable' for modern-day tasks? Could you watch a 144p YouTube video? Play a 128kbps MP3? Render a modern web page with scripts and CSS?
The question becomes — with what RAM? These things depend on huge datasets being able to be accessed and manipulated, often with SIMD instructions, and there's still only the 64K address space, even with the kind of things that the X16 does via paging.
With this kind of sort of primitive MMU style of paging, though, a 65C816 suddenly becomes a much more attractive option. Windows 95 was able to run in the 16MB memory space that an '816 can access, and if the paging system used 4MB pages (scaled proportionally to the size of the X16's ROM pages [or 2MB if you'd prefer it scaled to the RAM page size]), then you've got a sort of supercharged version of the old DOS EMS. You can do some fairly complicated web pages with those kinds of resources.
But without a 32-bit data bus and a hardware multiplier at the bare minimum, you're just not going to be able to move enough data fast enough and get enough arithmetic done to play those compressed media files. Uncompressed (or 'trivially' compressed, like RLE or TMV) is no problem. The X16 can do that if you have a fast enough storage interface that you can keep the buffer full. A 3.4GHz '816-based system could push data fast enough to play 4K HD video at 60p if that's all it has to be doing.
6502 RISC instruction set running at 3.4ghz..
10 hours ago, Serentty said:
The 6502 running at 3.4 GHz would be significantly slower than modern CPUs. You can check out this article where someone manages to emulate the 6502 at speeds equivalent to 15 GHz on a modern Intel CPU running at a clock speed less than a third of that. So modern CPUs do a lot more in a single clock cycle than the 6502.
https://scarybeastsecurity.blogspot.com/2020/04/clocking-6502-to-15ghz.html
The 6502 is also not very RISC-like. Its whole design assumes that accessing memory is fast. Something like the zero page, if not cached, would be horribly slow in the world of modern memory, which is not competitive with CPU registers for speed at all.
Yes, as I said, the first step after speeding up the CPU is to have the zero page on the CPU as a register file. Indeed, you'd have the stack page as a register file as well. But LDA (zp),Y is still three memory clocks unless you have an instruction cache and/or data cache.