Implementing DMA

Lorin Millsap · Post by **Lorin Millsap** » Mon Mar 29, 2021 5:19 pm

Now that release is getting closer and real hardware expansions may be a thing I’m going to attempt to explain and outline the requirements to implement DMA features to avoid contention and make sure cards work with each other and the base hardware. So firstly clarifying DMA. Just because something is directly connected to the bus does not mean that it is DMA. DMA is more referring to devices that can access things on the bus themselves. So what lines do you have available to implement DMA? There are three lines that are associated and there are a couple rules that need to be followed depending on what you are doing. The first line we need to care about is /ML. This stands for Memory Lock and when this line is active it means that memory is in use. This line is going to be the primary means of DMA arbitration. Before a DMA device takes control of the bus it needs to check the state of this line. If it is active then either a CPU operation is in process that can’t be safely interrupted, or another DMA is already in progress. That means your card needs to have the ability to wait until this line is deactivated before initiating a DMA operation. The second part of this is that once your card has initiated a DMA operation it needs to activate the ML signal to let other DMA devices know that a DMA is in progress. So in summary on this there is only simple first come first served DMA arbitration and each card needs to comply with this basic approach. Always check the Memory Lock status before beginning an operation and always activate the ML signal when a DMA is in progress. In theory you could use the /SYNC line however we aren’t going to encourage that at this time as it may have some useful future applications and using it for DMA may prevent that. The next lines we care about are /RDY and /BE. These two lines can be used in different combinations depending on what you are doing. /RDY essentially halts the CPU in its current state and the CPU doesn’t resume until the next PHI2 high following the release. /BE tristates the CPUs outputs and prevents the CPU from driving the address and data lines. By activating both you essentially disable the CPU and allow the DMA device full control of the busses. For most DMA operations you will assert both lines, but there may be some special cases where you don’t. The next requirement is the access cycles. You still need to comply with the timing requirements of the buses depending on what you are doing. Writes need to be stable about 10ns prior to the falling edge of PHI2 and should be held for 10ns following the falling edge to ensure that the values get written correctly. Reads need to latch on the falling edge of PHI2. Addresses need to be valid at least 10ns prior to the rising edge of PHI2 and need to be held till at least 10ns after the falling edge of PHI2. In some cases accesses can be extended over multiple cycles but the timing must still meet what I’ve outlined above. The PHI2 high time is just over 60ns. The next requirement we are going to ask is that all DMA devices need to return to a safe state and release control in the event of a /RST (reset) signal. DMA devices need to default to a disabled state and then be turned on by the host. So what about arbitration? Well to clarify as per discussions the X16 will use software arbitration. Each DMA device will have a control register which by default is set to false or off and in order to begin a DMA access it would need to be set to true by the CPU. When the DMA access is complete the control register needs to be cleared by the DMA device. If the DMA device needs to perform an operation it can generate an IRQ and the IRQ routine can perform the proper setup and enabling if the DMA operation. Look forward to your questions. And at this point, no DMA has not been tested. So this is all hypothetical until real tests are performed. Edited to reflect items discussed below. Sent from my iPhone using Tapatalk

picosecond · Post by **picosecond** » Mon Mar 29, 2021 6:40 pm

5 hours ago, Lorin Millsap said:

For most DMA operations you will assert both lines, but there may be some special cases where you don’t

I think you always need to halt the CPU with /RDY and you always need to tri-state the busses with /BE. Can you give an example when both are not required?

5 hours ago, Lorin Millsap said:

it needs to activate the ML signal to let other DMA devices know that a DMA is in progress

This attempt at self-arbitration won't avoid bus contention when two DMA controllers want access on the same cycle. Without real hardware arbitration you are left with enabling one DMA controller at a time through software. There is nothing wrong with software arbitration but it renders this /ML business pointless.

I think I mentioned in another thread that it looks unsafe for DMA controllers to interrupt writes to auto-increment addresses. An easy way to avoid this problem is to take the bus only during opcode fetch (by monitoring SYNC). As a bonus, this inherently avoids breaking atomic operations so /ML is no longer needed.

Also note that because /RDY is directly driven by the DMA controller it is impossible for DMA controllers to address anything that uses /RDY to add bus wait states.

5 hours ago, Lorin Millsap said:

The PHI2 high time is just over 60ns

It is nominally 62.5ns, but what is the duty cycle spec on your crystal oscillator? +/- 5% is pretty typical unless you pay extra for better. Or did you switch to a 16MHz oscillator and divide it by 2 to square things up?

Lorin Millsap · Post by **Lorin Millsap** » Mon Mar 29, 2021 7:39 pm

On the DMA you aren’t going to get hardware arbitration. However if your default state is DMA disabled, then you can handle it in software by making sure that DMA accesses only occur when they are enabled by the CPU and that when control is restored they can be disabled until they are needed again. So examples could include a network card that generates an interrupt when it needs to get the CPUs attention, then the handler routine sets up and enables the DMA function which could read or write a section of RAM then when it is finished the controller clears its DMA flag (so it’s automatically disabled until it gets turned back on by the CPU) and things return to normal. The same approach would be used for disk controllers, etc. So long as DMA defaults to off it helps avoid most conflicts and if combined with IRQ handlers it would allow a fairly clean way to avoid contention since under this approach DMA only starts when the CPU tells it to. You would still want to check /ML since a DMA may not start immediately and it’s possible a CPU instruction could set /ML. And example where you may not always want to set /BE is when you are wanting to monitor the CPUs output. In many cases you could do this passively. But in certain diagnostic situations you may want to see what the CPU is doing. One example too would be where you want to capture the CPU output, modify it, then write it to some location. But most common situations would never do this and so most of the time you would enable both /RDY and /BE. As to the clock, the CPU clock is cleaned up so you always get nice square waves. If you think this is a good explanation and approach I can add it to my original post. Sent from my iPhone using Tapatalk

picosecond · Post by **picosecond** » Mon Mar 29, 2021 10:28 pm

2 hours ago, Lorin Millsap said:

If you think this is a good explanation and approach I can add it to my original post.

This is application and expansion card dependent, so I don't see a need to be too prescriptive. The main requirement is that every DMA controller needs a "DMA enable" register whose reset state is disabled. DMA controllers may take the bus only when enabled and software may enable only one DMA controller at a time. The application can decide on enable scheduling in multi-controller situations. The main point expansion card designers need to know is that multi-controller arbitration is software, not hardware controlled.

I still say using SYNC is better than using /ML. It costs the same and avoids adding restrictions like "don't access Vera auto-increment registers if a DMA controller is enabled".

2 hours ago, Lorin Millsap said:

the CPU clock is cleaned up so you always get nice square waves

That's a nice improvement over proto#2.

neutrino · Post by **neutrino** » Wed Oct 19, 2022 10:53 pm

It might be necessary to make difference between DMA and Bus Mastering. DMA enables transfers faster than the CPU. And Bus Mastering enables a device to take over the computer completely provided DRAM refresh etc is alright. With an onboard DMA, devices won't need to replicate this functionality and can be made simpler.

Another option is to have bus mastering available and an add-on card that upon software commands disables the CPU, transfers data and then release control.