Does this seem eggregious to anyone besides me?

BruceMcF · Post by **BruceMcF** » Tue Nov 23, 2021 10:45 pm

On 11/23/2021 at 6:43 AM, Ed Minchau said:

That's what my little algorithm above does. The result isn't 16.8, but can be shifted there.

Yes, I was replying to a comment in the first page of the comments without reading the second page yet.

On 11/23/2021 at 4:35 AM, paulscottrobson said:

As it's a 16 bit value you could probably just divide the upper byte by 1024, which would be >> 8x >> 2 which would be quicker and only marginally less accurate. ...

Though for a 16.8 result, the high six bits of the lower byte are still part of the result, and the second from the bottom bit could be used for rounding ... indeed, do the /1024 second, so that the second from the bottom bit is in the carry. However, the /1024 means that high byte of the result is 0, so can be omitted. I get about 45 bytes.

proc calculate_tick_rate: near

           ; X/Y = tick rate (Hz) - divide by approximately 60 and store to zsm_steps

           ; use the ZP variable as tmp space

            ; Actually (X/64)+(X/1024), which is within 0.3% of X/60.

           ; use the ZP variable as tmp space

        val1 := r11

        frac1:= r12

        val2 := r13

        txa

        sty val1

        asl

        rol val1

        rol val1+1

        asl

        rol val1

        asl val1+1

        sta frac1

        txa

        sty val2

        lsr val2

        ror

        lsr val2

        ror

       adc frac1

       sta zsm_fracsteps

       lda va1

       adc val2

       sta zsm_steps

       lda val1+1

       adc #0

       sta zsm_steps+1

       rts

Ed Minchau · Post by **Ed Minchau** » Wed Nov 24, 2021 3:29 am

On 11/23/2021 at 3:45 PM, BruceMcF said:

Yes, I was replying to a comment in the first page of the comments without reading the second page yet.

Though for a 16.8 result, the high six bits of the lower byte are still part of the result, and the second from the bottom bit could be used for rounding ... indeed, do the /1024 second, so that the second from the bottom bit is in the carry. However, the /1024 means that high byte of the result is 0, so can be omitted. I get about 45 bytes.

Yep, we came up with nearly identical solutions, really just the entry and exit to the routine is the only difference. Either great minds think alike, or fools seldom differ.

BruceMcF · Post by **BruceMcF** » Thu Nov 25, 2021 4:04 am

On 11/23/2021 at 10:29 PM, Ed Minchau said:

Yep, we came up with nearly identical solutions, really just the entry and exit to the routine is the only difference. Either great minds think alike, or fools seldom differ.

I patterned the entry and exit after the original routine.

By taking 1/[1/60 - (1/64 + 1/1024)], which is around 15,000, the next power of 2 approximation is add X/16,384, that is, X/[(64)*(256)], which is already computed. The only change is that rather than just using the high bit from the low order part of val2, it is necessary to keep both bits and add to the "frac" for proper rounding.

I think it is something like this (untested, and I am VERY tired from a 12 hour shift then 3 hours driving to pick up my grandkid for Thanksgiving with his family up here!! -- so could be making an obvious mistake!!):

        val1 := r11

        frac1:= r12

        val2 := r13

        frac2 := r14

        stz val1+1 ; this is for both /64 and /16384 ; +3 clocks

        txa ; +2 = 5

        sty val1 ; +3 =8

        asl ; +2 = 10

        rol val1 ; +5= 15

        rol val1+1 ; +5 =20

        asl ; +2 = 22

        rol val1 ; +5 = 27

        asl val1+1 ; +5 = 32

        sta frac1 ; +3 = 35

        stx frac2 ; this is for /1024 ; +3 = 38

        sty val2 ; +3 = 42

        lda #0 ; +2 = 44

        lsr val2 ; +5 = 49

        ror frac2 +5 = 54

        ror ; +2 = 56

        lsr val2 ; +5 = 61

        ror frac2 ; +5 = 66

        ror ; Note: carry is clear ; +2 = 68

        adc frac1 ; the /16,384 version, this is really "a byte below 16.8" ; +3 = 71

        sta frac2+1 ; use high bit below for rounding if result is >= 128 ; +3 = 74

        lda frac2 ; 3 = 77

        adc val1 ; in /16384, this is the "frac" place of 16.8 ; +3 = 80

        sta frac2 ; +3 = 83

        lda val2 ; +3= 86

        adc val1+1 ; in /16384. this is the low byte of the integer part of 16.8 ; +3 = 89

        sta val2 ; +3 = 92

        asl frac2+1 ; round up if high bit below the fractional part is set. ; +5 = 97

        lda frac2 ; +3 = 102

        adc frac1 ; the /64 version, "the REAL fractional part in 16.8 result" ; +3 = 105

       sta zsm_fracsteps ; +4 = 109

       lda val2+1 ; +3 = 102

       adc val1 ; +3 = 105

       sta zsm_steps ; +4 = 109

       lda val1+1 ; +3 = 112

       adc #0 ; +2 = 114

       sta zsm_steps+1 ; +4 = 118

       rts ; +5 (+6 JSR) = 129

If it works correctly, it would be accurate to within 0.0005% ... I am guessing around 130 cycles.

EvilSnack · Post by **EvilSnack** » Fri Nov 26, 2021 9:41 pm

One way to divide any integer by 60, using only integer math:

1. Shift right by six bits. Save off this value.

2. Shift right by another four bits. Add this to the variable holding the saved-off value.

3. If you still have anything left in the shift variable, return to step 2.

4. The variable with the saved-off value will contain the quotient.

EvilSnack · Post by **EvilSnack** » Sat Nov 27, 2021 2:07 am

On 11/26/2021 at 3:41 PM, EvilSnack said:

One way to divide any integer by 60, using only integer math:

1. Shift right by six bits. Save off this value.

2. Shift right by another four bits. Add this to the variable holding the saved-off value.

3. If you still have anything left in the shift variable, return to step 2.

4. The variable with the saved-off value will contain the quotient.

If the most significant byte is 0x38 or lower, this will deliver more accurate results, and might be faster:

1. Shift left by two bits. Save off this value.

2. Shift right by four bits. Add this to the variable holding the saved-off value.

3. If you still have anything left in the shift variable, return to step 2.

4. Add 0x80 (this will round the final result to the nearest 1/60th).

5. Drop the least significant byte.

Ed Minchau · Post by **Ed Minchau** » Sat Nov 27, 2021 4:21 am

On 11/26/2021 at 2:41 PM, EvilSnack said:

One way to divide any integer by 60, using only integer math:

1. Shift right by six bits. Save off this value.

2. Shift right by another four bits. Add this to the variable holding the saved-off value.

3. If you still have anything left in the shift variable, return to step 2.

4. The variable with the saved-off value will contain the quotient.

This is the same algorithm that Bruce and I both posted above. Must be close to optimal.

BruceMcF · Post by **BruceMcF** » Sat Nov 27, 2021 11:38 pm

On 11/26/2021 at 11:21 PM, Ed Minchau said:

This is the same algorithm that Bruce and I both posted above. Must be close to optimal.

Yes ... "shift right six bits" is shift left two bits and treat the result as all being shifted down one byte. "Shift right another four bits" is ten bits total, so shift right two bits and treat the result as all being shifted down one byte. And "if you still have anything left, return to step 2" is 14 bits total, which is shift left two bits and treat the result as all being shifted down TWO bytes. I don't test to do 3, since the shifting has already been done, and since the target is to have a 16.8 result from a 16 bit dividend, so it makes more sense to simply go ahead and include it.

1/60 - [(1/64)+(1/1024)+(1/16,384)] ~= 4.07x10^(-6), or about 0.000407% discrepancy ... off by 4 parts in a million. That discrepancy should be imperceptible.

EvilSnack · Post by **EvilSnack** » Wed Dec 01, 2021 9:20 pm

My algorithm was more general-purpose, not assuming any particular length of the dividend.

The reason it works is because the series (1/16) + (1/16)^2 + (1/16)^3 + ... approaches 1/15, and then it's only a matter of dividing by 4 to get the final result.

BruceMcF · Post by **BruceMcF** » Thu Dec 02, 2021 12:06 am

On 12/1/2021 at 4:20 PM, EvilSnack said:

My algorithm was more general-purpose, not assuming any particular length of the dividend.

The reason it works is because the series (1/16) + (1/16)^2 + (1/16)^3 + ... approaches 1/15, and then it's only a matter of dividing by 4 to get the final result.

My algorithm is arrived at heuristically, taking the first power of 2 larger than 60, finding the residual of (1/60)-(1/64), inverting that, finding the first power of two larger than the inverse of that residual, and repeating until the residual is clearly going to be imperceptible for the application at hand.

That heuristic has no dependence on the length of the dividend.

Scott Robison · Post by **Scott Robison** » Thu Dec 02, 2021 1:14 am

My algorithm was fast! I don't know if it was correct, but it was fast. And if you're going to get an inexact answer anyway, why worry about how exact it is and just go with fast. ?

Of course, I guess 230 cycles can be improved upon.

LDA #0 is much faster, and it is guaranteed to be closer to the correct answer than infinity, so just use it! {slinking away}