Difference between revisions of "PIC: Multiplication"

From ScienceZero
Jump to: navigation, search
(New page: Low-end microcontrollers from Microchip do not have a multiplication instruction so any multiplication needs to be done using software or external hardware. This is a collection of 8 by 8 ...)
 
 
(4 intermediate revisions by 2 users not shown)
Line 1: Line 1:
 
Low-end microcontrollers from Microchip do not have a multiplication instruction so any multiplication needs to be done using software or external hardware. This is a collection of 8 by 8 bit multiplication methods that are very efficient.
 
Low-end microcontrollers from Microchip do not have a multiplication instruction so any multiplication needs to be done using software or external hardware. This is a collection of 8 by 8 bit multiplication methods that are very efficient.
  
General  (13-123 cycles, 18 instructions)
+
 
 +
==General  (13-123 cycles, 18 instructions)==
 
A simple starting point for experimentation and is easily extendible to any word lenght. It is quite fast if a large portion of the numbers are very small.
 
A simple starting point for experimentation and is easily extendible to any word lenght. It is quite fast if a large portion of the numbers are very small.
  
Small (67 cycles constant, 11 instructions)
+
  ;General - 13-123 cycles, 18 instructions
This one is for saving precious code memory and still have good speed. The principle of this method comes from the Microchip application note 526 but has been optimised for minimum size and maximum speed by Bjørn Bæverfjord.
+
;Remove "clrf tmpH" for 16 by 8 multiplication
 +
 +
mul8x8g clrf    prodL
 +
        clrf    prodH
 +
        clrf    tmpH
 +
        clrc
 +
mulgl  rrf    mulplr
 +
        skpc
 +
        goto    noadd
 +
        movfw  mulcnd
 +
        addwf  prodL
 +
        skpnc
 +
        incf    prodH
 +
        movfw  tmpH
 +
        addwf  prodH
 +
noadd  rlf    mulcnd
 +
        rlf    tmpH
 +
        tstf    mulplr
 +
        skpz
 +
        goto    mul
  
Medium  (36 cycles constant, 36 instructions)
 
Directly from Microchip application note 526.
 
  
Fast (22-36 cycles, 51 instructions)
+
==Small (67 cycles constant, 11 instructions)==
The fastest software solution I could find. Lifted from the PIClist.
+
This one is for saving precious code memory and still have good speed. The principle of this method comes from the Microchip application note 526 but has been optimised for minimum size and maximum speed.
  
Hardware (2-12 cycles)
+
  ;Small - 67 cycles constant, 11 instructions
How to read a 128kB look up table in EPROM using only 10 I/O pins and still have good performance. A larger PIC with more I/O pins will give higher performance since the data bus can be split and made unidirectional.
+
;Same principle as application note 526
 +
;Optimized for minimal size by Bjørn Bæverfjord
 +
 +
mul8x8s clrf    prodH
 +
        movlw  .128
 +
        movwf  prodL
 +
        movfw  mulcnd
 +
mulsl  rrf    mulplr
 +
        skpnc
 +
        addwf  prodH
 +
        rrf    prodH
 +
        rrf    prodL
 +
        skpc
 +
        goto    mulsl
 +
        return
  
--------------------------------------------------------------------------------
 
  
;General - 13-123 cycles, 18 instructions
+
==Medium  (36 cycles constant, 36 instructions)==
;Remove "clrf tmpH" for 16 by 8 multiplication
+
Directly from Microchip application note 526.
  
mul8x8g clrf    prodL
+
;Medium - 36 cycles constant, 36 instructions
        clrf    prodH
+
;From Microchip application note 526
        clrf    tmpH
+
;PIC16C5X / PIC16CXXX Math Utility Routines
        clrc
+
mulgl   rrf    mulplr
+
mul8x8m clrf    prodH
        skpc
+
        clrf    prodL
        goto    noadd
+
        movfw  mulcnd
        movfw   mulcnd
+
        clrc
        addwf  prodL
+
        btfsc   mulplr,0
        skpnc
+
        addwf  prodH
        incf    prodH
+
        rrf    prodH
        movfw   tmpH
+
        rrf    prodL
        addwf  prodH
+
        btfsc  mulplr,1
noadd   rlf     mulcnd
+
        addwf  prodH
        rlf     tmpH
+
        rrf    prodH
        tstf    mulplr
+
        rrf    prodL
        skpz
+
        btfsc   mulplr,2
        goto    mul
+
        addwf  prodH
 +
        rrf    prodH
 +
        rrf    prodL
 +
        btfsc  mulplr,3
 +
        addwf  prodH
 +
        rrf    prodH
 +
        rrf    prodL
 +
        btfsc   mulplr,4
 +
        addwf  prodH
 +
        rrf    prodH
 +
        rrf    prodL
 +
        btfsc   mulplr,5
 +
        addwf  prodH
 +
        rrf     prodH
 +
        rrf     prodL
 +
        btfsc  mulplr,6
 +
        addwf  prodH
 +
        rrf    prodH
 +
        rrf    prodL
 +
        btfsc  mulplr,7
 +
        addwf  prodH
 +
        rrf    prodH
 +
        rrf    prodL
 +
        return
  
  
--------------------------------------------------------------------------------
+
==Fast (22-36 cycles, 51 instructions)==
 
+
The fastest software solution I could find. Lifted from the PIClist.
;Small - 67 cycles constant, 11 instructions
+
;Same principle as application note 526
+
;Optimized for minimal size by Bjørn Bæverfjord
+
 
+
mul8x8s clrf    prodH
+
        movlw  .128
+
        movwf  prodL
+
        movfw  mulcnd
+
mulsl  rrf    mulplr
+
        skpnc
+
        addwf  prodH
+
        rrf    prodH
+
        rrf    prodL
+
        skpc
+
        goto    mulsl
+
        return
+
 
+
 
+
--------------------------------------------------------------------------------
+
 
+
;Medium - 36 cycles constant, 36 instructions
+
;From Microchip application note 526
+
;PIC16C5X / PIC16CXXX Math Utility Routines
+
 
+
mul8x8m clrf    prodH
+
        clrf    prodL
+
        movfw  mulcnd
+
        clrc
+
        btfsc  mulplr,0
+
        addwf  prodH
+
        rrf    prodH
+
        rrf    prodL
+
        btfsc  mulplr,1
+
        addwf  prodH
+
        rrf    prodH
+
        rrf    prodL
+
        btfsc  mulplr,2
+
        addwf  prodH
+
        rrf    prodH
+
        rrf    prodL
+
        btfsc  mulplr,3
+
        addwf  prodH
+
        rrf    prodH
+
        rrf    prodL
+
        btfsc  mulplr,4
+
        addwf  prodH
+
        rrf    prodH
+
        rrf    prodL
+
        btfsc  mulplr,5
+
        addwf  prodH
+
        rrf    prodH
+
        rrf    prodL
+
        btfsc  mulplr,6
+
        addwf  prodH
+
        rrf    prodH
+
        rrf    prodL
+
        btfsc  mulplr,7
+
        addwf  prodH
+
        rrf    prodH
+
        rrf    prodL
+
        return
+
 
+
 
+
 
+
--------------------------------------------------------------------------------
+
 
+
;Fast - 22-36 cycles, 51 instructions
+
;By Scott Dattalo from the PICList
+
 
+
mul8x8f movfw  prodH
+
        clrc
+
        clrf    prodL
+
        btfsc  mulplr,0
+
        goto    mulf0
+
        btfsc  mulplr,1
+
        goto    mulf1
+
        btfsc  mulplr,2
+
        goto    mulf2
+
        btfsc  mulplr,3
+
        goto    mulf3
+
        btfsc  mulplr,4
+
        goto    mulf4
+
        btfsc  mulplr,5
+
        goto    mulf5
+
        btfsc  mulplr,6
+
        goto    mulf6
+
        btfsc  mulplr,7
+
        goto    mulf7
+
        clrf    prodH      ;Bugfix by Dmitry Kiryashov
+
        return
+
  
mulf0  rrf    prodH
+
;Fast - 22-36 cycles, 51 instructions
        rrf    prodL
+
;By Scott Dattalo from the PICList
        btfsc  mulplr,1
+
        addwf  prodH,w
+
mul8x8f movfw  prodH
mulf1  rrf    prodH
+
        clrc
        rrf    prodL
+
        clrf    prodL
        btfsc  mulplr,2
+
        btfsc  mulplr,0
        addwf  prodH,w
+
        goto    mulf0
mulf2  rrf    prodH,f
+
        btfsc  mulplr,1
        rrf    prodL
+
        goto    mulf1
        btfsc  mulplr,3
+
        btfsc  mulplr,2
        addwf  prodH,w
+
        goto    mulf2
mulf3  rrf    prodH,f
+
        btfsc  mulplr,3
        rrf    prodL
+
        goto    mulf3
        btfsc  mulplr,4
+
        btfsc  mulplr,4
        addwf  prodH,w
+
        goto    mulf4
mulf4  rrf    prodH,f
+
        btfsc  mulplr,5
        rrf    prodL
+
        goto    mulf5
        btfsc  mulplr,5
+
        btfsc  mulplr,6
        addwf  prodH,w
+
        goto    mulf6
mulf5  rrf    prodH,f
+
        btfsc  mulplr,7
        rrf    prodL
+
        goto    mulf7
        btfsc  mulplr,6
+
        clrf    prodH      ;Bugfix by Dmitry Kiryashov
        addwf  prodH,w
+
        return
mulf6  rrf    prodH,f
+
        rrf    prodL
+
mulf0  rrf    prodH
        btfsc  mulplr,7
+
        rrf    prodL
        addwf  prodH,w
+
        btfsc  mulplr,1
mulf7  rrf    prodH,f
+
        addwf  prodH,w
        rrf    prodL
+
mulf1  rrf    prodH
        return
+
        rrf    prodL
 +
        btfsc  mulplr,2
 +
        addwf  prodH,w
 +
mulf2  rrf    prodH,f
 +
        rrf    prodL
 +
        btfsc  mulplr,3
 +
        addwf  prodH,w
 +
mulf3  rrf    prodH,f
 +
        rrf    prodL
 +
        btfsc  mulplr,4
 +
        addwf  prodH,w
 +
mulf4  rrf    prodH,f
 +
        rrf    prodL
 +
        btfsc  mulplr,5
 +
        addwf  prodH,w
 +
mulf5  rrf    prodH,f
 +
        rrf    prodL
 +
        btfsc  mulplr,6
 +
        addwf  prodH,w
 +
mulf6  rrf    prodH,f
 +
        rrf    prodL
 +
        btfsc  mulplr,7
 +
        addwf  prodH,w
 +
mulf7  rrf    prodH,f
 +
        rrf    prodL
 +
        return
  
  
--------------------------------------------------------------------------------
+
==Hardware  (2-12 cycles)==
 +
[[image:Picmul8x8.png|right|thumb|16F84A hardware table lookup]]
 +
How to read a 128kB look up table in EPROM using only 10 I/O pins and still have good performance. A larger PIC with more I/O pins will give higher performance since the data bus can be split into separate address/data busses and made unidirectional.
  
Hardware for efficient table look up on a 16F84A microcontroller. The value of R1 must be selected to give a pulse from the EOR gate that is shorter than the instruction cycle of the PIC and longer than the minimum requirements for the specific latch used.
+
The value of R1 must be selected to give a pulse from the EOR gate that is shorter than the instruction cycle of the PIC and longer than the minimum requirements for the specific latch used.
  
 
[[Category:Computing]]
 
[[Category:Computing]]

Latest revision as of 19:09, 5 January 2009

Low-end microcontrollers from Microchip do not have a multiplication instruction so any multiplication needs to be done using software or external hardware. This is a collection of 8 by 8 bit multiplication methods that are very efficient.


General (13-123 cycles, 18 instructions)

A simple starting point for experimentation and is easily extendible to any word lenght. It is quite fast if a large portion of the numbers are very small.

;General - 13-123 cycles, 18 instructions
;Remove "clrf tmpH" for 16 by 8 multiplication

mul8x8g clrf    prodL
        clrf    prodH
        clrf    tmpH
        clrc
mulgl   rrf     mulplr
        skpc
        goto    noadd
        movfw   mulcnd
        addwf   prodL
        skpnc
        incf    prodH
        movfw   tmpH
        addwf   prodH
noadd   rlf     mulcnd
        rlf     tmpH
        tstf    mulplr
        skpz
        goto    mul


Small (67 cycles constant, 11 instructions)

This one is for saving precious code memory and still have good speed. The principle of this method comes from the Microchip application note 526 but has been optimised for minimum size and maximum speed.

;Small - 67 cycles constant, 11 instructions
;Same principle as application note 526
;Optimized for minimal size by Bjørn Bæverfjord

mul8x8s clrf    prodH
        movlw   .128
        movwf   prodL
        movfw   mulcnd
mulsl   rrf     mulplr
        skpnc
        addwf   prodH
        rrf     prodH
        rrf     prodL
        skpc
        goto    mulsl
        return


Medium (36 cycles constant, 36 instructions)

Directly from Microchip application note 526.

;Medium - 36 cycles constant, 36 instructions
;From Microchip application note 526
;PIC16C5X / PIC16CXXX Math Utility Routines

mul8x8m clrf    prodH
        clrf    prodL
        movfw   mulcnd
        clrc
        btfsc   mulplr,0
        addwf   prodH
        rrf     prodH
        rrf     prodL
        btfsc   mulplr,1
        addwf   prodH
        rrf     prodH
        rrf     prodL
        btfsc   mulplr,2
        addwf   prodH
        rrf     prodH
        rrf     prodL
        btfsc   mulplr,3
        addwf   prodH
        rrf     prodH
        rrf     prodL
        btfsc   mulplr,4
        addwf   prodH
        rrf     prodH
        rrf     prodL
        btfsc   mulplr,5
        addwf   prodH
        rrf     prodH
        rrf     prodL
        btfsc   mulplr,6
        addwf   prodH
        rrf     prodH
        rrf     prodL
        btfsc   mulplr,7
        addwf   prodH
        rrf     prodH
        rrf     prodL
        return


Fast (22-36 cycles, 51 instructions)

The fastest software solution I could find. Lifted from the PIClist.

;Fast - 22-36 cycles, 51 instructions
;By Scott Dattalo from the PICList 

mul8x8f movfw   prodH
        clrc
        clrf    prodL
        btfsc   mulplr,0
        goto    mulf0
        btfsc   mulplr,1
        goto    mulf1
        btfsc   mulplr,2
        goto    mulf2
        btfsc   mulplr,3
        goto    mulf3
        btfsc   mulplr,4
        goto    mulf4
        btfsc   mulplr,5
        goto    mulf5
        btfsc   mulplr,6
        goto    mulf6
        btfsc   mulplr,7
        goto    mulf7
        clrf    prodH      ;Bugfix by Dmitry Kiryashov
        return

mulf0   rrf     prodH
        rrf     prodL
        btfsc   mulplr,1
        addwf   prodH,w
mulf1   rrf     prodH
        rrf     prodL
        btfsc   mulplr,2
        addwf   prodH,w
mulf2   rrf     prodH,f
        rrf     prodL
        btfsc   mulplr,3
        addwf   prodH,w
mulf3   rrf     prodH,f
        rrf     prodL
        btfsc   mulplr,4
        addwf   prodH,w
mulf4   rrf     prodH,f
        rrf     prodL
        btfsc   mulplr,5
        addwf   prodH,w
mulf5   rrf     prodH,f
        rrf     prodL
        btfsc   mulplr,6
        addwf   prodH,w
mulf6   rrf     prodH,f
        rrf     prodL
        btfsc   mulplr,7
        addwf   prodH,w
mulf7   rrf     prodH,f
        rrf     prodL
        return


Hardware (2-12 cycles)

16F84A hardware table lookup

How to read a 128kB look up table in EPROM using only 10 I/O pins and still have good performance. A larger PIC with more I/O pins will give higher performance since the data bus can be split into separate address/data busses and made unidirectional.

The value of R1 must be selected to give a pulse from the EOR gate that is shorter than the instruction cycle of the PIC and longer than the minimum requirements for the specific latch used.