# Matrix Multiplication with the TMS 32010 and TMS 32020 APPLICATION REPORT: SPRA008 Author: Charles Crowell Digital Signal Processing – Semiconductor Group Digital Signal Processing Solutions 1989 #### **IMPORTANT NOTICE** Texas Instruments (TI) reserves the right to make changes to its products or to discontinue any semiconductor product or service without notice, and advises its customers to obtain the latest version of relevant information to verify, before placing orders, that the information being relied on is current. TI warrants performance of its semiconductor products and related software to the specifications applicable at the time of sale in accordance with TI's standard warranty. Testing and other quality control techniques are utilized to the extent TI deems necessary to support this warranty. Specific testing of all parameters of each device is not necessarily performed, except those mandated by government requirements. Certain application using semiconductor products may involve potential risks of death, personal injury, or severe property or environmental damage ("Critical Applications"). TI SEMICONDUCTOR PRODUCTS ARE NOT DESIGNED, INTENDED, AUTHORIZED, OR WARRANTED TO BE SUITABLE FOR USE IN LIFE-SUPPORT APPLICATIONS, DEVICES OR SYSTEMS OR OTHER CRITICAL APPLICATIONS. Inclusion of TI products in such applications is understood to be fully at the risk of the customer. Use of TI products in such applications requires the written approval of an appropriate TI officer. Questions concerning potential risk applications should be directed to TI through a local SC sales office. In order to minimize risks associated with the customer's applications, adequate design and operating safeguards should be provided by the customer to minimize inherent or procedural hazards. TI assumes no liability for applications assistance, customer product design, software performance, or infringement of patents or services described herein. Nor does TI warrant or represent that any license, either express or implied, is granted under any patent right, copyright, mask work right, or other intellectual property right of TI covering or relating to any combination, machine, or process in which such semiconductor products or services might be or are used. Copyright © 1997, Texas Instruments Incorporated #### **TRADEMARKS** TI is a trademark of Texas Instruments Incorporated. Other brands and names are the property of their respective owners. #### **CONTACT INFORMATION** US TMS320 HOTLINE (281) 274-2320 US TMS320 FAX (281) 274-2324 US TMS320 BBS (281) 274-2323 US TMS320 email dsph@ti.com # Matrix Multiplication with the TMS 32010 and TMS 32020 #### **Abstract** This report is on matrix multiplication with the TMS32010 and TMS32020. Matrix multiplication is useful in applications, such as graphics, numerical analysis, or high-speed control. Because of the high speed of the multiply/accumulate operations and fast data I/O, both processors can multiply in microseconds large matrices with their sizes only limited by the internal data memory. Programs are included in the report to illustrate matrix multiplication on both processors. # **Product Support on the World Wide Web** Our World Wide Web site at www.ti.com contains the most up to date product information, revisions, and additions. Users registering with TI&ME can build custom information pages and receive new product updates automatically via email. #### INTRODUCTION Matrix multiplication is useful in applications such as graphics, numerical analysis, or high-speed control. The purpose of this application report is to illustrate matrix multiplication on two digital signal processors, the TMS32010 and TMS32020. Both the TMS32010 and TMS32020 can multiply any two matrices of size $M\times N$ and $N\times P$ . The programs for the TMS32010 and TMS32020, included in the appendices, can multiply large matrices and are only limited by the amount of internal data RAM available. Assuming a 200-ns cycle time, the TMS32010 and TMS32020 can calculate $[1\times 3]\times[3\times 3]$ in 5.4 microseconds. Before discussing the two versions of implementing a matrix multiplication algorithm, a brief review of matrix multiplication is presented along with three examples of graphics applications. #### MATRIX MULTIPLICATION The size of a matrix is defined by the number of rows and columns it contains. For example, the following is a $5\times3$ matrix since it contains five rows and three columns. $$A = \begin{bmatrix} a_{11} & a_{12} & a_{13} \\ a_{21} & a_{22} & a_{23} \\ a_{31} & a_{32} & a_{33} \\ a_{41} & a_{42} & a_{43} \\ a_{51} & a_{52} & a_{53} \end{bmatrix}$$ Any two matrices can be multiplied together as long as the second matrix has the same number of rows as the first has of columns. This condition is called conformability. For example, if a matrix A is an $M \times N$ matrix and a matrix B is an $N \times P$ matrix, then the two can be multiplied together with the resulting matrix being of size $M \times P$ . $$A = \begin{bmatrix} 3 & 4 \\ 2 & 7 \end{bmatrix} \qquad B = \begin{bmatrix} 4 \\ 6 \end{bmatrix} \qquad AB = \begin{bmatrix} 36 \\ 50 \end{bmatrix}$$ $$M \times N = 2 \times 2 \qquad N \times P = 2 \times 1 \qquad M \times P = 2 \times 1$$ Example: (3)(4) + (4)(6) = 36 Given the two conformable matrices A and B, the elements of $C = A \times B$ are given by: $$C_{ij} = \sum_{k=1}^{N} a_{ik} \times b_{kj}$$ for $$i = 1,...,M$$ and $j = 1,...,P$ #### Q12 FORMAT Applications often require multiplication of mixed numbers. Since the TMS32010 and TMS32020 implement fixed-point arithmetic, the programs in the appendices assume a Q12 format, i.e., 12 bits follow an assumed binary point. The bits to the right of the assumed binary point represent the fractional part of the number and the four bits to the left represent the integer part of the number. An example of Q12 format is as follows: 0000.110111100000 = 0.866 in Q12 × 0000.100000000000 = 0.5 in Q12 The result of a Q12 by Q12 multiplication is a number in a Q24 format that can easily be converted to Q12 by a logical left-shift of four. The first four bits will be lost as well as the last twelve, but these bits are insignificant for Q12. Note that the programs in the appendices provide no protection against overflow; therefore, the design engineer should implement a format that best fits the application. #### GRAPHICS APPLICATIONS Operations in graphics applications, such as translation, scaling, or rotation, require matrix manipulations to be performed in a limited amount of time. Therefore, the TMS32010 and TMS32020 processors are ideal for these applications. Graphics applications, such as scaling and rotation of points in a coordinate system, require multiplication of matrices. Translation is typically implemented by addition of two matrices. However, when points are represented in a homogeneous coordinate system, translation can be implemented by multiplication. In a homogeneous coordinate system, a point P(x,y) is represented as P(X,Y,1). This type of coordinate system is desirable since it relates translation with scaling and rotation. Translation can be defined as the moving of a point or points in a coordinate system from one location to another without rotating. This is accomplished by adding a displacement value $D_X$ to the X coordinate of a point and adding a displacement value $D_Y$ to the Y coordinate, thus moving the point from one location to another. Figure 1 shows both addition and multiplication methods of translation and an example of each. Similar to translation, scaling can be implemented by matrix multiplication. Points can be scaled by multiplying $$[X_{NEW} \ Y_{NEW}] = [X_{OLD} \ Y_{OLD}] + [D_x \ D_y]$$ where D<sub>x</sub> = 5 and D<sub>y</sub> = 1 Figure 1. Translation of Coordinates each coordinate of a point (or points) by a scaling value S<sub>X</sub> and Sy. Scaling an object is similar to stretching or shrinking an object. The coordinates of each point that makes up the object are multiplied by a scaling value which scales the object to a larger or smaller scale. Figure 2 shows the scaling of an object from one size to another. Rotation of the coordinates of a point (or points) about an angle theta can also be accomplished by a matrix multiplication. The following set of equations results with the matrix multiplication required to rotate an object about any angle. $[X_{\text{NEW }}Y_{\text{NEW }}1] = [X_{\text{OLD }}Y_{\text{OLD }}1] \bullet \begin{bmatrix} \cos\theta & \sin\theta & 0 \\ -\sin\theta & \cos\theta & 0 \\ 0 & 0 & 1 \end{bmatrix}$ Let the scaling factors $S_x$ and $S_y = 0.5$ Figure 2. Scaling From One Size To Another Figure 3 shows an implementation of these equations to rotate an object 30 degrees about the origin. Figures 4 and 5 show a segment of straight-line TMS32010 and TMS32020 code, respectively. These programs calculate the coordinate rotation example using a Q12 format. Note that once the matrices are loaded into memory, the procssors can calculate the results in 5.4 microseconds. The segment of TMS32020 code in Figure 5 implements the MAC instruction. For small matrices, the MAC instruction in conjunction with the RPT instruction gains little due to the overhead timing of the MAC instruction. However, for larger matrices, this method is most efficient since the MAC instruction becomes single-cycle in the repeat mode. For applications that only require translation, scaling, or rotation of coordinates, straight-line code as in Figures 4 and 5 is more efficient than the larger programs in the appendices. Figure 3. Implementation of Rotation Matrix | NOSIDT | | 32010 | FAMILY | MACRO | ASSEMBLER | | PC2.1 | 84 | 1.107 | 0 | 9:54 | | 02-2°<br>E 000 | | |--------|------|-------|--------|---------|-------------|----|--------|------------|----------------|-------|------|-----------------------------------------|----------------|------------| | 0001 | | | ***** | ***** | **** | ** | ***** | *** | **** | *** | *** | **** | **** | | | 0002 | | | * | | | | | | | | | | | * | | 0003 | | | | | DUTINE ASS | | | | | | | | | * | | 0004 | | | | | RST NINE I | | | | | | | | | * | | 0005 | | | * | MATRIX | (HOMOGENE | Oυ | S COOR | DIN | NATES) | , EN | TERE | ED BY | | * | | 0006 | | | | | B. THE LAS | | | | PUTS S | HOUL | D B | E THE | | * | | 0007 | | | * | OLD X / | AND Y COOR | DI | NATES. | | | | | | | * | | 8000 | | | * | | | | | | | | | | | * | | 0009 | | | **** | **** | **** | ** | **** | *** | **** | **** | *** | **** | **** | *** | | 0010 | 0000 | 6E00 | ROTATE | LDPK | О | | | | | | | | | | | 0011 | | 0000 | | EQU | 12 | | | | | | | | | | | 0012 | 0001 | 6880 | | LARP | O | | | | | | | | | 140 TE T V | | 0013 | 0002 | 7000 | | LARK | | * | POINT | AT | BEGIN | MING | OF | RUTE | ALLUN | MATRIX. | | 0014 | 0003 | 7109 | | LARK | | | | | | | | | | INATES. | | 0015 | | | | IN | , | | INPUT | | | MA1 | KIX | ANU | OFB | | | 0016 | | | | IN | , , , , , , | * | COORD | [NA | TES. | | | | | | | 0017 | | | | IN | *+,FA0 | | | | | | | | | | | 0018 | | | | IN | *+,PA0 | | | | | | | | | | | 0019 | | | | IN | *+,PA0 | | | | | | | | | | | 0020 | | | | IN | *+,PA0 | | | | | | | | | | | 0021 | | | | IN | *+,PA0 | | | | | | | | | | | 0022 | | | | IN | *+,PA0 | | | | | | | | | | | 0023 | | | | IN | *+,PA0 | | | | | | | | | | | 0024 | | | | IN | *+,PA0 | | | | | | | | | | | 0025 | | | | IN | *+,PA0 | | | | | | | | | | | | | 40A8 | | IN | *+,PA0 | | | | | | | | | | | | | 7F89 | | ZAC | | * | CLEAR | AU | CUMULA | ATOR. | | | | | | 0028 | | | | LARK | ARO,O | | | | | | | | _ | | | | | 6AA1 | | LT | *+,1 | Ħ | CALCU | LAT | E NEW | X CI | JURD | ITNATI | <b></b> | | | | | 6DAQ | | MPY | *+,0 | | | | | | | | | | | | | 6CA1 | | LTA | *+,1 | | | | | | | | | | | | | 6DAO | | MPY | *+,O | | | | | | | | | | | | | 6CA1 | | LTA | *+,1 | | | | | | | | | | | | | 6DAO | | MPY | *+,O | | | | | | | | | | | | | 7F8F | | APAC | | | 0014:5 | - <b>-</b> | <b>*</b> 0 04: | ~ A+ | D 01 | ITOUT | occ: | т | | | | 5000 | | SACH | ANS,4 | * | CONVE | ΚI | 10 91 | ∡ HN. | u ut | ,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, | KESU | I <b>s</b> | | 0037 | 001A | 480C | | OUT | ANS, PAO | | | | | | | | | | Figure 4. TMS32010 Code for Rotation ``` 0038 001B 7F89 ZAC * CALCULATE NEW Y COORDINATES. 0039 0010 7109 LARK AR1,9 0040 001D 6AA1 LT *+,1 MPY 0041 001E 6DA0 0042 001F 6CA1 LTA *+, 1 *+,0 *+,1 0043 0020 6DA0 MPY 0044 0021 6CA1 LTA 0045 0022 6DA0 MPY *+,0 0046 0023 7F8F 0047 0024 5C00 APAC * CONVERT TO 012 AND OUTPUT RESULT. ANS, 4 SACH 0048 0025 4800 OUT ANS, PAO 0049 0026 7F89 ZAC AR1,9 * FINISH HOMOGENEOUS MATRIX. 0050 0027 7109 LARK 0051 0028 6AA1 0052 0029 6DA0 LT *+,1 MPY *+,0 LTA MPY 0053 002A 6CA1 *+,1 *+,0 *+,1 0054 002B 6DA0 0055 002C 6CA1 LTA 0056 002D 6DA0 0057 002E 7F8F 0058 002F 5C0C MPY *+,0 APAC ANS, 4 ANS, PAO 0058 002F 5C0C 0059 0030 480C SACH OUT 0060 0031 7F8D RET NO ERRORS, NO WARNINGS ``` Figure 4. TMS32010 Code for Rotation (Concluded) ``` 32020 FAMILY MACRO ASSEMBLER PC0.7 84.348 16:07:15 02-25-85 NOSIDT PAGE 0001 ********** 0001 0002 THIS ROUTINE ASSUMES THE INPUTS ARE IN Q12. 0003 0004 THE FIRST NINE INPUTS SHOULD BE THE ROTATION 0005 MATRIX (HONOGENEOUS COORDINATES), ENTERED BY 0006 COLUMNS. THE LAST THREE INPUTS SHOULD BE THE 0007 OLD X AND Y COORDINATES. 0008 0009 * USE AUXILIARY REGISTER 1. 0010 0000 5589 ROTATE LARP 1 0011 000C ANS EQU 12 0012 0001 CA00 * INITIALIZE ACCUMULATOR. ZAC 0013 0002 C806 LDPK * LOAD ROTATION MATRIX INTO B1. AR1,>300 0014 0003 D100 LRLK 0004 0300 0015 0005 CB08 RPTK *+,PA0 0016 0006 80A0 IN LRLK AR1,>200 * LOAD COORDINATES INTO BLOCK BO. 0017 0007 D100 0008 0200 0018 0009 CB02 RPTK *+,PA0 0019 000A 80A0 TN * CONFIGURE BO AS PROGRAM MEMORY. CNFP 0020 000B CE05 * CLEAR P REGISTER. >0 0021 000C A000 MPYK 0022 000D D100 AR1,>300 LRLK 000E 0300 0023 000F CB02 RPTK 0024 0010 5DA0 >FF00.*+ * CALCULATE THE NEW X COORDINATE. MAC 0011 FF00 0025 0012 CE15 APAC 0026 0013 6C0C 0027 0014 E00C SACH ANS,4 ANS, PAO * OUTPUT NEW X COORDINATE. OUT * CLEAR P REGISTER. 0028 0015 A000 MPYK 0029 0016 CA00 ZAC 0030 0017 CB02 RPTK 0031 0018 5DA0 >FF00,*+ * CALCULATE NEW Y COORDINATE. MAC 0019 FF00 0032 001A CE15 APAC 0033 001B 6C0C SACH ANS,4 0034 001C E00C OUT ANS,PAO * OUTPUT NEW Y COORDINATE. MPYK * CLEAR P REGISTER. 0035 001D A000 0036 001E CA00 ZAC 0037 001F CB02 RPTK * FINISH HOMOGENEOUS MATRIX. 0038 0020 5DA0 MAC >FF00,*+ 0021 FF00 0039 0022 CE15 APAC 0040 0023 6C0C SACH ANS,4 0041 0024 E00C OUT ANS, PAO 0042 0025 CE26 RET NO ERRORS, NO WARNINGS ``` Figure 5. TMS32020 Code for Rotation To combine translation, scaling, and rotation, a more general matrix can be implemented. ## GENERAL MATRIX FOR TWO-DIMENSIONAL SYSTEMS $$\begin{bmatrix} r_{11} & r_{12} & 0 \\ r_{21} & r_{22} & 0 \\ t_x & t_y & 1 \end{bmatrix}$$ The upper $2\times 2$ matrix is a combination rotation matrix and scaling matrix. The $t_x$ and $t_y$ values are the translation values. A three-dimensional general matrix can be developed similar to the two-dimensional translation, scaling, and rotation matrix. ## GENERAL MATRIX FOR THREE-DIMENSIONAL SYSTEMS | $\begin{bmatrix} r_{11} \\ r_{21} \\ r_{31} \\ t_x \end{bmatrix}$ | r <sub>12</sub> | r <sub>13</sub> | o | |-------------------------------------------------------------------|-----------------|------------------|----| | r <sub>21</sub> | r <sub>22</sub> | Г23 | 0 | | г31 | r <sub>32</sub> | r33 | 0 | | L t <sub>x</sub> | t <sub>y</sub> | $t_{\mathbf{z}}$ | 1_ | # IMPLEMENTATION OF THE MATRIX MULTIPLICATION ALGORITHM FOR THE TMS32010 The implementation of the algorithm for the TMS32010 shown in Figure 6 assumes that the two matrices to be multiplied together are of size $M\times N$ and $N\times P.$ Three major Figure 6. TMS32010 Flowchart Figure 7. TMS32020 Flowchart loops are included to multiply the two matrices. The outside loop control is labeled MCOUNT since it controls which row in the A matrix is being referenced during the multiplication. The secondary loop control is labeled PCOUNT because it counts how many columns in the B matrix have been processed. The inside loop control is labeled NCOUNT since it controls the multiplication of the values in the A matrix with the values in the B matrix. #### IMPLEMENTATION OF THE MATRIX MULTIPLICATION ALGORITHM FOR THE TMS32020 The implementation of the algorithm for the TMS32020 is somewhat different since its advanced instruction set allows for a more efficient method of computing matrix multiplication. The TMS32020 version in Figure 7 also assumes that the two matrices to be multiplied are of size $M \times N$ and $N \times P$ . This program takes a row of the A matrix, loads it into block B0 of data memory, and then multiplies this row by all columns in the B matrix. The TMS32020 continues this process until all the rows in the A matrix have been multiplied by all the columns in the B matrix. The TMS32020 version is similar to the TMS32010 in that the A matrix must be entered by rows and the B matrix by columns. This allows for a faster execution time. Figure 7 shows the basic implementation of the matrix multiplication algorithm that the TMS32020 uses to multiply two matrices. Since the programs in the appendices treat the matrices differently, a memory map is included to help in understanding the two versions. Figure 8 shows how the matrices should look in memory after they have been entered. Note that for the TMS32020 version, the A matrix values reside in program memory since the CNFP (configure as program memory) instruction was implemented. Note also that only one row of the A matrix is in this block since the program enters one row at a time. For the following matrices, $$A = \begin{bmatrix} a_{11} & a_{12} \\ a_{21} & a_{22} \end{bmatrix} B = \begin{bmatrix} b_{11} & b_{12} & b_{13} \\ b_{21} & b_{22} & b_{23} \end{bmatrix}$$ the memory would be configured in this manner for the TMS32010 and TMS32020. | TMS3 | 2010 | TMS32020 | | | | | | | |----------------------|-----------------|----------------------|-----------------|----------------------|-----------------|--|--|--| | DATA M | EMORY | DATA M | PROGRAM MEMORY | | | | | | | LOCATION<br>(IN HEX) | VALUE | LOCATION<br>(IN HEX) | VALUE | LOCATION<br>(IN HEX) | VALUE | | | | | >00F | a <sub>11</sub> | >308 | b <sub>11</sub> | >FF00 | a <sub>i1</sub> | | | | | >010 | 812 | >309 | b <sub>21</sub> | >FF01 | a <sub>i2</sub> | | | | | >011 | 821 | >30A | b <sub>12</sub> | | | | | | | >012 | 822 | >30B | b <sub>22</sub> | | | | | | | >013 | b <sub>11</sub> | >30C | b <sub>13</sub> | | | | | | | >014 | b <sub>21</sub> | >30D | b <sub>23</sub> | | | | | | | >015 | b <sub>12</sub> | | | | | | | | | >016 | b <sub>22</sub> | | | | | | | | | >017 | b <sub>13</sub> | | | | | | | | | >018 | b <sub>23</sub> | | | | | | | | Figure 8. Memory Maps #### SUMMARY The TMS32010 and TMS32020 processors can be used to multiply large matrices efficiently. A brief review of matrix multiplication has been given to assist in the understanding of fundamental matrix multiplication. Three examples of graphics applications have been presented since these applications often require multiplication of matrices. The TMS320 family has the power and flexibility to cost-effectively implement a wide range of high-speed graphics, numerical analysis, digital signal processing, and control applications. Since the TMS32010 and TMS32020 combine the flexibility of a high-speed controller with the numerical capability of an array processor, a new approach to applications such as graphics can now be considered. #### REFERENCES - J.D. Foley and A. Van Dam, Fundamentals of Interactive Commputer Graphics, Addison-Wesley Publishing Company, Inc. (1982). - S.D. Conte and Carl de Boor, Elementary Numerical Analysis, McGraw-Hill, Inc. (1980). #### Appendix A ``` NO$IDT 32010 FAMILY MACRO ASSEMBLER PC2.1 84.107 10:03:42 02-25-85 PAGE 0001 0001 ************* ALL INPUTS AND OUTPUTS FOR THIS PROGRAM SHOULD * 0002 0003 BE OR ARE IN Q12 FORMAT EXCEPT FOR THE M, N, 0004 AND P INPUTS, WHICH SHOULD BE QO. 0005 0006 0000 AORG 0 0007 0000 EQU >0 0008 0001 EQU >1 0009 0002 EQU 0010 0003 C1 EQU 0011 0004 EQU 0012 0005 СЗ EQU >5 0013 0006 ANS EQU 0014 0007 ADIS EQU >7 0015 9008 BDIS EQU >8 0016 0009 CDIS EQU >9 0017 000A TEMP EQU >A 0018 000B COI EQU >B 0019 0000 COS EQU >0 0020 000D EQU >D 0021 000E ONE EQU ÞΕ 0022 0023 * INITIALIZATION 0024 0025 0000 6E00 LDPK 0 0026 0001 6880 1 ARP 0 0027 0002 7E0F LACK 15 0028 0003 500C SACL cos 0029 0004 500D SACL. 0030 0005 7E01 LACK 1 0031 0006 500E SACL ONE 0032 0033 * MATRIX A IS M \times N AND MATRIX B IS N \times P. 0034 * THESE STATEMENTS READ IN THE SIZES OF 0035 * THE TWO MATRICES. 0036 0037 0007 4000 ΙN M, PAO 0038 0008 4001 IN N,PAO 0039 0009 4002 P,PAO IN 0040 0041 CALCULATE THE LENGTH OF THE A MATRIX AND 0042 STORE THIS VALUE IN ADIS. 0043 0044 000A 6A00 0045 000B 6D01 MPY N 0046 000C 7F8E PAC 0047 000D 5007 SACL ADIS 0048 * CALCULATE THE LENGTH OF THE B MATRIX AND 0049 0050 STORE THIS VALUE IN BDIS. 0051 0052 000E 6A01 0053 000F 6B02 LT MPY 0054 0010 7F8E PAC 0055 0011 5008 SACL BDIS 0056 0057 POINT AT THE END OF THE INITIAL DATA. 0058 0059 0012 3800 LAR ARO,COS ``` ``` 0060 * READ THE A MATRIX VALUES INTO DATA RAM. 0061 0062 THIS MATRIX MUST BE ENTERED BY ROWS. 0063 THE MATRIX VALUES WILL BE LOCATED IN 0064 DATA RAM FOLLOWING THE INITIALIZATION 0065 * VALUES. 0066 0067 0013 200B FST COI 0068 0014 000E ADD ONE 0069 0015 500B SACL COI 0070 0016 4088 IN *,PA0 0071 0017 68A8 MAR 0072 0018 2007 LAC ADIS 0073 0019 100B SUB COI 0074 001A FE00 BNZ FST 001B 0013 0075 * RESET COUNTER TO READ IN THE B MATRIX VALUES. 0076 0077 0078 001C 7F89 ZAC 0079 001D 500B SACL 601 0080 0081 * READ THE B MATRIX VALUES INTO DATA RAM. * UNLIKE THE A MATRIX, THESE VALUES MUST BE * ENTERED BY COLUMNS. THESE VALUES WILL BE * LOCATED IN DATA RAM FOLLOWING THE A MATRIX VALUES. 0082 0083 0084 0085 0086 * 0087 001E 200B SND LAC COL ADD 0088 001F 000E ONE 0089 0020 500B SACL COI *,PA0 0090 0021 4088 IN 0091 0022 68A8 MAR 0092 0023 2008 LAC BDIS 0093 0024 100B SUB COL 0094 0025 FE00 BNZ SND 0026 001E 0095 0096 MORE INITIALIZATION 0097 0098 0027 200D LAC 0099 0028 1001 SUB N 0100 0029 5003 SACL 01 0101 002A 200D LAC ADIS 0102 002B 0007 ADD 0103 002C 500D SACL T 0104 002D 1001 SUB N 0105 002E 5007 SACL ADIS 0106 0107 CALCULATE A \times B 0108 0109 0110 0111 N . 0112 0113 0114 OUTPUT(ij) A(ik) \times B(kj) 0115 0116 0117 0118 k = 1 0119 0120 0121 002F 2003 LAC C1 0122 0030 0001 ADD Ν ``` ``` 0123 0031 5003 SACL C1 0124 0032 6881 LARP AR1,T 0125 0033 390D LAR 0126 0034 6880 LARP 0 0127 0035 7F89 ZAC 0128 0036 5004 SACL C2 0129 0037 2004 LAC C2 0130 0038 000E ADD ONE 0131 0039 5004 SACL C2 0132 003A 3803 LAR ARO, C1 0133 003B 7F89 ZAC 0134 0030 5006 SACL ANS 0135 003D 5005 SACL C3 0136 003E 2005 LAC 03 0137 003F 000E ADD ONE 0138 0040 5005 SACL CЗ 0139 0041 6506 ZALH ANS 0140 0042 6AA1 LT *+,AR1 0141 0043 6DA0 MPY *+, ARO 0142 0044 7F8F APAC 0143 0045 5806 SACH ANS 0144 0046 2005 63 0145 0047 1001 SUB N 0146 0048 FE00 TH 0049 003E 0147 0148 * LOAD ACCUMULATOR WITH HIGH WORD OF 024 RESULT. 0149 * LEFT-SHIFT FOUR TO CONVERT TO Q12. 0150 * NOTE THAT ONLY THE 12 MSB'S ARE SIGNIFICANT. 0151 LAC 0152 004A 2406 ANS, 4 0153 004B 5006 SACL ANS 0154 0040 4806 OUT ANS, PAO 0155 004D 2004 LAC C2 0156 004E 1002 0157 004F FE00 0050 0037 ρ SUB BNZ SN 0158 0051 2003 LAC C1 0159 0052 1007 0160 0053 FE00 SUB ADIS FS BNZ 0054 002F 0161 0055 F900 QUIT QUIT В 0056 0055 NO ERRORS, NO WARNINGS ``` #### Appendix B ``` 11:22:01 02-25-85 NO$IDT 32020 FAMILY MACRO ASSEMBLER PCO.7 84.348 PAGE 0001 0001 ********** 0002 ALL INPUTS AND OUTPUTS FOR THIS PROGRAM 0003 SHOULD BE OR ARE IN Q12 FORMAT EXCEPT * FOR THE M, N, AND P, WHICH SHOULD BE QO. * 0004 0005 32 AORG 0006 0020 0000 EQU >0 0007 М >1 0001 EQU 0008 N 0009 0002 EQU >2 P 0010 0003 ANS FOLI >3 0011 BDM1 FOU 0004 >4 >5 0012 0005 ONE EQU EQU 0013 0006 NM1 >6 PM1 EQU >7 0014 0007 0015 0016 * INITIALIZATION 0017 0018 0020 C80% LDPK 0019 0021 D100 AR1,>300 LRLK 0022 0300 0020 0023 5589 LARP 0021 0024 CA01 LACK 0022 0025 6005 SACL ONE 0023 0024 READ SIZES OF MATRICES. 0025 0026 0026 CB02 RPTK 0027 0027 80A0 IN *+,PA0 0028 MORE INITIALIZATION 0029 0030 0031 0028 2001 LAC 0032 0029 0005 ADD ONE 0033 002A 6001 SACL М 0034 002B 2000 LAC Ν 0035 0020 1005 SUB ONE 0036 002D 6006 SACL NM1 0037 002E 3000 LT Ν Р 0038 002F MPV 3802 0039 0030 CE14 PAC 0040 0031 1005 ONE SHB SACL 0041 0032 6004 BDM1 0042 0033 2002 LAC ONE 0043 0034 1005 SUB 0044 0035 6007 SACL PM1 0045 READ IN THE B MATRIX. 0046 0047 LRLK AR1,>308 0048 0036 D100 0037 0308 RPT BDM1 0049 0038 4B04 *+, PA0 0050 0039 80A0 IN CALLER LAC 0051 003A 2001 SUB ONE 0052 003B 1005 SACL 0053 0030 6001 0054 003D F680 ΒZ QT 003E 0052 0055 * CALL ROUTINE TO READ IN A ROW ``` 0056 ``` * OF THE A MATRIX. 0058 0059 003F FE80 CALL 10 0040 0053 0060 0041 D100 LRLK AR1,>308 0042 0308 0061 0043 5589 LARP 0062 0044 3007 LAR ARO,PM1 0063 0064 CLEAR ACCUMULATOR AND P REGISTER. 0065 0066 0045 A000 MPYK 0067 0046 CA00 ZAC 8800 0069 * MULTIPLY A ROW BY A COLUMN. 0070 0071 0047 4B06 RPT NM1 0072 0048 5DA0 MAC >FF00,*+ .0049 FF00 0073 004A CE15 APAC 0074 * OUTPUT RESULT. 0075 0076 0077 004B 6003 SACH ANS, 4 0078 004C E003 OUT ANS, PAO 0079 004D 5588 LARP 0080 0081 * CHECK TO SEE IF ALL COLUMNS HAVE BEEN PROCESSED. 0082 0083 004E FB99 004F 0045 BANZ MUL, *-, 1 0084 0085 GO GET NEXT ROW. 0086 0087 0050 FF80 В CALLER 0051 003A 0088 0052 CE1F QT IDLE 0089 0053 CE04 0090 0054 5589 CNFD 10 LARP 0091 0055 D100 0056 0200 AR1,>200 LRLK 0092 0057 4B06 RPT NM1 0093 0058 80A0 IN *+,PAQ 0094 0059 CE05 CNFP 0095 005A CE26 RET NO ERRORS, NO WARNINGS ``` 0057