PC Bus Performance NuBus vs Micro Channel

Author: William Nowlin
Source: Electronics & Wireless World, Sep 1988, Vol 94 No 1631 (pages 18-21 physical)


Two new computers have been introduced recently which use entirely new bus structures for add-in cards. Apple Computer's Macintosh II uses a personal computer version of the NuBus. IBM's new Personal System/2 family uses an improved personal computer bus called the Micro Channel Architecture. This article presents a comparison of the two buses and the two computers, and includes a technical examination of certain measured system characteristics using GPIB/IEEE488 boards which have been developed for both buses.

The Apple Macintosh II and the IBM Personal System/2 computers are attracting a great deal of attention from add-in board manufacturers. The article covers some of the technical aspects of designing cards for these machines.

The Macintosh II is a 32-bit 68020-based computer which uses a personal computer version of NuBus a general-purpose 32-bit computer backplane bus conceived at MIT and developed further by Texas Instruments. At present, the IEEE P1196 committee is formalizing the NuBus specification into a form suitable for acceptance and publication.

The IBM PS/2 family currently consists of six computers, three of which use the new Micro Channel Architecture (MCA). The Micro Channel is a general-purpose computer bus which supports 8, 16, or 32-bit microprocessors, peripherals, and memory. PS/2 Models 50 and 60 are equipped with 16bit 80286s, and Models 70 & 80 use the 32-bit 80386. Models 50 and 60 support the 16-bit version of the MCA, and Models 70 & 80 support both 16 and 32-bit add-in cards.

Performance Examples

The performance specifications listed in the panel are all theoretical in nature. To determine how each bus actually performs, I examined two add-in card examples and discuss the measurements obtained.

Table 1 gives a comparison of the two computers used: a Macintosh II computer and a PS/2 Model 80. The Macintosh II is run by a 16MHz 68020 microprocessor and the Model 80 by a 16MHz 80386. The d.m.a. controller in the PS/2 runs at 8MHz, while the Mac is equipped with a National Instruments NB-DMA-8-G board containing an 82380 d.m.a. controller running at 10MHz.


Table 1. Example computer systems

The Macintosh II has six card slots, one of which is used by a display adapter and another which is occupied by the NB-DMA-8-G card. The Model 80 has eight slots, with one taken up by a disc controller and a second slot used by National Instrument's MC-GPIB card.

The Macintosh II runs the Macintosh operating system and the Model 80 runs PC DOS3.3. Slave cards on the Macintosh II can introduce wait states of 100ns each, and on the Model 80 each wait state adds an additional 62.5ns to the transfer.

The NB-DMA-8-G, in addition to its 82380 d.m.a.c. and bus master circuitry, has an IEEE488 interface which responds as a NuBus slave. The measurements described use the NB-DMA-8-G both as NuBus d.m.a. master and slave.

MC-GPIB is an IEEE488 interface for the Micro Channel. It can respond as a bus slave and also contains arbitration circuitry to allow it to be used in conjunction with the d.m.a. controller on the mother board.

Slave Performance

In this example, I treat the NB-DMA-8-G and the MC-GPIB cards as bus slaves and measure some of the timing parameters which are of interest and which affect i/o performance.

A block diagram of the NB-DMA-8-C is shown alongside. For performance testing, I had the Macintosh II's 68020 execute a simple i/o write test. The test programs, shown on this page, transfer data to and from the 16×16 bit, fifo memory in the Turbo 488 integrated circuit on the card.

The same test was performed on the PS/2 Model 80 using the code shown. A block diagram of the MC-GPIB card, right, identifies the Turbo 488 and circuitry that perform slave functions which are quite similar to those on the NB-DMA-8-G.


The two boards and computers compared in these tests are National Instrument's MC-GPIB interface for the IBM PS/2, and the NB-DMA-8G combined d.m.a. controller and 488 interface.

Table 2 gives a summary of the performance obtained using the test programs and the two different systems. The fifo read and write cycle times are theoretical best-case values. Note that the NB-DMA-8-G adds 200ns to the fastest possible transfer cycle on the NuBus. This extra delay is due to fifo synchronization arbitration and access time. Similarly, the MC-GPIB adds approximately 125ns to the minimum Micro Channel cycle. These delays establish maximum theoretical transfer rates of 2.5×106 16-bit transfers per second (5Mbyte/s) for the NB-DMA-8-G and 2.67×106 16-bit transfers per second (5.3Mbyte/s) for the MC-GPIB.


Table 2. Slave timing performance

The actual programmed i/o rates obtained using the i/o read and write test programs are 0.7Mbyte/s for the NB-DMA-8-G and 0.533Mbyte/s for the MC-GPIB. As you can see, bus speed is of less importance in this example than microprocessor performance (execution time, program memory speed, on-chip cache, instruction set, etc).

One final slave timing characteristic was measured for both systems: interrupt response time. For this test, the time between when the card generated the interrupt request on the bus to when the interrupt service routine was entered when measured.

The 35µs interrupt latency for the Macintosh II is indicative of the interrupt service queue used by the Macintosh Operating System rather than hardware performance features of the 68020. The 7µs interrupt latency for the PS/2 Model 80 is a close indicator of actual hardware performance, since no intervening interrupt service software overhead is incurred in PC DOS.

Bus Master Performance

To measure bus master performance and, in particular, d.m.a. performance, we use the 82380 d.m.a.c. on the NB-DMA-8-C to transfer data to and from the Macintosh II motherboard and the fifo on the card. For the Micro Channel, we use the arbitration circuitry on the MC-CPIB in conjunction with the PS/2 Model 80 d.m.a.c. to transfer data to and from the motherboard memory and the fifo on the card. The results of the test are shown in Table 3.


Table 3. Bus master performance

Burst mode fetch and deposit d.m.a. transfers were used on both machines. Transfers on the NuBus were 16 bits wide for i/o accesses (fifo reads and writes) and 32 bits wide for memory. (The 82380 automatically performs source and destination hit width adjustments. The master latency times show how long it took from the time the card requested use of the bus until the time it received bus control.

The i/o read and write times indicate the time taken to access the fifo by the d.m.a. controller. The memory read and write times show how long it took for the d.m.a. controller to access system motherboard ram. Note that it should he possible to substantially increase the performance of the Macintosh II in this example by using a high-speed ram. PC-style NuBus add-in card.

The data transfer rates are given last. The Macintosh II is able to overcome its relatively slow memory access time by transferring 32 bits as opposed to 16 bits on PS/2.

Results

Both machines provide powerful platforms for add-in i/o cards, and both have distinct advantages and disadvantages, as well as unique characteristics. Some of the features that are common to the Macintosh II and the PS/2 Model 80 are high speed. 32-bit microprocessor, memory, and bus: automatic card configuration at power on (no jumpers): powerful interrupt mechanism: and support for multiple add-in cards. Some of the dissimilar features are listed in Table 4.


Table 4. Macintosh II and PS/2 Model 80 feature comparison

I feel that the 96-pin DIN connector used by the PC-style NuBus cards offers advantages over the edge fingers defined by the Micro Channel Architecture, primarily in terms of mechanical reliability.

The multiple versions of the Micro Channel make it more difficult to develop add-in cards that can be used on all machines and at the same time offer maximum performance. For example, if you were developing a high performance, intelligent peripheral processor card for the Micro Channel, you may be forced into developing two versions: one with a 16-bit interface and one with 32-bit interface.

The small board area available on Micro Channel cards makes development more difficult. In many cases, the developer will be forced into designs which require gate arrays or other asics and/or surface-mounted components. Both of these solutions lengthen development time and, except in very high volume cases, increase manufacturing costs.

At National Instruments, it is easier to design for the NuBus because of its synchronous nature. The separate address and data lines on the Micro Channel do not turn out to be of advantage because timing requirements force addresses to be latched much in the same way that they are latched on NuBus implementations.

Although I don't have quantitative, measured data to support this next conclusion, it is obvious that a portion of the Micro Channel bus bandwidth is utilized by the arbitration process, and that the percentage used increases as more devices contend for the bus. Unlike the Micro Channel, NuBus arbitration occurs in parallel with normal bus traffic.

For i/o intensive applications involving large blocks of data, I feel that it is essential for the computer system to provide d.m.a. transfer capability. The Macintosh II requires that this capability be provided by the add-in cards, themselves. The PS/2 uses a clever scheme which allows cards to use a d.m.a.c. on the motherboard. Unfortunately, the one on the PS/2 Model 80 is limited to 16-bit transfers. Perhaps future members of the PS/2 family will offer full 32-bit support.

The Micro Channel provides refresh support for dynamic ram, add-in cards. This eliminates the expense of refresh circuitry on each ram card. It does, however, utilize bus bandwidth (less than 5% on the PS/2 Model 80). This characteristic is probably paralleled on the NuBus, however, by increased access times during memory refresh and bus access collisions.

Accessing the Macintosh II motherboard ram from an add-in master NuBus card is agonizingly slow by today's standards. The PS/2 family offers acceptable access to its motherboard memory. Perhaps future versions of the Macintosh II will correct this deficiency.


Comparative Overview

The table lists some of the primary features of the PC-style NuBus and the Micro Channel bus. The NuBus is considered a synchronous bus. All bus transactions are referenced to a single, 10MHz clock signal. Transactions on the Micro Channel bus are asynchronous and are not referenced to a specific clock signal.


Main features of NuBus and Micro Channel

The 10MHz + 0.01% clock on the NuBus has an unequal duty cycle of 25:75. The Micro Channel bus provides a 14.31818MHz + 0.01% clock signal but does not use it for bus timing.

Micro Channel Architecture provides for separate memory and i/o address spaces while NuBus provides for a single address space. I/O addresses on the Micro Channel are 16-bits wide, providing 64K bytes of i/o address space.

Memory addresses on the NuBus are 32-bits wide and allow up to 4Gbytes of ram or rom to be individually addressed. The 16-bit version of the Micro Channel uses a 24-bit memory address, which translates to 16Mbytes; the 32-bit version provides a full 32-bit, 4Gbyte memory address space.

Data can be transferred on the NuBus as bytes, 16-bit halfwords, or 32-bit words. The 16-bit version of the Micro Channel can transfer 8 or 16-bit data, while the 32-bit version can handle 8, 16, 24 and 32-bit data. Bit, byte, halfword, and word significance is the same on both the NuBus and the Micro Channel.

On the NuBus, cards are given 1/16th of the total 4Gbyte address space 1256Mbyte). NuBus allows up to 16 cards in a system and each card is given 1/16th of the total 256M card address space (16Mbyte per card). The card automatically assumes its proper address by sampling four signals on the NuBus connector which provide a card identification code.

Micro Channel cards use the Programmable Option Select (POS) feature to determine i/o addresses. Add-in card address information is stored in a non-volatile r.a.m. available to the microprocessor. At power-on-time, the microprocessor reads the address information from the n.v.r.a.m. and configures each card in the system. Using the POS mechanism, add-in cards may occupy any free portion of the available memory or i/o address space.

Data transfers. A minimum single data transfer on the NuBus consists of two clock cycles (a start cycle and an acknowledge cycle, of 200ns, which translates to a 20Mbyte/s transfer rate see Tables. The Micro Channel minimum cycle is 200ns, which, for 32-bit data transfers, translates to 20Mbyte/s. The 10MHz 80286 microprocessors used in the PS/2 Models 50 and 60 support a minimum cycle of 300ns, which translates to 6.7Mbyte/s. The 16MHz 80386 Models 70 & 80 have a minimum cycle of 250ns, or 16Mbyte/s. The higher  performance 20MHz 80386 version will be able to support the full 20Mbyte/s transfer rate of the Micro Channel.

Bus masters. NuBus arbitration priority is fixed and depends on the card slot ID, with slot number 15 having the highest priority. Figure 1 shows the relationship of the NuBus slot IDs for the Macintosh II. The Micro Channel defines 18 levels of fixed priority, see Table 3. The arbitration level used by an add-in card is normally controlled by a Programmable Option Select register on the card which allows a limited form of programmable priority.


Figure 1: NuBus Slot IDs

DMA transfers. The Micro Channel computers are equipped with an eight-channel d.m.a. controller. Add-in cards can use the d.m.a. controller to perform data transfers using the standard Micro Channel bus arbitration mechanism. The d.m.a. controller on PS/2 Models 50 and 60 runs with a 10MHz clock, and the 16MHz Model 60 (???) d.m.a.c. runs at 8MHz.

The Macintosh II does not provide a d.m.a. controller on the backplane. For d.m.a. comparison, we used the NB-DMA-8-G card, Figure 2. This provides the Macintosh II with a general-purpose d.m.a. controller for the PC-style NuBus in addition to a 488 bus interface. It uses an Intel 82380 controller running at 10MHz.

An analysis of the d.m.a. capabilities of the Macintosh II equipped with an NB-DMA-8-G card as compared to a 16MHz IBM PS/2 Model 80 with built-in d.m.a. controller is shown in Table 3. The analysis is theoretical and assumes best-case conditions.

Each configuration can support eight independent channels. The 82380 on the NB-DMA-8-G can perform 8, 16, or 32-bit data transfers; the PS/2 Model 80 can perform 8 or 16-bit transfers. The Model 80 d.m.a.c. transfers data using a fetch and deposit technique, called flow-through or dual-addressing. The 82380 normally uses the same fetch and deposit method, but can also support single address or flyby d.m.a. transfers.

For this theoretical analysis, I assumed that the data transfers can be performed in the minimum time possible; for both Micro Channel and NuBus.

Burst capabilities allow the d.m.a.c. to keep the bus as long as necessary. Assuming that no higher-priority master is using the NuBus, we can determine the maximum transfer rate obtainable using burst mode. For the NB-DMA-8-G, this time is 600ns per 32-bit fetch and deposit cycle 16.67Mbyte/s) and 400ns per 32-bit flyby cycle (10Mbyte/s).

On the PS/2 Model 80, the memory refresh arbitration level will override any burst transfer in progress. Since a memory refresh takes place an average of once every 15.8µs, this will limit the number of 16-bit fetch and deposit transfers that can be performed during a burst to 25. Using this fact, the maximum burst data transfer rate that can be achieved is 3.1Mbyte/s.

The 82380 d.m.a.c. on the NB-DMA-8-C is equipped with a 24-bit transfer counter, allowing it to transfer a maximum of 16Mbyte/s with one d.m.a. transfer operation.

The custom d.m.a.c. on the PS/2 Model 80, and on Models 50 and 60, contains a 16-bit transfer counter, providing the ability to transfer 64K 16-bit words (128K bytes) or 64K bytes with a single d.m.a. operation.

Interrupts. The PC-style NuBus provides an interrupt request signal for slave cards called non-master requests, or NMRQ. As implemented on the Macintosh II, each card generates a separate interrupt request signal to the 68020 microprocessor on the motherboard. Each slot's interrupt has a programmable priority. Interrupts on the Macintosh II are level sensitive as opposed to edge sensitive.

The Micro Channel implements a shareable, level-sensitive, multiple-line interrupt mechanism. A total of 11 shareable interrupt lines are available on the bus. Priorities are shown in the Table.


Test Programs

Macintosh II I/O Write Test Source Code

code test-write
                 move.l    #$50, d1
                 move.l    #fifo_addr, a1
                 move.l    #buf_addr, a0
                 move.l    #isr3_addr, a2
                 move.l    #3, d0
loop             btst      d0, (a2)
                 beq       finis
                 move.w    (a0)+, (a1)
                 dbne      d0, loop
finis            rts
end-code

Macintosh II I/O Read Test Source Code

code test-read
                 move.l      #$50, d1
                 move.l      #fifo_addr, a1
                 move.l      #buf_addr, a0
                 move.l      #isr3_addr, a2
                 move.l      #2, d0
loop             btst        d0, (a2)
                 beq         finis
                 move.w      (a1), (a0)+
                 dbne        d0, loop
finis            rts
end-code

PS/2 Model 80 I/O Write Test Source Code

.286c
                 public _fillfifo
_TEXT            segment     byte public 'CODE'
                 assume cs:_TEXT
xx               proc        near
_fillfifo:
                 push        ax
                 push        dx
                 push        cx
                 push        bx
                 push        si
                 mov         si, 0ff00H
                 mov         bx, 0fe1aH
                 mov         cx, 20H
L2:              mov         dx, bx
L1:              in          ax, dx
                 test        ax, 8
                 jz          L3
                 mov         dx, 0fe18H
                 outs        dx, word ptr ds:[si]
                 loop        L2
L3:              pop         si
                 pop         bx
                 pop         cx
                 pop         dx
                 pop         ax
                 ret

xx               endp

_TEXT            ends
                 end

PS/2 Model 80 I/O Read Test Source Code

.286c
                 public _fillfifo
_TEXT            segment     byte public 'CODE'
                 assume cs:_TEXT
xx               proc        near
_fillfifo:
                 push        ax
                 push        dx
                 push        cx
                 push        bx
                 push        si
                 mov         si, 0ff00H
                 mov         bx, 0fe1aH
                 mov         cx, 20H
L2:              mov         dx, bx
L1:              in          ax, dx
                 test        ax, 4
                 jz          L3
                 mov         dx, 0fe18H
                 insw        dx, word ptr ds:[si]
                 loop        L2
L3:              pop         si
                 pop         bx
                 pop         cx
                 pop         dx
                 pop         ax
                 ret

xx               endp

_TEXT            ends
                 end

William Nowlin is engineering vice-president with National Instruments Corporation. Austin. Texas, represented in the UK by Amplicon Electronics.

Content created and/or collected by:
Louis F. Ohland, Peter H. Wendt, David L. Beem, William R. Walsh, Tatsuo Sunagawa, Tomáš Slavotínek, Jim Shorney, Tim N. Clarke, Kevin Bowling, and many others.

Ardent Tool of Capitalism is maintained by Tomáš Slavotínek.
Last update: 24 Mar 2024 - Changelog | About | Legal & Contact