# PA-RISC Connects to 486 Bus W89K Reaches 95 Dhrystone MIPS, Uses Common Peripherals

### by Curtis P. Feigel

HP's PA-RISC architecture will gain a significant new processor next quarter when Winbond introduces its new W89K. Aimed at the embedded market, the processor uses a bus interface similar to the 486, allowing it to connect to common PC peripheral chips. This simplifies its integration into products that already use such devices, and it reduces time-to-market in new designs. The W89K achieves 95 Dhrystone MIPS at 66 MHz—a number that is all the more impressive considering the chip's price: \$40.

Winbond is a Taiwan-based maker of PC systemlogic chip sets and peripheral chips. The company claims that the W89K's bus structure is so similar to a 486DX (or DX2) that it can be used in place of that chip in most existing PC motherboards. All that is required is to replace the processor and the BIOS. This level of compatibility opens the door to increased performance in embedded systems built around PCs. It also provides an inexpensive path for adding second-level cache, which is important because the processor's performance will be limited by memory bandwidth. The device contains neither an MMU nor an FPU, making it an inappropriate choice for a workstation.

#### **Bus Differences Are Minor**

The W89K can run in either standard or clockdoubled mode; that is, its processor can run at the same or twice the bus-clock rate. If the external logic can support it, however, the device can run with both its processor and bus clocks at 66 MHz. Its bus interface also incorporates an optional burst-write mode and can handle



Figure 1. The W89K's fully associative caches deliver hit rates similar to larger direct-mapped caches but use less space.

both big-endian and little-endian data. The W89K bus differs from the 486 only in that it provides no multiprocessor support and no floating-point exception protocol, allows burst writes, and has a different burst order.

In selecting a CPU architecture, Winbond chose to follow HP's PA-RISC 1.1 (third edition) Level 0. The processor's speed comes from the fact that, in one cycle, many PA-RISC instructions perform two of the following: branch, load/store, data transform (see MPR 4/2/91, p. 6). Level 0 specifies a 4G address space with memory and I/O accessed using 32-bit absolute addresses. Figure 1 shows Winbond's implementation.

The processor's separate 2K instruction and data caches are each broken into 128 lines. The line size is four 32-bit words to maintain compatibility with the 486-type burst bus cycle. One notable characteristic is that the caches are fully associative. This would be unwieldy with the larger 48- and 64-bit virtual addresses of Level 1 and Level 2 PA-RISCs, but because it uses absolute addressing, the W89K, requires only 32 bits. The typical disadvantage of fully associative caches is that they must perform a hit-or-miss determination before beginning to retrieve data. In the W89K, this is accomplished in a single cycle. The data cache uses a write-back policy that is very effective when combined with burst writes.

#### **Design Allows Flexibility**

The W89K addresses memory using a 32-bit unsigned integer pointing to the lowest-addressed byte of a four-byte word. Of the 4G address space, 256M is allocated for I/O and is non-cacheable. When accessing I/O space, the processor negates the M/IO signal and outputs zeros on the four most significant address lines. The W89K does not snoop the external bus to check the coherence of the cache. Peripherals that require memory space can use the non-cacheable block of I/O addresses or, alternatively, the processor has a KEN (Cache Enable) pin that external logic can use to disable caching for a given address. When writing software, programmers must be aware of possible coherency problems.

The processor reserves the FF0004Fx address for INTA (interrupt acknowledge) cycles responding to 8259compatible requests. The PA-RISC architecture specifies that seven of the general-purpose registers must be copied to shadow registers when interrupt service begins. This eliminates the need to perform register saves and restores, reducing the number of cycles spent in handling interrupts. Maximum interrupt latency of a 50-MHz W89K in DX mode is 540 ns. The PA-RISC specification does not define the address of the first instruction to be executed after a hardware reset. Until now, most PA-RISC systems put startup code in high memory, but PCs and clones expect to find this auto-start vector in low memory. The W89K solves this with an input pin called PA/486 that selects the startup mode. When this pin is high, the processor executes its initial instruction at EFFFFF0; when low, it begins execution at 000FFFF0. The pin has an internal pull-down resistor. If left unconnected, this ensures that the initial instruction is executed from what would be ROM space in an x86-compatible board.

To aid in fault-finding, the W89K uses three registers, called AIRs, that are invisible to the processor architecture. Two are reserved for use by the manufacturer, but AIRO can be accessed by software to support debugging. Bits in this register enable and disable each of the caches and the burst-write capability, and can set the default endian mode for interrupt routines. Disabling the caches and the write bursting simplifies debugging by making it easy for logic-analysis hardware to follow the execution of instructions.

The W89K also implements DIAG instructions. The PA-RISC architecture defines a format for these, but leaves their actual syntax and function to be decided by the manufacturer. Thus, the W89K's DIAG instructions are not directly supported by the available assemblers, but Winbond supplies a macro to convert DIAG instructions into a recognizable syntax.

#### Approach Has Advantages

Winbond's atypical approach has several advantages. The W89K's cost is low for its performance level. It promises shortened time-to-market because designers can use existing and proven devices for second-level caches and system logic. And 486-type peripheral chips are inexpensive due to their volumes and because of the competition between manufacturers. Furthermore, product developers can prove their concepts using a W89K in an existing PC motherboard without having to build and debug a new hardware design first.

The processor's fully associative caches are a rare design decision that reduces the physical size of the cache while holding cache misses to an acceptable level. Although fully associative caches require more circuitry per entry than the direct-mapped variety, they need fewer entries, resulting in a net reduction in die area.

Winbond's target is the embedded market which, as shown in Table 1, means it's up against serious competition. Currently, none of the other contenders approach this speed—the W89K appears to be a leader in both performance and performance per dollar. Priced at \$40 and delivering 95 Dhrystone MIPS, the 66-MHz W89K has no direct comparison in Intel's i960 line. Intel's new 40-MHz i960CF, at \$160, is four times more expensive and has

## Price & Availability

Samples of the W89K will be available in early April, with production scheduled for August. The price will be \$40 for the 66-MHz device, \$25 for devices 40 MHz and slower, in 1,000-unit quantities (much of the price difference is due to the ceramic package used for the faster device). The company is also planning a low-power version of the 66-MHz device to be released in October at a price of \$12. Development tools cost \$500 for HP workstations, GNU users can obtain tools on the Internet via ftp for no charge. For more information contact Winbond (Hsinchu, Taiwan) via fax at 886.35.774527.

about two-thirds of the W89K's performance (61 Dhrystone MIPS). The 25-MHz i960KA, at \$22, is about half the cost of the Winbond device but delivers only 9.4 Dhrystone MIPS—one-tenth of the performance of the W89K. Intel has promised to introduce a version of the i960, code-named P110, that it claims will reach the 100-MIPS level. It seems unlikely that the price will put it on par with the W89K.

The company explicitly states that one target market for this device is laser printers, but there are not many chip sets for putting a 486-bus processor in a laser printer. Still, the W89K will undoubtedly find work in LAN routers and bridges. It would also make a good X-terminal controller. Another application for this device would be in an inexpensive entry-level development system running the public-domain GNU tools for PA-RISC.

While PA-RISC processors require software different from that of a 486, this is not a big drawback for embedded systems. Winbond is ready to supply PA-RISC code to control its line of 486-type peripherals, but it remains to be seen whether this approach can overcome the current trend—many modern embedded processors reduce cost by incorporating DMA, interrupt control, communications ports, and peripherals on one chip. What really makes the W89K stand out is its low cost for the performance it delivers. ◆

|                  | W89K    | i960CF | i960KA | 29030  |
|------------------|---------|--------|--------|--------|
| Clock            | 66      | 40     | 25     | 33     |
| Caches           |         |        |        |        |
| Instruction      | 2K      | 4K     | 512    | 8K     |
| Data             | 2K      | 1K     | none   | none   |
| Interrupt        | 0.54 μs | 2.8 μs | 4.8 μs | 1.1 μs |
| Latency          |         |        |        |        |
| Performance      | 95      | 62     | 9.4    | 21     |
| (Dhrystone MIPS) |         |        |        |        |
| Price            | \$40    | \$160  | \$22   | \$39   |
| MIPS/\$          | 2.37    | 0.39   | 0.43   | 0.90   |

Table 1. The Winbond chip leads both in performance and in performance per dollar. (Source: vendors' information)