# Vitesse Unveils Pentium Cache Controller GaAs Chip Set Offers Top Performance at Modest Price Premium

#### By Linley Gwennap

Vitesse, known for its gallium-arsenide (GaAs) gate arrays and ASICs, has unleashed its GaAs technology on a Pentium cache controller that provides zero-wait-state accesses with standard 10-ns SRAMs. GaAs chips are commonly thought to be too expensive and too hard to cool for mainstream applications, but Vitesse has produced a cost-effective design with reasonable cooling requirements.

The two-chip set controls up to 1M of second-level cache memory in a uniprocessor Pentium system. It comes in two flavors. The VSP945/946 synthesizes a 33-MHz, 486-like local bus that is easily connected to standard system-logic chip sets for memory and I/O interfaces. The 951/952 emulates a 33-MHz (half-speed) Pentium bus, either 32- or 64-bits wide. The Pentium bus provides better support for a write-back cache than the 486 bus.

## High-Performance Cache Design

The Vitesse design is the first Pentium cache controller with performance similar to Intel's 82496 (*see* 070403.PDF); other Pentium chip sets insert wait states during cache accesses. For example, Intel's 82430 PCIset (*see* 070403.PDF) takes three cycles to return the first doubleword and two cycles for successive doublewords (3-2-2-2 access pattern) with standard SRAMs. With synchronous parts, the PCIset provides a faster 3-1-1-1 pattern. The Vitesse controller, using standard SRAMs,



Figure 1. The Vitesse chip set controls up to 1M of second-level cache for Pentium with zero wait states. The system bus can be either 32 bits (486 or P24T) or 64 bits (Pentium).

returns data at a blazing 2-1-1-1 rate, the fastest possible with a Pentium processor.

As a result, Pentium systems using the Vitesse chip set and 256K of cache should be able to match Intel's benchmark numbers, which are based on systems using the 82496. With a larger cache, the Vitesse design should beat these figures. By comparison, systems using the PCIset or similar products will probably be 5%-10%slower than Intel's published benchmarks.

The Vitesse cache is direct-mapped with 32-byte lines. Intel's 82496 cache is two-way set-associative to improve the hit rate, but it provides zero wait states only when accessing the most-recently-used set, while the Vitesse cache achieves zero wait states for any read or write hit. The Vitesse design includes a four-entry write buffer to prevent CPU stalls when writing to memory, as well as a snoop buffer to reduce conflicts with the processor on consecutive snoop accesses to the same line.

Since the Vitesse cache uses a write-back protocol, it implements a "read bypass" to speed dirty cache misses. If a read access misses the second-level cache, the controller initiates a memory read. During the first four cycles of the memory latency period (which is typically six or more cycles), the controller copies any dirty data from the selected cache line into its write buffer. Thus, there is no overhead for writing back a dirty cache line.

The Vitesse design allows individual cache lines to be write-protected. This prevents the data from being overwritten and is useful for BIOS code, for example. The Intel chip sets do not implement this feature.

The 945 and 951 have identical cache features, with one exception: the 951 implements Pentium's bus pipelining (see **070502.PDF**), which allows a new transaction to be started before the previous one has finished. Pipelined transactions can progress at the maximum speed of the bus (1-1-1) with the 951, but require an extra cycle (2-1-1) with the 945.

#### Simpler SRAMs Reduce System Cost

The chip set consists of a GaAs cache control chip and a CMOS data path chip, as shown in Figure 1. The cache tags are stored in two external SRAMs. The cache data is two-way interleaved, creating a total data width of 128 bits. Sixteen  $\times 8$  parts can be used, either  $8K \times 8$  for a 128K cache or  $32K \times 8$  for a 512K cache. The maximum 1M cache can be built from eight  $64K \times 16$  SRAMs. Tags are stored in two standard SRAMs, either  $8K \times 8$  for the smaller caches or  $32K \times 8$  for the larger sizes. At 66 MHz, the data SRAMs must have a 10-ns access time, while the tags must meet an 8-ns limit.

# Price and Availability

The VSP945/946 Cache Controller is priced at \$150 for the two-chip set in quantities of 1000. The VSP951/952 version is priced identically. The 945 and 951 use a 184-pin PQFP package, while the 946 and 952 use a 208-pin PQFP. The 945/946 chip set is expected to sample in June, with production in 3Q93. The 951/952 is expected to sample in August, with production in 4Q93.

Contact Vitesse Semiconductor at 741 Calle Plano, Camarillo, CA 93012; 805/388-7582, fax 805/987-5896.

Interleaving the data cache reduces cost because each bank of SRAMs has two clock cycles to respond. In the Vitesse design, only the CPU and the two Vitesse chips need to be synchronized to the 66-MHz clock. With designs using synchronous or custom SRAMs, however, the high-frequency clock must be distributed to as many as 19 chips (including 16 SRAMs) with very little skew a challenging design task.

The interleaving allows the chip set to achieve zero wait states without expensive synchronous or custom SRAMs. In contrast, the 82496 uses Intel's 82491 cache memory, which incorporates write buffers and other logic on the cache chips themselves. As noted above, the PCIset requires 10-ns synchronous SRAMs to achieve its maximum performance. Both of these solutions increase system cost, particularly for larger caches.

Table 1 compares the cost of 512K cache subsystems built using the Vitesse chip set, the 82496, and the PCIset. The Vitesse design is much less expensive than the 82496/491 cache due to the price difference between standard SRAMs and the custom 82491 SRAMs. This is not an entirely fair comparison because the Intel chips offer multiprocessor capabilities not found in the Vitesse design. Some of the initial uniprocessor Pentium systems, however, are using the 82496, which is the only other zero-wait-state design on the market.

Vitesse's cost advantage is further improved when one considers the system interface. Since Vitesse offers a local-bus interface, its chip set can connect to low-cost 486 system-logic chip sets and even P24T chip sets when they become available. The 82496 assumes that memory and bus interfaces are implemented by the system designer in PALs or ASICs, increasing both design time and overall system cost.

Table 1 shows that the Vitesse cache even uses less total power than the 82496/491 combination. Lower power generally reduces the cooling requirements, although the slightly hotter GaAs controller may require a small heat sink. Of course, cooling the Vitesse chips is relatively simple compared to the 13-watt Pentium CPU.

For a lower-cost system, the PCIset (or a similar de-

| Qty |                                  | Unit<br>Price | Total<br>Price | Total<br>Power |
|-----|----------------------------------|---------------|----------------|----------------|
| 1   | Vitesse 951/952 Cache Controller | \$150         | \$150          | 5 W            |
| 16  | 10 ns, 32K × 8 SRAMs             | \$18          | \$288          | 16 W           |
| 2   | 8 ns, $32K \times 8$ SRAMs       | \$30          | \$60           | 2 W            |
|     |                                  |               | \$498          | 23 W           |
| 1   | Intel 82496 Cache Controller     | \$160         | \$160          | 3 W            |
| 16  | Intel 82491 32 × 8 SRAM          | \$36          | \$576          | 32 W           |
|     |                                  |               | \$736          | 35 W           |
| 1   | Intel 82430 PCIset (ISA)         | \$84          | \$84           | 2 W            |
| 4   | 9 ns, 64K × 18 synch SRAM        | \$95          | \$380          | 10 W           |
|     |                                  |               | \$464*         | 12 W           |

Table 1. Price and power consumption for several 512K cache subsystems. All prices in 1000 units; SRAM pricing from Motorola. \*The PCIset includes about \$50 of system logic (equivalent to 82420) not found in the two cache controllers.

sign) is a better alternative than the Vitesse chip set. The Vitesse design is more expensive and requires the addition of a standard system-logic chip set, such as Intel's 82420 (486-to-PCI) chip set that costs about \$50. Taking this cost into account, the Vitesse design carries an added cost of about \$80 over the PCIset but provides 5%-10% better performance. The GaAs chip set also uses more power, but cooling should not be a problem in any system built to handle a Pentium processor.

## GaAs Outlook Improves

Vitesse's design is best suited for high-performance Pentium systems, both on the desktop and for servers. Although it supports 128K and 256K caches, these sizes require large numbers of  $8K \times 8$  SRAMs and are not as cost-competitive with other chip sets at that size. The Vitesse chip set looks good for larger caches sizes, and the 1M cache size is particularly attractive for servers. None of the Intel chip sets supports caches as large as 1M, although OPTi's PTMAWB-V Pentium chip set allows up to 2M of second-level cache.

One advantage that Intel's 82496 holds is its support for the MESI coherency protocol in a multiprocessor system. Vitesse is working to close that gap with its own multiprocessor cache controller, the 947/948, and expects to sample that product by the end of the year. This version will rival MP products from LSI Logic and Corollary (*see* 070503.PDF) that are also due at that time.

The Vitesse cache controller is priced competitively with other Pentium chip sets while offering charttopping performance. The company has chosen a good opportunity to enter the x86 chip-set market; the high performance and power demands of Pentium complement the abilities of gallium-arsenide technology. Vitesse has been the strongest of the GaAs vendors, which have all struggled to find a market niche over the past few years. The Pentium chip set may finally move this tricky technology into the high-volume PC world, albeit at the high end of that market. ◆