# 

THE INSIDERS' GUIDE TO MICROPROCESSOR HARDWARE

### VOLUME 9 NUMBER 11

#### AUGUST 21, 1995

# Vendors Fight for Pentium Core-Logic Market

## Intel Is King of the Land, Most Others Are Losing Chip-Set Share

#### by Yong Yao

At least ten vendors produce Pentium chip sets today, and more are coming into the market. Competition among those chip makers is hotter than ever. With its Mercury, Neptune, and lately Triton, Intel has become the number one chip-set supplier in the world. Companies such as Opti, VLSI, UMC, and SiS are having a difficult time maintaining their market share while facing the technology leader, Intel.

The power of Pentium-class CPUs and the widespread understanding of most core-logic design issues make it harder for these vendors to differentiate their chip sets. To avoid being squeezed out of the market, a Pentium-class chip-set maker needs to quickly revise its products, as VIA Technologies has, by enhancing features, supporting the right memory technologies, and adopting new architectures.

One of the trends for PC core logic design is to encompass all motherboard functions except the CPU and main memory. Unlike in the past, today a chip-set architect has to consider not only memory and I/O transactions but also various multimedia functions.

#### Market Attracts a Dozen Players

Even though PC system core logic has always been a low-margin business, its gigantic volume has attracted more than a dozen vendors to fight for the same pie. Among them are even newcomers like Cypress. MDR projects that Pentium shipments for this year will exceed 29 million units. Including products from AMD, Cyrix, and NexGen, total shipments of Pentium-class CPUs are projected to reach 50 million units in 1996.

To compete in this market for the rest of the year, a chip set must have the following minimum feature set:

- Support for Pentium (3.3-V) with different internal clock speeds and an external bus speed up to 66 MHz
- 3-1-1-1 burst reads/writes between the CPU and the secondary cache
- A glueless interface to PCI

- Integration of at least the control logic of the secondary cache with a write-back policy
- Support for burst and pipelined bus operations
- Support for alternative DRAMs and SRAMs, including EDO DRAM and pipelined burst SRAM.

Other things in common among today's chip sets are that they all have on-chip tag comparators, so as to use standard SRAMs for cache tags, and all are manufactured in 3.3/5-V CMOS processes that meet the requirement of 3.3-V CPUs and 5-V memory or other peripheral devices. Since the CPU is tuned for memory and not for I/O operations, these chip sets all support memory-mapped I/O peripherals.

Except Intel, all chip-set vendors claim to support AMD's K5 and Cyrix's M1. All the chip sets (except the Cypress HyperCache) support low-cost cacheless designs. Most chip sets include both interleaved and linear burst modes. To increase performance, all the chip sets include various buffer schemes, although the implementation and size of the buffers vary.

#### Intel Triton Dominates Market

In the past, Intel's core-logic products served to enable its highly profitable microprocessor business. These products were primarily targeted at the high-end desktop or server markets—a small triangle at the top of the PC pyramid. Although the main emphasis of its chip sets remains to grow the total PC market, Intel has taken a much more aggressive approach with its Triton chip set, which goes after mainstream Pentium PCs. Intel has an ambitious goal that other vendors refer to as Intel's 3-2-1: to ship 30 million Pentium processors, 20 million Triton chip sets, and 10 million Pentium motherboards in 1995.

Triton is Intel's third Pentium chip set. Compared with Mercury, Intel's first such chip set, Triton is rich in features and greatly enhances PCI performance. The major departures from its second-generation chip set, Neptune, are elimination of support for dual CPUs, support of master-mode IDE and EDO DRAM, and a reduction of production cost. These changes enable Triton to

#### MICROPROCESSOR REPORT

compete in the cost-sensitive volume PC market.

As Figure 1 shows, Triton is composed of four chips: the system controller, two data-path chips, and the peripheral controller. These four chips form a host-to-PCI bridge and provide the external cache control and a 64-bit data path to main memory.

The system controller integrates the cache and main-memory control functions and provides bus control for transfers among the CPU, cache, main memory, and PCI bus. The integrated cache logic supports a writeback cache policy for cache sizes of 256K and 512K. The external cache is in parallel with the main memory, allowing the CPU to access the cache and main memory simultaneously. This design makes the cache optional, so cacheless systems are supported. For overall system performance and cost savings, the system controller includes an internal tag RAM for cache-line status bits while leaving the address tags outside. The system controller supports five banks of DRAM for up to 128M of main memory.

The data-path chips provide the data paths connecting the CPU/cache, main memory, and PCI. They also contain the data portion of read prefetch and posted write buffers. These buffers are a common way to increase concurrency or to reduce the number of bus cycles that the CPU has to wait for completion of its reads and writes. The minimum size of these buffers is one cache line (32 bytes).

The peripheral controller is a PCI-to-ISA bridge with multiple I/O functions. The I/O functions include a seven-channel DMA controller, two 82C59 interrupt controllers, an 8254 timer/counter, Intel SMM power management, and control logic for NMI generation. In addition, the 82371FB fully supports PCI Plug and Play. It has an IDE interface with both programmed I/O and bus-master functions. The IDE interface supports two



Figure 1. Intel's Triton is a four-chip solution for Pentium system core-logic.

IDE connectors for up to four IDE devices, providing an interface for IDE hard disks and CD-ROMs.

Intel's success in the Pentium chip-set business comes from its early product introduction, the right feature set, its huge motherboard business, and the leverage of its microprocessors. Triton started production in February of this year, about three months ahead of most other vendors' Triton-class chip sets. It defines the baseline technology and feature set for a Pentium-class chip set. Despite the entrance of other chip-set vendors, we expect that Intel will retain more than 50% of the worldwide Pentium chip-set business in 1996.

#### VIA Gains Back Its Position

Founded in 1987, VIA Technologies was one of the early leaders in designing core-logic for IBM-compatible PCs. Like Chips and Technologies, but for a very different reason, VIA lost its momentum in 1990. After focusing its resources on developing core logic for SPARCbased workstations, VIA ran into problems because of a lack of customers: the SPARC clone business did not grow as quickly as expected.

VIA's comeback started in late 1993 with the introduction of its VL/ISA chip set for 386 and 486 green PCs. Since then, the company has brought products to market very rapidly. VIA's first Pentium PCI chip set, named Apollo, was released in July 1994. Six months later, in January 1995, Apollo Plus was released with new features such as enhanced IDE, dual-processor support, and Plug and Play. In July 1995, another six months after Apollo Plus, VIA brought out its third-generation Pentium-class chip set, Apollo Master, which adds burst EDO support and IDE mastering capability.

The Apollo Master, shown in Figure 2, has a few distinguishing features. It is the first chip set that supports burst EDO DRAMs for main memory, which doubles the memory bandwidth compared with fast pagemode DRAMs. IDE mastering helps concurrent CPU and PCI operations and is critical for applications that take advantage of this concurrency under Windows 95. The transfer rate of the IDE controller is as fast as 20 Mbytes/s. In addition, the memory data path can be configured as either 64 or 32 bits wide. With its advanced power management and 32-bit data path, the chip set is suitable for notebook PCs. Only six TTLs are required for a complete motherboard implementation.

VIA's successful comeback is due to its quick response with the right products to newly developed technologies, advanced architecture, and changing market requirements. The company also tries in every way to cut manufacturing costs. Fortunately, VIA came back to its PC core-logic business before the worldwide foundry capacity shortage, which allowed it to establish a strong relationship with a few key Far East semiconductor vendors. Instead of buying finished goods, VIA buys wafers, subcontracts chip assembly, and does its own testing to minimize cost.

Furthermore, to be cost effective, a chip set has to be developed using full-custom design. In retrospect, it was clearly the wrong business decision for VIA to get involved in SPARC core-logic development. It did, however, help VIA establish advanced CAD tools and state-of-theart VLSI design methods. This experience enabled the vendor to bring its full-custom designs to market quickly.

Backed by Taiwan's multibillion-dollar Formosa Plastics Group, VIA is closely tied to major motherboard manufacturers such as First International Computer. Currently, VIA is the second-largest Pentium chip-set supplier to board manufacturers in Taiwan, after Intel. The chip-set vendor is one of the few to gain market share in 1995, with the combination of its advanced products and its strong connection to Taiwan. We believe that with its superior products, VIA will become one of the top five Pentium chip-set suppliers worldwide by 1996.

#### **Opti Fights for Market Share**

Opti's long-awaited Viper-M chip set has been in production since May 1995, about three months behind Triton and Apollo Plus. As Figure 3 shows, Viper-M's system architecture is not as straightforward as that of other chip sets: part of the ISA bus connects to the system controller, and the rest connects to the peripheral controller.

In addition to the common features for a Pentiumclass chip set, Viper-M has a few unique features. It supports the Sony Sonic-2WP cache module and adaptive write-back, which means that if a CPU write hits in the secondary cache and is a page hit in main memory, the write will be treated as a write-through cycle. Viper-M can also run its PCI interface in either synchronous or asynchronous modes. In addition, it is the only chip set discussed in this article that supports a VL-bus slave device. The VL support, which complicates the Viper-M architecture, really belongs to 486-class systems. It seems unlikely that this feature will help Opti gain market share.

Opti has been very successful in the chip-set business for many years. Its total market share for desktop PCs was about 20% in 1994. Like VLSI Technology, Opti is losing market share this year for three reasons. First, the Viper-M product has been late to market. Second, the product does not bring enough differentiation in performance and function. In particular, its PCI performance is relatively poor: the sustained data throughput is about 60 Mbytes/s, while all the other chip sets discussed in this article can sustain 100 Mbytes/s or more. Although Opti calls Viper-M a multimedia chip set, it does not have the features to separate it from others as a dedicated multimedia solution. Finally, Intel has simply taken away Opti's market share.



Figure 2. VIA's Apollo Master is a four-chip solution for Pentium system core-logic. Unlike Triton, the PCI bus is isolated from the system controller and data paths.

Ironically, Opti has always been thought of as a low-cost chip maker, but it cannot compete with companies like VIA on cost. Opti was once the number-one chip-set supplier to Taiwan, but Opti's presence in the Far East market, which consumes more than 50% of all chip sets worldwide, is not significant any more. Opti seems unable to sustain its profit margins by selling to that market. In the U.S., Intel has stolen major system OEMs like Packard Bell, Gateway, and Dell that used to be either Opti or VLSI customers.

#### Cypress Is Eager to Take a Bite

Cypress Semiconductor is a new player in the PC core-logic market. From Cypress' point of view, a chip set is little more than a specialty memory. Its first Pentium chip set, named HyperCache, was introduced last June.



Figure 3. Opti's Viper-M is a three-chip solution for Pentium system core logic. Unlike Triton and Apollo Master, the ISA bus is split between the system controller and the peripheral controller.



Figure 4. Cypress' HyperCache is a three-chip solution for Pentium system core logic. It integrates a 128K secondary cache. The external cache size can be expanded by adding additional SRAM chips (128K each), up to seven pieces for a maximum of 1M. The 128K SRAM chip is specially designed for the HyperCache chip set.

The uniqueness of HyperCache is that it integrates a 128K cache SRAM. HyperCache is the only desktop chip set today that supports a two-way set-associative cache, improving the cache hit rate. This associativity makes the performance of a 128K HyperCache close to that of a 256K direct-mapped cache.

For Pentium-class CPUs, especially if the CPU internal clock gets faster, a 128K or larger secondary cache is critical for system performance. Without it, the performance of today's chip sets, even with EDO DRAM, will be about 15% lower. Therefore, Cypress' integrated cache is suitable for most desktop designs.

As Figure 4 shows, HyperCache consists of three chips: a memory controller with an integrated 8K×21 cache tag, a data path with a 128K cache-data RAM, and a peripheral controller. In addition, HyperCache can be expanded to a maximum of 1M by using Cypress' CY82C694, a 16K×64 synchronous/pipelined cache burst SRAM designed specially for this chip set.

Because of its built-in cache RAM, HyperCache has certain limitations. Some OEMs are planning to support low-end, midrange, and high-end PCs using the same motherboard design. Since the low-end system may leave out the secondary cache, HyperCache will not be suitable for these OEMs.

#### Product Differentiation Gets Harder

Besides the four chip sets discussed above, there are other solutions. VLSI's Lynx is the first chip set that implements distributed DMA, allowing AT DMA cycles to be converted into PCI master cycles. Symphony's Rossini is the first chip set that supports 3.3-V and 5-V PCI. It can automatically detect the voltage level of the PCI bus. SiS's 551x is the first chip set that has an option to support the unified memory architecture (see **090801.PDF**). Table 1 summarizes these and other available Pentium chip sets.

The technology related to system core-logic seems well understood, and lots of skillful designers are available. Thus, it becomes difficult to separate one chip set from another these days, as Table 1 shows. If an important feature shows up in one vendor's product, it will soon be everywhere. Some differentiation does not have any practical value. For instance, Triton supports only 512K of cache, while most others support up to 2M. In mainstream Pentium PCs, however, designers rarely use more than 512K of SRAM for secondary cache. Another example is the dual-CPU ("pseudo-MP") support. End users are unlikely to use this feature due to the lack of performance scalability.

If the system configuration and memory are the same, the performance of these Pentium chip sets varies by only 2–5%. The important performance figures are 3-1-1-1 read and write accesses to the secondary cache with burst SRAMs, a 100-Mbyte/s sustained PCI data bandwidth, enhanced IDE with master mode, and support for EDO DRAM (burst EDO is even better).

Most differences among these chip sets come from the level of concurrency. In the past, main memory was primarily used by the CPU, with occasional DMA accesses by devices such as a floppy-disk controller. Today, the main memory is used not only by the CPU but also by various PCI devices like bus-master IDE controllers, multimedia cards, and networking cards. Chip sets have to allocate the memory bandwidth and support concurrent operation of the CPU, PCI, ISA, and IDE buses.

Table 1 is just a partial list of Pentium chip sets, focusing only on desktop PC system logic. UMC, Weitek, Chips and Technologies, Cirrus Logic, and others are also playing in this market. In addition, there are several chip sets designed exclusively for Pentium notebooks, including Cirrus Logic's Redwood and its new Vesuvius (*see 0911MSB.PDF*), VLSI's Eagle, and Opti's Viper-N. For Pentium multiprocessor servers, there are ALI's Genie, Intel's C5C, Vitesse's VSP947, LSI Logic's Hydra, and Corollary's C-Bus II solutions. Other chip sets will emerge over time.

For example, Intel is working on its Triton II chip set, aiming to extend its Triton architecture to fault-tolerant computing. Besides the current Triton feature set, Triton II will support error checking and correction (ECC) for main memory, a 512M memory space, and dual Pentium processors.

So far, Intel does not have a chip set for Pentium notebook applications. This leaves a great opportunity for companies that do have Pentium notebook chip sets, as notebooks are the fastest-growing segment of the PC market and Pentium notebooks are just now beginning to ship. We do not, however, expect Intel to ignore this market segment for much longer.

#### 1996 Will Be an Exciting Year

To compete in 1996, a chip-set vendor has to have a unified memory architecture solution. None of the chip sets discussed above has been designed for this architecture except SiS's 551x. UMA chip sets for Pentium-class systems will soon be introduced by vendors such as VLSI, Opti, SiS, VIA, and Weitek. In addition to a normal 2D graphics accelerator, integrating multimedia functions like MPEG video playback will be a way for vendors to differentiate their chip sets in 1996.

If burst EDO support is a plus for competing in 1995, it will be a must for 1996. Chip-set vendors have to provide this flexibility not only for performance but also to help OEMs avoid the ongoing shortage of memory devices. Today, almost all DRAM vendors worldwide are trying to increase their fab output by shrinking die sizes, expanding existing fabs, or building new fabs. Even this increase, however, cannot meet the demand: because of Windows 95, the average PC memory size will jump from today's 6M to 12M in 1996. We project that the DRAM shortage will last at least another 12 to 18 months.

Most vendors will also add SDRAM to their mem-

ory device lists. Today's JEDEC-standard SDRAM is too complicated to be cost competitive. Companies like NEC are now working on what they call SDRAM "lite." The idea is to eliminate some unnecessary options in the standard SDRAM specification and leave only those features important for Pentium systems. The goal for the stripped-down SDRAM is to carry just a 5% price premium compared with conventional DRAM. The synchronous parts allow better performance: it is hard to boost the speed of burst EDO beyond 66 MHz, but a 75-MHz SDRAM is easier to build. We don't expect the "lite" SDRAM to reach volume production until 1997.

Burst EDO and SDRAM make cacheless PCs attractive, especially if AMD's K5 is used. The K5 supports outof-order execution: in most cases, it will continue executing while waiting for read data to return. Therefore, the latency of the first 64-bit read data in K5 PCs is not as crucial as in Pentium systems.

The next most expensive item on a motherboard is the cache SRAMs. We expect that, in 1996, the most popular solutions will either be cacheless or have an integrated cache using Cypress-like chip sets. With burst EDO and SDRAM, it makes little sense to use standard asynchronous SRAMs for the external cache.

Other future trends we anticipate in Pentium chipset design are a Universal Serial Bus (USB) interface, integration of functions that previously resided in a

|                     | ALI<br>Aladdin | Cypress<br>HyperCache | Intel<br>Triton | Opti<br>Viper-M | SiS 551x  | Symphony<br>Rossini | VIA Apollo<br>Master | VLSI Lynx   |
|---------------------|----------------|-----------------------|-----------------|-----------------|-----------|---------------------|----------------------|-------------|
| Chip count          | 4 Chips        | 3 Chips               | 4 Chips         | 3 Chips         | 3 Chips   | 3 Chips             | 4 Chips              | 2 Chips     |
| Max cache size      | 1M             | 1M                    | 512K            | 2M              | 1M        | 2M                  | 2M                   | 1M          |
| Cache associativity | Direct         | Two-way               | Direct          | Direct          | Direct    | Direct              | Direct               | Direct      |
| Linear burst        | Yes            | Yes                   | No              | Yes             | Yes       | Yes                 | Yes                  | Yes         |
| Max memory size     | 768M           | 768M                  | 128M            | 512M            | 512M      | 512M                | 512M                 | 768M        |
| Max # of banks      | 6 Banks        | 6 Banks               | 5 Banks         | 6 Banks         | 8 Banks   | 8 Banks             | 8 Banks              | 6 Banks     |
| Burst EDO           | No             | No                    | No              | No              | Yes       | Yes                 | Yes                  | No          |
| Enhanced IDE        | Slave          | Master                | Master          | Master          | Master    | Master              | Master               | Master      |
| Max # of PCI slots  | 4 Slots        | 4 Slots               | 4 Slots         | 4 Slots         | 5 Slots   | 6 Slots             | 5 Slots              | 4 Slots     |
| PCI speed           | 1/2 CPU Bus    | 1/2 CPU Bus           | I/2 CPU Bus     | Syn/asyn        | Syn/asyn  | 1/2 CPU Bus         | 1/2 CPU Bus          | Asyn        |
| # of ISA slots      | 4 Slots        | 4 Slots               | 5 Slots         | 6 Slots         | 5 Slots   | 5 Slots             | 8 Slots              | 8 Slots     |
| VL bus              | No             | No                    | No              | Yes             | No        | No                  | No                   | No          |
| Packages            | 208, 100,      | 208, 208,             | 208, 100,       | 208, 160,       | 208, 208, | 208, 208,           | 208, 100,            | 352BGA,     |
|                     | 100, 208       | 208                   | 100, 208        | 208             | 208       | 208                 | 100, 208             | 208MQFP     |
| CMOS process        | 0.6 µm         | 0.65 µm               | 0.8 μm          | 0.8 μm          | 0.6 µm    | 0.6 µm              | 0.6 µm               | 0.6, 0.8 μm |
| # of TTLs           | 10 TTLs        | No TTLs               | 10 TTLs         | 8 TTLs          | 7 TTLs    | 6 TTLs              | 6 TTLs               | 3 TTLs      |
| Read page hit       | 7-2-2-2        | 6-2-2-2               | 7-2-2-2         | 8-2-2-2         | 5-2-2-2   | 6-2-2-2             | 5-2-2-2              | 5-2-2-2     |
| Read row miss       | 10-2-2-2       | 9-2-2-2               | 9-2-2-2         | 10-2-2-2        | 8-2-2-2   | 9-2-2-2             | 7-2-2-2              | 8-2-2-2     |
| Read page miss      | 14-2-2-2       | 12-2-2-2              | 12-2-2-2        | 12-2-2-2        | 10-2-2-2  | 12-2-2-2            | 9-2-2-2              | 10-2-2-2    |
| Posted write        | 3-1-1-1        | 3-1-1-1               | 3-1-1-1         | 3-3-3-3         | 3-1-1-1   | 3-1-1-1             | 3-1-1-1              | 3-1-1-1     |
| Back-to-back read   | 14-2-2-2       | 6-2-2-2               | 7-2-2-2         | 8-2-2-2         | 5-2-2-2   | 6-2-2-2             | 5-2-2-2              | 5-2-2-2     |
| page hit            | 3-2-2-2        | 6-2-2-2               | 3-2-2-2         | 2-2-2-2         | 2-2-2-2   | 3-2-2-2             | 3-2-2-2              | 3-2-2-2     |
| Production date     | June 95        | Sept 95               | Feb 95          | May 95          | August 95 | 4Q95                | Sept 95              | 4Q95        |
| Price in 10,000s    | \$24           | \$48*                 | \$30            | \$25            | \$25      | \$24                | \$25                 | \$30        |

Table 1. Feature and performance comparison of eight recent Pentium chip sets. Memory read/write cycles assume EDO DRAM. For the chip sets that support burst EDO, their burst performance out of main memory is x-1-1-1. \*The price of the Cypress HyperCache includes a 128K cache. It is also a 1,000-piece price. (Source: vendors)

#### MICROPROCESSOR REPORT

super I/O controller, and elimination of support for the EISA and VL buses. If Intel succeeds in establishing its NSP standard, chip-set vendors must support this capability in 1996, but at this time, several issues remain to be settled between Intel and Microsoft. An agreement on NSP between these two key players will certainly benefit the PC industry as a whole.

In addition, designers will further explore hardware parallelism and BGA packages. Advanced caching schemes such as read-bypass-write and byte gathering will become popular for next year's chip sets. Read-bypass-write allows a CPU memory-read cycle to complete before previous writes that are in the posted write buffer, as long as the read does not hit on that buffer. Byte gathering collects individual writes of either 8, 16, or 32 bits into a single 64-bit write transaction whenever these accesses are within the same 64-bit location.

In short, 1996 will be a challenging year for Pentium chip-set vendors. Moving to P6, the chip-set business will become even tougher. With its huge fab capacity, Intel can produce more processors than the market can consume. Therefore, it is natural for Intel to further expand its core-logic business. This expansion also helps Intel secure its hold on the PC architecture.

Today, Intel's Orion is the only P6 chip set yet disclosed. We expect that companies like Opti and VIA will launch P6 chip sets by early 1996. But Intel will probably sell more P6 processors in systems or motherboards than as standalone CPUs in 1996. This situation will make it even harder for other chip-set vendors to compete for the P6 core-logic business. ◆