## **Contents** | Preface | V | |-----------------------------------------------------|----------| | | | | Scope of the Book | V | | Studying Real Machines | vi | | Introductory Notions: The Tools of the Trade | vii | | The Dual Roles of the Chapters That Lie Ahead | xvi | | The Amount of Detail and the Task at Hand | xvii | | Text Insets | xix | | Description, Content and Use of the CD-ROM | XX | | Web-site | xxi | | Acknowledgments | xxii | | List of Figures | xxxiii | | List of Tables | xxxvii | | 1 Microprocessors, Platforms, and Systems | 1 | | T Wild op 1000 300 10, 1 land 1110, and 0 yold 113 | <u> </u> | | Designing and Implementing a Microprocessor | 1 | | What Needs to be Done? | 3 | | Constraints | 4 | | Testing | 8 | | Defining a Processor's Instruction Set Architecture | 13 | | Model of the Microprocessor at the RTL Level | 16 | | Golden Representation | 22 | | Behavioral (Functional) Simulation | 23 | | | xxv | ## XXVi Chapter: | Model at the Gate and Circuit Levels | 26 | |-----------------------------------------------------------------|----------| | Gate and Circuit Level Simulation and Hardware Emulation | 26 | | Generate Netlists and Physical Layout | 28 | | Mask Generation, Wafer Fabrication, and Packaging | 33 | | Designing and Implementing a 3D Graphics PC Platform | 35 | | PC Platforms: Key Components and Interconnections | 35 | | Display Adapters | 46 | | Perspectives from Eckert-Mauchly Award Winners | 53 | | Summary of Chapter and How to Proceed | 61 | | 2 A Microarchitecture Case Study | 63 | | An Overview of the V4-2D Mieroprocessor | 4.4 | | An Overview of the K6 3D Microprocessor | 64<br>64 | | A Range of Design Approaches A Family of Microprocessors | 66 | | K6 3D Block Diagram | 68 | | L1-Cache, L2- Cache, Store Queue, and System Interface | 72 | | Superscalar Design | 74 | | The Execution Units | 77 | | Latencies | 83 | | Status Flags, Faults, Traps, Interrupts, and Abort Cycles | 83 | | Architectural and Microarchitectural Registers | 88 | | Register Number and Name Mappings | 91 | | Special Registers and Model Specific Registers | 94 | | Branch Direction Prediction Logic and the Branch Resolving Unit | 105 | | The L1 and L2 Caches (revisited) | 108 | | The Instruction Buffer and Instruction Registers | 113 | | Predecoding Logic | 115 | | Predecode Bits | 117 | | Combinational Predecode Analysis Logic | 117 | | Use of the Predecode Bits | 121 | | The Decoders | 123 | | Decoder Combinations | 126 | | Decoder and Scheduler OpQuads | 126 | | The Scheduler | 128 | | Issue Selection Logic | 133 | | Operand Selection Logic | 134 | | Load/Store Ordering Logic | 134 | | Status Flag Handling Logic | 135 | | Status Flag-Dependent RegOp Logic | 135 | | ٠, | v | v | 7 | | П | П | |-----|---|---|---|---|---|---| | - 2 | ж | ж | ◂ | v | ı | ı | | Branch Resolution Logic | 135 | |-------------------------------------------------------------------------|------------| | Self-Modifying Code Support Logic | 136 | | Global Control Logic | 136 | | OpQuad Expansion Logic | 136 | | Op Commit Unit | 137 | | OpQuad Sequences and the RISC86 Operation Set | 137 | | OpQuad Sequences | 137 | | Formats for Decoder Ops | 142 | | LdOp and StOp Field Descriptions | 143 | | RegOp Field Descriptions | 150 | | SpecOp Field Descriptions | 155 | | LIMM Op Field Descriptions | 158 | | Execution Pipelines | 158 | | Faults, Traps, Abort Cycles, and the Pipelines | 168 | | Fault and Trap Handling | 169 | | LdOp Abort Cycles | 170 | | LdOp Misaligned Accesses | 171 | | StOp Abort Cycles | 172 | | StOp Misaligned Accesses | 173 | | BRU Pipeline | 174 | | Handling Faults, Traps, and Precise Interrupts | 175<br>176 | | Re-examining The Abort Cycle Proving Interrupts and Proving Expontions | 176 | | Precise Interrupts and Precise Exceptions System Interface | 177 | | Chapter Summary | 181 | | Chapter summary | 101 | | 3 The K6 3D Microarchitecture | 183 | | | | | The Scheduler: An Expanded Description | 183 | | Loading the Scheduler | 184 | | Shifting OpFields from Row to Row | 187 | | Pseudo-RTL Descriptions | 188 | | Static Field Storage Element Shifting Operation | 190 | | Dynamic Field Storage Element Operation | 190 | | The LdEntry Signals: Shifting the OpQuads | 193 | | Static and Dynamic Fields | 196 | | An Op Entry's Static Fields in More Detail | 198 | | An Op Entry's Dynamic Fields in More Detail | 205 | | The OpQuad Fields in More Detail | 213 | ## XXVIII Chapter: | The Scheduler Pipeline | 218 | |---------------------------------------------------------------|-----| | Op Issue Stage Logic Overview | 222 | | Operand Fetch Stage Logic Overview | 223 | | LdOp-StOp Ordering Logic Overview | 225 | | Status Flag Handling Logic Overview | 226 | | Status Flag Dependent RegOp Logic Overview | 226 | | Branch Resolution Logic Overview | 227 | | Global Control Logic Overview | 228 | | Self-Modifying Code Support Logic Overview | 228 | | Issue Selection Logic | 228 | | Operand Information Broadcast | 232 | | RegOp Bumping | 246 | | Load/Store Ordering Logic | 248 | | Scheduler Op Entry Fields Read Out During Operand Transfer | 252 | | Global Control Logic | 253 | | Status Flag Handling Logic, Status Flag | | | Dependent RegOp Logic, Branch Resolution Logic, | | | and Nonabortable RegOp Logic | 258 | | Self-Modifying Code Support Logic | 271 | | The OCU: An Expanded Description | 272 | | Commitment Constraints | 274 | | FAULT Ops and LdStOps With Pending Faults | 276 | | Debug Traps and Sequential and Branch Target Limit Violations | 276 | | Aborts for Mispredicted BRCOND Ops | 277 | | The Timing of Result Commitments | 278 | | Memory Writes | 278 | | The Timing of Aborts | 278 | | General Register Commitment | 279 | | Multiple Simultaneous Full and Partial Writes | 280 | | Status Flag Commitment | 283 | | StOps and Memory Write Commitment | 285 | | Memory Read Fault Handling | 291 | | FAULT Op Commitment | 292 | | LDDHA and LDAHA Op Commitment | 293 | | Sequential and Branch Target Segment Limit Violation Handling | 294 | | Mispredicted BRCOND Op Handling | 295 | | OpQuad Retirement | 295 | | Abort Cycle Generation | 296 | | Avoiding Deadlock | 298 | | Register Renaming | 300 | | Implications for Pipeline Operation | 301 | | WW | | |-------------------------|-----| | $\mathbf{x} \mathbf{x}$ | I X | | $\Lambda\Lambda$ | | | Explicit Register Renaming | 304 | |--------------------------------------------------------------------------|------------| | Implicit Register Renaming | 307 | | The MMX and 3D Registers | 310 | | Differences Between the Implicit and Explicit Register | | | Renaming Schemes | 312 | | Summary of the Chapter | 314 | | 4 Technology Components of Platform Architecture | 315 | | DO Desires Malastanda de ede | 01/ | | PC Design Metastandards Forces Priving Platform Architecture | 316<br>316 | | Forces Driving Platform Architecture PC Design Guides and Specifications | 321 | | Platform Categories | 326 | | Platform-Level Technology | 329 | | Enhanced User Experience | 329 | | Appliance-Like Operation | 344 | | Total Cost-of-Ownership | 347 | | Connectivity | 350 | | Component-Level Technology | 367 | | Processors | 368 | | Storage Devices | 370 | | General-Purpose Buses | 372 | | Device-Specific Buses and Ports | 401 | | Chapter Summary | 425 | | 5 Platform Memory Technology | 427 | | | | | Basic Memory Technologies | 428 | | Asynchronous DRAM | 428 | | Synchronous DRAM | 434 | | Emerging Memory Technologies | 435 | | Rambus | 436 | | Synchronous-Link DRAM (SLDRAM) | 442 | | North-Bridge Memory Controller Overview | 443<br>443 | | FPM, EDO, and SDRAM Support | 443 | | Consequences of Multiple Banks of DRAM | 449 | | Memory Controller Functions | 453 | | Chapter Summary | 461 | | 6 Platform Optimization Techniques and Directions | | |-----------------------------------------------------------------------|------------| | | | | Survey Of Platform Performance Optimizations | 464 | | Improving Data Movement and Manipulation in the Platform | 464 | | Overall System Architecture Performance Optimization | 467 | | Performance Optimization Specifics in a Contemporary | | | 3D Graphics Platform | 480 | | Summary of Contemporary Platform Optimizations | 492 | | Platform Directions | 495 | | Overview of Contemporary Issues Impacting PC Platforms | 495 | | Contemporary Platform Strategic Issues | 499 | | Next-Generation Platforms A Vision of Platforms Beyond the Millennium | 506<br>508 | | Chapter Summary | 506<br>514 | | Chapter summary | 314 | | References to Authors and Other Individuals | 515 | | | | | References to Suggested Readings | 519 | | | | | Copyright and Legal Notices | 521 | | | | | Glossary/Index | 525 | | | |