This document is a technical specification for the EV3 and EV4 microprocessors, which are the first chips to implement the ALPHA architecture. It details their external interface, programming information, and micro-architecture. An important update memo is also included, outlining changes for the EV45 chip.
Key aspects covered:
Scope and Overview:
- The document describes the hardware/software interface and implementation-specific programming details, not the full ALPHA architecture or chip internal designs.
- EV3 is an earlier variant (CMOS-3, 1 micron, 10ns cycle time) primarily for system-level debugging and demonstration units. It is pin-compatible with EV4 but has less functionality.
- EV4 (CMOS-4, .75 micron, 6.6ns nominal cycle time, 5ns possible) is a superscalar, superpipelined processor designed for a wide range of systems.
Micro-architecture (EVx refers to common EV3/EV4 features):
- Execution Units: EVx features three independent units: the Integer Execution Unit (Ebox), the Address Generation, Memory Management, Write Buffer, and Bus Interface Unit (Abox). EV4 also includes a Floating Point Unit (Fbox), which EV3 lacks (floating-point operations are emulated via PALcode on EV3). The Ibox acts as the central control unit.
- Dual Issue: EVx can issue two instructions per cycle to independent units under specific scheduling rules.
- Caches: Both chips have on-chip instruction (I-cache) and data (D-cache) caches, which are physical, direct-mapped, and use 32-byte blocks.
- EV4: 8KB D-cache (write-through, read-allocate), 8KB I-cache (with ASN and branch history support, stream buffer).
- EV3: 1KB D-cache, 1KB I-cache (no ASN, ASM, or history; requires software flushing for coherence).
- Memory Management: Features on-chip Translation Buffers (TB).
- I-stream TB (ITB): 8-entry fully associative (8KB pages).
- D-stream TB (DTB): 32-entry fully associative (8KB pages) and a 4-entry for larger pages (512*8KB pages).
- Write Buffer: Manages store data, preventing CPU stalls and aggregating data for external cache. EV3 has 8 entries, EV4 has 4 but with more complex flow control for better utilization.
- Pipeline: EV4 has a 7-stage integer pipeline and a 10-stage floating-point pipeline. The document details scheduling, issue rules, and handling of pipeline stalls and exceptions.
- Performance Counters (EV4 only): Mechanisms to count hardware events and trigger interrupts for performance analysis. Not present in EV3.
Privileged Architecture Library Code (PALcode):
- PALcode is a key component for implementing ALPHA architecture functions that are not directly in hardware, such as translation buffer fill routines, interrupt handling, and exception dispatch. It runs in a privileged environment with instruction stream mapping and interrupts disabled.
- Special PAL instructions (e.g., HW_MTPR, HW_MFPR, HW_REI, HW_LD, HW_ST) are defined for low-level access to internal processor registers (IPRs) and memory.
- The document provides extensive detail on PALmode restrictions, IPR definitions (for control, status, and error reporting), and error handling flows (ECC errors, parity errors, transaction terminations).
External Interface:
- EVx chips connect directly to external static RAMs, with programmable external cache interfaces.
- It defines various signals for clocks, reset, address bus, data bus (64-bit or 128-bit wide), external cache control, and interrupts.
- Detailed transaction types (e.g., READ_BLOCK, WRITE_BLOCK, FETCH, BARRIER) and acknowledgment protocols are specified.
Electrical and AC Characteristics:
- Details power supply (3.3V +/- 5% CMOS mode), reference voltages, and input/output signal characteristics.
- Provides timing parameters for EV4 (6.6ns) and scaling factors for EV3 (1.5x slower).
- Includes a pinout diagram for the 431-pin PGA package.
EV45 Update (Rev 1.1 memo):
This update introduces the EV45 chip, building upon the EV4 with several enhancements:
- Larger Caches: I-cache and D-cache increased to 16KB (from 8KB). D-cache includes new backmap write enable and invalidate request pins.
- Improved FPU: New floating-point divider with significantly lower latency and full IEEE compliance (fixes EV4's inexact flag issue).
- Advanced Branch Prediction: Uses a 4Kx2-bit history table.
- Cache Parity: I-cache and D-cache now include parity protection (tag and data parity bits). I-cache parity errors are recoverable.
- Byte Parity: New mode for byte parity on the external data bus.
- Improved LDx/L and STx/C: New "fast lock" operating mode for better performance.
- External Cache Read: Fixes a design bug in EV4 to support 3-cycle external cache reads.
- Clock Divisor: Wider range of system clock divisors supported.
- Cache Redundancy: Physically implemented caches as two separate arrays with redundant rows and "soft fuse" modes for electrical repair.
- IPR and Pinout Changes: Several new bits in control registers and re-purposed spare pins.
In essence, this document serves as a comprehensive guide for hardware and software engineers working with the Digital Equipment Corporation's EV3, EV4, and EV45 ALPHA microprocessors, detailing their features, interfaces, and operational nuances.