This "Alpha Architecture Handbook, Version 4 (October 1998)" from Compaq Computer Corporation, provides a comprehensive overview of the Alpha 64-bit load/store RISC architecture.
Core Design Principles:
- True 64-bit Architecture: All registers are 64 bits wide, and operations are performed between 64-bit registers. Memory is accessed via 64-bit virtual byte addresses, supporting both little-endian and optional big-endian byte numbering.
- High Performance Focus: Designed for very high-speed implementations, emphasizing clock speed, multiple instruction issue, and multiprocessor support. Its simple, fixed 32-bit instruction length and lack of special registers or condition codes facilitate efficient pipelining.
- Load/Store Model: Data manipulation between registers is primary; memory operations are exclusively loads and stores.
- Multiprocessor Shared Memory: Utilizes
load_locked and store_conditional sequences for atomic operations and explicit Memory Barrier (MB) instructions to enforce strict ordering of shared memory accesses, which scales well with fast caches.
- Performance Hints: Includes instruction-level hints for improved performance, such as target hints for jump instructions, memory prefetching, and granularity hints for virtual addressing.
Key Architectural Components:
- Registers: Features 32 64-bit integer registers (R31 reads as zero), 32 64-bit floating-point registers (F31 reads as zero), a Program Counter (PC), Lock Registers for atomic operations, and a Processor Cycle Counter (PCC). Optional registers for memory prefetch and VAX compatibility are also defined.
- Data Types: Supports Byte (8-bit), Word (16-bit), Longword (32-bit), and Quadword (64-bit) integers. For floating-point, it supports VAX (Ffloating, Gfloating, D_floating) and IEEE (single, double, extended) formats, along with specific integer formats within the floating-point unit.
- Instruction Set: Classified into five basic formats: Memory, Branch, Operate, Floating-Point Operate, and PALcode. The handbook details instructions for various operations including Load/Store, Control (branches, jumps), Integer Arithmetic (e.g., add, subtract, multiply, compare, bit counts, but no integer divide), Logical/Shift, Byte Manipulation, Floating-Point (full VAX and IEEE sets, conversions), and Miscellaneous functions (
AMASK, EXCB, TRAPB, WMB, etc.). It also includes specific VAX Compatibility and Multimedia (graphics and video) instructions.
- Floating-Point Operations: Supports both VAX and IEEE computational models, with various rounding and trapping modes. The Floating-Point Control Register (FPCR) is crucial for managing these settings and handling exceptions.
System Architecture & Programming Implications:
- Memory Coherency and Ordering: Defines strict rules for how memory accesses are observed across multiple processors and I/O devices, emphasizing the need for explicit memory barriers to ensure predictable ordering.
- Trap Handling: Explains how arithmetic traps are handled in a pipelined environment, introducing concepts like "trap shadow" and detailing the role of the
/S qualifier and TRAPB instruction for precise exception completion.
- Software Optimization: Provides guidelines for compilers and programmers to maximize performance, focusing on data and instruction alignment, branch prediction strategies, instruction scheduling, and synchronization for shared data structures.
- UNPREDICTABLE/UNDEFINED Behavior: Clearly defines and distinguishes these terms, which are critical for understanding the architecture's guarantees regarding security and system stability.
PALcode (Privileged Architecture Library):
- A flexible, operating system-specific firmware layer that implements complex, atomic, and privileged functions. It handles low-level hardware interactions (e.g., context switching, memory management, interrupts, exceptions) that are not directly implemented in hardware.
- Enables the Alpha architecture to support diverse operating systems, including OpenVMS Alpha, Digital UNIX, and Windows NT Alpha, by abstracting implementation differences.
The document also includes detailed tables summarizing instruction formats, opcodes, and bit assignments, along with specific architectural waivers and implementation-dependent functionality for various DECchip processors (e.g., regarding IEEE divide and write buffer behavior).