This issue of the Digital Technical Journal, Volume 6 Number 3 (Summer 1994), focuses on Digital Equipment Corporation's Alpha systems, particularly advancements in multiprocessing and optimizations for scientific computing.
The document covers:
AlphaServer Multiprocessing Systems:
- Design: Introduces the AlphaServer 2100 (large pedestal) and 2000 (small pedestal) as high-performance multiprocessor servers. These systems combine Alpha RISC technology with PC-style I/O (PCI and EISA buses). The architecture supports up to four processors and 2 GB of memory, utilizing a 128-bit system bus designed for multiple Alpha processor generations. The goal was to achieve price/performance leadership, supporting Windows NT, DEC OSF/1, and OpenVMS.
- I/O Subsystem: Details the hierarchical, dual-level I/O structure designed for efficiency. It uses a custom T2 bridge chip (connecting the system bus to PCI) and an Intel PCI-to-EISA bridge chip set. Innovative techniques like data rate isolation, disconnected transactions, and a specialized I/O interrupt scheme were implemented to ensure efficient data transfer and concurrency across the different bus speeds, despite supporting older PC-standard peripherals. A "Standard I/O Module" integrates common PC functions and facilitates hardware modifications.
DEC OSF/1 Symmetric Multiprocessing (SMP):
- Implementation: Describes the significant software work involved in adapting DEC OSF/1 Version 3.0, Digital's UNIX implementation, for symmetric multiprocessing on AlphaServer systems. The primary aim was to ensure high performance, reliability, and scalability.
- Key Challenges & Solutions: Addressing issues arising from moving a uniprocessor operating system to a shared-memory SMP platform. This involved extensive work on lock-based synchronization (using simple and complex locks, elevated SPL, and funneling), optimizing for parallelism within the operating system kernel, developing a robust lock package with debugging features, adapting the scheduler with "soft affinity" and "load balancing" to improve CPU utilization and cache efficiency, and implementing an efficient TLB (Translation Lookaside Buffer) shootdown algorithm.
Scientific Computing Optimizations for Alpha:
- DXML (Digital eXtended Math Library): A high-performance scientific subroutine library optimized for Alpha systems. It accelerates application performance by providing routines for numerically intensive operations, including public domain BLAS (Basic Linear Algebra Subprograms) and LAPACK (Linear Algebra PACKage) libraries, as well as Digital's proprietary signal processing and sparse linear solvers. DXML focuses on exploiting the Alpha architecture's memory hierarchy for substantial performance gains.
- KAP Parallelizer: A preprocessor for DEC Fortran and DEC C programs that acts as a "super-optimizer," performing source-code-level optimizations beyond what compilers achieve. KAP parallelizes programs (both implicitly and explicitly using X3H5 directives) through advanced data dependence analysis and loop transformation techniques (e.g., loop interchange, fusion) to take full advantage of SMP systems, addressing concerns like efficient thread creation, synchronization, and data caching.
In essence, the journal highlights Digital's commitment to delivering open, high-performance, and cost-effective Alpha systems, demonstrating the complex engineering efforts in both hardware (AlphaServer design, I/O subsystem) and software (OSF/1 SMP, scientific computing libraries and parallelizers) to realize the "Alpha vision" and achieve a new standard of price/performance in the industry.