This document details the architecture and design of Digital's HS-series StorageWorks array controllers (HSJ30, HSJ40, HSD30, HSZ40), which aimed to provide high-performance, highly available, and reliable storage solutions for both open systems and Digital's proprietary environments.
The controllers were designed with three primary goals:
Open Systems Capability: They support industry-standard SCSI-2 devices (disks, tape, optical) as well as Digital's proprietary CI and DSSI host interconnects, and also an industry-standard SCSI-2 host interconnect (for the HSZ40 model) to connect with various non-Digital computers, ensuring customer investment protection and cost reduction.
High Availability: This was achieved through:
- Controller Fault Tolerance: A dual-redundant, dual-active controller configuration allows one controller to seamlessly take over the devices and cache of a failed partner. This design incorporates "hot swap" capabilities for components, hardware error correction (ECC) for memory, redundant power supplies and fans in enclosures, and a "KILL" signal for controlled failover, preventing data corruption.
- Storage Fault Tolerance (RAID): The controllers implement "Parity RAID" (a refined RAID Level 5 with dynamic update algorithms and write-back caching) to overcome common RAID 5 weaknesses like the "small-write penalty" and the "write hole" problem (data loss during power failure). This is enabled by a battery-protected nonvolatile cache, ensuring data integrity. The system also features automated Parity RAID recovery, including the use of shared "spare pools" of disks for rapid reconstruction, reducing downtime.
High Performance: Performance goals for throughput, latency, and data transfer rates were met through:
- Command Processing: Streamlined firmware reduces latency and allows for high parallelism and multiple outstanding commands.
- Caching: Initially read-only, read/write-back caching with battery backup was implemented to dramatically reduce latency for all write requests, especially in Parity RAID configurations, bringing performance close to RAID Level 0 (striping).
- Write Aggregation: Firmware techniques (contiguous, vertical, horizontal aggregation) combine multiple write requests to optimize operations, effectively simulating RAID Level 3 performance for writes in Parity RAID.
- Hardware Acceleration: A dedicated FX gate array performs on-the-fly XOR operations, accelerating Parity RAID parity calculations and data comparisons.
The HS-series controllers utilize a common hardware core, powered by an Intel i960CA microprocessor and featuring a shared memory architecture and PCMCIA flash for program storage, contributing to a cost-effective and high-performance solution that is significantly smaller than previous Digital controllers.