Parallel Processing the Cm* Experience

Order Number: EY-6706E-DP

This document, "Parallel Processing: The Cm* Experience," provides a comprehensive account of a decade-long research project (1977-1986) at Carnegie-Mellon University focused on exploring parallel computation. The Cm* system was a pioneering experimental MIMD (Multiple Instruction, Multiple Data) multiprocessor, eventually scaled to 50 processors organized into five clusters.

The book is structured into four main parts:

  1. The Cm* Hardware: It details the hierarchical architecture of Cm*, which consists of individual Computer Modules (Cm's – processor-memory pairs) connected via local switches (Slocals) to microprogrammable communication controllers (Kmaps) that manage clusters. Communication between Cm's within a cluster or across clusters is handled by Kmaps through packet switching. Key measurements show the performance characteristics of local versus non-local memory access, illustrating the trade-offs in communication overhead. Reliability studies revealed that transient errors were significantly more frequent than permanent ones and followed a decreasing failure-rate Weibull distribution, contrary to then-standard assumptions.

  2. Operating Systems: Two distinct operating systems, MEDUSA and STAROS, were developed and evaluated on Cm*. Both adopted principles like modularity, robustness, and policy/mechanism separation.

    • MEDUSA was designed for maximizing performance, closely reflecting the hardware structure. It used a message-based communication system with value-based message passing and implemented protection via descriptors and an "amplification" mechanism for utilities. It focused heavily on robust error handling, including the use of "co-objects" (shadows and standby copies) and "buddy activities" for recovery.
    • STAROS aimed for a more general and flexible multiprocessor environment, abstracting the hardware to a symmetrical global shared-memory model. It utilized a capability-based addressing scheme and primarily used reference-based message passing. STAROS also implemented a parallel garbage collection system. Performance comparisons between the two OS kernels and message systems highlight the trade-offs between functional richness, protection, and speed, often showing MEDUSA to be faster due to its closer hardware coupling and extensive microcoding.
  3. Programming Environments: The book discusses the software tools and languages developed to facilitate parallel programming on Cm*.

    • TASK and MEDLINK were languages used to describe the macro-static structure of parallel programs ("task forces"), their communication patterns, and resource-allocation directives, effectively serving as "blueprints" for parallel computations.
    • AMPL (A MultiProcessing Language) was an experimental high-level language emphasizing message passing for interprocess communication and dynamic process creation, operating on its own run-time system atop MEDUSA.
    • Other environments like NEST (for running early benchmarks) and ECHOES (for fast procedure calls and forks with shared memory) are also described, showcasing the flexibility of the Cm* architecture. An Integrated Instrumentation Environment (IIE) was developed to aid in experiment design, data collection, and analysis.
  4. Experiments: A significant portion of the book is dedicated to the experimental results obtained from running various parallel algorithms and emulating different multiprocessor architectures on Cm*.

    • Parallel Algorithms: Performance was primarily measured using "speedup" (how much faster a computation runs on N processors compared to one). The experiments analyzed factors limiting speedup, such as algorithm penalties (separation and reconstitution overhead) and implementation penalties (access and contention overhead). Case studies included algorithms for partial differential equations (PDE), quicksort, integer programming, molecular motion simulations, design-rule checking, Ada rendezvous, and transaction processing. Notably, some search and consumer algorithms occasionally exhibited "superlinear speedup."
    • Multiprocessor Architecture: Cm* served as a testbed to evaluate different architectural designs. Experiments investigated methods for accurate time measurement in distributed systems, the impact of software voting for reliability (N-Modular Redundancy), and the emulation of various interconnection network topologies (e.g., tree, ring, mesh, hypercube, fully connected) to compare their performance.

In summary, "Parallel Processing: The Cm* Experience" is a detailed record of a pioneering effort in parallel computing. It provides deep insights into the challenges and solutions related to hardware design, operating system development, programming languages, and performance evaluation for multiprocessor systems, significantly influencing subsequent research and commercial parallel computer designs.

EY-6706E-DP
1987
473 pages
Quality

Original
19MB

Site structure and layout ©2025 Majenko Technologies