This guide summarizes the Compaq Availability Manager User's Guide (Version 1.4, April 2001), a system management tool designed to detect and correct system availability problems.
Purpose:
The Availability Manager helps system managers monitor one or more OpenVMS nodes across an extended local area network (LAN) from either an OpenVMS or Windows NT node. It collects, analyzes, and displays system and process data, notifies users of performance problems, and provides real-time "fixes" to improve system availability.
Key Components & Operation:
- Data Collector Nodes (OpenVMS Alpha/VAX 6.2+): These nodes run software to collect data from the OpenVMS systems being monitored.
- Data Analyzer Node (Windows NT 4.0 SP3+, Windows 2000, OpenVMS 7.1+): This node runs the software that analyzes the collected data and presents it through a Java-based Graphical User Interface (GUI).
- Communication between Data Analyzer and Data Collector uses an IEEE 802.3 Extended Packet protocol, allowing it to function even if standard network protocols are unavailable.
Core Functionality:
- Problem Detection: The manager continuously monitors system and process data. It identifies potential issues by evaluating data against user-defined thresholds and "occurrences" (consecutive data samples exceeding a threshold). Detected problems are signaled as "events" in the GUI's Events pane (often highlighted in red) and logged to an event file.
- Data Display: The GUI presents a high-level overview and detailed data for monitored nodes, including CPU usage, memory summaries, disk status and volume information (logical/physical for Windows NT), OpenVMS cluster details, network interconnect data, and specific single process information.
Corrective Actions ("Fixes"): System managers can perform various real-time interventions on OpenVMS nodes to resolve availability issues. These include:
- Node Fixes: Crashing a node or adjusting cluster quorum.
- Process Fixes: Deleting a process, exiting an image, suspending/resuming a process, changing process priority, and adjusting process memory (working set) or resource limits (I/O count, AST queue, open file, lock queue, timer queue, subprocess creation, I/O byte).
- Caution: Some fixes can have serious, irreversible repercussions and should only be performed by experienced system managers with appropriate privileges (CMKRNL).
Customization: Users can tailor the manager to their specific needs by:
- Specifying groups or individual nodes to monitor.
- Changing data collection types and intervals (background, foreground, event-triggered).
- Applying data filters (e.g., for CPU, disk status, memory, I/O, lock contention).
- Customizing event display (severity, occurrence values) and associated actions.
Security: Access is controlled using passwords (for Windows NT Data Analyzers and Collectors) and security triplets (for OpenVMS Data Analyzer/Collector nodes), which define network address, password, and read/write access permissions. OpenVMS Data Collector nodes also have file protection and process privilege features.
Intended Audience:
This guide is for system managers responsible for installing and using the Compaq Availability Manager software, assuming familiarity with Windows operating system terms and functions.