Digital PDFs

EK-HSC70-SV-002

March 1986

591 pages

Original

22MB

Document:	HSC70 Service Manual
Order Number:	EK-HSC70-SV
Revision:	002
Pages:	591
Original Filename:

OCR Text

EK-HSC70-SV-002

HSC70
Service Manual

mamaama

EK-HSC70-SV-002

HSC70
SERVICE MANUAL

Prepared by Educational Services
Digital Equipment Corporation

First Edition, March 1986
Second Edition, September 1986

Copyright (c) Digital Equipment Corporation 1985
All Rights Reserved
Printed in USA
The material in this manual is for informational purposes and is
subject to change without notice.
Digital Equipment Corporation assumes no responsibility for any
errors which may appear in this manual.
The HSC70 Mass Storage Server is designed to work with Digitql
Equipment Corporation host computers, tape products, and disk
products. Digital Equipment Corporation assumes no
responsibility or liability if the computers, tape products, or
disk products of another manufacturer are used with the HSC70
subsystem.
o

Class A Computing Devices:

NOTICE: This equipment generates, uses, and may emit radio
frequency energy. This equipment has been type tested and found
to comply with the limits for a class A computing device pursuant
to Subpart J of Part 25 of FCC rules designed to provide
reasonable protection against such radio frequency interference
when operated in a commercial environment. Operation of this
equipment in a residential area may cause interference. In such
a case, the owner, at his own expense, may be required to take
measures to correct the interference.
The following are trademarks of Digital Equipment Corporation,
Maynard, Massachusetts:
DEC
DECUS
DIGITAL
logo
PDP
UNIBUS
RA8!

VAX
RA91
HSC
RA81

DECnet
DECsystem-10
DECSYSTEM-20
DECwriter
DIBOL
EduSystem
RA60
VT
RA60
RA80

OMNIBUS
OS/8
PDT
RSTS
RSX
VMS
UDA50
KDASO-Q
lAS

TOPS-20

CONTENTS

CHAPTER 1

GENERAL INFORMATION

1.1 INTRODUCTION. . . . . . . . . .
....
1.2 GENERAL INFORMATION
. . . .
1.2.1 HSC70 Cabinet Layout.
1.2.2 External Interfaces
. . . . . . . . . .
1.2.3 Internal Software
. . .
1.2.4 Subsystem Block Diagram
. . • . . . ..
.
1.3 MODULE DESCRIPTIONS
. . •.
. ....
1.3.1 Port Link Module (LINK) Functions
.
1.3.2 Port Buffer Module (PILA) Functions
..
1.3.3 Port Processor Module (K.p1i) Functions And
Interfaces
·
1.3.4 Disk Data Channel Module (K.sdi) Functions. ·
1.3.5 Tape Data Channel Module (K.sdi) Functions. ·
1.3.6 Input/Output (I/O) Control Processor Module
(P.ioj) Functions
....... .
·
1.3.7 Memory Module (M.std2) Functions.
·
1.4 HSC70 MAINTENANCE STRATEGY.
·
1.4.1 Maintenance Features . . . . .
. . . .
1.4.2 HSC70 Specifications . . . .
·
1.5 HSC70 RELATED DOCUMENTATION
. .
e

CHAPTER 2

•••••••••••••

1-14
1-14
1-15
1-15
1-16
1-17
1-18
1-19
1-20

HSC70 CONTROLS/INDICATORS

2.1 INTRODUCTION . . . . . . . . . . . . . .
2.2 OPERATOR CONTROL PANEL (OCP) . . . . . .
2.3 INSIDE FRONT DOOR CONTROLS/INDICATORS
2.4 MODULE INDICATORS AND SWITCHES.
2.4.1 Module Switches
....
2.5 POWER CONTROLLER
2.5.1 Operating Instructions . .

CHAPTER 3

1-1
1-1
1-2
1-7
1-8
1-10
1-11
1-12
1-14

2-1
2-1
2-3
2-6
2-9
2-10
2-10

REMOVAL AND REPLACEMENT PROCEDURES

3.1 INTRODUCTION . . . . . . .
3-1
3.2 SAFETY PRECAUTIONS .
3-1
3.3 POWER REMOVAL
3-1
3.4 FIELD REPLACEABLE UNIT (FRU) REMOVAL
3-4
3-4
3.4.1 Access From Cabinet Front Door
3.4.2 Access From Cabinet Back Door
3-5
3.5 Rx33 COVER PLATE AND DISK DRIVE REMOVAL AND
REPLACEMENT
3-5
3.5.1 RX33 Jumper Configuration
.•.•
3-9
3.6 OPERATOR CONTROL PANEL (OCP) REMOVAL AND
REPLACEMENT
. . . . . . . . . . . .
. ••
3- 9
3.7 LOGIC MODULES REMOVAL AND REPLACEMENT
• • • 3 -11
3.8 BLOWER REMOVAL AND REPLACEMENT . . . . .
. 3-13

iii

3.9 AIRFLOW SENSOR ASSEMBLY REMOVAL AND REPLACEMENT
3.10 POWER CONTROLLER REMOVAL AND REPLACEMENT
.
3.11 MAIN POWER SUPPLY REMOVAL AND REPLACEMENT . . .
3.12 AUXILIARY POWER SUPPLY
. . • . • . . ..
.

CHAPTER 4

INITIALIZATION PROCEDURES

4.1 INTRODUCTION . . . . . . . . • • .
4.2 CONSOLE TERMINAL CONNECTION
.•.•.
4.3 HSC70 INITIALIZATION . • . . . • . .
4.3.1 Init P.io Test. . . . . .
. ..••.
4.3.1.1 Init P.io Test System Requirements
4.3.1.2 Init P.io Test Prerequisites
.... .
4.3.1.3 Init P.io Test Operation
.... .
4.3.2 Fault Code Interpretation
4.3.3 Init P.io Test Summaries . . • .
.

CHAPTER 5

3-15
3-16
3-18
3-21

4-1
4-1
4-2
4-3
4-3
4-3
4-3
4-4
. 4-13

INLINE DIAGNOSTICS

5.1 INTRODUCTION . . .
5.1.1 Inline Diagnostics Commonalities . • . . .
5.1.1.1 Inline Diagnostics Generic Error Message
Format . . . . . . . . . . . . .
5.2 INLINE RX33 DIAGNOSTIC TEST (ILRX33) .
5.2.1 ILRX33 System Requirements. . .
5.2.2 ILRX33 Operating Instructions
. . . .
5.2.3 ILRX33 Test Parameter Entry
5.2.4 ILRX33 Setting/Clearing
. . . .
5.2.5 ILRX33 Progress Reports
5.2.6 ILRX33 Test Termination
5.2.7 ILRX33 Error Message Example
.•..
5.2.8 ILRX33 Error Messages
. . . .
5.2.9 ILRX33 Test Summary
• • . . . . . • . . . ..
5.3 INLINE MEMORY TEST (ILMEMY)
•.•••.•...
5.3.1 ILMEMY System Requirements. . . . • • .
5.3.2 ILMEMY Operating Instructions
5.3.3 ILMEMY Progress Reports
. . . . •
5.3.4 ILMEMY Error Message Example. . • . . • • . .
5.3.5 ILMEMY Error Messages
. . . . . . .
5.3.6 ILMEMY Test Summaries
. . . .
5.4 INLINE DISK DRIVE DIAGNOSTIC TEST (ILDISK)
5.4.1 ILDISK System Requirements. • . . . • .
.
5.4.2 ILDISK Operating Instructions
......
5.4.3 ILDISK Availability
. . . . .
.
5.4.4 ILDISK Test Parameter Entry
.......
5.4.5 Specifying Requestor And Port - ILDISK
..
5.4.6 ILDISK Progress Reports
......
5.4.7 ILDISK Test Termination
....
5.4.8 ILDISK Error Message Example.
. ..
5.4.9 ILDISK Error Messages
....
.
5.4.9.1 MSCP Status Codes - ILDISK Error Reports . .
5.4.10 ILDISK Test Summaries . . • . . . . . • . . .

5-1
5-1
5-2
5-2
5-2
5-3
5-3
5-4
5-4
5-4
5-4
5-4
5-6
5-6
5-7
5-7
5-7
5-8
5-8
5-8
5-9
5-10
5-10
5-11
5-11
5-12
5-13
5-13
5-13
5-13
5-26
5-27

5.5 INLINE TAPE TEST (ILTAPE)
.
5.5.1 ILTAPE System Requirements..
. ...•.
5.5.2 ILTAPE Operating Instructions
.••
5.5.3 ILTAPE/User Dialogue. . . .
. ..
5.5.4 ILTAPE User Sequences
......
5.5.5 ILTAPE· Progress Reports
....
5.5.6 ILTAPE Test Termination
.........
5.5.7 ILTAPE Error Message Example.
. ..
5.5.8 ILTAPE Error Messages
....
5.5.9 ILTAPE Test Summaries
..
5.5.9.1 K.sti Interface Test Summary
..
5.5.9.2 Formatter Diagnostics Test Summary.
.
5.5.9.3 User Sequences Test Summary
..
5.5.9.4 Canned Sequence Test Summary. . .
. ..
5.5.9.5 Streaming Sequence Test Summary
...
5.6 INLINE TAPE COMPATABILITY TEST (ILTCOM)
..
5.6.1 ILTCOM System Requirements. .
. .
5.6.2 ILTCOM Operating Instructions
.
5.6.3 ILTCOM Test Parameter Entry
.•
5.6.4 ILTCOM Test Termination
..
5.6.5 ILTCOM Error Message Example.
.
5.6.5.1 ILTCOM Error Messages
. . . .
.
5.6.6 ILTCOM Test Summaries
...
5.7 INLINE MULTIDRIVE EXERCISER (ILEXER) .
. .
5.7.1 ILEXER System Requirements. .
. .
5.7.2 ILEXER Operating Instructions
....
5.7.3 ILEXER Test Parameter Entry
..
5.7.4 Disk Drive User Prompts
.
5.7.5 Tape Drive User Prompts
....
. .
5.7.6 ILEXER Global User Prompts.
. ....
5.7.7 ILEXER Data Patterns. . .
.
5.7.8 Setting/Clearing Flags - ILEXER
.....
5.7.9 ILEXER Progress Reports
.. . . .
. .
5.7.10 ILEXER Data Transfer Error Report.
. .
5.7.11 ILEXER Performance Summary
. . . ..
..
5.7.12 ILEXER Communications Error Report . . . . .
5.7.13 ILEXER Test Termination. . . . . .
. .
5.7.14 ILEXER Error Message Format . . . . . . . . .
5.7.14.1 ILEXER Prompt Error Format
...
5.7.14.2 ILEXER Data Transfer Compare Error Format.
5.7.14.3 ILEXER Communications Error Format
.
5.7.15 ILEXER Error Messages. . . . . .
. ...
5.7.15.1 ILEXER Informational Messages.
.
5.7.15.2 ILEXER Generic Errors. .
. .
5.7.15.3 ILEXER Disk Errors
...
5.7.15.4 ILEXER Tape Errors
. . . .
5.7.16 ILEXER Test Summaries
....
. .
G

CHAPTER 6

5-31
5-31
5-32
5-32
5-36
5-38
5-38
5-39
5-39
5-43
5-43
5-43
5-44
5-44
5-44
5-44
5-46
5-47
5-47
5-49
5-49
5-50
5-51
5-51
5-51
5-52
5-53
5-54
5-57
5-58
5-60
5-62
5-62
5-62
5-63
5-66
5-66
5-66
5-66
5-67
5-68
5-68
5-68
5-69
5-71
5-73
5-75

OFFLINE DIAGNOSTICS

6.1 INTRODUCTION. . . . . . . . . . . . . . . .
6.1.1 Offline Diagnostics Software Requirements
6.1.2 Offline Diagnostics Load Procedure

6-1
6-2
6-2

6.1.3 P.ioj ROM Bootstrap
. . . . . . .
6-2
6.1.3.1 Bootstrap Initialization Instructions
6-3
6.1.3.2 Bootstrap Failures. . .
. . . .
6-3
6.1.3.3 Bootstrap Progress Reports. . . .
6-4
6.1.3.4 Bootstrap Error Information . . . .
6-5
6.1.3.5 Bootstrap Failure Troubleshooting
. . . . . 6-5
6.1.4 Bootstrap Test Summaries. . • . . . . . . . . 6-6
6.1.5 Offline Diagnostics Error Reporting And
Message Format
.............
6-10
6.2 OFFLINE DIAGNOSTICS LOADER . . . . . . . . . . . 6-11
6.2.1 Offline Diagnostic Loader System Requirements 6-11
6.2.2 Offline Diagnostic Loader Prerequisites . . . 6-12
6.2.3 Operating Instructions For The Offline
Diagnostic Loader . . . . . . . . . . . . . . 6-12
6.2.4 Offline Diagnostic Loader Commands
. . . 6-12
6.2.4.1 Offline Diagnostic Loader HELP Command . . . 6-12
6.2.4.2 Offline Diagnostic Loader SIZE Command.
6-13
6.2.4.3 Offline Diagnostic Loader TEST Command
. . 6-13
6.2.4.4 Offline Diagnostic Loader LOAD Command . . . 6-14
6.2.4.5 Offline Diagnostic Loader START Command . . 6-14
6.2.4.6 EXAMINE And DEPOSIT Commands . . . . . . . . 6-14
6.2.4.6.1 Offline Diagnostic Loader EXAMINE Command 6-14
6.2.4.6.2 Offline Diagnostic Loader DEPOSIT Command 6-15
6.2.4.6.3 Offline Diagnostic Symbolic Addresses
6-15
6.2.4.6.4 Repeating EXAMINE And DEPOSIT Commands
. 6-16
6.2.4.6.5 Offline Diagnostics Relocation Register
. 6-17
6.2.4.6.6 Offline Diagnostics EXAMINE And DEPOSIT
Qualifiers (Switches) . . . . . .
. . 6-18
6.2.4.6.7 Setting And Showing Defaults . . . . . . . 6-19
6.2.4.6.8 Executing INDIRECT Command Files
. . 6-20
6.2.5 Offline Diagnostics Unexpected Traps And
Interrupts . . . . . . . .
...
. . . 6-20
6.2.5.1 Offline Diagnostics Trap And Interrupt
Vectors . . . . . . . . . . . . . . . . . . 6-21
6.2.5.2 Offline Diagnostics Loader Help File
. 6-21
6.3 OFFLINE CACHE TEST . . . . . . . . ~ . .
. 6-22
6.3.1 Offline Cache Test System Requirements.
. 6-22
6.3.2 Offline Cache Test Operating Instructions
. . 6-23
6.3.3 Offline Cache Test Parameter Entry..
..6-23
6.3.4 Offline Cache Test Progress Reports
. . 6-24
6.3.5 Offline Cache Test Error Information
. . 6-24
6.3.5.1 Specific Offline Cache Error Messages
. 6-24
6.3.6 Offline Cache Test Troubleshooting
. . . . . 6-29
6.3.7 Offline Cache Test Descriptions
. . 6-29
6.4 OFFLINE BUS INTERACTION TEST
.....
. 6-33
6.4.1 Offline Bus Interaction Test System
Requirements
.......... .
. 6-34
6.4.2 Offline Bus Interaction Test Prerequisites
. 6-34
6.4.3 Offline Bus Interaction Test Operating
Instruction~ . . . . . . . . . . . . . . . . . 6-35
6.4.4 Offline Bus Interaction Test Parameter Entry. 6-35
6.4.5 Offline Bus Interaction Test Progress Reports 6-37
6.4.6 Offline Bus Interaction Test Error Information 6-37
6.4.6.1 Requestor Error Summary . . . . . . . . . . 6-38

6.4.6.2 Offline Bus Interaction Memory Test
Configuration
. . . . . . . . . . . . . . . 6-38
6.4.6.3 Offline Bus Interaction Test Error Messages 6-39
6.4.6.4 Offline Bus Interaction K Memory Test
Algorithm . . . . . . . . . . . . . . . . . 6-42
6.5 OFFLINE K TEST SELECTOR . . . . . . . . . . . . 6-43
6.5.1 Offline K Test Selector System Requirements
. 6-43
6.5.2 Offline K Test Selector Operating Instructions 6-43·
6.5.3 Offline K Test Selector Parameter Entry
. 6-44
6.5.4 Offline K Test Selector Progress Reports . . . 6-45
6.5.5 Offline K Test Selector Error Information
. . 6-45
6.5.5.1 K.ci Path Status Information . . . . . . . . 6-46
6.5.5.2 Offline K Test Selector Error Messages
. 6-46
6.5.6 Offline K Test Selector Summaries
. 6-55
6.6 OFFLINE KIP MEMORY TEST . . . . . . . . . . . . 6-57
6.6.1 Offline KIP Memory Test System Requirements
. 6-57
6.6.2 Offline KIP Memory Test Operating Instructions 6-58
6.6.3 Offline KIP Memory Test Parameter Entry
. 6-58
6.6.4 Offline KIP Memory Test Progress Reports
. . 6-60
6.6.5 Offline KIP Memory Test Parity Errors
. 6-60
6.6.6 Offline KIP Memory Test Error Information
6-61
6.6.6.1 Offline KIP Memory Test Error Summary
Information
. . . . . . . . . . . ..
. 6-62
6.6.6.2 Offline KIP Memory Test Error Messages
6-62
6.6.7 Offline KIP Memory Test Summaries
. . . . . . 6-72
6.7 OFFLINE MEMORY TEST . . . . . . . . . . . . . . 6-73
6.7.1 Offline Memory Test System Requirements
. 6-73
6.7.2 Offline Memory Test Operating Instructions
. 6-74
6.7.3 Offline Memory Test Parameter Entry . . . . . 6-74
6.7.4 Offline Memory Test Progress Reports
. . . . 6-75
6.7.5 Offline Memory Test Parity Errors
. 6-76
6.7.6 Offline Memory Test Error Information
. 6-76
6.7.6.1 Offline Memory Test Error Messages
. 6-76
6.7.7 Offline Memory Test Summaries
. . ..
. . 6-88
6.8 Rx33 OFFLINE EXERCISER. . . . . . . . .
. 6-89
6.8.1 Rx33 Offline Exerciser System Requirements . . 6-90
6.8.2 Rx33 Offline Exerciser Operating Instructions 6-90
6.8.3 Rx33 Offline Exerciser Parameter Entry.
. 6-90
6.8.4 Rx33 Offline Exerciser Progress Reports
. 6-91
6.8.5 RX33 Offline Exerciser Error Information
6-92
6.8.5.1 Specific Rx33 Offline Exerciser Error
Messages . . . . . . . . . . . . . .
. 6-92
6.8.6 RX33 Offline Exerciser Test Summaries
. 6-97
6.8.7 RX33 Offline Exerciser Data Patterns . . . . . 6-99
6.9 OFFLINE REFRESH TEST
..........
6-100
6.9.1 Offline Refresh Test System Requirements
6-100
6.9.2 Offline Refresh Test Operating Instructions
6-100
6.9.3 Offline Refresh Test Parameter Entry. .
6-101
6.9.4 Offline Refresh Test Progress Reports
6-101
6.9.5 Offline Refresh Test Error Information.
6-101
6.9.5.1 Offline Refresh Test Error Messages
6-102
6.9.6 Offline Memory Refresh Test Summaries
6-103
6.10 OFFLINE OPERATOR CONTROL PANEL TEST
....
6-104

vii

6.10.1 Offline Operator Control Panel Test System
Requirements
...•••••....
6.10.2 Operator Control Panel Test Operating
Instructions
..•...•••...
6.10.3 Offline Operator Control Panel Test
Parameter Entry • . • • . • . . • . .
6.10.4 Offline Operator Control Panel Test Error
Information
....••...•••
6.10.4.1 Offline Operator Control Panel Test Error
Messages
. . . . • . . . . . • • .
6.10.5 Offline Operator Control Panel Test
Summaries . . . . . . • .
6.10.6 Offline OCP Registers And Displays Via ODT
6.10.6.1 Offline OCP Test Switch Check Via ODT . .
6.10.6.2 Offline OCP Test Lamp Bit Check Via ODT .
6.10.6.3 Offline OCP Test Secure/Enable Switch
Check Via ODT . . . . . . . . . . . . . .
6.10.6.4 Offline OCP Test State LED Check Via ODT
CHAPTER 7

6-104
6-104
6-105
6-107
6-107
6-108
6-112
6-112
6-113
6-114
6-116

UTILITIES

7.1 INTRODUCTION . . . • . . • • . .
7.2 OFFLINE DISK UTILITY (DKUTIL)
7.2.1 DKUTIL Initialization
7.2.2 DKUTIL Command Syntax
7.2.3 DKUTIL Command Modifiers • .
7.2.4 DKUTIL Sample Session • • • .
7.2.5 DKUTIL Command Descriptions
7.2.5.1 DKUTIL DEFAULT Command.
7.2.5.2 DKUTIL DISPLAY Command.
7.2.5.3 DKUTIL DUMP Command
7.2.5.4 DKUTIL EXIT Command
•••
7.2.5.5 DKUTIL GET Command . .
.•.
7.2.5.6 DKUTIL POP Command . . • • • . . • .
7.2.5.7 DKUTIL PUSH Command
7.2.5.8 DKUTIL REVECTOR Command . . . • . . • • .
7.2.5.9 DKUTIL SET Command.
7.2.6 DKUTIL Error Messages
.••.....
7.2.6.1 DKUTIL Error Message Variables.
7.2.6.2 DKUTIL Error Message Severity Levels
·
7.3 OFFLINE DISK VERIFIER UTILITY (VERIFY)
• • •
7.3.1 VERIFY Initiation . • . . . . . . • .
·
7.3.2 VERIFY Sample Session . . . . . . . . • . ·
7.3.3 VERIFY Errors And Information Messages.
7.3.3.1 VERIFY Variable Output Fields . . . .
7.3.3.2 VERIFY Error Message Severity Levels. . .
7.3.3.3 VERIFY Fatal Error Messages
... .
·
7.3.3.4 VERIFY Information Messages
.... .
7.3.3.5 VERIFY Warning Messages
7.3.3.6 VERIFY Type Error Messages . .
7.3.3.7 VERIFY Informational Messages
..... ·
7.4 OFFLINE DISK FORMATTER UTILITY (FORMAT)
7.4.1 FORMAT Initiation

viii

·
·
·
·
·
·
·
.
•
.
.
·
·
.
.
·
·
·
.
·

7-1
7-1
7-1
7-2
7-3
7-3
7-7
7-8
7-10
7-12
7-14
7-14
7-15
7-15
7-16
7-16
7-17
7-17
7-17
7 -2 0
7-22
7-23
7-24
7-25
7-25
7-25
7-26
7-26
7-28
7-29
7-30
7-31

7.4.2 FORMAT Sample Session . . . . • . . . •
.
7.4.3 FORMAT Errors And Information Messages
..
7.4.3.1 FORMAT Error Message Variables
....•.
7.4.3.2 FORMAT Message Severity Levels
.
7.4.3.3 FORMAT Fatal Error Messages . . •
.
7.4.3.4 FORMAT Warning Message . . . . • • • . . . •
7.4.3.5 FORMAT Information Messages
•.•...
7.4.3.6 FORMAT Error Messages
.......•.
7.4.3.7 FORMAT Success Messages . . • . . . . . • .
7.5 RXFORMAT UTILITY. . . . . . . . . .
. ...
7.5.1 RXFORMAT Initiation . . . .
.
7.5.2 RXFORMAT Error Messages . . • . .
. .•
7.6 VIDEO TERMINAL DISPLAY (VTDPY) . . . . . . . . .
7.6.1 VTDPY Error Messages. . . . .
.
7.6.2 VTDPY Display Ex~mple
. ..
. ...
7.6.2.1 VTDPY Display Explanation . . .
. ..
CHAPTER 8

7-33
7-34
7-34
7-35
7-35
7-37
7-37
7-38
7-38
7-38
7-38
7-40
7-41
7-42
7-42
7-43

TROUBLESHOOTING TECHNIQUES

8.1 INTRODUCTION . . . . . . . . . . . . • . .
8.2 HOW TO USE THIS CHAPTER . . . . . . • . .
8.3 INITIALIZATION ERROR INDICATIONS
8.3.1 OCP Fault Code Displays
8.3.2 Module LEDs . . . . •
8.3.2.1 P.ioj LEDs . . . . .
8.3.2.2 Power-Up Sequence Of I/O Control Processor
LEOs

•

••

8.3.2.3 Memory Module LEOs. .
. . . . ..
8.3.2.4 Data Channel LEDs . . . . .
8.3.2.5 Host Interface LEDs . . . . . . . . . . . .
8.3.3 Communication Errors
............
8.3.4 Requestor Status For Nonfailing Requestors..
8.3.5 Boot Flowchart
......
. . . . . .
8.3.6 Boot Diagnostic Indications . . . . . . . • .
8.4 SOFTWARE ERROR MESSAGES
.•.......
8.4.1 Mass Storage Control Protocol Errors
...
8.4.2 MSCP/TMSCP Error Format, Description, And
Flags
. . . . . . . .
.
8.4.2.1 MSCP/TMSCP Error Format . . . .
. ...
8.4.2.2 MSCP/TMSCP Error Message Fields
....
8.4.2.3 MSCP/TMSCP Error Flags.
. ..•
8.4.2.4 MSCP/TMSCP Controller Errors
....
.
8.4.2.4.1 Controller Error List . . . . . .
.
8.4.2.5 MSCP SDI Errors . . . . . . • . . .
.
8.4.2.6 Disk Transfer Errors . . . . . . . . . . . .
8.4.3 Bad Block Replacement Errors (BBR)
.
8.4.4 TMSCP-Specific Errors
. ..
.
8.4.4.1 STI Communication Or Command Errors
.
8.4.4.2 STI Formatter Error Log . . . . . . . . . .
8.4.4.3 STI Drive Error Log . . . . . . . . . . . .
8.4.4.4 Breakdown Of GEDS Text Field
.
8.4.4.5 Breakdown Of GSS Text Field . . . .
.
8.4.5 Out-of-Band Errors . . . . . . . . . . . . . .

8-1
8-1
8-2
8-2
8-4
8-4
8-4
8-5
8-6
8-6
8-7
8-8
8-8
8-13
8-13
8-13

8-13
8-14
8-14
8-15
8-16
8-16
8-20
8-35
8-42
8-46
8-46
8-48
8-49
8-53
8-55
8-70

8.4.5.1 CI Errors
....... .
8.4.5.2 Load Device Errors . . . .
8.4.5.3 Disk Functional Errors
8.4.5.4 Tape Functional Errors
8.4.5.5 Miscellaneous Errors
....... .
8.4.6 Traps
. . . . . . ..
. .. .
8.4.6.1 NXM (Trap Thru 4)
......... .
8.4.6.2 Reserved Instruction (Trap Thru 10)
8.4.6.3 Parity Error (Trap Thru 114) . . . .
8.4.6.4 Level 7 K Interrupt (Trap Through 134)
8.4.6.5 Control Bus Error Conditions (Hardware
Detected)
............... .
8.4.6.5.1 Level 7 K Interrupt Printout.
8.4.6.6 MMU (Trap Thru 250)
.....

APPENDIX A

APPENDIX B

EXCEPTION CODES AND MESSAGES

APPENDIX C

GENERIC ERROR LOG FIELDS

APPENDIX E

8-112
8-112
8-115

A-I

C.l GENERIC ERROR LOG FIELDS
C.2 MSCP/TMSCP EVENT CODES . . .

0.1
0.2
D.3
D.4

8-71
8-81
8-84
8-92
8-101
8-110
8-111
8-111
8-111
8-112

INTERNAL CABLING DIAGRAM

A.l HSC70 INTERNAL CABLING.

APPENDIX 0

·
·
·
·

C-l
C-2

INTERPRETATION OF STATUS BYTES
INTRODUCTION . . . . . .
OVERVIEW. . . . . . . . .
. .. .
HOW TO USE THE STATUS CODE TABLES
...
EXAMPLE EXAMINATION
..... .

D-l
0-3
D-4
0-5

HSC70 REVISION MATRIX CHART

E.l INTRODUCTION.

E-l

INDEX

EXAMPLES
8-1 MSCP/TMSCP Error Message Format

· 8-14

8-2 Controller Error Message Example • • • • . • • • 8 -16
8-3
8-4
8-5
8-6

SDI Error Printout . . . . . . .
. • • • • 8 - 21
Disk Transfer Error Printout . .
· 8-36
Bad Block Replacement Error Printout
. . 8-43
STI Communication or Command Error Printout
. . 8-47

8-48
8-49
. 8-53

8-7 STI Formatter Error Log Printout .
8-8 STI Drive Error Log Printout
8-9 Tape Drive Related Error Message .
FIGURES

1-2
1-1 Redundant Cluster Configuration . . . .
1-3
1-2 HSC70 Cabinet - Front . . . .
. . . . . . . . 1-4
1-3 HSC70 - Inside Front View
1-5
1-4 HSC70 Module Utilization Label Example . . .
1-6
1-5 HSC70 - Inside Rear View . .
1-6 HSC70 External Interfaces
· . . .
1- 7
1-7 HSC70 Internal Software . . . . . . . . · . . • 1- 8
1-8 Subsystem Block Diagram . . . . . . . . · . . . 1-11
....
. . 1-1 7
1-9 Memory Map (M.std2 - LOl17)
1-10 HSC70 Specifications
1-19
2-1 Operator Control Panel . . .
· . . . 2-1
2-2 Controls/Indicators - Inside Front Door
• • • • 2- 4
• . .
2- 5
2-3 RX33 and DC Power Switch
2-4 Module LED Indicators . . . .
• • • • 2- 6
2-5 HSC70 Module Utilization Label Example .
2-7
2-6 Module (DIP) Switches . . . . . . . . .
2-9
2-7 Power Controller - Front Panel Controls
. 2-11
. . 2-13
2-8 881 Rear Panel
....
3-1 Location of Circuit Breaker on the Power
Controller . . . . . . .
. . . . . . . . 3-2
3-2 DC Power Switch Location.
....
3-3
3-3 FRU Removal Sequence . . . . . .
3-4
3-4 RX33 Cover Plate Removal .
. . . .
3-6
3-5 RX33 Disk Drive Removal . . . . . . . . .
3-7
3-6 RX33 Jumper Configurations .
3-8
3-7 Operator Control Panel Removal .
...
. 3-10
3-8 Card Cage Cover Removal . . . .
. . 3-12
3-9 Location of Node Address Switches
.
3-13
3-10 Main Cooling Blower Removal . . . .
. . 3-14
3-11 Airflow Sensor Assembly Removal . . . . . . . . 3-15
3-12 Power Controller Removal . . . .
. 3-17
3-13 Main Power Supply Cables - Disconnection . . . 3-19
3-14 Main Power Supply Removal . . . . . . . . . . . 3-20
3-15 Auxiliary Power Supply Cable Disconnection . . 3-22
3-16 Auxiliary Power Supply Removal . . . . . . . . 3-23
4-1 Console Terminal Connection . . . . . . . .
4-2
4-2 Operator Control Panel Fault Code Displays .
4-5
6-1 P.ioj Switch Display Register Layout. . .
6-113
6-2 P.ioj Control and Status Register Layout.
6-115
8-1 Operator Control Panel Fault Codes . . . . .
8-3
8-2 HSC70 Boot Flowchart (1 of 4)
8-9
8-3 HSC70 Boot Flowchart (2 of 4)
. 8-10
8-4 HSC70 Boot Flowchart (3 of 4)
. . 8-11
8-5 HSC70 Boot Flowchart (4 of 4)
....
. . . 8-12
8-6 Request Byte Field .
.......
. . 8-23
8-7 Mode Byte Field . . . . . . .
. 8-24
8-8 Error Byte Field . . . . . .
.
8-25
e

8-9 Controller Byte Field . . . . . . . .
· 8-26
8-10 Rx33 Floppy Controller CSR Breakdown
• • • 8- 82
8-11 MMSRO Bit Breakdown .
8-117
A-2
A-I HSC70 Internal Cabling (1 of 5)
A-2 HSC70 Internal Cabling (2 of 5)
• • •
A- 3
A-3 HSC70 Internal Cabling (3 of 5)
. . . . . • • • A- 4
A-5
.... .
A-4 HSC70 Internal Cabling (4 of 5)
. . . . . . . . A-6
A-5 HSC70 Internal Cabling (5 of 5)
0-1 Subsystem Exception K-Detected Error (1 of 2)
0-2
E-l HSC70 Revision Matrix Chart (1 of 4)
. . . E-2
a

•••

TABLES
1-1 Module Nomenclature . . . . .
. . ..
.
2-1 Functions of Logic Module LEDs
. . . .
4-1 UPAR Register Addresses . . . .
4-2 Control Program Bits
· .
4-3. Status of Requestors For Level 7 Interrupt
·
5-1 ILTCOM Header Record .
..... .
5-2 ILTCOM Data Patterns .
. .. .
· .
6-1 Error Table . . . . . . . . .
6-2 RX33 Error Code Table . . . .
. . .
7-1 DKUTIL Command Summary
. . . . . . . .
7-2 DKUTIL Error Messages . . . . . . . . . .
8-1 LOlll-O (P.ioj) LEDs ~
... .
8-2 LOl17-0 (M.std2) LEDs
........ .
8-3 LOl08-YA/YB (K.sdi/K.sti) LEDs .
. .. .
8-4 K.ci (LINK, PILA, K.pli) LEDs
... .
8-5 MSCP/TMSCP Error Message Field Description . · .
8-6 MSCP/TMSCP Error Flags . . . . . . . . . .
·
8-7 MSCP/TMSCP Controller Error Message Field
Description
. . . . . . . . ..
. ..
8-8 SOl Error Printout Field Description .
. .
8-9 Request Byte Field Description.
8-10 Mode Byte Field Description. . .
. .
8-11 Error Byte Field Description
...
8-12 Controller Byte Field Description . . . . .
8-13 Disk Transfer Error Printout Field Description
8-14 Original Error Flags Field Description
..
8-15 Recovery Flags Field Definition . . . . . . . .
8-16 Bad Block Replacement Error Printout Field
Definition
...............
8-17 Replace Flags Bit Description . . . . . . . . .
8-18 STI Communication or Command Error Printout
Field Description .
. ........
8-19 STI Formatter Error Log Field Description . . .
8-20 Formatter E Log
. . . . ..
...
8-21 STI Drive Error Log Field Description . .
.
8-22 GEDS Text
. . . . . . .
8-23 STI Drive Error Log
.
8-24 Status Register Summary . . . . . . . . .
C-1 Generic Error Log Fields . . . . . . . . . .
C-2 Error Flags . . . . . .
......

xii

1-12
2- 7
4-10
4-11
4-12
5-45
5-46
6-8
6-9
7-7
7-17
8-4
8-5
8-6
8-6
8-14
8-15
8-16
8-21
8-24
8-25
8-26
8-27
8-36
8-38
8-39
8-43
8-44
8-47
8-48
8-49
8-50
8-50
8-51
8-83
C-l
C-2

C-3
0-1
0-2
0-3

MSCP/TMSCP Event Codes .
K.ci Status Bytes
... .
K.sdi Status Bytes . . . . .
K.sti Status Bytes . . . . . . . . .

xiii

C-3
D-6
. 0-12
. 0-14

PREFACE

This manual describes the HSC70 subsystem. It describes HSC70
controls and indicators, error reporting, field replaceable
units, troubleshooting, and diagnostic procedures. All
information in this manual is informational/instructional and is
designed to assist field service personnel with HSC70
maintenance. Operational theory is included wherever such
background is helpful to field service.
Installation procedures, most HSC utilities, and indepth
technical descriptions are not included in this manual. For
source material on these and other subjects not within the scope
of this manual, refer to the list of related documentation at the
end of Chapter 1.

CHAPTER 1
GENERAL INFORMATION

1.1

INTRODUCTION

This chapter includes general information about the HSC70 Mass
Storage Server including:
o

Subsystem block diagrams

packaging and logic module descriptions

Maintenance features

Physical specifications

Related documentation

1.2 GENERAL INFORMATION
Defined as a disk and/or tape subsystem, the HSC70 can interface
with multiple hosts using the Computer Interconnect (CI) bus. In
case of bus failure, two CI buses are included with the
subsystem. Refer to Figure 1-1 for a sample five-node cluster
configuration utilizing two HSC70s and three host computers.
In
this figure, all three hosts access both HSC70s over the CI bus,
and through dual porting, both HSC70s can access the tape
formatter and the disks.

The HSC70 supports a combination of eight disk and tape data
channels. Each disk data channel supports four drives over the
Standard Disk Interface (SDI). Each tape data channel supports
four tape formatters over the Standard Tape Interface (STI).
Depending upon which formatter is used, from one to four tape
transports can be supported by each formatter.
Consult the HSC70 software release notes for the maximum number
of tape formatters conforming to the STI bus. These software
release notes are shipped with each HSC70 and with updates of the
software.

1-1

HOST
HOST

HOST

!\II

VT220

VT220
HSC70

HSC70

LA50

LA 50

mm CI INTERFACE
CX-88GA

Figure 1-1

Redundant Cluster Configuration

1.2.1 HSC70 Cabinet Layout
HSC70 logic and power systems are housed in a modified H9642
cross-products cabinet with both front and rear access. See
Figure 1-2 for front view of the cabinet.

1-2

eX-906B

Figure 1-2

HSC70 Cabinet - Front

On the front of the cabinet are the operator control panel
switches and indicators. Switch operation and indicator
functions are described in Chapter 2.
To access the cabinet interior, open the front door with a key.
The door key is part of the door-lock mechanism, part number
12-25"411-01.

1-3

The upper right-hand portion of the cabinet houses the RX33 dual
drives and connectors for the operator control panel.
The HSC70 contains two power supplies. Both are housed
underneath the Rx33. See Figure 1-3. Each power supply has a ,
fan drawing air from the front of the cabinet across the power
unit and exhausting it through a rear duct.

MAIN
POWER
SUPPLY

CAR D
CAGE

AUXI LlARY
POWER
SUPPLY

CX-927B

Figure 1-3

HSC70 - Inside Front View

1-4

A 14-s1ot card cage with a corresponding backplane provides
housing for the HSC70 logic modules (L-series extended hex).
When viewed from the front, the card cage occupies the upper left
of the cabinet. Above the card cage is a module utilization
label indicating the slot location of each module (Figure 1-4).
All unassigned slots contain baffles.

c
:::i

0
0

o
t
~>~

...Jccu

...JCCU

0
0

Mod

OCl.J_

::J

ro
.....
"-

.. 0

.->0..

OCl.J_

r--.

u
0

.....
"-

~>~

OCl.J_

...Jccu

Bkhd X
Req
Slot

14 13 12 11

..c
U

C
C

Cll

..c
U

C
C

Cll

..c
U

Cll

ctl

Cl.J

0..
Cll

Cll

..c
U
~

Cll

o .. OJ o .. OJ
'->0.. ~>~
o Cl.J.o'->0..
Cl.J ctl
OCl.JCll

..c
U

Cll

C
C

Cll

C
C

Cll

C
C

..c
U

Cll

o~>~
Cl.J._
o~>~
Cl.J . -

..c
U
~
Cll
0

Cll

..c
U

~
Cll
0

<:
<:
I

r--.
co
0
o .. ~
::
>E
'->",
o Cl.J._ o~>~
o Cl.J OJ
OJ . ...Jcco

...JCCI-

...Jcco

...Jcco ...Jcco

10 9

...JCCI-

OJ
U

C
C

Cll

...Jcco

...JCC~

c
0
.- .. U

o 5;0

...Jcc~

Y
0

eX-889A

Figure 1-4

HSC70 Module Utilization Label Example
NOTE

Requestor slots A, B, C, 0, E, F, M, and N,
illustrated in Figure 1-4, are optional tape or
disk data channels. Optional slot labels are
blank when no module is present. Appropriate
labels are provided with each data channel option
ordered.

1-5

Logic modules are cooled by a blower mounted behind the card cage
(Figure 1-5). Air is drawn in through the front door louver, up
through the modules, and exhausted through the larger duct at the
back.
NOTE

Figure 1-5 shows the blower motor outlet duct for
current models. Early models have a smaller
blower motor outlet duct.

Two levels of cable connections are found in the HSC70:
backplane to bulkhead and bulkhead to outside the cabinet. All
connections to the logic modules are made via the backplane. All
cables attach to the backplane with press-on connectors.

BLOWER

BLOWER
OUTLET
DUCT

W~~~t---~~~~----INTERNAL
CI CABLES

POWER
CONTROLLER
CI CABLES
BULKHEAD

Figure 1-5

EXTERNAL
SI CABLES

HSC70 - Inside Rear View

..l-O

CX-890B

The power controller is in the lower left-hand rear corner of the
HSC70. The power control bus, delayed output line, and noise
isolation filters are housed in the power controller.
Exterior CI, SOl, and STI buses are shielded up to the HSC70
cabling bulkhead. These cables are attached to bulkhead
connectors located at the bottom rear of the cabinet. From the
interior of the I/O bulkhead connectors, unshielded cables are
routed to the backplane.
1.2.2 External Interfaces
Figure 1-6 shows the external hardware interfaces used by the
HSC70.

CI BUS----ONE OR MORE HOST COMPUTERS

HSC70
CONTROLLER

SDI BUS---DISK DRIVES
(ONE CABLE PER DISK DRIVE)
STI BUS - - - T A P E FORMATTER
(ONE CABLE PER FORMATTER)
ASCII----CONSOLE TERMINAL
SERIAL LINE
(I/O BULKHEAD J60)
ASCII----(NOT USED)
SERIAL LINE
ASCII - - - - ( N O T USED)
SERIAL LINE
RX33DISK DRIVE
SIGNAL INTERFACE

}
(BACKPLANE J18)

RX33 DISK DRIVE
CX-928B

Figure 1-6

HSC70 External Interfaces

External interface lines include:
o

CI Bus - Four coaxial cables (BNCIA-XX): two-path
serial bus with a transmit and receive cable in each
path. The communication path between system host(s) and
the HSC70.

SO! Bus - Four shielded wires for serial communication
between the HSC70 and disk drives (one SOl cable per
drive per controller) (BC26V-Xx).

STI Bus - Four shielded wires for serial communication
between the HSC70 and the tape formatter (one STl cable
per formatter) (BC26V-XX).

1-7

Serial Line Interface - RS-232-C cable for console
terminal communication with the I/O control processor
module.

Rx33 Signal Interface - Cable linking Rx33 drives with

the RX33 controller located on the M.std2 module.

1.2.3 Internal Software
Major HSC70 software modules operating internally are shown at a
block level in Figure 1-7. Each software module is described in
the following lists.

HOST CPUs

DISKS

TAPES

K.CI

K.STI

K.SDI

CI
MANAGER

STI
MANAGER

SOl
MANAGER

DIAGNOSTIC
SUBROUTINES

UTI LlTY
PROCESSES

MSCP
PROCESSOR

ERROR
PROCESSOR

TAPE
I/O
MANAGER

DISK
I/O
MANAGER

DIAGNOSTIC
MANAGER

UTI LlTI ES
MANAGER

HSC70 CONTROL PROGRAM

RX33 DRIVES

CONSOLE TERMINAL

CX-929A

Figure 1-7
o

HSC70 Internal Software

HSC70 Control Program - (found on the System diskette)
is the lowest level manager of the subsystem. It
provides a set of subroutines and services shared by all
HSC70 processes. This program performs the following
functions:

1-8

Initializing and reinitializing the subsystem
Managing the RX33 local storage media
Executing all auxiliary terminal I/O
Scheduling processes (both functional and
diagnostic) for execution by the P.ioc
Providing a set of system services and system
subroutines to HSC70 processes.
Functional processes within the HSC70 communicate with
each other and the HSC70 control program. They
communicate through shared data structures and
send/receive messages.
o

MSCP Processor - is responsible for validating,
interpreting, and routing incoming MSCP commands and
dispatching MSCP completion acknowledgments.

CI Manager - is responsible for handling virtual circuit
and server connection activities.

Error Processor - responds to all detected error
conditions. It reports errors to the diagnostic manager
and attempts to recover from errors (ECC, bad-block
replacement, retries, etc.). When recovery is not
possible, a diagnostic is run to determine if the
subsystem can function without the failing resource.
Then appropriate action is taken to remove the failing
resource or to terminate subsystem operation.

Tape I/O Manager - sets up the data transfer structures
for tape operations and manages the physical positioning
of the tape.

STI Manager - handles the STI protocol, responds to
attention conditions, and manages the online/offline
status of the tape drives.

Disk I/O Manager - performs the following functions:
Translates logical disk addresses into
drive-specific physical addresses
Organizes the data-transfer structures for disk
operations
Manages the physical positioning of the disk heads

1-9

SOl Manager - performs the following:
Handles the SOl protocol
Responds to attention conditions
Manages the online/offline status of the disk drives

Diagnostic Manager - is responsible for all diagnostic
requests, for error reporting, and ~or error logging.
It also provides decision-making and
diagnostic-sequencing functions and can access a large
set of resource-specific diagnostic subroutines.

Diagnostic Subroutines - run under the control of the
Diagnostic Manager and are classified as inline
diagnostics.

utilities Manager - performs the following functions:
Interpreting incoming utility requests
Setting up the appropriate subsystem'environment for
operatiory of the requested utility
Invoking the utility process
Returning the subsystem to its normal environment
upon completion of the utility execution

Utility Processes - perform volume-management functions
(formatting, disk-to-disk copy, disk-to-tape copy,
tape-to-disk restore). They also handle miscellaneous
operations required for modifying subsystem parameters
or for analyzing subsystem problems (such as COPY,
PATCH, and error dump).

1.2.4 Subsystem Block Diagram
The HSC70 is a multimicroprocessor subsystem with two shared
memory structures, one for control and one for data.
In
addition, the HSC70 I/O control processor fetches its own
instructions from a private (program) memory. Figure 1-8 shows
an HSC70 block diagram and the position of each component in the
subsystem.

1-10

r HOSTINTERFACE - -

PORT
PROCESSOR

PLI BUS ..

K.PLI

L0107-YA

PORT
BUFFER

...
PILA
III
BUS

L_-

L0109

PORT
LINK
L0100

LINK

CI
BUS - A

- - -- 1'

'[,-~,

- ....1
:J
- -,

CI BUS A

SCOO8
STAR
COUPLER

-I
CONTROL BUS
I

DATA BUS
I

-..

P.IOJ

I
I
I
I
I
I
I
I
I
I

CI
BUS -.J
B

-""

--.
...
...

INPUT/OUTPUT
CONTROL
PROCESSOR

MEMORY
MODULE

LOl17

K.STI L0108-YB

TAPE
_ BUS

_ STI BUS ..

- -.....
-- -...
-- -.. ...-- ..-

MAGTAPE
FORMATTER
TA78, ETC.

TERMINAL

OPERATOR
CONTROL
PANEL

PROGRAM BUS

TAPE
DATA
CHANNEL
MODULE(S)

-...

LOlll

M.STD2

ASCII PORT SER IAL
LINE INTERFACE

....

-y
~

RX33
DRIVES

TAPE
TRANSPORT

L.jTAPE

TRANSPORT

SOl BUS

DISK
DATA
CHANNEL
MODULE(S)
K.sDI L0108-YA

CI B~B

---

TAPE

- ...

~ TRANSPORT
DISK DRIVE
RA8l, RA60,
ETC.

-..

TAPE
TRANSPORT
CX-930B

Figure 1-8

Subsystem Block Diagram

1.3 MODULE DESCRIPTIONS
This section describes each of the HSC70 logic modules.
References to modules by their engineering terms appear
throughout HSC70 documentation as well as on diagnostic
printouts.
For this reason, the engineering term is shown in
parentheses after the formal name for each module. These
relationships are also indicated in Figure 1-8 and Table 1-1.

1-11

Table 1-1

Module Nomenclature

Module
Name

Engineering
Name

Module
Designation

Port Link

LINK
or
Interprocessor
Link Interface

LOIOO

Port Buffer

PILA

LOI09

Port Processor

K.pli

LOI07

Disk Data Channel

K.sdi

LOI08-YA (HSC5X-BA)

Tape Data Channel

K.sti

L0108-YB (HSC5X-CA)

Input/Output Control
Processor

P. ioj

LOllI

Memory

M.std2

L0117

Host Interface

K.ci

Consists of Port Link, Port
Buffer, and Port Processor
Modules

1.3.1 Port Link Module (LINK) Functions
The port link module (LOIOO), a part of the host interface module
set (K.ci) performs the following functions:
o

Serialization/deserialization, encoding/decoding, dc
isolation - permits transmission of a self-clocking
stream over the CI.
(Information transmitted over the
CI bus is serialized and Manchester encoded.) The driver
circuit includes a transformer for ac coupling the
encoded signal to the coaxial cable. Information
received from a CI transmission is decoded and converted
to bit-parallel form. The circuitry also provides
carrier detection for determining when the CI is in use
by another node.

Cyclic redundancy check (CRC) generation/checking checks the 32-bit CRC character generated and appended
to a message packet when it is received. An incorrect
CRC means either errors were induced by noise or a
packet collision occurred.

1-12

ACK/NAK generation - generates an ACK upon receipt of a
packet addressed to the LINK if the following conditions
exist:
Error-free CRC
Buffer space available for the message
Upon receipt of a packet addressed to this node, a NAK
is generated if the following conditions exist:
Error-free CRC
No buffer space available for the message
No response is made if a packet addressed to this node
is received with CRC error.

Packet transmission - performs the following functions:
Executes the CI arbitration algorithm
Transmits the packet header
Moves the stored information from the transmit
packet buffer to the Manchester encoder
Calculates and appends the CRC to the end of the
packet
Receives the expected ACK packet

Packet reception - performs the following functions:
Detects the start of the CI transmission
Detects the sync characters
Decodes the packet header information
Checks the CRC
Moves the data from the Manchester decoder
Returns the appropriate ACK packet

1-13

The port link module interfaces via line drivers/receivers
directly to the CI coaxial cables. On the HSC70 interior side,
the port link module interfaces to the port buffer module through
a set of interconnect link (ILl) signals. The port link module
also interfaces to the port processor module (indirectly through
the port buffer module) using a set of port link interface (PLI)
signals.
1.3.2

Port Buffer Module (PILA) Functions

The port buffer module (L0109) provides a limited number of
high-speed memory buffers to accommodate the difference between
the burst data rate of the CI bus and HSC70 internal memory
buses. It also interfaces to the port link (CI link) module via
the ILl signals and the port processor module via port/link
interface (PLI) signals.
1.3.3 Port Processor Module (K.pli) Functions And Interfaces
The port processor module (L0107-YA) performs the following
functions:

Executes and validates low-level CI protocol

Moves command/message packets to/from HSC70 control
memory and notifies the correct server process of
incoming messages

Moves data packets to/from HSC70 data memory

The port processor module interfaces to three buses:
o

PLI bus interfaces the port buffer and port link modules

Control memory bus interfaces HSC70 control memory

Data memory bus interfaces HSC70 data memory

1.3.4

Disk Data Channel Module (K.sdi) Functions
Disk data channel module (L0108-YA) operation is controlled by an
onboard microprocessor with a local programmed read-only memory
(PROM). This data channel module performs the following
functions:

Transmits control and status information to the disk
drives

Monitors real-time status information from the disk
drives

1-14

Monitors in real time the rotational position of all the
disk drives attached to it

Transmits data between HSC70 data memory and the disk
drives

Generates and compares error correction code (ECC) and
error detection code (EOC) during data transfers

Commands and responses pass between the disk data channel
microprocessor and other internal HSC70 processes throug~ control
memory. The disk data channel module interfaces to the control
memory bus and to the data memory bus. It can also interface to
four disk drives with four individual SOl buses. Currently,
combinations of up to eight disk data channel modules are
possible in the HSC70. Configuration guidelines are found in the
HSC70 Installation Manual.
1.3.5 Tape Data Channel Module (K.sdi) Functions
Tape data channel module (LOI08-YB) operation is controlled by an
onboard microprocessor with a local programmed read-only memory
(PROM). The tape data channel performs the following functions:
o

Transmits control and status information to the tape
formatters

Monitors real-time status information from the tape
formatters

Transmits data between the data memory and the tape
formatters

Generates and compares the EOC during data transfers

Commands and responses pass between the tape data channel
microprocessor and other internal HSC70 processes through control
memory. The tape data channel module interfaces to the control
memory bus and to the data memory bus. Maximum configurations
are outlined in the software release notes.
1.3.6

Input/Output (I/O) Control Processor Module (P.ioj)
Functions
The I/O control processor module (LOllI) uses a POP-II ISP (J-ll)
processor with memory management and memory interfacing logic.
This processor executes the HSC70 internal software. Also, the
I/O control processor module contains the following:

1-15

Bootstrap read-only memory (ROM)

Arbitration and control logic for the control and data
buses

Program-addressable registers for subsystem
initialization and operator control panel communications

Handles all parity checking and generation for its
accesses to memory

Contains program memory instruction cache, 8 Kbytes of
direct map high-speed memory

The I/O control processor module interfaces to:
o

Program memory on the program memory bus

Control memory through the signals of the backplane
control bus

Data memory through signals of the backplane data bus

Rx33 disk drives

Console terminal RS-423 compatible signal levels

Memory Module (M.std2) Functions
The memory module (LOl17) contains three separate and independent
memories each residing on a different bus within the HSC70. In
addition, the memory module contains the diskette controller.
The three memories and diskette controller are known as:

1.3.7

Control Memory (M.ctl) - Two banks of 256 Kbytes of
dynamic RAM for subsystem control blocks and
interprocessor communication structures storage.

Data Memory (M.dat) - 512 Kbytes of status RAM to hold
the data from/to a data channel module.

Program Memory (M.prog) - 1 megabyte of RAM for the
control program loaded from the RX33 diskette.

Rx33 Diskette Controller (K.rx) - resides on the program

bus and performs direct memory access word transfers
when reading or writing data to the RX33 diskette.
Using physical addresses, the memory space allocations for the
three memories are illustrated in Figure 1-9.

1-16

22-BIT ADDRESS ALLOCATION
ADDRESS

SPACE

17777777

I/O PAGE

17770000 t - - - - - - - I
17767777 CONTROL
WINDOWS

BUS

SIZE

COMMENT

INTERNAL

2KW

INTERNAL REGISTERS

CBUS

2KW

RESERVED ADDRESSES

NONE

248KW

NOT ACCESSIBLE

CBUS

256KB(X2)

CONTROL MEMORY

DBUS

512KB

DATA MEMORY

PBUS

2MB

EXPANSION ROOM

177600001--_ _ _ _-1
17757777

UNDEFINED

17000000
16777777
16000000
15777777
14000000
13777777
04000000
03777777

M.CTL

M.DAT

UNUSED

M.PROG
PROGRAM MEMORY

PBUS
00000000

1MB

0-4000 RESERVED
FOR TRAP VECTORS
CX-931A

Figure 1-9

Memory Map (M.std2 - LOl17)
NOTE

Two completely redundant memory banks make up
control memory.
Only one bank at a time is
usable during functional operation.
Bank failure
detection and bank swapping are done at boot
time.

Interface to control memory is by the backplane control bus and
to data memory by the backplane data bus. The interface to the
I/O control processor local program memory is via a set of
backplane signals to the program memory module.
In addition, the
memory module houses the control circuitry for the RX33 disk
drives.
1.4 HSC70 MAINTENANCE STRATEGY
Maintenance of the HSC70 is accomplished with field replaceable
units (FRUs). Procedures for removal and replacement are
described in Chapter 4. Field service personnel should not
attempt to replace or repair component parts within FRUs.

1-17

Isolation of solid failures can be accomplished efficiently due
to the logical partitioning of the modules and extensive internal
diagnostics.
In addition to the device-resident diagnostics, the
HSC70-resident offline diagnostics are available to support and
verify corrective maintenance decisions.
Maintenance Features
The following features assist in troubleshooting the HSC70:

1.4.1

Self-contained and self-initiated diagnostics

Operator control panel fault code display

Console terminal

Module LED indicators

Various levels of diagnostics execute in the HSC70. Read-only
memory (ROM) diagnostics test each microprocessor in the disk and
tape data channels, port processor, and I/O processor modules.
pressing the HSC70 Init button starts all internal ROM
diagnostics that test 95 percent of the HSC70.
The OCP or the console terminal displays any failures.
If
further diagnostics are needed, use the terminal to initiate
diagnostics stored on the RX33 diskettes.
The Rx33 loads all HSC70 troubleshooting diagnostics upon
operator demand to check SDI/STI communication and interaction
between the HSC70 and disk or tape. Powerup, subsystem
initialization, or operator command can initiate these
diagnostics. Also, certain resource failure detections can
initiate them automatically.
The HSC70 subsystem allows logical assignment of a disk drive or
tape formatter to the diagnostics.
Inline diagnostics allow
drive diagnosis even though other active drives are connected to
the HSC70.
Background (periodic) diagnostics test HSC70 logic not currently
in use by the subsystem. Failures cause the HSC70 to reboot and
execute the initialization diagnostics.
Requestor detected data memory errors cause an initiation of the
inline memory diagnostics to test the buffer causing the error.
Failures found in any data buffer cause removal of that buffer
from service. If no failure is found, the tested buffer is
returned to service, with one exception. l t the same butter is
sent to test twice, it is retired from service even though no
failure is found.

1-18

1.4.2 HSC70 Specifications
Figure 1-10 lists the HSC70 physical and environmental
specifications.

DESCRIPTION

OPTION DESIGNATION
HSC70-AA = 60 HZ, 120/208V

HSC70 MASS STORAGE SERVER

HSC70-AB = 50 HZ, 380-415V
MECHANICAL

MOUNTING
CODE
FS

WEIGHT

HEIGHT

WIDTH

DEPTH

LBS

.CM

400

181.2

106.7

21.3

54.1

91.4

CAB TYPE
(IF USED)
MODIFIED
H9642

POVvER (AC)
ACVOLTAGE
NOMINAL

AC VOLTAGE
TOLERANCE

& TOLERANCE

120/208

104-128/180-222

60 HZ 2:. 1

SEE BELOW

2250 WATTS

380-415

331-443

50 HZ 2:. 1

SEE BELOW

2250 WATTS

FREQUENCY

STEADY -ST ATE POWER CONSUMPTION
(MAX)
CURRENT (RMS)

PHASE

POWER (AC)
STEADY-STATE CURRENT (MAX AMPS) BY PHASE
PHASE A

120/208V

= 0.7

380-415V

PHASE A

= 0.44

PHASE B = 12.4

PHASE B = 6.8

PHASE C = 11.8

PHASE C = 6.4

NEUTRAL = 17.1

NEUTRAL = 9.4
POWER (AC)

PLUG TYPE (NEMA NO.) POWER CORD LENGTH
NEMA -

L21 - 30P

I NTE R R UPT TO LE RANCE

APPARENT POWER (KVA)

4MS (MIN)

3.0 (KVA)

15 FT (4.5 M)

POWER (AC)
IN RUSH CU R RENT 60HZ

IN RUSH CU R RENT 50HZ

SURGE DURATION

175 A PEAK

1 CYCLE

DEVICE ENVIRONMENT
TEMPERATURE
OPERATING*

RELATIVE HUMIDITY
OPERATING

STORAGE

59 -

90 0 F -40 -+151"F

15 -

32 0 C

20 -

80%

STORAGE

RATE OF CHANGE
TEMP

HEAT DISSI PATION

HUMIDITY

BTUlHR

KJ/HR

20%/H R

7675

8100

20 0 F/HR
<96%

-40 - +66 0 C

11 ° C/H R

DEVICE ENVIRONMENT
AL TITUDE (MAX)
OPERATING

STORAGE

8000 FT

16,000 FT

2.4 KM

4.9 KM

, ALTITUDE CHANGES:

AIR VOLUME (AT INL.ET)

AIR QUALITY

FT 3 /MIN

M3/MIN

PARTICLE COUNT (MAX)

210

5.92

N/A

DE-RATE THE MAXIMUM TEMPERATURE 1.8' C PER THOUSAND METERS
(1.0 0 F PER THOUSAND FEET).

CX-912C

Figure 1-10

HSC70 Specifications

1-19

1.5 HSC70 RELATED DOCUMENTATION
Documents related to the HSC70 are available under the following
part numbers:
o

HSC User Guide

HSC70 Installation Manual

HSC70 Illustrated Parts Breakdown

Star Coupler User Guide

VT220 Owners Manual

VT220 Programmer Pocket Guide

VT220 Installation Guide

Installing and Using the LA50 Printer

LA50 Printer Programmer Reference Manual

AA-GMEAA-TK
EK-HSC70-IN
EK-HSC70-IP

EK-SC008-UG

EK-VT220-UG
EK-VT220-HR

EK-VT220-IN
EK-OLA50-UG
EK-OLA50-RM

These documents (except for the User Guide) can be ordered from
Publication and Circulation Services, 10 Forbes Road, Northboro,
Massachusetts 01532 (RCS Code: NR12, Mail Code: NR03/W3).
The User Guide can be ordered from the Software Distribution
Center, Digital Equipment Corporation, Northboro, Massachusetts
01532
NOTE

please consult the HSC Software Release Notes for
the latest hardware revision levels.

1-20

CHAPTER 2
HSC70 CONTROLS/INDICATORS

2.1 INTRODUCTION
This chapter describes the controls and indicators located in
five areas of the HSC70:

Operator Control panel (OCP)

Inside front door

Rx33 disk drives

Logic modules

Power controller

2.2 OPERATOR CONTROL PANEL (OCP)
Figure 2-1 illustrates the controls and indicators on the
Operator Control panel (OCP).
MOMENTARY
CONTACT
SWITCH

MOMENTARY
CONTACT
SWITCH

ALTERNATE
ACTION
SWITCH

State

Power

Onlinenn
~~
c:J

eX-OOS8

Figure 2-1

Operator Control Panel

2-1

The OCP controls and indicators are described in the following
list.
o

State and Init Indicators - describe the state of the
HSC70. under runtime conditions, the Init ,indicator is
off while the State indicator is pulsing. During
initialization, these indicators change to reflect the
current initialization phase of the subsystem.

Init Switch - causes the HSC70 to start its
in~tialization routine.
The Secure/Enable switch must
be in the ENABLE position for this switch to be
operational.

Power Indicator - goes OFF if the dc voltage levels drop
below one-third of minimal. The power indicator is
driven from a dc comparator circuit on the I/O Control
Processor module (LOllI) that constantly monitors the
+5, +12, and -5.2 voltages. The power indicator is also
driven by a logic gate that monitors the Power Fail
signal from the power supplies.
If this signal is
asserted, the power indicator goes OFF.
NOTE
The power indicator ON does not mean these
voltages are within specification (±S percent).

Fault Indicator and Switch - comes on when the HSC70
logic detects a fault. The Fault switch is used for the
OCP lamp test.
Fault Codes - When the Fault switch is pressed and
released, the lamps in Init, Online, Fault, and the
two blanks function as an error display.
If the
fault code is a hard fatal error, the fault code
blinks on and off until the HSC70 is powered down or
the Init switch is pressed again.
If the displayed fault code is a soft nonfatal
failure, the fault code clears on subsequent
toggling of the Fault switch. Multiple soft fault
codes can be queued in the fault code buffer.
Subsequent toggling of the Fault switch displays
each soft fault code until the buffer is emptied.

2-2

Soft fault codes are identified by the Fault
indicator ON (or displayed fault code) while the
State indicator is pulsing. With soft faults, the
HSC continues to operate without the use of the
failing resource. Hard fault codes are identified
by the fault indicator ON ('or displayed fault code)
while the HSC State indicator is not pulsing. With
hard faults the HSC will not continue operation
until the failure is remedied.
Error codes associated with the OCP display are
defined in Chapter 8 and in Chapter 4.
Lamp Test - Pushing and holding the Fault switch
causes all the OCP indicators to light and function
as a lamp test. Even if the Fault indicator is
already on before the switch is pushed; the lamp
test can be executed.
o

Online Switch - puts the HSC70 logic in the available
state when pushed to the IN position and allows a host
to establish a virtual circuit with the HSC70. When
this switch is released to the OUT position, no new
virtual circuits can be made.

Online Indicator - shows a virtual circuit exists
between the HSC70 and a host CPU when the Online
indicator is on. When this indicator is off, no virtual
circuits are established with any host.

Blank Indicators - form the lowest two bits of as-bit
fault code.

2.3 INSIDE FRONT DOOR CONTROLS/INDICATORS
Figure 2-2 shows the controls and indicators available when the
front door is opened.
o

Secure/Enable Switch - disables the Init switch from the
OCP when in the SECURE position. Also,- the SET utility
program cannot run, and the BREAK character from the
terminal is disabled. With the Secure/Enable switch in
the ENABLE position, the Init switch and all the utility
programs can be used.
The SHOW utility is operable with the Secure/Enable
switch in either position.

2-3

Enable Indicator - indicates the Secure/Enable switch is
in the ENABLE position when the Enable LED is
illuminated (all switches can be used). When the Enable
indicator is off, the OCP is secure.

OCP SHIELD

HSC70
SECUR E/ENAB LE
SWITCH

OCP SIGNAL/POWER
LINE CONNECTOR
CX-902B

Figure 2-2

Controls/Indicators - Inside Front Door

2-4

RX33 LEOS - are lit to indicate which particular drive

is in use. There is an LED on the front panel of each
drive. When not in use, the Rx33 diskettes are stored
inside the front door. See Figure 2-3.
o

DC Power Switch - is located on the left side of the
Rx33 housing.
See Figure 2-3. When the DC Power switch
is in the 0 position, the HSC70 is without dc power.
Moving the switch to the 1 position restores dc power.

DR I VE-I N-USE LEOS
PLATE

DISKETTE
STORAGE
AREA

CX-932B

Figure 2-3

Rx33 and DC Power Switch

2-5

2.4

MODULE INDICATORS AND SWITCHES

All logic modules have at least one LED to indicate board status.
Refer to Figure 2-4 for the locations of these LEOs and the
Module utilization Label. Additionally, two of these logic
modules contain specific switches. Figure 2-5 shows the slot
location for each of the modules. Table 2-1 shows the functions
of the various module LEOS.

MODULE UTI LlZATION
J

LABEL/

{( i
f { "

1 ~~ lJ ~'
~

NOT USED

NODE ADDRESS
SWITCHES

LINK BOARD _~t:I(oI
STATUS
INDICATORS

MICRO ODT

~ SERIAL LINE UNIT

il MEMORY OK

~ SEQUENCING

INDICATORS
•

RED

AMBER

® GREEN
CX-933A

Figure 2-4

Module LED Indicators

2-6

....

E <t

0
0

:.:i

0
0

.....

o
t:
~:;~

....JCI:U

Mod

OaJ-

Bkhd

0 .. 0
..->Cl..
OaJ_

Req
Siot

r--

aJ
u
0

.....

~:;~

OCl.l_

....JCI:U

14 13 12 11

C
C

.r::.

~
ro

<t
I

C
C

.r::.

C
C

.r::.

U
~

<t
I

.r::.

<t
<t
I

o .. aJ
..- > 0.
o Cl.l ro

0
CO
0 .. aJ
"->0.
OaJeo

o .. aJ

"->0.
OaJro

S:;~
o aJ.-

o .. ~ S:;~
o"->",
aJ._ o aJ._
....JCI:O

....JCI:O

0
0
o .. ~ CO
S:;~ ;::;E
o"->",
aJ._
o
aJ .- o aJ aJ
....JCI:O

10 9

....JCI:I-

....JCI:O

r--

..- .. U

;; ~o

....JCI:::::-

....JCI:2

eX-889A

Figure 2-5

HSC70 Module Utilization Label Example

Table 2-1

Functions of Logic Module LEOs

Module

Color

Function

LOllI

Dl Amber

Micro-ODT -- Used during J-ll power-up
microdiagnostics.

02 Amber

Terminal Port OK -- Used during J-ll
power-up microdiagnostics.

D3 Amber

Memory OK -- Used during J-ll power-up
microdiagnostics.

D4 Amber

Sequencing Indicator -- Used during J-ll
power-up microdiagnostics.

D5 Amber

State Indicator -- mirrors the OCP State
indicator.

06 Amber

Run Indicator -- pulses at the on-board
microprocessor run rate.

D7 Red

Board Status -- indicates an inoperable
module except during initialization when it
comes on during module testing.

2-7

Module

LOl17

LOl08-YA
LOl08-YB

LOl07-YA

LOI09

L0100

Color

Function

D8 Green

Board Status -- indicates the module has
passed all applicable diagnostics.

Green

Board Status -- indicates the operating
software is running and has successfully
tested this module.

Amber

Indicates "Memory Active" - lit during
every memory cycle.

Red

Board Status -- indicates an inoperable
module except during initialization when it
comes on during module testing.

Green

Board Status -- indicates the operating
software is running and that self-test
module microdiagnostics have completed
successfully.

Red

Board Status -- indicates an inoperable
module except during initialization when it
comes on during module testing.

Green

Board Status -- indicates the operating
software is running and that self-test
module microdiagnostics have completed
successfully.

Red

Board Status -- indicates an inoperable
module except during initialization when it
comes on during module testing.

Green

Board Status -- indicates the operating
software is running and that all applicable
diagnostics have completed successfully.

Red

Board Status -- indicates an inoperable
module except during initialization when it
comes on during module testing.

Amber

Always on.
purposes.)

Green

Indicates the node is either transmitting
or receiving. Dims or brightens relative to
the amount of local CI activity.

Red

Indicates the module is in the Internal
Maintenance mode.

(Used only for engineering test

2-8

2.4.1 Module Switches
Specific switches are found on LOlOO, LOl07, and LOl09, as
follows:
o

Port Link Module (LOlOO) - Figure 2-4 shows the location
of the CI node address switches mounted on the LOlOO
module. Both sets of switches must be identically set
to avoid CI addressing errors. The chosen address must
not exceed the current maximum of 15 (decimal).
Addresses higher than 15 cause port link module faults
on the OCP (error code of 25 octal).

Port Link and Port Link Buffer Modules (LOI07 and LOI09)
- Both the LOl07 and LOI09 modules have dual inline pack
(DIP) switches to indicate the hardware revision level.
DIP switch positions should not be changed except as
directed by a Field Change Order. Figure 2-6 shows the
location of the these switches.

L0107

L0109

HARDWARE REVISION LEVEL SWITCHES
(DO NOT CHANGE EXCEPT BY FCO)

CX-241C

Figure 2-6
o

Module (DIP) Switches

P.ioj (LOIIl) - The LOllI module contains two punch-out
connector packs used to assign an unique value to the
P.ioj serial number register. The switch settings
should never be modified in the field.

2-9

The P.ioj module serial number is only used when a
default HSC SCS-ID is generated. The SDS-ID is a
hexadecimal number uniquely identifying the HSC as a
node in the cluster. This ID is usually generated by
initializing the HSC70 (toggling the Init switch on the
OCP) while holding in the OCP Fault switch until the
INIPIO banner is printed on the console. For all other
reboot cases, the HSC70 P.ioj serial number is not used.

2.5 POWER CONTROLLER
The 881 (Figure 2-7) is a general-purpose, three-phase power
controller that controls and distributes ac power to various ac
devices (power supplies, fans, blower motor, etc.) packaged
within an HSC70. The 881:

controls large amounts of ac power with low level
signals.

Provides ac power distribution to single-phase loads on
a three-phase system.

Protects data equipment from electrical noise.

Disconnects ac power for servicing and in case of
overload.

In addition, the 881 features:
o

Local and remote switching

SWITCHED receptacles only

Convection cooling

Rack-mounting

AC line filtering

DIGITAL power control bus inputs

DIGITAL power control bus delayed output (to allow
sequencing of other controllers)

2.5.1 Operating Instructions
The two basic controls on the power controller are the circuit
breaker and the BUS/OFF/ON switch. These and all but one of the
other controls are located on the front panel of the controller
(Figure 2-7).

2-10

GROMMETED
CORD
OPENING
POWER CONTROL
BUS CONNECTORS

f;\ SECONDARY
~ON

O·
I' I
,

SECONDARY
OFF

REMOTE BUS
CONTROL

INTERNATIONAL SYMBOLS

SERIAL LOGO
LABEL
LABEL
FUSE

CIRCUIT
BREAKER

CX-893A

Figure 2-7

Power Controller - Front Panel Controls

2-11

The operator controls are described in the following list:
o

Power controller Circuit Breaker - controls the ac power
to all outlets on the controller. It also provides
overload protection for the ac line loads and is
unaffected by switching the BUS/OFF/ON control.

Fuse - protects the ac distribution system from an
overload of the power control bus circuitry. The fuse
is located on the front panel of the power controller.

DIGITAL Power Control Bus Connections - used if control
bus connections to another cabinet are required.
DIGITAL power control bus MATE-N-LOK connectors are JIO,
JII, Jl2 and J13. connectors JIO and JII are not
delayed. Connectors Jl2 and Jl3 are delayed.

BUS/OFF/ON Switch - are the three positions of this
switch. Assuming the circuit breaker for the power
controller is ON, the ac outlets are:
Energized when the BUS/OFF/ON switch is in the ON
position.
De-energized when the BUS/OFF/ON switch is in the
OFF position.
NOTE
The BUS position is intended for remote sensing
of DIGITAL power control bus instructions. The
switch is left in the ON position when the
DIGITAL power control bus is not used.

Total Off Connector - is a two-pin male receptacle. on
the back of the power controller (Figure 2-8). It
removes power from the HSC70 whenever the air flow
sensor detects system air-flow loss. To reset the TOTAL
OFF, cycle the circuit breaker off and then back on
agaln.

2-12

u
TOTAL OFF
CONNECTOR

I#ltl

I~
~I
:j

---+~

I~'
$

I+11cft

(i)

CX-934A

Figure 2-8

881 Rear Panel

2-13

CHAPTER 3
REMOVAL AND REPLACEMENT PROCEDURES

3.1

INTRODUCTION

This chapter describes procedures for removing and replacing the
field replaceable units (FRUs) in an HSC70. Observe the
following safety precautions before starting removal and
replacement procedures.
3.2

SAFETY PRECAUTIONS

Because hazardous voltages exist inside the HSC70, only a
qualified service representative should service the subsystem.
Bodily injury or equipment damage can result from improper
servicing. Always use the anti-static wrist strap provided when
removing and replacing logic modules.
WARNING
Always remove power from the HSC70 before
replacing internal parts or cables.

3.3

POWER REMOVAL

Before removing/replacing an FRU, turn off the ac power from the
power controller CB1. Open the back door with 5/32-inch hex
wrench. The power controller is located on the lower left side
of the cabinet. Figure 3-1 shows the location of ac circuit
breaker CB1.
To remove ac power, turn off CBl (Figure 3-1). To ensure
absolute safety, disconnect the ac plug from its receptacle.

3-1

J13 J12 Jl1 Jl0

o CB I

CIRCUIT
BREAKER

o
POWER
CONNECTOR

CX-1117A

Figure 3-1

Location of Circuit Breaker on the Power Controller

3-2

Following are the two methods for removing dc power:
o

Turning off the dc power switch, located on the side of
the RX33 housing. See Figure 3-2.

Turning off CBl (ac power).
WARNING
Ensure the OCP Signal/Power line indicator is
connected; otherwise the power indicator on the
OCP can show power off when the power is on.

HSC70
DC POWER SWITCH

OCP SIGNAL/POWER
LINE CONNECTOR

CX-946B

Figure 3-2

DC Power Switch Location

3-3

3.4

FIELD REPLACEABLE UNIT (FRU) REMOVAL

Figure 3-3 shows the FRU removal sequence for an HSC70.

OPEN CABI NET FRONT DOOR

MODULES

OCP

RX33

OPEN CABINET BACK DOOR

POWER CONTROLLER

BLOWER

AIR FLOW
SENSOR ASSEMBLY
CABINET BACK DOOR

CABINET FRONT DOOR

MAIN POWER SUPPLY

AUXI LlARY POWER SUPPLY

CX-93SA

Figure 3-3
3.4.1

FRU Removal Sequence

Access From Cabinet Front Door

The FRUs accessed via the front door include the RX33, the
Operator Control Panel, and the logic modules. Should you decide
to remove the front door use the following procedure:
1.

Unlock the cabinet front door and lift the latch to open
the door.
CAUTION
When performing the following steps, take care
not to damage the front spring fingers.

3-4

Remove HSC70 power by pushing the de power switch to the
"0" position.

Disconnect the ground wire from the door.

Disconnect the OCP cable at the bottom of the OCP shield
(Figure 3-2).

Pull down on the spring-loaded rod on the top hinge
inside the cabinet and then lift the door off its bottom
pin.

Reverse the removal procedure to replace the front door.
3.4.2

Access From Cabinet Back Door

The FRUs accessed via the back door include the Power Controller,
Blower, Air Flow Sensor Assembly, Main Power Supply and Auxiliary
Power Supply. To remove the back door, use the following
procedure.
1.

Open the back door with a 5/32-inch hex wrench.

Pull down on the spring-loaded rod on the top hinge
inside the cabinet and then lift the door off its bottom
pin.

Reverse the removal procedure to replace the back door.
3.5

RX33 COVER PLATE AND DISK DRIVE REMOVAL AND REPLACEMENT

The Rx33 disk drives are slide mounted in the HSC70 cabinet. A
cover plate ensures proper air flow and cooling. Use the
following procedure to remove the Rx33 cover plate (Figure 3-4).
1.

Unlock the cabinet front door and lift the latch to open
the door.

Turn off DC Power (Figure 3-2).

Rotate the four fasteners on the Rx33 cover plate 1/4
turn and remove the cover plate.

3-5

1/4 TURN
FASTENER

DRIVE
COVER
PLATE

o
CX-1118A

Figure 3-4

Rx33 Cover Plate Removal

Use the following procedure to remove the RX33 disk drives.
1.

Completely loosen the two captive screws holding the
drive assembly and mounting plate to the cabinet frame.
CAUTION
Avoid snagging the cables attached to the rear
of the drives during the next step.

Carefully pullout the slide mounted RX33s until they
clear their housing.

Support the drives with one hand, and remove the flat
ribbon cables and power cables from the rear of the
drives.

Determine whether drive 0 or drive 1 should be replaced.

Loosen the captive mounting screws with a flat bladed
screw driver on the drive to be replaced as shown in
Figure 3-5.

3-6

MOUNTING

PLATE

MOUNTING

PLATE

SCREW

Figure 3-5
6.

CX-936A

RX33 Disk Drive Removal

Configure RX33 jumpers on the replacement drive as shown
in Figure 3-6. If replacing drive 0, be sure to insert
jumper DSO. If replacing drive 1, be sure to insert
jumper DSI. Section 3.5.1 briefly describes the
function of each jumper.

3-7

NOTE
Replacement Rx33 drives shipped from the vendor
are not configured for HSC70 application. Two
identical jumpers, DEC part number 12-18783-00,
must be added.
If no extra jumpers are
available, remove two jumpers from the defective
drive. Correct jumper configuration is
necessary for proper operation of the
replacement Rx33 drive (see next section).

Replace the defective drive with a new one.

Reverse the removal procedure to replace the Rx33 drives.

....J

:J :J I

HG _ ,

IU~I:

ML RE DC RY

I•• 1•• 1- ::lIN

LG~" 11111: :1
O~NM
(f)(f)(f)(f)

0000

CX-937A

Figure 3-6

Rx33 Jumper Configurations

3-8

3.5.1

RX33 Jumper Configuration

This section defines the Rx33 jumpers. Jumpers identified with
an asterisk are connected for HSC operation.
o

* FG

LG = Logic low on NORMAL/HI DENSITY signal enables
high-density mode

* HG = Logic high on NORMAL/HI DENSITY signal enables
high-density mode

* DSO, 1, 2, 3 = Drive select number 0, 1, 2, }

* I = Speed Mode I (dual speed mode)

II = Speed Mode II (single speed mode, 360

* Ul, U2 = Selects mode of operation for loading heads

Frame ground connection

RPM

only)

and lighting bezel LED (see note).
o

HL, IU = Selects mode of operation for loading heads and
lighting bezel LED (see note).

DC = Drive will assert DISK CHANGED signal on pin 34 of
interface cable

* RY = Drive will assert DRIVE READY signal on pin 34 of
interface cable

ML = Motor enable.
application

No jumper installed for HSC70

RE = Recalibration.
application

No jumper installed for HSC70

NOTE
The HSC70 loads heads and lights the drive-in-use
LED when DRIVE SELECT n and READY are both true.

3.6

OPERATOR CONTROL PANEL (OCP) REMOVAL AND REPLACEMENT

If any OCP lamp fails, replace the entire OCP as follows:
1.

Open the front door by turning the key clockwise and
lifting the latch.

Remove dc power (Figure 3-2).

Remove the four Kepnuts securing the OCP shield to the
studs on the front door.

3-9

Remove the OCP shield.

Remove the two screws securing the OCP to the shield
(Figure 3-7).

Remove the two connectors from the printed circuit board
on the OCP.

Pullout the OCP carefully allowing for indicator and
switch clearance.

Reverse the removal procedure to replace the OCP.

OCP SHIELD

SvViTCH

oCP

OCP
MOUNTING
SCREWS

CX-938A

Figure 3-7

Operator Control Panel Removal

3-10

3.7 LOGIC MODULES REMOVAL AND REPLACEMENT
A Velostat (antistatic) kit must be used during module
removal/replacement. The Velostat kit part number is 29-11762.
For convenience, an antistatic wrist strap is included in the
front door diskette storage area.

Open the front door by turning the key clockwise.

Push the DC Power switch to the 0 position (off).
Figure 3-2.

Turn the two nylon latches on the module cover plate
one-quarter turn (Figure 3-8).

Pull the card cage cover up and out.

Check the module utilization label above the card cage
for the location of the desired module. The module
slots are numbered from right to left when viewed from
the front.

Remove the module and replace with a new one.

See

To remove the LOIOO port link module, the door latch
plate attached to the left side of the cabinet frame
must be moved away from the module removal path.
In
production model HSC70s, the latch plate is swivel
mounted. Lift the plate slightly and press it flat
against the cabinet frame.
Before closing the cabinet
door, return the door latch plate to its locked
position.

Reverse the removal procedure to replace the card cage cover.

3-11

NYLON
LATCHES

DISKETTE
STORAGE
AREA

CX-887B

Figure 3-8

Card Cage Cover Removal
NOTE

The I/O control processor module is identified by
factory-set jumpers. Each module has a unique
serial number that matches the pattern of the
jumpers. Do not reconfigure these jumpers.

If the port link module is being replaced, ensure
the node address switches are properly set on the
new module. Figure 3-9 shows the location of the
switches. See the system manager for the correct
node address.

3-12

1
2
3
4

0 2

4
8
E 16
32
N 64
128

VALUE OF EACH
SWITCH

5
6
7
8

DIP SWITCH
(EXAMPLE: BINARY 3)

•

1
2
3
4

·0 2

4
8
E 16
32
N 64
128

5
6
7
8

PORT LINK MODULE

CX-888A

Figure 3-9
3.8

Location of Node Address Switches

BLOWER REMOVAL AND REPLACEMENT

The blower, which provides forced air cooling for the cabinet, is
removed by using the following procedure:
1.

Open the back door using a 5/32-inch hex wrench.

Turn off ac power (CBl on the power controller).

Disconnect the blower power connector.

Remove the exhaust duct from the bottom of the blower by
lifting up the quick release latches on each side of the
duct (Figure 3-10).

3-13

Disconnect the airflow sensor power connector (J70) to
allow removal of the exhaust duct.
NOTE
Figure 3-10, Figure 3-11, and Figure 3-12 show
the blower outlet duct for current HSC70s.
Early models have a smaller blower motor outlet
duct.

Loosen, but do not remove, the three Phillips screws
holding the blower mounting bracket to the cabinet.

Lift the blower and bracket up and out of the cabinet.

Reverse the removal procedure to replace the cooling blower.

3 PHILLIPS HEAD SCREWS
(SECURE BLOWER
MOUNTING BRACKET)

REMOVABLE
EXHAUST DUCT

COOLING BLOWER
POWER CONNECTOR

AIRFLOW SENSOR
POWER CONNECTOR
(J 70)

Figure 3-10

AIRFLOW
SENSOR

QUICK RELEASE
LATCHES

eX-939B

Main Cooling Blower Removal

3-14

3.9

AIRFLOW SENSOR ASSEMBLY REMOVAL AND REPLACEMENT

The airflow sensor assembly, housed in the cooling duct, is
removed by the following procedure:
1.

Open the back door using a hex wrench.

Turn off the ac circuit breaker (C81) on the HSC70 power
controller.

Disconnect J70 (Figure 3-11).

Remove Phillips head screw that holds mounting clamp to
the duct.

Slide sensor assembly out of duct.

PHI LLI PS
HEAD
SCREW

SENSOR
CLAMP

AIRFLOW
SENSOR

CX-940B

Figure 3-11

Airflow Sensor Assembly Removal

3-15

Reverse the removal procedure to replace the airflow sensor
assembly and follow these three steps:

3.10

Align the slots in the airflow sensor tip horizontally
with the floor.

After turning on ac power to the HSC70, test the new
airflow sensor for proper operation.

Ensure the sensor is operable by blocking the flow of
air. Pinching the sensor should trip CBl.

POWER CONTROLLER REMOVAL AND REPLACEMENT

The power controller must be removed to replace either of the
power supplies.
1.

Open the back door.

Remove rear door latch to allow clearance for power
controller removal.

Remove ac power by placing CBl in the off position
(Figure 3-1).

Unplug the power controller from the power source.

Remove the two top screws and then the two bottom screws
securing the power controller to the cabinet (Figure
3-12). While removing the two bottom screws, push up on
the power controller to take the weight off the screws.
CAUTION
Do not pull the power controller out too far
because cables are connected to the back and
top.

Pull the power controller towards you and then out.

Remove the power control bus cables from JIO, Jll, J12,
and J13 connectors at the front of the power controller.
Refer to Figure 3-12.

Disconnect the total off connector at the rear of the
power controller.

Disconnect all llne cords from the top of the power
controller.

3-16

NOTE

Be sure to rotate the line cord elbow to the
vertical position if replacing a defective power
controller with a new one. To rotate the elbow
remove the set screw, rotate the elbow to the
position shown in Figure 3-12 and replace the
set screw in the other hole.

MAIN POWER
SUPPLY LINE

COOLING
BLOWER
LINE CORD

PHASE DIAGRAM

POWER

~~!::=-:====j CO NT R0 L L E R
SCREWS

POWER
CONTROLLER
LI NE CORD

Figure 3-12

Power Controller Removal

3-17

CX-941B

Reverse the removal procedure to replace the power controller.
NOTE

To ensure proper phase distribution, reconnect
the main power supply, auxiliary power supply and
cooling blower line cords as shown in Figure
3-12.

3.11 MAIN POWER SUPPLY REMOVAL AND REPLACEMENT
The following procedure covers the removal of the main power
supply:
WARNING

The power supply is heavy. Support it with both
hands to prevent dropping it.
1.

Open the back door using a 5/32-inch hex wrench.

Turn off CBl (ac power) on the power controller.

Unplug the power controller from the power source.

Remove the front door.

Remove the power controller (Section 3.10) to access the
back of the power supply.

unplug the main power supply line cord at the power
controller.

Remove the nut from the -VI stud (ground) on the back of
the power supply (Figure 3-13).

Remove the nut from the +Vl stud (+5 volts) on the back
of the power supply.

Remove the nut from the -v2 (-5.2 volt) stud on the back
of the power supply.

10.

Remove the nut from the +v2 (ground) stud.

11.

unplug J31 (+12 Vdc output from the supply to
backplane).

12.

unplug P32 (+12

VDC

and +5 vdc sense lines).

3-18

WI RE LIST
SIGNAL

SIGNAL

COLOR

POSITION

COLOR

POSITION

PUR

TBI-3-5

12 V

PUR

TBI-3-1

12 V SENSE

PUR

TBI-3-6

12 V

BLU

TBI-2-7

ACC

TBI-3-3

GND (12 V)

BRN

TBI-2-6

BLK

GRN/YEL

TBI-2-5

GND

ORN

TBI-2-2

-5 V SENSE

YEL

TBI-2-3

ON/OFF (-5,3 V)

BLK

TBI-2-1

GND (-5 V SENSE)

ORN

TBI-2-2

-5 V SENSE (S2-)

BRN

TBI-1-4

POWER FAI L

BLU

TBI-1-3

ON/OFF 5 V

BLK

TBI-1-2

GND (5 V SENSE)

BLK

TBI-1-2

GND (5 V SENSE)

RED

TBI-1-1

5 V SENSE

PUR

TBI-3-2

12 V

BLK

TBI-3-4

GND (12 V SENSE)

MAIN POWER SUPPLY - REAR VIEW
J35
POWER TO
AIRFLOW SENSOR

POWER FAIL

LINE CORD
CONNECTIONS

J34
TO AUXI LlARY
POWER SUPPLY
----~~TO BACKPLANE

+5 V
+V1

GND
-V1
MM63
FLEXBUS

Figure 3-13

CX-942B

Main Power Supply Cables - Disconnection

3-19

13.

Unplug J33 (to DC power switch).

14.

Unplug J34 (remote on/off jumper to auxiliary power
supply) .

15.

Unplug J35 (+12 vdc power to the airflow sensor).

16.

Turn the four captive screws on the front of the power
supply counterclockwise (Figure 3-14).

17.

Pull the power supply out about an inch. Check the back
of the cabinet to ensure the cables and flexbus
connectors are clear and will not snag when the supply
is completely removed.

18.

Carefully pull the power supply all the way out of the
cabinet.

MAIN POWER
SUPPLY CAB LES

CAPTIVE

SCREWS

11111111

IIII1111
IIII1111

CX-1157A

Figure 3-14

Main Power Supply Removal

3-20

19.

Remove the power cord from the failing unit and install
it on the new power supply.
NOTE
Spare power supplies are not shipped with a
power cord.

Reverse the removal procedure to replace the main power supply.
3.12

AUXILIARY POWER SUPPLY

An HSC70 requires an auxiliary power supply. The auxiliary power
supply is mounted directly beneath the main power supply. The
procedure for mounting the auxiliary power supply follows:
WARNING
This power supply is heavy. When removing,
support it with both hands to prevent dropping
it.

Open the back door using a 5/32-inch hex wrench.

Turn off CBl (ac power) on the power controller.

Unplug the power controller from the power source.

Remove the front door.

Remove the power controller to access the back of the
power supply (Section 3.10).

Unplug the auxiliary power supply line cord at the power
controller.

Remove the nut from the +Vl stud (+5 volt) on the back
of the power supply (Figure 3-15).

Remove the nut from the -VI stud (ground) on the back of
the power supply.

Disconnect J50 (sense line to voltage comparator).

10.

Disconnect J5l (dc on/off jumper).

11.

Turn the four captive screws on the power supply
counterclockwise (Figure 3-16).

3-21

WI RE LIST
COLOR

POSITION

SIGNAL

BLACK

TBI-2

GROUND (5 V SENSE)
5 V SENSE

RED

TBI-l

BROWN

TBI-4

POWER FAIL

BLUE

TBI-7

ACC
AC

BROWN

TBI-6

GRN/YEL

TBI-5

CHASSIS GROUND

BLUE

TBI-3

ON/OFF

BLACK

TBI-2

GROUND (5 V SENSE)

AUXILIARY POWER SUPPLY - REAR VIEW

--.

POWER FAIL

TO BACKPLANE

POWE R SUPPLY
TERMINAL STRIP

J51
TO BACKPLANE
J50
TO MAIN
POWER SUPPLY

GROUND

LINE CORD
TO POWER
CONTROLLER

Figure 3-15

+Vl
+5 VDC

FLEXBUS
-Vl
GROUND

CX-943A

Auxiliary Power Supply Cable Disconnection

3-22

AUXI LlARY POWER
SUPPLY CABLES

CAPTIVE
SCREWS

AUXI LlARY POWER
SUPPLY

Figure 3-16

Auxiliary Power Supply Removal

3-23

CX-1158A

12.

Pull the power supply out about an inch. Check the back
of the cabinet to ensure the cables and flexbus
connectors are clear.

13.

Carefully slide the power supply out through the front
of the HSC70.

14.

Remove the power cord from the failing unit and install
on the new power supply.
NOTE

Spare supplies are not shipped with a power

cord.
Reverse the removal procedure to replace the auxiliary power
supply.

3-24

CHAPTER 4
INITIALIZATION PROCEDURES

4.1

INTRODUCTION

This chapter tells how to connect the console terminal and how to
initialize the HSC70. Error reporting by fault codes displayed
on the OCP is also described.
4.2

CONSOLE TERMINAL CONNECTION

The console terminal designated for the HSC70 is the VT220. An
LA50 printer is connected to the terminal for hardcopy output.
Detailed operating information is provided in the owner manuals
accompanying the VT220 and LA50.
Figure 4-1 shows the placement of the EIA terminal connectors on
the HSC70 rear bulkhead. The console terminal connects to the
J60 connector as shown. Although three EIA connectors are shown,
two terminals cannot simultaneously connect to an HSC70.
Preferably, power is turned off before the console terminal is
installed. If power must be left on while connecting the
terminal, use the following procedure:
o

Put the Secure/Enable switch in the SECURE position.

Change terminal state (plug in, remove power, connect
EIA line)

Type three space characters on the terminal keyboard.

Put the Secure/Enable switch in the ENABLE position if
it is necessary to do so at this point.

4-1

NOTE
If this procedure is not followed, the HSC70 may
enter micro-Online Debugging tool (ODT) mode.
An
@ symbol on the screen indicates this mode.
Typing a P (proceed) exits this mode.

CONNECT CONSOLE
TERMINAL TO J60
EIA TERMINAL
CONNECTORS
1.

J60 COI\lSO LE

J61

J62

<==>
<==>
<==>
N
M
L

00
@) C_:J

~o
~

[-]
_
8~ CJ

00
00 CJ

8g c::J

~~ c::::J o~ CJ
00
00 0

00
00

F ~

~g c::J ~£ C:J ~~ c.J ~ C::J ~i [:J

~ c~
~O

~ C:.J

®O
00

C:J $0
O<'/j C-:J
_
~~
-

C:::J ~~ [:::1 ~ CJ ~ C:J

00
C::J O~
C=J ~ 0

~~ L...J

4.3

DATA CHANNEL
CONNECTIONS

~ C::J

CABLE CONNECTORS
WITHIN A DATA CHANNEL

Figure 4-1

CABLE
BULKHEAD

CX-891B

Console Terminal Connection

HSC70 INITIALIZATION

This section describes the booting procedures for the HSC70
System diskette. This diskette also contains the software
necessary to execute the inline diagnostics and the utilities.
To boot and run the offline diagnostics from a separate Offline
diskette, refer to Chapter 6.
NOTE
Blank RX33 diskettes are unformatted.
procedure is described in Chapter 7.

The format

In order to run the HSC/O inline, the System diskette must reside
in the RX33 drive. Customarily, this diskette resides in Rx33
drive O. However, drive 1 and drive 0 are identical, and disk
placement is arbitrary.

4-2

System boot is initiated by either powering on the unit or (if
the unit is already on) by depressing and releasing the Init
switch with the Secure/Enable switch in the ENABLE position.
This initiates the P.io ROM bootstrap tests and then loads the
Init P.io Test.
4.3.1 Init P.io Test
The Init P.io Test completes the P.ioj module and the HSC70
memory testing previously started by the ROM bootstrap tests.
All P.ioj logic not tested by the bootstrap is completed.
In
addition, the HSC70 Program, Control, and Data memories are
tested.
This test runs in a stand-alone environment (no other HSC70
processes are running). If a failure is detected, the failing
module is flagged.
If the test runs without finding any errors,
the HSC70 operational software is loaded and started. The Init
P.io Test is not a repair level diagnostic.
If a repair level
test is needed, run the Offline P.io test that provides standard
HSC70 error messages.
4.3.1.1 Init P.io Test System Requirements - In order to run
this test, the following hardware is required:
o

P.ioj (processor) module with HSC70 Boot ROM

At least one M.std2 (memory) module

RX33 controller with at least one working drive

In addition, an HSC70 System diskette (RX33 media) is required.
4.3.1.2 Init P.io Test prerequisites - The Init P.io Test is
loaded by the HSC70 ROM Bootstrap program. The bootstrap tests
the basic J-ll instruction set, the lower 2048 bytes of Program
Memory, an 8 Kword partition in Program memory, and the Rx33
subsystem used by the bootstrap. When the Init P.io Test begins
to execute, most J-ll logic has been tested and is considered
working. Likewise, the Program memory occupied by the test and
the Rx33 subsystem used to load the test are also considered
tested and working. The RX33 diskette is checked to ensure it
contains a bootable image.
4.3.1.3 Init P.io Test Operation - Follow these steps to start
the Init P.io Test:
1.

Insert the HSC70 System diskette in the RX33 unit 0
drive (left-hand drive).

4-3

Power on the HSC70, or depress and release the Init
button on the HSC70 OCP with the Secure/Enable switch
enabled. The Init lamp should light and the following
should occur:
o

The RX33 drive-in-use LED should light within 10
seconds indicating the bootstrap is loading the Init
p.io Test to the Program memory.

The I/O State light is on after diskette motion
stops and the Init P.io Test begins testing.

The Init P.io Test displays the following message on
the HSC console when it begins: INIPIO-I BOOTING.

HSC70 operational software indicates it has loaded
properly when the State light blinks.

HSC70 displays its name and version indicating it is
ready to perform host I/O.

Once initiated, the Init P.io Test is only terminated by halting
and rebooting the HSC. If the test fails to load using the
preceding start-up procedure, perform the next three steps.
1.

Boot the diskette from the RX33 unit 1 drive (right-hand
drive).

Boot another diskette. If that diskette boots, the
original diskette is probably damaged or worn.

Boot the HSC70 Offline Diagnostic diskette. This
diskette contains the Offline P.io Test, which provides
extensive error reporting features. A console terminal
must be connected to run the offline tests.

The progress of the Init P.io Test is displayed in the State LED.
Before the test starts, the State LED is off. When the test
starts, the State LED is turned on, and the INIPIO-I BOOTING
message is printed on the HSC console. When the test completes
with no fatal errors, the State LED begins to blink on and off.
If the test detects an error, the Fault lamp on the HSC70 OCP is
lit.
4.3.2 Fault Code Interpretation
All failures occurring during the Init P.io test are reported on
the operator control panel LEOs. When the Fault lamp is lit,
pressing the Fault switch results in the display of a failure
code in the OCP LEOs. This code indicates which HSC70 module is
the most probable cause of the detected failure. The failure

4-4

code blinks on and off at I-second intervals until the HSC is
rebooted if the fault code represents a fatal fault. A soft
fault code is cleared in the OCP by depressing the fault switch a
second time. To restart the boot procedure, press the Init
switch. This procedure is detailed in Chapter 8. To identify
the probable failing module, refer to Figure 4-2.

OCP INDICATORS

DESCRIPTION

HEX OCT BINAR

PORT PROCESSOR
MODULE FAI LUREt

00001

DISK DATA CHANNEL
MODULE FAI LUREt

00010

TAPE DATA CHANNEL
MODULE FAI LUREt

00011

INSTRUCTION CACHE PROBLEM
IN I/O CONTROL PROCESSOR"

01000

HOST INTERFACE ERROR"

01001

DATA CHANNEL ERROR"

01010

I/O CONTROL PROCESSOR
MODULE FAI LURE

10001

MEMORY MODULE FAI LURE

10010

BOOT DEVICE FAI LURE*"

10011

PORT LINK MODULE FAILURE

1 0101

MISSING FI LES REQUIRED

1 0110

NO WORKING K.SDI, K.STI,
OR K.CI

11000

REBOOT DURING BOOT

1 1001

SOFTWARE DETECTED
INCONSISTENCY

1 1010

I FAULTIIONLINEI

LJ D

t INCORRECT VERSION OF MICROCODE .
.. THESE ARE THE SO-CALLED SOFT OR NON-FATAL ERRORS.
*"POSSIBLE MEMORY MODULE/CONTROLLER ON HSC70

Figure 4-2

CX-905B

Operator Control Panel Fault Code Displays

4-5

The following paragraphs describe specific fault codes displayed
in the OCP lamps.
(All fault codes are indicated with octal
values.)
1.

Fault Code 1 - K.pli error - indicates the CIMGR
initialization routine discovered bad requestor status
from a previously-tested good requestor module in
requestor slot 1. The expected requestor status should
be 001. The FRU is the LOI07.
During CIMGR initialization, the K.ci is directed to set
the HSC node address into its own control structure. If
the K.ci failed to modify this node address field after
one-half second from K.ci requestor initialization, this
fault code is displayed. In addition, the K.pli
microcode version is checked to ensure it is compatible
with this functional version. If compatibility checks
fail, this is the fault code displayed.
Run offline diagnostics to test the K.ci requestor.
Replace the K.pli module on failure.
If the fault code
persists, refer to the HSC revision control document to
verify all HSC components are at the current revision.

Fault Code 2 - K.sdi incorrect version of microcode All K.sdi modules are initialized during the Disk Server
functional code initialization. If a K.sdi passes
initialization, the Disk Server initialization code
checks the K.sdi microcode version number to ensure it
is compatible with this version of functional code. If
code versions are not compatible, this fault code is
displayed. The FRU is the LOI08-YA.

Fault Code 3 - K.sti incorrect version of microcode indicates tape data channel microcode is incompatible.

Fault Codes 10, 11, and 12 - soft errors - are the
so-called soft or nonfatal errors related to the data
channels, the K.ci host interface, and the P.ioj cache.
None of these errors causes the HSC70 functional
operation to suspend when the fault is reported. Once
displayed, soft error indicators cannot be recalled.
The HSC may buffer up to eight soft fault codes.
Subsequent toggling of the Fault switch displays all
remaining soft fault codes until the buffer is empty.
o

Fault Code 10 - P.ioj cache failure - results in
disabling the cache and displaying this soft fault
code for any failure detected in the J-ll
instruction cache during HSC70 subsystem
initialization while the HSC70 continues operation.
Replace the P.ioj module (LOllI) and reboot.

4-6

Fault Code 11 - K.ci failure - is not present or has
failed its initialization tests. This soft fault is
displayed while the HSC continues to operate. The
most probable FRU is the Port Link module (LOIOO).

Fault Code 12 - Data channel module failure - is
used to report an unknown requestor type was found
in a requestor slot other than 0 or 1. Expected
valid requestor types for requestor slots 2 through
8 are either 002 (LOI08-YA) or 203 (LOI08-YB). The
data channel with the red LED on is the failing
module.

Fault Code 21 - P.ioj module failure - indicates the
P.ioj module is the most probable cause of the failure
detected by the Init P.io Test;
If possible; run the
Offline P.io Test for a mere definitive report on the
error. Otherwise, replace the P.ioj module, and run the
Init p.io test again. If the test still fails, run the
Offline P.io test to help further isolate the failure.

Fault Code 22 - M.std2 module failure - indicates the
M.std2 (memory) module is the most probable cause of
this bootstrap failure.
possible causes include:
o

The failure of the memory test of the first 1 Kword
(vector area) of Program memory as well as the use
of the Swap Banks bit in the P.ioj in trying to
correct the problem (Test 2).

A contiguous 8 Kword partition not found in Program
memory below address 00160000 (Test 3).

A hard fault detected in the RX33 controller logic
(Test 4).

Determine the error that occurred by examining physical
location 172340 which contains the number of the failing
boot ROM test.
In each of these cases, replace the
M.std2 module, and run the initialization tests again.
If the module still fails, run the Offline P.io Test.
Enter the SETSHO utility and execute the SHO MEM
command.
If any memory locations appear in the suspect
or disabled memory locations list, set the Secure/Enable
switch to ENABLE and execute the SET MEM ENABLE/ALL
command.

4-7

Fault Code 23 - RX33 failure - indicates a problem with
an RX33 drive, the diskette, the RX33 controller, or the
Read/Write logic on the memory module. This fault can
be any of the following, in order of probability:
o

A failure in the Read/Write logic of the M.std2
module. Replace M.std2.

A faulty RX33 controller/drive interface cable.
Replace the cable.

No diskettes installed in the drives.

Doors were left open on the Rx33 drives.

Neither diskette contains a bootable image.

Ensure a known good HSC70 bootable media is properly
loaded in one of the Rx33 drives. If checking the
obvious, doors and diskettes, does not remedy the
situation, refer to Chapter 6 for more information
before beginning repair. Running the Offline P.io and
Offline Rx33 tests (if possible) is strongly recommended
before modules are replaced. These tests may help
further isolate or define the problem.
8.

Fault Code 25 - Port Link node address switches out of
range - indicates the LOIOO module node address switches
are set to a value outside the currently-suggested range
of 15 decimal.

Fault Code 26 - missing files required - indicates the
System diskette does not contain one of the files
necessary for operation of the HSC70 Control Program.
This failure should occur only if one of the required
files is inadvertently deleted from the HSC70 System
diskette.
Note the condition of the State light must be observed
next prior to the fault occurrence. The State light is
always steady (either ON or OFF) when the Fault light is
lit during boot faults.
While the State light is steady (ON) it can mean:
0

SYSCOM.INI is not present on the load device.

EXEC70.INI is not present on the load device.

A version mismatch was found between either EXEC,
SUBLIB, or SYSCOM and OLBVSN (Object Library Version
Number).

4-8

While the State light is blinking it can mean:
o

Any of the the normally-loaded programs (SINI, CERF,
DEMON, etc.) is not present on the load device.

A version mismatch was found on anyone of the
normally-loaded programs.

Replace the diskette with a backup copy.
10.

Fault Code 30 - No working K.ci, K.sdi, or K.sti in
subsystem - indicates the HSC70 does not contain any
working K.ci, K.sti, or K.sdi modules. Either none are
installed in the HSC70, or all the ones installed failed
their initialization diagnostics. Also, if the Disk
Server code is loaded, and no working K.sdi is found,
this fault code is displayed.
Insert the HSC70 Offline Diagnostic diskette into the
Rx33 and reboot the HSC70. When the Offline Loader
prompts with ODL>, type SIZE followed by a carriage
return. The SIZE command displays the status of all the
Ks. This status indicates whether the modules are
missing or are failing initialization diagnostics.
If all else fails, replace the P.ioj (LOllI) and check
subsystem power for proper operation.

11.

OCP error code of 31 - indicates a crash occurred while
the HSC70 was attempting to load and initialize its
control program. Use Micro-ODT to diagnose these
initialization crashes as follows:
a.

Press the break key on the local console terminal.

Type 17 777 656/
This is the address of the UPAR7 register. The
reason for reboot codes are stored in UPAR7 bits 8
to 11 when an OCP code of 31 has been detected. The
other UPAR registers store useful information for
some of the errors related to an OCP fault code of
31. Refer to the fault code 31 reasons in the
following paragraphs for UPAR content usage. Table
4-1 shows the addresses of the UPAR registers.

Analyze bits 8 to 11 of the 16-bit message displayed
by examining UPAR7. Table 4-2 shows the bit/error
relationship.

4-9

12.

If this error occurs repeatedly, it indicates an
intermittent hardware error or degraded diskette
media. The boot-in-progress flag is indicated by
KPDR7 bit 3 set. The KPDR7 register address is 17
772 316. Use micro-ODT to examine bit 3 (it can be
reset).

OCP error code 32 - indicates an inconsistency in the

software. Reboot the HSC. If this failure persists,
use a backup copy of the System diskette. If the
failure still persists, use the Offline diagnostics to
help isolate any hardware failures in the subsystem.
Also, try using an earlier version of the HSC operating
software.

Table 4-1

UPAR Register Addresses

Address

UPARO

17 777 640

UPARI

17 777 642

UPAR2

17 777 644

UPAR3

17 777 646

UPAR4

17 777 650

UPAR5

17 777 652

UPAR6

17 777 654

UPAR7

17 777 656

4-10

Table 4-2

Control Program Bits

16 BIT MESSAGE

MEANING

FRUS

x XXX XXX lXX XXX XXX

NXM

LOllI
LOl17
Software

X XXX XXI OXX XXX XXX

Illegal lnst.

LOllI
LOl17
Software

x XXX XXI lxx XXX XXX

Parity Trap

LOl17
LOllI

x xxx X10 OXX xxx XXX

Level 7 Interrupt

L0108
LOl07

X XXX X10 1XX XXX XXX

MMU Trap

LOllI
Software

X XXX xlI Oxx XXX XXX

Software Crash

Software

X XXX III XXX XXX XXX

K.ci Host Reset

LOl17

X XXX 100 OXX XXX XXX

User Requested Reboot

N/A

The following list describes actions to be taken for each type of
error related to an OCP fault code of 31 as pointed out by
examining UPAR7.
o

NXM Trap:
Examine UPARI to find the lower 16 bits of
the failing memory address by typing 17 777 642/.
Examine UPAR2's lower byte for the high 6 bits of the
failing memory address by typing 17 777 644/.

Illegal Inst: Obtain a crash dump and analyze the crash
to find the failing instuction.

Parity Trap: Use the same method for parity traps as
you did for NXM traps to determine the failing address.

Level 7 Interrupt: Determine which K has interrupted
the system by examining UPARO through UPAR4. Refer to
Table 4-1 for the address of each UPAR register.
Each
byte of each register contains module status for each
requester (K) in the HSC70. Refer to Appendix C to
determine a failing status code is. Refer to Table 4-3
for the designation of requesters to UPAR registers for
a level 7 interrupt.

4-11

Memory Management Unit (MMU) Trap: Examine
UPAR2, and UPAR3 to determine the status of
the time of the OCP fault code of 31. When
occurs, status of the MMU is found in these

Software crash: Check the first word on the kernal
stack to determine the reason for failing software.
Refer to Appendix B.

K.ci Host Reset: Hit the break key again and at the @
symbol type 17 770 000/ when a host reset is known as
the reason for an OCP fault code of 31. This is the
address of control memory window O. When the / is hit,
the contents of control window 0 are displayed. Enter a
a into this location followed by a carriage return.
Then type 16 000 002/. This is the second location in
control memory. The number displayed as the contents of
16 000 002 is the number of the host that issued the
HOST RESET command.

Table 4-3

Status of Requestors For Level 7 Interrupt

HIGH BYTE

LOW BYTE

UPARO

REQ 2

REQ 1

UPARl

REQ 4

REQ 3

UPAR2

REQ 6

REQ 5

UPAR3

REQ 8

REQ 7

UPAR4

N/A

REQ 9

4-12

UPAR1,
the MMU at
a MMU trap
registers.

4.3.3 Init P.io Test Summaries
The Init P.io Test does not use a test numbering scheme for the
following reasons:

Test numbering adds overhead to the program both in
execution time and the memory size required for the
program. Because boot time is critical, the extra
overhead is not justified.

The only goal of the Init P.io Test is to provide module
callout on the fault code display.

The Offline P.io Test is provided for those situations
where a repair level diagnostic is needed. This offline
test produces standard HSC70 error reports. Chapter 6
describes each of the tests. They are Eunctionally
identical to the tests provided in the Init P.io Test.

4-13

CHAPTER 5
INLINE DIAGNOSTICS

5.1 INTRODUCTION
Inline diagnostics executing in the HSC do not interfere with
normal operation. The following sections describe these tests:
0

Inline Rx33 Diagnostic Test

Inline Memory Test

Inline Disk Drive Diagnostic Test

Inline Tape Test

Inline Tape Compatibility Test

Inline Multidrive Exerciser

5.1.1 Inline Diagnostics Commonalities
All inline diagnostics have two common areas: all test prompts
and error messages conform to standard formats. All prompts
issued by these diagnostics use a generic syntax.
o

Prompts requiring user action or input are always
followed by a question mark.

Prompts offering a choice of responses show those
choices in parentheses.

A capital D in parentheses indicates the response should
be in decimal.

The square brackets enclose the prompt default or if
empty, indicate no default exists for that prompt.

5-1

In1ine Diagnostics Generic Error Message Format - All

5.1.1.1

inline diagnostics follow a generic error message format, as
follows:
XXXXXX>D>tt:tt T#aaa E#bbb
U-ccc
<Text string describing error>
FRUI-dddddd FRU2-dddddd
MA -eeeeee
EXP-yyyyyy
ACT-zzzzzz

where:
XXXXXX>
Appropriate inline diagnostic prompt
D>
Letter indicating the diagnostic was initiated on
demand. This field can contain a D, an A (diagnostic initiated
automatically), or a P (diagnostic initiated as part of the
periodic diagnostics).
tt:tt
aaa
bbb
ccc
FRUI
FRU2
dddddd
MA
eeeeee
yyyyyy
zzzzzz

Current time
Decimal number denoting test that failed
Decimal number denoting error detected
unit number of drive being tested
Most likely Field Replaceable Unit (FRU)
Next most likely FRU
Name of Field Replaceable Unit
Media Address
Octal number denoting Offset within block
Octal number denoting data expected
Octal number denoting data actually found

The first line of the error message contains general information
concerning the error. The second line describes the nature of
the error. Lines 1 and 2 are mandatory and appear in all error
messages. Line 3 and any succeeding lines display additional
information and are optional.
5.2

INLINE Rx33 DIAGNOSTIC TEST (ILRX33)

The Inline RX33 diagnostic tests either of the Rx33 drives
attached to the HSC70. This test runs concurrently with other
HSC70 processes and uses the services of the HSC70 Control
Program and the Diagnostic Execution Monitor (DEMON). The Inline
RX33 test performs several writes and reads to verify the RX33
internal data paths and read/write electronics.
5.2.1

ILRX33 System Requirements

Hardware requirements include:
o

p.io (processor) module with HSC70 boot ROMs

At least one M.std2 (memory) module

5-2

Rx33 controller with at least one working drive

Console terminal
NOTE
A scratch diskette is not required. This test
does not destroy any data on the system software
diskette.

This program tests only the Rx33 and the data path (serial line)
between the P.ioj and the Rx33. All other system hardware is
assumed working.

Software requirements include:
o

HSC70 Control Program

Diagnostic Execution Monitor (DEMON)

5.2.2 ILRX33 Operating Instructions
Typing a CTRL Y starts ILRX33. The keyboard monitor responds
with a KMON prompt (HSC». Next, typing either RUN ILRX33 or RUN
DXO:ILRX33 followed by a carriage return initiates the Inline
Rx33 Test.
If the Inline Rx33 Test cannot load from the specified diskette,
try loading the test from the other diskette. For example, if
RUN ILRX33 fails, try RUN Dxl:ILRX33.
5.2.3 ILRX33 Test Parameter Entry
The device name of the Rx33 drive to be tested is the only
parameter sought by this test. When the test is invoked, the
following prompt is displayed:
Device Name of Rx33 to test (DXO:, DXI:, LB:) [] ?
NOTE
The string, LB:, indicates the RX33 drive last
used to boot the HSC70 Control Program.
One of the indicated strings must be entered. If one of these
strings in not entered, the test prints Illegal Device Name, and
the prompt is repeated.

5-3

ILRX33 Setting/Clearing
ILRX33 only verifies a particular Rx33 drive and controller
combination is working or failing and should not be used as a
troubleshooting aid. This test does not support any flags.
Because the test always reads and writes the same block of the
diskette, looping the test would eventually result in media
damage.
If the test indicates a particular controller or drive
is not operating correctly, the proper repair strategy is to
replace the drive and/or controller.
5.2.4

5.2.5 ILRX33 Progress Reports
At the end of the test, the following message is displayed:

ILRX33>O>tt:tt Execution Complete
tt:tt = current time

where:

5.2.6 ILRX33 Test Termination
This test is terminated by typing a ~Y (CTRL Y). The test
automatically terminates after reporting an error with one
exception. If the error displayed is RETRIES REQUIRED, the test
continues.
5.2.7 ILRX33 Error Message Example
All error messages produced by the Inline RX33 Test conform to
the HSC diagnostic error message format (Section 5.1.1.1).
Following is a typical ILRX33 error message:

ILRX33>D>00:00 TOOl E 003 U- 50182
ILRX33>D> No Diskette Mounted
ILRX33>D> FRUI-Drive
Other optional lines are found on different error messages.
5.2.8 ILRX33 Error Messages
The following paragraphs list specific information about each of
the errors produced by the Inline RX33 Test. Hints about the
possible cause of the error are provided where feasible.

Error 000 - RETRIES REQUIRED - indicates a Read or Write
operation failed when first attempted, but succeeded on
one of the retries performed automatically by the RX33
driver software. This error normally indicates the
diskette media is degrading and the diskette should be
replaced.

Error 001 - OPERATION ABORTED - is reported if the
ILRX33 test is aborted by a CTRL Y.

5-4

Error 002 - WRITE PROTECTED - indicates the RX33 drive
being tested contains a write-protected diskette. Write
enable the diskette and try again. If the diskette is
not write protected, the Rx33 drive or controller is
faulty.

Error 003 - NO DISKETTE MOUNTED - indicates the RX33
drive being tested does not contain a diskette.
Insert
a diskette before repeating the test.
If this error is
displayed when the drive does contain a diskette, the
drive or controller is at fault.

Error 004 - HARD I/O ERROR - indicates the program
encountered a hard error while attempting to read or
write the diskette.

Error 005 - BLOCK NUMBER OUT OF RANGE - indicates the

RX33 driver detected a request to read a block number
outside the range of legal block numbers (0 thru 2399
decimal). Because the Inline RX33 Test reads and writes
disk block 001, it may indicate a software problem.
o

Error 006 - UNKNOWN STATUS
STATUS=xxx - indicates the Inline RX33 Test received a
status code it did not recognize. The octal value xxx
represents the status byte received. RX33 reads and
writes are performed for the Inline Rx33 Test by the HSC
Control Program's Rx33 driver software. At the
completion of each Read or Write operation, the driver
software returns a status code to the RX33 test,
describing the result of the operation. The test
decodes the status byte to produce a description of the
error.
An UNKNOWN STATUS error indicates the status value
received from the driver did not match any of the status
values known to the test. The status value returned
(xxx) is displayed to help determine the cause of the
problem. Any occurrence of this error should be
reported via a Software Performance Report (SPR). See
Appendix B for detailed information on SPR submission.

Error 007 - DATA COMPARE ERROR
MA -aaaaaa
Exp-bbbbbb
ACT-cccccc - indicates data written to the diskette does
not agree with the data subsequently read back. The
field aaaaaa represents the address of the failing word
within the block (512 bytes) that was read. The field
bbbbbb represents the data written to the word and the
field cccccc represents the data read back from the
word. Because this test only reads and writes block 1
of the diskette, all failures occur while trying to
access physical block 1.

5-5

Error 008 - ILLEGAL DEVICE NAME - indicates the user
specified an illegal device name when the program
prompted for the name of the drive to be tested. Legal
device names include: DXO:, DXl: and LB:. LB:
indicates the drive from which the system was last
booted. After displaying this error, the program again
prompts for a device name. Enter one of the legal
device names to continue the test.

5.2.9 ILRX33 Test Summary
The test summary for this diagnostic is contained in the
following paragraphs.
o

Test 001 - Read/Write Test - verifies data can be
written to the diskette and read back correctly. All
reads and writes access physical block 1 of the RX33
(the RT-II Volume 10 Block). This block is not used by
the HSC operating software.
Initially, the contents of block I are read and saved.
Then three different data patterns are written to block
1, read back, and verified. This checks the read/write
electronics in the drive and the internal data path
between the Rx33 controller and the drive. Following
the Read/Write Test, the original contents of block 1
are written back to the diskette.
If the data read back from the diskette does not match
the data written, a Data Compare Error is generated.
The error report lists the word (MA) in error within the
block together with the expected (EXP) and actual (ACT)
contents of the word.

5.3 INLINE MEMORY TEST (ILMEMY)
The Inline Memory test is designed to test HSC70 data buffers.
This test can be initiated automatically or on demand.
It is
initiated automatically to test data buffers that produced a
parity error when in use by the HSC70 Control Program. Buffers
that fail the memory test are removed from service by sending
them to the Disabled Buffer Queue. Buffers sent twice to this
test, but not failing the memory test are also sent to the
Disabled Buffer Queue. Buffers that pass the memory test and
have not been tested previously are sent to the Free Buffer Queue
for further use by the HSC70 Control Program.
When the tesL is initiated on demand, any buffers on the Disabled
Buffer Queue are tested and, the results of the test are
displayed on the terminal from which the test was initiated.

5-6

This test runs concurrently with other HSC70 processes and uses
the services of the HSC70 Control Program and the Diagnostic
Execution Monitor (DEMON).
5.3.1 ILMEMY System Requirements
Hardware requirements include:
o

P.ioj (processor) module with HSC70 boot ROMs

At least one M.std2 (memory) module

Rx33 controller with at least one working drive

A console terminal (demand initiation only)

This program only tests data buffers located in the HSC70 Data
memory. All other system hardware is assumed to be working.
software requirements include:
o

HSC70 Control Program (System diskette)

Diagnostic Execution Monitor (DEMON)

5.3.2 ILMEMY Operating Instructions
To start this test, type a CTRL Y to get the attention of the
HSC70 keyboard monitor. The keyboard monitor responds to the
CTRL Y with a prompt
HSC70>
Type RUN DXO:ILMEMY and a carriage return to initiate the Inline
Memory Test. This program has no user-supplied parameters or
flags.
If the Inline Memory test is not contained on the specified
diskette (DXI:), an error message is displayed.
5.3.3 ILMEMY Progress Reports
Error messages are displayed as needed. At the end of the test,
the following message is displayed (by DEMON):
ILMEMY>D>tt:tt Execution Complete
where:

tt:tt = current time

5-7

5.3.4 ILMEMY Error Message Example
All error messages produced by the Inline Memory test conform to
the HSC70 diagnostic error message format (Section 5.1.1.1).
Following is a typical ILMEMY error message:
ILMEMY>A>09:33 TOOl E 000
ILMEMY>A>Tested Twice with no Error (Buffer Retired)
ILMEMY>A>FRU1-M.std2 FRU2ILMEMY>A>Buffer Starting Address (physical)
15743600
ILMEMY>A>Buffer Ending Address
(physical) = 15744776
5.3.5 ILMEMY Error Messages
The following list shows specific information about each of the
errors displayed by the Inline Memory Test.
o

Error 000 TESTED TWICE WITH NO ERROR - indicates the
buffer under test passed the memory test. However, this
is the second time the buffer was sent to the memory
test and passed it. Because the buffer has a history of
two failures while in use by the Control Program yet
does not fail the memory test, intermittent failures on
the buffer are assumed. The buffer is retired from
service and sent to the Disabled Buffer Queue.

Error 001 RETURNED BUFFER TO FREE BUFFER QUEUE indicates a buffer failed during use by the Control
Program but the Inline Memory test detected no error.
Because this is the first time the buffer was sent to
the Inline Memory test, it is returned to the Free
Buffer Queue for further use by the HSC70 Control
Program. The address of the buffer is stored by the
Inline Memory test in case the buffer again fails when
in use by the Control Program.

Error 002 MEMORY PARITY ERROR - indicates a parity error
occurred while testing a buffer. The buffer is retired
from service and sent to the Disabled Buffer Queue.

Error 003 MEMORY DATA ERROR - indicates the wrong data
was read while testing a buffer. The buffer is retired
from service and sent to the Disabled Buffer Queue.

5.3.6 ILMEMY Test Summaries
Test 001 receives a queue of buffers for testing. If the Inline
Memory test is initiated automatically, the queue consists of
buffers from the Suspect Buffer Queue.
When the HSC70 Control Program detects a parity error in a data
buffer, the buffer is sent to the Suspect Buffer Queue. While on
this queue, the buffer is not used for data transfers. The HSC70
Continuous Scheduler periodically checks the Suspect Buffer Queue

5-8

to see if it contains any buffers. If buffers are found on the
queue, they are removed, and the Inline Memory test is
automatically initiated to test those buffers.
If the ILMEMY test is initiated on demand, it retests only
buffers already known as disabled (a rather useless exercise).
If the test is initiated automatically, and the buffer passes the
test, the program checks to see if this is the second time the
buffer was sent to the Inline Memory test. If this is the case,
the buffer is probably producing intermittent errors. The buffer
is retired from service and sent to the Disabled Buffer Queue.
If this is the first time the buffer is sent to the Inline Memory
test, it is returned to the Free Buffer Queue for further use by
the HSC70 Control Program. In this last case, the address of the
buffer is saved in case the buffer again fails and is sent to the
Inline Memory test a second time.
When all buffers on the test queue are tested, the Inline Memory
Test terminates.
5.4 INLINE DISK DRIVE DIAGNOSTIC TEST (ILDISK)
The Inline Disk Drive Diagnostic (ILDISK) isolates disk
drive-related problems to one of the following three Field
Replaceable Units (FRUs):
1.

Disk drive

SDI cable

HSC Disk Data Channel module

The Inline Disk Drive Diagnostic runs in parallel with disk I/O
from a Host CPU. However, the drive being diagnosed cannot be
Online to any host. This diagnostic can be initiated upon demand
via the console terminal or automatically by the HSC70 Control
Program when an unrecoverable disk drive failure occurs.
Currently, ILDISK is automatically invoked by default whenever
(with one exception) a drive is declared inoperative. The
exception is if a drive is declared inoperative while in use by a
diagnostic or utility. Automatic initiation of ILDISK can be
inhibited by issuing the SETSHO command, SET AUTO DISABLE. If
the SET AUTO DISABLE command is issued, ILMEMY (a test for
suspect buffers) is also disabled. For this reason, leaving
ILDISK automatically enabled is preferable.
The tests performed vary, depending on whether the drive is known
to the HSC70 Control Program.
1.

DRIVE UNKNOWN - to the HSC70 Control Program. It is
either unable to communicate with the HSC70 or was

5-9

communicating and declared inoperative when it failed
during use by the HSC70. In this case, because the
drive cannot be identified by unit number, the user must
supply the requestor number and port number of the
drive. Then the SOl verification tests can execute.
The SDI verification tests check the path between the
K.sdi and the disk drive and command the drive to run
its self-test diagnostics. If the SOl verification
tests fail, the most probable FRU is identified in the
error report. If the SOl verification tests pass,
presume the drive is the FRU.
2.

ORIVE KNOWN - to the HSC70 Control Program, (i.e.
identifiable by unit number). Read/write/format tests
are performed in addition to the SOl verification tests.
If an error is detected, the most probable FRU is
identified in the error report.
If no errors are
detected, presume the FRU is the drive.

5.4.1 ILDISK System Requirements
Software requirements of this test include the HSC70 Control
Program, the Control Program disk functional code, and DEMON.
Hardware requirements include the disk drive and a disk data
channel, connected by an SO! cable. The test assumes the I/O
Control Processor module, and the memory module are working.

A service manual for the disk drive is required to interpret
errors that occur in the drive's self-test diagnostics.

ILDISK Operating Instructions
Use the following steps to initiate ILOISK:

5.4.2

Type a CTRL Y.

In response to the prompt
HSC70>
type RUN DXO:ILDISK, followed by a carriage return.

Wait until ILOISK is read from the system software load
media into the HSC70 Program memory.

Enter parameters after ILOISK is started.
Section 5.4.4.

5-10

Refer to

5.4.3 ILDISK Availability
If a diskette containing the Inline Disk Drive Diagnostic is not
loaded when you enter the R ILDISK command, an error message is
displayed. Insert the Operating System diskette containing
ILDISK and repeat Section 5.4.2.
5.4.4 ILDISK Test Parameter Entry
upon demand initiation, ILDISK first prompts:
DRIVE UNIT NUMBER (U) [] ?
Enter the unit number of the disk drive for test. Unit numbers
are in the form Dnnnn, where nnnn is a decimal number between 0
and 4095 corresponding to the number printed on the drive unit
plug. Terminate the unit number response with a carriage return.
ILDISK attempts to acquire the specified unit via the HSC70
Diagnostic Interface. If the unit is acquired successfully,
ILDISK next prompts for the drive diagnostic to be executed. If
the acquire fails, one of the following conditions was
encountered:
1.

The specified drive is UNAVAILABLE. This indicates the
drive is connected to the HSC70 but is currently online
to a host CPU or an HSC70 utility. Online drives cannot
be diagnosed. ILDISK repeats the prompt for the unit
number.

The specified drive is UNKNOWN to the HSC70 Disk
Functional software. Drives are UNKNOWN for one of the
following reasons:
o

The drive and/or disk data channel port is broken
and cannot communicate with the disk functional
software.

The drive was previously communicating with the
HSC70 but a serious error occurred, and the HSC70
has ceased communicating with the drive (marked the
drive as inoperative).

In either case, ILDISK asks if you desire to enter a
requestor number and port number. Refer to Section
5.4.5.
After receiving the unit number (or requestor and port), ILDISK
prompts:
RUN A SINGLE DRIVE DIAGNOSTIC (Y/N) [N] ?
Typing a carriage return causes the drive to execute its entire
diagnostic set. Typing a Y followed by a carriage return

5-11

executes a single drive diagnostic.
is selected, the test prompts:

If a single drive diagnostic

1 DRIVE TEST NUMBER (H) [] ?
Enter a hexadecimal number specifying the drive diagnostic to be
executed. Consult the appropriate disk maintenance or service
manual to determine the number of the test to perform. Entering
a test number not supported by the drive results in an error #13
generated in Test 5.
The test prompts for the number of passes to perform:
# OF PASSES TO PERFORM (1 to 32767) (D) [1] ?

Enter a decimal number between 1 and 32767 specifying the number
of test repetitions. Terminate the response with a carriage
return. Typing a carriage return, without entering a number,
runs the test once.

5.4.5 Specifying Requestor And Port - ILDISK
Drives unknown to the HSC70 disk functional software are tested
by specifying the requestor number and port number of the drive.
Requestor number is any number 2 through 9 specifying the disk
data channel connected to the drive under test.
Port number is 0 through 3 specifying which of four disk data
channel ports is connected to the drive under test. The
requestor number and port number can be determined in one of two
ways:
1.

By tracing the SDI cable from the desired disk drive to
the HSC70 bulkhead connector, then tracing the bulkhead
connector to a specific port on one of the disk data
channels.

By using the SHOW DISKS command to display the requestor
and port numbers of all known drives. To use this
method, exit ILDISK by typing a CTRL Y. Type SHOW DISKS
in response to the HSC70 prompt. This command displays
a list of all known drives including the requestor
number and port number for each drive. Each disk data
channel has four possible ports to which a drive can be
connected. By inference, the port number of the unknown
unit must be one not listed in the SHOW DISKS display
(assuming the unknown drive is not connected to a
defective disk data channel). A defective disk data
channel illuminates red LED on the lower front edge of
the module. Refer to Chapter 2.

After a requestor number and a port number are supplied to
ILDISK, the program checks to ensure the specified requestor and

5-12

port do not match any drive known to the HSC70 software. If the
requestor and port do not match a known drive, ILDISK prompts for
the number of passes to perform, as described in Section 5.4.4.
If the requestor and port do match a known drive, ILDISK reports
Error 08.
5.4.6 ILDISK Progress Reports
ILDISK produces an end-of-pass report at the completion of each
pass of the diagnostic. One pass of the program can take several
minutes depending upon the type of drive being diagnosed.
5.4.7 ILDISK Test Termination
ILDISK is terminated by typing a CTRL Y or CTRL C. A CTRL Y/CTRL
C may not take effect immediately because certain parts of the
program cannot be interrupted. An example would be during SDr
commands. Two minutes may be necessary to respond to a CTRL Y or
CTRL C if either is entered while an SDI DRIVE DIAGNOSE command
is in progress.
5.4.8 ILDISK Error Message Example
All error messages produced by the Inline Disk Drive diagnostic
conform to the HSC70 diagnostic error message format (Section
5.1.1.1). Following is a typical ILDISK error message.
ILDISK>D>09:35 T 005 E 035 U-D00082
ILDISK>D>Drive Diagnostic Detected Fatal Error
ILDISK>D>FRUI-Drive
FRU2ILDISK>D>Requestor Number 04
ILDISK>D>Port Number 03
ILDISK>D>Test 0025 Error 007F
ILDISK>D>End Of Pass 00001
5.4.9 ILDISK Error Messages
Messages produced by ILDISK are described in the following list:
o

Error 01 DDUSUB INITIALIZATION FAILURE - The HSC70
diagnostic interface did not initialize. Error 01 is
not recoverables and is caused by:
1.

Insufficient memory to allocate buffers and control
structures required by the diagnostic interface.

HSC Disk Functional software is not loaded.

Error 02 UNIT SELECTED IS NOT A DISK - The response to
the unit number prompt was not of the form Dnnnn. Refer
to Section 5.4.4.

5-13

Error 03 DRIVE UNAVAILABLE - The selected disk drive is
not available for diagnostic use.

Error 04 UNKNOWN STATUS FROM DDUSUB - A call to the
diagnostic interface resulted in the return of an
unknown status code. This indicates a software error
and should be reported via a Software performance Report
(SPR). See Appendix B for detailed information on SPR
submission.

Error 05 DRIVE UNKNOWN TO DISK FUNCTIONAL CODE - The
disk drive selected is not known to the HSC Disk
Functional software. The drive may not be communicating
with the HSC, or the disk functional software may have
disabled the drive due to an error condition.
ILDISK
prompts the user for the drive's requestor and port.
Refer to Section 5.4.5 for information on specifying
requestor and port.

Error 06 INVALID REQUESTOR OR PORT NUMBER SPECIFIED The Requestor number given was not in the range 2
through 9, or the port number given was not in the range
o through 3. Specify a requestor and port within the
allowable ranges.

Error 07 REQUESTOR SELECTED IS NOT A K.SDI - The
requestor specified was not a Disk Data Channel (K.sdi).
Specify a requestor that contains a Disk Data Channel.

Error 08 SPECIFIED PORT CONTAINS A KNOWN DRIVE - The
requestor and port specified contain a drive known to
the HSC Disk Functional software. The unit number of
the drive is supplied in the report.
ILDISK does not
allow testing a known drive via requestor number and
port number.

Error 09 DRIVE CAN'T BE BROUGHT ONLINE - A failure
occurred when ILDISK attempted to bring the specified
drive Online. One of the following conditions occurred:
1.

UNIT IS OFFLINE - The specified unit went to the
OFFLINE state and now cannot communicate with the
HSC70.

UNIT IS IN USE - The specified unit is now marked as
in use by another process.

UNIT IS A DUPLICATE - Two disk drives are connected
to the HSC70, both with the same unit number.

UNKNOWN STATUS FROM DDUSUB - The HSC70 diagnostic
interface returned an unknown status code when
ILDISK attempted to bring the drive Online. Refer
to Error 04 for related information on this error.

5-14

Error 10 K.SDI DOES NOT SUPPORT MICRODIAGNOSTICS - The
K.sdi connected to the drive under test does not support
microdiagnostics. This indicates the K.sdi microcode is
not at the latest revision level. This is not a fatal
error, but the K.sdi should probably be updated with the
latest microcode to improve error detection
capabilities.

Error 11 CHANGE MODE FAILED - ILDISK issued an SDI
CHANGE MODE command to the drive and the command failed.
The drive is presumed the failing unit, because the SDI
interface was previously verified.

Error 12 DRIVE DISABLED BIT SET - The SDI verification
test issued an SDI GET STATUS command to the drive under
test. The Drive Disabled bit was set in the status
returned by the drive, indicating the drive detected a
serious error and is now disabled.

Error 13 COMMAND FAILURE - The SDI verification test
detected a failure while attempting to send an SDI
command to the drive. One of the following occurred:

DID NOT COMPLETE - The drive did not respond to the
command within the allowable time.
Further SDI
operations to the drive are disabled.

K.SDI DETECTED ERROR - The K.sdi detected an error
condition while sending the command or while
receiving the response.

UNEXPECTED RESPONSE - The SDI command resulted in an
unexpected response from the drive. This error can
be caused by a DIAGNOSE command if a single drive
diagnostic was selected, and the drive does not
support the specified test number.

Error 14 CAN'T WRITE ANY SECTOR ON TRACK - As part of
test 04, ILDISK attempts to write a pattern to at least
one sector of each track in the Read/Write area of the
drive DBN space.
(DBN space is an area on every disk
drive reserved for diagnostic use.) During the write
process, ILDISK detected a track with no sector that
passed the Read/Write test.
(ILDISK could not write a
pattern and read it back successfully on any sector on
the track.) The error information for the last sector
accessed is identified in the error report. The most
probable cause of this error is a disk media error.
If test 03 also failed, the problem could be in the disk
Read/Write electronics, or the DBN area of the disk may
not be formatted correctly. To interpret the MSCP
status code, refer to Section 5.4.9.1.

5-15

Error 15 READ/WRITE READY NOT SET IN ONLINE DRIVE - The
SOl verification test executed a command to interrogate
the Real Time Drive State line of the drive. The line
status reported the drive was in the Online state, but
the Read/Write Ready bit was not set in the status.

Error 16 ERROR RELEASING DRIVE - ILDISK attempted to
release the drive under test. The release operation
failed.
One of the following occurred:
1.

COULD NOT DISCONNECT - An SOl DISCONNECT command to
the drive failed.

UNKNOWN STATUS FROM DDUSUB - Refer to Error 04.

Error 17 INSUFFICIENT MEMORY, TEST NOT EXECUTED - The
SOl verification test could not acquire sufficient
memory for control structures. The SDI verification
test could not be executed. Use the SETSHO command,
SHOW MEMORY, to display available HSC memory.
If any
disabled memory appears in the display, consider further
testing of the memory module.
If no disabled memory is
displayed, and no other diagnostic or utility is active
on this HSC, submit an SPR.

Error 18 K MICRODIAGNOSTIC DID NOT COMPLETE - The SOl
verification test directed the disk data channel to
execute one of its microdiagnostics. The
microdiagnostic did not complete within the allowable
time. All drives connected to the disk data channel may
now be unusable (if the microdiagnostic never
completes), and the HSC70 probably must be rebooted.
The disk data channel module is the probable failing
FRU.

Error 19 K MICRODIAGNOSTIC REPORTED ERROR - The SOl
verification test directed the disk data channel to
execute one of its microdiagnostics. The
microdiagnostic completed and reported an error. The
disk data channel is the probable FRU.

Error 20 DCB NOT RETURNED, K FAILED FOR UNKNOWN REASON The SDI verification test directed the disk data channel
to execute one of its microdiagnostics. The
microdiagnostic completed without reporting any error,
but the disk data channel did not return the Dialogue
Control Block (DCB). All drives connected to the disk
data channel may now be unusable. The disk data channel
is the probable FRU and the HSC70 will probably have to
be rebooted.

Error 21 ERROR IN DCB ON COMPLETION - The SOl
verification test directed the disk data channel to

5-16

execute one of its microdiagnostics. The
microdiagnostic completed without reporting any error,
but the disk data channel returned the Dialogue Control
Block (DCB) with an error indicated. The disk data
channel is the probable FRU.
o

Error 22 UNEXPECTED ITEM ON DRIVE SERVICE QUEUE - The
SDI verification test directed the disk data channel to
execute one of its microdiagnostics. The
microdiagnostic completed without error, and the disk
data channel returned the Dialogue Control Block with no
errors indicated. However, the disk data channel sent
the Drive State Area to its service queue, indicating an
unexpected condition in the disk data channel or drive.

Error 23 FAILED TO REACQUIRE UNIT - In order for ILDISK
to allow looping; the drive under test must be released
and then reacquired.
(This method is required to
release the drive from the Online state.) The release
operation succeeded, but the attempt to reacquire the
drive failed. One of the following conditions occurred:
1.

DRIVE UNKNOWN TO DISK FUNCTIONAL CODE - A fatal
error caused the HSC70 Disk Functional software to
declare the drive inoperative, hence the drive unit
number is not recognized. The drive must now be
tested by specifying requestor and port number.

DRIVE UNAVAILABLE - The specified drive is now not
available for diagnostic use.

UNKNOWN STATUS FROM DDUSUB - Refer to Error 04.

Error 24 STATE LINE CLOCK NOT RUNNING - The SOl
verification test executed a command to interrogate the
Real Time Drive State of the drive. The returned status
indicates the drive is not sending State Clock to the
disk data channel. Either the port, SDI cable, or drive
is defective or the port is not connected to a drive.

Error 2S ERROR STARTING I/O OPERATION - ILDISK detected
an error when initiating a disk read or write operation.
One of the following conditions occurred:
1.

INVALID HEADER CODE - ILDISK did not supply a valid
header code to the HSC70 diagnostic interface. This
indicates a software error and should be reported
via a Software Performance Report (SPR). See
Appendix B for detailed information on SPR
submission.

5-17

COULD NOT ACQUIRE CONTROL STRUCTURES - The HSC70
diagnostic interface could not acquire sufficient
control structures to perform the operation.

COULD NOT ACQUIRE BUFFER - The HSC70 diagnostic
interface could not acquire a buffer needed for the
operation.

UNKNOWN STATUS FROM DDUSUB - The HSC70 diagnostic
interface returned an unknown status code. Refer to
Error 04.
NOTE
Retry ILDISK during lower HSC activity for
problems 2 and 3, if these errors persist.

Error 26 INIT DID NOT STOP STATE LINE CLOCK - The SOl
verification test sent an SOl INITIALIZE command to the
drive. When the drive receives this command, it should
momentarily stop sending State Line Clock to the disk
data channel. The disk data channel did not see the
State Line Clock stop after sending the Initialize. The
drive is the most probable FRU.

Error 27 STATE LINE CLOCK DID NOT START UP AFTER INIT The SOl verification test sent an SOl INITIALIZE to the
drive. When the drive receives this command, it should
momentarily stop sending State Clock to the disk data
channel. The disk data channel saw the State Clock
stop, but the clock never restarted. The drive is the
most probable FRU.

Error 28 I/O OPERATION LOST - While ILOISK was waiting
for a disk read or write operation to complete, the
HSC70 diagnostic interface notified ILOISK that no I/O
operation was in progress. This error may be induced by
a hardware failure but indicates a software problem that
should be reported by a Software Performance Report
(SPR). See Appendix B for detailed information on SPR
submission.

Error 29 ECHO DATA ERROR - The SOl verification test
issued an SOl ECHO command to the drive. The command
completed but the wrong response was returned by the
drive. The SOl set and the disk drive are the probable
FRUs.

Error 30 DRIVE WENT OFFLINE - The drive, previously
acquired by the diagnostic, is now unknown to the disk
functional code. This indicates the drive spontaneously
went Offline or stopped sending clocks and is now

5-18

unknown. The test should be restarted using the
requestor and port numbers instead of drive unit number.
o

Error 31 DRIVE ACQUIRED BUT CAN'T FIND CONTROL AREA The disk drive was acquired, and ILDISK obtained the
requestor number and port number of the drive from the
HSC70 diagnostic interface. However, the specified
requestor does not have a control area. This indicates
a software problem and should be reported via a Software
Performance Report (SPR). See Appendix B for detailed
information on SPR submission.

Error 32 REQUESTOR DOES NOT HAVE CONTROL AREA - ILDISK
cannot find a control area for the requestor supplied by
the user. One of the following conditions exists:
1.

The HSC70 does not contain a disk data channel (or
other type of requestor) in the specified requestor
position.

The disk data channel (or other type of requestor)
in the specified requestor position failed its
initialization diagnostics and is not in use by the
HSC70.

Open the HSC70 front door and remove the cover from the
card cage. Locate the module slot in the card cage that
corresponds to the requestor. Refer to the module
utilization label above the card cage to help locate the
proper requestor.
If a blank module (air baffle) is in
the module slot, the HSC70 does not contain a requestor
in the specified position. If a requestor is in the
module slot, ensure the red LED on the lower front edge
of the module is lit. If so, the requestor failed and
was disabled by the HSC70. If the red LED is not lit, a
software problem exists and should be reported via a
Software Performance Report (SPR). See Appendix B for
detailed information on SPR submission.
o

Error 33 CAN'T READ ANY SECTOR ON TRACK - As part of
Test 03, ILDISK attempts to read a pattern from at least
one sector of each track in the Read Only area of the
drive DBN space.
(DBN space is an area on every disk
drive reserved for diagnostic use.) All drives have the
same pattern written to each sector in the Read Only DBN
space.
During the read process, ILDISK detected a track that
does not contain any sector with the expected pattern.
Either ILDISK detected errors while reading or the read
succeeded, but the sectors did not contain the correct
pattern. The error information for the last sector
accessed is supplied in the error report. The most

5-19

likely cause of this error is a disk media error.
If
Test 04 also fails, the problem may be in the disk
Read/Write electronics, or the DBN area of the disk may
not be formatted correctly. To interpret the MSCP
status code, refer to Section 5.4.9.1.
o

Error 34 DRIVE DIAGNOSTIC DETECTED ERROR - The SOl
verification test directed the disk drive to run an
internal diagnostic. The drive indicated the diagnostic
failed, but the error is not serious enough to warrant
removing the drive from service. The test number and
error number for the drive are displayed (in hex) in the
error report. For the exact meaning of each error,
refer to the service manual for that drive.

Error 35 DRIVE DIAGNOSTIC DETECTED FATAL ERROR - The SOl
verification test directed the disk drive to run an
internal diagnostic. The drive indicated the diagnostic
failed and the error is serious enough to warrant
removing the drive from service. The test and error
number are displayed (in hex) in the error report.
For
the exact meaning of each error, refer to the service
manual for that drive.

Error 36 ERROR BIT SET IN DRIVE STATUS ERROR BYTE - The
SOl verification test executed an SOl GET STATUS command
to the drive under test. The error byte in the returned
status was nonzero indicating one of the following
conditions:
1.

Drive error

Transmission error

Protocol error

Initialization diagnostic failure

Write lock error

For the exact meaning of each error, refer to the
service manual for that drive.
o

Error 37 ATTENTION SET AFTER SEEK - The SOl verification
routine the SEEK command issued to the drive completed
but resulted in an unexpected ATTENTION condition. The
drive status is displayed with the error report.

Error 38 AVAILABLE NOT SET IN AVAILABLE DRIVE - The SOl
verification routine executed a command to interrogate
the Real Time Drive State line of the drive.
ILDISK
found Available is not set in a drive that should be
available.

5-20

Error 39 ATTENTION NOT SET IN AVAILABLE DRIVE - The SOl
verification routine executed a command to interrogate
the Real Time Drive State line of the drive and found
Attention is not asserted even though the drive is
Available.

Error 40 RECEIVER READY NOT SET - The SDI verification
routine executed a command to interrogate the Real Time
Drive State line of the drive. The routine expected to
find Receiver Ready asserted but it was not.

Error 41 READjWRITE READY SET IN AVAILABLE DRIVE - The
SDI verification routine executed a command to
interrogate the Real Time Drive State line of the drive
and found Available asserted. However, Read/Write Ready
was also asserted. Read/Write Ready should never be
asserted when a drive is in the Available state.

Error 42 AVAILABLE SET IN ONLINE DRIVE - The SDI
verification routine issued an ONLINE command to the
disk drive. Then a command was issued to interrogate
the Real Time Drive State line of the drive. The line
status indicates the drive is still asserting Available.

Error 43 ATTENTION SET IN ONLINE DRIVE - The SOl
verification routine issued an ONLINE command to the
drive. The drive entered the Online state, but an
unexpected Attention condition was encountered.

Error 44 DRIVE CLEAR DID NOT CLEAR ERRORS - When ILDISK
issued a GET STATUS command, error bits were set in the
drive response.
Issuing a DRIVE CLEAR failed to clear
the error bits. The drive is the probable FRU.

Error 45 ERROR READING LBN - As part of Test 14, ILOISK
alternates between reading OBNs and LBNs. This tests
the drive's ability to seek properly. The error
indicates an LBN read failed. The drive is the probable
FRU.

Error 46 ECHO FRAMING ERROR - The framing code (upper
byte) of an SOl ECHO command response is incorrect. The
expected and actual ECHO frames are displayed with the
error message. The SOl set and the drive are the
probable FRUs.

Error 47 K.SDI DOES NOT SUPPORT ECHO - The disk data
channel connected to the drive under test does not
support the SDI ECHO command because the disk data
channel microcode is not the latest revision level.
This is not a fatal error, but the disk data channel
microcode should be updated to allow for improved
isolation of drive-related errors.

5-21

Error 48 REQ/PORT NUMBER INFORMATION UNAVAILABLE ILDISK was unable to obtain the requestor number and
port number from HSC70 disk software tables. The drive
may have changed state and disappeared while ILDISK was
running. This error can also be caused by
inconsistencies in HSC70 software structures.

Error 49 DRIVE SPINDLE NOT UP TO SPEED - ILOISK cannot
continue testing the drive because the disk spindle is
not up to speed.
If the drive is spun down, it must be
spun up before ILDISK can completely test the unit.
If
the drive appears to be spinning, it may be spinning too
slowly or the drive may be returning incorrect status
information to the HSC70.

Error 50 CAN'T ACQUIRE DRIVE STATE AREA - ILOISK cannot
perform the low-level SOl tests, because it cannot
acquire the drive state area for the drive. The drive
state area is a section of the K Control Area used to
communicate with the drive via the SOl interface. To
perform the SDI tests ILDISK must take exclusive control
of the drive state area; otherwise, the HSC70
operational software may interfere with the tests. The
drive state area must be in an inactive state (No
interrupts in progress) before it can be acquired by
ILDISK. If the drive is rapidly changing its SDI state
and generating interrupts, ILOISK may be unable to find
the drive in an inactive state.

Error 51 FAILURE WHILE UPDATING DRIVE STATUS - When in
the process of returning the drive to the same mode as
ILDISK found it originally, an error occurred while
performing an SDI GET STATUS command. When a drive is
acquired by ILDISK, the program remembers whether the
drive was in 576-byte mode or 512-byte mode (reflected
by the S7 bit of the mode byte in the drive status).
When ILDISK releases the drive (once per pass of the
program), the drive mode is returned to the state the
drive was in when ILDISK first acquired it.
In order to
ensure the HSC70 disk functional software is aware of
this mode change, ILDISK calls the diagnostic interface
routines to perform a GET STATUS to the drive. These
routines also update the disk functional software
information on the drive to reflect the new mode.
Error 51 indicates the drive status update failed.
The
diagnostic interface returns one of three different
status codes with this error:
1.

DRIVE ERROR - The GET STATUS command could not be
completed due to an error during the command.
If
informational error messages are enabled (via a SET
ERROR INFO command), an error message describing the
failure should be printed on the console terminal.

5-22

BAD UNIT NUMBER - The diagnostic interface could not
find the unit number specified. The drive may have
spontaneously transitioned to the OFFLINE state (no
clocks) since the last ILDISK operation. For this
reason, the unit number is unknown when the
diagnostic interface tries to do a GET STATUS
command.

UNKNOWN STATUS FROM DDUSUB - Refer to Error 04.

Error 52 576-BYTE FORMAT FAILED - The program attempted
to perform a 576-byte format to the first two sectors of
the first track in the R/W DBN area. No errors were
detected during the actual formatting operation, but
subsequent attempts to read either of the reformatted
blocks failed. The specific error detected is
identified in the error report.

Error 53 512-BYTE FORMAT FAILED - The program attempted
to perform a 512-byte format to the first two sectors of
the first track in the R/W DBN area. No errors were
detected during the actual formatting operation, but
subsequent attempts to read either of the reformatted
blocks failed. The specific error detected is
identified in the error report.

Error 54 INSUFFICIENT RESOURCES TO PERFORM TEST - This
error indicates further testing must be aborted due to
lack of required memory structures. To perform certain
drive tests ILDISK needs to acquire Timers, a Dialogue
Control Block (DCB), Free Control Blocks (FCBs), Data
Buffers, and enough control memory to construct two Disk
Rotational Access Tables (DRATs).
If any of these
resources are unavailable, testing cannot be completed.
under normal conditions these resources should always be
available.

Error 5S DRIVE TRANSFER QUEUE NOT EMPTY BEFORE FORMAT ILDISK found a transfer already queued to the K.sdi when
the format test began.
ILDISK should have exclusive
access to the drive at this time, and all previous
transfers should have been completed before the drive
was acquired. To avoid potentially damaging interaction
with some other disk process, ILDISK aborts testing when
this condition is detected.

Error 56 K.SDI DETECTED ERROR DURING FORMAT - The K.sdi
detected an error during a format operation. Each error
bit set in the Fragment Request Block (FRB) is
translated into a text message which accompanies the
error report.

5-23

Error 57 WRONG STRUCTURE ON COMPLETION QUEUE - While
formatting, ILDISK checks each structure returned by the
K.sdi to ensure the structure was sent to the proper
completion queue. An error 57 indicates one of these
structures was sent to the wrong completion queue. This
type of error indicates a problem with the K.sdi
microsequencer or a control memory failure.

Error 58 READ OPERATION TIMED-OUT - To guarantee the
disk is on the correct cylinder and track while
formatting, ILDISK queues a read operation immediately
preceding the format command. The read operation did
not complete within 16 seconds indicating the K.sdi is
unable to sense sector/index pulses from the disk, or
the disk is not in the proper state to perform a
transfer. ILDISK aborts the format test following this
error report.

Error 59 K.SDI DETECTED ERROR IN READ PRECEDING FORMAT To guarantee the disk is on the correct cylinder and
track while formatting, ILDISK queues a read operation
immediately preceding the format command. The read
operation failed so ILDISK aborts the format test. Each
error bit set in the Fragment Request Block (FRB) is
translated into a text message which accompanies the
error report.

Error 60 READ DRAT NOT RETURNED TO COMPLETION QUEUE - To
guarantee the disk is on the correct cylinder and track
while formatting, ILDISK queues a read operation
immediately preceding the format command. The read
apparently completed successfully, because the Fragment
Request Block (FRB) for the read was returned with no
error bits set. However the Disk Rotational Access
Table (DRAT) for the read operation was not returned
indicating a problem with the K.sdi.

Error 61 FORMAT OPERATION TIMED-OUT - The K.sdi failed
to complete a format operation. A format operation
consists of a read followed by a format. The read
completed successfully, but after waiting a 16-second
interval the format was not complete. A change in drive
state may prevent formatting, the drive may no longer be
sending sector/index information to the K.sdi, or the
K.sdi may be unable to sample drive state. The format
test aborts on this error to prevent damage to the
existing disk format.

Error 62 FORMAT DRAT WAS NOT RETURNED TO COMPLETION
QUEUE - The K.sdi failed to complete a format operation.
A format operation consists of a read followed by a
format. The read completed successfully, and the
Fragment Request Block (FRB) for the format was returned
by the K.sdi with no error indicated. However the Disk

5-24

Rotational Access Table (DRAT) for the format operation
was never returned indicating a probable K.sdi failure.
After reporting this error, the format test aborts.
o

Error 63 CAN'T ACQUIRE SPECIFIED UNIT - ILDISK was
initiated automatically to test a disk drive declared
inoperative. When initiated by the disk functional
software, ILDISK was given the requestor number, port
number, and unit number of the drive to test.
ILDISK
successfully acquired the drive by unit number, but the
requestor and port number of the acquired drive did not
match the requestor and port given when ILDISK was
initiated. This indicates the HSC is connected to two
separate drives with the same unit number plugs. To
prevent inadvertent interaction with the other disk
drive, ILDISK performs only the low-level SDI tests on
the unit specified by the disk functional software.
Read/Write tests are skipped because the drive must be
acquired by unit number to perform read/write transfers.

Error 64 DUPLICATE UNIT DETECTED - At times during the
testing sequence, ILDISK must release, then reacquire,
the drive under test. After releasing the drive and
reacquiring it, ILDISK noted the requestor and port
number of the drive it was originally testing do not
match the requestor and port number of the drive just
acquired. This indicates the HSC is connected to two
separate drives with the same unit number. To prevent
inadvertent interaction with the other disk drive,
ILDISK discontinues testing if this error is detected.

Error 65 FORMAT TESTS SKIPPED DUE TO PREVIOUS ERROR - To
prevent possible damage to the existing disk format,
ILDISK does not attempt to format if any errors were
detected in the tests preceding the format tests. This
error message informs the user that formatting tests
will not be performed.

Error 66 TESTING ABORTED - ILDISK was automatically
initiated to test a disk drive which was declared
inoperative by the disk functional code of the HSC. The
disk drive had previously been automatically tested at
least twice and somehow was returned to service.
Because the tests performed by ILDISK may be causing the
inoperative drive to be returned to service, ILDISK does
not attempt to test an inoperative drive more than
twice. On all succeeding invocations of ILDISK, an
Error 66 message prints and ILDISK exits without
performing any tests on the drive. This prevents ILDISK
from automatically initiating and dropping the drive
from the test over and over again.

Error 67 NOT ENOUGH GOOD DBNS FOR FORMAT - In order to
guarantee the disk is on the proper cylinder and track,

5-25

all formatting operations are immediately preceded by a
read operation on the same track where the format is
planned. This requires the first track in the drive's
R/W DBN area to contain at least one good block which
can be read without error. An Error 67 indicates no
good block was found on the first track of the R/W DBN
area, so the formatting tests are skipped.

MSCP Status Codes - ILDISK Error Reports - This section
lists some of the MSCP status codes that may appear in ILDISK
error reports. All status codes are listed in the octal radix.
Further information on MSCP status codes is provided in Appendix

5.4.9.1

C.
007 - Compare Error

010 - Forced Error
052 - SERDES Overrun
053 - SDI Command Timeout

103 - Drive Inoperative
110 - Header Compare or Header Sync Timeout
112 - EDC Error

113 - Controller Detected Transmission Error
150 - Data Sync Not Found
152 - Internal Consistency Error

153 - Position or Unintelligible Header Error
213 - Lost Read/Write Ready
253 - Drive Clock Dropout
313 - Lost Receiver Ready
350 - Uncorrectable ECC Error

353 - Drive Detected Error
410 - One Symbol ECC Error
412 - Data Bus Overrun
413 - State or Response Line Pulse or Parity Error

5-26

450 - Two Symbol ECC Error
452 - Data Memory NXM or Parity Error
453 - Drive Requested Error Log
510 - Three Symbol ECC Error
513 - Response Length or Opcode Error
550 - Four Symbol ECC Error
553 - Clock Did Not Restart After Init
610 - Five Symbol ECC Error
613 - Clock Did Not Stop After Init
650 - Six Symbol ECC Error
653 - Receiver Ready collision
710 - Seven Symbol ECC Error
713 - Response Overflow
750 - Eight Symbol ECC Error

5.4.10 ILDISK Test Summaries
Test summaries for ILOISK follow:
o

TEST 0 - Parameter Fetching - The part of ILOISK that
fetches parameters is identified as Test O. The user is
prompted to supply a unit number and/or a requestor and
port number. This part of ILDISK also prompts for the
number of passes to perform.

TEST 01 - RUN K.SDI Microdiagnostics - Test 1 commands
the disk data channel to execute two of its resident
microdiagnostics.
If the revision level of the disk
data channel microcode is not up to date, the
microdiagnostics are not executed. The microdiagnostics
executed are the Partial SOl test (K.sdi Test 7)and the
SEROES/RSGEN test (K.sdi Test 10).

TEST 02 - Check for Clocks and Drive Available - Test 02
issues a command to interrogate the Real Time Drive
State of the drive. This command does not require an
SOl exchange, but the real time status of the drive is
returned to ILOISK. The real time status should
indicate the drive is supplying clocks and the drive
should be in the Available state.

5-27

TEST 03 - Drive Initialize Test - Test 03 issues an
DRIVE INITIALIZE command to the drive under test. This
checks both the drive and the Controller Real Time State
line of the SOl cable. The drive should respond by
momentarily stopping its clock and then restarting it.

TEST 04 - SDI Echo Test - Test 04 first ensures the disk
data channel microcode supports the ECHO command.
If
not, a warning message is issued, and the rest of Test
04 is skipped. Otherwise, the test directs the disk
data channel to conduct an ECHO exchange with the drive.
An ECHO exchange consists of the disk data channel
sending a frame to the drive and the drive returning it.
An ECHO exchange verifies the integrity of the Write/Cmd
Data and the Read/Res Data lines of the SOl cable.

TEST OS - Run Drive Diagnostics - Test 05 directs the
drive to run its internal diagnostics. The drive is
commanded to run a single diagnostic or its entire set
of diagnostics depending upon user response to the
following prompt:
Run a Single Drive Diagnostic?
Before commanding the drive to run its diagnostics, the
drive is brought Online to prevent the drive from giving
spurious Available indications to its other SOl port.
The drive diagnostics are started when the disk data
channel sends a DIAGNOSE command to the drive. The
drive does not return a response frame for the DIAGNOSE
until it is finished performing diagnostics. This can
require two or more minutes. While the disk data
channel is waiting for the response frame, ILDISK cannot
be interrupted by a CTRL Y.

TEST 06 - Disconnect From Drive - Test 06 sends a
DISCONNECT command to the drive and then issues a GET
LINE STATUS internal command to the K.sdi to ensure the
drive is in the Available state. The test also expects
Receiver Ready and Attention are set in drive status and
Read/Write Ready is not set.

TEST 07 - Check Drive Status - Test 07 issues a GET
STATUS command to the drive to check that none of the
drive's error bits are set. If any error bits are set,
they are reported and the test issues a DRIVE CLEAR
command to clear the error bits. If the error bits fail
to clear, an error is reported.

TEST 08 - Drive Initialize - Test 08 issues a command to
interrogate the Real Time Drive State of the drive. The
test then issues a DRIVE INITIALIZE command to ensure
the previous DIAGNOSE command did not leave the drive in
an undefined state.

5-28

TEST 09 - Bring Drive Online - Test 09 issues an ONLINE
command to the drive under test. Then a GET LINE STATUS
command is issued to ensure the drive's real time state
is proper for the Online state. Read/Write Ready is
expected to be true; Available and Attention are
expected to be false.

TEST 10 - Recalibrate and Seek - Test 10 issues a
RECALIBRATE command to the drive. This ensures the disk
heads start from a known point on the media. The a SEEK
command is issued to the drive, and the drive's real
time status is checked to ensure the SEEK did not result
in an Attention condition. Then another RECALIBRATE
command is issued returning the heads to a known
position.

TEST 11

Disconnect From Drive - Test 10 issues a

DISCONNECT command to return the drive to the Available
state. Then the drive's real time status is checked to
ensure Available, Attention and Receiver Ready are true
and Read/Write Ready is false.
o

TEST 12 - Bring Drive Online - Test 12 attempts to bring
the disk drive to the Online state. Test 12 is only
executed for drives known to the HSC70 disk functional
software. Test 12 consists of the following steps:
1.

GET STATUS - ILDISK issues an SDI GET STATUS command
to the disk drive.

ONLINE - ILDISK directs the HSC70 Diagnostic
Interface to bring the drive Online.

If the GET STATUS and the ONLINE commands succeed,
ILDISK proceeds to Test 13. If the GET STATUS and the
ONLINE commands fail, ILDISK goes directly to Test 17
(Termination). Note the Online is performed via the
HSC70 diagnostic interface, invoking the same software
operations a host invokes to bring a drive Online. An
Online at this level constitutes more than just sending
a SDI ONLINE command. The FCT and RCT of the drive are
also read and certain software structures are modified
to indicate the new state of the drive.
If the drive is
unable to read data from the disk media, the Online
operation fails.
If Test 12 fails, ILDISK skips the
remaining tests and goes to Test 17.
o

TEST 13 - Read Only I/O Operations Test - Test 13 tests
that all R/W heads in the drive can seek and properly
locate a sector on each track in the drive Read Only DBN
space.
(DBN space is an area on all disk media devoted
to diagnostic use.) Test 13 attempts to read at least
one sector on every track in the Read Only area of the

5-29

drives DBN space. The sector is checked to ensure it
contains the proper data pattern. Bad sectors are
allowed, but there must be at least one good sector on
each track in the Read Only area. After each successful
DBN read, ILDISK reads one LBN to further enhance seek
testing. This ensures the drive can successfully seek
to and from the DBN area from the LBN area of the disk
media. ILDISK proceeds to Test 16 and Test 13
completes.
o

TEST 14 - Format 576-Byte Mode - This test is not yet
implemented.

TEST 15 - Format 5l2-Byte Mode - This test is not yet
implemented.

TEST 16 - I/O Operations Test (Read/Write) - Test 16
checks to see if the drive can successfully write a
pattern and read it back from at least one sector on
every track in the drive Read/Write DBN area.
(Read/Write DBN space is an area on every disk drive
devoted to diagnostic Read/Write testing.) Bad sectors
are allowed, but at least one sector on every track in
the Read/Write area must pass the test. After Test 16
completes, ILDISK proceeds to Test 17.

TEST 17 - Terminate ILDISK - Test 17 is the ILDISK
termination routine. The following steps are performed:
1.

If the drive is unknown to the HSC70 disk functional
software, or if the SDI verification test failed,
proceed to step 5 of this test.

An SDI CHANGE MODE command is issued to the drive.
The CHANGE MODE command directs the drive to
disallow access to the DBN area and changes the
sector size (512 or 576 bytes) back to its original
state.

The drive is released from exclusive diagnostic use.
This returns the drive to the Available state.

The drive is reacquired for exclusive diagnostic
use. This is to allow looping if more than one pass
is selected.

If more passes are left to perform, the test is
reinitiated.

If no more passes are left to perform, ILDISK
releases the drive, returns all structures acquired,
and terminates.

5-30

5.5 INLINE TAPE TEST (ILTAPE)
ILTAPE initiates tape formatter resident diagnostics or a
functional test of the tape transport.
In addition, the test
permits selection of a full test of the K.sti interface. When a
full interface test is selected, the K.sti microdiagnostics are
executed, line state is verified, an ECHO test is performed, and
a default set of formatter diagnostics is executed. See the
DRIVE UNIT NUMBER prompt in Section 5.5.3 for information on
initiating a full test. Detected failures result in fault
isolation to the FRU level.
See Section 5.5.9 for a summary of three types of tape transport
tests listed below:
o

Fixed canned sequence

User sequence supplied at the terminal

Fixed streamer sequence.

5.5.1 ILTAPE System Requirements
The following hardware and software are necessary to run ILTAPE.
Hardware requirements include:
o

HSC70 subsystem with K.sti

STI compatible tape formatter

TA78 tape drive (for transfer commands only)

Console terminal

RX33 disk drive or equivalent

The I/O control processor, Program memory, and Control memory
must be working.
Software requirements include:
o

CRONIC

DEMON

K.sti microcode

Tape Functional Code (TFUNCT)

Diagnostic/utility Interface (TDUSUB)

5-31

5.5.2 ILTAPE Operating Instructions
The following steps outline the procedure for running ILTAPE.
The test assumes an HSC70 is configured with a terminal and STI
interface. If the HSC70 is not booted, start with step 1. If
the HSC70 is already booted, proceed to step 2.
1.

Boot the HSC70
Press the INIT button on the HSC70 OCP of the. The
following message should appear at the terminal:
INIPIO-I Booting •••
The boot process takes about one minute, and then the
following message should appear at the terminal:
HSC Version xxxx Date Time System n

Type CTRL Y
This causes the KMON prompt
HSC70>

Type R DXn:ILTAPE

This invokes the inline tape diagnostic program, ILTAPE. The DX
in step 3 is the Rx33 device name. The n refers to the unit
number of the specific RX33 drive. For example, Dxl: refers to
Rx33 drive number one. The following message should appear at
the terminal:
ILTAPE>D>hh:mm Execution Starting
5.5.3 ILTAPE/User Dialogue
The following paragraphs describe ILTAPE/user dialogue during
execution of ILTAPE. Note default values for input parameters
appear within the brackets of the prompt. The absence of a value
within the brackets indicates the input parameter is not
defaultable.
DRIVE UNIT NUMBER (U) []?
If you want to run formatter diagnostics or transport tests,
enter Tnnn, where nnn is the MSCP unit number (such as T3l6).
If you want a full interface test, enter Xm (where rn is any
number). Typing X, instead of T, requires a requestor number and
slot number. The following two prompts solicit requestor/slot
numbers.

5-32

ENTER REQUESTOR NUMBER (2-9)

[]?

Enter the requestor number. The range includes numbers two
through nine, with no default value.
ENTER PORT NUMBER (0-3)

[]?

Enter the port number. The port number must be zero, one, two,
or three with no default value. After this prompt is answered,
ILTAPE executes the K.sti interface test.
EXECUTE FORMATTER DIAGNOSTICS (YN)

[Y]?

Enter Y (for yes) if you want to execute formatter diagnostics.
This is the default. Enter N if you do not want to run formatter
diagnostics.
MEMORY REGION NUMBER (H) [OJ?

This prompt appears only if the response to the previous prompt
was Y. A formatter diagnostic is named according to the
formatter memory region where it executes. Enter the memory
region (hexadecimal) in which the formatter diagnostic is to
execute. ILTAPE continues at the prompt for iterations. Refer
to the appropriate tape drive service manual for more information
on formatter diagnostics.
EXECUTE TEST OF TAPE TRANSPORT (YN)

[N]?

To test the tape transport, enter Y (the default is N).
If no
transport testing is desired, the dialogue continues with the
ITERATIONS prompt. Otherwise, the following prompts appear.
IS MEDIA MOUNTED (YN)

[N]?

This test writes to the tape transport, requiring a mounted
scratch tape. Enter Y if a scratch tape is already mounted.
FUNCTIONAL TEST SEQUENCE NUMBER (D)

[I]?

You may select one of five transport tests. The default is 1
(the canned sequence). Enter 0 if a new user sequence will be
input from the terminal. Enter 2, 3 or 4 to select a user
sequence previously input and stored on the RX33 diskette. User
sequences are described in Section 5.5.4. Enter 5 to select the
streaming sequence.
INPUT STEP 00:

This prompt appears only if the response to the previous prompt
was O. See Section 5.5.4 for a description of user sequences.
ENTER CANNED SEQUENCE RUN TIME IN MINUTES (D)

5-33

[I]?

Answering this prompt determines the time limit for the canned
sequence. It appears only if the canned sequence is selected.
Enter the total run time limit in minutes. The default is one
minute.
SELECT DENSITY (O=ALL, 1=1600, 2=6250)

[O]?

This prompt permits selection of the densities used during the
canned sequence. It appears only if the canned sequence is
selected. One or all densities may be selected; the default is
ALL.
SELECT DENSITY (1=800, 2=1600, 3=6250)

[3]?

This prompt appears only if a user-defined test sequence was
selected. The prompt permits selection of anyone of the
possible tape densities. The default density is 6250 bpi.
Enter 1, 2, or 3 to select the desired tape density.
1

800 bpi

1600 bpi

6250 bpi

The next series of prompts concern speed selection. The
particular prompts depend upon the type of speeds supported,
fixed or variable. ILTAPE determines the speed types supported
and prompts accordingly.
If fixed speeds are supported, ILTAPE displays a menu of
supported speeds, as follows:
Fixed Speeds Available:
(1)

sssl IPS

(2)

sss2 IPS

(n)

sssn IPS,

where sssn is a supported speed in inches per second. The
maximum number of supported speeds is four. Thus, n cannot be
greater than four. The prompt for a fixed speed is:
SELECT FIXED SPEED (D)

[lj?

To select a fixed speed, enter a digit (n) corresponding to one
of the above displayed speeds. The default is the lowest
supported speed. ILTAPE continues at the data pattern prompt.

5-34

If variable speeds are supported, ILTAPE displays the lower and
upper bounds of the supported speeds as follows:
VARIABLE SPEEDS AVAILABLE:
LOWER BOUND

= I I I IPS

UPPER BOUND = uuu IPS

where I I I is the lower bound and uuu is the upper bound of
supported speeds. The prompt for a variable speed is:
SELECT VARIABLE SPEED (D)

[0 = LOWEST]?

To select a variable speed, enter a number within the bounds,
inclusively, of the displayed supported variable speeds. The
default is the lOWer bound.
NOTE
If only a single speed is supported, ILTAPE does
not prompt for speed.
It runs at the single
speed supported.

DATA PATTERN NUMBER (D)

[3]?

Choose one of five data patterns.

o - User supplied
1 - All zeros
2 - All ones
3 - Ripple zeros
4 - Ripple ones
The default is three.
prompts appear.

If the response is zero, the following

HOW MANY DATA ENTRIES (D)

[]?

Enter the number of unique words in the data pattern.
(10) words are permitted.
DATA ENTRY (H)

Up to 16

[]?

Enter the data pattern word in hexadecimal, for example, ABCD.
This prompt repeats until the all data words specified in the
previous prompt are exhausted.

5-35

SELECT RECORD SIZE (GREATER THAN OR EQUAL TO 1)

(D)

Enter the desired record size in decimal bytes.
8192 bytes.

The default is

[8l92]?

NOTE
This prompt does not appear if streaming is
selected.

ITERATIONS (D)

[l]?

Enter the number of times the selected tests are to run. After
the number of iterations is entered, the selected tests begin
execution. Errors encountered during execution cause display of
appropriate messages at the terminal.
5.5.4

ILTAPE User Sequences

In order to test/exercise a tape transport, write a sequence of
commands at the terminal. This sequence may be saved on the Rx33
diskette and be recalled for execution at a later time. Up to
three user sequences can be saved on the RX33.
Following is a list of supported user sequence commands:
WRT

Write one record

RDF

Read one record forward

ROFC

Read one record forward with compare

RDB

Read one record backward

ROBe

Read one record backward with compare

FSR

Forward space one record

FSF

Forward space one file

BSR

Backspace one record

BSF

Backspace one file

REW

Rewind

RWE

Rewind with erase

UNL

Unload (after rewlnd)

WTM

Write tape mark

ERG

Erase gap

5-36

Cnnn

counter set to nnn (0

Dnnn

Delay nnn ticks (0

BRnn

Branch unconditionally to step nn

DBnn

Decrement counter and branch if nonzero to step nn

TMnn

Branch on Tape Mark to step nn

NTnn

Branch on no Tape Mark to step nn

ETnn

Branch on EaT to step nn

NEnn

Branch on not EaT to step nn

....

QTTTrr'
""

Terminate

innllr

-. .... 1:''''''''' ....

nT
..,-

1000.)

C::Qf""rllQnl"'Q

..., '-' "j" \004 ' - ...... ' - "-'

steps

Typing 0 in response to the prompt
FUNCTIONAL TEST SEQUENCE NUMBER (0) [I]?

initiates the user sequence dialogue. The following paragraphs
describe the ILTAPE/user dialogue during a new user sequence.
INPUT STEP nn

Enter one of the user sequence commands listed previously.
ILTAPE keeps track of the step numbers and automatically
increments them. Up to 50 steps may be entered. Typing QUIT in
response to the prompt
INPUT STEP nn

terminates the user sequence.
appears:

At that time, the following prompt

STORE SEQUENCE AS SEQUENCE NUMBER (0,2,3,4) [OJ?

The sequence entered at the terminal may be stored on the Rx33 in
one of three files.
To select one of these files, type 2, 3 or
4. Once stored, the sequence may be recalled for execution at a
later time by referring to the appropriate file (typing 2, 3 or
4) in response to the sequence number prompt.
Typing 0 (the default) indicates the user sequence just entered
should not be stored.
In this case, the sequence cannot be run
at a later time.
An example of entering a user sequence follows:
INPUT STEP

REW

iRewind the tape

INPUT STEP

C950

iSet counter to 950

5-37

INPUT STEP

WRT

iWrite one record

INPUT STEP

ET07

iIf EaT branch to step 7

INPUT STEP

RDB

iRead backward one record

INPUT STEP

FSR

iForward space one record

INPUT STEP

DB02

iDecrement counter, branch
iTo

step 2 if nonzero

INPUT STEP

REW

iRewind the tape

INPUT STEP

QUIT

iTerminate sequence input

STORE SEQUENCE AS SEQUENCE NUMBER (0,2,3,4) [OJ? 3
This sequence writes a record, reads it backwards and skips
forward over it. If an EaT is encountered prior to writing 950
records, the tape is rewound and the sequence terminates. Note,
the sequence is saved on the Rx33 as sequence number 3 and can be
recalled at a later execution of ILTAPE.
ILTAPE Progress Reports
When transport testing is finished, a summary of soft errors
appears on the terminal upon completion of the test. The format
of this summary is:
5.5.5

SOFT ERROR SUMMARY:

READ

WRITE

COMPARE

xxxxxx xxxxxx xxxxxx
Successful completion of a formatter diagnostic is indicated by
the following message on the terminal:
TEST nnnn DONE
where nnnn is the formatter diagnostic test number.
When an error is encountered, an appropriate error message is
printed on the terminal.
ILTAPE Test Termination
ILTAPE terminates normally after the selected tests successfully
complete. The program also terminates after typing a CTRL Y or
CTRL C at any time.
Further, certain errors cause ILTAPE to
terminate automatically.

5.5.6

5-38

5.5.7 ILTAPE Error Message Example
ILTAPE conforms to the diagnostic generic error message format
(Section 5.1.1.1). An example of an ILTAPE error message
follows:
ILTAPE>D>09:31 TOll U-TOOlOl
ILTAPE>D>COMMAND FAILURE
ILTAPE>D>MSCP WRITE MULTIPLE COMMAND
ILTAPE>D>MSCP STATUS:
000000
ILTAPE>D>POSITION
001792
The test number reflects the state level where ILTAPE is
executing when an error occurs. This number does not indicate a
separate test that can be called. Test levels are defined as
follows:
Test Number

ILTAPE State

Initialization of tape
software interface

Device (port, formatter, unit)
acquisition

STI interface test in execution

Formatter diagnostics executing
in response to Diagnostic Request
(DR) bit

Tape transport functional test

User-selected formatter
diagnostics executing

Termination and clean-up

The optional text is dependent upon the type of error.

5.5.8 ILTAPE Error Messages
The following list describes ILTAPE error messages.
o

Error 1 - INITIALIZATION FAILURE - Tape path software
interface cannot be established due to insufficient
resources (buffers, queues, timers, etc .. )

Error 2 - SELECTED UNIT NOT A TAPE - Selected drive is
not known to the HSC as a tape.

Error 3 - INVALID REQUESTOR/PORT NUMBER - Selected
requestor number or port number is out of range or port

5-39

selected is not known to the system.
o

Error 4 - REQUESTOR NOT A K.STI - Selected requestor is
not known to the system as a tape data channel.

Error 5 - TIMEOUT ACQUIRING DRIVE SERVICE AREA - While
attempting to acquire the Drive Service Area (port) in
order to run the STI interface test, a timeout occurred.
If this happens, the tape functional code is corrupted.
ILTAPE invokes a system crash.

Error 6 - REQUESTED DEVICE UNKNOWN - Device requested is
not known to the tape subsystem.

Error 7 - REQUESTED DEVICE IS BUSY - Selected device is
Online to another controller or host.

Error 8 - UNKNOWN STATUS FROM TAPE DIAGNOSTIC INTERFACE
- interface An unknown status was returned from the
diagnostic software interface, TDUSUB.

Error 9 - UNABLE TO RELEASE DEVICE - Upon termination of
ILTAPE or upon an error condition, the device(s) could
not be returned to the system.

Error 10 - LOAD DEVICE WRITE ERROR - CHECK IF WRITE
LOCKED - An error occurred while attempting to write a
user sequence to the RX33. Check to see if the Rx33
diskette is write protected. You are reprompted for a
user sequence number. To break the loop of reprompts,
type CTRL Y.

Error 11 - COMMAND FAILURE - A command failed during
execution of ILTAPE. The command in error may be one of
several types such as an MSCP or Level 2 STI command.
The failing command is identified in the optional text
of the error message. For example,
ILTAPE)D)MSCP READ COMMAND
ILTAPE)D)MSCP STATUS:

nnnnnn

Error 12 - READ MEMORY BYTE COUNT ERROR - The requested
byte count used in the read (formatter) memory command
is different from the actual byte count received.
EXPECTED COUNT:

xxxx ACTUAL COUNT:

yyyy-

Error 13 - FORMATTER DIAGNOSTIC DETECTED ERROR - A
diagnostic running in the formatter detects an error.
Any error text from the formatter is displayed.

Error 14 - FORMATTER DIAGNOSTIC DETECTED FATAL ERROR - A
diagnostic running in the formatter detects a fatal

5-40

error.

Any error text from the formatter is displayed.

Error 15 - RX33 READ ERROR - While attempting to read a
user sequence from the RX33, a read error was
encountered. Ensure a sequence has been stored on the
Rx33 as identified by the user sequence number. The
program reprompts for a user sequence number. To break
the loop of reprompts, type a CTRL Y.

Error 16 - INSUFFICIENT RESOURCES TO ACQUIRE SPECIFIED
DEVICE - During execution, ILTAPE was unable to acquire
the specified device due to a lack of necessary
resources. This condition is identified to ILTAPE by
the tape functional code via the diagnostic interface,
TDUSUB. ILTAPE has no knowledge of the specific
unavailable resource.

Error 17 - K MICRODIAGNOSTIC DID NOT COMPLETE - During
the STI interface test, the requestor microdiagnostic
timed out.

Error 18 - K MICRODIAGNOSTIC REPORTED ERROR - During the
STI interface test, an error condition was reported by
the K microdiagnostics.

Error 19 - DCB NOT RETURNED, K FAILED FOR UNKNOWN REASON
- During the STI interface test, the requestor failed
for an undetermined reason and the Diagnostic Control
Block (DCB) was not returned to the completion queue.

Error 20 - IN DCB UPON COMPLETION - During the STI
interface test, an error condition was returned in the
DCB.

Error 21 - UNEXPECTED ITEM ON DRIVE SERVICE QUEUE During the STI interface test, an unexpected entry was
found on the drive service queue.

Error 22 - STATE LINE CLOCK NOT RUNNING - During the STI
interface test, execution of an internal command to
interrogate the Real Time Formatter State line of the
drive indicated the state line clock is not running.

Error 23 - INIT DID NOT STOP STATE LINE CLOCK - During
the STI interface test, after execution of a formatter
INITIALIZE command, the state line clock did not drop
for the time specified in the STI specification.

Error 24 - STATE LINE CLOCK DID NOT START UP AFTER INIT
- During the STI interface test, after execution of a
formatter INITIALIZE command, the state line clock did
not start up within the time specified in the STI
specification.

5-41

Error 25 - FORMATTER STATE NOT PRESERVED ACROSS INIT The state of the formatter prior to a formatter
initialize was not preserved across the initialization
sequence.

Error 26 - ECHO DATA ERROR - Data echoed across the STI
interface was incorrectly returned.

Error 27 - RECEIVER READY NOT SET - After issuing an
ONLINE command to the formatter, the Receiver Ready
signal was not asserted.

Error 28 - AVAILABLE SET IN ONLINE FORMATTER - After
successful completion of a formatter ONLINE command to
the formatter, the Available signal is set.

Error 29 - RX33 ERROR - FILE NOT FOUND - During the user
sequence dialogue, ILTAPE was unable to locate the
sequence file associated with the specified user
sequence number. Check that a Rx33 system diskette is
properly installed. The program reprompts for a user
sequence number. To break the loop of reprompts, type a
CTRL Y.

Error 30 - DATA COMPARE ERROR - During execution of the
user or canned sequence, ILTAPE encountered a software
compare mismatch on the data written and read back from
the tape. The software compare is actually carried out
by a subroutine in the diagnostic interface, TDUSUB.
The results of the compare are passed to ILTAPE.
Information in the text of the error message identifies
the data in error.

Error 31 - EDC ERROR - During execution of the user or
canned sequence, ILTAPE encountered an EDC error on the
data written and read back from the tape. This error is
actually detected by the diagnostic interface, TDUSUB
and reported to ILTAPE.
Information in the text of the
error message identifies the data in error.

Error 32 - INVALID MULTIUNIT CODE FROM GUS COMMAND After a unit number is input to ILTAPE and prior to
acquiring the unit, ILTAPE attempts to obtain the unit's
multiunit code via the GET UNIT STATUS command. This
error indicates a multiunit code of zero was returned to
ILTAPE from the tape functional code. Because a
multiunit code of zero is invalid, this error is
equivalent to a device unknown to the tape subsystem.

33 - INSUFfICIENT RESOURCES TO ACQUIRE TIMER ILTAPE was unable to acquire a timer from the system;
insufficient buffers are available in the system to
allocate timer queues.
Er[o~

5-42

Error 34 - UNIT UNKNOWN OR ONLINE TO ANOTHER CONTROLLER
- The device identified by the selected unit number is
either unknown to the system, or it is online to another
controller. Verify the selected unit number is correct
and run ILTAPE again.

5.5.9 ILTAPE Test Summaries
Summaries of the tests contained in this diagnostic follow.
5.5.9.1 K.sti Interface Test Summary - This portion of ILTAPE
tests the STI interface of a specific tape data channel and port.
It also performs low-level testing of the formatter by
interfacing to the K.sti Drive Service Area (port) and executing
various level 2 STI commands. The testing is limited to dialogue
operations; no data transfer is done. The operations performed
are DIAGNOSE, READ MEMORY, GET DRIVE STATUS, and READ LINE
STATUS.
K.sti microdiagnostics are executed to verify the tape data
channel. A default set of formatter diagnostics (out of memory
region 0) is executed to test the formatter, and an echo test is
performed to test the connection between the port and the
formatter.
Failures detected are isolated to the extent possible and limited
to tape data channel, the STI set, or the formatter. The STI set
includes a small portion of the K.sti module, and the entire STI
(all connectors and cables and a small portion of the drive).
The failure probabilities of the STI set are:
1.

STI cables or connectors (most probable)

Formatter

K.sti (least probable)

When the STI set is identified as the FRU, replacement should be
in the order indicated in the preceding list.
5.5.9.2 Formatter Diagnostics Test Summary - Formatter
diagnostics are executed out of a formatter memory region
selected by the user. Refer to the particular tape drive service
manual (for example, TA78 Magnetic Tape Drive Service Manual) for
a description of the formatter tests. Failures detected identify
the formatter as the FRU.

5-43

5.5.9.3 User Sequences Test Summary - User sequences are used to
exercise the tape transport. The particular sequence is totally
user-defined. Refer to Section 5.5.4.
5.5.9.4 Canned Sequence Test Summary - The canned sequence is a
fixed routine for exercising the tape transport. The canned
sequence first performs a quick verify of the ability to read and
write tape at all supported densities. Using a user-selected
record size, it then writes, reads, and compares the data written
over a 200-foot length of tape. Positioning over this length of
tape is also performed. Finally, random record sizes are used to
write, read, compare, and position over a 50-foot length of tape.
Errors encountered during the canned sequence are reported at the
terminal.
5.5.9.5 Streaming Sequence Test Summary - The streaming sequence
is a fixed sequence which attempts to write and read the tape at
speed (without hesitation). The entire tape is written, the tape
is rewound, and the entire tape is read back. Execution may be
terminated at any time by typing CTRL Y.
NOTE
In reading the tape, ILTAPE uses the ACCESS
command. This allows the tape to move at speed.
This is necessary because of the buffer size
restrictions existing for diagnostic programs.

5.6 INLINE TAPE COMPATABILITY TEST (ILTCOM)
ILTCOM tests the compatibility of tapes which may have been
written on different systems and different drives with STI
compatible (TA78) drives connected to an HSC via the STI bus.
ILTCOM may generate, modify, read, or list a compatibility tape.
Data read from the compatibility tape is compared to the expected
pattern. A compatibility tape consists of groups of files,
called bunches, of records of specific data patterns. The format
of the compatibility tape is described in following paragraphs.
Each bunch contains a header record and several data records of
different sizes. Each bunch is terminated by a Tape Mark. The
last bunch on a tape is followed by an additional Tape Mark (thus
forming logical EOT).
Each bunch contains a total of 199 records: one header record
followed by 198 data records. The header record contains 48
(decimal) bytes of 6 bit-encoded descriptive information as
follows:

5-44

Table 5-1

ILTCOM Header Record

Field

Description

Length

Example

1
2
3
4
5
6

Drive type
Drive Serial Number
Processor Type
Processor Sere No.
Date
Comment *

6 Bytes
6 Bytes
6 Bytes
6 Bytes
6 Bytes
18 Bytes

TA78
123456
HSC-70
123456
093083
Comment

rn,..""""",

l..L.I.1 \..Ul'l

can read

h,,~

uu\..

,
cannot generate a comment .L..Le.L.U.
~ ~

The data records are arranged as follows:
o

Sixty-six records 24 (decimal) bytes in length. These
records sequence through 33 different data patterns.
The 1st and 34th records contain pattern 1, the 2nd and
35th records contain pattern 2, ... , the 33rd and 66th
records contain pattern 33.

Sixty-six records 528 (decimal) bytes in length.
records sequence through the 33 data patterns as
described above.

Sixty-six records 12,024 (decimal) bytes in length.
These records sequence through the 33 data patterns in
the same manner as the preceding ones.

5-45

These

The data patterns used are shown below:
Table 5-2

ILTCOM Data Patterns

Pattern
Number

Pattern

Description

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33

377
000
274,377,103,000
000,377,377,000
210,104,042,021
273,167,356,333
126,251
065,312
000,377
001
002
004
010
020
040
100
200
376
375
373
367
357
337
277
177
207,377,370,377
170,377,217,377
113,377,264,377
035,377,342,377
370,377,207,377
217,377,170,377
264,377,113,377
342,377,035,377

Ones
Zeros
Peak shift
Peak shift
Floating one
Floating zero
Alternate bits
Square pattern
Alternate frames
Track 0 on
Track 1 on
Track 2 on
Track 3 on
Track 4 on
Track 5 on
Track 6 on
Track 7 on
Track 0 off
Track 1 off
Track 2 off
Track 3 off
Track 4 off
Track 5 off
Track 6 off
Track 7 off
Bit peak shift

5.6.1 ILTCOM System Requirements
The following hardware and software are necessary to run ILTCOM.
Hardware requirements include HSC subsystem with K.sti,
STI-compatib1e tape formatter and drive. Because ILTCOM is not
diagnostic in nature, all of the necessary hardware is assumed to
be working. Errors are detected and reported but fault isolation
is not a goal of ILTCOM. Software requirements include CRONIC,
DEMON, K.sti microcode, TFUNCT (Tape Functional Code), TDUSUB
(Diagnostic/Utility Interface).

5-46

5.6.2 ILTCOM Operating Instructions
The following steps outline the procedure for running ILTCOM.
ILTCOM assumes the HSC is configured with a terminal, STI
interface, and a TA78 tape drive (or STI compatible equivalent).
If the HSC is already booted, proceed to step 2 below.
If the
HSC needs to be booted, start with step 1.
1.

Boot the HSC
Press the INIT button on the OCP of the HSC. The
following message should appear at the terminal:
INIPIO-I Booting •••
The boot process takes about one minute, and then the
following message should appear at the terminal:
HSC70 Version xxx x Date Time System n

Type CTRL Y
This causes the KMON prompt,
HSC70)

Type R DXn:ILTCOM, where n equals the number of the Rx33
drive containing the system diskette.
This invokes the compatibility test program, ILTCOM.
The following message should appear at the terminal:
ILTCOM)D)hh:mm Execution Starting
The subsequent program dialogue is described in the next
section.

5.6.3 ILTCOM Test Parameter Entry
ILTCOM allows the writing, reading, listing, or modifying of
compatibility tapes. The following describes the user dialogue
during the execution of ILTCOM.
DRIVE UNIT NUMBER (U) []?
Enter the tape drive MSCP unit number (such as T21)
SELECT DENSITY FOR WRITES (1600, 6250) []?
Enter the write density by typing (up to) 4 characters of the
density desired (1600 or 1 for 1600 bpi).
SELECT FUNCTION (WR=WRITE,REA=READ,ER=ERASE,
LI=LIST,REW=REWIND,EX=EXIT) []?

5-47

Enter the function by typing up to four characters which uniquely
identify the desired function (for instance, READ or REA for
read).
The subsequent dialogue is dependent upon the function selected.
WRITE -

The write function writes new bunches on the compatibility tape.
Bunches are either written one at a time or over the entire tape.
Bunches are written from the current tape position. If the write
function is selected, the following prompts occur.
PROCEED WITH INITIAL WRITE (YN) [N]?

Type Y or N (for yes or no, respectively.) The default is no, in
which case program control is continued at the function selection
prompt. If the response is yes, the following prompt ocurrs:
WRITE ENTIRE TAPE (YN) [N]?

Type Y (for yes), if the entire tape is to be written. Writing
of bunches begins at the current tape position and continues to
physical EaT (end-of-tape). Type N (for no), which is the
default, if the entire tape is not to be written. In this case,
only one bunch is written from the current tape position. This
prompt only appears on the initial write selection. After the
bunch(es) have been written, control continues at the function
selection prompt.
READ -

The read function reads and compares the data in the bunches with
an expected (predefined) data pattern. As the reads occur, the
bunch header information is displayed at the terminal. The
format of the display is shown in the following example:
BUNCH
01 WRITTEN BY TA78 SERIAL NUMBER 002965
ON A HSC70 SERIAL NUMBER 005993 ON 09-18-84

The number of bunches to be read is user selectable. All reads
are from BOT. If the read function is selected, the following
prompt appears:
READ HOW MANY BUNCHES (D)

[O=ALL]?

Type the number of bunches to be read. The default (0) causes
all bunches to be read. After the requested number of bunches
have been read and compared, control continues at the function
selection prompt.

5-48

LIST The list function reads and displays the header of each bunch on
the compatibility tape from BOT. The display is the same as the
one described under the READ function. The data contents of the
bunches are NOT read and compared. After listing the tape bunch
headers, control continues at the function selection prompt.
ERASE -'
The erase function erases a user-specified number of bunches from
the current tape position toward BOT. ILTCOM backs up the
specified number of tape marks and writes a second tape mark
(logical EOT). This effectively erases the specified number of
bunches from the tape. Thus, for example, if the current tape
position is at bunch 5 and the user wishes to erase two bunches,
three bunches are left on the tape after the ERASE command
completes.
If the erase function is selected, the following prompt appears
at the terminal:
ERASE HOW MANY BUNCHES FROM CURRENT POSITION

(0)

[OJ?

Type the number of bunches to be erased. The default of zero
results in no change in tape contents or position. Control
continues at the function selection prompt.
REWIND The rewind function rewinds the tape to BOT.
EXIT The exit function rewinds the tape and exits the tape
compatibility program, ILTCOM.
5.6.4 ILTCOM Test Termination
ILTCOM is terminated normally by selecting the exit function
(EXIT) or by typing a CTRL Y or CTRL C. Further, certain errors
which occur during execution cause ILTCOM to terminate
automatically.
5.6.5 ILTCOM Error Message Example
ILTCOM conforms to the diagnostic generic error message format
(Section 5.1.1.1). An example of an ILTCOM error message
follows:
ILTCOM)D)09:29 T 000 E 003 U-TOOIOO
ILTCOM)D)COMMANO FAILURE
where:

5-49

E nnn is an error number
U-Txxxxx indicates the Tape MSCP Unit Number
The optional text is dependent upon the type of error. Some
error messages contain the term, object count, in the optional
text. Object count refers to tape position (in objects) from
BOT.
5.6.5.1 ILTCOM Error Messages - The following are the ILTCOM
error messages.

Error 1 - INITIALIZATION FAILURE - Tape path cannot be
established due to insufficient resources.

Error 2 - SELECTED UNIT NOT A TAPE - User selected a
drive not known to system as a tape.

Error 3 - COMMAND FAILURE - A command failed during
execution of ILTCOM. The command in error may be one of
several types (MSCP level, STI level 2, etc.). The
failing command is identified in the optional text of
the error message. For example,
ILTCOM)D)tt:tt T 000 E 003 U-T00030
ILTCOM)D)COMMAND FAILURE
ILTCOM)D)MSCP READ COMMAND
ILTCOM)D)MSCP STATUS: nnnnnn

Error 5 - SPECIFIED UNIT NOT AVAILABLE - The selected
unit is Online to another controller.

Error 6 - SPECIFIED UNIT CANNOT BE BROUGHT ONLINE - The
selected unit is offline or not available.

Error 7 - SPECIFIED UNIT UNKNOWN - The selected unit is
unknown to the HSC configuration.

Error 8 - UNKNOWN STATUS FROM TDUSUB - An unknown error
condition returned from the software interface, TDUSUB.

Error 9 - ERROR RELEASING DRIVE - After completion of
execution or after an error condition, the tape drive
could not be successfully returned to the system.

Error 10 - CAN'T FIND END OF BUNCH - The compatibility
tape being read or listed has a bad format.

Error 11 - DATA COMPARE ERROR - A data compare error has
been detected. The actual and expected data are
displayed in the optional text of the error message.
For example,

5-50

ILTCOM>D>tt:tt T 000 E 011 U-T00030
ILTCOM>D>DATA COMPARE ERROR
ILTCOM>D>EXPECTED DATA: XXXXXX ACTUAL DATA: YYYYYY
ILTCOM>D>NUMBER OF FIRST WORD IN ERROR: nnnnn
ILTCOM>D>NUMBER OF WORDS IN ERROR: mmmmm
ILTCOM>D>OBJECT COUNT = cccccc
o

Error 12 - DATA EDC ERROR - An EDC error was detected.
Expected and actual values are displayed in the optional
text of the error message.

5.6.6 ILTCOM Test Summaries
ILTCOM writes, reads, and compares compatibility tapes upon user
selection. The testing that takes place looks for compatibility

of tapes written on different drives (and systems). As
incompatibilities are found, due to data compare errors or
unexpected formats, they are reported.
ILTCOM makes no attempt
to isolate faults during execution; it merely reports
incompatibilities and other errors as they occur.
5.7 INLINE MULTIDRIVE EXERCISER (ILEXER)
The Inline Multidrive Exerciser exercises the various disk drives
and tape drives attached to the HSC subsystem. The exerciser is
initiated upon demand. Drives to be tested are selected by the
operator. The exerciser will issue random READ, WRITE, and
COMPARE commands to exercise the drives. The results of the
exerciser are displayed on the terminal from which it was
initiated. The reports given by ILEXER do not provide any
analysis of the errors reported nor explicitly callout a
specific FRU. This is strictly an exerciser.
This exerciser runs with other processes on the HSC subsystem.
It is loaded from the Rx33 and uses the services of DEMON
(Diagnostic Execution Monitor) and the HSC control software. Bad
block replacement is disabled for any disk unit being exercised.
CAUTION
Do not run ILEXER through DUP.

5.7.1 ILEXER System Requirements
In order for this program to run, the following hardware and
software items must be available:
1.

HSC subsystem including:
a.

Console terminal

5-51

P. io

K.sdi and/or K.sti

Program, Control, and Data memories

RX33 System diskette or equivalent local HSC load
device

SDI compatible disk drive
and/or

STI compatible tape drive

HSC system software including:
a.

HSC internal operating system

DEMON

K.sdi microcode and/or

K.sti microcode

SDI Manager and/or

STI Manager or equivalent

Disk functional code and/or

Tape functional code

Error Handler

Diagnostic Interface to Disk functional code and/or

Diagnostic Interface to Tape Functional code

Tests cannot be performed on drives if their respective interface
is not available (K.sdi or K.sti.)
5.7.2 ILEXER Operating Instructions
Perform the following steps to initiate ILEXER (Multidrive
Exerciser):
1.

Type CTRL Y

The HSC responds with an

5-52

HSC70> prompt
3.

Type:

RUN DXO:ILEXER.DIA

The system loads the program from the specified local HSC load
media (any appropriate media with the image ILEXER.DIA in an RTII
format). When the program is successfully loaded, the following
message is displayed:
ILEXER)D)hh:mm Execution Starting
where 'hh:mm' is the current time.
ILEXER then prompts for parameters. After all prompts are
answered, the execution of the diagnostic proceeds. Error
reports and performance summaries are returned from ILEXER.
When ILEXER has run for the specified time interval, reported any
errors found, and generated a final performance summary, the
exerciser concludes with the following message:
ILEXER)D)hh:mm Execution Complete

5.7.3 ILEXER Test Parameter Entry
The parameters in ILEXER follow the format:
PROMPT DESCRIPTION (DATATYPE) [DEFAULT]?
o

The PROMPT DESCRIPTION explains the type of information
ILEXER needs from the operator.

The DATATYPE is the form ILEXER expects and can be one
of the following:
YIN - Yes/No response
D - Decimal number
U - Unit number (see form below)
H - Hexadecimal number

The DEFAULT is the value used if a carriage return is
entered for that particular value. If a default value
is not allowed, it appears as [].

The next prompt is:
DRIVE UNIT NUMBER (U) [] ?

5-53

Enter the unit number of the drive to be tested. This prompt has
no default. Unit numbers are either in the form Dnnnn or Tnnnn,
where nnnn is a decimal number between 0 and 4095 which
corresponds to the number printed on the drive's unit plug and
the D or T indicates either a disk drive or tape drive,
respectively. Terminate the unit number with a carriage return.
ILEXER attempts to acquire the specified unit via the HSC
Diagnostic Interface. If the unit is acquired successfully,
ILEXER continues with the next prompt. If the acquire fails with
an error, one of the following conditions was encountered:

The specified drive is unavailable. This indicates the
drive is connected to the HSC but is currently online to
a Host CPU or HSC utility. Online drives cannot be
diagnosed. ILEXER repeats the prompt for the unit
number.

The specified drive is unknown to the HSC Disk
functional software. Drives are Unknown for one of the
following reasons:
o

The drive and/or K.sdi port is broken and cannot
communicate with the disk functional software.

The drive was previously communicating with the HSC
when a serious error occurred and the HSC ceased
communicating with the drive.

In either case, ILEXER asks the operator if another
drive will be selected. If so, it asks for the unit
number. If not, ILEXER begins to exercise the drives
selected. If no drives are selected, ILEXER terminates.
After a drive is selected and ILEXER has both acquired the drive
and brought it online, the following prompts appear. If a disk
drive was specified, one set of prompts is presented. If a tape
drive was selected, an entirely different set of prompts is
presented. A CTRL Z at any time during parameter input selects
the default values for the remaining parameters. If a
nondefaultable parameter is encountered, the following message
appears and the test prompts for new parameters:
ILEXER>D>hh:mm Nondefaultable Parameter

Select up to 12 drives to be exercised; either all disk drives,
all tape drives, or a combination of the two.

5.7.4 Disk Drive User Prompts
The following prompts are presented if the drive selected is a
disk drive.

5-54

ACCESS USER DATA AREA (YIN)

[N]?

A Y answer to this and the following prompt directs ILEXER to
perform testing in the user data area. It is the operator's
responsibility to see to it the data contained there is either
backed up or of no value. If this prompt is answered with an N,
or carriage return, testing is confined to the disk area reserved
for diagnostics (DBN area). When testing is confined to the DBN
area, the following five prompts are not displayed.
ARE YOU SURE (YIN)

[N]?

An N response causes the previous prompt to be repeated. A Y
response allows the exercise to take place in the user data area
of the disk.
START BLOCK NUMBER

This value specifies the starting block of the area ILEXER
exercises when the user data area is selected. If block 0 is
specified, ILEXER will exercise beginning with the first LBN on
the disk.
END BLOCK NUMBER (D)

[O=MAX]?

This parameter specifies the ending block of the area ILEXER
exercises when the user data area is selected. If block 0 is
specified as the ending block, ILEXER exercises up to the last
LBN on the disk.
INITIAL WRITE TEST AREA (YIN)

[N]?

Answering Y to this prompt causes ILEXER to write the entire test
area before beginning random testing. If the prompt is answered
with an N or a carriage return, the prompt immediately following
this is omitted.
TERMINATE TEST ON THIS DRIVE FOLLOWING INITIAL WRITE (YIN)

[N]?

This question allows an initial write on the drive and terminates
the test at that point. The default answer, N, permits this
initial write. After completing the initial write, the test
continues to exercise the drive.
NOTE

The following prompts specify the test sequence
for that part of the test following the initial
write portion. That is, even if the operator
requests Read Only mode, the drive will not be
write protected until after any initial write has
been completed.

5-55

SEQUENTIAL ACCESS (YIN)

[N]?

The operator has the option of requesting all disk data access be
performed in a sequential manner.
READ ONLY (YIN)

[N]?

If answered N, the operator is asked for both a pattern number
and the possibility of write Only mode. If the answer is Y,
ILEXER does not prompt for Write Only mode, but only asks for a
data pattern number if an initial write was requested.
DATA PATTERN NUMBER (0-15)

(D)

[IS]?

The operator has the option of selecting one of 16 disk data
patterns. Selecting data pattern 0 allows selection of a pattern
with a maximum of 16 words. The default data pattern (15) is the
factory format data pattern.
WRITE ONLY (YIN)

[N]?

This option permits only write operations on a disk.
is not displayed if Read Only mode was selected.
DATA COMPARE (YIN)

This prompt

[N]?

If this prompt is answered with an N or a carriage return, data
read from the disk is not checked; for example, disk data is not
compared to the expected pattern. If the prompt is answered with
a Y, the following prompt is issued. The media must have been
previously written with a data pattern in order to do a data
compare.
DATA COMPARE ALWAYS (YIN)

[N]?

Answering a Y causes ILEXER to check the data returned by every
disk read operation. Answering with an N or carriage return,
causes data compares on 15 percent of the disk reads.
NOTE

selection of data compares significantly reduces
the number of disk sectors transferred in a given
time interval.
ANOTHER DRIVE (YIN)

[]?

Answering with a Y permits selection of another drive for
exercising. This prompt has no default. Answering with an N
causes ILEXER to prompt:
AVERAGE DISK TRANSFER LENGTH IN SECTORS (1 TO 400)

5-56

[]?

AVERAGE DISK TRANSFER LENGTH IN SECTORS (1 TO 35)

[]?

This prompt requests the selection of the average size (in
sectors) of each data transfer issued to the disk drives.
Once the preceding parameters are entered, ILEXER continues with
the prompts listed as global user prompts (Section 5.7.6).
5.7.5 Tape Drive User Prompts
ILEXER displays the following prompts if the drive selected is a
tape drive:
IS A SCRATCH TAPE MOUNTED (YIN)

[N]?

An N response results in a reprompt for the drive unit number.
Y response displays the next prompt.
ARE YOU SURE (YIN)

[N]?

If the answer is N, the operator is reprompted for the drive unit
number. If answered with a Y, the following prompts are
displayed.
DATA PATTERN NUMBER (16-22)

(D)

[21]?

Seven data patterns are available for tape.
(pattern 21) is defined in Section 5.7.7.
DENSITY (1=800, 2=1600, 3=6250)

(D)

The default pattern

[2] ?

The response to this prompt is a 1, a 2, or a 3. Any other
response is illegal, and the prompt is displayed again. The
default is 2 or a density of 1600 bpi.
SELECT AUTOMATIC SPEED MANAGEMENT (YIN)

[N]?

Either Automatic Speed Management (if the feature is supported)
or a tape drive speed is selected at this point. If the choice
is Automatic Speed Management, the available speeds are not
displayed.
ILEXER>D>FIXED [VARIABLE] SPEEDS AVAILABLE:

This is an informational message identifying the speeds available
for the tape drive. If the speeds are fixed, the value is
presented. If the speed is variable within a range, the range is
listed, and the next prompt asks the operator to select a speed.
See the tape drive user manual for available speeds.

5-57

SELECT FIXED [VARIABLE] SPEED (D) [1]?
This prompt allows selection of the variable speed for the tape
drive selected. See the tape drive user manual for available
speeds.
RECORD LENGTH IN BYTES (lor 12288) (D) [8192]?
Response to this prompt specifies the size in bytes of a tape
record. Maximum size is 12K bytes. The default value is 8192,
the standard record-length size for 32-bit systems. Constraints
on the HSC diagnostic interface prohibit selection of the maximum
allowable record length of 64K bytes.
DATA COMPARE (YIN) [N]?
Answering N results in no data compares performed during a read
from tape. A Y response causes the following prompt.
DATA COMPARE ALWAYS (YIN) [N]?
A Y response selects data compares to be performed on every tape
read operation. An N response causes data compares to be
performed on 15 percent of the tape reads.
ANOTHER DRIVE (YIN) []?
Answering Y, the prompts beginning with the prompt for DRIVE UNIT
NUMBER, are repeated.
If answered NO, the following global
prompts are presented. This prompt has no default, allowing the
operator to default all other prompts and be able to parameterize
another drive for this pass of ILEXER.
5.7.6 ILEXER Global User Prompts
The following prompts are presented to the operator when no more
drives or drive-specific parameters are to be entered into the
testing sequence. These prompts are global in the sense they
pertain to all the drives.
RUN TIME IN MINUTES (1 TO 32767) [10]?
The minimum time is 1 minute, and the default is 10. After the
exerciser has executed for that period of time, all testing
terminates and a final performance summary is displayed.
HARD ERROR LIMIT (D) [20]?
You are allowed to specify the number of hard errors allowable
for the drives being exercised. When a drive reaches this limit,
it is removed from any further exercising on this pass of ILEXER.
Hard errors include the following types of errors:

5-58

Tape drive BOT encountered unexpectedly

Invalid MSCP response received from functional code

UNKNOWN MSCP status code returned from functional code

Write on write-protected drive

Tape formatter returned error

Read compare error

Read data EDC error

Unrecoverable read or write error

Drive reported error

Tape mark error (ILEXER does not write tape marks)

Tape drive truncated data read error

Tape drive position lost

Tape drive short transfer occurred on read operation

Retry limit exceeded for a tape read, write, or read
reverse operation

Drive went OFFLINE or AVAILABLE unexpectedly

NARROW REPORT (Y/N)

[N]?

Answering Y presents a narrow report which displays the
performance summaries in 32 columns. The default display,
selected by answering N, or carriage return, is 80 columns. The
format of this display is described in further detail in Section
5.7.11. This report format is intended for use by small
hand-held terminals.
ENABLE SOFT ERROR REPORTS (Y/N) [N]?

This prompt enables soft error reports by answering Y. By
default, the operator does not see any soft error reports
specific to the number of retires required on a tape I/O
operation. A N response results in no soft error report. Soft
errors are classified as those errors that eventually complete
successfully after explicit controller-managed retry operations.
They include read, write, and read-reverse requested retries.

5-59

DEFINE PATTERN 0 - HOW MANY WORDS (16 MAX)

(D) [16]?

If data pattern 0 was selected for any preceding drive, the size
of the data pattern must be defined at this time. The pattern
can contain as many as 16 words, also the default. If a number
larger than 16 is supplied, an error message is displayed and
this prompt is presented again. When a valid response is
presented, the following prompt is displayed the specified number
of times.
DATA IN HEX (H) [OJ?

This prompt is displayed as many times as the number of words
specified in the previous response.
ILEXER is expecting a
4-character hex value as the answer to this prompt.
5.7.7 ILEXER Data Patterns
The data patterns available for use with ILEXER are listed in the
following sections. Note that Pattern 0 is a user-defined data
pattern. Space is available for a repeating pattern of up to 16
words.

Pattern 0
User Defined

Pattern 1
105613

Pattern 2
031463

Pattern 3
030221

Pattern 4
Shifting Is
000001
000003
000007
000017
000037
000077
000177
000377
000777
001777
003777
007777
017777
037777
077777
177777

Pattern 5
Shifting Os
177776
177774
177770
177760
177740
177700
177600
177400
177000
176000
174000
170000
160000
140000
100000
000000

Pattern 6
Alter Is,Os
000000
000000
000000
177777
177777
177777
000000
000000
177777
177777
000000
177777
000000
177777
000000
177777

Pattern 7
al0ll0ll0ll0ll00l
133331

5-60

Pattern 9
Pattern 8
8 0 1 0 1 . . /S 1 0 1 0 . . 8110 •..
155554
052525
052525
052525
125252
125252
125252
052525
052525
125252
125252
052525
125252
052525
125252
052525
125252

Pattern 10
26455/151322
026455
026455
026455
151322
151322
151322
026455
026455
151322
151322
026455
151322
026455
151322
026455
151322

Pattern 11

Pattern 13
Ripple 0
177776
177775
177773
177767
177757
177737
177677
177577
177377
176777
175777
173777
167777
157777
137777
077777

Pattern 14
Manufacture
155555
133333
155555
155555
133333
155555
155555
133333
155555
155555
133333
155555
155555
133333
155555
155555

Pattern 15
Patterns
155555
133333
066666
155555
133333
066666
155555
133333
066666
155555
133333
066666
155555
133333
066666
155555

Pattern 12
Ripple 1
000001
000002
000004
000010
000020
000040
000100
000200
000400
001000
002000
004000
010000
020000
040000
100000

5-61

066666

Data patterns for tapes follow:
Pattern 16
Alternating
one and zero
bits
125252
125252

Pattern 17
All ones

Pattern 20
Alternating
two bytes
ones and two
bytes zeros

Pattern 21
Alternating
three bytes
ones and one
byte zeros

Pattern 18
Alternating
bytes of all

Pattern 19

all ones

Pattern 22

Setting/Clearing Flags - ILEXER
One parameter is specified in Section 5.7.6 which allows the
operator to inhibit the display of soft error reports. No other
error reports can be inhibited.
5.7.8

5.7.9 ILEXER Progress Reports
ILEXER has three basic forms of progress reports; the data

transfer error report, the performance summary, and the
communication error report.
o

The data transfer error report is printed each time an
error is encountered in one of the drives being tested.

The performance summary report is printed when ILEXER
completes this pass on each drive being exercised or
when the operator terminates the pass via a CTRL Y.
This performance summary is also printed on a periodic
basis during the execution of ILEXER.

The communication error report is sent to the console
terminal any time ILEXER is unable to establish and
maintain communications with the drive selected for
exercising.

5.7.10 ILEXER Data Transfer Error Report
The report described here is printed on the terminal each time a
data transfer error is found during the execution of this pass of
ILEXER.
The report describes the nature of the error and all
data pertinent to the error found.

5-62

The data transfer error report is a standard HSC error log
message. It contains all data necessary to identify the error.
The only exception to this is when the error encountered by
performed a data check and found an error during the compare,
resulting in an ILEXER error report.
5.7.11 ILEXER Performance Summary
The Performance Summary is printed on the terminal at the end of
the testing session, when manually terminated, or every specified
number of minutes for the periodic performance summary. This
report provides statistical data which was being tabulated by
ILEXER during the execution of this test.

The performance summary presents the statistics which are
maintained on each drive. This summary contains the drive unit
number, the drive serial number, the number of position commands
performed, the number of 0.5 Kbytes read and written, the number
of hard errors, the number of soft errors, and the number of
software correctable transfers. For tape drives being exercised
by ILEXER, an additional report breaks down the software
correctable errors into eight different categories.
The frequency of report display is altered in the following
fashion:
1.

Type CTRL G during the execution of ILEXER

The following prompt is displayed:
MFGEXR)D)
Options are:
MFGEXR)D)
0 = No action
MFGEXR)D)
1 = (not implemented)
MFGEXR)D)
2 = (not implemented)
MFGEXR)D)
3 = frequency of performance summary
Enter Option (0,1,2,3) (D ) [ ] :

Enter in the preferred option.
options available are 0 and 3.

Currently, the only

If option 3 is selected, the following prompt is
displayed. The valid values range from 1 to 3599 for
the number of seconds between printings of performance
summaries. From that point on, the summary is displayed
as often as specified. The operator can enter a 1,
which prints a performance summary immediately but does
not alter the frequency of the report. Also note, a
value equal to or greater than one hour is not allowed
to avoid a reboot of the HSC.
Interval time for performance summary in seconds (D)
[30]?

5-63

The format of the Performance Summary follows:

PERFORMANCE SUMMARY (DEFAULT)
UNIT R SERIAL
SOFTWARE
NO
NUMBER
CORRECTED

POSI

KBYTE

TION

READ

KBYTE
WRITTEN

HARD

SOFT

ERROR ERROR

HHHHHHHHHHH ddddd dddddddddd dddddddddd ddddd ddddd
HHHHHHHHHHH ddddd dddddddddd dddddddddd ddddd ddddd

Dddd
Tddd

ddddd
ddddd

A performance summary is displayed for each disk drive and tape
drive active
on the HSC.
where:
1.

UNIT NUMBER - the unit number of the drive. D for disk,
T for tape. The number is reported in decimal.

R - the status of the drive. If an asterisk (*) appears
in this field, the drive was removed from the test and
the operator was previously informed.
If the field is
blank, the drive is being exercised.

SERIAL NUMBER - the serial number (hexadecimal) for each
drive.

POSITION - the number of seeks.

KBYTE READ - the number of 1 Kbytes read by ILEXER on
each drive.

KBYTE WRITTEN - the number of Kbytes written by ILEXER.

HARD ERROR - the number of hard errors reported by
ILEXER for a particular drive.

SOFT ERROR - the number of soft tape errors reported by
the exerciser if enabled by the operator.

SOFTWARE CORRECTED - the number of ECC correctable reads
encountered by ILEXER. Only ECC correctable errors
above the specific drive ECC error threshold are
reported via normal functional code error reporting
mechanisms. ECC correctable errors below this threshold
are not reported via an error log report, but only
included in this count maintained by ILEXER.

5-64

If any tape drives were being exercised, the following summary is
displayed within each performance summary.
UNIT MEDIA DOUBLE DOUBLE SINGLE SINGLE OTHER OTHER OTHER
ERROR TRKERR TRKREV TRKERR TRKREV ERR A ERR B ERR C
NO

ddddd

Tddd
etc.

ddddd

where:
1.

MEDIA ERROR - the number of bad spots detected on the
recording media.

DOUBLE TRKERR - the number of double track errors
encountered during a read or write forward.

DOUBLE TRKREV - the number of double track errors
encountered during a reverse read or write.

SINGLE TRKERR - the number of single track errors
detected during a read or write in the forward
direction.

SINGLE TRKREV - the number of single track errors
encountered during a reverse read or write.

Other Err A-C - reserved for future use.
PERFORMANCE SUMMARY (NARROW)

ILEXER)D>PER SUM

D[T]ddd
SN HHHHHHHHHHHH
P ddddd

dddddddddd
w dddddddddd
HE ddddd
SE ddddd
SC ddddd

This report is repeated for each drive tested.
If any tape drives are being tested, the following report is
issued for each tape drive following the disk drive performance
summaries.

5-65

ILEXER>D>ERR SUM
ILEXER>D>Tddd
ILEXER>D>ME ddddd
ILEXER>D>DF ddddd
ILEXER>D>DR ddddd
ILEXER>D>SF ddddd
ILEXER>D>SR ddddd
ILEXER>D>OA ddddd
ILEXER>O>OB ddddd
ILEXER>D>OC ddddd
5.7.12 ILEXER Communications Error Report
Whenever ILEXER encounters an error that prevents it from
communicating with one of the drives to be exercised, ILEXER
issues a standard error report. This report gives details
enabling the operator to identify the problem. For further
isolation of the problem, the operator should run another
diagnostic specifically designed to isolate the failure (ILDISK
or ILTAPE).
5.7.13 ILEXER Test Termination
Upon completion of the exercise on each selected drive, reporting
of any errors found, and display of final performance summary,
ILEXER terminates normally. All resources, including the drive
being tested, are released. The operator may terminate ILEXER
before normal completion by typing a CTRL Y. The following
output is displayed, plus a final performance summary:
ILEXER>D>hh:mm DIAGNOSTIC ABORTED
ILEXER>D>PLEASE WAIT - CLEARING OUTSTANDING I/O
Certain parts of ILEXER cannot be interrupted, so the CTRL Y may
have no effect for a brief moment and may need repetition.
Whenever ILEXER is terminated, whether normally or by operator
abort, ILEXER always completes any outstanding I/O requests and
prints a final performance summary.
5.7.14 ILEXER Error Message Format
ILEXER outputs four types of error formats: prompt errors, data
compare errors, pattern word errors, and communication errors.
These formats agree with the generic diagnostic error message
format (Section 5.1.1.1).
5.7.14.1 ILEXER Prompt Error Format - Prompt errors occur when
the operator enters the wrong type of data or the data is not
within the specified range for a parameter. The general format
of the error message is:

5-66

ILEXER>D>error message
Where the error message is an ASCII string describing the type of
error discovered.
5.7.14.2 ILEXER Data Transfer Compare Error Format - A data
transfer compare error occurs when an error is detected during
the exercise of a particular drive.

The two formats for the data transfer compare error are,
depending upon the type of error, data compare error and pattern
word error.
A data compare error occurs when the data read does not match the
expected pattern. The format of the data compare error is:
ILEXER>D>hh:mm T ddd E ddd u-uddd
ILEXER>D>Error Description
ILEXER>D>MA - HHHHHHHHHH
ILEXER>D>EXP - HHHH
ILEXER>D>ACT - HHHH
ILEXER>D>MSCP STATUS CODE = HHHH
ILEXER>D>FIRST WORD IN ERROR = ddddd
ILEXER>D>NUMBER OF WORDS IN ERROR = ddddd
where:
hh:mm - a time stamp since the start of ILEXER
T - the test number in the exerciser
E - corresponds to the error number
U - the unit number for which the error is being reported
MA - the media address (block number) where the error occurred
EXP - the expected data
ACT - the data (or code) actually received
MSCP STATUS CODE - the code received from the operation
FIRST WORD IN ERROR - describes the number of the first word
found in error.
NUMBER OF WORDS IN ERROR - once an error is found, the routine
continues to check the remainder of the data returned and counts
the number of words found in error.

5-67

The format for the pattern word error is slightly different from
the data compare error. A pattern word error occurs when the
first data word in a block is not a valid pattern number. The
format is:
ILEXER>D>hh:mm T ddd E ddd u-uddd
ILEXER>D>Error Description ILEXER>D>MA - HHHHHHHHHH
ILEXER>D>EXP - HHHH
ILEXER>D>ACT - HHHH
The MSCP status code, first word in error, and number of words in
error are not relevant for this type of error. The other fields
are as described for the data compare error.
ILEXER Communications Error Format - Communications
errors occur when ILEXER cannot establish/maintain communications
with a selected drive The error message appears in the following
format:
5.7.14.3

ILEXER>D>hh:mm T ddd E ddd U-uddd
ILEXER>D>Error Description
ILEXER>D>Optional Data lines follow here
where:
hh:mm - time stamp for the start of ILEXER
T - the test number in the exerciser
E - corresponds to the error number
U - the unit number for which the error is being reported.
Error Description - an ASCII string describing the error
encountered.
Optional Data lines - a maximum of eight optional lines per
report.
ILEXER Error Messages
The following section is a list of the informational messages and
error messages and an explanation of the cause of the error. A
typical error message looks like:
5.7.15

ILEXER>D>09.32 T#006 E#204 U-TOOIOO
ILEXER>D>Comm Error: TBUSUB call failed

5-68

5.7.15.1 ILEXER Informational Messages - These messages are not
fatal to the exerciser. They alert the user to incorrect input
to parameters, indicate missing interfaces, or are informational.
#1

Number must be between 0 and 15 - reported when the user
entered an erroneous value for the data pattern on a
disk.

Pattern Number must be within specified bounds reported when the operator tries to specify a disk
pattern number for a tape.

You May Enter at Most 16 Words in a Data Patter n reported if the operator specifies more than 16 words
for a user defined pattern, and the operator is
reprompted for the value.

Starting BN is either Larger than Ending BN or Larger
than Total BN on Disk - reprompts for the correct
values. The operator selected a starting block number
for the test which was greater than the ending block
number selected, or it is greater than the largest block
number for the disk.

Please Mount a Scratch Tape - appears after an N
response to the prompt asking if the scratch tape is
mounted on the tape drive to be tested.

Disk Interface Not Available - indicates the disk
functionality is not available to exercise disk drives.
This means the K.sdi is not available or not operable.

Tape Interface Not Available - indicates the tape
functionality is not available to exercise tape drives.
This means the K.sti is not available or not operable.

please Wait - Clearing outstanding I/O is printed when
the operator enters a CTRL y to stop ILEXER. All
outstanding I/O commands are aborted at this time.

5.7.15.2 ILEXER Generic Errors - The following list indicates
the number, text, and cause of errors displayed by ILEXER.
epur
#1

No Disk or Tape Functionality ••• Exerciser Terminated Neither the K.sdi or K.sti interfaces are available to
run the exercise. This terminates ILEXER.

Could not Get Control Block For Timer - Stopping
Multi-Drive Exerciser - ILEXER could not obtain a
transmission queue for a timer. This should occur only
on a heavily loaded system and is fatal to ILEXER.

5-69

Could not Get Timer For MOE - Stopping Multi-Drive
Exerciser - The exerciser cannot obtain a timer. Two
timers are required for ILEXER. This should only occur
on a loaded system and is fatal to ILEXER.

Disk functionality Unavailable-Choose Another Drive The disk interface is not available. A previous message
is printed at the start of ILEXER if any of the
interfaces are missing. This error prints when the
operator still chooses a disk drive for the exercise.

Tape Functionality Unavailable-Choose Another Drive The tape interface is not available. A previous message
is printed at the start of ILEXER if any of the
interfaces are missing. This error prints when the
operator still chooses a tape drive for the exercise.

Couldn't Get Drive Status-Choose Another Drive - ILEXER
was unable to obtain the status of a drive for one of
the following conditions:
1.

The drive is not communicating with the HSC.
the formatter or the disk is not Online.

The cables to the K.sdi or K.sti are loose.

Either

Drive is Unknown-Choose Another Drive - The drive chosen
for the exerciser is not known to the HSC functional
software for that particular drive type. Either the
drive is not communicating with the HSC or the
functional software has been disabled due to an error
condition on the drive.

Drive is Unavailable-Choose Another Drive - This may be
the result of:

The drive port button is disabled for this port.

The drive is Online to another controller.

The drive 1S not aDle to talk to the controller on
the port selected.

Drive Cannot Be Brought Online - ILEXER was unable to
bring the selected drive online. One of the following
conditions occurred:
1.

The unit went into an orr-line state and cannot
communicate with the HSC.

The unit specified is now being used by another
process.

5-70

There are two drives of same type with duplicate
unit numbers on the HSC.

An unknown status was returned from the HSC
diagnostic interface when ILEXER attempted to bring
the drive online.

#10

Could not return Drive to Available State - The release
of the drive from ILEXER was unsuccessful. This is the
result of a drive being taken from the test due to
reaching an error threshold or going off line during the
exercise.

#11

User Requested Write on Write Protected Unit - The
operator should check the entry of parameters and also
check the write protection on the drive to make sure
they are consistent.

#12

No Tape Mounted on Unit ••• Mount and Continue - The
operator specified a scratch tape was mounted on the
tape drive selected when it was not mounted. Mount a
tape and continue.

#13

Record Length larger than 12K or less than 0 - The
record length requested for the transfer to tape was
either greater than 12K or less than O.

#14

This unit already acquired - A duplicate unit number was
specified for a drive and the drive had already been
acquired.

#15

Invalid time entered .•• must be from 1 to 3599 - This is
reported when the user enters an erroneous value to the
performance summary time interval prompt.

#16

Could not get buffers for transfers - This message is
reported when the buffers required for a tape transfer
cannot be acquired.

#17

Tape rewind commands were lost - cannot continue - This
error message results from the drive being unloaded
during ILEXER execution.

5.7.15.3 ILEXER Disk Errors - The following list includes the
number, text, and cause of ILEXER disk errors.
#102

Drive Spindle not Up to Speed. Spin Up Drive And
Restart - The disk drive is not spun up.

5-71

#103

This Drive Removed From Test - This is reported when a
disk drive reaches the hard error limit or the drive
goes off line to the HSC during the exercise.

#104

Couldn't Put Drive in DBN Space - Removed From Test - An
error or communication problem occurred during the
delivery of an SDI command to put the drive in DBN
space.

#105

No DACB Available - Notify Field Support, submit SPR This is reported if no DACBs can be acquired. If this
happens, contact Field Support as soon as possible and
submit an SPR.

#106

Some Disk I/O Failed to Complete - An I/O transfer did
not complete during an allotted time period.

#107

Command Failed - Invalid Header Code - ILEXER did not
pass a valid header code to the diagnostic interface for
the HSC.

#108

Command Failed - No Control Structures Available - The
diagnostic interface could not obtain disk access
control blocks to run the exercise. The HSC could be
overloaded. Try ILEXER on a quiet system. If the error
still occurs, test the HSC memory.

#109

Command Failed - No Buffer Available - The diagnostic
interface could not obtain buffers to run the exercise.
The HSC could be overloaded. Try ILEXER on a quiet
system. If the error still occurs, test the HSC memory.

#110

write Requested on Write Protected Drive - The operator
requested an initial write operation on a drive which
was already write protected. The operator should pop
out the write protect button on the drive reporting the
error or have ILEXER do a READ ONLY operation on the
drive.

#111

Data Compare Error - Bad data was detected during a read
operation.

#112

Pattern Number Error - The first two bytes of each
sector, which contain the pattern number, did not match.

#113

EDC Error - Error Detection Code error:
was detected during a read operation.

#114

Unknown Unit number not allowed in ILEXER ••• - The
operator attempted to enter in a unit number of the
form, 'Xnnnn', which is not accepted by ILEXER.

5-72

invalid data

#115

Disk unit numbers must be between 0 and 4095 decimal The operator specified a disk unit number out of the
allowed range of values.

#116

Hard Failure on Disk - A hard error occurred on the disk
drive being exercised.

The following errors identify the function attempted by ILEXER
which caused an error to occur. Error logs do not indicate the
operation attempted.
#117

Hard Failure on COMPARE Operation - A hard failure
occurred during a compare of data on the disk drive.

#118

Hard Failure on WRITE Operation - A hard fault occurred
during a write operation on the specified disk drivee

#119

Hard Failure on READ Operation - A hard failure occurred
during a read operation on the disk drive being
exercised.

#123

Hard Failure on INITIAL WRITE Operation - A hard failure
occurred during the first write to the disk drive.

#124

Drive went spontaneously available - A drive which was
being exercised went into an Available state. This
could be caused by the operator releasing the port
button on the drive. A fatal drive error could also
cause the drive to go into this state.

ILEXER Tape Errors - The following list includes the
number, text, and cause of ILEXER tape errors.

5.7.15.4

#201

Couldn't Get Formatter Characteristics - A
communications problem with the drive is indicated.
could be caused by the unit not being online.

#202

Couldn't Get Unit Characteristics - The drive is not
communicating with ILEXER. The unit could be off line.

#203

Some Tape I/O Failed to Complete - The drive or
formatter stopped functioning properly during a data
transfer.

#204

Communication Error: TDUSUB call failed - ILEXER cannot
talk to the drive via interface structures. They have
been removed. Either the drive went available from
online, or is off-line, or a fault occurred.

5-73

#205

Read Data Error - A read operation failed during a data
transfer, and none was transferred.

#206

Tape Mark Error ••• rewinding to restart - ILEXER does not
write tape marks.
If this error occurs, it indicates a
drive failure.

#207

Tape position Lost ••• rewinding to restart - An error
occurred during a data transfer or a retry of one.

#209

Data Pattern word Error ••• Possible Media Defect - The
first two bytes of a record containing the data pattern
did not match.

#210

Data Read EDC Error •• continuing ••• - Error Detection
Code error - incorrect data was detected.

#211

Could Not Set unit Char ••• removing from test - The drive
is off line and not communicating.

#213

Truncated Record Data Error ••• rewinding to restart More data was received than expected indicating a drive
problem.

#214

Drive Error ••• Hard Error ••• continuing - A hard failure
occurred with the drive being exercised.

#215

Unexpected Error Condition ••• removing drive from test This is caused by MSCP error conditions which are not
allowed (i.e., invalid commands, unused codes,
write-protected drive write, etc.).

#216

Unexpected BOT encountered ••• will try to restart - The
drive is experiencing a positioning problem.

#217

Unrecoverable write Error ••• rewinding to restart - A
hard error occurred during a write operation. The write
did not take place due to this error.

#218

Unrecoverable Read Error ••• rewinding to restart - A hard
error occurred during a read operation and a data
transfer did not take place.

#219

Controller Error ••• Hard Error •• rewinding to restart This indicates a communications problem between the
controller and the formatter.

#220

Formatter Error ••• Hard Error ••• continuing - A
communications problem exists between the formatter and
the controller and/or drive.

5-74

#221

Retry Required on Tape Drive - A read/write operation
which failed required a retry before succeeding.

#222

Hard Error Limit Exceeded ••• removing drive from test The drive exceeded the threshold of hard errors
determined by a global user parameter (Section 5.7.6).
The drive is then removed from the exercise.

#224

Drive went Offline ••• removing from test - The drive went
off line during the exercise. This is caused by the
operator taking the drive off line or a hard failure
forcing the drive off line.

#225

Drive went Avai1ab1e ••• removing from test - The drive
became available to ILEXER and was not at the beginning
of the exercise.

#226

Short Transfer Error ••• rewinding to restart - Less data
was received than transferred.

#227

Tape position Discrepancy - The tape position was lost
indicating a hard failure.

5.7.16 ILEXER Test Summaries
The test numbers in ILEXER correspond to the module being
executed within ILEXER itself. The main module is called MOE,
and it calls all other modules.
o

Test Number 1 - Main Program:

MOE

Multi-drive Exerciser is the main program withi ILEXER.
It is responsible for calling all other portions of
ILEXER. It obtains the buffers and control structures
for the exerciser. It -verifies either disk or tape
functionalities are available before allowing ILEXER to
continue.
o

Test Number 2 - INITT
INITT is called to initialize drive statistic tables.
It obtains the parameters and verifies the values of
each one entered. This routine calls INICOO to obtain
drive specific parameters.

5-75

Test Number 3 - INICOD
INICOD is the initialization code for ILEXER. It gets
the various parameters for the drives from the operator
and fills in the drive statistic tables with initial
data for each drive. It also verifies the validity of
the input for the parameters. INICOD, in turn, calls
ACQUIRE to acquire the disk and/or tape drive.

Test Number 4 - ACQUIRE
ACQUIRE is responsible for acquiring the drives as
specified by the parameters. It brings all selected
drives online to the controller and spins up the disk
drives. Errors reported in this routine cause the
removal of the drive from the exercise.

Test Number 5 - INITD
INITD initializes the disk drives for the exercise.
This routine clears all disk access control blocks and
invokes the initial write.

Test Number 6 - TPINIT
TPINIT initializes the tape drives for the exercise. It
rewinds all acquired tape drives and verifies the drives
are at the BOT. If an error occurs, the drive is
removed from the exerciser. TPINIT is also responsible
for obtaining buffers for each acquired tape drive.

Test Number 7 - Exerciser
EXER is the main code of the exerciser. It dispatches
to the disk exerciser (QDISK and CDISK) and the tape
exerciser (TEXER). It continuously queues up I/O
commands to disk and tape, and checks for I/O
completion. The subroutines EXER calls are responsible
for sending commands and checking for I/O completion.

Test Number 8 - QDISK
QDISK is part of the disk exerciser which selects
commands to send to the disk drives. If the initial
write is still in progress, it returns to EXER. QDISK
calls a routine to select the command to exercise the
disk drive. The following scenario is the algorithm
used to select the command:
If the drive is read only and data compare is not
requested, a Read operation is queued to the drive.

5-76

If read only and data compare (occasional) are
requested, a Read operation is queued along with a
random choice of compare/not-compare.
If read only and data compare (always) are requested
by the operator, a READ-COMPARE command is queued to
the drive.
If write only is requested, and data compare is not,
then a WRITE request is queued up to the disk drive.
If write only and data compare (occasional) are
requested, a Write operation is queued along with a
random choice of compare/not-compare.
If write only and data compare (always) are
requested, a WRITE~COMPARE command is queued to the
drive.
If only data compare (occasional) is requested, then
a random selection of READ/WRITE and
compare/not-compare will be done.
If only data compare (always) is requested, a
COMPARE command is paired with a random selection of
READ/WRITE.
QDISK randomly selects the number of blocks for the
selected operation.
o

Test Number 9 - RANSEL

RANSEL is the part of the tape exerciser which is
responsible for sending commands to the tape drives.
This routine is called by TEXER, the tape exerciser
routine. RANSEL selects a command for a tape drive
using a random number generator. Following are some
constraints for the selection process:
No reads when there are no records before or after
the current position.
No writes when there are records after current
position.
No position of record when no records are before or
after the current position.

5-77

Reverse commands are permitted on the drive when 16
reverse commands have previously been selected.
That is, lout of every 16 reverse commands are sent
to the drive. Immediately following a reverse
command, a position to the end-of-written-tape is
performed. The reason for forward biasing the tape
is to prevent thrashing.
The following commands are executed in exercising the
tape drives:
1.

READ FORWARD

WRITE FORWARD

POSITION FORWARD

READ REVERSE

REWIND

POSITION REVERSE

RANSEL randomly selects the number of records to read,
write, or skip.
o

Test Number 10 - COISK
CDISK checks for the completion of disk I/O specified by
QDISK. CDISK checks the return status of a completed
I/O operation and if any errors occur, they are
reported.

Test Number 11 - TEXER
TEXER is the main tape exerciser which selects random
writes, reads, and position commands. TEXER processes
the I/O once it is completed and reports any errors
encountered.

Test Number 12 - EXCEPT
EXCEPT is the ILEXER exception routine. This is the
last routine called by MOE. EXCEPT is called when a
fatal error occurs, when ILEXER is stopped with a
CTRL/Y, or when the program expires its allotted time.
It cleans up any outstanding I/O, as necessary, returns
resources, and returns control to DEMON.

5-78

CHAPTER 6
OFFLINE DIAGNOSTICS

6.1 INTRODUCTION
This chapter describes the offline diagnostics, how to run them,
errors that can occur, and summaries of the tests in each
diagnostic. Included in the offlines are:

Offline Diagnostic Loader

Offline Cache Test

Offline Bus Interaction Test

Offline K Test Selector

Offline KIP Memory Test

Offline Memory Test

Rx33 Offline Exerciser

Offline Refresh Test

Offline Operator Control Panel Test

The offline diagnostics contain specific common characteristics,
discussed in the following three sections. They are listed
below.
o

Identical software requirements

Common load procedure

Identical bootstrap initialization procedures

Generic error message format

6-1

6.1.1 Offline Diagnostics Software Requirements
All offline diagnostics require an RX33 Offline Diagnostic
diskette containing a bootable image of the offlines software
programs.

Offline Diagnostics Load Procedure
The Offline diagnostics diskette boots from either Rx33 drive and
should not be write-enabled. This diskette contains the
necessary software to run all the HSC70 Offline diagnostics.
Booting is done either by powering on or by depressing and
releasing the Init switch with the Secure/Enable switch in the
ENABLE position. This causes the P.ioj ROM bootstrap tests to
run followed by the Offline P.ioj test.
6.1.2

NOTE
For offline diagnostics, the HSC70 must be booted
with the Secure/Enable switch in the ENABLE
position. If a hardware error occurs during
boot, the software executes a halt instruction on
certain errors. A halt instruction, even in
Kernel mode, is valid only if the Secure/Enable
switch is in the ENABLE position. Otherwise, the
result can be an illegal instruction trap in
addition to the error causing the halt.
In order for the bootstrap to complete successfully, the
following must be operational:
o

Basic instruction set of the PDP-II

First 2048 bytes of Program memory plus 8 Kwords of
contiguous Program memory below address 160000

Rx33 controller and at least one drive containing a
diskette with a bootable image

Before control is turned over to the HSC70 bootstrap ROMs,
internal microcode tests execute in the Jll chip set. Refer to
Table 2-1 for definitions of the JII module (P.ioj) LEDs. Also,
refer to Figure 8-4 for details of the P.ioj internal self-test
procedures.
6.1.3 P.ioj ROM Bootstrap
The HSC70/JII P.ioj ROM Bootstrap verifies the basic integrity of
the P.ioj module, Program memory, and the Rx33 controller/drive
subsystem. The goal of the bootstrap tests is to test enough of
the HSC70 to allow further test loading from the Rx33.

6-2

The bootstrap test is the first step in the HSC70 initialization
process. It is run for every bootstrap or reload of the HSC70
operating system (CRONIC). The bootstrap is initiated
automatically each time the HSC70 is powered on and is also
initiated by CRONIC when a software reboot is required.
The bootstrap is a PDP-II program written to execute in a DCJll
CPU in a stand-alone environment. This means no other software
processes co-exist with the bootstrap.
Bootstrap failures are reported via the Fault lamp mechanism
which specifies the module most likely causing the problem. The
fault codes are defined in Figure 4-2. An error table is
maintained in Program memory addresses 00000400 through 00000412.
These addresses contain the reasons for each Rx33 drive boot
failure.
6.1.3.1 Bootstrap Initialization Instructions - The following
procedure lists the operating instructions for the P.ioj ROM
Bootstrap. Operating instructions for the P.ioj ROM bootstrap
are in the following list. Refer to Section 6.1.3.2 if this
procedure fails.

Insert a Offline Diagnostics diskette with a bootable
image into the Rx33 unit 0 drive (left-hand drive).

Turn power ON.

Set the Secure/Enable switch to the ENABLE position,
then depress the Init switch. The bootstrap will
initiate automatically.

At this point, the Jll P.ioj module executes internal
microdiagnostics and then begins to execute from the boot ROM.
The Init lamp lights on the HSC70 operator control panel when the
bootstrap PDP-II tests are done. The RX33 drive-in-use LED
should light within 8 to 10 seconds, indicating the bootstrap is
attempting to load software into Program memory.
If the load is
successful, the bootstrap transfers control to the first
instruction of the image just loaded from the diskette.
6.1.3.2 Bootstrap Failures - Most bootstrap failures result in
lighting the Fault lamp on the HSC70 operator control panel.
When this happens, depress the Fault switch momentarily, and read
the failure code displayed in the operator control panel lamps.
Section 6.1.3.5 indicates the HSC70 modules most likely causing
the bootstrap failure. Momentarily depressing the Init switch on
the operator control panel reinitiates the bootstrap.

6-3

The rnicrodiagnostic LEOs on the Jll module indicate if a hard
fault exists causing the Jll to hang before control is passed the
boot ROM. Section 6.1.3.5 contains an explanation of these LEOs.
If a failure occurs in the tests of the POP-ll basic instruction
set, the Fault lamp mechanism does not report the failure.
Instead, the POP-ll executes a Branch dot (BR .) and does not
continue the bootstrap program. A failure of this type is easily
detected because the Init lamp does not light.
(The Init lamp
does light immediately after the basic PDP-II tests successfully
complete.)
When a console terminal is connected to the P.ioj, the exact
instruction that failed is determined by depressing the terminal
BREAK key and noting the address displayed on the terminal. With
a bootstrap listing, this address indicates the instruction that
failed. Notify Field Service Support to investigate such
failures.
NOTE
The bootstrap does not accept user-modifiable
flags.

6.1.3.3 Bootstrap Progress Reports - The bootstrap does not
issue progress reports in the usual sensei however, certain
indications of bootstrap progress are shown in the following
list:
o

Lamps Clear - clears all of the HSC operator control
panel lamps.
If the lamps fail to clear immediately
after the bootstrap is initiated, a failure of the P.ioj
is probable.
(Circuitry on the P.ioj module is
responsible for initiating the bootstrap program.)

Init Lamp - lights as soon as the basic tests of the
PDP-Il instruction set are finished. These tests
normally complete within milliseconds after the
bootstrap is initiated. Failure of the Init lamp to
light indicates a failure in the P.ioj PDP-ll processor.

Rx33 Drive-in-Use - lights as the bootstrap tries to
load the Init P.ioj Test (or Offline P.ioj Test) from
the RX33 following the test of the PDP-li and Program
memory.

State Lamp - lights when the bootstrap completes and
initiates the Init P.ioj Test (or Offline p.ioj test).
When the State lamp is ON, the Init lamp is OFF.

6-4

Fault Lamp - lights during the boot process if the ROM
bootstrap tests have detected a fatal error (Section
6.1.3.4) .

6.1.3.4 Bootstrap Error Information - Specific error codes for
the P.ioj bootstrap (Codes 21, 22, and 23) are described in
detail in Chapter 4.
Because the bootstrap operates in a stand-alone environment, it
does not use the terminal as an error reporting mechanism.
Instead, the HSC70 operator control panel lamps are used to
report errors and to indicate the module most likely causing the
error.
When the bootstrap detects an error, it lights the Fault lamp on
the operator control panel. When the Fault switch is depressed,
the bootstrap displays a failure code in the operator control
panel lamps. The failure code blinks on and off at one-half
second intervals.
6.1.3.5 Bootstrap Failure Troubleshooting - The ODT program
(built into the PDP-II microcode) contains further information
about bootstrap failures. This information is shown in the
following list.
o

Init is Off, Fault is Lit - a failure was detected after
control was passed to the bootable image loaded from the
diskette.

Init and Fault Both Lit - the fault code displays when
the Fault lamp is momentarily depressed. The program is
halted by depressing the BREAK key on the console
terminal. Now type: 172340/. ODT responds by
displaying the contents of address 172340, the test
number. Use the test number to refer to the appropriate
test in Section 6.1.4.

Init and Fault Lamps are Both Off - either the bootstrap
program was not automatically initiated, or the
bootstrap POP-II instruction test failed.
Before proceeding, ensure the Secure/Enable switch is
set to the ENABLE position. If the switch was not in
the ENABLE position when the Init switch was depressed,
the HSC70 did not initiate its boot sequence. If the
Secure/Enable switch is in the correct position, the JII
microdiagnostics may have failed.

6-5

To check the microdiagnostics, remove the card cage
cover and examine the four LEOs on the central edge of
the Jll module. At powerup, all the LEOs should be set
and then turned off as the Jll proceeds through its
microdiagnostic sequence. When viewed from the edge of
the P.ioj module, the LEDs ON or OFF are as follows:
ODT LED - Lit while in console ODT.
SLU LED - Lit when SLU failed to respond at 1777560
(console UART present).
MEM LED - Lit when Program memory did not respond
during microdiagnostics.
SEQ LED - Lit when very basic Jll internal sequence
test failed.

- - - - -

- - -

LED
SLU
- - - - - - - - - ON
ON
ON
OFF
OFF
ON
OFF
ON
ON
OFF
ON
OFF
OFF
OFF
OFF
- - - - - - - - - LED
SEQ

LED
MEM

- - - - - - - - - - - - - LED
PROBABLE FAILURE CAUSE
ODT
- - - - - - - - - - - - - - - ON
P.ioj
OFF
M.std2 first, then P.ioj
OFF
P.ioj
OFF
P.ioj
ON
P. io j
- - - - - - - - - - - - - - - - - -

6.1.4 Bootstrap Test Summaries
This section summarizes the bootstrap tests:

Test 0 - Basic PDP-II Instruction Set - This test
verifies the correct operation of a PDP-II instruction
subset. This instruction subset includes only those
instructions required for completion of the bootstrap.
The following instructions are tested:
Single Operand Instructions Tested (both word and
byte mode):
ADC,CLR,COM,INC,DEC,NEG,TST,ROR,ROL,ASR,ASL,SWAB,NOP
Double Operand Instructions (both word and byte
modes) :
MOV,CMP,BIT,BIC,BIS,ADD,SUB
Branch Instructions Tested:

6-6

BR,BNE,BEQ,BPL,BMI,BCC(BHIS),BCS(BLO),
BGE,BLT,BGT,BLE,BHI,BLOS,BVC,BVS
Jump and Miscellaneous Instructions Tested:
JMP,JSR,RTS,SOB,MTPS,MFPS,
CCC,CLN,CLV,CLZ,SEN,SEV,SEZ
Addressing Modes Tested:
All eight addressing modes
The POP-II instruction set test uses two methods of
reporting errors. During the initial part of the test,
errors result in an infinite program loop at the
location of the detected error. During the latter part
of the test (when enough instructions have been tested),
the Fault lamp mechanism is used to report failures.
Refer to Section 6.1.3.2.
o

Test 1 - Program Memory (Swap Bank) - The HSC70 memory
module includes special logic that permits changing the
address range of Program memory. This address range is
controlled by the Swap Banks bit in the P.ioj Control
and Status Register (CSR). This test verifies the Swap
Banks bit can be set and cleared.
(The actual memory
switching is not tested, only the setting and clearing
of the bit is tested.) A failure in this test indicates
the P.ioj module must be replaced.

Test 2 - Program Memory (Vector Area) - In order for the
HSC70 Control Program to function, the first 2048 bytes
(addresses 00000000 through 00003777) of Program memory
must be working. This test verifies the first part of
Program memory is operating properly. If the test
fails, the SWAP BANKS feature is used, attempting to
swap a portion of memory into the 00000000 through
00003777 address range. If the test still fails after
SWAP BANKS has been invoked, a Program memory error is
reported via the Fault lamp mechanism (Section 6.1.3.2).
A failure in this test indicates the M.std2 module must
be replaced.

Test 3 - Program Memory (8 Kword Partition) - After
verifying the first part of Program memory is working,
the bootstrap tries to find a 8 Kword piece of Program
memory between address 00004000 and address 00160000.
This partition is used to load the Init P.ioj Test from
the RX33.
If insufficient memory is available, a
Program memory error is reported via the Fault lamp
mechanism.

6-7

A failure in this test indicates the M.std2 module must
be replaced.
o

Test 4 - RX33 Controller Test - This test verifies basic
functionality of the control logic on the M.std2 module.
The four controller registers are tested for stuck bits.
The DMA hardware is checked for correct cycling and
addressing. The interrupt logic is checked to ensure
interrupts are properly acknowledged. With the control
hardware verified, proceed to the next step, and try to
read data from one of the drives.

Test 5 - Rx33 Drive/Interface Test - The goal of this
test is to find a working Rx33 drive containing a
diskette with a bootable image. Such an image is
identified by a PDP-II NOP instruction in the first word
of the imagee The intended drive is checked for DRIVE
READY from the interface. Then RECAL/VERIFY commands
the drive to seek to track zero. This command then
reads the diskette header to verify the recal did move
the head to track O.
After a suitable drive is found, the first eight blocks
of the diskette are loaded into the 8 Kword partition
found in Test 3. The eight blocks loaded consist of the
first five blocks of the Init P.ioj Test (or Offline
P.ioj Test), the RT-ll Volume ID block, and the first
RT-ll directory segment on the diskette.
(The directory
blocks are loaded at this time to save directory look-up
time in the Init P.ioj Test or the Offline P.ioj Test.)
Rx33 drive 0 is tested first. A failure with drive 0
causes the bootstrap to proceed to drive 1 and begin the
tests again. If neither Rx33 drive is working
correctly, an Rx33 error is displayed by the Fault
lamps. An error table is maintained in Program memory
addresses 00000400 through 00000412 which remembers why
each rejected Rx33 drive failed the boot.
The error table follows:

Table 6-1

Error Table

Address

Meaning

00000400
00000402
00000404
00000406
00000410

Contains controller error code (code 1 or code 2)
RX33 address being accessed, if applicable
Expected result
Actual result
Drive error code, byte-encoded: Drive l/Drive 0
(high-byte/low-byte)

6-8

NOTE
It is not possible to simultaneously have
information in addresses 00000400 and 00000410.
If the boot fails with a RX33 error, the ODT feature of
the PDP-II is used to examine the Rx33 error table to
determine why each Rx33 drive failed the test.
(Remember the bootstrap tries both drives before
declaring an error.) Use following to examine the Rx33
error table:
Depress the BREAK key on the console terminal.
The terminal should type out the address of the
current instruction of the bootstrap, and then
prompt for input with an @ character.
Type nnn (appropriate address).
The terminal should print the (octal) contents of
that address.
Type linefeed to examine Table 6-2.

Table 6-2

Rx33 Error Code Table

Controller Error
1

2
3

4
5
6
7

Failure Information
NXM occurred while accessing Rx33 registers.
A bit was stuck in the registers. See
expected/actual for more information.
Force mode interrupt did not occur.
DMA test mode hardware error occurred.
DMA address counters were wrong after transfer.
Incorrect data found after DMA test operation.
Data parity was bad after DMA test operation.

6-9

Controller Error
10
11
12
13

14
15

16
17

Failure Information
Drive was not ready (no diskette inserted or
door was open).
Hard error (CRC or Record Not Found) occurred
on recal/verify.
Track 0 bit was not set after recal.
SEEK command timeout occurred.
Seek error (CRC or Record Not Found) occurred.
READ SECTOR command timeout
Hard error (CRC or Record Not Found) occurred
on read.
Nonbootable image (non-NaP instruction) is
the first word.

Failure information for both drives in address 00000410
is possible. In this case, nonzero data is in both
bytes. Only when failures are detected on both drives
does the boot ROM generate a LOADFAL failure code and
branch to the fault light routine.
o

Test 6 - Transfer Control to Loaded Image - This part of
the bootstrap is not actually a test. However, it is
given a test number in case an error occurs in this
section of code. The PDP-II general registers are
loaded with certain parameters (CSR and unit of load
device, base address, and size of partition, etc.). The
image loaded from the Rx33 is initiated by jumping to
the first instruction. Any errors occurring in this
part of the bootstrap are probably unexpected traps or
interrupts caused by intermittent P.ioj or M.std2
failures. When the loaded image is started, the State
lamp is lit, and the Init lamp is turned off.

6.1.5 Offline Diagnostics Error Reporting And Message Format
The method of reporting errors and the message format are common
to the offline diagnostics. All errors are reported on the
console terminal as they occur. In all offline diagnostics,
error messages conform to the HSC diagnostic error message
format.
The first line of an error message contains general information
concerning the error and is mandatory. The second line of an
error message consists of text describing the error and is also
mandatory. The third and succeeding lines of the message are
used for additional information where required, and are optional.
The generic error message format follows:

6-10

XXXXXX>hh:mm Tn En UOOO
SEEK error detected during positioning operation
optional line 1
optional line 2
optional line 3
where:
XXXXXX> is the prompt for the particular diagnostic in question
(such as OFLCXT> or OBIT>, hh:mm is the number of hours and
minutes since system boot, tn is a test number in the range of
the number of tests in the specific test, en is an error number
with a range of 1 through 77 (octal), and uOOO is the unit
number. The final field in the first line appears only in
diagnostics where such information is appropriate. Each error
number has a unique text string associated with it; For errors
that consist of results that did not compare with the expected
value, the diagnostic uses the optional lines to show
expected/actual (EXP/ACT) data. Errors on data transfers and
SEEK commands use the optional lines to print out the LBN, track,
sector, and side to help isolate problems to the media or the
drive.
6.2 OFFLINE DIAGNOSTICS LOADER
The Offline Diagnostic Loader provides a software environment for
the HSC70 Offline diagnostics. The Loader supports a command
language that loads and executes an offline diagnostic from the
Rx33 into Program memory. The Loader command language also
permits the display and modification of any address contents in
the HSC70 Program, Data, or Control memories.

The software environment provided for Offline diagnostics
includes a Rx33 driver and a terminal driver. A standard
software interface between the diagnostics and the Rx33 and
terminal devices takes the place of individual interface routines
within the diagnostics. The Loader also maintains a timer that
keeps track of the relative time since the Loader was last
booted. This allows diagnostic error messages to be
time-stamped.
6.2.1 Offline Diagnostic Loader System Requirements
Hardware required to run the Offline Diagnostic Loader includes:

I/O control processor module with HSC70 Boot ROM.

At least one M.std2 (memory) module.

Rx33 controller with at least one working drive.

Terminal connected to I/O control processor console
interface.

6-11

6.2.2 Offline Diagnostic Loader Prerequisites
In the process of loading the Offline Diagnostic Loader, several
diagnostics are run. The ROM Bootstrap tests the basic POP-II
instruction set, tests a partition in Program memory, and tests
the RX33 used for the boot. Then the bootstrap loads the Offline
P.ioj Test which completes the PDP-II tests and the remainder of
the I/O control processor module tests.

After these tests, the Offline Diagnostic Loader is loaded from
the RX33 to memory and control is passed to the Loader. Due to
the sequence of tests that precede the Loader, the Loader assumes
the I/O control processor module and the Rx33 are tested and
working.
6.2.3 Operating Instructions For The Offline Diagnostic Loader
Follow these steps to start the Offline Loader:

Insert the HSC70 Offline diagnostics diskette into the
RX33 Unit 0 drive (left-hand drive).

Power on the HSC70, or depress and release the Init
button on the HSC70 OCP.

The Rx33 drive-in-use LED should light within a few
seconds, indicating the Bootstrap is loading the Offline
Diagnostic Loader to Program memory.

In less than 30 seconds, the Offline Diagnostic Loader
indicates it has loaded properly by displaying the
following:
HSC70 OFL Diagnostic Loader, Version Vnnn
Radix=Octal,Data Length=Word,Reloc=OOOOOOOO
ODL)

The Offline Loader is now ready to accept commands.
Section 6.2.4 contains information on the Loader command
language.

6.2.4 Offline Diagnostic Loader Commands
The following list describes the commands recognized by the
Offline Loader. Section 6.2.5.2 of this document is a copy of
the Offline Loader Help file.
6.2.4.1 Offline Diagnostic Loader HELP Command - The HELP
command supplies an abbreviated list of all commands the Loader
recognizes. In response to the HELP command, the Loader reads
the file OFLLDR.HLP from the Rx33 and displays the contents of
this file on the HSC70 console terminal. Section 6.2.5.2
contains a listing of the Loader Help file.

6-12

6.2.4.2 offline Diagnostic Loader SIZE Command - The Offline
System Sizer is invoked by the SIZE command. The Sizer
determines the sizes of the HSC70 Program, Control and Data
memories, and the type of requestor in each HSC70 requestor
position.
(The requestor position refers to the priority of a
particular requestor on the Data and Control memory buses.
It
does not match the numbering of module slots.)
6.2.4.3 Offline Diagnostic Loader TEST Command - The Offline
Diagnostic Loader TEST Command is used to invoke the various
offline diagnostics available on the HSC70. The following list
shows the particular form of the TEST command used to invoke each
diagnostic. In general, the TEST command format allows
specification of the system component to be tested. For
instance, the TEST MEMORY command invokes the Offline Memory
Test.
o

Offline Cache Test - verifies the full functionality of
the onboard cache. The Offline Cache Test is invoked by
the TEST CACHE command.

Bus Interaction Test - is invoked by the TEST BUS
command. The Bus Interaction test generates contention
on the HSC70 Data and Control memory buses by two or
more Ks simultaneously testing different sections of the
Control and Data memories. Two or more working
requestors are required to run this test (including the
K.ci).

K Test Selector - is invoked by the TEST K command. The
K Test Selector allows you to run specific requestor
microdiagnostics.

KIP Memory Test - is invoked by the TEST MEMORY BY K
command. The KIP Memory test uses one of the HSC70
requestors to test either Data or Control memory. This
test runs faster than the Offline Memory Test because a
requestor is roughly seven times faster than the I/O
control processor. Program memory cannot be tested
using the K/P memory test as the Ks do not have an
interface to the Program memory bus.

Offline Memory Test - is invoked by the TEST MEMORY
command. This test uses the I/O control processor to
test Program, Control, or Data memories.

Offline Rx33 Exerciser - is a combined hardware
diagnostic and exerciser for the M.std2/RX33 subsystem
of the HSC70.
Invoke the Offline RX33 Exerciser by the
TEST RX command.

6-13

Memory Refresh Test - is invoked by the TEST REFRESH
command. The Memory Refresh test allows the refresh
feature of the memories to be tested.

OCP Test - is invoked by the TEST OCP command. The OCP
(Operator Control Panel) test checks the HSC70 lights
and switches. The test requires manual intervention by
an operator.

6.2.4.4 Offline Diagnostic Loader LOAD Command - The LOAD
command loads a program into HSC70 Program memory without
starting it. The command format is LOAD <filename>, where
<filename> is the name of any file on the HSC70 OFFLINE diskette.
The Loader finds the specified file and loads it into Program
memory. This command is useful when you want to patch a program
image before starting execution. After the patch is made, the
program can be initiated via the START command described next.
6.2.4.5 Offline Diagnostic Loader START Command - The START
command initiates the Loader program currently loaded in Program
memory. The START command can be used in conjunction with the
LOAD command (see preceding section), or it may be used to
reinitiate the last loaded offline diagnostic. This saves the
time required to reload the program from the RX33. For example,
you have previously typed SIZE to initiate the Offline System
Sizer program and after the Sizer completes, you wish to run it
again. Typing START and then carriage return restarts the Sizer
without reloading the program from the RX33 saving many seconds
of load time.
6.2.4.6 EXAMINE And DEPOSIT Commands - The EXAMINE and DEPOSIT
commands are used to display or modify the contents of any
location in the HSC70 Program, Control, and Data memories.
Qualifiers (switches) can be used with these commands to display
bytes, words, long words or quad words. The radix (octal,
decimal, hex) of the displayed data can also be controlled by
qualifiers. Alternately, the SET DEFAULT command can be used to
set the default data length and radix for all EXAMINE and DEPOSIT
commands (Section 6.2.4.6.7).
6.2.4.6.1 Offline Diagnostic Loader EXAMINE Command - The
EXAMINE command is used to display the contents of any location
in the HSC70 Program, Data, or Control memories. The format of
the command is: EXAMINE <address>. The <address> can be a
string of digits in the current (default) radix. Certain
symbolic addresses are also permitted (see Section 6.2.4.6.3).
EXAMPLE:

ODL) E 14017776
(D) 14017776 125252

6-14

In the example, the user entered a command to examine the
contents of location 14017776.
(Notice the EXAMINE command can
be abbreviated to a single E.) When the Loader displays the
contents of location 14017776, the address is preceded by a (D)
indicating the location is within Data memory. The display shows
the location contains the value 125252.
6.2.4.6.2 Offline Diagnostic Loader DEPOSIT Command - The
DEPOSIT command is used to modify the contents of any location in
the HSC70 Program, Control, or Data memories. The format of the
command is: DEPOSIT <address> <data>. The <address> can be a
string of digits in the current (default) radix. Certain
symbolic addresses are also permitted (Section 6.2.4.6.3).
EXAMPLE:

ODL)

14017776 123456

In this example, the user entered a command to store the value
123456 in the contents of address 14017776. The previous
contents of this Data memory location are replaced with the value
specified in the DEPOSIT command (123456).
6.2.4.6.3 Offline Diagnostic Symbolic Addresses - The four
symbols used as symbolic addresses in a DEPOSIT or EXAMINE
command are described in the following list.
o

Asterisk (*) - indicates the Loader is to use the same
address as used in the last EXAMINE or DEPOSIT command.
For example, if you just examined the contents of
address 16012344, and you now wish to deposit the value
1234 into the same address, you can type DEPOSIT * 1234
instead of typing DEPOSIT 16012344 1234.

plus sign (+) - is also used as a symbolic address.
This symbol means the Loader is to use the address
following the last address used by an EXAMINE or DEPOSIT
command. When the Loader sees a + as an address, it
takes the last address used by EXAMINE or DEPOSIT and
adds an offset which depends on the current default data
length (Section 6.2.4.6.7).
If the current default data length is a byte, the Loader
adds one to the last address. If the default was a
word, the Loader adds two to the last address. The
offset is four for longword data length and eight for
quadword. This feature is useful when examining a
number of items stored in successive locations.
For example, if you are examining a table of words
beginning at address 14125234, you would examine the
first location by typing EXAMINE 14125234. The next
location could now be examined by typing EXAMINE +
instead of typing EXAMINE 14125236.

6-15

Minus sign (-) - is also used as a symbolic address.
It
indicates the Loader is to use the address preceding the
last address used by either command. When the Loader
sees a - symbol as an address, the Loader takes the last
address used by an EXAMINE or DEPOSIT and subtracts an
offset which depends on the current default data length
(Section 6.2.4.6.7.)
If the current default data length is a byte, the Loader
subtracts one from the last address.
If the default was
a word, the Loader subtracts two from the last address.
The Loader subtracts four for longword data length and
eight for quadword. This feature is useful in the same
way as the + symbol, but examines a table starting at
the highest address and proceeding down to lower
addresses.
For example, if you want to examine a table of words
that ends at address 14012346, you would examine the
last location of the table by typing EXAMINE 14012346.
The preceding location in the table could now be
accessed by typing EXAMINE - instead of having to type
EXAMINE 14012344.

At symbol (@) - is used as a symbolic address. This
symbol means the Loader should use the data from the
last EXAMINE or DEPOSIT command as an address. This
feature is useful when following linked lists. For
example, you first examine location 123434 which
contains a pointer to a linked list. Now, you can type
EXAMINE @ to examine the location pointed to by the
first location.

6.2.4.6.4 Repeating EXAMINE And DEPOSIT Commands - When
troubleshooting memory problems, continuously executing an
EXAMINE or DEPOSIT command is sometimes useful. The REPEAT
command is used for this continuous execution. Type REPEAT,
followed by the EXAMINE or DEPOSIT command to be repeated.

EXAMPLE 1 - Repeating a DEPOSIT command
REPEAT DEPOSIT 14017776 125252
or RE D 14017776 125252
In this example, the value 125252 is continuously deposited into
address 14017776. The format of the DEPOSIT command does not
change. The DEPOSIT command is just preceded by the word REPEAT.
Also the REPE~T comm~nd can be abbreviated to RE.

6-16

EXAMPLE 2 - Repeating an EXAMINE command
REPEAT EXAMINE 14017776
Dr RE E 14017776
In using this example, you can continuously examine the contents
of address 14017776. The format of the EXAMINE command does not
change. The EXAMINE command is just preceded by the word REPEAT.
In the example shown, the contents of location 14017776 are
displayed continuously on the terminal. This slows down the
repetition of the command and wastes paper on hard copy devices.
Stop output to the terminal by typing a CTRL/O. However, the
Loader also provides a special EXAMINE command qualifier
(jINHIBIT) for suppressing output to the terminal. This
qualifier is discus~ed in Section 6.2.4.6.6. To stop a repeated
command, type CTRL/C.
6.2.4.6.5 Offline Diagnostics Relocation Register - The Loader
provides a relocation register. It can be used to reduce the
number of address digits typed for an EXAMINE or DEPOSIT command
when all addresses are in either the Control or Data memories.
The contents of the relocation register are added to the address
given with an EXAMINE or DEPOSIT command. The relocation
register contains a zero when the Loader is initiated, so it
normally has no effect on the addresses typed in an EXAMINE or
DEPOSIT command.
If you wish to examine a large number of locations in Data
memory, use the following example:
EXAMPLE 1 - Relocation to Data memory
OOL) SET RELOCATION:14000000
OOL) EXAMINE 0
(D) 14000000 123432
OOL) EXAMINE 1234
(D) 14001234 154323
Load the relocation register with the address of the first
location in Data memory (14000000). When you issue an EXAMINE
command with an address of 0, the Loader adds the relocation
register .to the address given resulting in the examination of
address 14000000. Likewise, when an EXAMINE command with an
address of 1234 is issued, the Loader displays the contents of
location 14001234.
The following example shows how to examine a large number of
locations in Control memory.

6-17

EXAMPLE 2 - Relocation to control memory
OOL> SET RELOCATION:16000000
OOL> EXAMINE 0
(C) 16000000 125252
OOL> EXAMINE 4320
(C) 16004320 125432
The relocation register is loaded with the address of the first
location in Control memory (16000000). When an EXAMINE command
is issued with an address of 0, the Loader adds the relocation
register to the address given, displaying the contents of address
16000000. Likewise, when the user issues an EXAMINE command with
an address of 4320, the Loader displays the contents of location
16004320.
6.2.4.6.6
o

Offline Diagnostics EXAMINE And DEPOSIT Qualifiers
(Switches) /NEXT - allows an EXAMINE or DEPOSIT command to work on
successive addresses. When used with a valid EXAMINE
command, it specifies that after the command location
has been displayed, the Loader should also display the
next number of locations following the first.
For
example, the command E 1000/NEXT:5 results in the
display of locations 1000, 1002, 1004, 1006, 1010, and
1012 (assuming the default data length is a word). The
number of the argument can be any value in the current
default radix that can be contained in 15 binary bits or
less. For instance, if the default radix is octal, the
number of the argument can be any value between 1 and
77777.
The /NEXT qualifier works the same way for the DEPOSIT
command, except that the data given with the DEPOSIT
command are stored in the location specified and the
next number of locations following.

/BYTE/WORD/LONG/QUAD - are used to control the
data-length of examined or deposited data. Normally,
the Loader uses the default data-length (Section
6.2.4.6.7) when data is examined or deposited. However,
the data-length qualifiers can be used to override the
default for a single examine or deposit. For instance,
assume the default data-length is currently a word, and
you wish to examine a byte quantity at address 16001234.
The command EXAMINE 16001234/Byte followed by a carriage
return would display the proper byte without affecting
the default data length.

6-18

/OCTAL/DECIMAL/HEX - can be used with an EXAMINE command
to control the radix of the address and data displayed.
They are NOT used to control the radix of the address
supplied in the EXAMINE command. The radix of the
address and data displayed by an EXAMINE command is
usually controlled by the current Default Radix (Section
6.2.4.6.7), but the /BYTE/WORD/LONG/QUAD qualifiers are
used to override the default radix for a single EXAMINE
command. For example, assume the default radix is
octal. The command EXAMINE 14001234/Hex followed by a
carriage return displays the contents of address
14001234(8) in the hexadecimal radix. The EXAMINE
display would be as follows:
(D) 30029C HHHH. HHHH
represents the contents (hex) of the location displayed.
The address is also displayed in hex.

/INHIBIT (abbreviated to /INH) - inhibits the display of

examined date when repeating an EXAMINE command. This
is useful both for saving paper on hardcopy devices and
for speeding up the EXAMINE operation for scope-loop
purposes. For example, the command REPEAT EXAMINE
l60l2346/INH results in the Loader continuously reading
the contents of location 16012346 without displaying
anything at the console.

6.2.4.6.7 Setting And Showing Defaults - The SET DEFAULT command
is used to change the default radix and/or data length. The
default radix controls the radix of parameters supplied with
EXAMINE or DEPOSIT commands and the radix of data displayed by
the EXAMINE command. The default data length controls the length
(byte, word, long, quad) of data displayed by the EXAMINE command
or data stored by a DEPOSIT command.
The default radix may be set to octal, decimal, or hexadecimal.
When the Offline Loader first starts, it sets the default radix
to octal. Type in Set Default Hex followed by a carriage return
to set the default radix to hexadecimal. After the default radix
is set, it remains so until another SET DEFAULT command is issued
or the Loader is rebooted.
The default data length may be set to byte, word, longword, or
quadword. When the Loader is first started, it sets the default
data length to word (16 bits.) Type in Set Default Long followed
by a carriage return to set the default data length to longword
(32 bits). Setting the default data length to longword causes an
EXAMINE command to display longword quantities and causes the
DEPOSIT command to store longword quantities.
(Because the
Loader is executing in a PDP-ll, longwords are stored and
retrieved as two successive l6-bit words.) After the default data
length is set, it remains so until changed by another SET DEFAULT
command or until the Loader is rebooted.

6-19

Executing INDIRECT Command Files - The Loader is
capable of executing indirect command files stored on the RX33.
These command files consist of valid Offline Loader commands
terminated by a carriage return «CR» and a line feed «LF».
Comments may also be placed in indirect command files by
preceding a comment line with an exclamation mark (!). Comment
lines must also be terminated with a <CR> and <LF>. As an
example, the Offline Loader Help file is an indirect command file
that contains only comments (Section 6.2.5.2).
6.2.4.6.8

Indirect command files cannot be created by the Loader or by
CRONIC. The command files must be created in RT-ll format and
stored on the Offline Diagnostics diskette. Any editor that does
not insert line numbers in the output files can be used to create
command files.
6.2.5 Offline Diagnostics Unexpected Traps And Interrupts
When the Loader detects an unexpected trap or interrupt, the
following message is displayed:

Unexpected trap through www, VPC=xxx, PSW=yyy Error Address = zzz
where:
www
xxx
yyy
zzz

Address of the trap or interrupt vector
Virtual PC of Loader at time of trap
Contents of PSW at time of trap
Address of location causing NXM or parity trap

The first line of the unexpected trap report is issued for all
unexpected traps or interrupts. The second line is only issued
if the trap was through vector addresses 000004 (NXM trap) or
000114 (parity trap). The address of the vector is a direct clue
to the cause of the trap. Refer to Section 6.2.5.1 for a list of
the devices and error conditions associated with each vector.
The Virtual PC (VPC) of the instruction executing when the trap
occurs is sometimes useful in determining the cause of the trap.
The VPC can be referenced in the listing to find the instruction
causing the trap. Remember, the VPC is the address of the
instruction following the instruction executing when the trap
occurred. Notify Field Service Support to analyze such failures.
NXM traps can be caused by EXAMINE or DEPOSIT commands if you
specify an address not contained in a particular HSC70. For
example, if an HSC70 only contains data memory from addresses
14000000 through 14177776, and you try to examine or deposit
address 14200000, the Loader reports an NXM trap. In this
example, the NXM trap would not represent an error condition.
Parity traps can be caused by an EXAMINE command if a user
examines an address not initialized with good parity. For

6-20

example, when the HSC70 memories are powered on, the parity bits
are in random states. Thus, if a user examines a location not
written since poweron, the location may generate a parity error.
This does not constitute an error condition.
However, if a location produces a parity error and that location
has been written since poweron, a memory error is indicated.
(Also note the I/O control processor and Ks have bits allowing
them to write bad parity for testing the parity circuit. These
bits should never be used except by diagnostics.)
Offline Diagnostics Trap And Interrupt vectors Following is a list of trap and interrupt vectors for various
devices and error conditions recognized by the I/O control
processor PDP-II processor:
6.2.5.1

vector

Device or Error Condition

000004

Non-Existent Memory, Stack Overflow, Halt in User
Mode, and Odd Address Trap
Illegal Instruction
8PT Instruction
lOT Instruction
Power Fail Interrupt
EMT Instruction
TRAP Instruction
Console Terminal - Receiver Interrupt
Console Terminal - Transmitter Interrupt
Line Clock Interrupt
Parity Trap
Control Bus Interrupt - Level 4
Control Bus Interrupt - Level 5
Control Bus Interrupt - Level 6
Control Bus Interrupt - Level 7
RX33 Interrupt
MMU Abort (Trap)
SLU (Serial Line Unit) 1, Receiver Interrupt
SLU (Serial Line Unit) 1, Transmitter Interrupt
SLU (Serial Line Unit) 2, Receiver Interrupt
SLU (Serial Line Unit) 2, Transmitter Interrupt

000010
000014
000020
000024
000030
000034
000060
000064
000100
000114
000120
000124
000130
000134
000230
000250
000300
000304
000314
000310

6.2.5.2 Offline Diagnostics Loader Help File - An example of the
Offline Diagnostics Loader Help File follows:

!HSC70 OFL Diagnostic Loader Help File - Vnn-nn
!Capital letters = required input, lower case = optional
!COMMANDS (terminated by CR):
'Examine <address>'
;display data at <address> specified
'Deposit <address> <data>'
;deposit <data> to <address>
<address>
digit string in current default radix or:
'*'
= use same address as last Ex or De

6-21

use address following last address
use address preceding last address
'@'
use <data> from last Ex or De as <address>
iHElp'
iprint this file
'@filename'
;execute indirect command file
'Load filename'
;load file to diagnostic partition
'REpeat <command>'
irepeat specified command until AC
'SEt Default <option>'
iset default radix or data length
<option> = Byte,Word,Long,Quad,Hex,Octal,Decimal
'SEt Relocation:#'
iset relocation register to #
'+'

I_I

NOTE:

Relocation register is 22-bit positive # added to address
of all EXAMINE and DEPOSIT commands.

'SHow'
'SIze'
'Start'
'Test Bus'
'Test MEmory'
'Test MEmory By K'
'Test K'
'Test OCP'
'Test Refresh'

idisplay defaults and Loader version #
iSize HSC70 memories and display K status
istart program in diagnostic partition
iload and start the OFL Bus Test
iload and start the OFL Memory Test
iload and start the OFL K/P Memory Test
iload and start the OFL K Test Selector
iload and start the OFL OCP Test
iload and start the OFL Memory Refresh
Test

QUALIFIERS (switches) for 'Ex' and 'De'
irepeat Ex or De on next '#' addresses
'/Next:#'
iuse specified length vs.
default
'/Byte,/Word,/Long,/Quad'
iuse specified radix for Examine display
'/Octal,/Decimal,/Hex'
'/INHibit'
;inhibit display of examined data
<end of help file>
6.3

OFFLINE CACHE TEST

The Offline Cache Test is a diagnostic that runs under the
Offline Loader in a stand-alone environment. It provides indepth
testing of the cache logic on the Jll P.ioj. It verifies the
full functionality of the onboard cache. Execution time for a
single pass is between 16 seconds and 4 minutes depending on the
options selected.
6.3.1

Offline Cache Test System Requirements

The Offline Cache Test is loaded into memory via the Offline
Loader. This test requires 8 Kwords of memory to run. One-half
of this memory space contains the program; the other half is used
as a cached buffer. All terminal I/O and handling of the line
clock is done by the Offline Loader.

6-22

6.3.2 Offline Cache Test Operating Instructions
This section contains operating instructions specific to the
Offline Cache Test. If the HSC70 is not booted and running the
Offline Loader, necessary instructions are found in Section
6.1.2, Section 6.1.3, and Section 6.2.
If the HSC70 is already
booted and running the Offline Loader, enter the TEST CACHE
command at the ODL> prompt and press RETURN.
This command loads the Offline Cache Test from the media and
transfers control to the diagnostic. When it starts, the Offline
Cache Test should display the following:
HSC OFFLINE Cache Test Vxxx
Where Vxxx is a 3-digit version/edit number.

User-modifiable parameters are described in the following
section.
6.3.3 Offline Cache Test Parameter Entry
Following are the three user-modifiable parameters for the cache
test. In each case the default (invoked by a carriage return) is
shown in brackets. If no default is possible, the brackets are
empty.
o

Select Data Reliability Test - is the first
user-modifiable parameter, an optional selection of the
data reliability tests. It is a moving-inversions style
test for exercising the RAM array. The Offline Cache
Test prints:
Run extended cache ram test (Y/N) [N] ?
Selection of this optional test increases test time per
pass to about four minutes. It is useful for the
manufacturing burn-in and test areas. It is not
necessary to run this optional test in order to fully
verify the health of the cache.

Leave Cache Enabled - determines the cache state at the
termination of the diagnostic. The Offline Cache Test
prints out:
Leave cache enabled after successful completion (Y/N)
[N]

This feature allows enabling the cache for further use
after running the diagnostic to verify the cache is
working. If the diagnostic detects any hard failures in
the cache, it is not enabled at the end of the
diagnostic. This prevents complications if the cache
contains hard failures and is inadvertently turned on.

6-23

Number of Passes - accepts a total number of passes from
1 to 32767 (decimal). The test prompts for this number
as follows :
# of passes to perform (0)

[1] ?

Any decimal number up to 32767 can be used.
Fatal
errors can cause the diagnostic to terminate before the
specified number of passes executes.
At the completion of the total passes requested by the user, the
diagnostic prompts:
reuse parameters (YIN) [Y] ?
Answering this prompt with a Y allows you to rerun the diagnostic
with the same parameters as before. Answering with an N causes
repetition of the parameter entry questions.
6.3.4 Offline Cache Test Progress Reports
The Offline Cache Test provides summary information at the end of
each pass. The end of pass message is similar to this:

End of Pass 00001, 00000 Errors, 00000 Total Errors
The Errors field contains the number of errors for the pass. The
Total Errors field contains a running total of errors accumulated
since the start of the diagnostic.
6.3.5 Offline Cache Test Error Information
The Offline Cache Test displays the errors detected during
execution on the console terminal. All error messages follow the
offlines generic error message format (Section 6.1.5) preceded by
an OFLCXT> prompt.

Each error number has a unique text string associated with it.
For errors with results that did not compare with the expected
value, the diagnostic uses the optional lines to show
expected/actual data.
Soft errors (such as cache parity errors) can accumulate to a
point where the diagnostic classes them as fatal. The test then
terminates on a fatal error.
6.3.5.1 Specific Offline Cache Error Messages - The following
list describes in detail each possible error message. The errors
are listed in numerical order.

6-24

Error 00 - Memory parity error, VPC = xxxxxx (Applicable
to all tests.) - can occur at any time during execution
of the diagnostic. The virtual PC on the stack is
printed to help identify the program area where the
error occurred. The content of the error address
register is also displayed.
Both the virtual PC and the error address register
content are optional lines. Detection of this error
causes the testing to cease. Then the diagnostic
returns to the Reuse parameters prompt.

Error 01 - NXM Trap, VPC = xxxxxx (Applicable to all
tests.) - causes the diagnostic to return to the Reuse
parameters prompt. Additional data (such as the virtual
PC of the instruction which caused the trap and the
physical address contained in the error address
register) are printed as optional lines.

Error 02 - Cache parity error, VPC = xxx xxx (Applicable
to Tests 2 through 16.) - results when a trap through
the parity error vector is detected and the cache is
enabled. The virtual PC where the error was detected is
printed, as well as the content of the error address
register. If the 22-bit value in the error address
register is 177770024, no main memory error was present.
You can assume the parity error is from the cache.

Error 03 - Bit stuck in cache control register
(Applicable to Test 2.) - indicates a bit is
stuck-at-fault in the cache control register. The
expected and actual data values are printed as optional
lines.

Error 04 - Forced miss operation failed.
(Applicable to
Test 3.) - bit 2 of the cache control register does not
prevent the cache from allocating a test location. This
could be a problem in the cache control gate array or in
the hit/miss compare logic.

Error OS - Forced miss with abort failed (Applicable to
Test 3.) - bit 3 did not prevent the cache from
allocating when set. Failures of this nature mean the
cache cannot be disabled, and all memory references may
be allocating cache regardless of the intent of the code
being executed. The cache control gate array or the tag
compare logic may be at fault.

Error 06 - Expected cache hit did not occur (Applicable
to Tests 4, 6, 9, 12, and 14.) - did not allocate a
given test location to the cache as expected, causing a
miss condition in the hit/miss register.

6-25

Error 07 - Expected cache miss did not occur (Applicable
to Tests 7, 9, and 10.) - shows a test location not
expected to be allocated, or valid, as a hit on access.

Error 10 - Value in hit/miss register incorrect
(Applicable to Test 5.) - indicates the 6-bit value in
the hit/miss register was incorrect after a certain
sequence of instructions. The expected values, as well
as the actual content of the hit/miss register, are
printed as optional lines.

Error 11 - write byte operation caused cache update
(Applicable to Test 6.) - A byte operation (on a miss)
did not cause cache to deallocate the test location.
Thus, when the test location was read back, a cache hit
resulted.

Error 12 - write byte did not cause cache update
(Applicable to Test 6.) - A byte-value did not get
written into cache or main memory.

Error 13 - Cache failed to flush successfully
(Applicable to Test 8.) - When checking cache after a
flush command was executed, one or more locations still
contained valid data (were detected as cache hits).

Error 14 - Access with force bypass did not cause
invalidate (Applicable to Test 9.) - The second access
to an allocated location, with the force bypass bit (bit
9) set in the control register, did not result in a miss
as expected.

Error 15 - Tag Parity error did not set (Applicable to
Test 10.) - The diagnostic could not set the tag parity
error bit in the memory system error register when faced
with an actual tag parity error.

Error 16 - Abort on cache parity error did not occur
(Applicable to Test 11.) - The cache logic did not abort
the instruction under execution when a cache parity
error was forced and the abort bit (bit 7) was set in
the control register.

Error 17 - Unexpected parity trap during abort test
(Applicable to Test 10.) - Although expected to, cache
control Bit 0 did not prevent the cache logic from
taking a trap on bad parity. The address where the trap
occurred is printed as optional information.

6-26

Error 20 - Content of memory system error register
incorrect (Applicable to Test 11.) - The error bits in
the memory system error register (1777744) do not
reflect the correct status for the operation under test.
The expected and actual content are printed as optional
lines.

Error 21 - Return PC wrong during abort/interrupt test
(Applicable to Test 11.) - The return PC on the stack is
not equal to the value expected during an abort or
interrupt operation caused by a cache parity error. The
state sequencer gate array is most likely defective.

Error 22 - Cache data parity bit{s) did not set
(Applicable to Test 10.) - The diagnostic was unable to
set the data parity error bit(s) in the memory system
error register on a forced parity error. The parity
logic may not be detecting parity errors or one of the
bits in the memory system error register may be stuck
low.

Error 23 - Interrupt on parity error did not occur
(Applicable to Test 11.) - The cache did not interrupt
through vector 114 on a forced parity error. The state
sequencer or the parity detection logic may be faulty.

Error 24 - Expected NXM trap did not occur (Applicable
to Test 13.) - A NXM trap was not detected during an
access to location 1777757776. The timeout logic that
detects a NXM may be defective, or some problem may
exist in the cache data path gate array that prevents it
from acting on timeout.

Error 25 - Parity error was not blocked by NXM
(Applicable to Test 13.) - When accessing a location
expected to result in a NXM, the parity error flag set
instead, and a trap occurred through vector 114. The
NXM signal may not have been detected by the cache data
path gate array.

Error 26 - Cache data miscompare on word operation
(Applicable to Test 14.) - A word address in the cache
array did not have the correct data when read. This may
indicate address line faults or data path faults
allowing the location to be rewritten after the test
value was placed there. The expected/actual data values
are printed as optional lines.

Error 27 - Cache data miscompare on byte operation
(Applicable to Tests 14 and 15.) - A location in the
cache, when addressed in a byte fashion, did not have
the expected data pattern. This may indicate address
line faults or data path control faults which allowed
overwriting the expected value.

6-27

Error 30 - OMA write to memory did not cause cache to
invalidate (Applicable to Test 12.) - A DMA write by the
RX33 controller to a test location, allocated to cache,
still resulted in a hit status after the transfer. The
cache has stale data.

Error 31 - Instruction still completed during abort
condition (Applicable to Test 11.) - With the abort bit
set in the cache control register, an instruction set up
to detect a parity error on an operand fetch still
finished execution modifying the destination of the
instruction.

Error 32 - Load device error during OMA test (Applicable
to Test 12.) - The Rx33 subsystem did not respond
correctly to the OMA test operation. There may be
faults in the Rx33 controller or the interrupt service
logic. This message is informational in nature, and
this error is outside the scope of this diagnostic.

Error 33 - POR cache bypass failed (Applicable to Test
7.) - Setting the POR bypass bit in the PAR/PDR pair
under test did not bypass the cache. This points to a
MMU or cache data path gate array problem. The POR
number and the CPU execution mode (Kernel or User) are
printed as optional lines in the error message.

Error 34 - Tag store address hit failure (Applicable to
Test 16.) - Changing the value of the tag bits (bits
16:22 of the physical address) still resulted in a hit
condition (even though the address should not have
compared) forcing a fetch to main memory. There may be
a problem in the tag RAMs or the tag compare logic in
the cache data path is not working.

Error 3S - Tag store address miss failure (Applicable to
Test 16.) - When going through the possible values for
the tag bits (16:22 of the physical address), the cache
failed to allocate for some combination of the bits.
Possible problems are stuck bits in the address lines
going to the cache array, bad RAMs in the cache array,
or a fault in the tag compare logic.

Error 41 - Processor type is not J11 (Applicable to Test
1.) - The processor type register does not show the
correct value for a Jll chip set. Attempting to run
this diagnostic on anything other than a Jll produces
this error.

6-28

Offline Cache Test Troubleshooting
All of the logic under test is contained on the Jll P.ioj module
with the exception of the memory used by the diagnostic. Main
memory parity errors usually point to the memory module. Because
much of the logic tested is buried within the two gate arrays on
the module, troubleshooting is often limited to a best-guess
replacement of one or both of these gate arrays.
6.3.6

Cache parity errors and data miscompare errors can usually be
traced to specific RAMs if proper attention is paid to the data
content and address.
For scope loops, the cache test should be run with a large number
of passes, and a CTRL 0 typed on the console to inhibit error
message printout.
Constant hit/miss errors; or tag address hit problems; may also
be caused by the tag compare logic, which is separate from the
gate arrays and the data path.
Offline Cache Test Descriptions
Following are descriptions of the Offline Cache Tests 1 through
16.
6.3.7

Test 1 - Cache Register Access Test - checks for the
presence of the necessary cache control/status
registers, the cache control register (1777746), the
hit/miss register (1777752), and the memory system error
register (1777744). To perform further diagnosis, these
registers must respond.

Test 2 - Cache Control Register Bits - tests the
read/write bits of the cache control register (1777746)
for stuck-at faults.
In addition, bits (8,11:15), which
are write-only, are checked for read data of zero. Bits
6 and 10 which cause data and tag parity to be written
incorrectly on new data allocated to cache are treated
as special cases. After writing/reading each of these
bits, the cache is flushed to remove any bad parity
locations.

6-29

Test 3 - Force Miss Action - verifies all references
made with either bit 3 or bit 2 of the cache control
register set that cause a cache miss and leave the cache
entry unchanged. To perform this test, first write a
test address with bits 3:2 cleared to allocate cache and
place a known data pattern into the cache. Then bit 2
is set, and the same test location is written again.
with bit 2 set, the cache will not update, and the data
in cache is still considered valid. When bit 2 is
cleared, and the test location is accessed again, the
old data from cache should be the result. If not, the
force miss action of bit 2 did not work. The same
sequence is repeated for bit 3, and the same results are
expected.

Test 4 - Hit/Miss Register Part I - checks the basic
operation of the hit/miss register in logging hit/miss
information on instruction fetches and data
reads/writes. The hit/miss register is critical to
further cache diagnosis, because it is the window into
what is actually going on inside the cache.
First, a test location is allocated with cache enabled.
Then cache is bypassed, and the test location is
accessed again by a write. This write should go
directly to main memory and bypass the cache. The cache
is enabled, and a read access to the test location
should result in a hit condition in the hit/miss
register. Then the test location offset by 8 Kwords is
accessed. This should result in a miss, since the upper
bits of the address (tag) will not match.

Test 5 - Hit/Miss Register Part I I - checks all the
combinations of the six bits in the hit/miss register
for a single miss at different bit positions. This is
done by caching a certain sequence of instructions and
executing them, with miss conditions forced at each bit
position. At the completion of this test the hit/miss
register has been checked for both ones and zeros at
each bit position.

Test 6 - Byte Accesses - ensures byte references to the
cache are handled correctly by the control logic. The
first operation is a write byte to the test location not
allocated followed by a byte-read of the test location.
The read should result in a miss. Then the entire word
at the test location is allocated. The upper byte of
the test location is modified, and a cache hit is
expected. The entire word is also read and compared
against the expected result to see if the byte-write
occurred. A similar chain of events follows, this time
modifying the low byte.

6-30

Test 7 - POR Cache Bypass Test - tests all of the Kernel
PDRs <0:7> as well as the user PDRs. It is very
important for the bypass cache bit (Bit 15 of any PDR)
to work correctly in the multiprocessing environment of
the HSC70.
To test PDR bypass, select from a table the PAR/PDR pair
to test. This PDR is remapped to point to Control
memory. Control memory is then written via the MMU
writing a data pattern and allocating cache. Control
memory windows are used to write Control memory to a
second pattern without involving the cache control
logic. When Control memory is read through the MMU with
the bypass bit set, the actual Control memory content
(second pattern) should be the result if the bypass bit
is actually set. If the old content (first pattern) is
read back, the bypass bit is not working. PARs 1, 2, 3,
5, and 6 are tested in this way.
PARs 0, 4, and 7 are treated as special cases due to
programming environment restrictions. They are tested
by allocating cache with some location mapped by the
PAR/PDR under test and then setting the bypass bit.
When the test location is read, the hit/miss register
should record a hit and then invalidate the location.
If the location is written or read again, it should
result in a miss as long as the bypass bit is set.
After all the Kernel PAR/PDR registers are tested, the
program maps user space identical to Kernel space and
switches into User mode to re-execute all the tests.
After all User PAR/PDR pairs have been tested, the
program swaps back into Kernel mode and proceeds to the
next test.

Test 8 - Cache Flush Action - allocates all 4 Kwords of
cache, and then executes a flush command by setting bit
8 in the cache control register.
The cache control
logic then writes every location in cache with the data
value 17777746 and resets the valid bit for each
location. All 4 Kwords of cache allocated before the
flush are read again, and if any location responds with
a hit when read, an error is declared.

Test 9 - Unconditional Bypass to Main Memory - checks
the correct operation of Bit 9 of the cache control
register. Bit 9 is used to bypass cache in a fashion
similar to the bypass bits in the PAR/PDRs. Any
location allocated in cache before the bypass bit is set
results in a hit on the first access, and further
accesses all show as misses.

6-31

This function is used when it is desirable to
temporarily disable the cache in a fashion that does not
leave the cache with stale data when re-enabled. A test
location is allocated, and then the bypass bit is set.
The first access of the test location should be a hit,
and the second should be a miss.
o

Test 10 - Force Tag/Data Parity Errors - forces parity
errors in the tag and data fields of the cache array to
test the parity detection logic. A special diagnostic
mode is used, with bit 0 of the cache control register
and one of the force parity error bits set. When bit 0
is set, any trap through 114 is disabled on a parity
error detected in cache. If a parity trap does occur,
an error is declared.
First, tag errors are forced using bit 10 in the cache
control register. When this bit is set, locations
allocated to cache do so with bad tag parity. When
accessed again (resulting in a cache hit), they should
set the tag parity error bit (bit 5 in the memory system
error register). The force data parity error bit (bit 6
of the cache control register) is checked next. After a
location is allocated to cache with bad data parity,
further reads of that location result in setting the
data parity error bits (bits 6:7 of the memory system
error register). After using the force bad parity bits,
the program flushes the cache to remove these parity
errors.

Test 11 - Abort/Interrupt on Parity Errors - uses the
force parity error bits in the cache control register to
force parity errors in the cache array. Because testing
of the detection of such errors has been done, testing
of the other logic related to cache data or tag parity
errors can be done.
Different combinations of tag and parity errors are
forced, with the cache control register set to interrupt
through 114, or abort through 114 on parity errors. An
interrupt through 114 should set the correct error
bit(s) in the memory system error register. Also, the
instruction detecting the parity error should complete.
On an abort through 114, the correct error bit(s) should
be set, but the instruction should not complete. If the
parity error is detected on the fetch of the source
data, the data in the destination of the instruction is
not modified. The PC on the stack after each interrupt
or abort instruction is checked against the PC that is
expected.

6-32

Test 12 - DMA Invalidate - modifies a location resulting
in the cache acquiring stale data unless cache logic
detects the DMA change. The RX33/M.std2 subsystem is
used to generate DMA operations to program memory. A
DMA write to a program memory location allocated to
cache should result in a cache miss when it accessed
after the DMA write.

Test 13 - Check Blockage of Parity Error on NXM Abort generates simultaneous NXM and parity errors. The NXM
trap should occur overriding the parity error.

Test 14 - Cache Data RAM Test - tests the cache data
RAMs by mapping one PAR and using the cache solely for
data storage. A data pattern to detect dual-addressing
is written to the cache. Failures of the cache data to
match the expected data on read-back are considered
miscompare errors. The test is first done using word
addresses and test values, and then repeated with byte
addresses and byte data patterns. Each location
allocated is expected to be a hit from cache, and the
content is checked as well.

Test 15 - Tag Store RAM Test - checks the tag bits of
the cache array for dual address errors and stuck-at
faults. With the cache flushed and completely
deallocated, the first 256 locations of the cache are
written with a unique data value in each address. Then
the entire cache is read. Only the 256 locations
written should be cache hits, and only these locations
should have the expected data pattern. Then the upper
address bits are changed so a new combination of tag
bits results. This test is repeated 15 times until all
of the tag bits have been tested.

Test 16 - Data RAM Reliability Test - performs a
modified moving inversions test on the cache data RAM
array. Due to the geometry of the data RAMs, every
fourth bit is done concurrently to save time. This
results in using the same pattern in both nibbles of the
data word. This test must be selected by the user as it
does not normally run by default. About four minutes
are required to complete one pass of this test.

6.4 OFFLINE BUS INTERACTION TEST
The Offline Bus Interaction Test creates Control and Data bus
contention among the requestors in the HSC70 subsystem. The
contention is generated by simultaneously testing different
portions of the same memory (Control and/or Data) from different
requestors. In the process of testing the memories, the various
requestors in the subsystem contend with each other for the use
of the Control and Data buses.

6-33

In addition to the bus contention generated by the requestors,
you can select I/O control processor interaction with the
Program, Control, and Data memories, with the Operator Control
Panel (OCP), and/or the load device. If I/O control processor
interaction is selected, it occurs simultaneously with the bus
contention generated by the requestors.
This test requires a minimum of two working requestors in order
to operate and uses a maximum of seven requestors if they are
available. The more requestors available for use by this test,
the greater the amount of bus contention. A larger number of
requestors makes it easier to isolate failures to a particular
source. Also, the run time of this test increases linearly as
the number of requestors is increased.
If the Bus Interaction Test fails, you must first determine if
the failure was caused by an interaction problem. Determine this
by running the Offline K/P Memory Test (Test Memory By K). When
the test prompts for parameters, specify the requestor number of
the requestor that detected the failure in the Bus Interaction
Test. Also specify the same starting and ending addresses
displayed with the error report from the Bus Interaction Test.
If the requestor also fails the Offline K/P Memory Test, the
original problem was not an interaction problem. The problem
should be localized in the same manner as any ordinary memory
failure.
6.4.1 Offline Bus Interaction Test System Requirements
Hardware required to run this test is shown in the following
list.
o

I/O control processor module with HSC70 Boot ROMs

At least one M.std2 (memory) module

Working Control and Data memories

Rx33 controller with at least one working drive

Terminal connected to I/O control processor console
interface

At least two working requestors (K.sdi, K.sti, or K.ci.)

6.4.2 Offline Bus
Booting procedures
Offline Diagnostic
Section 6.1.3, and

Interaction Test Prerequisites
and testing through successful loading of the
Loader program is described in Section 6.1.2,
Section 6.2.

6-34

Due to the sequence of tests that precede the memory test, the
memory test assumes the I/O control processor module and the load
device are tested and working. This test also assumes the
Control and Data memories were previously tested with the Offline
Memory Test or the Offline K/P Memory Test and are working.
6.4.3 Offline Bus Interaction Test Operating Instructions
At the Loader prompt (ODL», the operator types the TEST BUS
command and the Offline Bus Interaction test is loaded and
started. The test indicates it has been loaded properly by
displaying the following:
HSC OFL Bus Interaction Test
The test then sizes the Program, Control, and Data memories and
determines the number of requestors available for testing.
6.4.4 Offline Bus Interaction Test Parameter Entry
After displaying the program name and version, the Program,
Control and Data memories are sized. The bounds of each memory
are displayed on the terminal.
NOTE
For any of the Bus Interaction Test prompts, use
the DELete key to delete mistyped parameters
before the terminating carriage return is typed.
If you note an error in a parameter already
terminated with a carriage return, type a CTRL/C
to return to the Offline Loader. Then type
START, followed by carriage return, to restart
the test from the beginning.
The test prompts you to select the requestors used for the test,
as follows:
Use requestor

001, K.ci (Y/N) [Y] ?

Answer with a carriage return (or a Y followed by a carriage
return) if the K.ci should be used. Answer with an N followed by
a carriage return if the K.ci should not be used.
At least two working requestors must be used to run the bus
contention test because one requestor cannot generate bus
contention by itself. The program displays the following error
message if less than two requestors remain after you have
indicated which requestors should be used:
Not Enough Ks Available for Test

6-35

Next, the program prompts for the type of I/O control processor
interaction desired:
P.ioj Memory Interaction desired (Y/N) [Y] ?
Answer the prompt with a carriage return if you want I/O control
processor interaction with memory. Answer with an N followed by
a carriage return if you do not want I/O control processor
interaction with memory. If you answer the prompt with an N, the
following three prompts are skipped.
If you answer the prompt
with a carriage return, the following prompts are displayed:
Interact with Program Memory (Y/N) [Y] ?
Interact with Control memory (Y/N) [Y] ?
Interact with Data Memory (Y/N) [Y] ?
For each prompt, answer with a carriage return if you want the
I/O control processor to interact with the specified memory while
the requestors are generating contention on the Control and Data
buses. Answer with an N followed by a carriage return if you do
not want the I/O control processor to interact with the specified
memory.
(If I/O control processor interaction is selected, the
I/O control processor interacts with the memory at the same time
the requestors are generating Control and Data bus contention.)
The program next prompts for OCP interaction:
OCP Interaction Desired (Y/N) [Y] ?
If you want I/O control processor interaction with the OCP,
answer with a carriage return. If you do not want OCP
interaction, answer with an N, followed by a carriage return.
The test then prompts for load device interaction:
Interact with load device (Y/N) [Y] ?
If you want I/O control processor interaction with the load
device, answer with a carriage return.
If you do not want such
interaction, answer with an N, followed by a carriage return.
The program then prompts:
Number of passes to perform (D) [1] ?
Enter a decimal number between 1 and 2,147,483,647 (omitting
commas), to specify the number of times the bus interaction test
should be repeated.
(Entering a 0, or just a carriage return,
causes one pass of the test.) After the number of passes is
entered, the bus contention test begins. The test can be aborted
at any time by typing a CTRL/C.
(The test may continue running
for a few seconds after CTRL/C is typed.)
After the specified number of passes is completed, the following
prompt is issued:

6-36

Reuse parameters (YIN) [Y] ?

To repeat the last test specifled using the parameters, answer
this prompt with a carriage return or a Y followed by a carriage
return. To cause the test to prompt for new parameters, answer
the prompt with a N followed by a carriage return. Answering the
prompt with CTRL/C returns control to the Offline Loader.
6.4.5 Offline Bus Interaction Test Progress Reports
Each time the program completes one full set of bus contention
tests, an end of pass report is displayed. A pass consists of
completing a full set of contention tests, including: Control
Bus Tests, Data Bus Tests and Combined Control and Data Bus
Tests. The end of pass message is displayed as follows:

End of Pass nnnnnn! xxxxxx errors, yyyyyy total errors.
where:
nnnnnn

decimal count of the number of passes completed

xxxxxx
decimal count of the number of errors detected on
current pass
yyyyyy = decimal count of the total number of errors detected
since the test was initiated
6.4.6 Offline Bus Interaction Test Error Information
All error messages produced by this test conform to the generic
diagnostic error message format (Section 6.1.5).
Following is a
typical Bus Interaction Test error message:

OBIT>hh:mm T aaa E bbb U-OOO
Memory Test Error
Detected By K.sdi, requestor 006
MA -xxxxxxxx
EXP-yyyyyy
ACT-zzzzzz

<K-Error-Summary-Info>
Memory Test Configuration:
K.ci , requestor 001, M.ctl 16000700 - 16100274
K.sdi ,requestor 006, M.ctl 16100300 - 16177674
where:
hh
mm
aaa
bbb
xxxxxxxx
yyyyyy
zzzzzz

Hours since Offline Loader was last booted
Minutes since Offline Loader was last booted
decimal number denoting test
decimal number denoting the error detected
Address of location causing the error
Data that was expected
Data that was actually found

6-37

<K-Error-Summary-Info)
Memory Test Configuration

Refer to Section 6.4.6.1
Refer to Section 6.4.6.2

6.4.6.1 Requestor Error Summary - When the requestor reports a
memory test failure to the I/O control processor, the following
information is supplied:

Address of the failing memory location

Data expected and data actually found

Error summary information

The error summary information is supplied as a 3-bit field
including:
1.

A bit indicating a parity error occurred while reading
the location

A bit indicating an NXM error occurred while accessIng
the location

A bit indicating a Control Bus (CBUS) error occurred
while accessing the location

When a memory error report is issued for an error detected by the
requestor, the last line of the error report includes a list of
the error summary bits that were set (if any).
A Control Bus (CBUS) Error indicates the requestor asserted an
illegal combination of the three CCYCLE lines when accessing
Control memory. Because these lines were previously tested from
the I/O control processor (in the OFL P.ioj Test), a Control Bus
Error is probably caused by a problem with the requestor's
drivers that assert the CCYCLE lines.
6.4.6.2 Offline Bus Interaction Memory Test Configuration - The
memory test configuration lists each requestor being used for bus
interaction tests along with the section of memory each requestor
was testing when the failure occurred. The configuration
information consists of:

Type of requestor (K.ci, K.sdi, K.sti) and the requestor
number

Memory being tested by the requestor (M.ctl
memory, M.data = Data memory)

First address of the chunk of memory being tested

Last address of the chunk of memory being tested

6-38

Control

6.4.6.3 Offline Bus Interaction Test Error Messages - The
following list describes the nature of the failure indicated by
each error number:
o

Error 000 - Memory Test Error - indicates one of the
requestors detected a memory error in the Control or
Data memories. The following is a sample error report:
Memory Test Error
Detected by K.ci, requestor 001
MA -16010234
EXP-000177
ACT-000377
parity error
Memory Test Configuration:
K.ci ,requestor 001, M.ctl 16000700 - 16100274
K.sdi ,requestor 007, M.et1 16100300 - 16177674
MA
The 22-bit address of the failing location.
EXP = The data pattern expected by the requestor
ACT
The data pattern found by the requestor
Memory Test Configuration = other requestors enabled
when failure occurred.
This sample error report indicates the K.sdi detected a
memory parity error while reading address 16010234 of
Control memory (M.ctl). The requestor expected to find
the value 000177 in the location but instead found the
value 000377. At the time the error occurred, the K.ci
in requestor 1 was testing addresses 16000700 through
16100274 of Control memory, and the K.sdi in requestor 7
was testing addresses 16100300 through 16177674 of the
Control memory.

Error 001 - K Timed-out During Init - is displayed when
a requestor fails to complete its Init sequence in time.
This error usually indicates the specified requestor
failed one of its internal microdiagnostics. A sample
error report follows:
K Timed-out During Init
K.ci , requestor 001, Status
Other Ks Enabled:
K.sdi, requestor 6
K.sdi, requestor 7

104

This sample error report indicates the K.ci in requestor
1 did not finish its initialization diagnostics in the
required time. The requestor status displayed with the
error report indicates the requestor failed test 4 of
its microdiagnostics (lXX in status = failed test XX).
Two other requestors were enabled at the time the
requestor K.ci timed-out.
(One of these requestors may
be responsible for K.ci time-out.)

6-39

When the I/O control processor enables the requestor to
perform the memory test, the requestor begins its
initialization sequence (which includes executing
certain microdiagnostics). At the end of the
requestor's Init sequence, the list indicates it found
the K Control Area by complementing a pointer word in
Control memory. If the requestor fails to complement
this pointer word within 50 milliseconds (4.2 seconds
for the K.ci) of being enabled, error 001 is reported.
The contents of the K Status Register are displayed with
the error report.
o

Error 002 - K Timed-out During Test - indicates the
specified requestor failed to complete its memory test
within the expected time. A sample error report
follows:
K Timed-out During Test
K.sdi, requestor 007, Status = 002
Memory Test Configuration:
K.ci , requestor 1, M.ctl 16000700 - 16100274
K.sdi, requestor 7, M.ctl 16100300 - 16177674
The sample error report indicates the K.sdi in requestor
7 never completed the memory test it was assigned.
(Ks
are allowed up to one minute to complete a memory test.)
The memory configuration displayed with the error report
shows all Ks testing at the same time the K.ci
timed-out. In this example, the K.ci in requestor 1 was
also testing at the time the K.sdi timed-out.
Test time-out failures may be caused by a failure in the
requestor that timed-out. They may also be caused by a
failure in one of the other requestors that was testing
at the same time.

Error 003 - Parity Trap - indicates the I/O control
processor detected a parity error. The 22-bit address
of the location causing the error is displayed as the MA
data in the error report, where:
MA = The address causing the parity trap.
VPC = The Virtual PC of the memory test at the time the
trap occurred. Reference this address in the listing to
locate the area of the test where the error occurred.

The data is lost when a parity trap occurs so no
expected or actual data can be displayed.
o

Error 004 - NXM Trap - indicates the I/O control
processor detected a Non-Existent Memory (NXM) error.
An NXM error is caused when no memory responds to a
particular address. The MA data in the error report
indicates the address which produced the NXM trap.

6-40

After the trap is reported, the program attempts to
restart the test from the beginning. The MA and VPC
fields have the same meanings as Error 003.
If this error occurs at a memory address that should be
in your memory configuration, the memory in question is
not supplying an ACK to the I/O control processor when
the specified address is presented on the memory bus.
The most probable point of failure is the logic on the
memory module that compares addresses on the memory bus
with the range of addresses to which the module should
respond. Also, the comparator itself could be faulty or
the [C IN, C OUT], [D IN, D OUT] or [P IN, P OUT] lines
on the backplane could be in error.
o

Error 005 - Memory Test Error (P.ioj Detected) indicates the I/O control processor detected an error
while testing Program memory. This error can only occur
if I/O control processor interaction with Program memory
is selected. This interaction consists of:
1.

A series of POP-II instructions that perform

Read/Modify/Write (RMW) cycles to selected Program
memory locations.
2.

Quick-verify tests of the entire Program memory
(done 6 Kwords at a time).

Error 005 can be caused by cross-talk between the
Program memory bus and either the Control or Data bus.
It can also be caused by a failure in the Program memory
logic which inhibits refresh cycles in the middle of a
RMW cycle.
NOTE
Errors 006 through 009 are HSCSO specific and do
not apply to the HSC70.
o

Error 010 (12 octal) - Cache Parity Trap, VPC = xxx xxx can happen during any test. The Jll trapped through the
parity vector. The error was caused by the cache.
NOTE
Errors all through 017 can occur on an HSC70
when load device interaction is enabled.

Error all - Rx33 Drive Not Ready - indicates the drive
selected for the operation was not ready. The door may
be open or the diskette absent during a READ or POSITION
command.

6-41

Error 012 - RX33 CRC Error During Seek - indicates the
Rx33 detection a CRC error during a seek. The RX33
could not verify position when reading header
information from the diskette.

Error 013 - RX33 Track 0 Not Set on Recalibrate indicates a recalibrate (seek to track 0) operation is
performed before each block of read operations.
If the
Rx33 does not show correct status after the recal
command, error 013 is printed.

Error 014 - RX33 Seek Timeout - prints if during a
the Rx33 does not respond by interrupting.

Error 015 - Rx33 Seek Error - sets the seek error bit
(Bit 4 of the CR$). At the end of a Seek operation, the
Rx33 found out it is not where it thought it should be.

Error 016 - RX33 Read Timeout - indicates the Rx33 did
not interrupt at the end of a READ command.

Error 17 - RX33 CRC/RNF Error on Read Command - can be
caused by a soft error or bad spot(s) on the disk.
For
informational purposes, the following additional message
prints out:
First LBN In Transfer

~.I ~k

= xxxx

where:

xxx is the LBN of the first block in the transfer. The
Offline Interaction Bus Test performs reads in blocks of
four.

6.4.6.4 Offline Bus Interaction K Memory Test Algorithm - The
Moving Inversions Memory Test (MOVI) is used to generate bus
contention among the requestors. Each requestor in an HSC
contains the Moving Inversions test as part of its
microdiagnostic software set. The Moving Inversions RAM test is
used to detect data and addressing problems in dynamic
semiconductor memories.
The following are the steps in the Moving Inversions Algorithm:
1.

Write 000000 in each location being tested.

Read all locations in order from lowest to highest.
After reading a location and checking for a zero,
rewrite the same location with a single one in the
least-significant bit. Then reread the location and
verify the write worked correctly.

6-42

Again, read all locations in order from lowest to
highest, checking to see each location contains the data
previously written. Then rewrite the data found with a
single additional one bit and reread to check that the
write worked properly.

Repeat step 3 until the test pattern consists of a word
containing all ones (pattern 17777777).

Repeat steps 1 through 4, but this time start at the
highest memory address each time and work down to the
lowest. However, instead of adding an additional I, add
an additional O. This changes each memory location from
all ones back to all zeros.

End of test.

All memory is cleared to 000000.

6.5 OFFLINE K TEST SELECTOR
The Offline K Test Selector allows you to command a K to perform
an internal microdiagnostic self-test. This Offline K Test
executes from the P.ioj and uses the HSC K Control Area for
instruction. You select the K for testing and the test number of
the microdiagnostic test for execution.
6.5.1 Offline K Test Selector System Requirements
The following hardware is required to run this test:

P.ioj (processor) module with HSC Boot ROMs

M.std2 (memory) module

A working section of Control memory for use as a K
Control Area

One working Rx33 drive

Terminal connected to the P.ioj console interface

At least one working K (K.sdi, K.sti, or K.ci)

Due to the sequence of tests that precede this test, you can
assume the P.ioj, Program memory, and Rx33 are working.
Offline K Test Selector Operating Instructions
If the HSC70 is not booted and loaded, refer to Section 6.1.2,
Section 6.1.3, and Section 6.2. If the Loader prompt (ODL» is
displayed, follow these steps to start the K Test Selector:
6.5.2

6-43

Type the TEST K command. The RX33 drive-in-use LED
lights as the test is loaded.

The test indicates it has been loaded properly by
displaying the following:
HSC OFL K Test Selector

The test next prompts for parameters.

6.5.3 Offline K Test Selector Parameter Entry
This section gives detailed information on how to enter the test
parameters for the Offline K Test Selector. Items in square
brackets are the default value for each particular prompt. If no
default is possible, the brackets are empty.
NOTE
For any of the Offline K Test prompts, use the
DELete key to delete mistyped parameters before
the terminating carriage return is typed. If you
note an error in a parameter already terminated
with a carriage return, type CTRL/C to return to
the initial prompt and re-enter all parameters.
The Offline K Test Selector first prompts:
K requestor # (1 thru 7)

[] ?

Answer this question with single digit (1 through 7) that
specifies the requestor number of the K to be used. Terminate
the response by typing a carriage return. After the requestor
number is supplied, a K Control Area is located in Control memory
and tested. This area is required for communicating with the K
that will run its microdiagnostics. The test then prompts:
Test # (1 thru 11) (0) [] ?
Legal test numbers
for Test 5.
(Test
which is supported
test number with a

are octal numbers between 1 and 11(8), except
5 is the K's Control and Data memory test,
by the OFL KIF Memory Test.) Terminate the
carriage return. The test then prompts:

# of passes to perform (D)

[1] ?

Enter a decimal number between 1 and 2,147,483,647 (omitting
commas) to specify the number of times the memory test should be
repeated.
(Entering a zero, or just a carriage return, results
in performance of one pass.)

6-44

The P.ioj next instructs the K to perform the selected test, and
allows up to 4.2 seconds for the K to complete its test.
If the
K completes the test within this time, the P.ioj displays an
end-of-pass message. If the K fails to complete within 4.2
seconds, the P.ioj displays a K Time-Out Error (Error 009).
The K microdiagnostics are designed to hang when an error is
detected, so all failures in the microdiagnostics are reported as
time-out errors. The current test may be aborted at any time by
typing CTRL/C.
After the first test has been specified and completed, the
following prompt is issued:
Reuse parameters (YIN) [Y] ?
If you answer this prompt with a carriage return or a Y followed

by a carriage return, the last test specified is repeated, using
the same parameters. If you answer the prompt with an N,
followed by a carriage return, the test prompts for new
parameters.
6.5.4 Offline K Test Selector Progress Reports
Each time the K completes one full pass through the test
specified, an end-of-pass report is displayed. A full pass is
defined as:
1.

The K completes the test with no errors detected.

The K fails its test, and the P.ioj times-out.

The end-of-pass message is displayed as follows:
End of Pass nnnnnn, xxxxxx Errors, yyyyyy Total Errors
The pass count nnnnnn is a decimal count of the number of
complete passes made. The Errors count (xxxxxx) indicates the
number of errors detected during the current pass. The Total
Errors count (yyyyyy) indicates the number of errors detected
during all passes completed so far.
6.5.5 Offline K Test Selector Error Information
All error messages produced by this test conform to the HSC
generic diagnostic error message format (Section 6.1.5). Offline
K Selector Test error messages are preceded by an OKTS> prompt.
In this test optional lines three, four, and five show the
address of the failing location (MA), expected data (EXP), and
actual data (ACT).

6-45

6.5.5.1 K.ci Path Status Information - Whenever a K.ci is
enabled, it runs the CI Link test as part of its
microdiagnostics. The Link test performs loop-back tests on CI
paths A and B of the K.ci. To pass the Link test, one of the
paths must work (one failing path is not a fatal error). The
microdiagnostics then return information in the K Control Area
which specifies which paths worked, and how many retries were
required.
(The test retries 64 times before declaring a
failure.)
The Offline K Test selector reports the CI path status each time
the K.ci is initialized. If the Link test is selected (K.ci Test
11), the path status is reported only after the Link test
completes.
(When the K.ci is enabled, it runs all of its
microdiagnostics, including the Link test. If the Link test was
selected, the K.ci runs that test once more.)
The CI path status display indicates which path failed the Link
test, if any. If both paths fail, the microdiagnostics fail in
Test 11, and no path status information is displayed. The status
display also includes the number of retries required for paths
that passed the Link test.
6.5.5.2 Offline K Test Selector Error Messages - Errors detected
by this test fall into one of three classes:
1.

Control memory errors occur when the P.ioj is testing
the portion of Control memory used to communicate with
the K.
(The P.ioj does not test Data memory.) Error
numbers 000 through 007 are all Control memory errors
detected by the P.ioj. The difference between these
errors is the exact step in the memory test where they
are detected. The step where an error was detected can
be a helpful clue to the cause of the error.

Failures in a K microdiagnostic detected by a time-out.
Error 008 indicates the K failed to initialize properly.
Error 009 indicates the K failed the selected
microdiagnostic.

Unexpected traps detected by the P.ioj (NXM and Parity).
Errors 010 and 011 are unexpected trap errors detected
by the p.ioj. Error 010 signifies a parity trap
occurred, and error 011 indicates a Non-Existent Memory
trap. The reports for unexpected trap errors differ
slightly from a data error report, since they do not
display expected and actual data. Error 012 indicates
no working control memory could be found for a K Control
Area. Error 13 is a cache parity trap.

The following list describes the nature of the failure indicated
by each error number:

6-46

Error 000 - occurs in the Moving Inversions test when
the P.ioj is testing the K Control Area at a memory
location that did not contain the expected pattern,
where:
MA
EXP
ACT

The address of the failing location.
The data pattern expected.
The data pattern actually found.

This error can be caused by a data error in the address
specified, or it may indicate a dual-addressing problem
(the location was incorrectly addressed and written when
some other location was written). At this step in the
test, a dual-addressing problem is characterized by:
1.

The ACTual data contains a single additional one.

The additional one bit occurs immediately to the
left of the leftmost one in the EXPected data. For
example:
EXP=000377, ACT=000777
EXP=077777, ACT=17)777
EXP=OOOOOO, ACT=OOOOOI
For the first example, the location in error was
probably written with the pattern 000777 when a
lower numbered address was being written with the
same pattern. When the location in error was
subsequently checked to ensure it still contained
the previous pattern (000377), it contained the next
pattern (000777). Data errors at this step of the
test fall into one of the following classes:
a.

The ACTual and EXPected data differ by more than
one bit:
EXP=017777, ACT=017477

The ACTual data contains fewer ones than the
expected data:
EXP=003777, ACT=001777

The bit in error is not in the bit position
immediately to the left of the leftmost one in
the expected data:
EXP=000777, ACT=002777

Error 001 - occurs in the Moving Inversions Test when
the P.ioj is testing the K Control Area at a location

6-47

written with a pattern. Immediately after the write,
the location was read and found to contain an incorrect
pattern, where:
MA = The address of the failing location
EXP = The data pattern expected
ACT = The data pattern actually found
This error indicates a memory data problem.
following hardware failures is indicated:

One of the

A bit was picked up or dropped when the location was
written.

A bit was picked up or dropped when the location was
read.

If the error occurs repeatedly, but only in a single
location, the memory chip containing the failing bit for
that address is probably defective.
If the error occurs in many locations, but only occurs
in a particular nibble (4-bit field), one of the bus
data transceivers for that nibble is probably defective.
If the error occurs in many locations, and the bits in
error are randomly spaced throughout the word, the
memory or bus timing is probably faulty.
If the error occurs in more than one location, but the
addresses of the failing locations are similar, there
could be crosstalk between the memory data and
addressing lines. For instance, all failing addresses
end with either 2 or 6.
o

Error 002 - occurs in the Moving Inversions test when
the P.ioj is testing the K Control Area. A memory
location did not contain the expected pattern, where:
MA
EXP
ACT

The address of the location in error.
The data pattern expected.
The data pattern actually found

This error can be caused by a data error in the address
specified, or it may indicate a dual-addressing problem.
(The location was incorrectly addressed and written when
some other location was being written.) At this step in
the test, a dual-addressing problem is characterized by:
1.

The ACTual data contains one more zero than the
EXPected data.

The additional zero occurs in the same bit position
as the leftmost one in the expected data:

6-48

Exp=003777, ACT=001777
EXP=000017, ACT=000007
EXP=177777, ACT=077777
In the first example, the location in error was probably
written with the pattern 001777 when a lower numbered
address was being written with the same pattern. When
the location in error was subsequently checked to ensure
it still contained the previous pattern (003777), it
contained the next pattern (001777).
Data errors in this step of the Moving Inversions test
fall into one of the following categories:
1.

The ACTual and EXPected data differ by more than one
bit:
EXP=177777, ACT=174777

The ACTual data contains more ones than the expected
data:
EXP=037777, ACT=077777

The bit in error is not in the same bit position as
the leftmost one in the EXPected data:
EXp=001777, ACT=001377

Error 003 - occurs in the Moving Inversions Test when
the P.ioj is testing the K Control Area. A location was
written with a pattern. Immediately after the write,
the location was read and found to contain an incorrect
pattern, where:
MA
EXP
ACT

The address of the failing location
The data pattern expected
The data pattern actually found

This error indicates a memory data problem.
following hardware failures is indicated:
1.

One of the

A bit was picked up or dropped when the location was

written.
2.

A bit was picked up or dropped when the location was

read.
If the error occurs repeatedly but only in a single
location, the memory chip containing the failing bit for
that address is probably defective.

6-49

If the error occurs in many locations but only occurs in
a particular nibble (4-bit field), one of the bus data
transceivers for that nibble is probably defective.
If the error occurs in many locations, and the bits in
error are randomly spaced throughout the word, the
memory or bus timing is probably faulty.
If the error occurs in more than one location, but the
addresses of the failing locations are similar, there
could be crosstalk between the memory data and
addressing lines. For instance, all failing addresses
end with either 2 or 6.
o

Error 004 - occurs in the Moving Inversions test when
the P.ioj is testing the K Control Area. A memory
location did not contain the expected pattern, where:
MA
EXP
ACT

The address of the failing location.
The data pattern expected.
The data pattern actually found.

The ACTual data contains a single additional one.

The additional one bit occurs immediately to the
left of the leftmost one in the expected data:
EXP=000377, ACT=000777
Exp=077777, ACT=177777
EXP=OOOOOO, ACT=OOOOOI

In the first example, the location in error was probably
written with the pattern 000777 when a higher numbered
address was being written with the same pattern. When
the location in error was subsequently checked to ensure
it still contained the previous pattern (000377), it
contained the next pattern (000777). Data errors at
this step of the test fall into one of the following
classes:
1.

The ACTual and EXPected data differ by more than one
bit:
EXP=017777, ACT=017477

The ACTual data contains fewer ones than the
expected data:

6-50

EXP=003777, ACT=001777
3.

The bit in error is not in the bit position
immediately to the left of the leftmost one in the
EXPected data:
EXP=000777, ACT=002777

Error 005 - occurs in the Moving Inversions Test when
the P.ioj is testing the K Control Area. A location was
written with a pattern. Immediately after the write,
the location was read and found to contain an incorrect
pattern, where:
MA
EXP

ACT

The address of the failing location
The data pattern expected
The data pattern actually found

This error indicates a memory data problem.
following hardware failures is indicated:

One of the

A bit was picked up or dropped when the location was
written.

A bit was picked up or dropped when the location was
read.

If the error occurs repeatedly but only in a single
location, the memory chip containing the failing bit for
that address is probably defective. If the error occurs
in many locations but only occurs in a particular nibble
(4-bit field), one of the bus data transceivers for that
nibble is probably defective.
If the error occurs in many locations, and the bits in
error are randomly spaced throughout the word, the
memory or bus timing is probably faulty.
If the error occurs in more than one location, but the
addresses of the failing locations are similar, there
could be crosstalk between the memory data and
addressing lines. For instance, all failing addresses
end with either 2 or 6.
o

Error 006 - occurs in the Moving Inversions test when
the P.ioj is testing the K Control Area. A memory
location did not contain the expected pattern, where:
MA = The address of the location in error.
EX P= The data pattern expected.
ACT = The data pattern actually found.

6-51

This error can be caused by a data error in the address
specified, or it may indicate a dual-addressing problem.
(The location was incorrectly addressed and written when
some other location was being written). At this step in
the test, a dual-addressing problem is characterized by:
1.

The ACTual data containing one more zero than the
expected data.

The additional zero occuring in the same bit
position as the leftmost one in the EXPected data.
For example:
EXP=003777, ACT=001777
EXP=000017, ACT=000007
EXp=177777, ACT=077777

In the first example, the location in error was probably
written with the pattern 001777 when a higher numbered
address was being written with the same pattern. When
the location in error was subsequently checked to ensure
it still contained the previous pattern (003777), it
contained the next pattern (001777). Data errors in
this step of the Moving Inversions Test fall into one of
the following categories:
1.

The ACTual and EXPected data differ by more than one
bit:
Exp=177777, ACT=174777

The ACTual data contains more ones than the expected
data:
EXP=037777, ACT=077777

The bit in error is not in the same bit position as
the leftmost one in the EXPected data:
EXP=001777, ACT=001377

Error 007 - occurs in the Moving Inversions Test when
the P.ioj is testing the K Control Area. A location was
written with a pattern. Immediately after the write,
the location was read and found to contain an incorrect
pattern, where:
MA
EXP
ACT

The address of the failing location
The data pattern expected
The data pattern actually found

6-52

This error indicates a memory data problem.
following hardware failures is indicated:

One of the

A bit was picked up or dropped when the location was
written.

A bit was picked up or dropped when the location was
read.

If the error occurs repeatedly but only in a single
location, the memory chip containing the failing bit for
that address is probably defective.
If the error occurs in many locations but only occurs in
a particular nibble (4 bit field), one of the bus data
transceivers for that nibble is probably defective.
If the error occurs in many locations, and the bits in
error are randomly spaced throughout the word, the
memory or bus timing is probably faulty.
If the error occurs in more than one location, but the
addresses of the failing locations are similar, there
could be crosstalk between the memory data and
addressing lines. For example, all failing addresses
end with either 2 or 6.
o

Error 008 - indicates the selected K did not complete
its Init sequence properly. When the P.ioj enables the
K to perform a test, the K begins its Init sequence
(which includes executing certain microdiagnostics). At
the end of the K's Init sequence, the K indicates it
found the K Control Area by complementing a pointer word
in the Control memory. If the K fails to complement
this pointer word within 4.2 seconds of being enabled,
Error 008 is reported.
The contents of the K Status Register are displayed with
the error report.
If this error occurs, make sure the Requestor Number
parameter given matches the actual requestor number of
the K.

Error 009 - indicates the K failed the selected
microdiagnostic test. This usually indicates a serious
hardware problem in the K. The contents of the K Status
Register are displayed with the error report.

Error 010 - indicates the P.ioj detected a Parity trap.
The 22-bit address of the location that caused the trap
is displayed as the MA data in the error report, where:

6-53

MA = The address causing the parity trap.
VPC = The Virtual PC of the memory test at
the time the trap occurred. Reference
this address ih the listing to locate
the area of the test where
the error occurred.
Because the data is lost when a parity trap occurs, no
EXPected or ACTual data is displayed. After the trap is
reported, the program attempts to restart the test from
the beginning.
o

Error 011 - indicates the P.ioj detected a Non-Existent
Memory trap. A NXM error is caused when no memory
responds to a particular address. The MA data in the
error report indicates the address which produced the
NXM trap. After reporting the trap, the program
attempts to restart the test from the beginning, where:
MA = The address causing the NXM trap.
MA
The address causing the parity trap.
VPC = The Virtual PC of the memory test at
the time the trap occurred. Reference
this address in the listing to locate
the area of the test where
the error occurred.
If this error occurs at a memory address that should be
in your memory configuration, the memory in question is
not supplying an ACK to the P.ioj when the specified
address is presented on the Memory bus. The most
probable point of failure is the logic on the memory
module that compares addresses on the memory bus to the
range of addresses the module should respond to. Also,
the comparator itself could be faulty, or the [C IN, C
OUT], [D IN, D OUT], or [P IN, P OUT] lines on the
backplane could be in error.

Error 012 - indicates no working Control memory could be
found for a K Control Area. A K Control Area is
required to communicate with a K. The Control memory
must be repaired before the K Test Selector can be used
to test a K. Use the Offline Loader command TEST MEMORY
to test Control memory.

Error 013 - Cache Parity Trap, VPC = xxxxxx - can happen
during any test. The JII trapped through the parity
vector. The error was caused by the cache.
During the run of the diagnostic, the JII took a trap
through the parity error vector. This is a cache error.
The virtual PC at the time of the trap is printed.

6-54

6.5.6 Offline K Test Selector Summaries
The following is a list of Offline K Selector test summaries.

Test 000 - Moving Inversions Test - is the Moving
Inversions (MOVI) memory test used by the P.ioj to test
a K Control Area. The K Control Area is used to pass
memory test parameters to the K and to return the
results of memory tests to the P.ioj. The Moving
Inversions RAM test is used to detect data and
addressing problems in dynamic semiconductor memories.
The following are the steps in the Moving Inversions
Algorithm:

Write 000000 in each location being tested.

Read all locations in order from lowest to highest.
After reading a location and checking for a zero,
rewrite the same location with a single one in the
least significant bit. Then reread the location and
verify the write worked correctly.

Again read all locations in order from lowest to
highest. Check that each location contains the data
previously written. Rewrite the data found with a
single additional one bit. Reread it to verify the
write operation worked properly

Repeat step 3 until the test pattern consists of a
word containing all ones (pattern 177777).

Repeat step 3, but this time substitute a single
extra zero each time, instead of a one.

Continue step 5 until the test pattern consists of a
word of all ~eros (pattern 000000).

Repeat steps 1 through 6, but this time start at the
highest memory address each time and work down to
the lowest. This will work each memory location
from all zeros to all ones, and back to all zeros.

End of test.

All memory is cleared to 000000.

Test 001 through Test all (K Microdiagnostics) - Refer
to the following three lists for the names of each
microdiagnostic. Included in each list is the type of K
being used and the failing test number.
1.

K.ci Microdiagnostics - The following list shows the
test number and name of each of the K.ci
microdiagnostics:

6-55

Test 0 - Sequencer Test
Test 1 - ALOE Test
Test 2 - Data Bus Test
Test 3 - Control Bus Test
Test 4 - PROM Parity Test
Test 5 - Memory Test (Unavailable via K Test
Selector)
Test 6 - RAM Test
Test 7 - PLY Interface Test
Test 10- Packet Buffer Test
Test 11- Link Test
2.

K.sdi Microdiagnostics - The following list shows
the test number and name of each of the K.sdi
microdiagnostics:
Test 0 - Sequencer Test
Test 1 - ALOE Test
Test 2 - Data Bus Test
Test 3 - Control Bus Test
Test 4 - PROM Parity Test
Test 5 - Memory Test (Not available via K Test
Selector)
Test 6 - RAM Test
Test 7 - SERDES/RSGEN Test

Test 10 - Partial SOl Interface Test
3.

K.sti Microdiagnostics - The following list shows
the test number and name of each of the K.sti
microdiagnostics:
Test 0 - Sequencer Test
Test 1 - ALOE Test

6-56

Test 2 - Data Bus Test
Test 3 - Control Bus Test
Test 4 - PROM parity Test
Test 5 - Memory Test (Not available via K Test
Selector)
Test 6 - RAM Test
Test 7 - SERDES Test
Test 10 - Partial STI Interface Test

6.6 OFFLINE K/P MEMORY TEST
The Offline K/P Memory Test tests the HSC Control and Data
memories from a K.sdi, K.c, or K.sti.
This test executes from
the I/O control processor and uses the HSC K Control Area to
instruct one of the subsystem requestors to test either the
Control or Data memories.
You select the K to be used as well as
the starting and ending addresses of the section of memory to be
tested.
The test algorithm used by the K stresses the memories
trying to detect transient errors caused by bus and memory timing
problems.
Errors are reported at the console terminal as they
occur.
6.6.1 Offline K/P Memory Test System Requirements
Hardware required by this test includes:
o

I/O control processor module with HSC70 Boot ROMs

At least one memory module

RX33 controller with at least one working drive

Terminal connected to I/O control processor console
interface

At least one working K.sdi, K.sti, or K.ci

Working Control memory for a K Control Area

6-57

6.6.2 Offline KIP Memory Test Operating Instructions
If the HSC70 is not booted and loaded, refer to Section 6.1.2,
Section 6.1.3, and Section 6.2. If these preceding steps are
complete, you are at the ODL> prompt. Follow these next steps to
start the memory test.
1.

Type TEST MEMORY BY K in response to the Loader prompt
(ODL». The Rx33 LED lights as the memory test is
loaded.

The memory test indicates it has been loaded properly by
displaying the following:
HSC70 OFL KIP Memory Test

The memory test then prompts for parameters.

6.6.3 Offline KIP Memory Test Parameter Entry
This section describes the various parameters for the Offline KIP
Memory Test.
NOTE
For any of the Offline KIP Memory Test prompts,
use the DELete key to delete mistyped parameters
before the terminating carriage return is typed.
If you note an error in a parameter already
terminated with a carriage return, type a CTRLIC
to return to the initial prompt and re-enter all
parameters.
The Offline KIP Memory Test first prompts:
requestor # of K (1 through 9) [] ?
Answer this question with the single digit (1 through 9), that
specifies the requestor number to be used. Terminate the
response by typing a carriage return. After the requestor number
is supplied, a K Control Area is located in Control memory and
tested. This area is required for communicating with the
requestor that performs tests of Data and Control memory. The
test then prompts:
Control (0) or Data (1) memory [0] ?
Type a zero to test Control memory or type a one to test Data
memory. Type a carriage return to terminate your response.
(Typing just a carriage return selects the Control memory test.)
The memory test next prompts for the first address to test:
First (in=XXXXXXXX) [in] ?

6-58

Enter the first address to be tested. Addresses are eight octal
digits in length. The [in] address displayed is the lowest
address that may be entered for the memory chosen. After typing
the address, terminate your response with a carriage return.
(Typing just a carriage return causes the first address to
default to the in address.)
NOTE
Because requestors test Control memory in 4-byte
units, the lowest two bits of the starting
address are ignored (treated as binary zeros).
For example, if address 16000223 is entered as
the first address, the requestor starts testing
at address 16000200.
Because requestors test Data memory in 64-byte
units, the lower six bits of the starting address
are ignored (treated as binary zeros).
FOr
example, if address 14012376 is entered as the
first address, the K starts testing at address
14012300.

The test next prompts for the last address to test:
Last (max=XXXXXXXX)

[] ?

Enter the last address to be tested. The max address displayed
is the highest address still within the memory chosen.
If your
system does not have a fully populated memory, the last address
that may be tested is less than the max address displayed.
If
you choose a last address that exceeds the amount of memory in
your system, the memory test displays a Non-Existent Memory (NXM)
error when the test reaches the first address beyond the end of
your memory.
(Use the Offline Loader command SIZE to determine
the actual last address in a given HSC.)
NOTE
Because requestors test control memory in 4-byte
units, the lower 2 bits of the ending address are
ignored (treated as binary ones).
For instance,
if address 16023400 is specified as the last
address, the K will test up to and including
address 16023403.
Because requestors test data memory in 64-byte
units, the lower 6 bits of the ending address are
ignored (treated as binary ones).
If address
14005400 is specified as the last address, the
requestor will test up to and including, address
14005477.

6-59

Finally, the memory test prompts:
# of passes to perform (0) [1] ?

Enter a decimal number between 1 and 2,147,483,647 (omitting
commas) to specify the number of times the memory test should be
repeated.
(If you enter a zero or a carriage return, the test
performs one pass.) The test can be aborted at any time by typing
CTRL/C.
After the first memory test completes, the following prompt is
issued:
Reuse parameters (Y/N) [Y] ?
Answering this prompt with a carriage return or a Y followed by a
carriage return repeats the last test specified using the same
parameters. Answering the prompt with a N followed by a carriage
return causes the prompt for new parameters.
6.6.4 Offline KIP Memory Test Progress Reports
Each time the requestor completes one full pass through the
memory specified, an end-of-pass report is displayed. A full
pass is defined as:

A complete test of the memory specified with no errors
detected

Testing the memory specified until an error occurs

The end-of-pass message is displayed as follows:
End of Pass nnnnnn, xxxxxx Errors, yyyyyy Total Errors
The Pass count nnnnnn is a decimal total of the complete passes
made. The Errors count (xxxxxx) indicates the number of errors
detected on the current pass. The Total Errors count (yyyyyy)
indicates the number of errors detected during the passes
completed so far.
6.6.5 Offline KIP Memory Test Parity Errors
When a parity error occurs, it is desirable to know whether the
error was produced by the loss or gain of a data bit or by the
loss or gain of a parity bit. When a parity trap occurs in the
I/O control processor, the data that was read is discarded by the
POP-II. However, a feature of the I/O control processor allows
parity traps to be disabled. Using this feature, a user can
determine if a parity error is being caused by a data or parity
bit as follows:

6-60

After a parity trap (P.ioj detected) is reported, type a
CTRL/C to terminate the memory test.

Type another CTRL/C to return to the OFL Diagnostic
Loader. The Loader prompts: ODL>.

Type Ex 17770042 followed by a carriage return.
The contents of the I/O control processor Switch Control
and Status Register (SWCSR) are displayed as follows:
"(I) 17770042 nnnnnn".

Type De * nnnn4n followed by a carriage return. The
nnnn4n represents the previous contents of the register,
including a one in bit 5. I/O control processor parity
traps are now disabled.

Return to the memory test by typing Start followed by a
carriage return.

Rerun the memory test with the original parameters.
If the location that previously produced a parity trap
then produces a data error, the original parity trap was
caused by a data bit problem. The error report
indicates the failing bit via the EXPected and ACTual
data displayed.
If the location that previously produced a parity trap
does not fail again when the memory test is rerun, the
original parity trap was caused by an error in one of
the parity bits (high or low byte) for that word.

Type a CTRL/C to return to the Loader, and re-enable
parity errors by typing De 17770042 nnnnOn followed by a
carriage return. The nnnnOn represents original
contents of the I/O control processor SWCSR, before
parity traps were disabled.
(Refer to step 5.)

Offline K/P Memory Test Error Information
For generic diagnostic error message format, refer to Section
6.1.5. Listed below is a typical error message from Test Memory
by K:
6.6.6

OKPM>hh:mm T aaa E bbb U-OOO
6<Text describing error>
MA -xxxxxxxx
EXP-yyyyyy
ACT-zzzzzz
<K-Error-Summary-Info>
where:

6-61

hh
= Elapsed hours since last bootstrap
mm
= Elapsed minutes
aaa
= Decimal number denoting test
bbb
= Decimal number denoting the error detected
xxxxxxxx = Address of location causing the error
yyyyyy = Data that was expected
zzzzzz = Data that was actually found
<K-Error-Summary-Info>

See next section.

Offline K/P Memory Test Error Summary Information - When
the requestor reports a memory test failure to the I/O control
processor, the following information is supplied:
6.6.6.1

Address of the failing memory location

Data expected and data actually found

Error summary information

The error summary information is supplied as a 3-bit field,
including the following:
1.

A bit indicating a parity error occurred while reading
the location

A bit indicating an NXM error occurred while accessing
the location

A bit indicating a Control Bus (CBUS) error occurred
while accessing the location

When a memory error report is issued for an error detected by the
K, the last line of the error report includes a list of the error
summary bits that were set (if any).
A Control Bus (CBUS) Error indicates the requestor asserted an
illegal combination of the three CCYCLE lines when accessing
Control memory. As these lines were previously tested from the
I/O control processor (in the OFL P.ioj Test), a Control Bus
error is most likely caused by a problem with the requestor's
drivers that assert the CCYCLE lines.
6.6.6.2 Offline K/P Memory Test Error Messages - Error messages
produced by this test can be caused by a memory error detected
either by the I/O control processor or by the requestor being
used to test memory. Errors detected by the I/O control
processor occur when the I/O control processor is testing the
portion of Control memory used to communicate with the K.
(The
I/O control processor does not test Data memory.)

6-62

To determine whether the I/O control processor or the requestor
detected an error, examine the second line of the error message.
The text either begins with a (P) or a (K). If the text begins
with a (P), the I/O control processor detected the error. If the
text begins with a (K), the requestor detected the error.
Error numbers 000 through 007 are all Control memory errors
detected by the I/O control processor. The difference between
these errors is the exact step in the memory test where they are
detected. The step where an error was detected can be a helpful
clue to the cause of the error.
Error 008 indicates the requestor failed to initialize properly.
Error 009 indicates a Control or Data memory error detected by
the K. In addition to the normal error information, the last
line of the error report contains a K Error Summary (see previous
section).
Errors 010 and 011 are unexpected trap errors detected by the I/O
control processor. Error 010 signifies a parity trap occurred;
error 011 indicates a Non-Existent memory trap. The reports for
unexpected trap errors differ slightly from a data error report
because they do not display expected and actual data.
Error 012 indicates no working Control memory could be found for
a K Control Area. Error 013 indicates a parity trap caused by
cache.
The following list describes the nature of the failure indicated
by each error number:
o

Error 000 - occurs in the Moving Inversions test (see
Section 6.6.7) when the I/O control processor is testing
the K Cont"rol Area. A memory location did not contain
the expected pattern, where:
MA
EXP
ACT

Address of the failing location
Data pattern expected
Data pattern actually found

The ACTual data contains a single additional one.

The additional one bit occurs immediately to the
left of the leftmost one in the EXPected data, such
as:

6-63

ExP=000377, ACT=000777
EXP=077777, ACT=177777
EXP=OOOOOO, ACT=OOOOOI
In the first example, the location in error was probably
written with the pattern 000777 when a lower numbered
address was being written with the same pattern. When
the location in error was subsequently checked to ensure
it still contained the previous pattern (000377), it
contained the next pattern (000777).
Data errors at this step of the test fall into one of
the following classes:
1.

The ACTual and EXPected data differ by more than one
bit:
EXP=017777, ACT=017477

The ACTual data contains fewer ones than the
EXPected data:
EXP=003777, ACT=001777

The bit in error is not in the bit position
immediately to the left of the leftmost one in the
EXPected data:
EXP=000777, ACT=002777

Error 001 - occurs in the Moving Inversions Test

(Section 6.6.7) when the I/O control processor is
testing the K Control Area. A location was written with
a pattern. Immediately after the write, the location
was read. It contained an incorrect pattern, where:
MA
EXP
ACT

Address of the failing location
Data pattern expected
Data pattern actually found

This error indicates a memory data problem.
following hardware failures is indicated:

One of the

A bit was picked up or dropped when the location was
written.

A bit was picked up or dropped when the location was
read.

6-64

If the error occurs repeatedly but only in a single
location, the memory chip containing the failing bit for
that address is probably defective.
If the error occurs in many locations but only occurs in
a particular nibble (4-bit field), one of the bus data
transceivers for that nibble is probably defective.
If the error occurs in many locations, and the bits in
error are randomly spaced throughout the word, the
memory or bus timing is probably the problem.
If the error occurs in more than one location, but the
addresses of the failing locations are similar,
crosstalk between the memory data and addressing lines
may be present.
(For example, all failing addresses end
with either 2 or 6.)
o

Error 002 - occurs in the Moving Inversions test

(Section 6.6.7) when the I/O control processor is
testing the K Control Area. A memory location did not
contain the expected pattern, where:
MA
EXP
ACT

Address of the location in error
Data pattern expected
Data pattern actually found

This error can be caused by a data error in the address
specified, or it may indicate a dual-addressing problem.
(The location was incorrectly addressed and written when
some other location was being written.)
At this step in the test, a dual-addressing problem is
characterized by:
1.

The ACTual data contains one more zero than the
EXPected data.

The additional zero occurs in the same bit position
as the leftmost one in the EXPected data, such as:
EXP=003777, ACT=001777
EXP=000017, ACT=000007
EXP=177777, ACT=077777

In the first example, the location in error was probably
written with the pattern 001777 when a lower numbered
address was being written with the same pattern. When
the location in error was subsequently checked to ensure
it still contained the previous pattern (003777), it
contained the next pattern (001777).

6-65

Data errors in this step of the Moving Inversions Test
fall into one of the following categories:
1.

The ACTual and EXPected data differ by more than one
bit:
EXp=177777, ACT=174777

The ACTual data contains more ones than the EXPected
data:
Exp=037777, ACT=077777

The bit in error is not in the same bit position as
the leftmost one in the EXPected data:
EXP=0017777, ACT=00377

Error 003 - occurs in the Moving Inversions Test
(Section 6.6.7) when the I/O control processor is
testing the K Control Area. A location was written with
a pattern. Immediately after the write, the location
was read. It contained an incorrect pattern, where:
MA = Address of the failing location
EXP
Data pattern expected
ACT
Data pattern actually found
This error indicates a memory data problem.
following hardware failures is indicated:

One of the

A bit was picked up or dropped when the location was
written.

A bit was picked up or dropped when the location was
read.

6-66

If the error occurs in more than one location, but the
addresses of the failing locations are similar,
crosstalk between the memory data and addressing lines
could be present.
(For example, all failing addresses
end with either 2 or 6.)
o

Error 004 - occurs in the Moving Inversions test (see
Section 6.6.7) when the I/O control processor is testing
the K Control Area. A memory location did not contain
the expected pattern, where:

MA
EXP
ACT

Address of the failing location
Data pattern expected
Data pattern actually found

The ACTual data containing a single additional one.

The additional one bit occuring immediately to the
left of the leftmost one in the EXPected data, such
as:
EXP=000377, ACT=000777
EXP=077777, ACT=177777
EXP=OOOOOO, ACT=OOOOOI

In the first example, the location in error was probably
written with the pattern 000777 when a higher numbered
address was being written with the same pattern. When
the location in error was subsequently checked to ensure
it still contained the previous pattern (000377), it
contained the next pattern (000777).
Data errors at this step of the test fall into one of
the following classes:
1.

The ACTual and EXPected data differ by more than one
bit:
EXP=017777, ACT=017477

The ACTual data contains fewer ones than the
EXPected data:
EXP=003777, ACT=001777

The bit in error is not in the bit position
immediately to the left of the leftmost one in the

6-67

EXPected data:
EXP=000777, ACT=002777
o

Error 005 - occurs in the Moving Inversions Test
(Section 6.6.7) when the I/O control processor is
testing the K Control Area. A location was written with
a pattern. Immediately after the write, the location
was read.
It contained an incorrect pattern, where:
MA = Address of the failing location
EXP = Data pattern expected
ACT
Data pattern actually found
This error indicates a memory data problem.
following hardware failures is indicated:

One of the

A bit was picked up or dropped when the location was
written.

A bit was picked up or dropped when the location was
read.

If the error occurs repeatedly but only in a single
location, the memory chip containing the failing bit for
that address is probably defective.
If the error occurs in many locations but only occurs in
a particular nibble (4-bit field), one of the bus data
transceivers for that nibble is probably defective.
If the error occurs in many locations and the bits in
error are randomly spaced throughout the word,' the
memory or bus timing is probably the problem.
If the error occurs in more than one location, but the
addresses of the failing locations are similar,
crosstalk between the memory data and addressing lines
could be present.
(For example, all failing addresses
end with either 2 or 6.)
o

Error 006 - occurs in the Moving Inversions test
(Section 6.6.7) when the I/O control processor is
testing the K Control Area. A memory location did not
contain the expected pattern, where:
MA
EXP
ACT

Address of the location in error
Data pattern expected
Data pattern actually found

6-68

This error can be caused by a data error in the address
specified or it may indicate a dual-addressing problem.
(The location was incorrectly addressed and written when
some other location was being written.)
At this step in the test, a dual-addressing problem is
characterized by:
1.

The ACTual data contains one more zero than the
EXPected data.

The additional zero occurs in the same bit position
as the leftmost one in the EXPected data, such as:
EXP=003777, ACT=001777
EXP=000017, ACT=000007
EXP=177777, ACT=077777

In the first example, the location in error was probably
written with the pattern 001777 when a higher numbered
address was being written with the same pattern. When
the location in error was subsequently checked to ensure
it still contained the previous pattern (003777), it
contained the next pattern (001777).
Data errors in this step of the Moving Inversions test
fall into one of the following categories:
1.

The ACTual and EXPected data differ by more than one
bit:
EXP=177777, ACT=174777

The ACTual data contains more ones than the EXPected
data:
EXP=037777, ACT=077777

The bit in error is not in the same bit position as
the leftmost one in the EXPected data:
EXP=001777, ACT=001377

Error 007 - occurs in the Moving Inversions Test
(Section 6.6.7) when the I/O control processor is
testing the K Control Area. A location was written with
a pattern. Immediately after the write, the location
was read.
It contained an incorrect pattern, where:
MA
EXP
ACT

Address of the failing location
Data pattern expected
Data pattern actually found

6-69

This error indicates a memory data problem.
following hardware failures is indicated:

One of the

A bit was picked up or dropped when the location was
written.

A bit was picked up or dropped when the location was
read.

If the error occurs repeatedly but only in a single
location, the memory chip containing the failing bit for
that address is probably defective.
If the error occurs in many locations but only occurs in
a particular nibble (4-bit field), one of the bus data
transceivers for that nibble is probably defective.
If the error occurs in many locations and the bits in
error are randomly spaced throughout the word, the
memory or bus timing is probably the problem.
If the error occurs in more than one location, but the
addresses of the failing locations are similar,
crosstalk between the memory data and addressing lines
may be present. For example, all failing addresses end
with either 2 or 6.
o

Error 008 - indicates the selected requestor did not
complete its Init sequence properly. When the I/O
control processor enables the requestor to perform the
memory test, the requestor begins its Init sequence
(which includes executing certain microdiagnostics). At
the end of the requestor's Init sequence, the requestor
indicates it found the K Control Area by complementing a
pointer word in Control memory. If the requestor fails
to complement this pointer word within 50 milliseconds
(4.2 seconds for K.ci) of being enabled, error 008 is
reported.
The contents of the K Status Register are displayed with
the error report. If this error occurs, make sure the
Requestor Number parameter given matches the actual
requestor number.

Error 009 - indicates a Control or Data memory error
detected by the K, where:
MA
EXP

ACT

22-bit address of the failing location
Data pattern expected by the K
Data pattern found by the K

6-70

In addition to the address and the expected/actual data,
the K returns an error summary, displayed as the last
line of the error report. The error summary information
indicates whether the error was caused by a parity
error, a Non-Existent Memory (NXM) error, or a Control
Bus (CBUS) error. If the error was not caused by any of
the these, the error summary line does not appear in the
error report. Refer to Section 6.4.6.1 for further
information on the error summary.
o

Error 010 - indicates the I/O control processor detected
a parity trap. The 22-bit address of the location that
caused the trap is displayed as the MA data in the error
report, where:
MA
VPC

The address causing the parity trap.
The Virtual PC of the memory test at the time the
trap occurred. Reference this address in the
listing to locate the area of the test where the
error occurred.

Because the data is lost when a parity trap occurs, no
EXPected or ACTual data can be displayed.. To further
localize the problem, disable parity errors and rerun
the test (Section 6.6.5).
If the original failure was
in a data-bit position, the memory test detects and
reports the error, displaying the EXPected and ACTual
data. This helps to trace the error to a particular
address and/or bit position.
If no further errors are
detected after disabling parity errors, the original
failure was in one of the parity-bits for the address
displayed in the parity trap report.
o

Error 011 - indicates the I/O control processor detected
a Non-Existent Memory trap. An NXM error is caused when
no memory responds to a particular address. The MA data
in the error report indicates the address which produced
the NXM trap. After the trap is reported, the program
attempts to restart the test from the beginning, where:
MA
VPC

The address causing the NXM trap.
The virtual PC of the memory test at the time the
trap occurred. Reference this address in the
listing to locate the area of the test where
the error occurred.

If this error occurs at a memory address that should be
in your memory configuration, the memory in question is
not supplying an ACK message to the I/O control
processor when the specified address is presented on the
Memory bus. The most probable point of failure is the
compare logic on the memory module. This logic compares
addresses on the Memory bus with the range of addresses
to which the module should respond. The comparator

6-71

itself could be faulty or the [C IN, C OUT], [D IN, D
OUT], or [P IN, P OUT] lines on the backplane could be
in error.
o

Error 012 - indicates no working Control memory could be
found for a K Control Area. A K Control Area is
required to communicate with a requestor. Control
memory must be repaired before the KIP Memory Test can
be used. Use the Offline Loader command TEST MEMORY to
test the Control memory.

Error 013 - Cache Parity Trap, VPC = xxxxxx - indicates
the Jll took a trap through the parity error vector
during the run of the diagnostic. This is a cache
error. The virtual PC at the time of the trap is
printed.

6.6.7 Offline K/P Memory Test Summaries
The following is a summary of individual K/P memory tests:
o

Test 000 - Moving Inversions Test from P.ioj - is the
Moving Inversions (MOVI) memory test used by the I/O
control processor to test a requestor Control Area. The
K Control Area is used to pass memory test parameters to
the requestor and to return the results of memory tests
to the I/O control processor. The Moving Inversions RAM
test is used to detect data and addressing problems in
dynamic semiconductor memories.
The following are the steps in the Moving Inversions
Algorithm:
1.

Write 000000 in each location being tested.

Again read all locations in order from lowest to
highest. Check each location for the data
previously written. Rewrite the data found with a
single additional one bit. Rerepd it to verify the
write operation worked properly.

Repeat step 3 until the test pattern consists of a
word containing all ones (pattern 17777777).

Repeat step 3 but this time substitute a single
extra zero each time, instead of a one.

6-72

Continue step 5 until the test pattern consists of a
word of all zeros (pattern 000000).

Repeat steps I through 6 but this time start at the
highest memory address each time and work down to
the lowest. This changes each memory location from
all zeros to all ones and back to all zeros.

End of test.

All memory is cleared to 000000.

Test 001 - Moving Inversions Test from K - is the Moving
Inversions test implemented in the K microcode. The
algorithm is identical to that described in the previous
test, except steps 5 and 6 are omitted to save time.
When the requestor detects an error, the remainder of
the test is aborted, and the information concerning the
error is returned to the I/O control processor via the K
Control Area. The I/O control processor is responsible
for displaying the error report.

6.7 OFFLINE MEMORY TEST
The Offline Memory test exercises the HSC memories. You may
select Control, Data, or Program memory for testing. Three
memory testing algorithms are used: the Quick Verify algorithm,
the Moving Inversions algorithm, and the Walking Ones algorithm.
The Quick Verify algorithm quickly uncovers stuck data and
address bits. The other two algorithms stress the memories,
attempting to detect transient errors caused by bus and memory
timing problems.
Errors are reported at the console terminal as they occur. After
reporting a data error, or a parity error from a location being
tested, testing continues where it left off. If an NXM error
occurs during the memory test, testing is restarted from the
beginning.
6.7.1 Offline Memory Test System Requirements
Hardware required for the Offline Memory Test follows:
o

I/O control processor module with HSC Boot ROMs

At least one memory module

RX33 controller with at least one working drive

Terminal connected to I/O control processor console
interface

6-73

6.7.2 Offline Memory Test Operating Instructions
If the HSC70 is not booted and loaded, refer to Section 6.1.2,
Section 6.1.3, and Section 6.2. If the HSC70 is booted and
loaded, the terminal displays an ODL> prompt. At this point,
follow these steps to start the memory test:
1.

Type SIZE in response to the Loader prompt (ODL». The
Rx33 drive-in-use LED lights as the Offline System Sizer
is loaded.
The Sizer displays the bounds of the various memories in
the HSC. The memory size information includes the last
address of each memory.

Type TEST MEMORY in response to the Loader prompt
(COL». The RX33 drive-in-use LED lights as the memory
test is loaded.
The memory test indicates it has been loaded properly by
displaying the following:
HSC OFL Memory Test
The memory test next prompts for parameters.
the following section.

Refer to

6.7.3 Offline Memory Test Parameter Entry
This section describes the Offline Memory Test parameter entry.
NOTE
For any of the Offline Memory Test prompts, use
the DELete key to delete mistyped parameters
before the terminating carriage return is typed.
If you note an error in a parameter already been
terminated with a carriage return, type a CTRL/C
to return to the initial prompt and re-enter all
parameters.
The Offline Memory Test first prompts:
Control(O), Data(l), or Program(2) Memory [0] ?
Type a 0 to test Control memory, type a 1 to test Data memory, or
type a 2 to test Program memory. Type a carriage return to
terminate your response.
(Typing just a carriage return causes
the Control memory test to be selected.) The memory test next
prompts for the first address to test:
First (in=XXXXXXXX) [in] ?

6-74

Enter the first address to be tested. Addresses are eight digits
in length. The [in] address displayed is the lowest address that
may be entered for the memory chosen. Terminate your response
with a carriage return.
(Typing just a carriage return causes
the first address to def~~lt to the in value.) The test next
prompts for the last address to test:
Last (max=XXXXXXXX) [] ?
Type the last address to be tested. The max address displayed is
the highest address still within the memory chosen. If your
system does not have a fully-populated memory, the last address
to be tested is less than the max address displayed. Use the
memory size information displayed by the ODL SIZE command to
answer this prompt with the correct address for the HSC under
test. If you choose a last address that exceeds the amount of
memory in your system, the memory test displays a Non-Existent
Memory (NXM) error when the test reaches the first address beyond
the end of your memory. The test then prompts:
# of passes to perform (D) [1] ?

Enter a decimal number between 1 and 2,147,483,647 (omitting
commas) to specify the number of times the memory test should be
repeated.
(Entering a 0, or just a carriage return, results in
one pass.) It can be aborted at any time by typing a CTRL/C or
CTRL Y.
After the first memory test is complete, the following prompt is
issued:
Reuse parameters (YIN) [Y] ?
Answering this prompt with a Y followed by a carriage return or
with a carriage return alone, repeats the last test specified,
using the same parameters. If you answer the prompt with an N,
followed by a carriage return, the test prompts for new
parameters.
6.7.4 Offline Memory Test Progress Reports
A complete pass through the memory test consists of one pass
through the Quick Verify test, one pass through the Moving
Inversions test, and one pass through the Walking Ones test.
After each complete pass, an end-of-pass message is displayed as
follows:
End of Pass nnnnnn, xxxxxx Errors, yyyyyy Total Errors
The Pass count nnnnnn is a decimal count of the number of
complete passes made. The Errors count xxx xxx indicates the
number of errors detected on the current pass. The Total Errors
count (yyyyyy) indicates the number of errors detected on all
passes of the test completed so far.

6-75

NOTE
A complete pass through the memory test for
Program memory may take about eight hours.
Unless exhaustive memory testing is required,
allow this test to run only until the Quick
Verify Pass Complete message is displayed. This
should take no more than 10 minutes.

6.7.5 Offline Memory Test Parity Errors
The process to disable P.ioj parity traps is identical for the
Offline Memory Test and the Offline KIP Memory Test. This
process is described in Section 6.6.5.
6.7.6 Offline Memory Test Error Information
Refer to Section 6.1.5 for the generic diagnostic error message
format. The following is a typical Offline Memory Test error
message:

OMEM>hh:mm T aaa E bbb U-OOO
<Text describing error>
MA -xxxxxxxx
EXP-yyyyyy
ACT-zzzzzz

where:
hh
mm
aaa
bbb
xxxxxxxx
yyyyyy
zzzzzz

Elapsed hours since last bootstrap
Elapsed minutes
Decimal number denoting test
Decimal number denoting the error detected
Address of location causing the error
Data that was expected
Data that was actually found

parity trap and NXM trap errors do not include EXPected and
ACTual data.
6.7.6.1 Offline Memory Test Error Messages - Error messages
produced by the memory test can be classed as either data errors
or unexpected traps. Error numbers 000 through 010 are all
memory data errors. The only difference between these errors is
the exact step in the testing algorithm where they are detected.
The step at which a data error occurs can be an important clue to
the cause of the error. Errors 000 through 007 are declared in
the Moving Inversions algorithm; while errors 008 through 010 are
declared in the Walking Ones algorithm.

6-76

Errors 011 and 012 are unexpected trap errors. Error 011
signifies a parity trap occurred and error 012 indicates a
Non-Existent Memory trap. The reports for unexpected trap errors
differ slightly from a data error report because they do not
display expected and actual data.
The following list describes the nature of the failure indicated
by each error number:
o

Error 000 - occurs in the Moving Inversions test (see
Section 6.6.7). A memory location did not contain the
expected pattern.
MA
EXP
ACT

Address of the failing location
Data pattern expected
Data pattern actually found

This error can be caused by a data error in the address
specified, or it may indicate a dual-addressing problem.
In the second case, the location was incorrectly
addressed and written when some other location was
written. At this step in the test, a dual-addressing
problem is characterized by:
1.

The ACTual data contains a single additional one.

The additional one bit occurs immediately to the
left of the leftmost one in the EXPected data, such
as:
EXP=000377, ACT=000777
EXP=077777, ACT=177777
EXP=OOOOOO, ACT=OOOOOI

In the first example, the location in error was probably
written with the pattern 000777 when a lower numbered
address was being written with the same pattern. When
the location in error was subsequently checked to ensure
it still contained the previous pattern (000377), it
contained the next pattern (000777).
Data errors at this step of the test fall into one of
the following classes:
1.

The ACTual and EXPected data differ by more than one
bit:
EXp=017777, ACT=017477

6-77

The ACTual data contains fewer ones than the
EXPected data:
EXP=003777, ACT=001777

The bit in error is not in the bit position
immediately to the left of the leftmost one in the
Expected data:
EXP=000777, ACT=002777

Error 001 - occurs in the Moving Inversions Test
(Section 6.6.7) when the I/O control processor was
testing the K Control Area. A location was written with
a pattern. Immediately after the write, the location
was read.
It contained an incorrect pattern.
MA
EXP
ACT

Address of the failing location
Data pattern expected
Data pattern actually found

This error indicates a memory data problem.
following hardware failures is indicated:

One of the

A bit was picked up or dropped when the location was
written.

A bit was picked up or dropped when the location was
read.

If the error occurs repeatedly but only in a 'single
location, the memory chip containing the failing bit for
that address is probably defective.
If the error occurs in many locations but only occurs in
a particular nibble (4-bit field), one of the bus data
transceivers for that nibble is probably defective.
If the error occurs in many locations, and the bits in
error are randomly spaced throughout the word, the
memory or bus timing is probably the problem.
If the error occurs in more than one location, but the
addresses of the failing locations are similar,
crosstalk could exist between the memory data and
addressing lines.
(For example, all failing addresses
end with either 2 or 6.)
o

Error 002 - occurs in the Moving Inversions Test
(Section 6.6.7). A memory location did not contain the
expected pattern, where:

6-78

MA
EXP
ACT

Address of the location in error
Data pattern expected
Data pattern actually found

This error can be caused by a data error in the address
specified, or it may indicate a dual-addressing problem.
(The location was incorrectly addressed and written when
some other location was being written).
At this step in the test, a dual-addressing problem is
characterized by:
1.

The ACTual data containing one more zero than the
EXPected data.

The additional zero occuring in the same bit
position as the leftmost one ln the EXPected data,
for example:
EXP=003777, ACT=001777
EXP=000017, ACT=000007
EXP=177777, ACT=077777

The ACTual and EXPected data differ by more than one
bit:
EXP=177777, ACT=174777

The ACTual data contains more ones than the EXPected
data:
EXP=037777, ACT=077777

The bit in error is not in the same bit position as
the leftmost one in the EXPected data:
EXP=001777, ACT=001377

Error 003 - occurs in the Moving Inversions Test
(Section 6.6.7). A location was written with a pattern.
Immediately after the write, the location was read and
found to contain an incorrect pattern.

6-79

MA = Address of the failing location
EXP = Data pattern expected

ACT

Data pattern actually found

This error indicates a memory data problem and one of
the following hardware failures is indicated:
1.

A bit was picked up or dropped when the location was
written.

A bit was picked up or dropped when the location was
read.

If the error occurs repeatedly but only in a single
location, the memory chip containing the failing bit for
that address is probably defective.
If the error occurs in many locations but only occurs in
a particular nibble (4-bit field), one of the bus data
transceivers for that nibble is probably defective.
If the error occurs in many locations, and the bits in
error are randomly spaced throughout the word, the
memory or bus timing is probably the problem.
If the error occurs in more than one location, but the
addresses of the failing locations are similar,
crosstalk could be present between the memory data and
addressing lines. For example, all failing addresses
end with either 2 or 6.
o

Error 004 - occurs in the Moving Inversions test (see
Section 6.6.7.) A memory location did not contain the
expected pattern, where:
MA
EXP

ACT

Address of the failing location
Data pattern expected
Data pattern actually found

This error can be caused by a data error in the address
specified, or it may indicate a dual-addressing problem.
In the latter case, the location was incorrectly
addressed and written when some other location was
written.
At this step in the test, a dual-addressing problem is
characterized by:
1.

The ACTual data containing a single additional one.

6-80

The additional one bit occuring immediately to the
left of the leftmost one in the EXPected data, for
instance:
EXP=000377, ACT=000777
EXP=077777, ACT=177777
EXP=OOOOOO, ACT=OOOOOI

In the first example, the location in error was probably
written with the pattern 000777 when a higher numbered
address was being written with the same pattern. When
the location in error was subsequently checked to ensure
it still contained the previous pattern (000377), it
contained the next pattern (000777.)
Data errors at this step of the test fall into one of
the following classes:
1.

The ACTual and EXPected data differ by more than one
bit:
EXP=017777, ACT=017477

The ACTual data contains fewer ones than the
EXPected data:
EXP=003777, ACT=001777

The bit in error is not in the bit position
immediately to the left of the leftmost one in the
EXPected data:
EXP=000777, ACT=002777

Error 005 - occurs in the Moving Inversions Test
(Section 6.6.7). A location was written with a pattern.
Immediately after the write, the location was read and
it contained an incorrect pattern.
MA
EXP
ACT

Address of the failing location
Data pattern expected
Data pattern actually found

This error indicates a memory data problem.
following hardware failures is indicated:

One of the

A bit was picked up or dropped when the location was
written.

A bit was picked up or dropped when the location was
read.

6-81

If the error occurs repeatedly but only in a single
location, the memory chip containing the failing bit for
that address is probably defective.
If the error occurs in many locations but only occurs in
a particular nibble (4-bit field), one of the bus data
transceivers for that nibble is probably defective.
If the error occurs in many locations, and the bits in
error are randomly spaced throughout the word, the
memory or bus timing is probably the problem.
If the error occurs in more than one location, but the
addresses of the failing locations are similar,
crosstalk between the memory data and addressing lines
could be present.
(For example, all failing addresses
end with either 2 or 6.)
o

Error 006 - occurs in the Moving Inversions test

(Section 6.6.7). A memory location did not contain the
expected pattern, where:
MA
EXP
ACT

Address of the location in error
Data pattern expected
Data pattern actually found

This error can be caused by a data error in the address
specified, or it may indicate a dual-addressing problem.
(The location was incorrectly addressed and written when
some other location was being written.)
At this step in the test, a dual-addressing problem is
characterized by:
1.

The ACTual data containing one more zero than the
EXPected data.

The additional zero occuring in the same bit
position as the leftmost one in the EXPected data,
for example:
EXP=003777, ACT=001777
EXP=000017, ACT=000007
EXP=177777, ACT=077777

6-82

Data errors in this step of the Moving Inversions test
fall into one of the following categories:
1.

The ACTual and EXPected data differ by more than one
bit:
EXP=177777, ACT=174777

The ACTual data contains more ones than the EXPected
data:
ExP=037777, ACT=077777

The bit in error is not in the same bit position as
the leftmost one in the EXPected data:
EXP=001777, ACT=001377

Error 007 - occurs in the Moving Inversions Test
(Section 6.6.7). A location was written with a pattern.
Immediately after the write, the location was read and
found to contain an incorrect pattern, where:
MA
EXP
ACT

Address of the failing location
Data pattern expected
Data pattern actually found

This error indicates a memory data problem.
following hardware failures is indicated:

One of the

A bit was picked up or dropped when the location was
written.

A bit was picked up or dropped when the location was
read.

6-83

If the error occurs in more than one location, but the
addresses of the failing locations are similar,
crosstalk may be present between the memory data and
addressing lines. For example, all failing addresses
end with either 2 or 6.
o

Error 008 - occurs in the Walking Ones test (Section
6.7.7). All locations in the memory under test were
written with the pattern 000000. Then all locations
were read to check that they all contained 000000. When
the location specified in the error report was read, it
did not contain 000000, where:
MA
EXP

ACT

Address of the failing location
Data pattern expected (000000)
Data pattern actually found

Because all locations we.re cleared to 00000000 before
this error was detected, a dual-addressing problem is
unlikely. More likely, a bit was picked up when the
word was written or read.
If the error occurs repeatedly but only in one location,
the memory chip containing the bit in error for that
address is probably marginal.
If the error occurs in many locations but always occurs
in a particular nibble (4-bit field), one of the bus
data transceivers for that nibble is probably marginal.
If errors occur in many locations, and the bits in error
are randomly spaced throughout the words, the memory or
bus timing is probably marginal.
o

Error 009 - occurs in the Walking Ones Test (Section
6.7.7)~
One location in the memory under test was
written with the pattern 17777777 and all the other
locations should contain the pattern 00000000. While
reading to check that all other locations are clear, a
location was found containing something other than
00000000, where:
MA
EXP

ACT

Address of the failing location
Data pattern expected (000000)
Data pattern actually found

This error is either a data error or a dual-addressing
error.
(The location was incorrectly addressed and
written when some other location was being written.)
At this step of the test a dual-addressing failure is
possible if the ACTual data is 17777777. During this
part of the test, one location in the memory was written
to 17777777. When this write was performed, the failing

6-84

location may also have been addressed and written with
the same data. When the test was checking that all
other locations were clear, it found the second location
with the pattern 17777777. If this is a true
dual-addressing problem, the error is repeated on each
pass of the test.
At this step of the test, a data error is probable if
the ACTual data is NOT 17777777. Some clues to the
possible causes of a data error follow.
If the error occurs repeatedly but only in a particular
bit in a single location, the memory chip that contains
the failing bit for that location is defective.
If errors occur in many locations but only occur in a
particular nibble (4-bit field), one of the bus data
transceivers for that nibble is probably marginal.
If errors occur in many locations and the bits in error
are randomly spaced throughout the words, the memory or
bus timing is probably marginal.
o

Error 010 - occurs in the walking Ones test (Section
6.7.7). At this step of the test, one location in the
memory under test was set to the pattern 17777777 and
all other locations were cleared to 00000000. After
checking that all other locations contain 00000000, the
location that should contain 17777777 was read.
It
contained some other pattern. Because only read
operations were performed after writing the 17777777, a
dual-addressing problem is highly improbable.
MA
EXP

ACT

Address of the failing location
Data pattern expected (177777)
Data pattern actually found

If the error occurs repeatedly but only in a particular
bit of a single location, the memory chip that holds
that bit for the failing location is defective.
If errors occur in many locations but only occur in a
particular nibble (4-bit field), one of the bus data
transceivers for that nibble is probably marginal.
If errors occur in many locations, and the bits in error
are randomly spaced throughout the words, the memory or
bus timing is probably marginal.
If errors occur in more than one location, but the
addresses of the failing locations are similar,
crosstalk may be present between the memory data and
addressing lines.
(For example, all failing addresses
end in 2 or 4.)

6-85

Error 011 - indicates a parity trap occurred. The
parity trap probably occurred in a location under test
but may have been caused by Program memory where the
memory test itself resides. The MA data in the error
report indicates the address of the location causing the
parity trap. After reporting the parity trap, the
memory test continues if the parity error occurred in a
memory location under test, where:
MA
VPC

Address of the location causing the parity trap.
Virtual PC of the memory test at the time
the trap occurred. Reference this address In
the listing to locate the area of the test where
the error occurred.

Because the data is lost when a parity trap occurs, no
EXPected or ACTual data is displayed. To further
localize the problem, disable parity errors and rerun
the test.
(Refer to Section 6.6.5.)
If the original failure was in a data-bit position, the
memory test detects and reports the error, displaying
the EXPected and ACTual data. This helps trace the
error to a particular address and/or bit position. If
no further errors are detected after disabling parity
errors, the original failure was in one of the parity
bits for the address displayed in the parity trap
report.
o

Error 012 - indicates a Non-Existent Memory trap
occurred. An NXM error is caused when no memory
responds to a particular address. The MA data in the
error report identifies the address that produced the
NXM trap. After reporting the error, the program
attempts to restart testing from the beginning, where:
MA
VPC

The address being tested at the time the NXM trap
occurred.
The PC of the memory test at the time the trap
occurred. Reference this address in the listing
to locate the area of the test where the error
occurred.

The most frequent cause of this error is specifying too
large of a value for the Last Address to Test? prompt
(trying to test beyond the end of the memory in your
system).
If this error occurs at a memory address that should be
within your memory configuration, the memory in question
is not supplying an ACK to the I/O control processor
when the specified address is presented on the memory
bus. The most probable point of failure is the logic on
the memory module that compares addresses on the memory

6-86

bus with the range of addresses the module should
respond to. The comparator itself could be faulty or
the [C IN, C OUT], [0 IN, D OUT], or [P IN, P OUT] lines
on the backplane could be in error.
o

Error 013 - occurs in the Quick Verify test. This error
may indicate a dual-addressing problem. The Quick
Verify test consists of clearing the entire memory, then
writing two patterns to each location and checking that
the writes worked properly. Before writing the first
pattern to each location, the contents of the location
should be zero. Error 013 indicates a location
contained something besides a zero before the first
pattern was written.
If the ACTual data in the error report is 031463(8) or
146314(8), a dual-addressing problem is probably the
cause of the error.
(When an address lower in memory
was written with a test pattern, the failing location
was also written with the same pattern.) Dualaddressing problems are normally caused by shorts
between memory address bits.
If the ACTual data is other than 031463(8) or 146314(8),
the problem probably is caused by a memory bit or bits
stuck in the one state. The first pattern written is
146314(8). The second pattern written is the ones
complement of the first pattern, 031463(8).

Error 014 - occurs in the Quick Verify test. The MA in
the error report shows the failing address. The ACTual
data shows the bit or bits that failed.

Error 015 - occurs when an NXM trap occurs as the memory
under test is initially being cleared. The last address
to test (operator-supplied) exceeds the amount of memory
actually installed in the HSC or part of the memory
under test is not responding.
If the NXM occurs at an
address that should respond, use CTRL/C or CTRL/Y to
return to the Offline Loader. Use the Loader's REPEAT
EXAMINE (address that caused trap) to set up a scope
loop for isolating the problem.

Error 016 - Cache Parity Trap, VPC = xxxxxx - indicates
the Jll took a trap through the parity error vector
during the run of the diagnostic, and the error was
determined to be from the cache. The virtual PC at the
time of the trap is printed.

6-87

6.7.7 Offline Memory Test Summaries
The following list describes the three algorithms used by the
Offline Memory Test:

Test 000 - Quick Verify Test - quickly detects stuck
bits and dual-addressing problems. The algorithm used
by the Quick Verify test is as follows:
write 00000000 to each location of the memory
FOR i = First to Last address
IF < location i does not contain zero > THEN < display
error >
write test pattern to location i (146314(8»
IF < location i does not contain pattern > THEN <
display error >
write complement of pattern to location i (031463(8»
IF < location i does not contain complement > THEN <
display error >
NEXT i

Test 001 - Moving Inversions Test - detects data and
addressing problems in dynamic semiconductor memories.
The Moving Inversions Algorithm performs the following:
1.

Writes 00000000 in each location of the memory.

Reads all locations in order from lowest to highest.
After reading a location and checking for a zero,
rewrites the same location with a single one in the
least-significant bit. Then rereads the location
and verifies the write worked correctly.

Repeats step 3 until the test pattern consists of a
word containing all ones (pattern 17777777).

Repeats step 3 but this time substitute a single
extra zero each time instead of a one.

Continues step 5 until the test pattern consists of
a word of all zeros (pattern 00000000).

6-88

Repeats steps 1 through 6 but this time starts at
the highest memory address each time and works down
to the lowest. This works each memory location from
all zeros to all ones and back to all zeros.

Clears all memory to 00000000.

Test 002 - Walking Ones Test - is an algorithm that
stresses semiconductor memories and is effective in
locating timing problems on the memory module or on the
bus.
The Walking Ones Algorithm performs the following:

6.8

Writes all memory to zeros (pattern=OOOOOOOO).

Checks all memory for zeros.
not zero.

Sets TESTADDRESS

4•

Writes 17777777 to contents of TESTADDRESS.

Checks all other locations = 00000000.
Error 009 if not equal to 00000000.

Checks that TESTADDRESS contains 17777777.
an Error 010 if not equal to 17777777.

Writes 00000000 to contents of TESTADDRESS.

IF <TESTADDRESS = last address to test>
THEN <done testing>
ELSE <add 2 to TESTADDRESS, GOTO step 4>

Declare Error 008 if

first address to test.

Declare an
Declare

Rx33 OFFLINE EXERCISER

OFLRXE is a combined hardware diagnostic and exerciser for the
HSC70 M.std2/RX33 subsystem. Diagnosis of the DMA hardware and
diskette controller are provided, as well as a read/write
exerciser to provide exercise for the actual drive portion of the
subsystem. OFLRXE is a stand-alone diagnostic running under the
Offline Diagnostic Loader. This loader provides terminal I/O
service, time keeping, string conversions, and interrupt
handling. OFLRXE is an 8 Kword program of which approximately
half is control code and half is mapped for data buffer
transfers.

6-89

6.8.1 RX33 Offline Exerciser System Requirements
This test must have a Jll P.ioj module, and a M.std2
memory/controller board. At least one RX33 drive must be
present. One scratch diskette is needed for each drive to be
tested (maximum of two).
Testing of the entire Jll chip set and the Jll cache is assumed
if it is turned on. Two tested 4 Kword partitions of memory are
required by OFLRXE.
6.8.2 RX33 Offline Exerciser Operating Instructions
If the HSC70 is not booted and loaded, refer to Section 6.1.2,
Section 6.1.3, and Section 6.2. If the HSC70 is already booted
and displaying the Offline Loader prompt (ODL», proceed as
follows:
At the ODL> prompt, invoke the Offline Rx33 diagnostic by typing
TEST RX followed by a carriage return. This loads the Offline
Rx33 diagnostic (OFLRXE) from the media, and transfers control to
the diagnostic. At the start, the diagnostic should print out
the following string:
HSC70 Offline Rx33 Exerciser

Vxxx

where:
Vxxx is a 3-digit version/edit number.
NOTE
If you are unable to boot from drive 0, move the
diskette to drive I, try again, or use a backup
copy of the Offline Diagnostics diskette.
The Offline Rx33 Exerciser terminates on CTRL/C, CTRL/Y, or on
expiration of the allotted time. The program also terminates on
fatal errors.
6.8.3 RX33 Offline Exerciser Parameter Entry
Following are the three user-modifiable parameters for this test:
1.

Drive selection is prompted for by the program in the
following manner:
Test drive n (Y/N) [Y] ?
Where n is drive number (0 or 1). The default is yes.
The prompt repeats for each available diskette on the
HSC70.

6-90

Operator is asked if initial write operation should be
performed:
Perform initial write on this drive (Y/N)

[Y]

The default is yes. This lays down a background pattern
on the entire disk in preparation for the random
read/write exerciser. Selecting this option adds 10
minutes of test time per drive.
As soon as you have answered the previous prompts, the
program directs you to place a scratch diskette in the
selected drive:
Insert a scratch diskette in the drive, type a carriage
return to continue.

At this point, insert your scratch diskette. The random
read/write exercise takes place over the entire surface
of the diskette, so be sure the diskette is a scratch
one only to be used for the exercise.
3.

Run time of the exerciser is user-selectable and is
prompted for by the program as follows:
# of minutes to exercise (0)

[30] ?

Enter a number between 1 and 32767. The default, if the
user types just a carriage return, is 30 minutes. This
30 minutes starts after the initial patterning of the
disk (if selected) so the total test time with two
drives, and initial patterning is amount of time plus 20
minutes. A value of 1440 minutes gives a 24-hour run
time for burn-in purposes. The 30-minute default is
sufficient for installation use and repair verification.
At the end of the amount of time allotted for the exerciser, the
program will prompt the user by printing:
Reuse parameters (Y/N)

[Y] ?

Answering this prompt with a Y allows you to run the diagnostic
again with the same parameters. Answering with a N takes you
back through the parameter entry questions again.
6.8.4

RX33 Offline Exerciser Progress Reports

The Offline RX33 Exerciser does not run in a conventional pass
sense. There are no pass completed messages. Instead some
informational messages are printed indicating what the exerciser
is currently doing. At the end of the initial write test (if
selected), the exerciser prints:

6-91

Initial write completed on drive
Where n is the drive number,

OOOn

a or 1.

When the exerciser begins the random read/write phase of the
testing, the following message is printed:
Beginning random exerciser
The random exerciser is now in progress. It runs for the amount
of time requested by the user. When the requested time has
expired, the program prints the following string:
Exerciser completed.
The program then returns to the parameter entry routine.
The program also has a user-requested status report available.
If at any time the user types CTRL/T on the console, the program
responds:
Number of sectors transferred

= xxxxxxxxxx, yyyyyy errors.

Where xxxxxxxx is a 16-digit number of sectors successfully
transferred, and yyyyyy is a 6-digit cumulative number of errors
detected.
6.8.5 Rx33 Offline Exerciser Error Information
A generic message format for all offline diagnostic errors is
found in Section 6.1.5. The following section contains
information on specific errors associated with Rx33 Offline
Exerciser. A typical Rx33 Offline Exerciser error message is:
OFLRXE>52:22 T 008 E 010 D 001
SEEK error detected during positioning operation
LBN = 004356
Track = 000114
Sector =000007
Surface = 00000
Soft errors, such as seek errors, can build up to a point where
diagnostic defines them as fatal and terminates on a fatal error.
The internal bias for soft errors is currently set to 20. When
this number is exceeded, the Exerciser determines the errors are
fatal and terminates.
6.8.5.1 Specific RXJJ Offline Exerciser Error Messages - The
following is a list of errors associated with test failures:
o

Error 00 - Parity trap, VPC = xxx xxx - (Applicable to
all tests) - occurs at any time during execution of the

6-92

diagnostic. The virtual PC on the stack is printed to
help identify the program area where the error occurred.
Both the content of the error address register and the
virtual PC are displayed as optional lines. This error
terminates the test. The diagnostic returns to the
Reuse parameters prompt.
o

Error 01 - NXM Trap, VPC = xxxxxx (Applicable to all
tests) - causes the diagnostic to return to the Reuse
parameters prompt. Additional data, such as the virtual
PC of the instruction which caused the trap, and the
physical address contained in the error address register
are printed as optional lines.

Error 02 - Bit Stuck in Register (Applicable to Test 1)
- indicates a stuck-at fault is present in one of the
RX33 control registers. The register address and the
expected and actual data are printed as optional lines
in the error message.
If the error is in the low byte,
the problem is the diskette controller chip.
If the
error is in the high byte, the problem is with the MAR
register at that address. If more than one register
show the same bit(s) in error, the problem is probably
in the bus transceivers.

Error 03 - Interrupt Occurred Without Enable Set
(Applicable to Test 2) - indicates there is a stuck-at
fault in the register, or the etch going into the DCaa3
interrupt control chip. The interrupt enable bit, <13>
of the CSR, does not disable interrupts.

Error 04 - Rx33 Interrupt Occurred at Wrong priority
(Applicable to Test 2) - indicates the RX33 interrupt
occurred with the priority at five or greater. The
virtual PC where the interrupt occurred is printed out
as an optional line. Using the listing of the program,
you can determine the priority at the time of the
interrupt.

Error 05 - Unexpected Interrupt from Rx33 (Applicable to
all tests) - indicates an unexpected interrupt. An
interrupt that occurs at any time when a command to the
RX33 is not in progress is defined as unexpected. The
virtual PC where the interrupt occurred is printed as an
optional line.

Error 06 - Track 0 Did Not Set After RECALIBRATE Command
(Applicable to Test 5) - indicates the track a status
bit (bit 2 of the CSR) did not set upon completion of a
RECALIBRATE command. The drive may not be sending the
signal, or the cable to the drive may be faulty.

6-93

Error 07 - Rx33 Did Not Interrupt as Expected
(Applicable to Test 2) - indicates an expected interrupt
never occurred. The interrupt control chip (DC003) may
be at fault, or the diskette controller chip interrupt
signal is stuck at 1. The Jll may be unable to
recognize interrupts from the diskette controller, or
the backplane etches carrying interrupt control signals
are open.

Error 10 - Seek Error Detected During Positioning
Operation (Applicable to Tests 5, 6, 7, and 8) indicates a seek error status (bit 4 of the CSR) was set
after a SEEK or RECALIBRATE command. The problem may be
in the diskette controller chip or the diskette. If the
errors are occurring mostly in test 5 starting with
track 0, the problem is probably fundamental; the
controller cannot read the diskette at all.
If the
errors occur in a random fashion, the problem is
probably the diskette.

Error 11 - Current Track Register Incorrect (Applicable
to Tests 5 and 6) - indicates the values in the track
register of the diskette controller chip are not as
expected after a given operation. This problem probably
is in the diskette controller chip.

Error 12 - CRC Error in Header Detected During Position
Verify (Applicable to Tests 5, 6, 7, and 8) - detects a
CRC error when reading a header during a position
verify. This error occurs when a valid header has been
found and read, but the CRC at the end is incorrect.
This is probably the diskette. If the controller is
able to detect the address and data marks that precede a
header (so that it knows that a header is being read),
the data separation logic is probably working.

Error 13 - Processor Type is Not Jll (Applicable to Test
0) - does not contain the value which defines a Jll.
This error causes the diagnostic to terminate.

Error 14 - Drive Under Test is Not Ready (Applicable to
Tests 5, 6, 7, and 8) - indicates the diskette drive is
sending NOT READY status to the controller. The door
may open on the drive, or no diskette is inserted.
If
these conditions are not the cause of the fault, the
ready signal from the drive may be stuck.

Error 15 - Last Command Did Not Complete (Applicable to
Tests 5, 6, 7, and 8) - indicates the last command
issued to the diskette controller never interrupted to
show completion. This error points to the diskette
chip, since it occurs after the interrupt logic has
already been tested.

6-94

Error 16 - RX33 Header Does Not Compare (Applicable to
Tests 7 and 8) - The header information written in the
data area of a sector is not what it should be for that
sector. Each sector has a unique header consisting of
track, sector, and side, written as part of the data in
that sector. This error happens when an undetected
positioning error has occurred, either during the read,
or the write of the sector involved. The LBN, track,
sector, and side are displayed as optional lines.

Error 17 - Record Not Found During Read (could also say
Write) (Applicable to Tests 7 and 8) ~ indicates the
controller was unable to find that sector on the current
track when attempting to read or write a given sector.
Either a misposition occurred, or that sector is
unreadable. Because this error occurs after basic read
capability has been tested, the most probable culprit is
the diskette, with the diskette chip being the next most
probable. The LBN, track, sector, and side are
displayed as optional lines.

Error 20 - CRC error in Data During Read (could also say
Write) (Applicable to Tests 7 and 8) - indicates the
controller detected a eRe error when reading the desired
sector. If the error occurs multiple times in a row for
a given sector, the problem is most likely the diskette
(or the drive it is installed in). Single errors when
an LBN has this error only once are soft errors. The
LBN, track, sector, and side information is printed as
optional lines.

Error 21 - Lost Data Detected During Read (could also
say Write) (Applicable to Tests 7 and 8) - indicates the
DMA logic did not service an I/O request of the diskette
controller chip in time. There are probably problems in
the DMA logic, or stuck-at faults exist in the etch
between the controller chip and the DMA logic.

Error 23 - Invalid Pattern Code in Buffer (Applicable to
Test 8) - indicates the data word, defined as the
pattern code, read from the diskette does not match any
of the possible patterns used. It is unlikely the data
was read incorrectly from the diskette and not detected
as a eRe error. Usually this error occurs when a
diskette is not written with the initial data pattern.
The LBN, track, sector, and side are displayed as
optional lines.

6-95

Error 24 - Drive is Write-Protected (Applicable to Tests
7 and 8) - indicates the drive is sending write protect
status. Either the interface is bad, or the drive is in
error (assuming you don't have a write-protected
diskette in the drive). This error terminates the
diagnostic, as you cannot write on a write-protected
diskette.

Error 25 - CRC Error in Header During Read (could also
say Write) (Applicable to Tests 7 and 8) - indicates the
controller detected bad CRC in the header it was reading
as part of a data transfer command. This is probably a
diskette error. The LBN, track, sector, and side are
displayed as optional lines.

Error 26 - Data Incorrect After DMA TEST MODE command
(Applicable to Tests 3 and 4) - indicates the memory

content after a DMA test mode command was not correct.
There are either stuck-at faults in the DMA registers,
or the transfer did not happen at all (that is, the
memory is unchanged). This is a fundamental error in
the diskette logic, and the diagnostic terminates after
detecting it.
o

Error 27 - Data Compare Error (Applicable to Tests 7 and
8) - indicates a manual check of data read by the
diskette turned up an error. Either the transfer did
not complete, an intermittent error occurred in the data
or address path, or what was written on the disk was
written incorrectly. The LBN, track, sector, and side
are displayed as optional lines.

Error 30 - RX33 Detected Parity Error During Read (could
also say Write) (Applicable to Tests 7 and 8) indicates the Rx33 detected a parity error when doing a
DMA read from memory. Either program memory is bad, or
the parity logic on the controller is in error.

Error 31 - Rx33 Detected NXM During Read (could also say
Write) (Applicable to Tests 7 and 8) - indicates the
Rx33 detected a NXM during a DMA operation. Either the
DMA address was loaded wrong and pointed to a
nonexistent location, or the handshake logic on the
M.std2 board is in error.

Error 32 - Rx33 MAR Value Incorrect After DMA Transfer
(Applicable to Test 3) - indicates the value of the MAR
address counters was in error after a DMA test
operation. The problem is probably in the counters or
the etch associated with them. The EXPected and ACTual
data are printed out as optional lines.

Error 33 - a Parity Error was not Forced in Main Memory
(Applicable to Test 4) - indicates a write to program

6-96

memory with bad parity (bit 11 of the CSR) set did not
result in bad parity in memory. There is either a
stuck-at fault in the parity logic or the operation
never wrote memory in the first place.
o

Error 34 - Parity Error Did Not Set in CSR (Applicable
to Test 4) - indicates a DMA read of a location with
known bad parity did not set the parity error bit (bit
15 of the CSR). Either the data was never read, or
there is a stuck-at fault in the parity logic.

Error 35 - NXM Did Not Set in CSR (Applicable to Test 4)
- indicates a DMA read of a location expected to give a
NXM did not set NXM in the CSR. Look for stuck-at
faults in the NXM detection logic.

Error 36 - Parity Error Set Along with NXM in CSR
(Applicable to Test 4) - indicates both the parity error
and the NXM error set simultaneously in the CSR. On a
NXM error, the parity error should not set. Check for
stuck-at faults in the NXM/parity error logic.

Error 37 - Cache Parity Error, VPC = xxxxxx (Applicable
to all tests.} - indicates the Jll took a trap through
the parity error vector, a cache error during the run of
the diagnostic. The virtual PC at the time of the trap
is printed.

6.8.6 RX33 Offline Exerciser Test Summaries
The following is a summary of Rx33 offline tests:
1.

Test 1 - Rx33 Controller Registers - performs stuck-at
testing on the RX33 controller registers at 177400,
177402, 177404, and 177406. A simple walking one's test
is performed on each register, except for the CSR
register at 177400 which only has the high byte tested.

Test 2 - Interrupt Hardware - exercises the interrupt
hardware on the M.std2. The interrupts generated are
also tested for the correct priority when they occur.

Test 3 - DMA Logic and Counters - checks out all of the
DMA handshake signals, the data path, and the address
path. A special DMA test mode in the controller is used
to perform one read or write to/from each memory
location loaded in the DMA address registers. Correct
incrementing action from the counters is checked. The
actual data loaded to memory on a DMA write is checked
as well.

6-97

Test 4 - Parity Logic - also uses DMA test mode in
addition to the force bad parity function (bit 11 of the
CSR) to prove parity errors can be detected, and correct
parity is written to memory by the DMA control logic.
NXM action is also lumped into this test. Correct
handling of NXM errors is checked as well as correct
reporting by the error bit in the CSR.

Test 5 - Verify Track Counters and Registers - uses the
step function of the diskette controller chip to verify
all cases of the track counter bits internal to the
diskette controller chip work as advertised. Step
functions are performed for each power of two in the
diskette track register, (step four times, step eight
times more, etc.). The verify option is set on each
step command so the diskette controller reads headers on
each track to verify position.

Test 6 - Oscillating Seek Test - performs an oscillating
seek test using the algorithm:
oscillating seek test
begin
incnt = 0
outcnt = 124
while incnt<> outcnt do
begin
seek outcnti
CHECKSTATUSi
If outcnt <> rxtrk then error 11
outcnt =outcnt-li
seek incnti
CHECKSTATUSi
if
incnt <> rxtrk then error 11
incnt =incnt + Ii
end;
end
oscillating seek test.
In this manner, all seeks are performed in both
directions with all seek counts between <0:77>.
Verification is performed on each track to check the
step logic.

Test 7 - Sequential Read/Write Test - performs the basic
patterning of the diskette with a background pattern.
This test is user-selected. If selected, this test
writes each LBN on the Rx33 diskette in ascending order
with a unique pattern consisting of the track, sector,
and side of that LBN, and then an incrementing-byte
pattern for the remainder of the 5l2-byte sector. Each
LBN so written is then read back, and each word compared
to the data that was written. This test takes about 10
minutes per drive.

6-98

Test 8 - Random Reads/Writes - does random reads and
writes to the selected drives. If both drives are
selected for test, operations on each drive are
performed in groups of five.
This test runs until the allotted time for the exercise
expires, or the user terminates the test with a CTRL/C.
The mechanism of this test is as follows:
A random number is generated. The value of this number
determines if the operation is a read or a write, and
which LBN is used.
If the command is a read, the appropriate LBN is read
from the disk. The header bytes (0:5) of the data read
are then compared against the values expected. The
pattern number bytes (6:7) are then compared against a
list to see which pattern should be used to compare the
rest of the buffer (10:512).
If the command is a write, other bits of the random
number are used to select one of four different patterns
to write to the disk. A buffer is then set up with the
correct header bytes for the LBN to be written and the
correct background data pattern. This buffer is then
written out on the diskette.
Descriptions of the data patterns used are found in the
following section.

6.8.7 Rx33 Offline Exerciser Data Patterns
Four unique data patterns were selected to give maximum delta of
frequency with the MFM (modified frequency modulation) encoding
used on the RX33. These patterns are as follows:
PATTERN NUMBER

PATTERN VALUE

177400
11111
22222
33333
44444

Incrementing by bytes starting at 2404
1000101110001011 binary, 105613 octal
0011001100110011 binary, 031463 octal
0011000010010001 binary, 030221 octal
0000101110001011 binary, 005613 octal

6-99

6.9 OFFLINE REFRESH TEST
The Offline Memory Refresh test finds memory problems related to
refresh. Patterns are written to memory and then checked after
waiting one minute. Three separate patterns are used to test
each memory bit (including parity bits) in both the one and zero
states. All three HSC memories are tested (Program, Control, and
Data), although only the Program and Control memories require
refreshing. Tests of Data memory are included because some
static RAM failures resemble refresh problems.
The refresh test can find problems in the memories not detected
by the normal memory tests. The refresh test is not intended to
be run on memories that fail the normal memory tests.
6.9.1 Offline Refresh Test System Requirements
The following hardware is required to run this diagnostic:
o

I/O control processor module with HSC70 Boot ROMs

At least one memory module that passes the Offline
Memory test and/or the Offline K/P memory test

Rx33 controller with at least one working drive

Terminal connected to I/O control processor console
interface

This test assumes the HSC memories pass both the Offline Memory
Test and the Offline K/P Memory Test. In addition, the test
assumes the memories are working except for the refresh
circuitry.
6.9.2 Offline Refresh Test Operating Instructions
If the HSC70 is not booted and loaded, refer to Section 6.1.2,
Section 6.1.3, and Section 6.2. If the HSC70 is already booted
and displaying the Offline Loader prompt (ODL», proceed as
follows:
1.

Type TEST REFRESH in response to the prompt ODL>.

The refresh test indicates it is loaded properly by
displaying the following:
HSC OFL Memory Refresh Test

The refresh test now prompts for parameters.
Section 6.9.3 for test parameter entries.

6-100

Refer to

6.9.3 Offline Refresh Test Parameter Entry
This section describes the prompts for the Offline Refresh Test
parameters.
NOTE
For any of the Offline Refresh Test prompts, use
the DELete key to delete mistyped parameters
before the terminating carriage return is typed.
If you note an error in a parameter already
terminated with a carriage return, type a CTRL/C
to return to the initial prompt and re-enter all
parameters.
The Offline Memory Refresh Test first prompts:
# of passes to perform (D) [1] ?

Enter a decimal number between 1 and 2,147,483,647 (omitting
commas) to specify the number of times the refresh test should be
repeated.
(Entering a 0, or just a carriage return results in
one pass.) After selection of the number of passes the test
begins. The test can be aborted at any time by typing a CTRL/C.
Each pass of the test requires three minutes to complete.
After the refresh test completes, the following prompt is issued:
Reuse parameters (YIN) [Y] ?
Answering this prompt with a carriage return, or a Y followed by
a carriage return, repeats the test using the same parameters.
Answering the prompt with a N followed by a carriage return
causes a prompt for new parameters.
6.9.4 Offline Refresh Test Progress Reports
Each time the refresh test completes one full pass, an
end-of-pass report is displayed. Each pass of the test requires
three minutes to complete. The end-of-pass message is displayed
as follows:
End of Pass nnnnnn, xxxxxx Errors, yyyyyy Total Errors
The Pass count nnnnnn is a decimal count of the number of
complete passes made. The Errors count (xxxxxx) indicates the
number of errors detected on the current pass. The Total Errors
count (yyyyyy) indicates the number of errors detected during the
passes completed so far.
6.9.5 Offline Refresh Test Error Information
All error messages produced by the refresh test conform to the
HSC Diagnostic Error message format (refer to Section 6.1.5).
Following is a typical Offline Refresh error message.

6-101

ORFT>hh:mm T aaa E bbb
<Text describing error>
MA -xxxxxxxx
EXP-yyyyyy
ACT-zzzzzz
where:
MA = Address of the failing location
EXP
EXPected data
ACT
ACTual data
6.9.5.1 Offline Refresh Test Error Messages - The following list
describes the nature of the failure indicated by each error
number:

Error 01 - indicates the test detected a parity error
when reading the pattern from the indicated location.
The expected and actual data are included in the error
report. This error indicates a data bit or parity bit
was not refreshed (assuming the memory in question
passed the Offline Memory Test). If the expected and
actual data are the same, one of the parity bits was not
refreshed.

Error 02 - indicates the test detected a data compare
error when reading the pattern from the indicated
location. The expected and actual data are displayed in
the error report. Note, a parity error did not occur so
more than one bit must have failed to refresh.

Error 03 - indicates the I/O control processor detected
a parity error. The 22-bit address of the location that
caused the trap is displayed as the MA data in the error
report, where:
MA
VPC

Address causing the parity trap.
Virtual PC of the memory test at the time the
trap occurred. Reference this address in the
listing to locate the area of the test where
the error occurred.

Because the data is lost when a parity trap occurs, no
EXPected or ACTual data can be displayed. The parity
error occurred within the program itself not within the
memory being tested. After the trap is reported, the
program attempts to restart the test from the beginning.
o

Error 04 - indicates the I/O control processor detected
a Non-Existent Memory trap. An NXM error is caused when
no memory responds to a particular address. The MA data
in the error report indicates the address that produced
the NXM trap. After the trap is reported, the program

6-102

attempts to restart the test from the beginning.
(The
MA and VPC fields have the same meanings as those in
Error 03.)
If this error is at a memory address that should be in
your memory configuration, the memory in question is not
supplying an ACK to the I/O control processor when the
specified address is presented on the memory bus. The
most probable point of failure is the logic on the
memory module that compares addresses on the memory bus
with the range of addresses to which the module should
respond. The comparator itself could be faulty or the
[C IN, C OUT], [D IN, D OUT] or [P IN, P OUT] lines on
the backplane could be in error.
o

Error 05 - Cache Parity Trap, VCP = xxx xxx - Indicates
the Jll took a trap through the parity error vector
during the run of the diagnostic. This is a cache
error. The virtual PC at the time of the trap is
printed.

6.9.6 Offline Memory Refresh Test Summaries
The following are the test summaries for the Offline Memory
Refresh Test:
o

Test 01 - Pattern 17777777 - fills the memories with the
pattern 177777. This sets all data bits and also sets
the upper and lower byte parity bits. The entire
Control and Data memories are filled with the pattern.
All of Program memory not occupied by the Refresh test
and the Offline Loader is also filled with the pattern.
After filling the memories, the program delays for one
minute, then each memory location is read and checked
for the pattern. Any errors detected are reported on
the terminal.

Test 02 - Pattern 000000 - fills the memories with the
pattern 000000. This clears all data bits and sets the
upper and lower byte parity bits. The entire Control
and Data memories are filled with pattern. All Program
memory not occupied by the Refresh test and the Offline
Loader is also filled with the pattern. After filling
the memories, the program delays for one minute, then
each memory location is read and checked for the
pattern. Any errors detected are reported on the
terminal.

6-103

Test 03 - Pattern 100001 - fills the memories with the
pattern 100001. This sets data bits 0 and 15 and clears
data bits 1 through 14. Both parity bits are also
cleared. The entire Control and Data memories are
filled with the pattern. All of Program memory not
occupied by the Refresh test and the Offline Loader is
also filled with the pattern. After filling the
memories, the program delays for one minute, then each
memory location is read and checked for the pattern.
Any errors detected are reported on the terminal.

6.10 OFFLINE OPERATOR CONTROL PANEL TEST
The Offline Operator Control Panel (OCP) test checks the
operation of the HSC lamps and switches. Testing includes the
five OCP lamps and switches, the State LED, and the Secure/Enable
switch, and the Enable LED.
This section includes troubleshooting procedures for localizing
faults detected by this test.
6.10.1 Offline Operator Control Panel Test System Requirements
The following hardware is required to run this test:
o

I/O control processor module with HSC70 Boot ROMs

At least one memory module

Rx33 controller with at least one working drive

Terminal connected to I/O control processor console
interface

Operator Control Panel

Due to the sequence of tests that precede this test, you can
assume the I/O control processor module, Program memory, and Rx33
are tested and working.
6.10.2 Operator Control Panel Test Operating Instructions
If the HSC70 is not booted and loaded, refer to Section 6.1.2,
Section 6.1.3, and Section 6.2. If the HSC70 is already booted
and displaying the Offline Loader prompt (ODL», proceed as
follows:
Type TEST OCP in response to the ODL> prompt.
motion LED should be ON.

The RX33 drive in

The test indicates it is loaded properly by displaying the
following message:

6-104

HSC OFL OCP Test
The test then prompts for parameters.
for test parameter entry.

Refer to Section 6.10.3

6.10.3 Offline Operator Control Panel Test Parameter Entry
The test first checks the position of the Secure/Enable switch,
via a bit in the I/O control processor Control and Status
Register (address 17770040). If the switch is in the SECURE
position, the following prompt is issued. Otherwise, the test
skips to the next prompt:
Put Secure/Enable switch into ENABLE position
If the Secure/Enable switch is in the ENABLE position and the
above prompt is issued anyway, a problem is indicated with the
bit in the I/O control processor CSR that monitors the
Secure/Enable switch. Refer to the troubleshooting procedures in
Section 6.10.6. The program waits until the Secure/Enable switch
is changed to the ENABLE position and issues the following
message:
(Enable LED should be lit, State LED should be blinking)
Check to verify the Enable LED is lit and the OCP State LED is
blinking. There are two State LEOs, one is to the left of the
Init switch on the HSC OCP, the other is located on the I/O
control processor module (the fourth LED from the bottom of the
rightmost module in the HSC card cage).
If either LED is not
blinking, refer to the troubleshooting procedures in Section
6.10.6. The test next prompts for a lamp test:
Press Fault (all OCP lamps should light) (Y/N) [Y] ?
Press the Fault lamp and observe that all OCP lamps light.
If
none of the lamps light, a problem may be present in the lamp
test logic on the OCP assembly. If all lamps light properly,
type a carriage return to continue the test.
If the lamp test
fails, replace the OCP.
Next, the program checks that all OCP switches are OFF (out
position). If any switch bits in the I/O control processor
Switch/Display register read as ones (ON), the program lights the
lamps for those switches and prompts:
Put all lit switches in OFF (out) position (Y/N) [Y] ?
If the Fault or Init lamps are lit (nonlocking switches), a
problem exists with the wiring in those switches or with their
respective bits in the Switch/Display register. Replace the OCP.

6-105

Otherwise press all lit switches to release their locks and type
a carriage return.
If the message repeats, and one or more lamps
remain lit even though the switches are OFF (out position), refer
to the troubleshooting procedures in Section 6.10.6.
The program then tests each of the OCP switches, one at a time.
A switch lights and the following prompt is displayed:
Press and release the lit switch
Press the switch that is lit. The program allows about one
second for the switch to be released after it is pressed and then
continues to the next prompt. If the program fails to respond
when a switch is pressed, refer to the troubleshooting procedures
in Section 6.10.6. For those switches that lock in the ON
position (Online switch and the two unmarked switches), the
program prompts:
Press and release the lit switch again
Press the switch again to return it to the OFF (out) position.
If the Online switch or either of the unmarked switches fails to
lock in the ON position, the switch is defective, and the OCP
should be replaced.
After the OCP switch tests are complete, several features of the
Secure/Enable switch are tested. The program begins these tests
by prompting:
Put Secure/Enable switch into SECURE position
The program waits until the Secure/Enable switch is in the proper
position before continuing. If the program fails to respond when
the switch is moved to the SECURE position, refer to the
troubleshooting procedures in Section 6.10.6. When the program
detects the switch is in the SECURE position, it prompts with:
(Enable LED should turn off)
Ensure the Enable LED is off. If this LED fails to turn off when
the switch is in the SECURE position, a short or wiring problem
is probable.
Next, the program prompts:
Press Init (HSe should not re-boot) (YIN) [Y] ?
Press the Init switch. When the Secure/Enable switch is in the
SECURE position, pressing the Init switch should have no effect.
(Do not press any other switch or an error message results.) If
the HSC70 starts to perform a bootstrap, (Init lamp turns on and
green LED on I/O control processor turns off), the Secure/Enable
switch is not disabling the action of the Init switch. After
pressing the Init switch, type a carriage return to continue.

6-106

The test responds with the following prompt:
Press terminal BREAK key (HSC should not halt) (YIN) [Y] ?
Press the BREAK key as directed. When in SECURE mode, the BREAK
key should not cause the Jll processor to halt (enter ODT.) If
the terminal displays the @ character when BREAK is pressed, the
Secure/Enable switch is not disabling the action of the BREAK
key. Refer to the troubleshooting procedures in Section 6.10.6.
After pressing the BREAK key, type a carriage return to continue
the test. The final prompt of the test is:
Put Secure/Enable switch into ENABLE position.
The test waits until the Secure/Enable switch is returned to the
ENABLE position. At that point the test terminates and returns
to the Offline Loader.
The test may be aborted at any time by typing a CTRL/C.
6.10.4 Offline Operator Control Panel Test Error Information
All error messages produced by this test conform to the HSC
Diagnostic Error message format. Refer to Section 6.1.5. Listed
below is a typical Offline Operator Control Panel Test error
message format:
OOCP>hh:mm T aaa E bbb
<Text describing error>
MA -xxxxxxxx
EXP-yyyyyy
ACT-zzzzzz

6.10.4.1 Offline Operator Control Panel Test Error Messages The following list describes the nature of the failure indicated
by each error number:
o

Error 000 - Wrong Bit Set - occurs when the test detects
a switch bit set in the I/O control processor
Switch/Display register other than the switch bit being
tested. This error can be caused by:
The operator pressing the wrong switch.
A short causing an additional switch bit·to set
along with the expected bit.
A wiring error causing the wrong bit to set when a
switch is pressed.

6-107

The MA (Media Address) field of the error report gives
the address of the I/O control processor Switch/Display
register. The EXPected and ACTual data in the error
report show the switch bit the program expected to find
set and the bit or bits that actually were set.
If the EXPected data and the ACTual data each consist of
only one bit, the failure was either caused by the
operator pressing the wrong switch or by a wiring error.
If the ACTual data consists of two or more set bits, a
short between switches is likely. Refer to the
troubleshooting procedures in Section 6.10.6.
o

Error 001 - Bit Set When Init Pressed - occurs when the
Init switch is pressed while the HSC is in the SECURE
mode (Test 008). This error can be caused by one of the
following:
Pressing some switch other than the Init switch.
Pressing the Init switch, causing a switch bit in
the I/O control processor Switch/Display register to
set.
The MA (Media Address) field of the error report gives
the address of the I/O control processor Switch Display
register. The EXPected data is always zero (no bit is
expected to set). The ACTual data shows the bit or bits
that read as a 1 when the Init switch was pressed.
Refer to the troubleshooting procedures in Section
6.10.6.

6.10.5 Offline Operator Control Panel Test Summaries
The following sections summarize Test 000 through Test 009:
o

Test 000 - Observe Enable and State LEOS - is performed
by the operator, because the program cannot tell whether
the Enable or State LEOs are lit. If the Enable LED is
off, a wiring problem may be the cause (LED not
connected to power/ground source) or the LED itself may
be faulty.
If the State LED on the OCP fails to blink, check the
State LED on the I/O control processor module (fourth
LED from the bottom of the rightmost module in the HSC70
card cage). If neither State LED is blinking, the
problem is probably caused by the bit in the I/O control
processor CSR register that controls the State LED.
(Refer to Section 6.10.6.4.) If one of the State LEOs is
blinking but the other is not, the nonblinking LED is
probably wired wrong or is faulty.

6-108

Test 001 - Lamp Test via Fault Switch - performs an
automatic lamp test. When the Fault switch is pressed,
all lamps should light and remain lit until the switch
is released.
If none of the lamps light when the Fault switch is
pressed, the problem is probably in the lamp test
circuitry on the OCP assembly. It is possible all lamps
are defective or they are not installed. Replace the
OCP.
If some lamps light when Fault is pressed but others do
not, replace the OCP.

Test 002 - Check All Switches OFF - reads the I/O
control processor Switch/Display register to see if any
of the switch bits read as ON (switch bit is a one). If
the bit for any switch reads as ON, the corresponding
lamp is lit, and the program prompts to turn off any
switch that is lit. The program will not proceed until
all switch bits read as OFF.
If a lamp remains ON, even though the corresponding
switch is OFF (out position), the switch is either wired
incorrectly or the bit in the I/O control processor
Switch/Display register for that switch is faulty.
Refer to Section 6.10.6.1 to localize the problem.

Test 003 - Fault Switch - directs pressing the lit
switch. The program then monitors the switch bits in
the I/O control processor Switch/Display register and
waits for the Fault switch bit to set. If any other
switch bit sets, an error is reported and the program
terminates.
If pressing the Fault switch has no effect, one of the
following could be the cause:
Fault switch is broken.
Fault switch is not wired properly.
Fault switch bit in the I/O control processor CSR
cannot be set.
Refer to the troubleshooting procedures in Section
6.10.6.
If pressing the Fault switch results in an error
message, refer to Section 6.10.4.1.

6-109

Test 004 - Online Switch - directs pressing the lit
switch. The program then monitors the switch bits in
the I/O control processor Switch/Display register and
waits for the Online switch bit to set. If any other
switch bit sets, an error is reported and the program is
terminated.
After the Online switch bit sets in the I/O control
processor Switch/Display register, the program directs
you to press the lit switch again returning it to the
OFF (out) position. Then the program waits until the
switch bit reads OFF (0) before proceeding to the next
test.
If pressing the Online switch has no effect, one of the
following could be the cause:
Online switch is broken.
Online switch is not properly wired.
Online switch bit in the I/O control processor CSR
cannot be set.
Refer to the troubleshooting procedures in Section
6.10.6.
If pressing the Online switch results in an error
message, refer to Section 6.10.4.1.

Test 005 - First Unmarked Switch - directs pressing the
lit switch. The program then monitors the switch bits
in the I/O control processor Switch/Display register and
waits for the first unmarked switch bit to set.
If any
other switch bit sets, an error is reported and the
program is terminated.
After the first unmarked switch bit sets in the I/O
control processor Switch/Display register, the program
directs you to press the lit switch again returning it
to the OFF (out) position. Then the program waits until
the switch bit reads OFF (0) before proceeding to the
next test.
If pressing the first unmarked switch has no effect, one
of the following could be the cause:
First unmarked switch is broken.
First unmarked switch is not wired properly.
First unmarked switch bit in the I/O control
processor CSR cannot be set.

6-110

Refer to the troubleshooting procedures in Section
6.10.6.
If pressing the first unmarked switch results in an
error message, refer to Section 6.10.4.1.
o

Test 006 - Second Unmarked Switch - directs pressing the
lit switch. The program then monitors the switch bits
in the I/O control processor Switch/Display register and
waits for the second unmarked switch bit to set. If any
other switch bit sets, an error is reported and the
program terminates.
After the second unmarked switch bit sets in the I/O
control processor Switch/Display register, the program
directs you to press the lit switch again, returning it
to the OFF (out) position. Then the program waits until
the switch bit reads OFF (0) before proceeding to the
next test.
If pressing the second unmarked switch has no effect,
one of the following could be the cause:
Second unmarked switch is broken.
Second unmarked switch is not properly wired.
Second unmarked switch bit in the I/O control
processor CSR can not be set.
Refer to the troubleshooting procedures in Section
6.10.6.
If pressing the second unmarked switch results in an
error message, refer to Section 6.10.4.1.

Test 007 - Enable LED Off - begins with a prompt to put
the Secure/Enable switch into the SECURE position. The
program waits until bit 15 of the I/O control processor
Control and Status register reads as a zero, indicating
the switch is in the SECURE position. Then the program
tells the operator to observe the Enable LED is OFF.
If the Enable LED fails to turn off when the switch is
in the SECURE position, replace the OCP.

Test 008 - Init Switch in Secure Mode - checks that the
Init switch has no effect when the Secure/Enable switch
is in the SECURE position. You are prompted to press
the Init switch while the program monitors the switch
bits in the I/O control processor Switch/Display
register. Monitoring ensures that pressing the Init
switch does not cause any switch bits to set.

6-111

If pressing the Init switch causes the HSC70 to reboot,
the SECURE position of the Secure/Enable switch is not
disabling the Init switch.
Replace the OCP.
If pressing the Init switch causes one of the switch
bits in the Switch/Display register to set, an error
message is displayed. Refer to Section 6.10.4.1 for
further information.
o

Test 009 - BREAK Key in SECURE Mode - checks the
terminal BREAK key has no effect when the Secure/Enable
switch is in the SECURE position.
(Normally the BREAK
key causes the I/O control processor Jll CPU to halt and
enter ODT~) You are prompted to press the BREAK key and
to observe the HSC70 does not halt.
If pressing the BREAK key causes the terminal to print
an @ symbol, the SECURE position of the Secure/Enable
switch is not disabling BREAK from halting the Jll CPU.

6.10.6 Offline OCP Registers And Displays Via ODT
The following paragraphs and layouts are included to assist you
with troubleshooting.
6.10.6.1

Offline OCP Test Switch Check Via ODT -

To check the operation of an HSC70 switch, follow this procedure:
1.

with the Secure/Enable switch in the ENABLE position,
press the terminal BREAK key.
The I/O control processor Jll CPU should halt and
display an @ symbol.

Type:

17770042/

The contents of address 17770042 (the I/O control
processor Switch Display register) are displayed in
octal. Refer to the layout of the Switch Display
register in Figure 6-1 to locate the switch bits. Each
bit is in the 1 state when the associated switch is ON
(pressed in).

Type a carriage return.

You may now type a slash (I) to re-examine the Switch
Display register.

6-112

To restart the Offline Loader (or the diagnostic that
was interrupted), type a carriage return, then type a P
followed by another carriage return.

Using this method, the switch bits of the Switch/Display register
can be monitored when various switches are in the ON or OFF
position.

ADDRESS 17770042 VIA ODT

4000(8) FAULT SWITCH
r - - = = = 2 0 0 0 ( S ) ONLINE SWITCH

, - 1 0 0 0 1 8 ) FIRST UNMARKED SWITCH

r---400(S) SECOND UNMARKED SWITCH

(UNUSED)

200(S) GREEN LED ---~
100(S) CHEM/DMEM NXM 40(S) INH PARITY TRAP - - - - - - '
20(S) INIT LAMP -------~
10(S) FAULT LAMP - - - - - - - - - - '
4(S) ONLINE LAMP _ _ _ _ _ _ _ _ _---.1
2(S) FIRST UNMARKED L A M P - - - - - - - - - '
l(S) SECOND UNMARKED LAMP - - - - - - - - - - '

CX-1119A

Figure €-l

P.ioj Switch Display Register Layout

6.l0.6e2 Offline OCP Test Lamp Bit Check Via ODT - To check the
operation of the lamp control bits in the I/O control processor
Switch/Display register, use the following method:
1.

with the Secure/Enable switch in the ENABLE position,
press the terminal BREAK key.
The I/O control processor Jll CPU should halt and
displai an @ symbol.

6-113

Type 17770042/
The contents of the Switch/Display register are
displayed in octal.

Use Figure 6-1 to locate the bits controlling the OCP
lamps.
When a lamp bit is set, the corresponding lamp should be
lit.

To light a lamp, type the octal value that corresponds
to the proper lamp, then type a carriage return. The
lamp should light.

Type / to re-examine the contents of the Switch/Display
register.

Type a carriage return to restart the Offline Loader (or
the diagnostic that was interrupted), then type a P
followed by another carriage return.

using this method, various lamps can be manually enabled or
disabled.
6.10.6.3 Offline OCP Test Secure/Enable Switch Check Via ODT To manually check the operation of the Secure/Enable bit in the
I/O control processor Control and Status register, use the
following procedure. Using this method, the Secure/Enable bit in
the I/O control processor CSR can be checked with the
Secure/Enable switch in both positions.
1.

with the Secure/Enable switch in the ENABLE position,
press the terminal BREAK key.
(If the HSC70 is stuck in
the SECURE mode, this method cannot be used, because
BREAK is disabled.)

The I/O control processor Jll CPU halts and displays an
@ symbol.

Type 17770040/

The content of the I/O control processor Control and
Status register is displayed in octal. Refer to Figure
6-2 to identify the various bits of this register.
When the Secure/Enable switch is in the ENABLE position,
the contents of the register should be lxxxxx. When in
the SECURE position, the contents should be Oxxxxx.

6-114

ADDRESS 17770040 VIA ODT

. - - - - - - - - - - - - - - 1 0 0 0 0 0 ( 8 ) OWHEN SECURE
. - - - - - - - - - - - - 4 0 0 0 0 ( 8 ) ALWAYS 0
. . - - - - - - - - - 2 0 0 0 0 ( 8 ) ALWAYS 0
.---------10000(8) ALWAYS 0
.-------4000(8) SWAP BOARD

1r==

2000(8) SWAP BANK
1000(8) ALWA YS 0

I --400(8) SE LECT BT PG2

200(8) ENA CMEM A R B - - - - - - '
100(8) ALWAYS 0 -------~
40(8) HI BYTE PARITY TEST -----~
20(8) LO BYTE PARITY TEST-----------'
10(8) STATE LED _ _ _ _ _ _ _ _ _ _ _----J
4(8) NON-MEMORY-ACCESS (NMA) _ _ _ _ _ _ _--J
2(8) CONTROL MEMORY INTERRUPT ENABLE _ _ _ _----J
1(8) CONTROL MEMORY LOCK CYCLE ENABLE _ _ _ _ _ _--1

CX-1120A

Figure 6-2

P.ioj Control and Status Register Layout

Type a carriage return and a /
register,

Type a carriage return, then type a P followed by
another carriage return to restart the Offline Loader
(or the diagnostic that was interrupted).

6-115

(slash) to re-examine the

6.10.6.4 Offline OCP Test State LED Check Via ODT - There are
two State LEOs in the HSC70. One is on the OCP, far left. The
other State LED is on the I/O control processor module (rightmost
module in the HSC70 card cage, fourth LED from the bottom of the
module). Both LEOs are controlled by a bit in the I/O control
processor Control and Status register.
(Refer to Figure 6-2 for
a layout of this register.) To manually control the State LED,
use the following procedure:
1.

With the Secure/Enable switch in the ENABLE position,
press the terminal BREAK key.
The I/O control processor Jll CPU should halt and
display an @ symbol.

Type 17770040/
The contents of the Control and Status register are then
displayed in octal.

Use Figure 6-2 to find the octal value corresponding to
the State LED.

To light the State LED, type the octal value
corresponding to the State LED, followed by a carriage
return. To extinguish the State LED, put a zero in the
same bit position and press a carriage return.
CAUTION
Bit 7 of the I/O control processor CSR must be
set to allow the HSC70 Ks to access Control
memory. The setting of other bits in the CSR
can result in strange side-effects. Be careful
not to set any bits except the State LED bit and
leave bit 7 set when you are done.

Type a slash (/) to re-examine the contents of the I/O
control processor CSR.

To restart the Offline Loader (or the diagnostic that

was interrupted), type a carriage return, then type a P
followed by another carriage return.

6-116

CHAPTER 7
UTILITIES

7.1 INTRODUCTION
This chapter contains the information required to run three of
the offline utilities:. DKUTIL (Disk utility), FORMAT, VERIFY,
RXMFT (RXFORMAT utility), and VTDPY (Video Terminal Display
utility). Topics include initiating the utility, using commands,
and interpreting error messages. These HSCS70 utilities are
interactive and therefore are prompt-oriented. Note that prompt
information displayed in square brackets is the default.
For information on the other HSC utilities, refer to the HSC User
Guide. utilities described in that manual include:
o
o
o
o
o

SETSHO
BACKUP Package
DKCOPY
PATCH
COpy

7.2 OFFLINE DISK UTILITY (DKUTIL)
DKUTIL is a general utility for displaying disk structures and
disk data. Unlike other utilities, DKUTIL is a command language
interpreter. Initially, the user is prompted for the unit number
of the appropriate disk. The program then goes into command mode
prompting for a command, executing it, and then prompting for
another. Execution is terminated by CTRL C, CTRL Y, CTRL Z, or
the EXIT command.
7.2.1 DKUTIL Initialization
DKUTIL is initiated via the standard CRONIC command syntax, RUN
DKUTIL.UTL. The program prompts for the unit number of the disk
to examine:
DKUTIL-Q Enter unit number (U) [DO]?

7-1

Reply with the appropriate unit number. The first block of the
Format Control table (FCT) is read, if possible, and dumped in a
format similar to a VERIFY printout. The unit is brought online
with the ignore media format error modifier so drives improperly
or not completely formatted can be examined. If the FCT cannot
be read or the mode is invalid, the program prompts for the
sector size:
DKUTIL-Q Enter sector size (512/576) [512]?
The program places the unit in diagnostic mode to access the DBN
area. After the initial prompts, DKUTIL goes into command mode
and prompts for a command.
DKUTIL)
Comment lines can be entered by prefixing them with an
exclamation point (!). Entering a CTRL Z terminates the program.
Commands are executed immediately and take only the time
necessary to print the results. A CTRL Y or CTRL C at any time
aborts the program and releases the drive.
7.2.2 DKUTIL Command Syntax
The DKUTIL commands are:
0
0
0
0
0
0
0
0
0

DEFAULT
DISPLAY
DUMP
EXIT
GET
POP
PUSH
REVECTOR
SET

Any initial substring recognizes commands, command options, and
modifiers. For example, DUMP-can entered as DUM, DU, or D. In
cases where the initial substring can indicate one of several
commands, the match depends on an order based on history and
expected frequency of usage. Thus, D specifies DUMP, DI
specifies DISPLAY, and DE specifies DEFAULT. In the following
descriptions, only the part of the command or command in bold
print must be specified.
Some command options take optional parameters which, if omitted,
default.

7-2

7.2.3 DKUTIL Command Modifiers
Modifiers, specified only for commands that allow them, can occur
anywhere after the command itself. They are preceded by a slash
(one slash for each modifier). The following are equivalent:

DUMP/NOEDC RBN 0
DUMP /NOEDC RBN 0
DUMP RBN/NOEDC 0
DUMP RBN O/NOEDC
DUMP RBN 0 /NOEDC
Modifiers are processed left to right and applied to the current
default modifiers. The DUMP command is the exception. The
default modifiers for DUMP can be changed via the DEFAULT
command. In the following descriptions, only the portion of the
modifier in bold print needs to be specified. The initial
default modifiers for DUMP are /DATA, /EDC, and /IFERROR.
7.2.4 DKUTIL Sample Session
The following is a sample session using DKUTIL.
indicated in bold print.

HSC> RUN DXO:DKUTIL
DKUTIL-Q Enter unit number (U) [DO]?D133
Serial Number:
Mode:
First Formatted:
Date Formatted:
Format Instance:
FCT:

0000000004
512
17-Nov-1858 00:35:47.48
04-Apr-1984 00:05:09.20
6
VALID

DKUTIL> DIS/F FCT
Factory Control Table for D133 (RA80)
0000000004
512
17-Nov-1858 00:35:47.48
04-Apr-1984 00:05:09.20

Serial Number:
Mode:
First Formatted:
Date Formatted:
Format Instance:
FCT:

VALID

Bad PBNs in FCT:

1 (512), 0 (576)

Scratch Area Offset: 63
Size (Not Last):
417
Size (Last):
289
Flags:
Format Version:

000000

7-3

Command input is

PBNs in 512 Byte Subtable (04) 244865 (LBN 237213),
OKUTIL> REV 1000
ERROR-W Bad Block Replacement (Success) at 04-Apr-1984 17:47:24.20
Command Ref #
00000000
RA80 Unit #
133.
Err Seq #
6.
Error Flags
80
Event
0014
Replace Flags
A400
LBN
1000.
Old RBN
32.
New RBN
33.
Cause Event
004A
ERROR-I End of error.
DKUTIL> DIS/F RCT
Revector Control Table for 0133 (RA80)
Serial Number:
Flags:

0000000004
000000

LBN Being Replaced:
Replacement RBN:
Bad RBN:

1000 (000000 001750)
33 (060000 000041)
32 (060000 000040)

Cache ID:
Cache Incarnation:
Incarnation Date:

Bad RBN:
139512 -->

32,
4500,

RCT Statistics:

0000000000

17-Nov-1858 00:00:00.00
1000 *->

33,

25512 -->

822,

1 Bad RBNs,
3 Bad LBNs,
2 Primary Revectors,
1 Non-Primary Revectors,
o Probationary RBNs.

DKUTIL> DEF/NODATA
DKUTIL> DUMP LBN 1000

Buffer for LBN 1000 (000000 001750), MSCP Status: 000000
Error Summary

header compare

Original Error Bits
Error Recovery Flags
Error Retry Counts
Header

004000
000
o, I, 0

BN = 1000 (000000 001750)
ECC Symbols Corrected = 0,0
Error Recovery Command = 000

001750 030000 001750 030000 001750 030000 001750 030000

7-4

EDC = 000105

Calculated EDC Difference = 000000

ECC = 000000 000000 000000 000000 000000 000000
000000 000000 000000 000003 000000 000000
DKUTIL> DIS CHAR LBN 1000
Characteristics for LBN 1000 (000000 001750)
Cylinder 1, Group 0, Track 4, Position 8
PBN 1032 (OOOOOO 002010)
primary RBN 32 (060000 000040) in RCT Block 3 at Offset 128
DKUTIL> DIS CHAR DISK
Drive Characteristics for D133
Type:

RA80 (576 byte mode allowed)

Media:

FIXED

Cylinders:

275 LBN, 2 XBN, 2 DBN

Geometry:

14 tracks/group, 2 groups/cylinder, 28 tracks/cylinder
31 LBNs/track, 1 RBNs/track, 32 sectors/track, 32 XBNs
896 XBNs/cylinder, 868 LBNs/cylinder, 28 RBNs/cylinder

Group Offset:

16 (LBN), 16 (XBN)

LBNs:

237212 (host), 238700 (total)

RBNs:

7700

XBNs:

1792

DBNs:

1344 (read/write), 448 (read only)

PBNs:

249984

RCT:

465 (size), 63 (non-pad), 4 (copies)

FCT:

480 (size), 63 (non-pad), 4 (copies)

SDI Version:

Transfer Rate: 97
Timeouts:

3 (short), 7 (long)

Retry Limit:

7-5

Error Recover:

a command levels

ECC Threshold: 2 symbols
Revision:

10 (microcode),

Drive ID:

OA7AOOOOOOOO

a (hardware)

Drive Type ID: 1
DBN RO Groups: 1
Preamble Size: 11 (data), 4 (header)
DKUTIL> DUMP RCT BLOCK 3
RCT Block 3, Copy 1
Buffer for LBN 237214 (000003 117236), MSCP Status: 000000
Data
+16
+32
+48
+64
+80
+96
+112
+128
+144
+160
+176
+192
+208
+224
+240
+256
+272
+288
+304
+320
+336
+352
+368
+384
+400
+416
+432
+448
+464
+480
+496

000000
000000
000000
000000
000000
000000
000000
000000
000000
000000
000000
000000
000000
000000
000000
000000
000000
000000
000000
000000
000000
000000
000000
000000
000000
000000
000000
000000
000000
000000
000000
000000

000000
000000
000000
000000
000000
000000
000000
000000
040000
000000
000000
000000
000000
000000
000000
000000
000000
000000
000000
000000
000000
000000
000000
000000
000000
000000
000000
000000
000000
000000
000000
000000

000000
000000
000000
000000
000000
000000
000000
000000
001750
000000
000000
000000
000000
000000
000000
000000
000000
000000
000000
000000
000000
000000
000000
000000
000000
000000
000000
000000
000000
000000
000000
000000

000000
000000
000000
000000
000000
000000
000000
000000
030000
000000
000000
000000
000000
000000
000000
000000
000000
000000
000000
000000
000000
000000
000000
000000
000000
000000
000000
000000
000000
000000
000000
000000

7-6

EDC = 023277
DKUTIL> EXIT

Calculated EDC Difference

= 000000

7.2.5 DKUTIL Command Descriptions
Descriptions for the individual DKUTIL commands are found in
Table 7-1. In the syntax and modifier specifications, the
characters displayed in bold print are the minimum allowed
abbreviations. Command options are shown by separate lines in
the syntax specification. Parameters are indicated in the syntax
by braces ({}) and lower case. Options indicated by brackets
([]) can be omitted.
Table 7-1

DKUTIL Command Summary

command

Description

DEFAULT

Change default modifiers
for DUMP command.

DISPLAY

Display characteristics,
error history, RCT, or
FCT.

DISPLAY ALL
DISPLAY CHARACTERISTICS DBN {block}
DISPLAY CHARACTERISTICS DISK
DISPLAY CHARACTERISTICS LBN {block}
DISPLAY CHARACTERISTICS PBN {block}
DISPLAY CHARACTERISTICS RBN {block}
DISPLAY CHARACTERISTICS XBN {block}
DISPLAY ERRORS
DISPLAY FCT
DISPLAY RCT
Dump given block or
table of blocks.

DUMP
DUMP [BUFFER]
DUMP DBN [{block}]
DUMP FCT [BLOCK {numbe r }] [COpy {copy}]

7-7

Description

Command
DUMP LBN [{block}]
DUMP RBN [{block}]
DUMP RCT [BLOCK {number}] [COpy {copy}]
DUMP XBN [{block}]
EXIT

Terminate execution of
the program.

GET

Change the current
drive.

GET [{drive}]
POP

Restore save buffer to
current buffer.

PUSH

Save current buffer in
save buffer.

REVECTOR

Force bad block
replacement for the
given LBN.

SET

Change various program
parameters.

SET [SIZE {size}]
7.2.5.1 DKUTIL DEFAULT Command - The DEFAULT command is outlined
as follows:
Purpose:

To change the default modifiers for the DUMP command.

Syntax:

DEFAULT

Parameters:

None

Modifiers:

Shown in the following list.

/IFERROR (NOIFERROR) (defaults ON) - dumps the error,
header, and ECC fields in the buffer if an error occurs
when reading the block. When this modifier is used in
conjunction with the /RAW modifier, the error must occur
on the reread of the block with the header code
extracted from the first read.

7-8

IERRORS (NOERRORS) (defaults OFF) - dumps the error
fields in the buffer. If this modifier is specified,
the error fields in the buffer are dumped.
IEDC (NOEDC)

(defaults ON) - dumps the EDC and
calculated EDC fields in the buffer.

IECC (NOECC)

(defaults OFF) - dumps the ECC fields in

the buffer.
IDATA (NODATA) (defaults ON) - displays the data in the
buffer unless the /NZ modifier was also specified.
IHEADERS (NOHEADERS) (defaults OFF) - displays the
header fields in the buffer.
/ALL (NONE) (same as /ERRORS/EDC/ECC/DATA/HEADERS) requests all fields be displayed. Its opposite, /NONE,
requests no fields be displayed. When using the /NONE
qualifier, only the MSCP status line prints.

IRAW (NORAW) - allows reading the original LBN that was
revectored rather than the RBN that would be read
without the /RAW qualifier. /RAW only affects
revectored (primary or nonprimary) LBNs. If /IFERROR is
in effect, this modifier also applies only to dumping a
revectored LBN.
INZ (NONZ) (defaults OFF) - prevents the data from being
displayed if it is all zero.
Instead, a single line
indicating the data is zero is printed. It has no
effect if the /DATA modifier is not specified or if it
is defaulted OFF.
IBBR (NOBBR)

(defaults OFF) - is usually inhibited when
a block is accessed. If this modifier is specified, bad
block replacement can occur. It only occurs, however,
if the error recovery code detects the block being
accessed as bad and the block is an LBN in the host
area.

/ORIGINAL (NOORIGINAL) (defaults off) - saves the first
data seen for display. When a block is accessed for
dumping, the data is seen twice by the program if an
error occurs. It is seen first just after the K detects
the error and sends it to error recovery.
It is seen
again after error recovery takes place and the data has
been corrected or reread. Usually, the data is saved
for displaying when it is last seen.

7-9

Usage: The modifiers specified are applied to the current
default modifiers for the DUMP command. The result becomes the
new default.
Examples:

DEFAULT/NONE, DE/A/OR/NZ, and DEF/RAW/NODATA

7.2.5.2 DKUTIL DISPLAY Command - The DISPLAY command is outlined
as follows:
Purpose:
To display the disk characteristics, the
characteristics of a given block, the error history in the drive,
the FCT, and/or the RCT.
Syntax:
DISPLAY ALL
DISPLAY CHARACTERISTICS DBN {block}
DISPLAY CHARACTERISTICS DISK
DISPLAY CHARACTERISTICS LBN {block}
DISPLAY CHARACTERISTICS PBN {block}
DISPLAY CHARACTERISTICS RBN {block}
DISPLAY CHARACTERISTICS XBN {block}
DISPLAY ERRORS
DISPLAY FCT
DISPLAY RCT
Parameters: Block is a number specifying the DBN, LBN, PBN, RBN,
or XBN whose characteristics are displayed. The default radix is
decimal, and can be changed to octal by prefixing the number with
a zero.
Modifiers:
/FULL - displays all defined fields in xCT block O.
/FULL applies only to the RCT and FCT command options.
For the RCT option, the bad block replacement and write
back caching fields in RCT block 0 are only displayed if
the appropriate flags in the flags field are set. These
flags indicate they are currently in use (BBR or caching
in progress). This modifier forces all fields to be
displayed regardless of the flags settings. For the FCT
option, the number of bad PBNs field is normally
displayed only if the FCT is valid. Also, the scratch

7-10

area parameters, format version, and format flags are
normally not displayed. This modifier forces all fields
in FCT block 0 to be displayed.
/NOITEMS - does not display the individual items in the
FCT or RCT.
It applies only to the FCT and RCT command
options.
If given, only the block 0 information is
displayed.
Usage:
DISPLAY ALL - displays FCT, RCT, and error history.
Because the error history in the drive is dumped by this
option, it should not be used for RA60 drives. Using
the SOl command to read RA60 error history is illegal
and causes the drive to become inoperative.
DISPLAY CHARACTERISTICS DISK - displays the drive type,
media, cylinders, geometry, group offsets, numbers of
LBNs, number of RBNs, number of XBNs, numbers of OBNs,
number of PBNs, RCT parameters, FCT parameters, SOl
version, transfer rate, SOl timeouts, SOl retry limit,
error recovery command levels, ECC threshold, revision
levels, drive 10, drive type 10, OBN Read/Only groups,
and preamble sizes.
DISPLAY CHARACTERISTICS xBN {block} - displays the
characteristics of the given block. For DBNs and XBNs,
these are the block number in decimal and octal,
cylinder, group, track, position, and PBN in decimal and
octal. For RBNs, the RCT block number and offset are
also displayed. For LBNs, the primary RBN number and
its RCT block number and offset are also displayed. For
PBNs, the display depends on the type of block: OBN,
LBN, RBN, or XBN.
DISPLAY ERRORS - reads the error history in the drive.
The error history in the drive is read from region 2,
offset 0, and dumped in hexadecimal. This option should
not be used for RA60 drives because it causes them to
become inoperative.
DISPLAY FCT - displays the information in FCT block O.
Certain fields are not displayed unless the /FULL
modifier is given. The list of bad PBNs is displayed
unless the /NOITEMS modifier is given. For each item in
the list, the header bits, PBN number, type (OBN, LBN,
RBN, or XBN), and xBN number are displayed.
DISPLAY RCT - displays the information in RCT block O.
Certain fields are not displayed unless the /FULL
modifier is given. The list of revectors, bad RBNs, and
probationary RBNs are displayed unless the /NOITEMS

7-11

modifier is given. For bad and probationary RBNs, just
the RBN number is displayed (in decimal). For
revectors, the LBN number and RBN number to which it is
revectored are displayed (in decimal). A primary
revector is distinguished by the character sequence
"-->". A nonprimary revector is distinguished by the
character sequence "*->".
Examples: DISPLAY/FULL ALL, DI/F A, 01 C D, DIS CHAR
LBN 1000, DI/NOI RCT

7.2.5.3 DKUTIL DUMP Command - The DUMP command is outlined as
follows:
Purpose:

To dump the given block or table of blocks.

Syntax:
DUMP [BUFFER]
DUMP DBN [ {block}]
DUMP FCT [BLOCK {number}] [COpy {copy}]
DUMP LBN [{block}]
DUMP RBN [{block}]
DUMP RCT [BLOCK {number}] [COpy f copy} ]
DUMP XBN [{block} ]
Parameters: Block is a number specifying the DBN, LBN, RBN, or
XBN to be dumped. The default radix is decimal. It can be
changed to octal by prefixing the number with a zero.
Number is the relative block number in the FCT or RCT to be
dumped. The default radix is decimal and can be changed to octal
by prefixing the number with a zero. The value must be in the
range 1 through nonpad FCT or RCT size. That is, the first block
is number 1 (not 0) and the block must lie in the nonpad area.
Copy specifies which copy of the given block in the FCT or RCT is
to be dumped. The first copy is number 1. The value must not
exceed the number of copies.
Modifiers:
/IFERROR (NOIFERROR) - (defaults ON) dumps the error,
header, and ECC fields in the buffer when an error
occurs while reading the block. When used in

7-12

conjunction with the /RAW modifier, the error must occur
on the read of the LBN (reread) with the header code
extracted from the RBN (first read). Refer to Section

7 • 2 • 5 .1.
/ERRORS (NOERRORS) - (defaults OFF) dumps the error
fields in the buffer.
/EOC (NOEOC) - (defaults ON) dumps the EDC and
calculated EDC fields in the buffer.
/ECC (NOECC) - (defaults OFF) dumps the ECC fields in
the buffer.
/OATA (NOOATA) - (defaults ON) displays the data in the
buffer unless the /NZ modifier was also specified.
/HEADERS (NOHEADERS) - (defaults OFF) displays the
header fields in the buffer.
/ALL (NONE) - is the same as
/ERRORS/EDC/ECC/DATA/HEADERS. It requests display of
all fields.
Its opposite, /NONE, requests display of no
fields. When using the /NONE qualifier, only the MSCP
status line prints.
/RAW (NORAW) - allows a read of the original revectored
LBN (rather than the RBN that would be read without the
/RAW qualifier). /RAW only affects revectored (primary
or non-primary) LBNs.
If in effect, the /IFERROR
modifier also applies only to dumping a revectored LBN.
/NZ (NONZ) - prevents data from being displayed when it
is all zeroes.
Instead, a single line prints indicating
the data is zeroes. /NZ has no effect unless the /DATA
modifier is specified. It also has no effect if /DATA
is not specified (or is defaulted OFF).
/BBR (NOSBR) - (defaults OFF) permits bad block
replacement. Normally, bad block replacement is
inhibited when a block is accessed. BBR occurs if the
block being accessed is detected as bad by the error
recovery code and is an LBN in the host area.
/ORIGINAL (NOORIGINAL) - saves the first data seen for
display. When a block is accessed for dumping, the data
is seen twice by the program when an error occurs.
It
is seen first just after the K detects the error and
sends it to error recovery.
It is seen again after
error recovery takes place and the data has been
corrected or reread. Normally, the data is saved for
displaying when it is last seen.

7-13

usage: DUMP [BUFFER] The current buffer is dumped subject to the
given modifiers. If there is no current buffer, an error message
is printed as follows:
DUMP xBN [{block}]
The specified DBN, LBN, RBN, or XBN is read in and
dumped subject to the given modifiers. If the block
number is not specified, it defaults to zero.
DUMP xCT [BLOCK {number}] [COpy {copy}]
If a BLOCK number is given, that block in the FCT or RCT
is read in and dumped.
If none is specified, every
block in the nonpad area of the FCT or RCT is read in
and dumped.
If COpy is not specified, it defaults to
copy 1.
Examples: DUMP RCT BLOCK 3 COpy 4, DU/NZ RCT C 2, DU LBN 1000, D
F B 2, D X, D/DATA
7.2.5.4 DKUTIL EXIT Command - The EXIT command is outlined as
follows:
Purpose:

To terminate execution of the program.

Syntax:

EXIT

Parameters:

None

Modifiers:

None

Usage: The current drive is released, all resources are
returned, and the program exits.
Examples:

EXIT, E

7.2.5.5 DKUTIL GET Command - The GET command is outlined as
follows:
Purpose:

To change the current drive.

Syntax:

GET [{drive}]

Parameters: Drive is a valid drive unit specification of the
form Dnnn. If this parameter is omitted, GET defaults to DOOO
(unit 0).
Modifiers:

7-14

/NOIMF - inhibits the reading of FCT block 0 to
determine the mode and the reading and writing of RCT
block 0 to verify the RCT is sane. If this modifier is
specified, the IMF MSCP modifier is not used in the
online mode and these actions take place. By default, a
new drive is brought online with the IMF (MO.IMF) MSCP
modifier.
!WP - brings the drive online with the MSCP SET WRITE
PROTECT modifier (MO.SWP) and WRITE PROTECT unit flag
(UF~WPS).
The drive is then software or volume
write-protected.
/NOWP - brings the drive online with the MSCP SET WRITE
PROTECT modifier. The drive is not software
write-protected.
Usage: The current drive is released. The new drive is acquired
and then brought online with the requested modifiers and unit
flags. If the drive is nonexistent, in use, or inoperative, the
program prompts for another unit. The modifiers cannot be
changed for this other unit. If the mode word in FCT block 0 is
invalid or all copies of FCT block 0 are bad, the program prompts
for the sector size to use.
Examples:

GET D133, G/WP 064, G

7.2.5.6 DKUTIL POP Command - The POP command is outlined as
follows:
Purpose:
buffer.

To restore the data in the current buffer from the save

Syntax:

POP

Parameters:

None

Modifiers:

None

usage: The data in the save buffer is restored to the current
buffer. The data in the current buffer is lost.
Examples:

POP, P

7.2.5.7 DKUTIL PUSH Command - The PUSH command is outlined as
follows:
purpose:
buffer.

To save the data in the current buffer in the save

7-15

Syntax:

PUSH

Parameters:

None

Modifiers:

None

usage: The data in the current buffer is saved in the save
buffer. The data in the save buffer is lost.
Examples:

PUSH, PU

7.2.5.8 DKUTIL REVECTOR Command - The REVECTOR command is
outlined as follows:
Purpose:
LBN.

To force bad block replacement to occur for a given

Syntax:

REVECTOR {block}

Parameters: Block is a number specifying the LBN to be replaced.
The default radix is decimal. It can be changed to octal by
prefixing the number with a zero.
Modifiers:

None

Usage: The specified LBN is sent to the bad block replacement
module to be revectored. If it is not a valid LBN or in the RCT,
the revector fails, and an error message prints. Otherwise, the
result of the replace attempt shows in the error log produced (if
the appropriate level message level is enabled (INFO)). The data
in the replacement RBN is read from the specified LBN.
Examples:

REVECTOR 1000, R 100

7.2.5.9 DKUTIL SET Command - The SET command is outlined as
follows:
Purpose:

To change various program parameters.

Syntax:

SET [SIZE {size}j

Parameters: The size parameter specifies the new sector size to
be used for the current drive. It must be either 512 or 576.
Modifier:

/SIZE {size}

The sector size is changed to the given value and the disk
parameters are recomputed. This new sector size is used when
doing I/O to the LBN area and is also reflected in the parameters
printed by the DISPLAY CHARACTERISTICS DISK command.

7-16

SET SIZE 576, S S 512

Examples:

7.2.6 DKUTIL Error Messages
Table 7-2 contains a list of error and information messages
printed out by DKUTIL. These messages are arranged
alphabetically.
7.2.6.1 DKUTIL Error Message Variables - Certain portions of the
error messages are variable and are shown in bold print. The
meanings of these variables are as follows:
n

par
parm
status
text

=
=

xBN

xCT

a decimal number
BLOCK or COpy
the part of the command in error (modifier, etc.)
MSCP status (an octal number)
the actual text in error
DBN, LBN, etc.
FCT or RCT

7.2.6.2 DKUT!L Error Message Severity Levels - DKUTIL error
messages conform to the HSC utility error message format. In
each case, the utility name at the start of the message is
followed by a letter indicating the severity level of the
message. These are defined as follows:
0
0
0
0

Table 7-2

E = Error
F
Fatal
Information
I
Success
S
DKUTIL Error Messages

Error Message

Explanation

DKUTIL-S CTRL/Y or CTRL/C
Abort!

This termination message prints if
you abort DKUTIL by typing CTRL-C
or CTRL-Y.

DKUTIL-F Insufficient
resources to RUN!

This message prints if DKUTIL
cannot acquire the necessary
resources to run or if the disk
functional code is not loaded. The
program terminates after this
message is printed.

DKUTIL-F Drive went OFFLINE!

This message prints if the selected
unit goes offline while DKUTIL is
running. The program terminates
after this message is printed.

7-17

Error Message

Explanation

DKUTIL-F I/O request was
rejected!

This message prints if the
diagnostic interface (DDUSUB)
rejects a request to start an I/O
operation. It indicates a bug in
DKUTIL and should be reported to
Field Service Support. The program
terminates after this message is
printed.

DKUTIL-E Illegal response to
start-up question.

This error message prints if you
enter an invalid response to a
start-up question or to a prompt
for the GET command. The program
reprompts with the same question.

DKUTIL-E Nonexistant unit
number.

This error message prints if the
unit number entered does not
correspond to any known unit. The
program reprompts for the unit
number.

DKUTIL-E unit is not
available.

This error message prints if the
unit requested is unavailable. The
unit may be in use by a host or
another diagnostic or it may be
inoperative. The program reprompts
for another unit.

DKUTIL-E cannot bring unit
ONLINE.

This error message prints if the
requested unit is available, but
the ONLINE command failed. The unit
is released, and the program
reprompts for another unit.

DKUTIL-E Invalid decimal
number.

This error message prints if you
entered an invalid decimal number
in a command line.

DKUTIL-E Invalid octal
number.

This error message is printed if
the user entered an invalid octal
number in a command line.

DKUTIL-E Missing parameter.

This error message prints if a
command line is entered with a
required parameter missing.

DKUTIL-E There is no buffer
to dump.

This error message prints if the
DUMP BUFFER command is entered, and
there is no current buffer. This
can only happen if a drive has just
been selected.

7-18

Error Message

Explanation

DKUTIL-E Missing modifier
(only "I" was specified).

This error message prints if a
command line is entered with a
slash (I) followed by a blank or is
entered at the end of the line. A
modifier is expected, but is
missing.

DKUTIL-E SOl command was
unsuccessful.

This error message prints when an
SOl command is rejected by the
drive. A DISPLAY ERRORS command for
a RA60 drive always generates this
message.

DKUTIL-E n is an invalid par
number; maximum is n.

This error message prints if an
out-of-range number is entered for
a BLOCK or COpy value for the DUMP
command.

DKUTIL-E "text" is an

This generic error message prints
when an invalid command; invalid
command option, invalid modifier,
invalid block type, or invalid SET
option is specified in a command
line.

..:----,.:~

.1.11VQ.L LU

.....
~.,.. .....
J:'QI. &n.

DKUTIL-E Invalid block
number for xBN space.

This error message is printed if
the block number specified for a
DISPLAY CHARACTERISTICS xBN command
is out-of-range for the given
space.

DKUTIL-E Copy n of xCT Block
n (xBN n) is bad.

This error message prints when FCT
or RCT blocks cannot be read
correctly with error recovery. It
occurs when the FCT or RCT is being
read just after a drive has been
selected. It also occurs when the
DISPLAY FCT or DISPLAY RCT command
is being used.

DKUTIL-E All copies of of
xCT Block n are bad.

This error message prints when all
copies of FCT or RCT blocks are
bad. It occurs when the FCT or RCT
is being read just after a drive
has been selected. It also occurs
when the DISPLAY FCT or DISPLAY RCT
command is being used.

7-19

Error Message

Explanation

DKUTIL-E Invalid sector
size; only 512 and 576 are
legal.

This error message prints if the
sector size entered for the SET
SIZE command is other than 512 or
576.

DKUTIL-E Revector for ~BN n
failed, MSCP Status: status.

This error message prints if a
revector (using the REVECTOR
command) fails.

7.3 OFFLINE DISK VERIFIER UTILITY (VERIFY)
VERIFY is a utility that checks the integrity of the disk
architectural structure. This utility is a tool designed for
DIGITAL support personnel to check a disk to ensure it conforms
to the DIGITAL Standard Disk format.
VERIFY has many messages that may print during the course of a
disk structure verification. These messages have significance
only when VERIFY reports the drive is bad. At the end of its
run, VERIFY reports the drive is either OK or BAD.
NOTE
The VERIFY utility only reads the disk. It does
not destroy user data and does not perform Bad
Block Replacement.
The following steps describe the process by which this utility
verifies a disk.
1.

The first block of the Factory Control Table (FCT) is
read to determine how the disk is formatted. The serial
number, format mode, date first formatted, date last
formatted, format instance, state of the FCT, number of
bad PBNs, scratch area parameters (offset, size of not
last, and size of last), flags, and format version are
printed.

The first block of the Revector Control Table (RCT) is
then read. The information in it is printed, including
the serial number, flags, bad block replacement
variables (LBN being replaced, replacement RBN, and bad
RBN) , and cache variables (IO, incarnation, and
incarnation date).

All copies of the first two blocks in the ReT (used by
bad block replacement) are read and compared.
Discrepancies or bad blocks are reported.

7-20

All copies of the rest of the RCT are read and compared.
Any discrepancies or bad blocks are reported. The
information about revectors and bad RBNs is dumped. A
summary of the number of bad blocks and revectors by
type is printed.

All copies of FCT block 0 are read and compared, and bad
blocks or discrepancies are reported.

All copies of the appropriate FCT subtable are read (if
not null) and bad blocks or discrepancies are reported.

The list of bad PBNs is printed. Each entry is printed
with the header bits, PBN number, and xBN number (in
parentheses) as separate fields. If a bad PBN is found
which should be in the RCT but is not, the xBN field is
printed in brackets instead of parenthesese If any such
PBNs are found, an error message indicating the total
number is printed at the end of the bad PBN list.

After reading and dumping the FCT, a quick scan of DBN
space is done. Every block is accessed only once.
counts of various detected errors are recorded for a
summary printed at the end of the scan. If more than
nine positioner errors are detected, a message is
printed suggesting DBN space be reformatted. If more
than nine EDC errors are detected, a message is printed
suggesting the INITIAL WRITE option should be used when
running ILEXER.

All LBN space up to the RCT and all RBNs are scanned.
Any block with an error is reread five more times to
determine the type of error. Information about bad
blocks and revectors collected in this phase is compared
with information collected from reading the RCT. During
the scan, four error classes can be found:
o
o
o
o

Structure errors
Permanent recoverable errors
Permanent unrecoverable errors
Transient errors

Structure and permanent unrecoverable errors are
considered inconsistencies and are always reported.
Permanent recoverable errors, usually ECC errors, are
reported if requested. During the five rereads of a
block with an error, a block read at least once with no
detected error is considered to have a transient error.
Transient errors are reported if you request them.
10.

At the end of the scan, certain other errors are
reported. Some errors can only be determined at that
time by examining information collected during the scan.

7-21

11.

Finally, a summary, by type, of the errors detected and
certain other information is printed. If no
inconsistencies were discovered, a message saying the
drive is OK prints. Otherwise, the message indicates
the number of inconsistencies.

7.3.1 VERIFY Initiation
VERIFY is initiated via the standard CRONIC command syntax, RUN
DXO:VERIFY,UTL. The following prompt asks for the unit number of
the disk to verify.
VERIFY-Q Enter unit number to verify (U) [DO]?
It then prompts to determine if the unit was recently formatted:
VERIFY-Q Was this unit just FORMATted (YIN) [Y]?
This question is asked because certain errors are classed as
inconsistencies only when the unit has not been subject to bad
block replacement following the execution of FORMAT. The next
prompt determines whether errors not considered inconsistencies
should be reported:
VERIFY-Q Print informational (non-warning) messages (YIN) [N]?
If you reply N to this question, only inconsistencies are
reported. If your reply is Y, you are further prompted to decide
whether transient errors should be reported:
VERIFY-Q Report transient errors by block (YIN) [N]?
Regardless of the response to this question, the number of
transient errors is printed in the final summary. The response
to this question determines whether or not individual blocks with
transient errors should be reported.
You can enter CTRL Z at any prompt for the remainder of the
responses. CTRL Z forces the default response (in square
brackets). Also, the responses to subsequent questions can be
supplied at any question by typing them separated with commas.
For example, if unit 0133 (which was just formatted) is to be
verified and all options are to be selected, the user could type
D133"Y,Y at the first prompt.
If the unit does not exist or cannot be accessed, you are
notified and reprompted for another unit number. If the unit can
be accessed, it is acquired and brought online. VERIFY runs to
completion, unless aborted by CTRL Y or CTRL C.

7-22

7.3.2 VERIFY Sample Session
The following is a sample session using VERIFY.
bold print.

User input is in

HSC50) RUN DXO:VERIFY
,

VERIFY-Q Enter unit number to verify (U) [DO]?D133
VERIFY-Q Was this unit just FORMATted (YIN) [Y]?
VERIFY-Q Print informational (non-warning) messages (YIN) [N]?Y
VERIFY-Q Report transient errors by block (YIN) [N]?Y
***
FCT Block 0 Information
Serial Number:
Mode:
First Formatted:
Date Formatted:
Format Instance:
FCT:
Bad PBNs in FCT:

0000000004
512
l7-Nov-1858 00:35:47.48
10-Apr-1984 00:05:09.20
6

VALID
1 (512'), 0 (576)

Scratch Area Offset: 63
Size (Not Last):
417
Size (Last):
289
Flags:
Format Version:

***

000000

RCT Block 0 Information

Serial Number:
Flags:

0000000004
000000

LBN Being Replaced:
Replacement RBN:
Bad RBN:

o (000000 000000)
o (060000 000000)

Cache 10:
Cache Incarnation:
Incarnation Date:

***

o (060000 000000)
0000000000

17-Nov-1858 00:00:00.00

Revector Control Table for 0133

VERIFY-I Copy 1 of RCT Block 2 (LBN 237213.) is bad.
25512 --)
822, 139512 --) 4500,
RCT Statistics:

***

o Bad RBNs,
2 Bad LBNs,
2 Primary Revectors,
o Non-primary Revectors,
o Probationary RBNs,
1 Bad RCT Blocks,
1 Bad First Copy RCT Blocks.
Factory Control Table for D133

7-23

PBNs in 512 Byte Subtable
(04) 244865 (LBN 237213),

***
Statistics:

Quick Scan of DBN Area

o total blocks with any error.
Scan of LBN Area

***

VERIFY-! LBN 26003. has a 1 symbol correctable ECC error.
VERIFY-I RBN 2471. has a 1 symbol correctable ECC error.
VERIFY-I LBN 139962. has a 1 symbol correctable ECC error.
Statistics:
3 total ECC symbols corrected,
3 blocks with 1 symbol ECC errors,
2 revectors verified,
5 total blocks with any error.
VERIFY-I Drive is OK.
The preceding example is the output of an actual session for an
RA80 disk with one bad PBN in the FCT. Notice this PBN
corresponds to copy 1 of RCT block 2. RCT block 2 is used to
store the copy of the user data during bad block replacement. In
its scan of the RCT, VERIFY noticed this block was bad and
printed an informational message indicating that. If
informational messages had been suppressed through use of the
SETSHO utility, this information would show only in the summary
of the RCT dump.
In the example, VERIFY also printed informational messages for
the three blocks it found with solid I-symbol correctable ECC
errors. If informational messages had been suppressed, these
messages would not have printed. However, the number of such
blocks would show up in the summary statistics.
No transient errors were detected and, therefore, no count is
reported in the summary statistics. Also note, although no
messages were printed for them, the two revectors in the ReT were
verified (as indicated in the summary statistics). Note the
funny date for the Date Formatted field. This date is the
default when no date is supplied by a host or a human during
manufacturing format. If structure inconsistencies had been
found, some of the following error messages would also print.
1.3.3 VERIFY Errors And Information Messages
This section describes error and information messages that may be
printed out by VERIFY. Error messages are arranged
alphabetically according to the actual message.

7-24

7.3.3.1 VERIFY Variable Output Fields - Error message fields
with variable output print are in bold print. Definitions for
these fields are:
xCT =
n
=
n. =
xBN =
0

t
x

=
=
=

FCT or RCT
a decimal number
a decimal LBN, RBN, or XBN
LBN, RBN, or XBN
an octal number
type code: I or W
error: ECC, EDC, etc.

7.3.3.2 VERIFY Error Message Severity Levels - VERIFY error
messages conform to the HSC utility error message format.
In
each case, the utility name at the start of the message is
followed by a letter indicating severity level. These are
defined as follows:
F
I
t
W

Fatal
Information
type: either W or I, depending on the error
warning

7.3.3.3 VERIFY Fatal Error Messages - Following is a list of the
error messages fatal to the VERIFY utility. The program
terminates after printing one of these messages.
o

VERIFY-F All Copies of xCT Block n Are Bad! - prints if
all copies of some block in either the ReT or the FeT
are bad. The program cannot continue to run because
vital information is missing. In any case, it has
verified that the unit is bad!

VERIFY-F Current System Sector Size is 512! - prints if
the mode field in FeT block 0 indicates the unit is
formatted in 576-byte mode, but the system sector size
is set to 512. In this case, VERIFY cannot run because
it cannot read""sectors 576 bytes long.

VERIFY-F Drive went OFFLINE! - prints if the unit
selected goes offline while VERIFY is running.

VERIFY-F Insufficient Resources to Run! - prints if
VERIFY cannot acquire the necessary resources to run or
the disk functional code is not loaded.

VERIFY-F I/O Request Was Rejected! - prints if the
diagnostic interface (DDUSUB) rejects a request to start
an I/O operation. It is an indication of a bug in
VERIFY and should be reported to Field Service Support.

7-25

VERIFY-F Mode is Bad or Format is in Progress on This
Unit!
prints if the mode field in FCT block 0 of the
selected unit is not valid.

7.3.3.4 VERIFY Information Messages - The following messages are
informational only.
o

VERIFY-I CTRL/Y or CTRL/C Abort! - prints if the user
aborts VERIFY by typing a CTRL Y or CTRL C.

VERIFY-I Drive is OK. - is a termination message and
prints at the end of VERIFY if no inconsistencies were
discovered.

VERIFY-I There Were n Inconsistencies Found for This
Drive. - is a termination message and prints at the end
of VERIFY if inconsistencies were discovered.

7.3.3.5 VERIFY Warning Messages - The following messages are
warning messages. In many cases, they are true warnings; in
other cases, they simply precede a reprompt.
o

VERIFY-W n Bad PBNs (in Brackets Above) Not in the RCT.
- prints if the LBN/RBN count is anything other than
zero. After the RCT has been collected, the appropriate
subtable of the FCT is read. The list of PBNs is
printed. The RCT is searched for RBNs and LBNs
corresponding to PBNs. They should be there! If they
are not found, the LBN or RBN corresponding to the PBN
is printed in brackets and counted.

VERIFY-W Cannot ONLINE unit - message prints if the unit
requested is available but the ONLINE command failed.
The unit is released and the user is reprompted for
another unit.

VERIFY-W Cannot Read Track with Starting xBN n - prints
if this access fails before the request is sent to the
drive. It is usually caused by failing hardware. When
VERIFY accesses LBN space or RBN space to check it, it
reads all LBNs or RBNs on a track with one request.
This operation is done with VERIFY processing all errors
for each LBN or RBN.

VERIFY-W Copy n of xCT Block n (xBN n.) Does Not Compare
- prints whenever a block is found that does not compare
to the first good one. All copies of every RCT or FCT
block are read and compared to the first good copy read.

7-26

VERIFY-W Illegal Response to Start-up Question!
prints if an invalid response is entered for a start-up
question. The program reprompts with the same question.

VERIFY-W LBN n., a Non-Primary Revector, is Improper.
prints if an LBN was not a nonprimary revector but was
recorded in the RCT as such. When VERIFY reads an LBN
with a header indicating it is a nonprimary revector, it
looks it up in the collected RCT information and flags
the fact it was found to be a nonprimary revector.

VERIFY-W LBN n., a Primary Revector, is Improper.
prints if an LBN was not a primary revector but was
recorded in the RCT as such. When VERIFY reads an LBN
with a header indicating it is primarily revectored, it
looks it up in the collected RCT information and flags
the fact that it was found to be a primary revector.

VERIFY-W LBN n. Revectors to RBN n. Which is Bad.
prints if VERIFY finds an RBN is good (can be read with
error recovery) or only has a forced error (after error
recovery). It looks it up in the collected RCT
information. If found, VERIFY marks it as good. If,
after the scan is finished, this flag is not set for an
RBN revectored to, this message prints.

VERIFY-W Nonexistent Unit Number. - prints if the unit
number entered does not correspond to any known unit.
The program reprompts for the unit number.

VERIFY-W Unit is Not Available. - prints if the unit
requested is unavailable. It may be in use by a host or
another diagnostic. It may be inoperative. The program
reprompts for another unit.

VERIFY-W xBN n. Has a Hard EDC Error. - prints for
LBNs and RBNs found to have a bad EDC {neither correct
nor forced error}. This error is classed as an
inconsistency. Only a software error can result in a
record with a bad EDC (unless the WRITE/BAD DKUTIL
command is used).

VERIFY-W xBN n. is Bad but Not in the ReT. - prints
when VERIFY accesses a particular track for LBNs or RBNs
only once. Any LBNs or RBNs where errors are detected
in this initial pass are recorded. They are then read
five more times, one LBN or RBN at a time. If errors
are detected each time the LBN or RBN is accessed, and
all of the errors are header errors, but the LBN or RBN
is not recorded in the ReT, this error message is
printed.

VERIFY-W xBN n. I/O Error in Access (MSCP Code: 0).
indicates a problem in the drive or the K. When this

7-27

message prints, it is an inconsistency. VERIFY provides
its own error processing for records read where the K
detects errors. This message prints if the return from
the I/O operation is not SUCCESS (forced error, EDC
error, or uncorrectable ECC error).

7.3.3.6 VERIFY Type Error Messages - A list of the type error
messages produced by VERIFY follows. The t for type in these
messages can stand for either I (Information) or W (Warning).
o

VERIFY-t LBN n. Has Corrupted Data (Forced Error).
prints with t as a W if you answered Y to the prompt
about FORMAT. However, if the unit has been subject to
bad block replacement, this message is printed (if at
all) with t as an I.
Normally, all LBNs have a correct EDC indicating their
data is good. However, a bad block replacement which
occurs when the data could not be recovered produces a
revectored LBN with a forced error flag. This indicates
the data is probably bad. No such LBNs should exist
just after FORMAT has run.

VERIFY-t RBN n. is Good but Not Used for a Revector.
prints if a good RBN with a valid EDC is found in the
verification pass but not recorded in the RCT as used.
Unused RBNs on a disk are written with a forced error
indication (the EDC is the complement of the proper
EDC). No such records should exist just after FORMAT
has been run. If you answered Y to the prompt about
FORMAT, this message prints with t as a W. However, if
the unit has been subject to bad block replacement, this
message is printed (if at all) with t as an I.

VERIFy-t RBN n. Marked Bad in the RCT was Not Bad.
prints with t as a W if you answered Y to the prompt
about FORMAT. However, if the unit has been subject to
bad block replacement, this message prints (if at all)
with t as an I. When VERIFY reads a bad RBN (bad header
or header code of bad), it looks it up in the collected
RCT information and flags the fact it was indeed found
to be bad. If any bad RBN recorded in the RCT is in
fact all right, this flag is not set. No such RBNs
should exist just after FORMAT has been run.

VERIFy-t xBN n. Has an Uncorrectable ECe Error.
prints when VERIFY discovers an inconsistency. No LBN
should have an uncorrectable Ece error; it should be
revectored either by FORMAT or by bad block replacement.
Thus, for an LBN, this error is considered an
inconsistency. Also, FORMAT should have discovered all
RBNs with uncorrectable ECC errors and marked them as

7-28

bad in the RCT. If an RBN is found with an
uncorrectableECC error, but that RBN is not in the RCT,
it is also considered an inconsistency.
In both of these cases, this message is printed with t
as a W. If an RBN is discovered with an uncorrectable
ECC error marked bad in the RCT, this message prints (if
at all) with t as an I.

7.3.3.7 VERIFY Informational Messages - Following are
descriptions of the informational messages printed by VERIFY.
Note, this type of message mayor may not need informational
messages enabled in order to print.
o

VERIFY-! Copy n of xCT Block n (xBN n.) is Bad.

prints if informational messages are enabled for RCT or
FCT blocks that cannot be read correctly with error
recovery.
NOTE
Table is null or empty (no bad PBNs). This
message is printed for null or empty FCTs
whether or not informational messages are
enabled.
o

VERIFY-I DBN Area Should Probably be Reformatted.
prints whether or not informational messages are
enabled. If more than nine DBNs were detected with
positioner errors, this message prints after the DBN
scan.

VERIFY-I INITIAL WRITE Should be Specified for ILEXER.
- prints whether or not informational messages are
enabled. If more than nine DBns were detected with
positoner errors, this message prints after the DBN
scan.

VERIFY-I LBN n., a Primary, Has a Bad Header (is
Non-Primary). - prints if informational messages are
enabled for LBNs recorded in the RCT as primary
revectors but have garbled headers. Such a condition is
abnormal but not erroneous.

VERIFY-I xBN n. Has a Transient (n Out of 6) x Error.
- prints if an LBN or RBN has been read six times with a
least one error-free read when informational and
transient error messages are enabled. The number of
times out of six that errors were detected is indicated
in the message.

7-29

VERIFY-I xBN n. Has a n Symbol Correctable ECC Error.
- prints for LBNs or RBNs with solid Eee errors (errors
on all six accesses) that are correctable when
informational messages are enabled. The highest number
of symbols corrected on a seventh access is indicated in
the message.

VERIFY-I xBN n. Has Solid Errors: x. - prints for LBNs
or RBNs with errors on all six accesses when
informational messages are enabled. The errors included
those other than Eec or EDC. The record is read a
seventh time with error recovery to determine if the
error is correctable. If it is not, a warning message
prints.

7.4 OFFLINE DISK FORMATTER UTILITY (FORMAT)
FORMAT is the HSC70 utility used to format disks. It formats
with either a 512- or 576-byte sector size. It can be used to
format only the DBN area or to format both the LBN area and the
DBN area.
CAUTION
The FORMAT utility destroys user data and can
destroy the FCT if used by persons not familiar
with DSA.
The DBN area is always formatted. If the user requests it, the
LBN area is also formatted. When the LBN area is formatted,
there are two modes of operation. In Best Guess mode, the XBN
area is formatted, no FCT is used, and a null FCT is generated.
In Reformat mode, the FCT on the disk is used and the XBN area is
not formatted. If a Reformat is requested, but the FCT is null
or clobbered, a modified Best Guess mode is used where only the
LBN area is formatted. The main difference between Best Guess
mode and Reformat mode is each track is reread at least three
times during the check pass (Best Guess Mode) instead of once
(Reformat Mode). If any error is detected, the track is reread
20 times instead of 3 times for Reformat mode.
CAUTION
Be careful when using CTRL C or CTRL Y to abort
the FORMAT utility after formatting operations
begin. Doing this may destroy the contents of
the FCT and/or the RCT. The FORMAT utility
should only be aborted under fatal-unrecoverable
disk failure conditions.

7-30

7.4.1 FORMAT Initiation
FORMAT is initiated via the standard CRONIC command syntax, RUN
DXO:FORMAT.UTL. Note the last field in the following prompts
(shown in square brackets). This indicates the default for that
prompt.
The program prompts for the unit number of the disk to format
with the following:
FORMAT-Q Enter unit number to format (U) [DO]?
The next prompt determines whether the LBN (user data) area
should formatted or only the DBN (diagnostic) area. If you
answer y to this prompt, user data is destroyed.
FORMAT-Q Format user data area (YIN) [N]?
If you reply with N or a carriage return only (to obtain the
default), the program starts executing and formatting the DBN
area only. If you enter a Y, the program prompts for the sector
size to use when formatting the disk:
FORMAT-Q Enter sector size to be used (512/576) [512]?
If you press carriage return only, the sector size used is 512
bytes. Otherwise, either 512 or 576 should be entered. The next
prompt determines which format mode should be used:
FORMAT-Q Use existing bad block information (YIN) [Y]?
If you enter an N, the program assumes a Best Guess mode format
and is done depending on the response to the next question. If
the user enters Y, a Reformat mode or modified Best Guess mode
format is used, depending on the state of the FCT on the disk and
the response to the next question. If N was entered in the
previous question, the next question is:
FORMAT-Q FCT will be destroyed; are you sure (YIN) [N]?
If you enter N, the program reprompts with the previous question.
If you enter Y, the DBN, XBN, and LBN areas are formatted in Best
Guess mode. This mode is seldom, if ever, used because the FCT
contains important information about bad spots on the HDA which
FORMAT does not always find using Best Guess mode. User data
loss is highly probable when formatting in this mode.
If Y was entered for the previous prompt, the next question is:
FORMAT-Q Continue if bad block information is inaccessable (YIN)
[N]?
If you enter N, a Reformat mode is used if the FCT is valid. If
it is not valid, the program aborts with an appropriate error
message. If Y is entered, a Reformat mode is used if the FCT is

7-31

valid or a modified Best Guess mode is used if the FCT is null or
clobbered. In either case, the XBN area is not formatted.
If the response to this prompt is Y or the response to the
destroy FCT prompt is Y, the program prompts for a serial number:
FORMAT-Q Enter a non-zero serial number (D)?
This serial number is used if a Best Guess mode format is used or
all copies of FCT block 0 are unreadable (in modified Best Guess
mode). FORMAT allows a number of special options, not only for
debugging purposes but also to increase data reliability. To
determine if any of these options are desired, the program
prompts with the following:
FORMAT-Q Do you want special options (YIN) [N]?
If the response is N or a carriage return (the default of N),
FORMAT starts processing.
If the response is Y, the following three special option prompts
appear:
FORMAT-Q Revector blocks with 1 symbol ECC errors (YIN) [N]?
Normally, blocks discovered during the check pass of formatting
with I-symbol ECC errors are not retired. The program assumes
this level of error is tolerable. If the response to this prompt
is Y, all blocks with solid (nontransient) ECC errors are
retired. However, in all cases, blocks with 2-syrnbol (or more)
ECC errors are always retired, regardless of the drive's ECC
symbol threshold.
The second special option prompt is:
FORMAT-Q Revector blocks with transient errors (YIN) [N]?
After a track is formatted, it is read either once (Reformat) or
three times (Best Guess). If an error is detected, and the mode
is Reformat, the track is read twice more. If any block not
previously retired shows an error twice, it is retired, and the
track is reformatted with this check pass done again. If no
block had errors twice, the track is read 3 more times (Reformat)
or 20 more times (Best Guess). Blocks which show an error only
once during all of these reads are normally not retired. Such
errors are considered tolerable transient errors. If the
response to this prompt is Y, blocks that show any error are
retired.
The third and final special option prompt is:
FORMAT-Q Report position of bad blocks (YIN) [N]?

7-32

Blocks retired during the format process are reported with a
single line printout. The type, block number, and cause are
printed. If the response to this prompt is Y, the PBN number,
cylinder, track, group, and position are also printed on a
subsequent line.
The user can enter CTRL Z at any prompt to use the default for
the remainder of the responses. Also, the responses to
subsequent questions can be supplied at any question by typing
the responses separated by commas. For example, if unit D133 has
an FCT and is to be formatted in 5l2-byte mode with no special
options, the user could type D133,Y"" at the first prompt.
7.4.2 FORMAT Sample Session
The following is a sample session using FORMAT.
bold print.

User input is in

HSC70> RUN DXO:FORMAT
FORMAT-Q Enter unit number to format (U) [DO]?D133
FORMAT-Q Format user data area (YIN) [N]?Y
FORMAT-Q Enter sector size to be used (512/576) [5l2]?
FORMAT-Q Use existing bad block information (YIN) [Y]?
FORMAT-Q Continue if bad block information is inaccessable (YIN)
[N] ?

FORMAT-Q Do you want special options (YIN) [N]?Y
FORMAT-Q Revector blocks with 1 symbol ECC errors (YIN) [N]?
FORMAT-Q Revector blocks with transient errors (YIN) [N]?
FORMAT-Q Report position of bad blocks (YIN) [N]?
FORMAT-S Format begun.
2 cylinders left in DBN space at 00:05:34.60.
FORMAT-I
FORMAT-I 275 cylinders left in LBN space at 00:05:39.60.
FORMAT-I Bad LBN 237213 (FCT), in the RCT area.
FORMAT-I 265 cylinders left in LBN space at 00:06:05.60.
FORMAT-I 255 cylinders left in LBN space at 00:06:31.40.

25 cylinders left in LBN space at 00:16:36.20.
FORMAT-I
15 cylinders left in LBN space at 00:17:02.00.
FORMAT-I
5 cylinders left in LBN space at 00:07:28.40.
FORMAT-I
FORMAT-S Format completed.
FORMAT-I Stats:
0 Bad RBNs,
2 Revectored LBNs,
2 Primary Revectored LBNs,
0 Non-Primary Revectored LBNs,
1 Bad Blocks in RCT Area,
0 Bad Blocks in DBN Area,
0 Bad Blocks in XBN Area,
9 Blocks Retried on Check Pass.
FORMAT-I FCT was used successfully.

7-33

***********************************************************

*
*
*

VERIFY must be RUN to complete FORMAT verification!

*
*
*

***********************************************************

The preceding example is the output for an actual session for an
RA80 disk with one bad PBN in the FCT. Notice the message which
indicates it was retired because it was in the FCT, and it was in
the RCT area. Note the informational message which is printed
every 10 cylinders. This confirms that progress is actually
being made and to show at what rate. Also, note the two LBNs
which were retired because they had 2-symbol ECC errors; they
became primary revectors. The error log messages were printed
for them because, in the case of an RA80, two symbols are in
excess of the ECC drive threshold.
Note, the final statistics indicate two LBNs were revectored and
one bad LBN was found in the RCT area. The nine Blocks Retried
on Check Pass include the two bad LBNs plus seven other blocks
with transient errors only and therefore not retired. The bad
block in the ReT was not retried in the check pass because it was
known to be bad from the FCT. This would be true for any blocks
retired due to their location in the FCT. The final message
indicates an FCT was found and was successfully used.
Note, the message in the box which indicates VERIFY must be run
to complete verification. This is an essential step and should
not be skipped.
7.4.3 FORMAT Errors And Information Messages
This section describes the error and information messages printed
by FORMAT. Error messages are arranged alphabetically according
to the actual message.
7.4.3.1 FORMAT Error Message Variables - Variable output in the
error and information messages is shown in bold print. These
fields are formed as follows:

7-34

n
x
xBN
hh
mm
ss
xx

=
=
=
=
=
=
=

a decimal number
the way a block was found bad: FCT or check
a space: DBN, XBN, or LBN
hours
minutes
seconds
hundredths of a second

7.4.3.2 FORMAT Message Severity Levels - FORMAT error messages
conform to the HSC utility error message format.
In each case,
the utility name at the start of the message is followed by a
letter indicating severity level. These are defined as:
F = Fatal
I = Information
E = Error
Success
S
W Warning
7.4.3.3 FORMAT Fatal Error Messages - This section describes the
fatal error messages printed by FORMAT.
o

FORMAT-F Cannot Position to DBN Area! - attempts to
verify it has positioned the heads to the DBN area
before it formats the disk unless FORMAT is running in
Best Guess mode. FORMAT does this by reading the first
sector of every track in the DBN read/write area until a
sector is read without a header error. This fatal error
message is printed if no such sector can be found.

FORMAT-F Current Maximum Sector Size is 512! - prints
if the user requests a 576-byte sector size but the
system sector size is set to 512. In this case, FORMAT
cannot run because I/O cannot be done with sectors that
are 576 bytes long.

FORMAT-F DBN/XBN Format Error (Drive FORMAT Command
Failed)! - prints if a FORMAT command fails for five
retries when formatting the DBN or XBN area.

FORMAT-F Drive Does Not Support 576 Mode on This Media!
- prints if the user requests a 576-byte sector size for
a drive that does not support it.

FORMAT-F Drive is write Protected! - prints if the
requested drive is hardware write-protected and
therefore cannot be formatted.

FORMAT-F FCT Does Not Have Enough Good Copies of Each
Block! - prints if any block in the FCT does not have
two good copies.

7-35

FORMAT-F FCT is Improper! - prints if one or more PBNs
are remaining to be processed. When the program
finishes formatting the LBN area, it checks to see if
all PBNs in the FCT have been processed. It usually
indicates an FCT where some PBNs are out of order.

FORMAT-F FCT Nonexistent! - prints if the FCT is null
or clobbered, and the user has instructed the program
not to continue.

FORMAT-F FCT Read Error! - prints if all copies of some
given block of the FCT cannot be successfully read.

FORMAT-F FCT write Error! - prints if all copies of
some given block of the FCT cannot be successfully
written.

FORMAT-F Formatter Initialization Error! - prints if
FORMAT cannot acquire enough data buffers or control
blocks to start formatting, or if the disk functional
code is not loaded.

FORMAT-F GET STATUS Failure! - prints if the unit
requested is not available or cannot be brought online.

FORMAT-F LBN Format Error (Drive FORMAT Command Failed)!
- prints if a FORMAT command fails for five retries when
formatting the LBN area.

Nonexistent Unit Number!
does not exist.

FORMAT-F RCT Does Not Have Enough Good Copies of Each
Block! - prints if any block in the RCT does not have
two good copies.

FORMAT-F RCT is Full! - prints if so many bad blocks
are encountered the RCT overflows.

FORMAT-F RCT Read Error! - prints if all copies of some
given block of the RCT cannot be successfully read.

FORMAT-F RCT write Error! - prints if all copies of
some given block of the RCT cannot be successfully
written.

FORMAT-F SOl Receive Error! - prints if a track cannot
be read at all after it has been formatted.

FORMAT-F Too Many Bad RBNs Found Before RCT was
formatted. - prints if more RBNs than can be recorded
in memory are encountered before the ReT area has been
formatted.

- prints if the unit requested

7-36

FORMAT-F Unsuccessful SOl Command! - prints if the
drive fails to respond to an SOI command. FORMAT issues
SEEK, RECALIBRATE, and DRIVE CLEAR SOI commands.

7.4.3.4 FORMAT Warning Message - The FORMAT utility prints only
one warning message.
o

FORMAT-W WARNING: Possible Head Addressing Problem.
prints if no sector was successfully read from one or
more tracks in the XBN area. Note that all cylinders
are checked. This is a simple check for a bad head.

7.4.3.5 FORMAT Information Messages - Following are the
informational messages printed by FORMAT:
o

FORMAT-I Bad LBN n (x), a Non-Primary Revector.
prints for LBNs retired by being revectored to some RBN
other than the primary RBN; they are marked in the RCT
as nonprimaries. They are formatted with a header code
of nonprimary or with a headeer code of bad if their
header area is bad.

FORMAT-I Bad LBN n (x), a primary Revector to RBN n.
prints for LBNs retired by being revectored to the first
RBN on the same track; they are marked in the RCT as
primaries. They are formatted with a header code of
primary.

FORMAT-I Bad LBN n (x), in the RCT Area. - prints for
retired LBNs in the RCT area. They are formatted with a
header code of bad.

FORMAT-I Bad RBN n (x). - prints for retired RBNs.
They are marked bad in the RCT and are formatted with a
header code of bad.

Cylinder n, Group n, Track n, Position n, PBN n.
prints following the preceeding four messages, if the
user requested the special option to print bad block
position.

FORMAT-I CTRL/Y or CTRL/C Abort! - is an informational
message and prints if the user aborts FORMAT by typing a
CTRL Y or CTRL C. Note, this probably leaves the disk
in an unusable state if the format has begun.

FORMAT-I FCT was Not Used. - prints if a null or
clobbered FCT was found on the disk or generated at the
request of the user (Best Guess mode).

7-37

FORMAT-I FCT was used Successfully.
FCT was found on the disk and used.

- prints if a valid

FORMAT-I
n Cylinders Left in xBN Space at
hh:mm:ss.xx. - is an informational message and prints
after every 10 cylinders are formatted in order to
record the progress of the FORMAT program.

FORMAT-I Only DBN Area Formatted (n Bad DBNs). - prints
if the user requested formatting of the DBN area only.
It prints after the format of the DBN area is completed.
After this message prints, the program terminates.

7.4.3.6 FORMAT Error Messages - Following are the error messages
printed by FORMAT:
o

FORMAT-E Illegal Response to Start-up Question!
prints if an invalid input is supplied for a start-up
question. The program reprompts with the same question.

FORMAT-E Nondefaultable Parameter. - prints if the user
enters only a carriage return, requesting the default
for the only nondefaultable parameter (the serial
number). The program reprompts for the serial number.

7.4.3.7 FORMAT Success Messages - Following are the FORMAT
success messages:
o

FORMAT-S Format Completed. - prints after the format
process is done, and all verification tests are
complete.

FORMAT-S Format Begun. - prints when FORMAT actually
begins formatting the disk.

7.5 RXFORMAT UTILITY
The RXFMT utility program allows the user to format and verify
RX33 diskettes. These are 5 1/4-inch, two-sided, double-density
diskettes available from DIGITAL. This utility is used only to
format diskettes for the HSC70. The program should complete in
less than five minutes.
7.5.1 RXFORMAT Initiation
To run the RXFMT utility, select an HSC70.
type:

7-38

At the KMON prompt,

HSC>RUN dev:RXFMT
where dev is the name of the drive containing the RXFORMAT
utility.
The program prompts the user from beginning to end. As with all
HSC prompts, material contained in square brackets is the
default. To accept the default, press the RETURN key. With
square brackets that do not contain material, you have to supply
the value and then press the RETURN key.
To abort the utility, type CTRL/Y or CTRL/C at any point.
However, note this action leaves your diskette in an unknown
state.
After the RUN command is input, the utility prompts:
RXFMT-Q Unit to format []?
RXFMT allows the user to select either drive to run the program.
Following is an example of a typical RXFMT session:
HSC70>RUN DXl:RXFMT
RXFMT-Q Unit to format []?DXl:
RXFMT-Q Ready to start formatting (Y or N) []?Y
RXFMT-I Formatting track 0, side 0, LBN 0
RXFMT-I Formatting track 8, side 0, LBN 240
RXFMT-I Formatting track l6,side 0, LBN 480
RXFMT-I Formatting track 24,side o, LBN 720
RXFMT-I Formatting track 32,side o, LBN 960
RXFMT-I Formatting track 40,side o, LBN 1200
RXFMT-I Formatting track 48,side o, LBN 1440
RXFMT-I Formatting track 56,side 0, LBN 1680
RXFMT-I Formatting track 64,side o, LBN 1920
RXFMT-I Formatting track 72,side o, LBN 2160
RXFMT-S Format successfully completed.
RXRD-I Reading track o, side o, LBN 0
RXRD-I Reading track 8, side o, LBN 240
RXRD-I Reading track 16,side o, LBN 480
RXRD-I Reading track 24,side o, LBN 720
RXRD-I Reading track 32,side o, LBN 960
RXRD-I Reading track 40,side o, LBN 1200
RXRD-I Reading track 48,side o, LBN 1440
RXRD-I Reading track 56,side o, LBN 1680
RXRD-I Reading track 64,side o, LBN 1920
RXRD-I Reading track 72,side o, LBN 2160
RXFMT-I Program Exit

7-39

7.5.2 RXFORMAT Error Messages
Error messages possible while running RXFORMAT follow:
o

RXFMT-E Requested unit is unavailable. - The unit
specified in the command line is unavailable.

RXFMT-F Aborting. - RXFMT tries to format and verify 10
times. If no progress is made, RXFMT issues an error
message and the program exits. This message is also
displayed after the user types either a CTRL/Y or a
CTRL/C. Try a different diskette. If the problem
persists, report it to appropriate personnel.

RXFMT-F Error comparing track. - RXFMT detected an
inconsistency. The data read from the diskette in the
verify pass did not match what was written.
Retry.

RXFMT-F Error formatting track. - This could be caused
by a bad diskette or a hardware problem. Retry. If the
problem still persists, try a different diskette.

RXFMT-F Error reading track. - This error could be
caused by a bad diskette or a hardware problem. RXFMT
tries to verify the formatting 10 times. If no progress
is made, the program exits. Run program again. If
problem persists, use a different diskette. If the
problem still persists, report it to Field Service
Support.

RXFMT-F Unable to allocate sufficient mapped memory.
Not enough blocks in Program memory are available to use
as buffer space. Try again later.

RXFMT-F Unable to allocate sufficient XFRBs. - The
common pool did not contain enough memory to allocate an
XFRB, required for RXFMT using load media. This is a
transient condition. Try again later.

RXFMT-W About to format diskette in boot device.
RXFMT warns the user the utility is about to format the
diskette in the boot device. The user must be very
cautious when running RXFMT. As a result, RXFMT not
only asks whether reformatting should start, but also
outputs this warning message.

RXFMT-I Formatting track, side, LBN. - RXFMT did not
encounter any problem while formatting previous track,
and simply reports.

RXFMT-I Please specify a valid unit. - The user must
specify the unit-id, either DXO: or Dxl:.

RXFMT-I Program Exit.
exiting.

- The program is finished and is

7-40

RXFMT-I Reading Track, side, LBN. - RXFMT did not
encounter any problems while verifying previous track.

RXFMT-S Format successfully completed. - RXFMT
completed without any errors or interruptions.

RXFMT-Q Unit to format []? - RXFMT asks which unit the
user will use to format the diskette.

RXFMT-Q Ready to start formatting (Y or N) []? - RXFMT
asks if the user is ready to format the diskette.
Ensure the diskette is loaded into the correct drive.

7.6 VIDEO TERMINAL DISPLAY (VTDPY)
VTDPY is a utility for gathering system statistics. This utility
displays, on a continuing basis, activity within the HSC. VTDPY
can display system throughput, AVAILABLE or ONLINE status of disk
and tape drives, and utilities running on other terminals. This
utility also indicates which nodes have virtual circuits,
connections, and multiple connections to the HSC.
NOTE
Do not run VTDPY using the command SET HOST/HSC
through the Diagnostic Utility Program (DUP).
DUP cannot manage VTDPY because too much optional
interrupt data is produced.
This utility requires a video terminal and does not display on an
LA12. Either a VT100 or a VT220, set at 9600 baud, must be
attached to the EIA port on the HSC to run VTDPY.
To run VTDPY, enter at the prompt
HSC> RUN dev:VTDPY (update-interval)
In this command, update-interval is in seconds, anywhere from 2
to 420. If you do not provide this update interval, VTDPY
prompts:
VTDPY-Q Interval (sees) ?
If your response is outside the allowable range, VTDPY displays
an error message. The higher the number for the update-interval,
the smaller the performance impact on the HSC.
VTDPY terminates after the user enters a CTRL/Y or a CTRL/C.
screen is cleared upon termination.

7-41

The

7.6.1 VTDPY Error Messages
This utility has only two error messages, as follows:
Message:

VTDPY-E Illegal Interval Value (2 to 420 seconds)

Explanation: The user has entered an update-interval ouside
range permitted. VTDPY reprompts for the update-interval.
User Action:
Message:

the

Re-enter a value within the correct range.

VTDPY-F Insufficient Common Pool

Explanat~on:

This message indicates insufficient memory

run

VTDPY.
User Action:

again later.

7.6.2 VTDPY Display Example
An example of a VTDPY screen display follows:
HSC70 V3.00 C3PO
42.9% Idle

Id OOOOOOOOOODD On 14-Apr-1986 12:28:13.12

39 Work Requests/Sec

Free Lists
2269 +
Ctrl Blks
32 +
SLCB/DCB
889 +
Buffers
Pool Sizes
1800 +
SYSCON
6504 +
Kernel
821120 +
Program
32436 +
Control
Data B/W used:
Host Status
0123456789012345
.. MMMMM .. MMMM. . .

........ BA ...... BA ...

. 01.

40 Sectors/Sec

Process Pr St
Kernel
4 VTDPY 11 Rn
24 SYSDEV 1 Bl
60 DEMON 11 Bl
62 PDEMON 7 Bl
64 PSCHED 13 Rn
72 DISK
9Rn
110 ECC
6 Bl
120 TAPE
8 Bl
122 TTRASH 7 Bl
124 HOST
4 Bl
126 paLLER 5 Bl
130 SCSDIR 5 Bl
146 DPOUT 10 Bl
150 DP20UT 10 Bl
162 DUP
9 Bl

7-42

Time~

16.4119.21-

UP: 113.49

0 Records/Sec

Disk Status
1111111111
+1234567890123456789
0 ................... .

42.9116.01-

.9%
. 91-

20A.A .......... A..... .
40 .......... A.A.A .... .
60 . AA ....... a.. A.. O.. .
80 ................... .
100 ................... .
120 ................... .
140 ................... .
160 ................... .
180 .................. A.
200A .................. .
220 ................... .
240 ................... .

NOTE

A true video display contains solid diamond
symbols on the bottom line, indicated in this
example as a caret (A).

7.6.2.1 VTDPY Display Explanation - The previous display example
constantly changes as different processes run in the HSC. These
changes are made automatically with the exception of the fields
relating to HSC memory. Memory statistics are updated only by
typing a CTRL/W. The major fields are explained as follows:
HSC70 V3.00 C3PO
113.49

Id 000000000000 On 14-Apr-1986 12:28:13.12

UP:

The top line, reading from left to right, shows the HSC model
number (HSC70), the baselevel of the operating software (V3.00),
the system name (C3PO), the system 10 (Id) any hexadecimal number
unique to the cluster, in this case 000000), time, and date. The
last number on the right indicates the hours and minutes the HSC
has been running since the last boot or reboot.
42.9% Idle

39 Work Requests/Sec

40 Sectors/Sec

o Records/Sec

This second line in the display shows the percentage of current
P.io idle time, average number of work requests (i.e., MSCP and
TMSCP) per second, number of disk data sectors transferred per
second, and number of tape data records transferred per second.
These numbers are normalized to match the update interval.
Free Lists
Ctrl Blks
2269 +
SLCB/DCB
32 +
Buffers
889 +
Pool Sizes
SYSCOM
1800 +
Kernel
6504 +
Program
821120 +
Control
32436 +
This field represents the quantity of available memory and memory
structures. The sizes are usually followed by plus signs. If
followed by minus signs, the system is in memory deficit.
Extremely prolonged memory deficit results in HSC slowdown and
could eventually result in an HSC crash.

7-43

Data B/W used:

.0%

This display shows the percentage of data bandwidth used. This
is an instantaneous display and may often show 0% when the HSC is
busy because the sampling interval missed the instantaneous
bandwidth usage.

Host Status
0123456789012345
•. MMMMM .. MMMr.t. . .
.. ---BA .. "-BA ...

NOTE

A true video display contains solid diamond

symbols on the bottom line, indicated in this
example as a caret (A).
This field indicates host status. The line below Host Status
shows the node number (in the range 0 through 15) of the hosts in
the cluster. If no letter appears under this node number in the
next line, that node number is not a currently active host. If a
V appears on that line, a Virtual Circuit only is open and no
connection is present (host usually in the transitional state).
A C on this line indicates one connection to that host and an M
indicates multiple connections. Because each host can make a
separate connection to each of the Disk, Tape, and DUP servers,
this field frequently shows multiple connections.
The bottom line of this field contains CI path status information
and each position can contain either a diamond symbol, an A, or a
B. The meanings are as follows:
o

A diamond symbol equals normal operation in any position
with a connection.

An A or B indicates only one CI path is operational. If
an A is displayed, Path A is running, but Path B is not;
if a B is displayed, Path B is running, but Path A is
not. Either letter probably indicates a hardware
problem.

7-44

Process Pr St
Kernel
4 VTDPY 11 Rn
24 SYSDEV 1 B1
50 DEMON 11 B1
52 PDEMON 7 B1
54 PSCHED 13 Rn
9 Rn
72 DISK
6 B1
110 ECC
120 TAPE
8 Bl
122 TTRASH 7 Bl
4 B1
124 HOST
126 paLLER 5 B1
130 SCSDIR 5 B1
146 DPOUT 10 B1
,t:.n DP20UT ' n Bl
9 B1
152 DUP

Time%
16.4%
19.2%

42.9%
16.0%

.9%
.9%

.LV

.L.JV

The headings in this display (from left to right) mean the
following:
o

The first column with numbers is the process number.

The Process column shows the name of the process running
at the time.

The Pr column shows the priority of the process.

The St column shows the status of the process, either
running Rn or blocked Bl.

The Time% column is the percentage of P.io time each
currently-running process is using.

Certain process names in the first column under Kernel (the
operating system) are defined as follows:
o

In this case, VTDPY. However, it could be another
utility (in which case the priority number would change
also).

SYSOEV is the load device driver.

DEMON indicates demand and automatic diagnostics are
running.

POEMON indicates periodic diagnostics are running.

PSCHEO is the scheduler for periodic diagnostics.
is the HSC idle loop.

7-45

This

DISK is the disk server.

ECC is the error correction code process and is always
displayed when disk I/O is indicated.

TAPE is the tape server.

TTRASH is always displayed when the tape server is
active. It is the process that sends tape error logs to
the host.

HOST is the process that interfaces to the host.
always present.

paLLER polls for the host process and is always present
when a connection is present.

SCSDIR processes directory requests from the host.

DPOUT and DP20UT are the I/O from two different DUP
processes.

DUP is the Diagnostic and Utility Protocol server.

It is

Note, not all processes are necessarily shown. Because of
limited space on the screen, the display of some processes may be
truncated and the CPU time percentages may not total 100 percent.
Disk Status
1111111111
+1234567890123456789

o••••••••••.•••••••.•
20A.A .......... A ••••••
40 .......... A.A.A •••••

6 0 • AA. • • • • • • o. . A. • o. . .
80 ••••••••••••••••••••

100 ••••••..•......•....

120 ................... .
140 •.•..•..............
160 . . . . . . . . . . . . . . . . . . . .
180 . . . . . . . . . . . . . . . . . . A.
20 OA ••••.•••••••.•••••.

220 •••••.••••....•.•.•.
240 •.••••.•••..•.•••..•

The last area in the display can indicate either Disk Status or
Tape Status. This rightmost field fluctuates between the two
displays whenever both device types are connected to the HSC.
The line immediately under Disk Status indicates the following
unit numbers are augmented by 10 from the base number in the
leftmost column. To find the identification number of the disk
indicated by a single letter in this field, count from the left.

7-46

For instance, on the 20s line, the third A would be disk unit
number 33.
A letter anywhere in the field has a particular meaning for the
particular disk unit identified, as follows:
o

An A indicates the drive is available but not mounted.

An 0 indicates Online status. The drive is in use by a
host, an HSC utility, or an HSC diagnostic.

A 0 indicates the HSC is connected to duplicate units
(two or more drives with the same unit number).

A U indicates the drive went into an undefined state.

The letters and method of deteJmining tape drive 10 number are
the same when tape status is displayed. However, one additional
letter can be shown, an F, indicating no tape is mounted on the
tape drive.

7-47

CHAPTER 8
TROUBLESHOOTING TECHNIQUES

8.1 INTRODUCTION
This chapter describes the types of errors occurring during HSC70
boot and operation. The major divisions are initialization
errors and system type errors. Initialization errors occur while
the HSC70 is trying to boot. System type errors occur while the
HSC70 is running functional code. System type errors may be
reported to a host node and possibly the HSC console device.
Some system errors may result in the HSC70 crashing and
rebooting. System errors include MSCP, TMSCP, BBR, and
out-of-band errors.
8.2

HOW TO USE THIS CHAPTER

Initialization error indications are found in the Operator
Control Panel (OCP) fault codes and the module LEDS. In
addition, the bootstrap diagnostics may produce error messages
printed out to the console. Read Section 8.3 for an
understanding of initialization errors that do not produce a
message. All errors displayed as English messages are shown in
the index to this manual. Section 8.3 divides initialization
errors into three types:
o
o
o

OCP fault codes
Module LEOS
Boot diagnostic messages

HSC console error messages for system type errors are described
in this chapter and are organized into the following sections:
o

MSCP/TMSCP errors--Section 8.4.1
Controller errors--Section 8.4.2.4
MSCP SDr errors--Section 8.4.2.5
Disk Transfer errors--Section 8.4.2.6

8-1

BBR errors--Section 8.4.3

TMSCP errors--Section 8.4.4
STI Communication or Command Errors -- Section
8.4.4.1

STI Formatter Error Log -- Section 8.4.4.2
STI Drive Error Log -- Section 8.4.4.3
o

Out-of-Band errors -- Section 8.4.5
HOST-X -- CI errors -- Section 8.4.5.1
SYSOEV-X -- Load Device errors -- Section 8.4.5.2
OISK-X
Disk Functional errors -- Section 8.4.5.3
TAPE-X
Tape Functional errors -- Section 8.4.5.4
SINI-X
Miscellaneous errors -- Section 8.4.5.5

Each message description includes the following:
o
o
o
o
o

Actual error message
Error message severity level
Message description
Field service action
Possible FRUs

INITIALIZATION ERROR INDICATIONS
Initialization errors are indicated by:

8.3

o
o
o

OCP fault code displays
Module LEOs
Boot diagnostic messages

8.3.1 OCP Fault Code Displays
OCP fault codes are divided into two categories, hard fault codes
and soft fault codes. Soft fault codes are also called nonfatal
fault codes. Soft faults impede HSC70 operation, but the fault
does not hinder the boot process. Hard fault codes are fatal to
the Hse and prevent further operation of the HSC subsystem until
the condition is remedied.

Figure 8-1 shows the possible displays available on the OCP in
the event of errors during initialization or operation. For
detailed interpretations of these fault codes, refer to Chapter
1.

8-2

OCP INDICATORS
DESCRIPTION

HEX

OCT BINAR

PORT PROCESSOR
MODULE FAI LUREt

00001

DISK DATA CHANNEL
MODULE FAI LUREt

00010

TAPE DATA CHANNEL
MODULE FAI LUREt

00011

INSTRUCTION CACHE PROBLEM
IN I/O CONTROL PROCESSOR*

01000

HOST INTERFACE ERROR*

01001

DATA CHANNEL ERROR*

01010

I/O CONTROL PROCESSOR
MODULE FAI LURE

1 0001

MEMORY MODULE FAILURE

1 0010

BOOT DEVICE FAI LURE**

1 0011

PORT LINK MODULE FAILURE

1 0101

MISSING FI LES REQUIRED

1 0110

NO WORKING K.SDI, K.STI,
OR K.CI

1 1000

REBOOT DURING BOOT

11001

SOFTWARE DETECTED
INCONSISTENCY

1 1010

c=J D

t INCORRECT VERSION OF MICROCODE.
* THESE ARE THE SO-CALLED SOFT OR NON-FATAL ERRORS.
**POSSIBLE MEMORY MODULE/CONTROLLER ON HSC70

Figure 8-1

CX-905B

Operator Control Panel Fault Codes

8-3

8.3.2 Module LEOs
HSC70 modules contain LEDs used as state indicators for each
module. These LEDs are described in the following tables.
Figure 2-1 in Chapter 2 shows the locations of the module LEOs.
8.3.2.1 P.ioj LEOs - Table 8-1 shows the LOllI (I/O Control
Processor module) LEDs and their functions.
Table 8-1

L0111-O (P.ioj) LEOs

Led

Color

Meaning

Yellow

Micro-ODT
ON when J-ll executing micro-ODT

Yellow

SLU OK
Serial Line Unit output of UART

Yellow

MEM OK
Turned OFF as J-ll successfully accesses Program
memory

Yellow

SEQ Lamp
Turned OFF as J-ll verifies proper functioning of
its sequencers for control store

Yellow

State Lamp
Blinks in parallel with OCP State Lamp (under
software control)

Yellow

Fetch Lamp
Blinks once for every PDP-II instruction fetched
(J-ll run LED)

Red

Diagnostic/Testing Failure
Initially on for poweruPi turned off upon successful
completion of J-ll initialization diagnostics

Green

Diagnostics Passed
Turned on upon successful completion of J-ll
initialization diagnostics

Power-Up Sequence Of I/O Control Processor LEOs - This
section defines the power-up sequence of the LEOs shown in Table
8-1. First, LED numbers 08 and 07 are used to indicate whether
the P.ioj module has successfully completed all of its
initialization diagnostics. The module powers up with the red
(07) LED ON and the green (08) LED OFF. 01 through 04 (yellow)

8.3.2.2

8-4

are initially on. As soon as the J-ll starts operating, Dl
(micro-ODT LED) turns off.
Several microcode steps later, 04 (sequence LED) is turned off
indicating the J-ll is sequencing and succeeded in reaching this
point in its microcode. The J-ll performs several program memory
operations and, if successful, turns off 03, (memory OK LED).
Finally, the J-ll accesses the console terminal port of the UART
(universal asynchronous receiver/transmitter) and turns off 02
(SLU or serial line unit LED).
upon successful completion of the boot time initialization
diagnostics, 08 (module OK LED) turns on, and 07 (module failure
LED) turns OFF. The J-ll then proceeds to the software
initialization programs.
In addition to being initially ON, the Dl (micro-ODT run LED) is
on any time the J-ll is executing micro-ODT. 06 (the fetch LED,
sometimes referred to as the run LED) blinks once for every
PDP-II instruction fetch cycle. When the J-ll is running, 06 is
illuminated at half-brilliance compared to the other yellow LEOs.
8.3.2.3 Memory Module LEOs - Table 8-2 shows the LOl17-0
(M.std2) module LEOs and their functions. These LEOs are
controlled by a bit in the Rx33 FOC MAR02 register. The green
LED is set to ON by the P.ioj boot/ROM self-test diagnostics
after the RX33 has passed its self-tests, and Program memory has
found 8 Kwords to load INIPIO/OFLPIO.
NOTE
The entire LED package on the M.std2 is called
02. All three LEOs are contained in the 02
package.

Table 8-2

LOl17-0 (M.std2) LEOs

LED

COLOR

MEANING

Red

Mod Not OK

Green

Mod OK

Yellow

Memory Active
Indicates access activity to any of the three
memories on this module

8-5

8.3.2.4 Data Channel LEOs - Table 8-3 shows the L0108-YA and -YB
(K.sdi and K.sti) module LEOs and their functions with the system
software.

Table 8-3

LOI08-YA/YB (K.sdi/K.sti) LEOs

Led

Color

Meaning

Red

Module Failure
Indicates a module microdiagnostic failed to
successfully complete, or this module is still under
initialization by the subsystem.

Green

Module OK
Turned on by the Init/Func Flag signal in the K
functional microcode. The green LED comes on after
successful initialization or while the data channel
is running functional microcode.

8.3.2.5 Host Interface LEOs - Table 8-4 shows the three modules
in the K.ci set, their LEOs, and the functions of the LEOs with
the system software.

Table 8-4

K.ci (LINK, PILA, K.pli) LEOs

Module

LED

Color

Meaning

K.pli

Red

ON when Peio has booted or rebooted,
K.pli
module
has not yet passed
self-test.

K.pli

Green

ON when K.pli has passed its self-test.

PILA

Red

ON when PILA module has not yet passed the
test performed by the K.pli.

PILA

Green

ON when the PILA module has passed the test
performed by the K.pli. LED is controlled
by the port processor.

8-6

but
its

Module

LED

Color

Meaning

PILA

Yellow

(Not found on all etch rev modules.) ON
when K.pli is asserting init. When init is
true, both the red and the green PILA LEOs
are forced OFF.

LINK

D998

Green

ON when local activity is present on the
LINK module (whenever the LINK module
detects a message directed to its node or
when it detects an outgoing message).

LINK

0999

Red

ON during the CI maintenance loop test.

8.3.3 Communication Errors
It is possible for the HSC70 to complete its initialization and
not report the fact on the local console terminal (VT220). This
is an indication of a failure in the serial communication path
between the UART chip on the P.ioj (LOIII-O) and the local
console terminal.
As a method of testing this serial path, the HSC70 echoes the
characters typed on the local console terminal as if the terminal
were in local mode. Use the following procedure to test the
serial path:
1.

Place the Secure/Enable switch in the ENABLE position

With power on, push in and hold the OCP Init switch.

Type a series of characters on the terminal keyboard.

Check to see if the series of characters echoed
correctly on the terminal.
NOTE
When the Init switch is released, the HSC
reboots.

If this procedure fails to echo characters typed at the keyboard,
the failure is either a terminal/P.ioj baud-rate mismatch
(default is 9600), a P.ioj module failure, or a problem within
the terminal-cabling subsystem. Ensure the terminal set-up
parameters are correct. Refer to the HSC70 Installation Manual
(EK-HSC70-IN-00l) for the proper terminal configuration. Refer
to the VT220 Owner's Manual (EK-VT220-UG-00l) for problem-solving
techniques related to the VT220.

8-7

8.3.4 Requestor Status For Nonfailing Requestors
When a requestor successfully completes all internal
microdiagnostics, bits 0 through 5 contain the following codes
defining module types.
o

Code 001 represents a properly-functioning host
interface module set (K.ci).

Code 002 represents a properly-functioning disk data
channel module (K.sdi).

Code 203 represents a properly-functioning tape data
channel module (K.sti).

Code 377 indicates the requestor slot does not contain a
module.
NOTE
When a module fails internal microdiagnostics or
its functional code, the status byte reflects the
failure. See Appendix D for a complete list of
K.ci-, K.sdi-, and K.sti-detected failures.

8.3.5 Boot Flowchart
The HSC70 Boot Flowchart (Figure 8-2) maps the entire boot
sequence. The flowchart calls out useful visual milestones that
aid in troubleshooting the problems which can occur during
initialization.
The flowchart has three main divisions:
1.

Information on activity common to both the system and
offline diskettes is contained in boxes A through o.

Information on activity specific to the system diskette
is contained in boxes SA through SJ.

Information on activity specific to the Ottllne dIskette
is contained in boxes OA through OG.

The flowchart begins when one of the following occurs:
o
o
o

Init button pushed
Powerup has started
Other software caused reboot

8-8

INTERNAL/EXTERNAL
INITIALIZATION
ENTRY POINT
TIME = 0
J-11 PERFORMS INTERNAL
MICRO TEST ... A THRU C.

A TEST INTERNAL J-11
SEQUENCER, TURN OFF
01 (MICRO-ODT) IF NOT
IN ODT. TURN OFF 04.
B TEST MEMORY: LOC 0
RESPOND (NO NXM?)
LOC 1777700 'SHOU LD'
NXM. TURN OFF 03.

NOTE: LEOs 01-04 AND 07 ARE ON THE P.IOJ MODULE.

NO FAULT CODE
FAIL

.sI8ll !.lliI FAULT
?

07 (RED LED) ON.
NOTE: ? MEANS OCP LEOS ARE
INDETERMINATE AND
HAVE NO MEANING AT
THIS TIME.

NO FAULT CODE
FAIL

s.I..8.li ill!.I FA U LT
?

NO FAULT CODE
TEST FOR SLU, CHECK
177580 FOR RESPONSE.
TURN OFF 02.

FAIL

STATE INIT FAULT

NO FAULT CODE
BEGIN EXECUTION OF
BOOT ROM. TURN OFF
ALL OCP INDICATORS.

FAIL

TEST J-11 BASIC
INSTRUCTIONS TEST O.

FAIL

STATE lli.!I FAULT
a
a
0
NO FAULT CODE
STATE LlliI FAULT

FAUL T = 21 OCTAL
FAIL

STAT E LlliI FA U LT
a a
1

I
L
TIME
<1/2
SECOND

07 STILL ON.
OCP INDICATORS
NOW RELIABLE.

TURN ON 'INIT' INDICATOR.

CX-9458

Sheet 1 of 4

Figure 8-2

HSC70 Boot Flowchart (1 of 4)

8-9

IF MEMORY
FAILS W/NXM
OR PARITY
ERROR, FAULT
WILL NOT BE
SET.

FAULT = 21 OCTAL
STATE INIT FAULT

TEST BANK SWAP
BITS IN P.IOJ CSR.

STATE INIT FAULT

x
IF MEMORY
DATA ERROR IS
DETECTED, FAULT
IS 22, AND
FAULT LED WI LL
BE ON.

FAUL T= 22 OCTAL
STATE I~ FAULT
FIND 8KW OF GOOD
PROGRAM MEMORY.

a
FAULT = 22 OCTAL

TEST 4
TEST RX33
CONTROLLER HARDWARE.

STATE IJ':!!I FAULT

a
NOTE: FOR MORE INFORMATION, SEE
ERROR REPORTING SECTION
IN CHAPTER 4.

FAULT = 23 OCTAL
STATE INIT FAULT

----a

READ/ RECALIBRATE
TEST ON RX33 DRIVE.

FAULT = 23 OCCURS ONLY IF
'BOTH' DRIVES FAIL.

FAULT = 23 OCTAL
N

READFIRST8
BLOCKS FROM RX33
(BOOT BLOCKS).

Tf"'\

I V

I.'\.~"" r [ ' "
iIVII"'"'\.UL..

STATE l.lli.I FAULT

FAULT=230CCURSONLY!F
'BOTH' DRIVES FAI L.
SEE CHAPTER 8.

'I I e ' ,

oJUv t

LOADED.
SYSTEM ~_ _ _-'-_ _ _ _"
DISKETTE

OFF LI N E
DISKETTE

CX-945B

Sheet 2 of 4

Figure 8-3

HSC70 Boot Flowchart (2 of 4)

8-10

SYSTEM DISKETTE

I
~ INIT LED TURNED OFF, STATE
LED TURNED ON SOLID, HSC
CONSOLE O/P 'INIPIO-I-BOOTING'.
LOAD REMAINDER OF INIPIO.INI.

FAUL T = 21 OCTAL

~ INIPIO PERFORMS

STATE !lilI FAULT

FAIL

.-.

INSTRUCTION TESTS AND
MMU TESTS.

~ INIPIO LOADS

INICAC AND
TRANSFERS CONTROL.

~ INICAC TESTS CACHE.
IF CACHE FAILS,
FLAG FAILURE TO INIPIO .

•

~ INIPIO INITS ALL
REQUESTORS AND GETS
THEIR STATUS.

FAULT = 22 OCTAL

~ J-11 TESTS PROG MEM.
HIGHEST REQUESTOR It
TESTS CONTROL AND
DATA MEMORY.

STATE !!i!J FAULT

FAIL

TOTAL MEMORY FAILURE
IN CONTROL 'OR' DATA.

FAULT = 23 OCTAL

~ INIPIO LOADS EXEC.
INIPIO TURNS ON
GREEN LED ON THE
P.IOJ MODULE.

FAIL

~ INIPIO TRANSFERS TO

EXEC START STATE
LIGHT BLINKING AT 1/2
SECOND INTERVALS.

FAIL

FAULT OCCURS IF BOOT
DEVICE HAS ERROR WHEN
LOADING EXEC.

STATE lli!I FAULT

STATE !lilI FAU L T

SOLID
ON OR
OFF

MOST REMAINING FAULTS
INDICATE SOFT FAULTS.

FAULT CODE DEPENDEN T ON FAILURE

~ EXEC RUNS SIN!.
SINI LOADS AND
INITIALIZES REMAINING
SIW MODULES.

•

-..

FAIL

STATE INIT FAULT
---SOLID OFF
ON OR
OFF

NOTE:

~ SINI TRANSFERS
COMPLETE L Y TO EXEC
STATE LIGHT BLINKS
AT 1 SECOND INTERVALS.
OUTPUT OPERATING
SOFTWARE HERALD.

FAIL

.-.

SAME AS ABOVE

AFTER THE OPERATING
SOFTWARE HERALD, OTHER
INITIALIZATION MESSAGES
MAY BE REPORTED. SEE
CHAPTER 8, SECTION ON
OUT-OF-BANDS FOR SINI
ERRORS.
CX-945B

Sheet 3 of 4

Figure 8-4

HSC70 Boot Flowchart (3 of 4)

8-11

OFFLINE DISKETTE

~ TURNS INIT INDICATOR
OFF, TURNS STATE
INDICATOR ON SOLID.

~ LOADS REST OF OFFLINE
P. IOJ TEST (OFLPIO).

ocl

..::.::.J RUNS OFFLINE P. IOJ

----...,

TEST (OF LPIO).

STATE I NIT FAULT
--ON
ON OFF
ERROR TYPEOUT OR
HALT AT 400

.2.2J LOADS OFFLI~JE
DIAGNOSTIC LOADER
(ODL).

DE I
,.=..=..I

TURNS ON P. IOJ
GREEN LED.

~ STARTS ODL. BLINKS
STATE INDICATOR. ODL
HERALD TO TERMINAL.

i2QJ ODL PROMPT ODL
WAITS FOR OPERATOR
COMMAND, ROTATES OCP
LAMPS FOR TEST.

ODL FEATURES

8 TESTS
BUS
MEM
MEM BY K
K TEST SEL
OCP
REFRESH
CACHE
RX33

11 CONVEN I ENCES
SIZE
HELP
@

LOAD
START
SET DEFAULT
SHOW DEFAULT
SET RELOCATION
EXAMINE
DEPOSIT
REPEAT

NOTE: SEE CHAPTER 6, OFFLINE
DIAGNOSTiCS. FOR MORE
INFORMATION.

NOTE 1: FIRST PORTION OF THE OFFLINE P.IOJ TEST (OFLPIO) WAS LOADED
WITH THE PREVIOUS LOAD OF EIGHT BOOT BLOCKS.
NOTE 2: FOR DETAILED INFORMATION ON THE OFFLINE P.IOJ TEST AND
ERROR REPORTS, REFER TO CHAPTER 6. OFFLINE DIAGNOSTICS.
CX-9458

Sheet 4 of 4

Figure 8-5

HSC70 Boot Flowchart (4 of 4)

8-12

8.3.6 Boot Diagnostic Indications
The HSC70 can pass boot diagnostics with a failing requestor.
Although the HSC70 passed the boot, the failure associated with
the requestor is considered an initialization error.
Following is an example of an error message displayed when a
requestor fails on initialization of the operating software. The
HSC70 has passed most of the initialization/boot diagnostics, but
a requestor has failed.
SINI-E ERROR SEQUENCE 2. AT 20-SEPT-1985 00:00:02.80
REQUESTOR 2 FAILED INIT DIAGS, STATUS = 107
The requestor with the red LED on is the failing requestor. In
this case, the diagnostic identifies requestor 2 as failing its
internal self-test number 7. Additionally, the Fault indicator
turns on, and a soft fault code of octal 12 is displayed on the
OCP after the Fault switch is pressed.
See Chapter 4 for more information on errors indicated by the
OCP.
8.4 SOFTWARE ERROR MESSAGES
Software error messages are classified into three categories:
1.
2.
3.

MSCP/TMSCP errors
Bad Block Replacement errors (BBR)
Out-of-Band errors

8.4.1 Mass Storage Control Protocol Errors
The Mass Storage Control Protocol (MSCP/TMSCP) errors printed out
at the console terminal and reported to a host can be one of the
following types:
o
o
o
o
o
o

Controller Errors
SDl Errors
Disk Transfer Errors
STl Communication Errors
STl Formatter Errors
STl Drive Errors

8.4.2 MSCP/TMSCP Error Format, Description, And Flags
Error formats, descriptions of the fields within the error
format, and error flags are nearly identical for MSCP and TMSCP
errors. Differences are noted where they exist.

8-13

MSCP/TMSCP Error Format - Example 8-1 shows an error
format generic to all MSCP/TMSCP errors. Some errors may contain
optional lines with additional information.
8.4.2.1

Example 8-1

MSCP/TMSCP Error Message Format

ERROR-X Text of message
Command Ref #
Err Seq #
Error Flags
Event
(Optional line)
(Optional line)
(Optional line)
ERROR-I End of error.

at (date) (time)
xxxxxxxx
x.
xx
xxxx

MSCP/TMSCP Error Message Fields - Table 8-5 describes
the various fields found in an MSCP/TMSCP error message. These
are common fields to all error messages of this type.
8.4.2.2

Table 8-5

MSCP/TMSCP Error Message Field Description

Field

Description

ERROR-E

The E is a code indicating the severity level of
an error. Other codes are: Q for inquiry, I for
informational, F for fatal, W for warning, and S
for success. Note: Only severity levels E and Q
require user action. Information following the
severity level code is a textual version of the
error message describing the event code,
followed by the the date and time.

Command Ref #

This number (in hexadecimal) is the MSCP/TMSCP
command number which caused the reported error.
It is zero if the error does not correspond to a
specific outstanding command. This number is
normally assigned by the issuing host CPU.

Err Seq #

This number (in decimal) is a sequential number
which counts error log messages since the
MSCP/TMSCP server established a connection with
the host. It is zero if the MSCP/TMSCP server
does not implement error log sequence numbers.

Format Type

This field is found only in TMSCP error
messages. This number, in bit format, is the
formatted density in bits per linear inch of
tape.

8-14

Field

Description

Error Flags

This number (in hexadecimal) indicates bit
flags, collectively called error log message
flags, used to report various attributes of the
error. Refer to Table 8-6.

Event

This number (in hexadecimal) identifies the
specific error or event being reported by this
error log message. This code consists of as-bit
major event code and an II-bit subcode. The
event codes and their meanings are listed in
Appendix D.

Error-I

The I indicates the severity level of the end of
error message is informational.

8.4.2.3 MSCP/TMSCP Error Flags - Table 8-6 defines the
MSCP/TMSCP error flags.

Table 8-6

MSCP/TMSCP Error Flags

Bit
Number

Bit
Mask
Hex.

If set, the operation causing this error log
message has successfully completed. The error log
message summarizes the retry sequence necessary
to successfully complete the operation.

If set, the retry sequence for this operation
continues. This error log message reports the
unsuccessful completion of one or more retries.

(MSCP-specific) If set, the identified logical
block number (LBN) needs replacement.

(MSCP-specific) If set, the reported error
occurred during a disk access initiated by the
controller bad block replacement process.

If set, the error log sequence number has been
reset by the MSCP server since the last error log
message sent to the receiving class driver.

Format Description

8-15

8.4.2.4 MSCP/TMSCP Controller Errors - Example 8-2 is a printout
of a typical controller error.
Example 8-2

Controller Error Message Example

ERROR-E Data memory error (NXM or parity) at 5-Mar 1985 12:52:14.43
Command Ref #
lC430008
Err Seq #
1.
Error Flags
41
Event
012A
Buffer Addr
143611
Source Req.
O.
Detecting Req. 3.
ERROR-I End of error.
NOTE

The direction of data transfer may be deduced
from the types of requestors identified in the
Source Requestor and Detecting Requestor field of
the error message. In this example, the source
requestor (the P.ioj) filled the buffer and
requestor 3 is reading it.
This section lists controller and compare errors together because
their format and fields are the same. These errors contain three
optional fields in addition to those described in Table 8-5. The
controller/compare specific fields are shown in Table 8-7. The
actual descriptions for these errors follow in Section 8.4.2.4.1.
Table 8-7

MSCP/TMSCP Controller Error Message Field Description

Field

Description

Buffer Addr

This number (in octal) is the starting address
of the HSC data buffer where the error occurred.

Source Req.

This number (in decimal) is the requestor that
orginally filled the buffer with data.

Detecting Req.

This number (in decimal) is the requestor that
detected the error.

8.4.2.4.1 Controller Error List - The following is an
alphabetical listing and an explanation of the controller errors.

8-16

Compare Error
Message Error Level:

Message Description: A compare error occurred during a
Read-compare or a write-compare operation. For the
Read-compare operation, the HSC again obtains the data from
the unit or shadow set and compares it with data obtained
from host memory. If the data is not the same, a compare
error results. For the write-compare operation, the
controller obtains.data from each destination and compares
it with data again obtained from host memory. If the data
is not the same, a compare error results.
Field Service Action: Isolate the FRU by moving the disk or
tape unit to another data channel and retrying the exact
failing operation. Also, check the HSC data memory buffer
address for repetition. If failure occurs on multiple
physical units across multiple data channels and HSC data
memory buffer address is not repetitive, investigate a
possible K.ci problem.
possible FRUs:
1.
2.
3.
4.

Isolated disk (or tape) unit
Data channel
M.std2
K.ci module set.

Data Bus Overrun
Message Error Level:

Message Description: The HSC attempted to perform too many
concurrent transfers, causing one or more of them to fail
due to a data overrun or underrun. For example, data is
sent to a bus by a data producer and then removed from the
bus by a data consumer. If the producer sends data to the
bus more quickly then the consumer can remove it, a data
overrun occurs. If the consumer removes data more quickly
than the producer can send it, a data underrun occurs.
Field Service Action: Determine which module is the data
producer and which module is the consumer for a given error.
Use the requestor number for assistance.
If the problem persists after replacing the suspect
module(s), an HSC software problem should be investigated.
Possible FRUs:

Source or detecting requestor modules.

8-17

Data Memory Error (NXM or Parity)
Message Error Level:

Message Description: The HSC detected an error in internal
Data memory. The error was either a parity error, detected
via a parity generator/checker (data only - not address) on
the requestor module, or a nonresponding address (the
requestor did not receive a DACK from the memory module).
Field Service Action: Determine if this error is
repetitive; if so, the problem is probably the M.std2
module. However, it may be a data bus problem caused by a
number of things, such as failing bus drivers/receivers on
the indicated requestor modules.
possible FRUs:

M.std2 or a possible data bus problem.

EDC Error
Message Error Level:

Message Description: The sector was read with correct or
correctable ECC and invalid EDC. A fault probably exists in
the ECC logic of either this controller or the controller
that last wrote the sector. Look at the source and
detecting requestor fields in the error message to determine
which requestor detected the error and the direction of the
transfer (read or write).
Field Service Action: Determine if other errors indicate a
problem with the data path circuitry on the indicated
requestor modules.
possible FRUs:
1.
2~

K.sdi
M.std2, if an address parity error on Data memory
occurs, as this is checked by the EDC field.

8-18

Internal Consistency Error
Message Error Level:

Message Description: A high-level check detected an
inconsistent data structure. For example, a reserved field
contained a nonzero value, or the value in a field was
outside its valid range. This error is most likely caused
by the requestor microcode or hardware.
Field Service Action: If the error is repetitive, check for
consistent requestor numbers in detecting requestor field of
error. Determine if any other surrounding error reports
indicate a possible internal memory error.
possible FRUs:
1.
2.

FRU noted in the detecting requestor field
M.std2 memory module.

SERDES Overrun
Message Error Level:

Message Description: This error is either a SERDES overrun
or underrun error. Either the drive is too fast for the
controller, or a controller hardware fault prevented
controller microcode from keeping up with data transfer to
or from the drive.
Field Service Action: Determine if other errors have
occurred that may indicate a K.sdi problem. Move the
offending drive to another requestor. If the problem
persists, test the drive further.
possible FRUs:

K.sdi module

PLI Receive Buffer Parity Error
Message Error Level:

Message Description: When the data from the packet in a
receive buffer on the PILA module was transferred to the
K.pli module, a parity error was detected on the bus. In
this case, parity is generated by the LINK module (LOIOO)
and checked by the K.pli module (LOI07). The PILA module
stores the data without checking or generating parity.

8-19

Field Service Action: If failure is persistent and is
accompanied by K.ci level 7 K interrupt HSC crashes, analyze
K.ci module status code for more detailed information. Run
Offline Test K diagnostic to test K.ci. Any error report
should more clearly indicate the specific K.ci module
failure. For very intermittent failures follow sequence of
possible FRUs.
possible FRUs:
1.

2.
3.

PILA
K.pli
LINK.

PLI Transmit Buffer Parity Error
Message Error Level:

Message Description: When data was being transferred from
the K.pli to the PILA transmit buffer, a parity error was
detected on the bus. In this case, parity is generated by
the K.pli module and checked by the LINK module. The PILA
module stores the data without checking or generating
parity.
Field Service Action: If failure is persistent and is
accompanied by K.ci level 7 K interrupt HSC crashes, analyze
K.ci module status code for more detailed information. Run
Offline Test K diagnostic to test K.ci. Any error report
should more clearly indicate specific K.ci module failure.
For very intermittent failures follow sequence of possible
FRUs.
possible FRUs:
1.
2.
3.

PILA
LINK
K.pli

8.4.2.5 MSCP SOl Errors - The SOl type errors total 15. Example
8-3 shows a typical sor error message. Table 8-8 describes the
fields specific to SDI errors. Tables 8-9, 8-10, 8-11 and 8-12
further define the fields in Table 8-8. For the remaining
fields, refer to Table 8-5.

8-20

Example 8-3

SOl Error Printout

ERROR-E Drive Detected Error at 5-Mar 1985 12:52:14.43
00000000
Command Ref #
124.
RA81 unit #
4.
Err Seq #
40
Error Flags
OOEB
Event
IB
Request
Mode
00
Error
80
00
Controller
Retry/fail
00
Extended Status 88
00
03
00
07
4B
lA
Requestor #
6.
Drive port #
2.
ERROR-I End of error.
Table 8-8

SOl Error Printout Field Description

Field

Description

Request

This number (in hexadecimal) is a byte
describing the state of the drive. Figure
8-6 shows the bits of this byte
field, and Table 8-9 describes the
bits. In this example, the IB indicates:
o
o
o
o

Mode

RUN/STOP switch in
Port switch in
Log information in extended area
Spindle ready

This number (in hexadecimal) is a byte
describing the mode of the unit. Figure
8-7 shows the bits of this byte
field, and Table 8-10 describes the
bits. In this example, the 00 indicates:
o
o

No subunits are write protected.
The disk is in 512-byte sector format.

8-21

Field

Description

Error

This number (in hexadecimal) is a byte
describing the errors in the unit. Figure
8-8 shows the bits of this byte
field, and Table 8-11 describes the
bits. In this example, the 80 indicates a drive
error has occurred, and the drive FAULT lamp may
be on.

Controller

This number (in hexadecimal) is a byte
describing the subunits with attention available
messages suppressed in the controller and a
status code indicating various states of drive
operation. Figure 8-9 shows the bits
of this byte field, and Table 8-12
describes the bits. In this example, the 00
indicates:

Retry/fail

No subunits with attention available
message suppressed in the controller

Drive normal operation

This number (in hexadecimal) is a byte
containing one of two types of information
depending upon the status of the DF bit in the
Error field. The DF bit describes the drive
initialization process. The DF bit is a zero if
the drive initialization was successful. In this
case, the Retry/fail field contains the retry
count from the previous operation. For example,
a Seek operation required 14 retries to be
successful. If a GET STATUS command is
initiated, the Retry/fail field contains the
number 14.
The DF bit set indicates the drive
initialization failed, and therefore, the
Retry/fail contains a specific drive error code.
This error code is defined in the appropriate
drive service manual.
In this example, 00 indicates no retry count
exists for the previous operation. (The DF bit
is zero in the Error field.)

8-22

Description
Field
-----------------------------------------------------------------

These bytes, in hexadecimal, contain the
extended status of the particular drive. (In
this example it is an RA81.) Refer to the
appropriate drive service manual for the meaning
of these bytes.

Extended
Status

In this example, the extended status is:
o

88 - Controller command functional code
last executed by the drive. (In this case,
a GET SUBUNIT CHARACTERISTICS command.)

00 - Interface error status bits which are
all reset.

03 - Low-order cylinder address bits of the
last Seek operation.

00 - High-order cylinder address bits of
the last Seek operation.

07 - The present group address.

48 - Error code (index pulse error)
displayed by the drive LEOs during the
execution of a drive-resident diagnostic.

lA - Error code (Servo fine positioning
error) displayed on the operator control
panel of the RA81.

Requestor #

This number, in decimal, is the number of the
requestor connected to the drive.

Drive port #

This number, in decimal, is the number of the
port on the requestor. (The ports are numbered 0
through 3.)

CX-1121A

Figure 8-6

Request Byte Field

8-23

Table 8-9

Request Byte Field Description

Bits

Description

----------------------------------------------------------------OA

A logical one in this position indicates the drive is
unavailable to the controller. A logical zero indicates
the drive is available to the controller.

A logical one in this position indicates the drive

requires an internal readjustment. Some drives do not
use this bit.
DR

A logical one in this position indicates a request is

outstanding to load a diagnostic in the drive
microprocessor memory. A logical zero indicates no
diagnostic is being requested of the host system.
SR

logical one in this position indicates the drive
spindle is up to speed. A logical zero indicates the
drive spindle is not up to speed.

A logical one in this position indicates usable
information in the extended status area. A logical zero

indicates no information is available in the extended
status area.
PS

A logical one in this bit position indicates the drive

port select switch for this controller is pushed in
(selected). A logical zero indicates the switch is out.
RU

A logical one in this position indicates the RUN/STOP
switch is pushed in (RUN). A logical zero indicates the
switch is out (STOP).

CX-1122A

Figure 8-7

Mode Byte Field

8-24

Table 8-10
Bits

Mode Byte Field Description
Description

----------------------------------------------------------------W4-Wl

Logical ones in any of these four bit positions
represent the write-protect status for the subunit.
(For example, a 0001 indicates subunit 0 within the
selected drive is write-protected.)

A logical one in this position indicates the drive was
disabled by a.controller error routine or diagnostic.
The Fault light is on when this bit is set. A logical
zero indicates the drive is enabled for communication
to a controller.

A logical one in this position "indicates the diagnostic
cylinders on the drive can be accessed.

logical one in this position indicates the drive can
be formatted.

logical one in this position indicates the 576-byte
sector format is selected. A logical zero indicates
that the 5l2-byte sector format is selected.

CX-1123A

Figure 8-8

Error Byte Field

8-25

Table 8-11

Error Byte Field Description

Bits

Description

A logical one in this position indicates an error

logical one in this position indicates a drive error
has occurred and the drive FAULT lamp may be on.

occurred in the transmission of a command between the
drive and the controller. The error could be a checksum
error or an incorrectly formatted command string.
PE

A logical one in this position indicates improper
command codes or parameters were issued to the drive.

A logical one in this position indicates a failure in
the initialization routine of the drive.

A logical one in this position indicates a write-lock
error has occurred.

C4
CX-1124A

Figure 8-9

Controller Byte Field

8-26

Table 8-12

controller Byte Field Description

Bits

Description

54-51

This is a 4-bit representation of the subunits with
Attention Available messages suppressed in the
controller. The rightmost bit position represents
subunit O. The leftmost bit position represents subunit
2.

If one of the bits is set, it indicates the controller
is not to interrupt the host CPU with an Attention
Available message when the specified subunit raises its
available real-time drive status line to the
controller. The 54 through 51 bits reflect the results
of a CHANGE CONTROLLER FLAGS command in which Attention
Available messages are not desired for certain
subunits.
C4-Cl

This is a 4-bit drive status code indicating various
states of drive operation. At the present time, only
three codes are valid:
o

0000 - Drive normal operation

1000 - Drive is offline because it is under the
control of a diagnostic

1001 - Drive is offline due to another drive
having the same unit identifier (for example,
serial number, drive type, class).

Following is an alphabetical listing of SDI type errors with an
explanation of each.
NOTE
When the HSC marks the drive as inoperative, it
places the drive in a state of Unit-Offline with
a substate of unit-inoperative relative to this
HSC.

8-27

Controller-Detected Transmission or Time Out Error
Message Error Level:

Message Description: The controller detected an invalid
framing code or a checksum error in a Level 2 response from
the SDr drive.
Field Service Action: Determine if this error is occurring
on more than one drive which may indicate a K.sdi problem.
However, if it is occurring on only one drive, the sor cable
or the drive may be at fault. Refer to the appropriate
drive service manual for assistance with drive FRUs.
Possible FRUs:
1.
2.
3.
4.

sor cable
Orive sor interface module
K.sdi module
sor transition bulkheads

Drive Clock Dropout
Message Error Level:

Message Description: Either data or state clock was missing
when it should have been present. This is detected by the
requestors connected to this sor drive.
Field Service Action: Oetermine if this error is occurring
on more than one drive which may indicate a K.sdi problem.
However, if it is occurring on only one drive, the sor cable
or the drive may be at fault. If other errors surround or
precede this one, those errors may have sequentially
triggered this error. Refer to the appropriate drive
service manual for assistance with drive FRUs.
Possible FRUs:
1.
2.
3.
4.

sor cables
Drive SOl interface module
K.sdi module
SOl transition bulkheads

8-28

Drive Inoperative
Message Error Level:

Message Description: The drive is generating so many
unrecoverable errors that it appears inoperative. Once the
HSC reports the drive as inoperative, the drive state clocks
must transition to return the drive to an operational state.
Field Service Action: Refer to the drive service manual.
Run ILDISK to help isolate failure between HSC and drive.
Possible FRUs:
1.
2.
3.

Drive modules (Refer to Drive service manual.)
K.sdi module
SDl cables

Drive-Detected Error
Message Error Level:

Message Description: The controller received a GET STATUS
command or unsuccessful response with EL set, or the
controller received a response with the D flag set and does
not support automatic diagnosis for that SDI drive type.
Field Service Action: Determine if the drive has a hard
fault (fault light on, and an error code in the drive
microprocessor LEDS). Refer to the drive service manual for
assistance with drive internal diagnostics and LED error
codes. Decode remaining error message bytes for more
detailed error information. If error message decoding does
not clearly indicate a drive error, move the drive to
another requestor (or requestor port) to help isolate
failure between HSC and drive.
Possible FRUs:
1.
2.
3.

Drive modules (Refer to drive service manual.)
SDI cables
SDI bulkheads

8-29

Drive-Requested Error Log (EL Bit Set)
Message Error Level:

Message Description: The controller requested a drive error
log because the drive returned a status message with the EL
bit set in the request byte field.
Field Service Action: Determine what drive-detected error
(previous error description) caused the drive to request a
drive error log by finding the error in the error log
report. Also decode remaining fields in the drive status
response of this error message and any preceding errors on
the unit.
Possible FRUs:
manual.)

Drive modules (Refer to the drive service

Message Error Level:

Message Description: Read/Write Ready drops when the
controller attempts to initiate a transfer or at the
completion of a transfer with Read/Write Ready previously
asserted. This usually results from a drive-detected
transfer error, where additional error log messages
containing the drive-detected error subcode may be
generated.
Field Service Action: Look for surrounding drive-detected
errors and/or associated disk transfer error log. Move
suspect drive to another port or data channel to help
isolate failure, as this error may be caused by any of
several communication components.
Possible FRUs:
1.
2.
3.
4.

Drive modules (Refer to drive service manual.)
K.sdi module
SDr cables
SDr transition bulkheads

8-30

Lost Receiver Ready
Message Error Level:

Message Description: Receiver Ready was negated when the
controller attempted to initiate an SDI disk transfer or did
not assert at the completion of a transfer. This includes
all cases of the controller timeout expiring for a transfer
operation (LEVEL 1 REAL TIME command).
Field Service Action: Look for a probable drive error or a
possible SDI cable problem. Move suspect drive to another
port or data channel to help isolate failure, as this error
may be caused by any of several communication components.
possible FRUs:
1.
2.
3.
4.

Drive modules (Refer to drive service manual.)
K.sdi module
SDI cables
SDI transition bulkheads

position or Unintelligible Header Error
Message Error Level:

Message Description: The drive reported a Seek operation
was successful by returning successful status in response to
the INITIATE SEEK SDI command and asserting R/W Ready when
on the desired cylinder. However, the controller determined
the drive had positioned itself to an incorrect cylinder.
The header read from the drive is consistent (three out of
four header copies are identical) but does not match the
desired target header value. The error is considered
recoverable if the Error Flags bit indicates success or a
subsequent replacement succeeds.
Field Service Action: The drive Servo system or media is
probably at fault in this case. If one is available, move
the drive to a different requestor. A drive failure is
indicated if the failure persists on the new requestor.
Possible FRUs:
1.
2.

Drive modules (Refer to drive service manual.)
K.sdi module

8-31

pulse or Parity Error
Message Error Level:

Message Description: The controller detected a pulse error
on either the SOl drive state or data line, or the
controller detected a parity error in a drive state frame.
The HSC does an SOl GET STATUS command, reports any errors
from it, and then clears those errors, if possible. After
this, the HSC retries the original command up to two more
times before considering the error unrecoverable.
Field Service Action: If the error is reported on more than
one drive, a K.sdi problem is indicated. If the error is
reported on only one drive, an SOl cable or drive problem is
indicated.
possible FRUs:
1.
2.
3.
4.

Drive modules (Refer to drive service manual.)
SOl cable
SOl transition bulkhead
K.sdi Module

SI Clock Resumption Failed After INIT
Message Error Level:

Message Description: The drive clock did not resume
following a controller attempt to initialize the SDI drive.
This implies the drive encountered a fatal initialization
error. Closely examine error logs for surrounding disk
errors, as this error may be the result of a
previously-reported drive error.
Field Service Action: Determine if this drive has
encountered any other related problems which may be found in
an appropriate error log report. Also, this error may be
due to an SOl cable problem.
possible FRUs:
1.
2.

Drive modules (Refer to drive service manual.)
SDI cable

8-32

SI Clock Persisted After INIT
Message Error Level:

Message Description: The drive clock did not cease
following a controller attempt to initialize the SOl drive.
This implies the drive did not recognize the initialization
attempt. This error condition causes the HSC to retry the
INlT command eight more times before marking the drive
inoperative.
Field Service Action: Determine if this drive has
encountered any other related problems which may be entered
in an appropriate error log report. Also, this error may be
due to an SOl cable problem. Closely examine error logs for
surrounding disk errors, as the error may be a result of a
previously-reported drive error.
Possible FRUs:
1.
2.

Drive modules (Refer to drive service manual.)
SDl cable

SI Command Timeout
Message Error Level:

Message Description: The controller timeout expired for
either a Level 2 exchange or the assertion of READ/WRITE
READY after an Initiate Seek. The HSC retries the command
three more times, reinitializing the SDI drive each time.
If the error persists on a single SDI level 2 exchange, the
drive is marked inoperative.
Field Service Action: Determine if this drive has
encountered any other related problems which may be found in
an appropriate error log report. Also, this error may be
due to an SOl cable problem. Closely examine error logs for
surrounding disk errors, as the error may be a result of a
previously-reported drive error.
possible FRUs:
1.
2.

Drive modules (Refer to drive service manual.)
SOl cable

Ensure the drive and all HSC modules are at the latest
revision levels.

8-33

5I Receiver Ready Collision
Message Error Level:

Message Description: This error occurs when the drive fails
to follow the sor protocol during sor command/reception.
For example, the controller sends the drive a command,
asserts Controller Receiver Ready, and waits for the sor
response. The following lists the possible drive operations
that lead to this error:
1.

The drive fails to deassert Orive Receiver Ready.
In
this case, the drive indicates it did not receive the
command.

The drive deasserts Drive Receiver Ready and then
reasserts it before sending a proper sor response.
rn
this case, the drive believes it has sent a response and
is indicating so by re-asserting Orive Receiver Ready,
yet the controller has never received the response.

The HSC K.sdi detects this error. The HSC functional code
does an SDr GET STATUS command and clears the drive of any
errors found. The original command is then retried. This
cycle is repeated twice before the drive is initialized by
the HSC, and the entire operation is done two more times.
If the failure persists, the drive is marked inoperative.
Field Service Action: Oetermine if this drive has
encountered any other related problems which may be found in
an appropriate error log report. Also, this error may be
due to an SOl cable or SDI transceiver/encoder/decoder
problem. Closely examine error logs for surrounding disk
errors, as this error may be the result of a
previously-reported drive error.
possible FRUs:
1.
2.
3.

Drive modules (Refer to drive service manual.)
sor cable
K.sdi module

8-34

SI Response Length or Opcode Error
Message Error Level:

Message Description: A Level 2 response from the drive had
correct framing codes and checksum but was not a valid
response within the constraints of the SI protocol. The
response had an invalid opcode, was an improper length, or
was not a possible response in the context of the exchange.
The HSC K.sdi detects this error. The HSC functional code
does an SOl GET STATUS command and clears the drive of any
errors found. The original command is then retried. This
cycle is repeated twice before the drive is initialized by
the HSC, and the entire operation is done two more times.
If the failure persists, the drive is marked inoperative.
Field Service Action: Determine if the drive has
experienced other similar errors. Closely examine error
logs for surrounding disk errors, as this error may be the
result of a previously-reported drive error.
possible FRUs:
1.
2.

Drive modules (Refer to drive service manual.)
K.sdi module

SI Response Overflow
Message Error Level:

Message Description: A drive sent back more frames than the
reception buffer could hold. This can be caused by a hung
drive microdiagnostic or a malfunctioning K.sdi.
Field Service Action: Determine if the drive is failing in
other ways, indicating a drive problem. If not, the K.sdi
may be the more likely cause.
possible FRUs:
1.
2.

Drive modules (Refer to drive service manual.)
K.sdi module

8.4.2.6 Disk Transfer Errors - Disk transfer errors are either
data or media format type errors. Example 8-4 shows an example
Disk Transfer error printout, and Table 8-13 describes the
various fields of the printout.

8-35

Example 8-4

Disk Transfer Error Printout

ERROR-E SEVEN Symbol ECC Error at 27-Mar-1985 12:15:15.00
50400015
Command Ref #
120.
RA8l unit #
Err Seq #
9.
EO
Error Flags
01C8
Event
o•
Recovery level
Recovery count
o•
426978
LBN
100020
Orig err flags
000003
Recovery Flags
LvI A retry cnt
1.
Lvl B retry cnt
o•
143022
Buffer addrs
Source Req.
5.
Detecting Req.
5.
Error-I End of error.

Table 8-13 describes the fields in a disk transfer error message
not described in Table 8-5. Unless otherwise specified, all
fields in this table are shown in decimal numbers. These fields
are specific to an RA8l disk and may not be the same for other
RAXX type drives.

Table 8-13

Disk Transfer Error Printout Field Description

Field

Description

RA8l unit #

This is the number of the unit the error log
message relates to, or is zero if the message
does not relate to a specific unit. In this
example, the RA8l indicates the drive is an RA8l
and is unit 120.

Recovery level

This number indicates the error recovery level
used for the most recent transfer attempt by the
unit. In this example, the 0 indicates it used
error recovery level O. An RA8l only has a
recovery level of 0 (recalibration).

Recovery count

This number indicates the number of times the
recovery level was tried. In this example, the 0
indicates the recovery level was not retried.

LBN

This number indicates the logical block number.
In this example, the LBN is 426978.

8-36

Field

Description

Orig err flags

This number (octal) indicates the original
errors associated with this error. Table
8-14 describes the bits associated
with this field. In this example, the 100020
indicates:
o
o

Recovery flags

ECC Error
SOC error

This number (octal) indicates the recovery flags
the software processes should take to recover
from this error. Table 8-lS describes
the bits associated with this field. In this
example, the 000003 indicates:
o

An LBN should be replaced.

The current error should be logged on the
console and to the host if a connection is
present.

LvI A retry
cnt

This number indicates the number of times the
HSC attempted the Level A recovery routines.
These routines are those not requiring any
exhaustive SI exchanges as part of the recovery
sequence. In this example, the 1 indicates the
ECC error correction was completed in the HSC
without going over the SI.

LvI B retry
cnt

This number indicates the number of times the
HSC attempted the Level B recovery routines.
These routines require extensive SDI exchanges
as part of the recovery sequence. In this
example, the 0 indicates no Level B recovery was
attempted.

Buffer addrs

This number (octal) is the address of the HSC
internal data buffer associated with this error.
In this example, the buffer address is 143022.

Source Req.

This number is the requestor that filled the
buffer with data. In this example, the 5
indicates the source requestor was requestor
number 5. A requestor of 1 in this field would
indicate a disk write operation. All other
values would indicate a disk Read operation.

Detecting Req.

This number is the requestor that detected that
error. In this example, the 5 indicates
requestor number 5 detected the ECC error.

8-37

Table 8-14 shows definitions of the original error flags and
Table 8-15 defines the recovery flags.
Table 8-14

Original Error Flags Field Description

Bits

Mask
(Octal)

Definition

100000

ECC error

040000

SERDES overrun error

020000

SDI RESPONSE/DATA line pulse error

12 and
11

{'\l,.,{'\/'"

V.L."".II:VVV

Suspected position error - low header mismatch

010000

Header sync timeout

004000

Header compare error - compare-64 performed

002000

Data sync timeout

001000

Drive clock timeout

000400

SOL STATE line pulse error

000200

Data Bus overrun

000100

Data Memory parity error

000040

Data Memory NXM

000020

EDC error

03 and

000014

READ/WRITE READY down at end of sector

000010

Lost READ/WRITE READY before transfer began

000004

Lost RECEIVER READY before transfer began

000002

Forced error (EDC
EDC)

000001

Drive inoperative

8-38

ones complement of correct

Table 8-15

Recovery Flags Field Definition

Bit

Mask
(Octal)

000040

Indicates the error count reported by the ILEXER
should be updated

000020

Indicates an error log message has already been
generated for the current error

000010

Indicates an entry for the desired logical block
number was found

000004

Indicates revectoring and replacement should be
suppressed

000002

Indicates the current error should be logged on
the console and to the host if a connection is
present

0000001

Indicates the logical block should be replaced

Definition

The following is an alphabetical listing of the disk transfer
errors with an explanation of each error.

Data Synch Not Found
Message Error Level:

Message Description: This error occurs when the SERDES 16
does not detect the SYNC character (26BC hex) immediately
preceding read data from the disk drive. The K.sdi has
already read a valid header and is awaiting the Data SYNC
character.
Field Service Action: Determine if additional errors occur
from this drive to indicate a drive or media error. If not,
the problem is probably the K.sdi module.
possible FRUs:
1.
2.
3.

Drive modules (Refer to drive service manual.)
K.sdi module
SOl interface

8-39

ECC Errors
Message Error Level:

Message Description: The following description covers all
of the ECC error types:
o
o
o
o
o
o
o
o
o

Uncorrectable ECC Error
One Symbol ECC Error
Two Symbol ECC Error
Three Symbol ECC Error
Four Symbol ECC Error
Five Symbol ECC Error
Six Symbol ECC Error
Seven Symbol ECC Error
Eight Symbol ECC Error

ECC errors occur when the data read from the disk does not
agree with the data written. When data is written to the
disk, an ECC is calculated (by the R-S GEN) and appended to
the end of the sector. When the data is subsequently read
from the sector, the ECC is revalidated. The two possible
results are:
1.

The data error falls within the ECC error correction
capability (less than nine lO-bit symbols in error) and
data correction is performed. In this case, no data
errors are shown.

The data error does not fall within the error correction
capability of the ECC, and the error is retried
according to drive dependent parameters. If all of the
retries fail, an uncorrectable ECC error occurred, and a
bad block is reported via an end packet.
NOTE
An uncorrectable ECC error can also occur if the
Suppress Error Correction modifier is chosen and the
transfer encounters any type of ECC error.

Field Service Action: Determine if the ECC errors are just
normal occurrences or if a very large number of blocks is
being replaced. The latter indicates the drive may have a
read path problem.
possible FRUs:
1.
2.

Drive modules (Refer to drive service manual.)
K.sdi module

8-40

Forced Error
Message Error Level:

Message Description: The sector was written with a Force
Error modifier indicating this is a replaced image and the
original data could not be read correctly using retries and
the ECC algorithms.
Field Service Action:
Possible FRUs:

Backup the media.

NOne

Header Error
Message Error Level:

Message Description: The subsystem reads an invalid or
inconsistent header for the requested sector. The header is
considered invalid if all of the following are true:
o

The header is consistent (three out of four copies
match).

Two out of four of the low-word header values match the
desired target header low-word value.

The high-word header values do not match the respective
target header values.

For recoverable errors, this code implies a retry of the
transfer read a valid header. For unrecoverable errors,
this code implies the subsystem attempted nonprimary
revectoring and determined the requested sector is not
revectored. Causes of an invalid header include header
missync, header sync timeout, and an unreadable header.
Field Service Action: Determine if this error is repetitive
on this unit indicating a deteriorating media.
Possible FRUs:
1.
2.

Drive modules (Refer to drive service manual.)
K.sdi module

8-41

RCT Corrupted Error
Message Error Level:

Message Description: The RCT search algorithm encountered
an invalid ReT entry. The subcode may be returned under the
following conditions:
o
o
o

During replacement of a block
During nonprimary revectoring of a block
When bringing a unit online

Field Service Action: Determine if this error is repetitive
for this unit possibly indicating a defective media or drive
read path failure.
Possible FRUs:
manual.)

Drive modules (Refer to drive service

8.4.3 Bad Block Replacement Errors (BBR)
Another type of error displayed on the console terminal is for a
bad block replacement request. The bad block replacement request
is a result of the one of the following errors:

o
o
o
o
o
o

Data sync timeout
ECC symbol error above the threshold
Header compare error
Header sync timeout
Loss of R/W Ready at end of read from disk (SERDES read)
Uncorrectable ECC

Example 8-5 shows a bad block replacement message. This message
reports completion, successful or unsuccessful, of a bad block
replacement attempt. A message is generated regardless of the
success or failure of the replacement attempt. Refer to Table
8-16 for a definition of the fields in the message explicit to
this type of message. Table 8-17 describes replace flag bits.
Fields generic to all MSCP/TMSCP error messages are described in
Table 8-5.

8-42

Example 8-5

Bad Block Replacement Error Printout

ERROR-W Bad Block Replacement (Success) at 18-Dec-1985 18:05:37.1
Command Ref #
B8590012
RA60 Unit #
251
Err Seq #
2
Error Flags
80
Event
0014
Replace Flags
8000
LBN
205
Old RBN
0
New RBN
5
Cause Event
00E8
ERROR-I End of error
Table 8-16 defines BBR error fields not previously described in
Table 8-5. The replace flags are defined in Table 8-17.
Table 8-16

Bad Block Replacement Error Printout Field Definition

Field

Description

Replace Flags

This number, in hexadecimal, indicates bit flags
used to report in detail the outcome of the bad
block replacement attempt. (Refer to Table
8-17). In this example, the 8000
indicates the block was verified as bad.

LBN

This number, in decimal, is the logical block
number that is the target of the replacement. In
this example, the LBN is 205.

Old RBN

This number, in decimal, indicates the RBN the
bad LBN was formerly replaced with, or zero if
it was not formerly replaced. In this example,
the 0 indicates it was not formerly replaced.

New RBN

This number, in decimal, indicates the RBN the
bad LBN was replaced with, or is zero if no
actual replacement was attempted. In this
example the new RBN is 5.

Cause Event

This number, in hexadecimal, is the event code
from the original error that caused the
replacement to be attempted. The number is zero
if that event code not available. (Refer to
Appendix e.) In this example, the 00E8 indicates
an uncorrectable Eee error caused the bad block
replacement.

8-43

Table 8-17

Replace Flags Bit Description

Bit Mask
Bit
Replace Flag Bit Definition
Hex.
Number
----------------------------------------------------------------15

8000

Replacement Attempted
This bit is set if the suspect bad block
indeed tested bad during the initial stages
of the replacement process. If not set, the
suspect block did not check bad and no
replacement was completed.

4000

Forced error
The data from the suspect bad block could
not be corrected or obtained without error.
The Forced Error Indicator will be written
to the replacement block along with the bad
data from the block that was replaced. The
user data from the bad block is read with a
forced error when accessed. If this
condition occurs frequently on a specific
drive, then a closer analysis of the drive
for possible problems would be recommended.

2000

Nonprimary revector
This bit is set if the replacement process
was accomplished and required putting the
bad blocks data into a replacement block
that is not the bad blocks primary RBN.

1000

Replace command failure
This bit is set during the replacement
process if the status coming back from the
execution of the MSCP REPLACE command is not
successful. If this occurs, the drive should
not be used until it can be reformatted.

800

RCT inconsistent
This bit is set if the Replacement Control
Tables are not usable. The drive should not
be used until it can be reformatted.

400

Bad replacement block
This bit is set if the bad block reported is
a replacement block. The replacement block
can be replaced just like any LBN.

8-44

The following is an alphabetic listing of the BBR errors with an
explanation of each error.

Bad Block Replacement (Block OK)
Message Error Level:

warning.

Message Description:

Block tested OK - not replaced.

Field Service Action: Monitor drive for the frequency of
these reports. If frequency increases, troubleshoot the
error that triggers BBR.
possible FRUs:
Table 8-16.

Refer to Cause Event error message field in

Bad Block Replacement (Drive Inoperative)
Message Error Level:

warning

Message Description: Replacement failure--drive access
failure. One or more transfers specified by the replacement
algorithm failed. If necessary and possible, write-protect
the drive and perform a volume backup immediately.
Field Service Action: Drive should be tested further. Move
the drive to another K.sdi (or to just another K.sdi port)
if available. If the problem persists, failure is most
likely in the drive.
possible FRUs:
manual.)

Drive module (Refer to drive service

Bad Block Replacement (RCT Inconsistent)
Message Error Level:

Warning

Message Description:
not usable.

Replacement failure--the RCT table is

Field Service Action: Drive media should not be used until
replaced or verified as good. If necessary, write-protect
this drive, and have the customer perform a volume backup
immediately. Further testing of drive may be necessary.
possible FRUs:
manual.)

Drive module (Refer to drive service

8-45

Bad Block Replacement (REPLACE Failed)
Message Error Level:

Warning

Message Description: Replacement failure - REPLACE command
or its analogue failed. The status returned from the
replacement process indicates the command was not
successful.
Field Service Action: Drive media should not be used until
it is replaced or verified as good. If necessary,
write-protect this drive and have the customer perform a
volume backup immediately. Further testing of drive may be
necessary.
possible FRUs:
manual.)

Drive module (Refer to drive service

Bad Block Replacement (Success)
Message Error Level:

Warning

Message Description:
replaced.

The bad block was successfully

Field Service Action: Monitor drive for the frequency of
these reports.
If frequency increases, troubleshoot the
error triggering SSR.
Possible FRUs:
Table 8-16.

Refer to Cause Event error message field in

TMSCP-Specific Errors
The Tape Mass Storage Control Protocol (TMSCP) error messages
printed out at the console terminal are one of the following
types:
8.4.4

o
o
o
o

STI Communication or Command Errors
STI Formatter Error Log Errors
STI Drive Error Log Errors
controller Errors (Section 8.4.1)

8.4.4.1 STI Communication Or Command Errors - The following is
an example of the console printout of an STI communication or
command Error.

Example 8-6 shows the printout and Table 8-18 explains the fields
additional to those defined in Table 8-5.

8-46

Example 8-6

STI Communication or Command Error Printout

ERROR-E Drive detected error at 6-Mar-1985 09:51:11.88
864E0004
Command Ref #
o
TA78 unit #
12
Err Seq #
40
Error Flags
OOEB
Event
13026
position
02 00 00 00
GSS Text
05 00 00 00 00 00 00 00
Error-I End of error

Table 8-18

STI Communication or Command Error Printout Field
Description

Field

Description

Event

The number, in hexadecimal, identifies the
specific error or event reported by this
error log message. The event codes and
their meanings are shown in Appendix C. In
this example, the OOEB means drive
detected error.

Position

This is the last known tape position the
formatter received. This is given in gap
counts from BOT. In this example, the
number 13026 means 13026 gaps from BOT.

GSS Text

The GSS Text field is the response
received by the HSC from the formatter
when the HSC issues the GET SUMMARY STATUS
(GSS) and TOPOLOGY commands. The GSS text
in this example is 02 00 00 00 05 00 00 00
00 00 00 00. This means Level 2 protocol
error, Speed Management Enabled, Zero
Threshold. See Section 8.4.4.5
for details on field definitions and bit
decoding.

8-47

8.4.4.2 STl Formatter Error Log - The following is an example of
the console printout of an STl Formatter Error Log. Example 8-7
shows the printout, and Table 8-19 explains the fields not
previously defined in Table 8-5.
Example 8-7

STI Formatter Error Log Printout

ERROR-E Tape Formatter Requested Error Log at 30-Jan-1986 11:20:09.31
Command Ref #
43900012
TA8l unit #
95
Err Seq #
47
Format Type
08
40
Error Flags
Event
FF6C
Position
1057
Formatter E Log 40 00 00 81 00 00 00
01 98 72 00 00 00 00
C4 48 00 00
ERROR-l End of error.

Table 8-19

STl Formatter Error Log Field Description

Field

Description

position

The last known tape position the formatter
received. This is given in gap counts from
BOT. In this example, the number 1057
means 1057 gaps from BOT.

Formatter E Log

See Table 8-20.

8-48

Table 8-20
BYTE

Formatter E Log

No.

BYTE
DATA

DESCRIPTION

Formatter error

Data pulse parity error during data transfer

The information containea 1n these fields is product spec1~lc.
Refer to the appropriate drive manual for a description of the
remainder of the bytes.

8.4.4.3 STl Drive Error Log - The following is an example of a
console printout of an STI Drive Error Log. Example 8-8 shows
the printout, and Table 8-21 explains the fields additional to
those defined in Table 8-5. Table 8-22 describes GEDS Text
field, and Table 8-23 describes the Drive Error Log field.

Example 8-8

STl Drive Error Log Printout

ERROR-I End of error

8-49

Table 8-21

STI Drive Error Log Field Description

Field

Description

Position

The last known tape position where the HSC
believes the tape drive is upon successful
completion of all outstanding commands.
This is given in gap counts from BOT. In
this example the number 1 means 1 gap from
BOT.

GEDS Text

See Table 8-22

Drive Error Log

See Table 8-23

See also Section 8.4.4.4 for field definitions and bit decoding.

Table 8-22

GEDS Text

Byte

Byte
No.

DATA

Description

125 IPS tape drive

6250 BPI GCR encoding

MSCP unit number

80.

GAP count = 1.

The information shown in Table 8-23 is product specific to the
TA78. See the TA78 service manual for details.

8-50

Table 8-23

STI Drive Error Log

Byte
No.

Byte
Data

Description

No SOFT error

Error 10 number = 50.
Operational error
fault number indicates
possible cause general area
unknown fault number

RMC write fail bits

Statistics select clock stopped
STATUS VALID

NON-BOT cmd sts is ok

Last cmd sent to M8953 via
"RCMO" = normal NON-BOT read

Read channel AMTIE sts (CH 7:0)

End mark for read channels 7:0

Weak amplitude on parity bit
ECC corrected output (parity bit)

Read channel PE postamble detect

Data from read channels to ECC

CRC checker output bits

Corrected data (ECC to CRC)

Last STI level 2 cmd = 50(X)

Read channel illegal sts (CH 7:0)

8-51

No.

Byte
Data

2-TRK ECC performed on data
"AMTIE" during data of record

Channel 0 tie bus 2

Channel 3 tie bus 3

Byte

Description

on
vv

Tie bus

Tape unit bus line AMTIE 7:0

AMTIE parity
READ parity
WCS parity
Tape unit present

TU bus line read data 7:0

"CRC" to "WMC DR" bus

Tape unit selected

R/W Data, intermediate DRD bus

Unknown error code

"DR MBD" parity error

= OF(X)

Byte count

65535.

PAD counter

65535.

"PE" write parity error
POWER OK

8-52

Byte
No.

Byte
Data

Tape unit serial #2597.

AMTIE threshold field = 2.
READ ENABLE
WRITE BIT 4

Description
Online
READY ON
READY

Position 0 (normal)
125 IPS tape drive

8.4.4.4 Breakdown Of GEDS Text Field - The following is an
example of a tape drive related error message printed on the
HSC70 terminal.
Example 8-9

Tape Drive Related Error Message

ERROR-W Tape Drive Requested Error Log at l5-Aug-1984 18:43:05.80
Command Ref
00001D8E
TA78 unit
20.
Err Seq
1.
Error Flags
40
Event
FF6B
Position
2.
GEDS Text
7D 02 0014 00000002
Drive Error Log 00 00 00 00 C5 38 04 04
46 FF 07 FF 00 00 00 00
81 00 00 21 FF BO 00 04
00 00 80 FF 17 DE 00 08
00 00 21 FF FF 00 00 99
99 47 F4 E8 00 56 85 19
A2 OA 80 FF 17 DE
Both the GEDS Text and Drive Error Log portions of this message
result from a GET EXTENDED DRIVE STATUS command to the drive from
the HSC70. The Drive Error Log portion can be interpreted by
referencing the service manual for the appropriate tape drive.
(The preceding example is for a TA78 drive.)

8-53

Following is a breakdown of the information contained in the GEDS
Text field. The leftmost byte is referenced as the First Byte
and the rightmost byte as the Eighth Byte.
Bytes in the GEDS Text field are described as follows:
First Byte = Speed: Currently sets speed of the drive; it is an
integer value (in hex) in inches per second (IPS)
rounded down to the nearest integer. For a totally
variable speed drive, the speed returned is the lower
bound on the range of permissible speeds. In the
example shown, this field contains a value of 7D which
corresponds to 125 IPS.
Second Byte = Density: This is the current operating density of
the tape unit. Only one bit is set to indicate the
current operating density.
04

= 6250 BPI

= 1600 BPI

Third and Fourth Bytes = Unit Number:
drive unit number (hex).

= 800 BPI

These bytes contain the

Fifth through Eighth Bytes = Gap Count: The formatter's gap
count is from the beginning of the tape to where the
tape drive is. The contents of this field may differ
from the Position field in this error message. The
HSC's gap count is contained in The Position field at
the end of successful completion of all outstanding
commands.

8-54

8.4.4.5 Breakdown Of GSS Text Field - Following is another
example of a tape drive related error message printed at the
HSC70 console.

ERROR-E Drive detected error at l8-Aug-1984 12:05:34.82
0346003
Command Ref
TA78 unit
3.
Err Seq
7
Error Flags
40
Event
OOEB
Position
o•
GSS Text
02 20 00 00
28 00 00 00 00 00 14 00
ERROR-I End of error.
0

The HSC70 received the GSS Text field form of this error message
from the tape formatter when the HSC70 issues the GET SUMMARY
STATUS (GSS) and TOPOLOGY commands. The field is also the
unsuccessful response for all Level 2 commands. Following is a
breakdown of this response and an interpretation of bits
contained in it.

8-55

+---+---+---+---+---+---+---+---+

IAF IA3 IA2 IAI lAO lOA IP

I SUMMARY MODE BYTE I

+---+---+---+---+---+---+---+---+

I FE I TE I PE IDF /

I SUMMARY ERROR BYTE

+---+---+---+---+---+---+-~-+---+

IPS IEL IRP IRT IFD I SUMMARY MODE BYTE 2

+---+---+---+---+---+---+---+---+

ICI IC2 IC3 IC4 ICS IC6 IC7 Ica I CONTROLLER BYTE
+---+---+---+---+---+---+---+---+

ITM IEOTIBOTIWL IOL IAV IMR IEL I DRIVE 0 MODE BYTE
+---+---+---+---+---+---+---+---+

IDE ILP IPL lEX /DTEISMEIDI /ZT I DRIVE 0 ERROR BYTE
+---+---+---+---+---+---+---+---+

/TM /EOTIBOTIWL IOL IAV IMR IEL I DRIVE I MODE BYTE
+---+---+---+---+---+---+---+---+

IDE ILF 1Ft lEX IDTEISMEIDI IZT I DRIVE

, ERROR BYTE

+---+---+---+---+---+---+---+---+

ITM IEOTIBOTIWL IOL IAV 1MR IEL I DRIVE 2 MODE BYTE
+---+---+---+---+---+---+---+---+

IDE ILP IPL lEX IDTEISMEIDI IZT I DRIVE 2 ERROR BYTE
+---+---+---+---+---+---+---+---+

ITM IEOTIBOTIWL IOL IAV IMR IEL I DRIVE 3 MODE BYTE
+---+---+---+---+---+---+---+---+

IDE ILP IPL lEX IDTEISMEIDI IZT I DRIVE 3 ERROR BYTE
+---+---+---+---+---+---+---+---+

AF:

Formatter Attention Asserted

A3:

Drive 3 Attention Asserted

A2:

Drive 2 Attention Asserted

AI:

Drive I Attention Asserted

Drive 0 Attention Asserted

AV:

Drive Available to Formatter

BOT:

Beginning of Tape

en:

controller Flags (el - C8) - currently not implemented

DE: Drive Error - asserted when any drive error not covered by
other status bits is detected.
DF:

Formatter Diagnostic Failed

DI: Diagnostic Mode - when set, instructs the formatter to use
special internal algorithms to report imperfect performance.
D: Diagnostic Requested - asserted when the formatter is
requesting permission to execute a diagnostic.

8-56

DTE: Data Transfer Error - asserted when any error occurs which
prevents a data transfer from completing successfully.
EL: Error Logging Request - asserted by either the drive or
formatter when error logging information is available.
EOT:' End of Tape - asserted when the tape is positioned at or
past the end of tape marker.
EX: Exception condition - asserted whenever the formatter
encounters TM, BOT, or EOT during a data transfer operation or
when EL is raised during a data transfer.
FD: Retry Bit - Failure / Direction - is asserted during error
recovery to indicate the direction of a retry or to indicate a
failing operation. If RP = 0 and RT = 1, then FD = direction to
transfer. FD; 0 means transfer in the same direction as
original operation; FD = 1 means transfer in the opposite
direction of original operation. If RP = 1 and RT = 0, then FD
indicates success or failure of operation. FD = 0 means the
retry sequence succeeded; FD = 1 means the retry sequence failed.
FE: Formatter Error - asserted on formatter errors not covered
by the TE, PE, or DF bits. These errors include fatal errors
that may turn on the drive fault indicator.
LP: Lengthy Operation in Progress - asserted when a rewind
operation (including the optional data security erase portion of
a rewind) is in progress.
MR: Maintenance Mode Request - asserted when the drive i~ put
into maintenance mode. On the TA78, this is accomplished via a
thumbwheel switch on the operator panel.
OA:

Formatter Online or Available (for the TOPOLOGY command).

OL:

Drive Online to Formatter.

PB: Active Port Button - PB = 0 if formatter is connected to
the controller through port Ai PB = I if formatter is connected
to the controller through port B.
PE: Level 2 Protocol Error - asserted when a protocol error is
detected while processing a Level 2 command.
PL
position Lost - asserted when the formatter is not certain
of the current tape position.
P:

Port Switch - asserted when the port switch is enabled.

RP: Request position - used by the formatter along with RT to
inform the controller of the next step in the error recovery
sequence.

8-57

o
o
o
o
RT:

Retryable RP = 1, RT = 1
Transfer RP = 0, RT = 1
Done RP = 1, RT = a
No Error RP = 0, RT = a

Request Transfer - refer to the explanation for RP.

SME: Speed Management Enabled - asserted whenever the formatter
may change the current operating speed of a particular drive at
any time (provided the changing of the drive operating speed is
transparent to the controller).
TE: Transmission Error - used by the formatter to report Level
a and Levell STI errors. The formatter only reports Level a
real-time state parity errors and Write/Cmd Data Line pulse
errors when a transfer is in progress. Levell errors are
framing errors, checksum errors, inappropriate value in data
field of real-time command, or a real-time command occurring in
an invalid context.
TM:

Tape Mark

WL:

Write Locked

ZT: Zero Threshold - instructs the formatter to change all
error thresholds from their default values to zero.
A list of the tape errors and their meaning follows:
NOTE
Always verify proper dc voltage levels if the
indicated possible FRUs do not rectify failure.
Acknowledge Not Asserted At Start Of Transfer
Message Error Level:

Error

Message Description: The HSC is ready to start a transfer
by sending the formatter a Level 1 command and the formatter
does not have ACKNOWLEDGE asserted.
Field Service Action: Check the formatter. This error may
indicate a formatter STI communications error, or if
preceded by tape transport errors, may be a result of a
transport failure.
possible FRUs:
1.
2.
3.

Formatter
K.sti module
STI cable set

8-58

Buffer EDC Error
Message Error Level:

Error

Message Description: The K.sti detected an EDC error on the
data buffer it read from memory on a Write operation.
Field Service Action: Test the data path from tape
formatter to HSC data memory.
possible FRUs:
1.
2.
3.
4.

Formatter
M.std2 module
K.sti module
K.ci module

cannot Clear Formatter Errors
Message Error Level:

Error

Message Description: Issued a clear bit three times and
cannot clear the error.
Field Service Action:

Check the formatter

Possible FRUs:
1.
2.
3.

Formatter
STI cable set
K.sti module

cannot Clear Drive Errors
Message Error Level:

Error

Message Description: Issued a clear bit three times and
cannot clear the bit.
Field Service Action: Check the formatter and drive.
Further analysis of tape drive error log may be necessary.
Possible FRUs:
1.
2.
3.
4.

Drive modules (Refer to drive service manual.)
Formatter
STI cable set
K.sti module

8-59

controller Detected position Lost
Message Error Level:

Error

Message Description: Information contained in the response
from the formatter to the HSC POSITION command did not match
the expected tape drive position.
Field Service Action: Check the formatter.
If the error
persists, run the Inline Tape (ILTAPE) diagnostic to help
isolate to the FRU.
Possible FRUs:

Formatter

Controller Transfer Retry Limit Exceeded
Message Error Level:

Error

Message Description: The controller failed to perform the
command within the limit of allowable retries.
Field Service Action:

Check formatter, drive

Possible FRUs:
1.
2.

Drive modules (Refer to drive service manual.)
Formatter

Could Not Complete Online Sequence
Message Error Level:

Error

Message Description: Could not complete on-line sequence
due to a condition in the drive.
Field Service Action:

Check the formatter and drive.

Possible FRUs:
1.
2.

Drive modules (Refer to drive service manual.)
Formatter

8-60

Could Not Get Extended Drive Status
Message Error Level:

Error

Message Description: Issued the Get Status command and the
drive did not respond with the extended drive status.
Field Service Action:
possible FRUs:

Check the formatter.

Formatter

Could Not Get Formatter Summary Status During Transfer Error
Recovery
Message Error Level:

Error

Message Description: Issued the command and the formatter
did not respond with the formatter summary.
Field Service Action:
possible FRUs:

Check the formatter.

Formatter

Could Not Get Formatter Summary Status While Trying To Restore
Tape position
Message Error Level:

Error

Message Description: Issued the command and the formatter
did not respond with the formatter summary status.
Field Service Action:
Possible FRUs:

Check the formatter.

Formatter

8-61

Could Not Position For Formatter Retry
Message Error Level:

Error

Message Description: The HSC issued a command for data
recovery with position required, and the drive could not
complete the command ..
Field Service Action:

Check the media, drive, formatter.

possible FRUs:
1.
2.
3.

Drive modules (Refer to drive service manual.)
Media
Formatter

Could Not Set Byte Count
Message Error Level:

Error

Message Description: Issued command to set byte count and
could not complete command.
Field Service Action:
Possible FRUs:

Check the formatter.

Formatter

Could Not Set Unit Characteristics
Message Error Level:

Error

Message Description: Issued command to set unit
characteristics and could not complete command.
Field Service Action:
possible FRUs:

Check the formatter.

Formatter

8-62

Data Ready Timeout
Message Error Level:

Error

Message Description: The controller did not detect DATA
READY from the formatter within 5 ms after sending it a
Levell command.
Field Service Action:

Check the STI path.

possible FRUs:
1.
2.
3.

STI cable set
K.sti module
Formatter

Data Overflow Due To Pipeline Error
Message Error Level:

Error

Message Description: No data buffers in HSC data memory
were available when the K.sti needed one during a data
transfer.
Field Service Action: Intermittent errors may indicate
excessive error recovery simultaneously occurring elsewhere
in the subsystem. Retry operation. Persistent failures may
indicate a tape data channel error during a read operation
or a K.ci problem during a tape write operation.
possible FRUs:
1.
2.
3.

M.std2 module
K.sti module
K.ci module

8-63

Erase Command Failed
Message Error Level:

Error

Message Description:
failed.

Issued erase command and command

Field Service Action:

Check the formatter.

possible FRUs:

Formatter

Erase Gap Command Failed
Message Error Level:

Error

Message Description:
failed.

Issued erase gap command and command

Field Service Action:

Check the formatter.

possible FRUs:

Formatter

Formatter And HSC Disagree On Tape Position
Message Error Level:

Error

Message Description: The formatter and the HSC disagree on
position of the tape.
Field Service Action:

Check the formatter.

possible FRUs:
1.
2.
3.

Tape drive module
Formatter
K.sti module

8-64

Formatter Detected Position Lost
Message Error Level:

Error

Message Description:
position.

The formatter lost track of tape

Field Service Action:

Check media, drive, formatter.

Possible FRUs:
1.
2.
3.

Drive modules (Refer to drive service manual.)
Formatter
Media

Formatter Requested Error Log
Message Error Level:

Error

Message Description: The formatter detected an error and
set the EL bit to request an error log be taken.
Field Service Action:
possible FRUs:

Check the formatter.

Formatter

Formatter Retry Sequence Exhausted
Message Error Level:

Error

Message Description: The formatter failed to complete a
command within the retry limit.
Field Service Action:

Check the media, drive, formatter.

possible FRUs:
1.
2.
3.

Drive modules (Refer to drive service manual.)
Formatter
Media

8-65

Host Requested Retry Suppression On A Formatter Detected Error
Message Error Level:

Error

Message Description: The formatter detected an error and
the host issued a command to suppress the retry of the
command that failed.
Field Service Action:
possible FRUs:

Check the formatter.

Formatter

Host Requested Retry Suppression

Message Error Level:

'"'-_

VII

A K.sti Detected Error

Error

Message Description: An error was detected in the K.sti and
the host issued a command to suppress the retry of the
command that failed.
Field Service Action:
Possible FRUs:

Check the K.sti.

K.sti module

Lower Processor Error
Message Error Level:

Error

Message Description: A bit was set in the lower processor
error register. Bits included in the lower processor error
register are Data Bus NXM, Data SERDES Overrun, Data Bus
Overrun, Data Bus Parity Error, Data Pulse Missing, and Sync
Real Time parity Error.
Field Service Action:
Possible FRUs:

Check the K.sti.

K.sti module

8-66

Lower Processor Timeout
Message Error Level:

Error

Message Description: The upper processor in the K.sti
detected the lower processor had stopped and restarted it.
Field Service Action:
possible FRUs:

Check the K.sti.

K.sti module

Receiver Ready Not Asserted At Start Of Transfer
Message Error Level:

Error

Message Description: The HSC is ready to start a transfer
by sending the formatter a Level 1 command and the formatter
does not have Receiver Ready asserted.
Field Service Action:

Check the formatter, cable, K.sti.

Possible FRUs:
1.
2.
3.

Formatter
Cable
K.sti module

Record EDC Error
Message Error Level:

Error

Message Description: On a read from tape operation, the EDC
calculated by the K.sti did not match the EDC generated by
the tape formatter.
Field Service Action:

Check the formatter, cable, K.sti.

Possible FRUs:
1.
2.
3.

Formatter
Cable
K.sti module

8-67

Retry Limit Exceeded While Attempting To Restore Tape Position
Message Error Level:

Error

Message Description: A command was issued to restore the
tape position, and the command failed in the limit of
retries.
Field Service Action:
possible FRUs:

Check the formatter.

Formatter

Reverse Retry Currently Not Supported
Message Error Level:

Error

Message Description: Reverse Retry requests from the
formatter are currently not supported by HSC.
Field Service Action:
possible FRUs:

None

Rewind Failure
Message Error Level:

Error

Message Description: A command for a rewind was issued, and
the command failed (the controller received an unsuccessful
response from the formatter).
Field Service Action:

Check the drive, formatter.

possible FRUs:
1.
2.

Drive modules (Refer to drive service manual.)
Formatter

8-68

Tape Drive Requested Error Log
Message Error Level:

Warning

Message Description: The drive detected an error condition
and set the EL bit for an error log to be taken.
Field Service Action:
Possible FRUs:
manual.)

Check the drive.

Drive modules (Refer to drive service

Topology Command Failed
Message Error Level:

Error

Message Description:
command failed.

A topology command was issued and the

Field Service Action:

Check the formatter.

possible FRUs:

Formatter

unable To position To Before LEOT
Message Error Level:

Error

Message Description: The command to position the tape was
issued before LEOT and could not do the command.
Field Service Action:
Possible FRUs:
manual.)

Check the drive.

Drive module (Refer to drive service

8-69

Unknown K.tape Error
Message Error Level:

Error

Message Description:

The ER bit was set but was undefined.

Field Service Action:

Check the formatter.

Possible FRUs:

Formatter

word Rate Clock Timeout
Message Error Level:

Error

Message Description: The K.sti detected the loss of clocks
from a drive during a transfer.
Field Service Action:

Check the formatter, cable.

possible FRUs:
1.
2.

Formatter
Cable

8.4.5 Out-of-Band Errors
The out-of-band errors are those not conforming to a specific
template format, as the MSCP and TMSCP errors do. The method of
reporting differs for individual errors.
NOTE
The HSC70 operating software allows the setting
of different levels of error reporting for
out-of-band type errors using the SETSHO utility.
These message error levels are Informational,
Warning, Fatal, Error, and Success. The
identifiers for the out-of-band errors are
followed by an I, W, F, E or S, depending on the
SETSHO value. The X in the following list
represents the message error level.
Out-of-band errors are further classified into five categories.
They are:
1.

C1 Errors -

identified by HOST-X identifier printed
prior to message

Load Device Errors - identified by SYSDEV-X identifier
prior to message

8-70

Disk Functional Errors - identified by DISK-X identifier
prior to message

Tape Functional Errors - identified by TAPE-X identifier
prior to message

Miscellaneous (Software Inconsistencies) - identified by
SINI-X identifier prior to message
NOTE
Some out-of-band errors report microcode-detected
error status codes within the printout. Refer to
Appendix D for a full list of all K.ci, K.sti,
and K.sdi microcode-detected errors.

8.4.5.1 CI Errors - The following list shows each CI-detected
out-of-band error message and gives the error level, a message
description, field service action, and possible FRUs. The
messages are displayed in alphabetical order. When replacing
indicated FRUs, always verify correct dc voltage levels before
and after replacing a module.

Date/Time set by node nn
Message Error Level:

Informational

Message description: The HSC received either a START or
STACK (Start Acknowledge) message over the CI, and the date
and time was not set.
Field Service Action: None. This is a normal message as
part of establishing a VC between a host and an HSC.
Possible FRUs:

None

8-71

vc open with node nn
Message Error Level:

Informational

Message Description: A virtual circuit (VC) has been
established with the given node. The first time a VC is
established to an HSC causes the ONLINE lamp on the HSC
operator control panel to light.
Field Service Action: None is required; this message is for
informational purposes only.
Possible FRUs:

None

Node nn Cables have gone from uncrossed to crossed
Message Error Level:

Warning

Message Description: This message occurs when an IDRSP (ID
Response) packet is received by an HSC in response to an
IDREQ (IO Request) message. Upon receiving an IDRSP packet,
the HSC checks two bits in the IORSP message that indicate
which path the sending node used. If these two bits do not
indicate the same path the HSC received the message on, this
error occurs.
Field Service Action: Determine if the problem is broken
hardware in the HSC CI interface, broken hardware in the
host CI interface, or if the CI cables are crossed. Before
replacing any modules or cables, determine if the HSC is
encountering crossed paths to multiple nodes in the cluster
or only to a particular node. If the HSC is encountering
crossed paths to all nodes, the problem is probably in the
HSC or the cables. If it is encountering the problem to
only one node, it is likely a problem with that host node's
CI module set or the cables running from the host to the
star coupler.
possible FRUs:
1.

Cables physically connected wrong at HSC, Star Coupler,
or host CI

Any of the three K.ci modules in the HSC (LOIOO, LOI09,
LOI07)

Host CI module set

Duplicate node address settings

8-72

Node nn Cables have gone from crossed to uncrossed
Message Error Level:

Error

Message Description: This message occurs only when check
for a crossed path finds a previously crossed path no longer
crossed. More detail is covered in the description of
preceding error message:
Node nn Cables have gone from uncrossed to crossed
Field Service Action: Note, if both the "uncrossed to
crossed" and "crossed to uncrossed" messages are occurring,
it is most likely an indication of failing hardware, not a
cable problem. See the Field Service Action for the
previous message for more detail.
possible FRUs:
1.
2.

CI cables, if a single message is displayed
K.ci module set, if both messages are displayed

Node nn Path (A or

has gone from good to bad

Message Error Level:

Warning

Message Description: K.ci microcode detects a hard
(nonrecoverable) transmission error on a previously good
path. Examples of hard transmission errors are:
1.
2.
3.
4.

Transmit Buffer Parity Error
Unrecoverable NAK
Unrecoverable NORSP
Transmitter Attention Timeout

Determining the reason for failure using the error message
is not possible.
Field Service Action: Before replacing any FRU, determine
if the message is occurring because of problems with one
host or problems with multiple hosts. If the problem
involves one host, it is most likely in the star coupler's
host side. If the problem involves multiple hosts, it is
most likely on the star coupler's HSC side. Also, if the
message occurs on both paths to a host, that host may have
been powered down, stopped, or may have crashed. Examine
the host console log and the error log to determine if
something did happen to the host.

8-73

Determining which error caused the bad path is not possible
except with the Transmit Buffer Parity Error (XBUF PE) which
prints as an MSCP type message.
possible FRUs:

1.
2.
3.

CI cable
Host
CI interface hardware in the host

Node nn path n has gone from bad to good
Message Error Level:

Warning

Message Description:

A disconnected CI cable has been reconnected, or an
intermittent hardware or cable problem is indicated.
detail is found in the description of previous error
message:

Node nn Path (A or B) has gone from good to bad

This message also occurs if an open VC node path was
previously found to be bad. During this polling cycle the
node sends out IDREQ (ID Request) packets to all nodes and
receives successful IDRSP ID Response messages.
Field Service Action:
is no further action.

If the cable was reconnected, there
Otherwise, replace the possible FRUs.

Possible FRUs:

1.
2.
3.

CI cable
Host
CI interface hardware in the host

8-74

K.ci exception detected, code = nnn
Message Error Level:

warning

Message Description: The code here is the contents of
KH$FLG (the second word in the K.ci control area). Below is
a breakdown of the bits contained in this word:
1.

000001 KHF$PD - path(s) disabled by K.ci due to Xmit
error or VC breakage due to other K.ci-detected error.

000002 KHF$EQ - Item(s) placed on error queue (KH$EQ).

000004 KHF$BL - Data memory error during BMB list
operation.

000010 KHF$UP - Unreceivable packet.

K.ci stopped

(causes a crash).
5.

000100 KHF$NH - Sequenced message received while
reserved-to-receive queue was empty.

040000 KHF$PD - Set by diagnostics to disable
interrupts.

Field Service Action: Compare the code from the printout to
the previous list, and determine whether the error code
points to an HSC70 module or to the host.
possible FRUs:
1.
2.
3.
4.

Status
1:
Status
4:
Status 10:
Status 100:

K.pli module
M.std2 module
PILA module, Host K.ci set
Host K.ci set

8-75

vc closed with node nn due to unexpected disconnect
Message Error Level:

warning

Message Description: The HSC receives a DISCONNECTREQ
packet, and the following conditions exist inside the HSC.
o

A connection is not open.

The HSC is not in the DISCONNECTSENT state.
(The
DISCONNECTSENT state indicates the HSC also sent a
DISCONNECTREQ packet.)

Field Service Action: Verify no other nodes in the cluster
failed and caused sending an unexpected disconnect to the
HSC. If failure persists, the K.ci module set may be
causing this error. Run Offline Test K diagnostic to test
K.ci. If no failure, verify no duplicate node addresses
exist in this cluster (LOIOO node address switches).
possible FRUs:

K.pli module.

VC closed with node nn due to disconnect timeout
Message Error Level:

Warning

Message Description: A second disconnect call for the same
connection block has been received by the CI manager.
Field Service Action: Verify other cluster nodes have not
failed or have CI port problems. If the problem persists,
run Offline Test K diagnostic to test K.ci. If no failures
exist, verify Set parameters are valid, use backup copy of
the HSC code and replace FRUs indicated.
Possible FRUs:

Host K.ci module set

8-76

vc closed with node nn due to request from K.ci
Message Error Level:

warning

Message Description: The K.ci microcode has detected both
CI paths have gone from good to bad during polling. More
details are found under the description for error message:
Node nn Path n has gone from good to bad
Field Service Act!on: See the descriptions and field
service action for the following error messages:
o
o

Node nn Path
Node nn Path

(A
(A

or
or

8)
8)

has gone from good to bad
has gone from bad to good

possible FRUs:
1.
2.

K.ci hardware interface in HSC
CI cables

vc closed with node nn due to START received
Message Error Level:

Warning

Message Description: A START message is received over the
CI to an already open virtual circuit (VC).
Field Service Action: Check for two HSC70s with the same ID
(not node address) on the cluster. This happens when new
HSC70 is installed on the cluster and is given existing ID.
possible FRUs:

CI cables

8-77

No control block available to satisfy HMB request.
Message Error Level:

Warning

Message Description: The CIMGR tried to allocate an HMB
(Host Memory Block) from the free control block queue when
none were available. If a significant amount of control
memory was removed from use due to errors detected during
boot, this message occurs. Otherwise, it may indicate an
internal HSC software problem where control blocks in HSC
memory are taken by some service and never returned to the
list of free control blocks.
Field Service Action: Type in the SHOW MEMORY command for
HSCSO software version v300 and later and HSC70 software
version VIOO and later to determine how much control memory
is being used. Compare the amount of control memory shown
on the SHOW MEMORY printout to the amount contained in the
HSC. If more than 10% has been disabled from use, replace
the memory module. For HSCSO software before V300, run the
offline memory test on control memory to determine if
excessive solid failures are causing removal of a large
amount of memory. If memory amount is adequate, the problem
may be caused by a software or microcode problem within the
HSC.
possible FRUs:
1.
2.

M.std2 module
Software

HML$ER set - HM$ERR

= nn

Message Error Level:

Warning

Message Description: A HMB (host memory block) operation
resulted in an error. A breakdown of HMB error word
(HM$ERR) bits follow:
1.

000002 HME$BM - Insufficient BMBs to receive message.

000004 HME$NC - Sequenced message received over a
connection with "0" in credit field.

000010 HME$NC - Sequenced message received over a
connection with credit field >"1". Excess has been
added to CB$EM.

000020 HME$OV - Oversize message received (>1096.
bytes).

8-78

000040 HME$DN - Data memory NXM during BMB operation.

000100 HME$DP - Data memory parity error in BMB
operation.

000200 HME$DO - Data memory overrun during BMB
operation.

000400 HME$FP - Reception buffer parity error in packet
"header". Message not receivable.

001000 HME$PL - Reception buffer parity error in "body"
of message.

10.

002000 HME$CN - Transmission not attempted because
connection not valid.

11.

004000 HME$VC - Transmission not attempted because VC
closed or connection invalid.

12.

010000 HME$TE - Transmission attempted but failed (no
ACK) .

13.

020000 HME$TP - Transmission failed due to transmission
buffer parity error.

14.

040000 HME$HC - Packet inconsistent with K.ci context
received from host.

15.

100000 HME$IC - Illegal control function opcode.

Field Service Action: Compare the displayed code to the
previous list and determine where the problem lies. For
example, a code of 000040 indicates a failure in the M.std2
module, and a code of 002000 indicates a problem in the K.ci
module set.
possible FRUs:

1.
2.
3.

PILA module
K.pli module
M.std2 module

8-79

Bad dispatch state in CB •••
Message Error Level:

Warning

Message Description: The CI manager sends a SCS control
message and finds an invalid dispatch state in the control
block. The CI manager then uses the dispatch state to
determine where to send the proper control message. If this
is the only known problem, a software problem could exist
within the HSC. Otherwise, the problem could be caused by a
Control Bus addressing problem with the K.pli, M.std2, or
P.ioj modules.
Field Service Action:

Replace the following FRUs.

possible FRUs:
1.
2.
3.

K.pli
M.std2
P.ioj

K.ci loopback microcode loaded
Message Error Level:

Error

Message Description: The CIMGR detected K.ci loopback
microcode was loaded during initialization. When this
message occurs a problem with the K.pli (LOI07) module most
likely exists.
Field Service Action:
Possible FRUs:

Replace the following FRUs.

K.pli module

Resource lost to K.ci -- xxx xxx HMBs
Message Error Level:

Error

Message Description: A control memory HMB (host message
block) data structure was lost. HMBs were expected in the
sequence message ready to receive queue (.KHSRR), but none
were found.
Field Service Action: Report the error, with frequency of
occurrence, to support. Also, note sequence of events that
reproduce this failure. This message indicates a software
bug. Verify dc power levels are correct.
possible FRUs:
1.
2.

Software
Dc power

8-80

8.4.5.2 Load Device Errors - Detected errors from the RX33 load
device are classified into the out-of-band error category. The
following is an example printout of a detected Rx33 error.

SYSDEV-S Seq 104. at 6-JAN-1986 10:12:00.76
Dxl: LBN 1488. (49,0,02), Status 001
Seek 000, 000000
Tran 003, 021404
T.O. 000
87 3 1485 -7680 1 49 1 4
The -S following the SYSDEV prompt and before the Seq. number
indicates the severity level. The Rx33 has three severity
levels:
1.
2.
3.

Success (S) : two or less errors during a command/retry
Informational (I) : more than two errors
Error (E) : unrecoverable error

The status field is most important and is a direct indication of
the error. Following is a list of the Rx33 status codes:
0
0
0
0
0
0
0
0
0
0
0
0
0
0

000
001
002
200
201
202
203
204
205
206
365
367
375
376

:success
:success with retries
:S/W version mismatch (driver vs. operating code)
:command aborted via a CTRL/Y or exception operation
:illegal file name
:file not found
:file is not in a loadable image format
:insufficient memory to load image
:no free partition to load image into
:unit is S/w disabled
:unit is write protected
:no media mounted
:EOF detected during read or write
:hard disk error, other than the following:
370 :bad unit number
357 :data check error
343 :motor broken (would not spin up)
340 :Uncorrectable seek error (desired cylinder not
found)
311 :bad record (LBN) number (not on media)
272 :parity error in controller on M.std2 module

The failing floppy disk drive is indicated by Dxl:. The logical
block number where the failure occurred is displayed by LBN 1488.
The three numbers in parentheses, separated by commas after the

8-81

logical block number, indicate in order they are shown, the
cylinder, the media surface, and the drive sector.
The Seek entry's first group of zeros shows the retry count for
seek/recal errors or the number of times the command was issued
but not completed. The second group of zeros shows an inclusive
OR of the control and status registers CSR bits set during seek
error retries. The important bit in a seek error is bit 4.
The Tran (transfers) entry's first group of zeros shows the retry
count for read, write, and format errors, or the number of times
the command was issued and not completed. The second group of
zeros shows an inclusive OR of the CSR bits set during read,
write, and format error retries. A breakdown of the upper CSR
bits is shown in Figure 8-10. The status of the lower CSR bits
is shown in Table 8-24.

PAR
ERR

NXM
ERR

INTR
ENABLE

DMA
DIS

TST
HI
PAR

TST
La
PAR

MOTR
ENABLE

DRV
SEL

7
eSR BITS

CX-1125A

Figure 8-10

Rx33 Floppy Controller CSR Breakdown

8-82

Table 8-24

Status Register Summary
READ
SECTOR

READ
TRACK

WRITE
SECTOR

ALL TYPE I
COMMANDS

READ
ADDRESS

Not Ready

Not ready

Not Ready

Write
Protect

Head Loaded

Record Type

Seek Error

RNF

CRe Error

eRe Error

Track 0

Lost Data

Index Pulse

DRQ

Busy

BIT

WRITE
TRACK

-------------------------------------------------------------------------

The T.O. entry line is a timeout recording for each command
type. This counter reflects the total number of timeouts for the
command in error. All commands (read, write, recal, spinup, and
format track) time out in one second.
The last line in the error message is more complicated to
breakdown. The breakdown of the last line is as follows:

1485

-7680

sector number
surface number
---cylinder number
---- - - - -unit number
- - - -LBN
- -byte count negative implies write)
-------------------- success count
err count number

-------------

Most information in the error printout is reiterated in the last
line. Starting from the right, sector, surface, cylinder number,
and unit number are displayed as in the main body of the error

8-83

message. The byte count has an indicator for write and read
commands; the negative indicates a write operation. The LBN in
this field is the starting LBN for this transfer. The LBN in the
main message body is the failing LBN. The success count and
error count are for informational purposes.
8.4.5.3 Disk Functional Errors - Although most disk drive
related errors are MSCP errors, several disk functional errors
fall into the out-of-band error category. They are identified by
the DISK-E identifier printed on the terminal display prior to
the error.
The message, message description, field service action, and
probable FRUs for the disk functional out-of-band errors follow
in alphabetical order~
Aborting Error Recovery Due to Excessive RECALS
Disk Unit xx
Requestor xx
Port xx
Message Error Level:

Error

Message Description: For each transfer, a counter detects
the number of recals attempted. If the count exceeds number
of recals attempted, this message is printed. Recovery from
an error is not possible because of excessive recals.
Field Service Action: Refer to drive service manual to
determine reasons for persistent positioning failures.
Possible FRUs:

Drive unit

Aborting Error Recovery Due to Excessive Timeouts
Message Error Level:

Error

Message Description: The HSC detects several timeouts on
the disk drive. All error recovery attempts will be
aborted.
Field Service Action: Replace the following FRUs.
testing may be necessary.
possible FRUs:
1.
2.

Drive module (Refer to drive service manual.)
K.sdi module

8-84

Further

Attention condition serviced for ONLINE disk unit xxx
Message Error Level:

Information

Message Description: A condition change in the drive needs
servicing. A Get Status exchange is invoked to the drive.
Field Service Action:
Status response.
Possible FRUs:
manual.)

ATN.

Refer to the console printed Get

Drive modules (Refer to drive service

message sent to Node xx, for Unit xx
Message Error Level:

Information

Message Description: The attention message has been sent.
This message corresponds to the previous message.
Field Service Action:
possible FRUs:

None

Clock dropout from ONLINE disk unit xx
Message Error Level:

Error

Message Description:
state clock.

The online disk has lost its real-time

Field Service Action: Check the path between the K.sdi and
the disk drive that was reported. Determine if the problem
is in the HSC or the disk drive. Other disk error reports
may precede this message and provide more detail about this
error condition.
Possible FRUs:
1.
2.
3.

Drive modules (Refer to drive service manual.)
SI cable
K.sdi module

8-85

Deferred ATN.

message for Node xx, Unit xx

Message Error Level:

Information

Message Description:
process.

A attention message is delayed in

Field Service Action:

None

Possible FRUs:

None

Disk unit xx ready to transfer.!
Retrieval failure or subsystem deadlock probable.
Message Error Level:

Information

Message Description:
transfer.

Necessary resources would not do the

1.
2.

Out of buffers
K.sdi ready to die

Field Service Action: Check data transfer path. This error
may indicate too many utilities or inline diagnostics
running simultaneously. The problem might also be an HSC
software problem.
Possible FRUs:

K.sdi

Disk Unit xx (Requestor xx, Port xx) being initialized
DeB addr: xxxxxx
Message Error Level:

Information

Message Description:
identified.

A disk is being initialized and

Field Service Action:

None

Possible FRUs:

None

8-86

Disk unit xxx. (Requestor xx.,Port xx.) declared inoperative
intervention required.
Message Error Level:

Error

Message Description: The K.sdi sent a nondata transfer
command over to the disk three times and received the same
error back three times. The HSC ignores the disk until it
detects some intervention. An example is to deport the port
button to drop the state clock.
Field Service Action:
help resolve failure.
possible FRUs:
manual.)

Examine previous error reports to
Toggle port switch on drive.

Drive modules (Refer to drive service

DRAT/SEEK timeout, disk unit xxx
Message Error Level:

Information

Message Description: A stimulus resulting in error recovery
code action is the expiration of the DRAT/SEEK timer for the
drive. A DRAT represents data transfer action with the
drive, whereas the SEEK timer represents position requests
to the drive.
Each drive has a timer (set to three times the SDr drive
short timeout value) allocated on its behalf at subsystem
initialization time. This timer, called the DRAT/SEEK
timer, is active whenever data transfer activity to the
drive is outstanding.
When the disk transfer code queues transfer work to K.sdi on
behalf of a previously idle drive, the timer starts. When
it adds transfer work to a drive that already has transfer
work, the timer restarts. When it detects the completion of
the last DRAT queued to the drive, the timer stops. Thus,
the timer is running only as long as transfer work is
outstanding. A timer may expire for several reasons:
1.

The drive has detected a drive error and has lowered
Read/Write Ready.

The drive has stopped sending clock signals.

Another element in the subsystem that should have
supplied resources to the disk transfer operation in a
reasonable time did not.

8-87

Field Service Action:
Possible FRUs:
manual.)

Check out the drive.

Drive modules (Refer to drive service

DRIVE CLEAR attempt on disk unit xx (Requestor xx, Port xx)
DCB addr: xxxxxx Error count xxxxxx
Message Error Level:

Information

Message Description: The drive had some previous error and
now is attempting to clear that error.
Field Service Action: Examine the host error log to
determine what error the drive is trying to clear.
Possible FRUs:

Drive

Duplicate disk unit xx
Message Error Level:

Information

Message Description:
within the system.

Disk unit numbers are duplicated

Field Service Action: Locate the duplicate disks and change
the plug number on one.
Possible FRUs:
manual.)

FRB error:

Drive modules (Refer to drive service

K.ci, 1st LBN xx buffers, FE$SUM xx

Message Error Level:

Information

Message Description: A fragment request block arrives to
the error process. Example: A Revector.
Field Service Action:
Possible FRUs:
manual.)

If excessive, reformat drive.

Drive modules (Refer to drive service

8-88

FRB error:

K.sdi, Unit xx, first LBN xxx, buffers, FE$SUM

Message Error Level:

Information

Message Description: A fragment request block arrives to
the error process. Example: A Revector.
Field Service Action:
Possible FRUs:
manual.)

If excessive, reformat drive.

Drive modules (Refer to drive service

Illegal bit change in status from disk unit xxx
EL bit forced on so status logged.
Message Error Level:

Error

Message Description: An unsupported bit was received in
status returned from disk unit.
Field Service Action:
in HSC.

Check drive and version of software

Possible FRUs:
1.
2.

Drive module (Refer to drive service manual.)
Version of software.

K.sdi in slot xx failed its init diagnostics, status = xxx
Message Error Level:

Error

Message Description: A requestor fails during boot. The
displayed K.sdi has failed with the displayed status. This
message is only displayed at the end of the boot procedure.
Field Service Action:
purposes.
Possible FRUs:

Record the status for module repair

The K.sdi displayed

8-89

LBN Restored with Forced Error in RESTOR Operation!
Disk Unit xx LBN xx
Tape Unit xx
Message Error Level:

Warning

Message Description: An error was detected in the LBN data
during backup. A forced error bit was set in the LBN.
Field Service Action:
Possible FRUs:
manual.)

If excessive, reformat drive.

Drive modules (Refer to drive service

positioner error on disk unit xxx. DRAT addr:xxx
Desired hdr (lo,hi):xxx xxx
Actual hdr (lo,hi):xxx xxx
Message Error Level:

Information

Message Description:
wrong place.

The drive positioned the heads in the

Field Service Action:

Check drive modules and K.sdi module.

Possible FRUs:
1.
2.

Drive modules (Refer to drive service manual.)
K.sdi module

Premature LP flag in RTNDAT sequence from host node xx
Message Error Level:

Warning

Message Description: A violation of packet protocol; the
last packet flag was set before all data was received from a
host.
Field Service Action: If the problem is transient, monitor
error for repetitive node numbers as this may indicate a
host CI problem. If problem is persistent across all
cluster nodes, test the K.ci.
Possible FRUs:
1.
2.

K.ci modules
CI cables

8-90

SOl exchange retry on disk unit xxx (Requestor xx Port xx)
DeB addr xx Error count xx
Message Error Level:

Information

Message Description:

Retry the SDI command on the drive.

Field Service Action:

None

Possible FRUs:

None

unexpected AVAILABLE signal from ONLINE disk unit xx
Message Error Level:
Message Description: The HSC believes the disk is already
online; therefore the disk should not be asserting
available.
Field Service Action: Determine why the disk drive is
asserting the Available signal.
Possible FRUs:
manual.)

Drive modules (Refer to drive service

Unrecoverable error on disk unit xx.
intervention required.
Message Error Level:

Drive appears inoperative

Error

Message Description: An error log message from the drive
caused this message, or the drive may be offline.
Field Service Action:
possible FRUs:
manual.)

Check error log and drive.

Drive modules (Refer to drive service

8-91

Unsuccessful SEEK initiation, disk unit xxx.
Message Error Level:

DeB addr:

xxx

Information

Message Description: The dialog control block sent the seek
exchange, and it was rejected or lost.
Field Service Action:
Possible FRUs:
manual.)

Check drive.

Drive modules (Refer to the drive service

ve closed due to timeout of RTNDAT/CNT from host node xx
Message Error Level:

Information

Message Description: The host issued a request over the CI,
and the response timed out.
Field Service Action: Determine if the problem lies in the
HSC K.ci module set or the host CI module.
Possible FRUs:
1.
2.

K.ci module set in the HSC
CI module set in the host

8.4.5.4 Tape Functional Errors - Although most tape errors are
covered under TMSCP errors, certain tape functional errors are
classified in the out-of-band error category. They are
identified by the TAPE-E identifier printed prior to the error
printout on the local console terminal.
The following shows each tape functional detected out-of-band
error message, a message description, field serVlce actlon, and
probable FRUs.

8-92

Data Error Flagged in Backup Record
Disk Unit xx LBN xx
Tape unit xx
Message Error Level:

warning

Message Description: During a backup, a data error was
encountered. During the BBR, the record was written with a
forced error bit set.
Field Service Action:

Check BBR history on source drive.

Possible FRUs:
1.
2.

Disk unit
Media

Insufficient Control Memory for K.sti in Requestor xx
Message Error Level:

Error

Message Description: Not enough Control Memory left in pool
to allocate a control block. A certain amount of Control
Memory is needed to set up control blocks. Enough memory
has not been found to set up control blocks to turn the
K.sti functional code on.
Field Service Action: Use HSC SETSHO utility to show
available HSC memory (control, data, and program). If less
than 87.5% of available control memory is usable, replace
M.std2 module. Run Offline TEST MEM by K diagnostic and
test control memory.
possible FRUs:
1.
2.
3.

M.std2 module
P.ioj module
Software

8-93

Insufficient Private Memory remaining for TMSCP Server
Message Error Level:

Error

Message Description: In the SCT, a parameter determines the
maximum number of supported tape formatters. During
initialization, all the working K.sti modules are counted
and a calculation is done showing the maximum number of
possible formatters. These two parameters are compared.
Based on the comparison, a certain amount of Private memory
is allocated for the TMSCP Server. If that allocated
portion of Private memory is not enough, this message is
displayed.
Field Service Action: Use HSC SETSHO utility to show
available HSC program memory. If less than 87.5% of
available program memory is usable, replace M.std2. Run
Offline TEST MEM or TEST REFRESH to test program memory.
possible FRUs:
1.
2.
3.

M.std2 module
P.ioj module
Software

K.sti in Requestor xx has microcode incompatible with this TMSCP
Server
Message Error Level:

Error

Message Description: The data structure version within the
microcode version residing on the K.sti module is a lower
version than the TMSCP Server can support.
Field Service Action: Ensure the version of microcode on
the K.sti module is up to current revision. If not, replace
the microcode or replace the K.sti module with a K.sti
module of the current revision.
Possible FRUs:

K.sti module

8-94

No Tape Drive Structures available for Requestor xx Port xx Unit
xx Increase Structures via SET MAXTAPE command
Message Error Level:

Error

Message Description: An additional tape drive has been
added to an existing tape formatter, but the tape structures
set up in initialization have been exceeded.
Field Service Action: Use the SET/SHO utility to increase
to the number of tape structures with the SET MAXTAPE
command.
possible FRUs:

None

No Tape Formatter Structures available for Requestor xx Port xx
Increase structures via SET MAXFORMATTERS command
Message Error Level:

Error

Message Description: An additional tape formatter has been
added to the HSC70, but enough Tape Formatter Structures are
not available to service this additional tape formatter.
Tape Formatter Structures are set up during initialization.
Field Service Action: Use the SET/SHO utility to set the
structure level higher to compensate for the additional tape
formatter with the SET MAXFORMATTERS command.
Possible FRUs:

None

No usable K.sti boards were found by the TMSCP Server
Message Error Level:

Error

Message Description: The TMSCP server polled the HSC and
found no working K.sti modules. This message does not
appear frequently because the K.sti normally fails its
initialization diagnostics and displays the error message.
Field Service Action: Check for a failed initialization
diagnostic error message prior to this message. This prior
message displays the failed requestor slot and failing
status.
Possible FRUs:
is the FRU.

The K.sti(s) displaying the failing status

8-95

Requestor xx has failed initialization diagnostics with status =
xx
Message Error Level:

Error

Message Description: The requestor in slot xx has failed
initialization diagnostics with the displayed status. The
message indicates the failed K.sti module.
Field Service Action: Refer to the section on status codes
to determine what the displayed status indicates the failure
to be.
Possible FRUs:

K.sti module in the indicated slot

Tape unit number xx connected to Requestor xx Port xx Ceased to
exist while Online
Message Error Level:

Error

Message Description: This message is similar to the
previous error message except in this case, the HSC70 was
using the tape drive to do data transfers when the tape
drive went Offline.
Field Service Action: Check to see if a breaker has blown.
The tape drive may be diagnostic mode also making the tape
drive go Offline.
Possible FRUs:
1.
2.
3.

Tape drive
Tape formatter
SI cable

8-96

Tape unit number xx connected to Requestor xx Port xx Dropped
state clock while Online
Message Error Level:

Error

Message Description: The formatter supplies the state clock
over the SI cable. The state bits are encoded on this state
clock waveform such as AVAILABLE and ATTENTION. As long as
the K.sti is receiving a state clock, the SI cable must
still be plugged in, and the formatter must be operating
correctly. Droppi~g state clock is equivalent to
disconnecting the SI cable from the HSC70.
Field Service Action: First isolate the problem to the
HSC70, SI cable or tape unit. Next, try replacing or
swapping the K.sti module exhibiting the failure. If the
problem is not solved; try a known good tape unite
possible FRUs:
1.
2.
3.

SI cable
Tape unit
K.sti module

Tape Formatter connected to Requestor xx Port xx Has been
declared Inoperative. Intervention required
Message Error Level:

Error

Message Description: The K.sti has sent a nondata transfer
command over the SI cable to the displayed tape formatter
three times and has received back the same error three
times. The HSC70 then ignores the tape formatter until it
detects some intervention such as a change in the state
clock.
Field Service Action: Replace the possible FRUs.
Deasserting the tape drives port switches, recycling power,
unplugging the SI cable or any action causing the state
clock to come and go is considered an intervention. The
HSC70 will not attempt to communicate with the failing tape
formatter until it detects this change in state clock.
Examine any previous error reports for more specific data
regarding this error message.
Possible FRUs:
1.
2.
3.

Tape formatter
STI cabling
K.sti module

8-97

Tape unit number xx connected to Requestor xx Port xx Is not
asserting Available when it should be
Message Error Level:

Error

Message Description: The formatter is not online and is not
asserting its Available signal to the HSC70. The H5C70 does
not detect the Available signal and displays this message on
the local console terminal.
Field Service Action: First isolate the problem to either
the H5C70, the 51 cable, or the tape unit. Next, try
replacing or swapping the K.sti module exhibiting the
failure. If the problem is not solved, try a known good
tape unit.
Possible FRUs:
1.
2.
3.

SI cable
Tape unit
K.sti module

Tape unit number xx connected to Requestor xx Port xx went
Available without request
Message Error Level:

Error

Message Description: When the formatter is online,
Available is not normally asserted to the H5C70. When the
formatter is online and doing I/O and an Available is
asserted, the H5C70 detects this as an error. A formatter
does not need to send Available unless the K.sti requests
it.
Field Service Action: First isolate the error to the
formatter or to the active K.sti.
Possible FRUs:
1.

2.
3.

K.sti
Formatter
51 cable

8-98

Tape unit number xx connected to Requestor xx Port xx went
Offline without request
Message Error Level:

Error

Message Description: The formatter lost contact with one of
the tape drives. The HSC70 detected this loss of a tape
drive and printed this message.
Field Service Action: Check to see if breaker has blown.
The tape drive may be in diagnostic mode also making the
tape drive go Offline.
Possible FRUs:
1.
2.

Tape drive
Tape formatter
SI cable

TMSCP fatal initialization error - TMSCP functionality not
available
Message Error Level:

Error

Message Description: Something went wrong during
initialization with the tape functional code (TFU~C). A
routine was called up to initialize some part of the
functional code, and that part failed to initialize.
Typically, some other message is displayed prior to this
message giving more detail on the error.
Field Service Action:
message.
Possible FRUs:
message.

Take action depending on the previous

Dependent on the previously displayed error

8-99

TMSCP Server operation limited by insufficient Private Memory
Use the SET MAX command to reduce Private Memory requirements.
Message Error Level:

Error

Message Description:
message

This message appears before the

Insufficient Private Memory remaining for TMSCP server
and indicates the same problem. Private memory has
insufficient space to hold the necessary structures the
TMSCP Server needs as dictated by the number of K.sti
modules and the number of tape formatters on the HSC70.
Field Service Action: Use HSC SETSHO utility to decrease
maximum number of tape formatters for which the HSC should
reserve memory structur~s.
Possible FRUs:
1.
2.
3.

M.std2
P.ioj
Software

TTRASH fatal initialization error.
Message Error Level:

Error

Message Description:
message,

This message is similar to the

TMSCP fatal initialization error - TMSCP functionality not
available
except the process failing to initialize is TTRASH instead
of the tape functional process (TFUNCT).
Field Service Action: Lneck for previous error reports
displaying a more specific reason for this error report.
If
earlier error messages do not exist, reboot HSC using backup
HSC software copy.
Possible FRUs:
1.
2.

M.std2 module
Software

8-100

***WARNING*** K.sti microcode too low for large transfers.
Message Error Level:

warning

Message Description: The amount of microcode I/O the K.sti
can accommodate is restricted. The code still attempts to
do transfers, but a warning has been issued.
Field Service Action: Check the microcode version level to
ensure the proper revision.
Possible FRUs: Change the level of K.sti microcode to a
supported version, or change the K.sti with the out-of-date
code.
8.4.5.5 Miscellaneous Errors - Miscellaneous errors are
identified by the SINI-E identifier printed on the local console
terminal. Many of these messages are one or two line messages,
but some have several lines of informational text and result from
subsystem exceptions. Subsystem exceptions detect
inconsistencies in the operating software. These SINI errors are
discussed in more detail in this section.
The following describes each message text, gives field service
action, and lists the probable FRUs associated with the SINI
out-of-band errors.

Booted from drive 1.

Drive 0 Error (text)

Message Error Level:

Informational

Message Description:
Drive 1 of the Rx33.

The System diskette was booted from
Normal boot is from Drive O.

Field Service Action:

None

Possible FRUs:

RX33 Drive 0

8-101

Cache disabled due to failure
Message Error Level:

Error

Message Description: SINI looks back at the Cache
diagnostic and senses the Cache is disabled due to cache
failure or manually disabled in the diagnostic. This error
also shows as a soft fault code on the OCP.
Field Service Action: Load the Offline Cache diagnostic and
answer the prompt asking to disable or enable Cache with an
enable. Reboot the System diskette and check if the
original message is displayed again.
Possible FRUs:
1.
2.

P.ioj module
M.std2 module

Hard transfer error loading (file) xx
Message Error Level:

Error

Message Description: The P.ioj detected a hard error while
loading a file from the System diskette into Program memory.
The particular files that can produce this error are DUP and
MIRROR. The xx field is the error status value from the
device driver.
Field Service Action: Load the file from the other disk
drive; load the back-up diskette.
Possible FRUs:
1.
2.

Diskette
Rx33

8-102

Hard transfer error writing SCT xx
Message Error Level:

Error

Message Description: The HSC detected an error while
attempting to write the SCT. The xx designates the octal
byte that is the error status value returned from the device
driver.
Field Service Action: Make sure the drive is not write
protected; try the back-up diskette; try the other disk
drive.
possible FRUs:
1.

Diskette

Rx33

Host Clear from CI node
Message Error Level:

Error

Message Description: The host cannot function with the
HSC70 for some reason such as a nonresponse within a certain
amount of time or too many errors on the CI.
Field Service Action: Check the HSC70 console messages and
the error logs of the systems connected to the HSC70.
possible FRUs:
1.
2.
3.

HSC70
HSC70 operating software
System software

8-103

Host interface (K.ci) failed INIT diags, status
Message Error Level:

= xxx

Error

Message Description: The failing status indicates which
module in the K.ci set has failed. A soft fault code is
generated and may be examined by pressing the Fault button
on the OCP.
Field Service Action: Determine which is the failing module
by comparing the failing status value to the values in
Appendix D. This comparison will point more directly to the
failing module.
Possible FRUs:
1.
2.
3.

Link module
PILA module
K.pli module

Host interface (K.ci) is required but not present
Message Error Level:

Error

Message Description: A K.ci module set is absent, or the
failure in the K.ci module set was so severe upon
initialization, the initialization diagnostics did not run.
Field Service Action: Check for the presence of a K.ci
module set. If missing, install the K.ci module set. If
K.ci module set is present, determine which module is
failing by running Offline diagnostics. This error
generates a soft fault and is examined by pressing the Fault
button on the oep.
possible FRUs: See list below and next error message (Last
soft init resulted from unknown cause)
1.
2.

K.pli module
K.ci module set (anyone of the three modules in the
set)

8-104

Last soft init resulted from unknown cause
Message Error Level:

Error

Message Description: Software has a list of known reasons
for reboot (Trap thru 134, Trap thru 250, CRASH$, SET/SHO,
etc.). If no reason for reboot is apparent, the software
may have failed to detect where the error came from.
Field Service Action: Check the HSC70 console error
messages and the system error logs on all the systems
connected to the HSC70. This error indicates a probable
software problem.
Possible FRUs: Dependent upon the information obtained from
the error logs.

Less than 87.5% of program memory is available
Less than 87.5% of control memory is available
Less than 87.5% of data memory is available.
Message Error Level:

Error

Message Description: These three messages are a result of
the P.ioj polling the memories on initialization and finding
an insufficient amount of working memory in either one. Any
combination of the three messages may appear.
Field Service Action:
memory is failing.
Possible FRUs:

The error printout determines which

M.std2 module

P.ioj running with memory bank or board swap enabled
Message Error Level:

Error

Message Description: Upon initialization an error was
detected in the low address space of private memory. The
PGioj asserted the SWAP BANK signal, and the second bank of
private memory was enabled. The P.ioj and memory
combination can still function under limited capabilities.
Field Service Action: Exchange the M.std2 module.
HSC70 still functions with limited capabilities.
Possible FRUs:

M.std2 module

8-105

The

Requestor xx failed INIT diags, status = xxx
Message Error Level:

Error

Message Description: The data channel in the displayed
requestor has failed initialization diagnostics with the
displayed status.
Field Service Action: Determine which data channel is in
the displayed requestor slot. Make note of the status value
for module repair. Replace the failing data channel.
Possible FRUs: The data channel (K.sdi or K.sti) exhibiting
the failing status

SCT read or verification error.
Message Error Level:

Using template SCT.

Error

Message Description: An error was detected by the P.ioj as
it attempted to read the System Configuration Table (SCT) or
as it attempted to verify the SCT. The reason this error
message occurred because a new, previously uninitialized
system diskette was booted. The default settings from
SYSCOM are used instead of the SCT from the load media. The
second sentence in this message indicates the SCT is new as
derived from the template SCT settings set in the factory.
Field Service Action: Reinstall the old system diskette and
do a SHO SYSTEM. Install the new diskette exhibiting the
error and set all system diskette fields to the old values
using the SET command. Reboot the HSC70 to validate these
values and ensure system continuity.
Possible FRUs:

System diskette

The following alphabetical list of the SINI out-of-band errors
consist of informational text~ These SINI errors result from
subsystem exceptions. A detected inconsistency in the operating
software causes a subsystem exception and results in an HSC
crash.

8-106

Level 7 K interrupt (Trap thru 134)
process yyy
PC xxx
Status xxx xxx xxx xxx xxx xxx xxx xxx xxx
Message Error Level:

Error

Message Description: A level 7 K interrupt occurs when one
or more requestors detect a fatal error condition while
executing functional code. The requestor, upon detecting
the error, generates a level 7 K interrupt to the P.ioj.
The P.ioj traps through location 134 causing a reboot. The
requestor status and the failing requestors status value are
displayed for all requestors on the last line of the
printout.
Field Service Action: In some cases; the error printout
shows a failing requestor when the real problem is in the
M.std2 module.
wait for two or more failures of this type to determine if
the real problem is the M.std2 module. If the M.std2 is at
fault, the same requestor is not displayed twice as the
failing requestor. Refer to Appendix D for failing status
values and their meanings. Check the status line message to
determine the failing requestor status. Change the
requestor exhibiting the failing status if the same
requestor is displayed more than once.
possible FRUs:
1.
2.

Requestor displaying a continuous failing status value
M.std2 module

MMU (Trap thru 250)
Process yyy
PC
xxxxxx
PSW
xxxxxx
MMSRO
xxxxxx
MMSRI
xxx xxx
MMSR2
xxx xxx
Message Error Level:

Error

Message Description: A failure was detected in the memory
management unit on the P.ioj. The active process is
displayed as well as the bit assignments for the memory
management status registers.
Field Service Action: Examine the MMSR registers to
determine the failure in the MMU.
possible FRUs:

P.ioj module.

8-107

NXM (Trap thru 4)
process yyy
PC xxx
PSW xxx
Low err reg xxx
Hi err reg xxx
WBUSR xxx
Message Error Level:

Error

Message Description:

For the J-ll:

A memory location did not respond within the specified
timeout period.

A stack overflow occurred.

An odd address access was attempted for example, a byte
access instead of a word.

A halt was executed in user mode.

Field Service Action: Determine which memory is failing by
examining the low and high error address registers for
module repair.
possible FRUs:
1.

M.std2 module.
P. ioj module

Parameter change
process yyy
PC
xxx
PSW xxx
Reason xxx
Message Error Level:

Informational

Message Description:
SET/SHO utility.

A parameter has been changed via the

Field Service Action:

None

Possible FRUs:

None

8-108

parity Error (Trap thru 114)
process yyy
PC xx
PSW xx
Lo err add xxxxxx
Hi err add xxxxxx
WBUSR
Message Error Level:

Error

Message Descripti~n: This message covers parity errors in
memory and in cache. In the case of a memory parity error,
the address of the failing memory is latched into the low
error address register. In the case of a cache parity
error, the address is not latched into the low error address
register. Instead, the address of the low error address
register is displayed in the error printout:
Field Service Action: Determine if the error occurred in
memory or in cache memory by reading the contents of the low
error address displayed in the error printout. If the
contents is the address of the low error address register
(170024), the error is in cache memory. If the error is in
cache, the probable FRU is the P.ioj.
possible FRUs:
1.
2.

P.ioj
M.std2

Reserved Instruction (Trap thru 10)
From process yyyy
PC
xxx
PSW xxx
Message Error Level:

Error

Message Description: The P.ioj detected an opcode resulting
in the execution of an invalid instruction. The process
indicated is the process that executed the nonexistent
instruction.
Field Service Action:
module repair.

Determine what process was active for

Possible FRUs:
1.

2.
3.

P.ioj module.
M.std2 module
Software

8-109

Software inconsistency
Process yyy
PC
xxxxxx
PSW xxxxxx
Stack dump xxx xxx xxxxxx xxxxxx
Message Error Level:

Error

Message Description: During operation, the operating
software performs numerous consistency checks. When one of
these consistency checks fails, the HSC70 crashes and
reboots. The active process is displayed, as well as th~
stack dump.
Field Service Action:
Possible FRUs:

None

The previous SINI error messages are a result of the operating
software performing a consistency check which failed. When
consistency checks fail, the HSC70 performs a soft initialization
causing it to crash and reboot. This is known as a subsystem
exception. Upon successful completion of the reboot, the
subsystem exception printout displays the contents of several
HSC70 registers as well as the status of all requestors. As a
result of the subsystem exception, the SINI error message is
printed. This message tells why the last soft init happened.
The actual sequence of events for a SINI-E out-of-band error
printout is as follows:
1.

When the HSC70 detects an unrecoverable problem, a soft
init or crash occurs. A system dump is performed under
the heading SUBSYSTEM EXCEPTION. The HSC70 then
reboots.

When the HSC70 reboots, a message indicating the HSC70
has rebooted, followed by the multiline SINI message,
gives the reason for the last soft init (crash).

The same message is written on the system diskette and
can be examined with the SHO EXCEPTION command. A host
error message log is also filed in host memory as an HSC
datagram, storing the out-of-band error SINI message.

Traps
The four traps described in the following sections (Trap Thru 4,
Trap Thru 10, Trap Thru 114, and Trap Thru 134) are the same as
are found in the 1170 CPU.
8.4.6

8-110

8.4.6.1 NXM (Trap Thru 4) - If the error registers in the NXM
printout equal 170024 000077, the error is not a nonexistent
memory error. Instead, it is a stack overflow or some illegal
instruction. When the error register is any number other than
170024 000077, the number represents the unresponsive address.
The nonexistent memory trap produces a subsystem exception
printout similar to the example in Section 8.4.6.5.1.
If the error register equals l6xxxx, the Window Bus register
equals the Control memory address causing the NXM error. If the
failing address is in Control memory and shows an NXM error, it
is definitely a hardware problem. Otherwise, it can be either a
software or a hardware problem.
8.4.6.2 Reserved Instruction (Trap Thru 10) - The subsystem
exception message for this trap indicates on the User Pc: the
vector number is 10 and identifies the trap as ILOP (an illegal
opcode). Refer to the (PC-6) to (PC): field in the example
(Section 8.4.6.5.1). With a trap thru 10, the first line is the
field; the third word from the left is the instruction causing
the trap. If this is a valid PDP-II instruction, it is
definitely a hardware problem. Otherwise, the program may not
executing in the right place indicating the problem could be
either hardware or software.
8.4.6.3 parity Error (Trap Thru 114) - This error, caused by
hardware, does not crash the HSC but causes a reboot and SINI
error message. The error message shows the last reboot caused by
the trap through 114 and the address that caused the trap.
Determine if the error occurred in memory or in cache memory by
reading the contents of the low error address displayed in the
error printout. If the content is the address of the low error
address register (170024), the error is in cache memory. Any
other address indicates the error is in memory.
In the following example printout, note the low error address and
the high error address fields. When these fields contain the
exact addresses as shown in this example, the error is from the
P.ioj cache.
SINI-E

Seq 1. at l7-Nov-1858 00:00:01.60
Parity Error (Trap Thru 114)
Process PSCHED
PC 111022
PSW 140000
Lo err adr 170024
Hi err adr 000077
WBUSR 020633

8-111

8.4.6.4 Level 7 K Interrupt (Trap Through 134) - A level 7 K
interrupt, detected by hardware or microcode, occurs when one or
more requestors detect a fatal error condition while executing
functional
code.
The microcode-detected errors causing level 7 K interrupts result
from a
microcode consistency check failure in either K.sdi, K.sti or
K.ci microcode.
Requestor hardware detected errors are the result of errors
detected on the
control bus.
K.ci hardware detected errors are a result of errors detected on
the control
bus, scratchpad RAM parity errors/data bus parity errors or host
clears! or
control bus NXMs (not related to data transfers).
The requestor, upon detecting the error, generates a level 7
interrupt
to the P.ioj. The P.ioj traps through location 134 causing a
reboot.

8.4.6.5 Control Bus Error Conditions (Hardware Detected) - The
hardware detected control bus errors causing level 7 K interrupts
are:
o

Control Bus Error - The requestor was in the process of
executing a
control bus cycle and received CERR L (control bus error
low) from the P.ioj.
The P.ioj had detected an illegal control bus cycle
type.

Control Bus Parity Error - The requestor detected bad
parity on the data
it read off the control bus.

Control DUS NXM - The requestor tried to reference
control memory and did
not receive an acknowledgment (CACK L) from the M.std2
within the timeout
period.

8.4.6.5.1 Level 7 K Interrupt Printout - An example of a
detected level 7 K Interrupt follows:

8-112

SUBSYSTEM EXCEPTION *V# 250
at 25 Oct 1985 00:08:46.64
User
PC:
PSW: 140011

110574 caused by

PSCHED active

(134

HSC

LONDON

o 23:23:21.40

) Kint

PCB addr = 054536

RO-R5:

000000

000024

000000

052744
047260

045412
000000

001012
051300

000000
000000

045644
054742

000000
000000

User Stack:
150042 147502
000000 000000

147516
000000

000000
000000

102146
000000

000000
000000

KPAR(0-7):
000440 000640

001040

1577770

001440

001240

000240

177600

KPDR(0-7):
077506 077506

077506

077406

077506

077406

077506

UPAR(0-7):
000000 000000

000000

002204

001240

000240

177600

UPDR(0-7):
077406 077406

077406

063406

077406

000116

Kernel SP: 000774
Kernel Stack
005046
052136

000004
000000

User SP: 000774

MMSR(0-2): 000017

000000

037260

Window Index Reg: 000026
Window Bus Reg: 001431
WADR(0-7):
160004

161004

162004

163004

164004

165004

166004

167004

Translated WADR(0-7):
001401 001401 001401

001401

Error Regs: 170024

000077

Status of Requestors(1-9):

8-113

000001

000377

000175

000377

(PC-6) to (PC):
013737 141020 110560 013701
Control area for slot #000006
Control area address: 017660:
Register area contents:
000000
000000
000011
021154
102557
000770
000000
000000
017650
000000
057502
005317
002224
001000
000000
000671
000000
143444
107001
001000
005317
002212
000671
001000
000000
000000
000000
040506
000010
000374
043520
005400
001000
Booting
INIPIO-I Booting

Requestor 6 has failed with a status of 175. Refer
Appendix D to determine if the failure was a control bus error.
At this time the HsC70 reboots. A message is displayed on the
local console
terminal stating the HSC70 has rebooted.

8-114

000377

A*HSC Version 200 29-Sept-1985 23:17:28

System LONDON\*

The actual SINI error message is printed on the local console
terminal
after the HSC70 has rebooted.
SINI~E

Error sequence 1. at 17-Nov-1858 00:00:03.00
Last soft init caused by level 7 K interrupt
From process PSCHED
PC 110574
Status: 001 377 377 377 377 175 377 377 377

The resulting 134 trap information is printed on the local
console terminal. The PSCHED statement indicates PSCHED was the
active process when the error occurred. The status statement
shows requestor 6 failed with a status of 175; Also; three lines
after the status line is a message line indicating the control
area for slot six and slot six control address. This indicates
requestor six is the failing requestor. The INIPIO-I Booting
statement indicates the HSC70 is attempting to reboot.
When the HSC70 completes the initialization, the Last Soft Init
caused by Level 7 K interrupt failure is printed on the local
console terminal identified by SINI-E. The active process at
time of failure is identified. In this case, the active process
was PSCHED. If the failure is a hard failure, the following
message may also be displayed on the local console terminal.
SINI-E ERROR SEQUENCE 1.

25-0CT-1858 00:00:02.80

REQUESTOR 6 FAILED INIT DIAGS, STATUS 107
This message is also considered an out-of-band error.
8.4.6.6 MMU (Trap Thru 250)
Following is an sample printout of a detected Memory Management
Unit (MMU) failure.
**SUBSYSTEM EXCEPTION**
V# YIOB
at l2-DEC-1985 13:43:40.05
User pc: 004747
PSW: 140000

caused by (250

SETSHO active, PCB addr
RO-R5:
000320

000001

100000

HSC70 LAYER
2 19:24:07.40

MMU

104116
100212

000266

Kernel SP: 000774

8-115

000002

Kernel Stack:
005046 000004
047022 000000

053314
047426

045762
000000

001012
052052

000000
000000

046214
051042

000000
000000

User Stack:
040314 021356
020040 020037

033552
020037

021356
000330

021246
101000

000040
027113

017440
000144

017440
060542

KPAR(0-7):
000440 000640

001040

001440

002040

001240

000240

177600

KPDR(0-7):
077506 077506

077506

UPAR(0-7):
007074 007274

006410

000000

002240

001240

000240

177600

UPDR{0-7):
077506 077406

013406

077406

077506

000116

User SP: 000226

MMSR(0-2): 040145

000000

004743

Window index reg: 000002
Window bus reg: 001407
WADR(0-7):
160000

161004

162440

163000

164004

165004

166220

167034

Translated WADR(0-7):
000000 001401 067510

040000

001401

010444

001407

000203

000377

Error rags: 170024

000077

Status of requestors(1-9):
000001 000002 000002 000002
(PC-6) to (PC):
027441 067516 051040

000377

071545

Because the trap is a memory management trap, look first at the
register contents of MMSRO (memory management status register 0).
Refer to Figure 8-11 for a breakdown of the bits in MMSRO.

8-116

115 114 113 112 \" I' a I 9
j~

I
I

II'

1 1 71 1 I 4 I 31
8

ill

2 J 1
j~

ABORT, NONRESIDENT!
ABORT, PAGE LENGTH
ERROR
ABORT, READ ONLY
ACCESS VIOLATION

TRAP, M EMORY MANAGEMENT
NOT US ED

I
I

NOT US ED
ENABLE MEMORY MANAGEMENT TRAP

MAINTE NANCE MODE

INSTRUCTION COMPLETED
PAGE M ODE
PAGE AD DRESS SPACE I/O
PAGE NU MBER
ENABLE RELOCATION

CX-1126A

Figure 8-11

MMSRO Bit Breakdown

Look at the printout lines for MMSR (0-2). Compare the bits set
in MMSRO to the bit breakdown in Figure 8-11. The example
indicates a page length violation on page 2. The page length
error bit is set, and the page number 2 bit is set.
Next, check the PSW line and determine the mode the HSC70
reported this error in. A 140000 in the PSW means user mode, a
000000 in the PSW means kernel mode. Also, above the PSW line
the word user or kernel appears to identify the mode. Our
example shows user mode is active. Therefore, the next register
contents of any value are the UPAR and UPDR. If the active mode
had been kernel, the important registers would have been the KPDR
and KPAR registers.

8-117

The first group of numbers under the UPAR(0-7) line is for page
zero, the second for page one, the third for page two, and so
forth. The third group of numbers in the example are for page
two, the violated page. Note the difference in UPOR contents on
page two versus the UP DR contents on other pages. The UP DR
contents on other pages all start with 077 designating a full
page of memory to be allocated for that page. The UPOR contents
on page 2 starts with a 013. indicating a failure also.
Two possible problems cause this error:
1.
2.

Memory Management unit on the P.ioj
Software

If the error occurred in page 0, the problem is a hardware
problem. Replace the P.ioj. Otherwise, let the error recur and
see if different pages are affected.
Software Inconsistency (Trap thru 20) is reported similar to
trap. A subsystem exception is dumped on the local console
terminal with the trap vector reported being a Trap thru 20,
(AT). An example printout and explanation are found in Appendix
B.

The subsystem exception is followed by the HSC70 reboot.
successful reboot, the following message is displayed.
HSC70 Version YIOs

16-Jan-1986

15:30:20.20

System MASTER

Then the SIN! error resulting from the detected subsystem
exception is printed.
SINI-E Sequence 1. at 16-Jan-1986 00:00:11.20
Last soft init caused by software inconsistency
From process HOST
PC 007044
PSW 140001
Stack dump: 000016 006401 015476

8-118

Upon

APPENDIX A
INTERNAL CABLING DIAGRAM

A.I

HSC70 INTERNAL CABLING

Figure A-I is a diagram of the internal HSC70 cabling.

A-I

AIRFLOW SENSOR CABLE
11701275-01 )

OCP/BACKPLANE CABLE
11701215-011

BACKPLANE I REAR VIEW)

\_---------..:--~"
BF TO PS

~ -- 'OUR SH I ELO/C; cr B~_E

1NTERCONNECT (1701266-01 )

~SSY

-VWt:.M

L.UN I HULLt:.~

f7023i40-01'

----......

ASSEMBLY
OR -02'

. -'1"'---_

30243~4-01

rop llG BULKHE,AD
ASSY

=___ _ _ _ _ _

17023134-01,

........- - - - - - - - - - - - - - - - - - - - - - - - " - - - " - - - , i - - - - -

------------ ---------------

~ ~~

L ___

'----:-----""<-~'

~~~~

--·8CT T C)M
~ssy

:/0 BULKHE.P.,D

:-;"J23135-01

''--- '3 PHASE/NE.UTF:(J·,~ 'GI\:J
AC POWER

cORe

CX-944A

Sheet 1 of 5

Figure A-I

HSC70 Internal Cabling (1 of 5)

A-2

WIRE TABLE
COLOR
RED
BLACK
WHITE

TO
J70- I
J70-3
J70-2

I FROM
T A 1- +

A I -GND
i A I -LOAD

1226092-01 A/F SENSOR
SIGNAL
REMARKS
I

£
)

)

i
1

RED
BLK

I
!

YEL
YEL

WHT
WHT

FROM
P4-01
P4-02
P4-03
P4-04
P4-05
P4-06
P4-07
P4-08
P4-09
P4 10

...
SI -3
SI -6
SI -4
SI-5

I
I

WIRE 1ABLE
COLOR

S 1 -1
SI -2

1701202--01 OCP TO ROCKER SWITCH
REMARKS

SIGNAL
NO CONNECTION
NO CONNECTION
+5 VOLT
GND [+5 VOLT J
NO CONNECTION
GND
TERM ENABLE
NO CONNECTION
INIT SWL
INIT L

SPARE
KEYING PLUG

SPARE
i

SPARE

WIRE TABLE

COLOR
YELLOW
YEL/ORG
YEL/BLU
YEL/GRN
YEL/BLK
YEL/VIO
YEL/GRY
YEL/WHT
YEL/RED
YEL/BRN
YEL/BLK/GRYYEL/GRN/ORG
YEL/RED/WHT
BLACK
RED

FROM
J40-1
J40-2
J40-3
J40-4
J40-5
J40-6
J40-7
J40-8
J40-9
J40- 10
J40-! 1
J40-12
J40- 13
J40- 14
J40-15

I
i

1701203-01 OCP CABLE
REMARKS
OCP SIGNAL
:
STATE LAMP L
POWER LAMP L
i
LAMP ENA 0 L
TERM ENA L
P~-6
I LAMP ENA 2 L
P3-5
LAMP ENA 1 L
!
LAMP ENA 4 L
P3-8
P3-7
LAMP ENA 3 L
i
P3-10
PANEL SWITCH 1 l i
PANEL SWITCH 0 L!
P3-9
P3-12
PANEL SWITCH 3 Li
PANEL SWITCH 2 Li
P3-11
P3-15 • BDCOKH (INIT LJ
P3-14 i GND
P3-16
+5V
P3-20
KEY 1 Nr, PI 11r, (nrp 1
TO
P3-1
P3-2
P3-4
P3-3

WIRE TABLE
J12-1
P40-01
YELLOW
J12-2
YELLOW/ORG
P40-02
J12-3
YELLOW/BLUE
P40-03
J12-4
YELLOW/GRN
P40-04
P40-05
YELLOW/BLACK J12-5
YELLOW/VIOLET J12-6
P40-06
J12-7
P40-07
YELLOW/GRAY
P40-08
YELLOW/WHITE J12-8
J12-9
P40-09
YELLOW/RED
J12-10 P40-10
YELLOW/BRN
YELL/BLK/GRY ~12-11 P40-ll
YELL/GRN/ORG J12-12 P40-12
YELL/RED/WHT J12-13 P40- 13
J12-14 P40- 14
BLACK
J12-16 P40- 15
RED
J12-19 P41 -04
RED
J12-20 P42-04
RED
J12-21 P41 -02
BLACK
J12-22 P41-03
RI ArK
J12-23 P42-02
BL_ACK
jJ12-24 P42-03
BLACK
J12-25 i P41-01
VIOLET
J12-261 P42-01
VIOLET

1701215-01 OCP/BACKPLANE

STATE LAMP L
POWER LAMP L
LAMP ENA _0 L
I
TERM ENA L
i
!
LAMP ENA 2 L
LAMP ENA 1 L
LAMP ENA 4 L
LAMP ENA 3 L
PANEL SW ITCH 1 L
I PANEL SWITCH 0 L
PANEL SW ITCH 3 L
PANEL SW ITCH 2 L
BDCOK H [INT Ll
GRnlJNn
+5 VOLTS
+5 VOLTS
+5 VOLTS
GROUND
GROUND
GROUND
GROUND
+12 VOLTS
+12 VOLTS
I

CX-944A
Sheet 2 of 5

Figure A-I HSC70 Internal Cabling (2 of 5)

A-3

WIRE TABLE
COLOR
WHITE
WHITE

FROM
K1 - 3
K I -S

TO
P8- 1
P8-2

1701231 -01 RELAY TO PC A/F SENSOR
REMARKS
SIGNAL
TRIP
RETURN

WIRE T,A,BLE
FROM
COLOR
YELLOW i S2-2
ORANGE I S2- I
bLUt. i S2-4
BLACK 1 S2-3

1701231-02 DC ON/OFF

SIGNAL
ON/OFF ( -S.2VI
S2ON/OFF (+S.OV)
S 1-

TO
P33-4
P33-3
P33-2 I
P33-1 1

REMARKS
I

WIRE TABLE

COLOR I
V I OLET !
VIOLET!
VIOLET I
V 10LET I
BLK
BLK
BLK
BLK
ORANGE
BLK
BRN
BLK
RED ,
BLK
VIOLET!
BLK
RED i
BRN I
1

FROM
J 13- I
J13-2
J13-3
J13-4
JI3-S
J13-6
J13-7
J13-8
J13-9
J13-10
J13- II
J13-14
J13- 13
J13- 16
J 13- IS
J13-17
J13-18
J13-20

COLOR
WHITE
WHITE/BLK
WHITE/BLU
WHITE/ORG
WHITE/RED
WHiTE/V 10
WHITE
WHITE/BLK
WHITE/BLU
WHITE/ORG
WHITE/RED
WHITE/VIO
WHITE
WHITE/BLK
WHITE/BLU
WHITE/ORG
WHITE/RED
WHiTE/V 10

1701266-01 BP TO PS
SIGNAL
REMARKS
I
!
I
+12V
+12V
I
+12V
i
I
!
I
+12V
GND(+12VI
GND{+12V)
I
GNDI+12VI
DOUBLE
P31-4
i
GND(+12V)
I CRIMPED 1 STANDARD
POWER
i
-S.2V SENSE
P31-6
TWISTED
SUPPLY
PAIR
GND ( - SV SENSE)
P31-8
I
P31-10
POWER FAIL L
J32- I
GND(+SV SENSE)
TWISTED
!
+SV SENSE
J32-2
PAIR
J32-3
GND(+12V SENSE)
TWISTED
PAIR
+12V SENSE
J32-4
PSO-2
OPTIONAL
GND (+SV SENSE)
TWISTED
POWER
PAIR
PSO-I
+SV SENSE
SUPPLY
POWER FAIL L i
PSO-3
TO
P31-1
P31-3
P31-S
P31 -7
P31-9
P31-2

•

1701267-01 EIA
WIRE TABLE
FROM
!
TO
BACKPLANE SIGNAL
HSC RDY+
JII-I
! J60-20 •
TERM PRES L
JII-2
J60-6 !
!
TERM XMTJII-3 i J60- 1 :
TERM XMT+
Jll -4
J60-2
i
TERM RCV+
Jll-S
i J60-3
IERM RCVJ60-7
Jll-6
i
I
J61-20
HSC RDY+
JII -9
!
")61-6
AUXI PRES L
JII-IO
AUXI XMTJII-II
J61- 1
i
AUXI XMT+
JII - 12 J61-2 !
I
AUXI RCV+
I
JII - 13 ! J61-3
J 11-14 J61-7
AUXI RCVJII - 17 J62-20
HSC RDY+
I
Jll-18 J62-6
AUX2 PRES L
I
I
JII-19 i J62-1
AUX2 XMTI
I
i
J 1 I - 20 i J62 - 2
AUX2 XMT+
:
AUX2 RCV+
JII-21 : J62-3
JII-22 : J62-7 I
\
AUX2 RCV-

REMARKS

1
1

!
i

CX-944A
Sheet 3 of 5

Figure A-I HSC70 Internal Cabling (3 of 5)

A-4

WiRE TABLE

FROM
COLOR
VIOLET I P70- 1
VIOLET i P35
ORANGE j P70-2
ORANGE I P70-3

1701275-01 A/F SENSOR CABLE
SIGNAL
REMARKS
I
+
12
V
K
1
1
DOUBLE
CRIMP
:
LOAD [ -5 V)
K1-6
[-5.2V BUSBAR @ BACKPLANE
-5.2V
TO

WIRE TABLE
COLOR J FROM
I
GRN/YEL I GND STUDl
TB 1 - 1 -7 i
[3LUE
• TB 1 - 1 -6 1
eRN

•

•
•

1701276-01 STD POWER SUPPLY
REMARKS

SIGNAL
i
I GND
i ACC

.POWER CONTROLLER~ 2

1 AC

WIRE TABLE

COLOR I FROM
I
GRN/YEL!GND STUD I
i TBI-7
BLUE
L
ITBI -6
BROWN
I
COLOR
BLUE
BROWN
GREEN
BLACK

1701276-01 OPT POWER SUPPLY
SIGNAL
REMARKS
I GND
• POWER CONTROLLER ~ 3
I ACC
AC

~3
~3
~3

•
•

WI RF
T ABI- F
1701276 - 02 BLOWER AC LI NE CORD
REMARKS
FROM
SIGNAL
I TO
AC
P80-1i
NEUTRAL
AC
IN MOLDED PLUG !P80-2
LINE
I P80-3
GROUND
JUMPER
P80-4 I
P80-5

WIRE TABLE
COLOR
BLUE
BROWN
GREEN
BLACK
BLACK

FROM
i

IN MOLDED PLUG
P80-7
P80-6

TO
P80-1
P80-2
P80-3
P80-4
P80-5

1701276-03 BLOWER AC LINE CORD
REMARKS
SIGNAL
AC
NEUTRAL
!
AC
LINE
GROUND
.JUMr-'t.H

JUMPER

WIRE TABLE

COLOR
VIOLET
VIOLET
VIOLET
VIOLET:
BLACK
BLACK
BLACK
ORANGE
BLACK
BROWN

FROM
TO
J31 -1
I TB 1 -3-5
J31-3
J31-5
TBI-3-6
J31 -7
J31 -9
TBI-3-3
J31 -2
TB 1 - 3-3
J31-4
I TB 1 -2-2
J31 -6
TB 1 - 2- 1
J31 -8
TB 1 - 1 - 4
J31 - 10

COLOR
BLACK
REO
BLACK
VIOLET

FROM
P32-1
P32-2
P32-3
P32-4

COLOR
REO
BLACK
BROWN

FROM
J50- 1
J50-2
J50-3

COLOR
YELLOW
ORANGE

FROM
J33-4
J33-3

7019680-01
SIGNAL
REMARKS
DOUBLE
+12 V
CRIMP
DOUBLE
CRIMP
DOUBLE
CRIMP

+12 V

GND [+ 12 V)
GND [+12 V)
TWISTED
-5V SENSE
GND [ - 5V SENSE) I PAIR
POWER FAIL

WIRE TABLE

~~33-2

BLUE
BLACK
BLACK

J34-2
J33- 1
J34-1

7019681-01
SIGNAL
REMARKS
GROUND
TWISTED PAIR
I +5V SENSE
GROUND
TWISTED PAIR
+12V SENSE
7019683-01
WIRE TABLE
TO
SIGNAL
I
REMARKS
+5V SENSE
TB1-1
TWISTED PAIR
GND
[+5V
SENSE)
I TBI-2
TBI-4
PWR FAIL
TO
TB 1 - 1 -2
' TBI - 1 - 1
TBI-3-4
I TB 1 - 3- 1

WTRE TABlE
TO
TB 1 - 2 - 3
T61-2-2
, TB 1 - 1 -3

7020197-01
SIGNAL
REMARKS
ON/OFF [-5.3V) i
S2ION/OFF [+5V)
DOUBLE CRIMP
!

, TB 1 - 1 - 2

; SI-

DOUBLE CRIMP
CX-944A
Sheet 4 of 5

Figure A-l HSC70 Internal Cabling (4 of 5)

A-5

WIRE TABLE

COLOR
BLUE
alACK

I FROM
1 ,)51-2
1 ,)51-1

COLOR
BLACK
BLUE

I FROM

TO
I
I P51-1
IP51-2

I P34-1
IP34-2

COLOR 1 FROM
VIOl.,-ET I ,)35

TO
ITBI-3
I TBI-2

7020198-01
SIGNAL
I
1
ION/OFF +5V
1
I SI

REMARKS

WIRE TABLE

7020199-01
SIGNAL
I
I SI
ION/OFF (+5Vl
1
I

WIRE TABLE

TO
I TB 1 - 3 - 2

7020203-01
SIGNAL
1
I + 12 V
I

REMARKS

CX-944A
Sheet 5 of 5

Figure A-I HSC70 Internal Cabling (5 of 5)

A-6

APPENDIX B
EXCEPTION CODES AND MESSAGES

This appendix describes all known HSC exception (crash) codes caused
entirely or in part by software inconsistencies. For ease of reference, these codes are arranged in numerical order (octal radix).
Each message contains the code number, the meaning of the crash,
the facility causing it, an explanation, and user action. Note
the code number but not the text appears on hardcopy printouts.

B.l Overview
In order to determine which exception code caused a particular
crash, refer to the crash dump printed out at your terminal. The
following HSC70 crash dump example shows you where to look.

B-1

Examples

SUBSYSTEM EXCEPTION *V100
HS C7 0 HS C 0 01 "1
at 17-Nov-1858 00:13:34.20
up
o 00:13:34.20
User 2 Pc: 015066 caused by (20
lOT '3
PSW:
140001
DEMON;4 active, PCB addr = 054214
R0-R5;
000005
000000
023004
147602
1n0020
154752
Kernel SP: 000774
Kernel Stack:
005045 's 000004
053336
046004
001012
000000
04623()
000000
047044
000000
047450
000000
052074
000000
055334
000000
User SP: 154734
User Stack:
002013'6 104262
140310
102250
000034
035064
004305
000000
000 0 0 0 '. ~ 000003
00 000 1
0000 0 4
000000
000000
002445
000000
KPAR (0-7) :

Booting
INIPIO-I

Booting ..•

This line calls out a crash and indicates the HSC70 is at
software version number V100. The last field is the assigneo
node name (set with SET NAME) .

Mode of the crash. This can be either Kernel or User. It
indicates in which processor mode the crash occurred.

This three-letter mnemonic indicates the crash is a software inconsistency. Any other combination of letters, such
as NXM (Non-Existent Memory) would designate a crash outside the scope of this appendix. Hardware exceptions are
defined in Appendix D.

B-2

The initial name on this line identifies the process active at the time of the crash. It is valid only during usermode crashes. This name can be used as a cross-check when
you look up your crash description.

If the mode notation is Kernel, you would read the first
word of the Kernel Stack for your crash code.

Because the mode notation in this example indicated User,
check the User Stack for your crash code number. This code
is always the first word on the stack (in this case 002013).

The crash codes are listed numerically in this appendix (Section
B.3). Consult them for explanations and suggested user action.
The following SINI-E error example appears immediately upon reboot after a subsystem exception. Information contained in this
error message is a condensation of the crash dump.
Examples
SINI-E

Seq 1. at 17-Nov-1858 00:00:02.00
Software inconsistency 1
Process DEMON 2
PC 000002
PSW 140001
Stack dump: 002013 104262 140310

This line defines the cause of the crash .

This line and the following three lines
plicable information in the crash dump.

...:J .... _ . ,

.: _ _ "-_

UUP.l.l\,,;QL.1::

the ap-

In each of the exception descriptions in this appendix, Facility indicates the process(es) running at the time the crash occurred. The first name listed is the major process. The second
is the module of the process that generated the exception. This
may be a subprocess of the main process or simply a different code
module. A large number of these messages request submission of

8-3

an SPR (Software Performance Report). This process is described
in the following section.
B.2 SPR SUBMISSION
Include with the SPR the crash dump message and any other hardcopy information needed. Your customer will contact you or the
Customer Support Center if a 9rash dump appears with one of the
exception messages described in this appendix. The HSC User Guide
gives the customer a short explanation about the except~con
dition. This appendix shows the same messages, but provides more
detailed information needed to analyze the crash.
In many cases, the HSC User Guide tells your customer to submit
a Software Performance Report (SPR). The SPR should be sUbmitted only after you decide a hardware condition did not cause the
error, and you suspect a software problem.
In some cases, not all of the necessary information which must
accompany the SPR is contained in the crash dump (the message printed
on the console when the HSC detects an exception). This appendix
lists the additional information you or the customer must gather
after the HSC has printed its crash dump message.
After this additional information is known, the Customer Support
Center may be able to assist you over the telephone. If an SPR
is necessary, your customer must include all the information listed
for the specific exception code.
After two or three similar exception messages occur and you determine the customer should submit the SPR, look up the exception message in this appendix. If a data structure (for instance,
HMB or PCB) should be included with the SPR, set the ODT parameter, causing the HSC to enter ODT after an exception. If data
structures are not requested in the applicable exception code,
you do not need to enter ODT.

8-4

To set the ODT parameter type:
CTRL Y
HSC> RUN SETSHO
SETSHO> SET ODT DUMP 8PT
SETSHO> EXIT
SETSHO-Q Rebooting HSC; Y to continue, CTRL!Y to abort:?

The HSC then reboots with the new parameter ODT DUMP 8PT set. When
the next exception occurs, the HSC prints the exceptIon message,
followed by an asterisk (*) prompt. Instruct your customer to call
you or the Customer Support Center when the next crash occurs.
NOTE
If you instruct the customer
to call the Customer Support
Center for assistance, inform
the Center of the problem. Also,
let them know your c~stomer will
need help gathering information related to the software
error.
When another crash occurs, check the appropriate exception code
in this appendix for information needed to analyze the crash. Include all the requested information with the SPR.
Data structures needed with the SPR must be formatted. These data
structures are addressed by a register or the contents of another
structure's field. To format the necessary data structure(s), substitute the x in Table 8-1 with the pointer from the specified
register or location. Substitute only the x and type the rest of
the line exactly as you see it in the table, except for the information in parentheses. The number of = signs designates which
memory the data structure is in (= indicates Program memory, -indicates Control memory, and === indicates Data memory) •

B-5

Table 8-1

Obtaining Data Structure Information

Data
Structure
Needed

Type this at the

x==CB$

Counter

x=C. (and) x==C.

DCB

x==DC$DISK (or) x==DC$TAPE (if tape path problem)

DDCB

x=DD$

FRB

x==F$

HCB

x=HC$

HMB

x==HM$ (command packet)
x==HM$CPY (BACKUP)
x==HM$DATA (with BMBs)
x==HM$QUIET (diagnostic)
x==HM$XFR (used while work is outstanding)
x==HM$VC (used to alter VC state)

K Control
Area

x==KG$

PCB

x=Z.

SLCB

x=SL$

TDCB

x=TD.

TFCB

x=TF.

TTCB

x=TT$

XFRB

x==X.

* prompt

B-6

After the information is complete, the customer should fill out
the SPR and submit it together with all hardcopy as instructed
on the SPR form.
B.3

EXCEPTION MESSAGES

001001 ($CKERSTK)
Execution of Kernel Stack
Facility: EXEC, EXEC
Explanation: The HSC executive executed stack space.
User Action: Submit an SPR with a dump. You may reboot immediately.
001002 ($CPUMl)
Previous mode not user
Facility: EXEC, EXEC
Explanation: During context switch of user processes, the previous mode (as indicated by the PSW) was not user mode.
User Action: Submit a Software Performance Report (SPR) with
a dump. R5 points to PCB (Process Control Block).
001003 ($CEXPCB)
EXEC PCB was scheduled
Facility: EXEC, EXEC
Explanation: During process scheduling, the EXEC PCB (Process
Control Block) was scheduled. This dummy PCB is used only for
loading the process and should never be scheduled.
User Action: Submit an SPR with a dump. R2 points to PCB.

B-7

g01004 ($CDEBCAC)
Cache setting in PDR is in incorrect state
Facility: EXEC, EXEC
Explanation: This software inconsistency should not appear under normal circumstances. A ~DR (page Descriptor Register) directed to program memory does not have "disable cache" set.
A PDR directed to data memory does have "disable cache" set.
User Action: Submit an SPR with a dump. R0 points to PDR.
001005 ($CPUM2)
Previous mode not user
Facility: EXEC, EXEC
Explanation: During context switch of user processes, the premode (as indicated by the PSW) was not user mode.

VIOUS

User Action: Submit an SPR with a dump.
001006 ($CCB4)
Spurious Interrupt from K at Control Bus Level 4
Facility: EXEC, EXEC

Explanation: This software inconsistency should not appear

un~

der normal circumstances. One of the Ks interrupted the p.ioc
at level 4 (an element should be on the level 4 interrupt queue) ,
yet upon examination, no elements were shown on the queue.
User Action: Save any dump before rebooting. Submit an SPR.
rr-this crash continues to occur, escalate the problem to Field
Service support.

B-8

001007 ($CCBS)
Spurious Interrupt from K at Control Bus Level S
Facility: EXEC, EXEC
Explanation: This software inconsistency should not appear under normal circumstances. One of the Ks interrupted the p.ioc
at level 5 (an element should be on the level 5 interrupt queue),
yet upon ~xamination, no elements were shown on the queue.
User Action: Submit an SPR. If this crash continues to occur,
escalate the problem to Field Service support.
001010 ( $ CDC 1 )
Downcount failed
Facility: EXEC, EXEC
Explanation: This software inconsistency should not appear under normal circumstances. During processing of the level 5 interrupt queue, a down-count operation on a counter (down counted
by 1) failed.
User Action: Submit an SPR with a dump. R1 points to counter.
001011 ($CDC2)
Downcount failed
Facility: EXEC, EXEC
Explanation: This software inconsistency should not appear under normal circumstances. During processing of the level 5 interrupt queue, a down-count operation on a counter (down counted
by 1) failed.
User Action: Submit an SPR with a dump. R1 points to counter.

B-9

01011011.,
'IJ LI ~.., .... ~

f~("'~("'n\
\ V "" .. .&""'" '&:! I

Acquire on Semaphore with address of 9
Facility: EXEC, EXEC
Explanation: This software inconsistency should not appear under normal circumstances. The ACQ$P System Service was called
with a Semaphore address of 0.
User Action: The process specified as active is the offender.
Submit an SPR with a dump.
001013 ($CAML)
Acquire Multiple on Semaphore with address of 0
Facility: EXEC, EXEC
Explanation: This software inconsistency should not appear under normal circumstances. The AMLT$P System Service was called
with a Semaphore address of 0.
User Action: The process specified as active is the offender.
Submit an SPR with a dump.
001014 ($CRLP)
Release on Semaphore with address of 0
Facility: EXEC, EXEC
Explanation: This software inconsistency should not appear under normal circumstances. The REL$P System Service was called
with a Semaphore address of 0.
User Action: The process specified as active is the offender.
Submit an SPR with a dump.

8-10

~01015 ($CRRTI)
RRT!$ on Semaphore with address of

Facility: EXEC, EXEC
Explanation: This software inconsistency should not appear under normal circumstances. The RRTI$P System Service was called
with a Semaphore address of 0.
User Action: The process specified as active is the offender.
Submit an SPR with a dump.
001017 ($CRTI2)
RRTI$ on Semaphore with address of 0
Facility: EXEC, EXEC
Explanation: This software inconsistency should not appear under normal circumstances. The RRTI$P System Service was called
with a Semaphore address of 0.
User Action: The process specified as active is the offender.
Submit an SPR with a dump.

8-11

991929 (SCRCPP)
Receive/Dequeue from Queue with address of

Facility: EXEC, EXEC
Explanation: This software inconsistency should not appear under normal circumstances. One of the RCV$P FROM$P or DEQ$P FROM$P
System Services was called with a Queue Head address of 0.
User Action: The process specified as active is the offender.
Submit an SPR with a dump.
991g2l ($CRCCP)
Receive/Dequeue from Queue with address of 0
Facility: EXEC, EXEC
Explanation: This software inconsistency should not appear under normal circumstances. One of the RCV$C FROM$P or DEQ$C FROM$P
System Services was called with a Queue Head address of 0.
User Action: The process specified as active is the offender.
Submit an SPR with a dump.
00lg22 ($CRCCV)
Receive/Dequeue from Queue with address of 0
Facility: EXEC, EXEC
Explanation: This software inconsistency should not appear under normal circumstances. One of the RCV$C FROM$P, DEQ$C FROM$P,
RCV$C FROM$W, or DEQ$C FROM$W System Services was called with
a Queue Head address of 0.
User Action: The process specified as active is the offender.
Submit an SPR with a dump.

8-12

001023 ($CRMPP)
Receive/Dequeue Multiple from Queue with address of 0
Facility: EXEC, EXEC
Explanation: This software inconsistency should not appear under normal circumstances. One of the RMLT$P FROM$P, or DMLT$P
FROM$P System Services was called with a Queue Head address
of 0.
User Action: The process specified as active is the offender.
Submit an SPR with a dump.
001024 ($CRMCP)
Receive/Dequeue Multiple from Queue with address of 0
Facility: EXEC, EXEC
Explanation: This software inconsistency should not appear under normal circumstances. One of the RMLT$C FROM$P, or DMLT$C
FROM$P System Services was called with a Queue Head address
of 0.
User Action: The process specified as active is the offender.
Submit an SPR with a dump.
001025 ($CRMCV)
Receive/Dequeue Multiple from Queue with address of 0
Facility: EXEC, EXEC
Explanation: This software inconsistency should not appear under normal circumstances. One of the RMLT$C FROM$P, DMLT$C FROM$P,
RMLT$C FROM$W, or DMLT$C FROM$W System Services was called with
a Queue Head address of 0.
User Action: The process specified as active is the offender.
Submit an SPR with a dump.

8-13

aa1
a.,e::::
'lJIJ..I-VL.V

1~("Dl1M(",U\

\ . , , " . . , " ' .. &&4,",

Receive All-Maybe from Queue with address of 0
Facility: EXEC, EXEC
Explanation: This software inconsistency should not appear under normal circumstances. One of the RCAM$C FROM$P, or RCAM$C
FROM$W System Services was called Queue Head address of 0.
User Action: The process specified as active is the offender.
Submit an SPR with a dump.
001027 ($CSPP)
Send/Enqueue to Queue with address of 0
Facility: EXEC, EXEC
Explanation: This software inconsistency should not appear under normal circumstances. One of the SEND$P TO$P or ENQ$P TO$P
System Services was called with a Queue Head address of 0.
User Action: The process specified as active is the offender.
Submit an SPR with a dump.
001030 ($CSCP)
Send/Enqueue to Queue with address of 0
Facility: EXEC, EXEC
Explanation: This software inconsistency should not appear under normal circumstances. One of the SEND$C TO$P or ENQ$C TO$P
System Services was called with a Queue Head address of 0.
User Action: The process specified as active is the offender.
Submit an SPR with a dump.

8-14

001031 ($CSCV)

Send/Enqueue to Queue with address of

Facility: EXEC, EXEC
Explanation: This software inconsistency should not appear under normal circumstances. One of the SEND$C TO$P, ENQ$C TO$P,
SEND$C TO$W or ENQ$C TO$W System Services was called with a
Queue Head address of 0.
User Action: The process specified as active is the offender.
Submit an SPR with a dump.
001032 ($CSHPP)
Send-/Enqueue-to-Head to Queue with address of 0
Facility: EXEC, EXEC
Explanation: This software inconsistency should not appear under normal circumstances. One of the SNDH$P TO$P or ENQH$P TO$P
System Services was called with a Queue Head address of 0.
User Action: The process specified as active is the offender.
Submit an SPR with a dump.
001033 ($CSHCP)
Send-/Enqueue-to-Head to Queue with address of 0
Facility: EXEC, EXEC
Explanation: This software inconsistency should not appear under normal circumstances. One of the SNDH$C TO$P, ENQH$C TO$P,
SNDH$C TO$P, or ENQH$C TO$P System Services was called with
a Queue Head address of 0.
User Action: The process specified as active is the offender.
Submit an SPR with a dump.

B-15

~~1~34

($CIHPP)
Insert at Head to Queue with address of 0
Facility: EXEC, EXEC
Explanation: This software inconsistency should not appear under normal circumstances. The INSH$P TO$P System Service was
called with a Queue Head add~ess of 0.
User Action: The process specified as active is the offender.
Submit an SPR with a dump.
001035 ($CIHCP)
Insert at Head to Queue with address of 0
Facility: EXEC, EXEC
Explanation: This software inconsistency should not appear under normal circumstances. The INSH$C TO$P System Service was
called with a Queue Head address of 0.
User Action: The process specified as active is the offender.
Submit an SPR with a dump.
001036 ($CUPCV)
Upcount to Counter with address of 0
Facility: EXEC, EXEC
Explanation: This software inconsistency should not appear under normal circumstances. The UPC$ System Service was called
with a Queue Head address of 0.
User Action: The process specified as active is the offender.
Submit an SPR with a dump.

8-16

001037 ($CDWCV)
Downcount to Counter with address of 0
Facility: EXEC, EXEC
Explanation: This software inconsistency should not appear under normal circumstances. The DWNC$ System Service was called
with a Queue Head address of 0.
User Action: The process specified as active is the offender.
Submit an SPR with a dump.
001040
Set Timer operation to Timer with address of 0
Facility: EXEC, EXEC
Explanation: This software inconsistency should not appear under normal circumstances. The SETTM$ System Service was called
with a Queue Head address of 0.
User Action: The process specified as active is the offender.
Submit an SPR with a dump.
001041 ($CSNZ1)
Release of Semaphore with address of 0
Facility: EXEC, EXEC
Explanation: This software inconsistency should not appear under normal circumstances. During some circumstances, a semaphore
will require a down count, without subsequent scheduling considerations. This typically happens when a process enters hibernation or exits. During the implicit release operation, the
Semaphore had an address of 0.
User Action: Submit an SPR with a dump.

8-17

001042 ($CTOVR)
Time-of-day overflowed
Facility: EXEC, EXEC
Explanation: During update of current time of day, the Executive detected an overflow. This can happen if a node on the
CI sets a bogus time to the HSC.
User Action: Examine previous console printouts to verify accurate date and time fields. If accurate, submit an SPR with
the console crash report. If inaccurate, set the HSC outband
error level to INFO. Then verify console report of date and
time set by a host node on the next HSC reboot. If a host node
problem is NOT indicated, escalate the problem to Field Service support.
001043 ($CPWFL)
Power Failure
Facility: EXEC, EXEC
Explanation: After a power failure indication on the p.ioc,
CRONIC will wait five seconds for power to diminish fully enough
to stop the processor. If the processor is still operating five
se~onds after a power failure indication, CRONIC concludes that
the powerfail indication was bogus.
User Action: verify the dc voltages are correct. If so, and
the problem persists, notify Field Service support.

8-18

001201 ($CNOHIBER)
Process on Recoverable List not Hibernating
Facility: EXEC, EXECLOAD
Explanation: When requested to load a utility or diagnostic,
the Loader first examined the Recoverable Memory List of cached
programs to determine whether a program might be loaded from
memory instead of from the load device. When the program was
indeed found on the Recoverable Memory List, its state was not
Hibernate State. This software inconsistency should not be seen
under normal circumstances.
User Action: Submit an SPR with a dump noting previous activity with the program requested. R3 points to PCB (Process Control Block) for process to restart.
001202 ($CIMAGE)
Memory extent encroaches defined area
Facility: EXEC, EXECLOAD
Explanation: The process to be loaded specified required addItIonal memory for buffer space, as specified on the LFHEADER
(Loadable File Header) directive. When the additional memory
was allocated and mapped to the process, it had encroached upon
the loaded area. This software inconsistency should not appear under normal circumstances.
User Action: Submit an SPR with a dump. R0 points to XFRB (Extended Function Request Block) for loading the image. R4 points
to CH$ (Canonical File Header).

B-19

001203 ($CNOPROC)
No code parent process loaded
Facility: EXEC, EXECLOAD
Explanation: When a process was loaded, its PCB (Process Control Block) specified it sh~uld execute and share code associated with another process. When attempting to locate the code
parent, the loader found the parent was not loaded. This software inconsistency should not appear under normal circumstances.
User Action: Submit an SPR with a dump. R2 equals Process Numoer-of code parent. R3 points to code child's PCB.
001204 ($CALLOCATE)
Insufficient Kernel Pool
Facility: EXEC, EXECLOAD
Explanation: When attempting to allocate either a PCB (Process Control Block--Z.) or an Address Descriptor (A.) structure from Kernel Pool for a new process, Kernel Pool was inadequate to support the additional structures.
User Action: Submit an SPR with a dump.
001205 ($CLFAO)

FAO overrun
Facility: EXEC, EXECLOAD
Explanation: When formatting a module version mismatch message, the string returned from FAO was too large for the buffer.
This software inconsistency should not appear under normal circumstances.
User Action: Submit an SPR with a dump. If possible, send a
copy of the load medium.

B-20

001401 ($CBUSY)
Performed receive when already busy with request
Facility: EXEC, EXECRDWR
Explanation: The READ$/WRITE$ service is single-threaded, handling only one request at a time. The service, while in its
exception routine, was already busy with one request while a
RCV$P operation was performed.
User Action: Submit an SPR with a dump.
001402 ($CNOLOADED)
Requested driver not loaded
Facility: EXEC, EXECRDWR
Explanation: A process within the HSC specified a READ$ or WRITE$
operation with a DDCB (Device Control Block) for a device not
configured on that model. For example, a program specified a
transfer for a TUS8 on an HSC70 model. Because the device is
not configured on the system, the driver is not loaded.
User Action: Submit an SPR with a dump, describing activity
on the HSC at the time of the exception. The process listed
as active may be the READ$/WRITE$ service, and not the process which performed the offending request. R3 points to XFRB
(Extended Function Request Block). R4 points to DDCB. RS equals
CSR for device.

B-21

~~'A~~
YU~~V~

1~~nn~Q\
\V~~~~~I

Invalid ODCB specified
Facility: EXEC, EXECRDWR
Explanation: A request to the READ$/WRITE$ service specified
a DDCB (Device Control Block) that was invalid (or specified
an invalid device type in the DD$TYPE field).
User Action: Submit an SPR with a dump, describing activity
on the HSC at at the time of the exception. The process listed
as active may be the READ$/WRITE$ service, and not the process which performed the offending request. R3 points to XFRB
(Extended Function Request Block). R4 points to DDCB. R5 equals
CSR for device. R0 equals Device Type.
001501
Software Inconsistency - Motor not Running
Facility: EXEC, EXECRX33
Explanation: The motor was not running when the Motor Shutdown TImer expired.
User Action: Submit an SPR with a crash dump.

001502
Software Inconsistency - Non-RX33 command requested
Facility: EXEC, EXECRX33
Explanation: An XFRB (CRONIC transfer request) was received
by the RX33 driver, but specified a DDCB (Device Control Block)
for a non-RX33 device. R4 points to DDCB, R5 points to XFRB
(Extended Function Request Block).
User Action: Submit an SPR with a crash dump.

B-22

001503
Software Inconsistency - Invalid Unit Number
Facility: EXEC, EXECRX33
Explanation: The DDCB (Device Control Block) specified an RX33
device, but the unit requested was not 0 or 1. R5 points to
XFRB (Extended Function Request Block).
User Action: Submit an SPR with a crash dump.

001504
Software Inconsistency - Zero byte count transfer
Facility: EXEC, EXECRX33
Explanation: A transfer was requested with a zero byte count.
User Action: Submit an SPR with the crash dump. R2 equals byte
count, R5 points to XFRB (Extended Function Request Block).

001505
Software Inconsistency - Invalid byte count
Facility: EXEC, EXECRX33
Explanation: A transfer was requested with a byte count that
was not a multiple of 512 (sector size). R2 equals byte count,
R5 points to XFRB (Extended Function Request Block) •
User Action: Submit an SPR with a crash dump.

001506
Software Inconsistency - Invalid internal byte count
Facility: EXEC, EXECRX33
Explanation: Remaining byte count of a partially completed transfer was not a multiple of 512 (sector size). The original (requested) byte count was a multiple of 512. R2 equals byte count,
R5 points to XFRB (Extended Function Request Block).
User Action: Submit an SPR with a crash dump.

B-23

001507

Software/Hardware Inconsistency - RX33 hardware registers are incorrect
Facility: EXEC, EXECRX33
Explanation: RX33 hardware signaled successful completion of
an I/O operation, but the hardware registers (current sector,
current track, or memory address register) did not contain the
expected values.
User Action: The most probable candidates are M.std2 and the
RX33 drives. If the problem persists, submit an SPR with crash
dumps.
001510

Software Inconsistency - Invalid Head Select
Facility: EXEC, EXECRX33
Explanation: Software attempted to select a head other than
o or 1. R0 equals head select.
User Action: Submit an SPR with a crash dump.
001511

Software Inconsistency - Memory Management
Facility: EXEC, EXECRX33
Explanation: Relocation is not enabled in the memory management hardware. (Bit 0 not set in MMR0.)
User Action: Submit an SPR with a crash dump.
001512

Software Inconsistency - Invalid Virtual Address
Facility: EXEC, EXECRX33
Explanation: The virtual address passed in the XFRB is not in
page 4. R~ points to XFRB (Extended Function Request Block).
User Action: Submit an SPR with a crash dump.
8-24

001513

Software/Hardware Inconsistency - Unexpected Interrupt from RX33
Facility: EXEC, EXECRX33
Explanation: An unexpected interrupt was received from the RX33
controller. This condition is not detected until a command is
about to be issued (i.e., the crash does not happen when the
interrupt is detected).
User Action: If problem persists, submit an SPR with crash dumps.
Further testing of the subsystem (load device area) may be necessary.
001514

Software Inconsistency - Invalid Internal Unit Number
Facility: EXEC, EXECRX33
Explanation: The unit number index value is not 0 or 2. This
unit number index value is contained in R4.
User Action: Submit an SPR with a crash dump.
001515

Software/Hardw?re Inconsistency - Non-Existent Memory
Facility: EXEC, EXECRX33
Explanation: RX33 controller returned an NXM error.
User Action: Further testing of the HSC subsystem (load de-

VICe area) may be necessary. If problem persists, submit an
SPR with crash dumps.

8-25

nn'~n'
~~~u~~

I~~n~~'\
\v~rnU~J

TYPE$ crosses page boundaries
Facility: EXEC, EXECTT
Explanation: A process requested a TYPE$ System Service (or
an ACPT$ Service with a prompt) specifying a buffer which crosses
a memory management page boundary. This is a restriction of
the driver.
User Action: Submit an SPR with a dump, describing activity
at the tIme of the exception. R0 equals size of print string.
Rl points to String Buffer. R4 points to TTCB (Device Control
Block). R5 points to XFRB (Extended Function Request Block).
00l6~2 ($CPAG2)
ACPT$ crosses page boundary

Facility: EXEC, EXECTT
Explanation: A process requested an ACPT$ System Service specifying a buffer which crosses a memory management page boundary. This is a restriction of the driver.
User Action: Submit an SPR with a dump, describing activity
at the tIme of the exception. R4 points to TTCB (Device Control Block). R5 points to XFRB (Extended Function Request Block).
001603 ($CNOPCB)
PCB not found on run queue
Facility: EXEC, EXECTT
Explanation: When a process attached to a terminal is excepted
by a keyboard command, the exception manager of terminal service first performs an EXCPT$ on the Terminal Service and load
device driver. To prevent the attached process from running
while the drivers potentially run down any activity, the PCB
(Process Control Block) for the active process is removed from
the run queue. When searching the run queue specified in the

8-26

ZeRUNQ field of the PCB, the PCB itself was not found. This

is a software inconsistency.
User Action: Submit an SPR with a dump. R4 points to attached
PCB.
001701 ($CPAGE)
READ$ or WRITE$ crossed page boundary
Facility: EXEC, EXECTUS8
Explanation: A request to the TUS8 driver specified a buffer
which crossed a memory management page boundary. This is a restriction of the driver.
User Action: Submit an SPR with a dump, describing activity
at the time of the exception. The process listed as active may
be the READS/WRITES service and not the process which initiated the offending request.
002001
Exception routine invoked for unknown reason
Facility: DEMON
Explanation: Demon's exception routine was activated, but not
for CTRL Y, CTRL C, or a diagnostic timeout. A software problem is the most likely cause of this crash.
User Action: Submit an SPR with the crash dump. If a certain
sequence of HSC operations induced this crash, include a description of that sequence.

B-27

002002

Insufficient free memory to allocate a program stack
Facility: DEMON
Explanation: When DEMON was initialized, it could not allocate enough free program memory for use as a stack. A failing memory module is the most likely cause of the problem.
User Action: A failing memory module is the most likely cause
of the problem. If no hardware problem is found, submit an SPR
and the crash dump. If a certain sequence of operations causes
this crash, include a' description of that sequence.
002003

DEMON was initiated when there was no diagnostic to run
Facility: DEMON
Explanation: DEMON did a receive on its work queue and received
a nondlagnostic request. A software problem is the most probable cause of this crash.
User Action: Submit an SPR and the crash dump. If a certain
sequence of HSC operations induced this crash, include a description of that sequence.
002004

Failure in periodic control or data memory test
Facility: DEMON, PRMEMY
Explanation: One of the periodic control or data memory interface tests detected a failure. Failures in these tests are
fatal, and the HSC must reboot after displaying a message describing the failure. A failing p.ioc module is the most probable cause of this crash.
User Action: Further testing of the HSC memory and P.ioj may
be necessary.

8-28

002005

Failure in periodic K.sdi or K.sti test
Facility: DEMON, PRKSDI, PRKSTI
Explanation: The periodic K.sdi test or the periodic K.sti test
detected a failure. Failures in either test are fatal, and the
HSC must reboot after displaying a message which describes the
type of error and the requestor number of the failed module.
A failing K.sdi or K.sti module is the most probable cause of
this crash.
User Action: The requestor number of the probable failing modure-is dIsplayed in the error message preceding the crash. Further testing of HSC data channels and HSC internal buses may
be necessary.
002006

ILDISK received illegal queue address
Facility: DEMON, ILDISK
Explanation: ILDISK requested exclusive access to a drive's
state area. The acquire operation should return the control
memory address of the Attention/Available Service Queue for
the specified drive. The address returned was zero, an illegal address for a queue. A software problem is the most likely
cause of this crash.
User Action: Submit an SPR and the crash dump. If a certain
sequence of HSC operations induced this crash, include a description of that sequence. Also note if the problem occurs
only when a particular disk drive is tested.

B-29

002007
ILDISK received illegal buffer descriptor

Facility: DEMON, ILDISK
Explanation: ILDISK received a buffer descriptor from the free
buffer queue. A consistency check on the buffer descriptor failed
because the descriptor indicated the buffer was not in the HSC's
buffer memory. A software problem is the most likely cause of
this crash.
User Action: Submit an SPR which includes the crash dump information. If a certain sequence of HSC operations induced this
crash, include a description of that sequence. Also note if
the problem occurs only when a particular disk drive is tested.
002010
ILDISK detected inconsistency in exception routine

Facility: DEMON, ILDISK
Explanation: ILDISK's internal flags indicated exclusive ownership of a drive's state area, but the address of the K.sdi
control area was not available. When ILDISK has exclusive ownership of a drive state area, the address of the K.sdi control area should always be available. A software problem is
the most likely cause of this crash.
User Action: Submit an SPR and the crash dump. If a certain
sequence of HSC operations induced this crash, include a description of that sequence. Also note if the problem occurs
only when a particular disk drive is tested.

B-30

002011
An ILEXER disk I/O request failed to complete
Facility: DEMON, ILEXER
Explanation: ILEXER attempted to abort all outstanding disk
I/O requests. After waiting two minutes, the program found one
or more I/O requests uncomplete. The HSC is crashed and rebooted because ILEXER cannot exit with a request outstanding.
A faulty disk drive is the most likely cause of this problem.
User Action: Submit an SPR and the crash dump. If a certain
sequence of HSC operations induced this crash, include a description of that sequence. Also note if the problem occurs
only when a particular disk drive is tested. Further testing
of suspect disk and associated requestor(s) may be necessary.
~~2012

An ILEXER tape I/O request failed to complete
Facility: DEMON
Explanation: ILEXER attempted to abort all outstanding tape
r/o requests. After waiting two minutes, the program found one
or more r/o requests uncomplete. The HSC is crashed and rebooted because rLEXER cannot exit with a request still outstanding. A faulty tape drive or formatter is the most likely
cause of this problem. This crash could also be caused by the
K.sti clocks stopping due to a hardware error (such as an Instruction parity error) •
User Action: Submit an SPR and the crash dump. If a certain
sequence of HSC operations induced this crash, include a description of that sequence. Also note if the problem occurs
only when a particular tape drive or formatter is tested. Further testing of suspect tape subsystem and associated requestor(s)
may be necessary.

B-31

~~2~13

ILTAPE was supplied an illegal requestor number
Facility: DEMON, ILTAPE
Explanation: ILTAPE was automatically initated to test a particular formatter. One of the parameters supplied to ILTAPE
is the requestor number of the K.sti connected to the formatter. ILTAPE checked the specified requestor and found it was
not a K.sti. A software problem is the most likely cause of
this crash.
User Action: Submit an SPR and the crash dump. Also include
a summary of any tape error messages immediately preceding the
crash. If a certain sequence of HSC operations induced this
crash, include a description of that sequence. Also note if
the problem occurs only when a particular tape drive or formatter is used.
~~2~14

ILTAPE timed-out waiting for drive state area
Facility: DEMON, ILTAPE
Explanation: In order to test a tape formatter, ILTAPE must
acquire exclusive access to the drive state area for that formatter. When ILTAPE requests exclusive access to a drive state
area, the request should complete within 60 seconds. Failure
to complete indicates a problem with the tape server.
User Action: Submit an SPR and the crash dump. Also include
a summary of any tape error messages immediately preceding the
crash. If a certain sequence of HSC operations induced this
crash, include a description of that sequence. Also note if
the problem occurs only when a particular tape drive or formatter is used.

8-32

002015
ILTAPE detected inconsistency after a command failure
Facility: DEMON, ILTAPE
Explanation: ILTAPE issued a command to the HSC tape diagnostic interface. The command failed. In the process of preparing an error message, ILTAPE found the command opcode was illegal, an unknown value. A software problem is the most likely
cause of this crash.
User Action: Submit an SPR and the crash dump information. Also
include a summary of any tape error messages immediately preceding the crash. If a certain sequence of HSC operations induced this crash, include a description of that sequence. Also
note if the problem occurs only when a particular tape drive
or formatter is used.
002016
ILTAPE detected inconsistency while restoring a TACB
Facility: DEMON, ILTAPE
Explanation: ILTAPE maintains a table of available Tape Access Control Blocks (TACBs). When a particular TACB is in use
by the program, the associated table entry is zeroed. When finished with a TACB, ILTAPE stores the address of that TACB into
one of the table entries which contains a zero. While trying
to return a TACB to the table, ILTAPE discovered all table entries are nonzero implying no TACBs were in use. A software
problem is the most probable cause of this crash.
User Action: Submit an SPR and the crash dump. Also include
a summary of any tape error messages immediately preceding the
crash. If a certain sequence of HSC operations induced this
crash, include a description of that sequence. Also note if
the problem occurs only when a particular tape drive or formatter is used.

B-33

002017
ILTAPE detected inconsistency in exception routine
Facility: DEMON, ILTAPE
Explanation: ILTAPE's internal flags indicated exclusive ownership of a drive's state area, but the address of the K.sdi
control area was not available. When ILTAPE has exclusive ownership of a drive state area, the address of the K.sti control area should always be available. A software problem is
the most likely cause of this crash.
User Action: Submit an SPR which includes the crash dump information. If a certain sequence of HSC operations induced this
crash, include a description of that sequence. Also note if
the problem occurs only when a particular tape drive is tested.
003001 ($CFMTTYP)
Illegal format type specified
Facility: CERF
Explanation: An illegal format type was specified in an error message to CERF.
User Action: Submit an SPR with a dump. R4 equals Format Type
303002 ($CFAOl)
Output length too long
Facility: CERF
Explanation: When processing an MSCP error message, the FAa
output of the text string was too long for CERF's buffer.
User Action: Submit an SPR with a
bytes output.

8-34

dump~

Rl equals number of

003003 ($CFA02)

Output length too long
Facility: CERF
Explanation: When processing an out of band message, the FAD
output of the text string was too long for CERF's buffer.
User Action: Submit an SPR with a dump. Rl equals number of
bytes output.
004001

No structure to ONLINE disk to connection
Facility: DISK, MSCX
Explanation: When an MSCP ONLINE command was issued to bring
a dIsk onlIne to a connection, there was no structure to record
the necessary information. Since the initialization code allocates enough structures to bring every disk online to every connection, this crash indicates either memory corruption
or mismanagement of the free pool for this structure.
User Action: Submit an SPR with the crash dump. Specify the
number of hosts in the cluster.
004002

BMB reserved but not found
Facility: DISK, many
Explanation: A Big Memory Buffer (BMB) was reserved via a system function but not found when the table of BMBs was searched.
This indicates memory corruption or mismanagement of the BMB
pool.
User Action: Submit an SPR with the crash dump. Specify which
process was running.

B-35

004003

DUCB address zero in K Control Area
Facility: DISK, SDI
Explanation: A disk attention condition sent a control area
to a disk subprocess. The subprocess found a zero in the location which should have contained the DUCB address. This indicates an invalid structure address was passed to the process (possibly due to memory corruption), the structure was
corrupted, or it was not initialized properly.
User Action: Submit an SPR with the crash dump.
004004

Invalid action byte in Connect Block
Facility: DISK, SDI
Explanation: The subprocess within the disk path which processes requests from the CI manager received a Connect Block
with an invalid action byte. This indicates an invalid structure was passed to the process, the structure was passed at
the improper time, or that memory was corrupted.
User Action: Submit an SPR with the crash dump. Note the contents of user register 2 in the crash dump.
004005

Datagram received from a connection
Facility: DISK, MSCP
Explanation: The main MSCP command server process received a
nonsequenced message from some connection. This may indicate
memory corruption or improper message reception. It may also
indicate an improper structure was passed to the process. Host
software may have improperly sent such a message.
User Action: Submit an SPR with the crash dump. Note all levers-of host software running in the cluster.

B-36

004006

MSCP message size exceeded maximum
Facility: DISK, MSCP
Explanation: The main MSCP command server process received a
sequenced message with a length greater than the MSCP 36-byte
maximum from some connection. This may indicate memory corruption or improper message reception. It may also indicate
an improper structure was passed to the process. Host software may have improperly sent such a message.
User Action: Submit an SPR with the crash dump. Note all levels of host software running in the cluster.
004007

Invalid error signaled by K.ci
Facility: DISK, MSCP
Explanation: An MSCP command packet with invalid error bits
set was received by the main MSCP command server from the K.ci.
This may indicate memory corruption or improper message reception. It may also indicate an improper structure was passed
to the process. Host software may have sent an improper message.
User Action: Submit an SPR with the crash dump. Note all levels of host software running in the cluster and the revision
level of the K.ci microcode.

B-37

Server queue on work queue with no items
Facility: DISK, many
Explanation: The main disk process received a subprocess work
queue with no items from the main work queue. This indicates
either memory corruption or improper manipulation of items on
the subprocess work queue. An invalid structure may have been
queued to the main work queue.
User Action: Submit an SPR with the crash dump. Note the current process running.
004011

Invalid module number in subprocess work queue
Facility: DISK, many
Explanation: The main disk process received a subprocess work
queue containing an invalid module number. This indicates memory corruption or an invalid structure was queued to the main
work queue.
User Action: Submit an SPR with the crash dump. Note the current process running.
004012

SLeB not available when needed
Facility: DISK, SDI
Explanation: A Short Lifetime Control Block (SLCB) was needed
by the disk path but one was not available. Because many processes and subprocesses require SLCBs, this is unlikely except under extreme load circumstances. The number SLeBs allocated by default should be sufficient to avoid this crash.
User Action: Submit an SPR with the crash dump. Note the configuration of the HSC and the number of disk and tape drives
online at the time of the crash.

B-38

004013
State change to ONLINE requested via gatekeeper
Facility: DISK, SDI
Explanation: The state change processor within the sequential
command gatekeeper received a Disk Unit Control Block extension with the current state set to online. This crash indicates an improper use of the state change mechanism.
User Action: Submit an SPR with the crash dump.

094914
Inconsistent drive state detected
Facility: DISK, SDI
Explanation: The state change processor within the sequential
command gatekeeper received a Disk Unit Control Block extension different than the current state. This crash indicates
an improper use of the state change mechanism.
User Action: Submit an SPR with the crash dump.

004015
Improper state change for shadow member
Facility: DISK, SDI
Explanation: The sequential gatekeeper mechanism suspends activity for shadow units before allowing a state change. This
crash indicates the mechanism failed to operate properly.
User Action: Submit an SPR with the crash dump.

8-39

~~4~16

Shadow unit not found in Disk unit Table
Facility: DISK, MSCP
Explanation: The subroutine SUREM could not find the shadow
unit in the Disk Unit Table. This crash indicates improper sequencing of actions to remove a shadow unit. The most probable cause is multiple calls on SUREM for the same unit.
User Action: Submit an SPR with the crash dump.
~~4~17

Invalid diagnostic HMB
Facility: DISK, MSCP
Explanation: The diagnostic interface within the disk path received an HMB with a nonzero length field in the HM$LOF word.
This indicates an invalid request from some diagnostic or improper routing of the HMB by the disk path.
User Action: Submit an SPR with the crash dump. List any utilities or diagnostics running at the time of the crash.
~~4020

Too many seek blocks requested by diagnostic
Facility: DISK, MSCP
Explanation: A diagnostic or utility requested an excessive
number of seek blocks for transfers during initialization.
User Action: Submit an SPR with the crash dump. List any utilities or diagnostics running at the time of the crash.

B-4~

004021
Diagnostic release of disk unit while online
Facility: DISK, MSCP
Explanation: A diagnostic or utility attempted to release a
disk unit while it was still online.
User Action: Submit an SPR with the crash dump. Specify the
utilities or diagnostics running at the time of the crash.

004022
Diagnostic release of HCB while units still online
Facility: DISK, MSCP
Explanation: A diagnostic or utility attempted to release a
Host Control Block (HCB) which keeps records of online units,
while some disk units were online via that HCB.
User Action: Submit an SPR with the crash dump. Specify the
utilities or diagnostics executing at the time of the crash.

004023
DRAT/Seek timer not allocated for disk unit
Facility: DISK, ERROR
Explanation: The disk path initialization code discovered a
dIsk unIt wIthout a DRAT/Seek timer allocated (address of zero).
This is an initialization inconsistenc'y, possibly due to an
improper load of the disk path.
User Action: Submit an SPR with the crash dump. Specify the
configuration of the HSC which crashed.

B-41

004024
Not enough mapped memory to initialize disk path
Facility: DISK, ERROR
Explanation: The disk path initialization routine could not
allocate enough program memory to perform error recovery. The
most probable cause is insufficient available memory.
User Action: Determine the amount of available program (P.ioj)
memory. If it is lower than the minimum amount, replace the
memory module. If the memory appears to be sufficient, submit an SPR with the crash dump. Note the actual amount of available memory by executing the SHOW ALL command. If no hardware
problem exists, submit an SPR with a printout of SHOW ALL command results.

004025
Error identification table overwritten
Facility: DISK, ERROR
Explanation: This crash can only occur if the disk error identifIcatlon table was overwritten or a wild branch was taken.
The most probable cause is a bad load.
User Action: If this crash occurs immediately after a boot,
try rebooting with a backup copy of the HSC software. Otherwise, submit an SPR with the crash dump.

8-42

004026

Invalid error bit value found during error recovery
Facility: DISK, ERROR
Explanation: The bit value describing a K.sdi error was not
valid for a given stage of the error recovery. The most probable cause is a design error within the error recovery code.
It is possible, although unlikely, the cause is a malfunctioning K.sdi.
User Action: If this error appears to recur from the same K.sdi,
replace the K.sdi. If no hardware problem exists, submit an
SPR with the crash dump.
004027

Invalid disk characteristics for operation
Facility: DISK, ERROR
Explanation: An arithmetic operation to compute some disk parameter caused an overflow or produced a result outside the
allowed range. The most probable cause is a design error within
the error recovery code. It is also possible, although unlikely,
a disk supplied invalid characteristics to the HSC.
User Action: If possible, get the number of the requestor involved from the last error log printed on the console or from
the system error log. Further testing of the disk and attached
requestor(s) may be necessary. If this error appears to recur from the same unit, repair the unit. If no hardware problem exists, submit an SPR with the crash dump.

8-43

004~30

S bit not set in FRB error state
Facility: DISK, ERROR
Explanation: The S bit in the K control area port subarea for
a drive in FRS error state was not set as expected. This logical inconsistency indicates improper manipulation of the port
state. The most probable cause is a design error within the
error recovery code. It is also possible, although unlikely,
a K.sdi is malfunctioning.
User Action: If possible, get the number of the requestor involved from the last error log printed on the console or from
the system error log. Further testing of suspected requestor
may be necessary. If this error appears to recur from the same
K.sdi, replace the K.sdi. If no hardware problem exists, submit an SPR with the crash dump.

004031
DT$ERQ not zero in FRB error state
Facility: DISK, ERROR
Explanation: The FRS error queue in the DRAT being processed
by error recovery was not zero as expected. This logical in-

consistency indicates improper manipulation of the port state.
The most probable cause is a design error within the error recovery code. It is also possible, although unlikely, a K.sdi
is malfunctioning.
User Action: If possible, get the number of the requestor involved from the last error log printed on the console or from
the system error log. Further testing of suspected requestor
may be necessary. If this error appears to recur from the same
K.sdi, replace the K.sdi. If no hardware problem exists, submit an SPR with the crash dump.

B-44

004032

Unable to get to FRB error state
Facility: DISK, ERROR
Explanation: Error recovery was unable to place a port in FRB
error state in order to perform an error recovery operation.
This crash can occur in an extremely unlikely compound error
situation. The most probable cause, however, is a design error within the error recovery code.
User Action: Reboot the HSC. If this error persists, submit
an SPR with the crash dump.
004033

Non-ECC/EDC errors remaining after Eee correction
Facility: DISK, ERROR
Explanation: Eee error correction should take place after all
other errors except EDC have been corrected. This crash occurs because other error bits are set after ECC correction.
The most probable cause is a design error within the error recovery code.
User Action: Submit an SPR with the crash dump.
004034

Level B retry in wrong state
Facility: DISK, ERROR
Explanation: This crash occurs because a level B retry operation is attempted without the drive port being in FRB error
state. The only cause is a design error within the error recovery code.
User Action: Submit an SPR with the crash dump.

B-45

A04035

Level C retry in wrong state
Facility: DISK, ERROR
Explanation: This crash occurs because a level C retry operation is being attempted without the drive port being in FRB
error state. The only cause is a design error within the error recovery code.
User Action: Submit an SPR with the crash dump.
004036

DeB state is busy with empty DeB queue
Facility: DISK, ERROR
Explanation: The drive state indicator in the K control area
indicates a K.sdi is processing a DCB, but the DeB queue is
empty. The most probable cause of this crash is a design error in the error recovery code. It is also possible, but unlikely, that the K.sdi is malfunctioning.
User Action: If possible, get the number of the requestor involved from the last error log printed on the console or from
the system error log. Further testing of suspect requestor may
be necessary. If this error appears to recur from the same K.sdi,
replace the K.sdi. If no hardware problem exists, submit an
SPR with the crash dump.
004037

Invalid error queue address in route
Facility: DISK, ERROR
Explanation: When attempting to route an FRB to an error queue,
the error queue address in a route descriptor was invalid. The
most likely cause of this crash is a corrupted route descriptor probably due to a logic error in the error recovery code.
User Action: Submit an SPR with the crash dump.

B-46

004040
Undefined error bit in error word from K
Facility: DISK, ERROR
Explanation: The error recovery routine IDENTIFY
defined bit in the error word stored by either a
The most probable cause of this crash is a logic
the error recovery code. It is also possible but
K is malfunctioning.

found an unK.sdi or K.ci.
error within
unlikely a

User Action: If possible, get the number of the requestor involved from the last error log printed on the console or from
the system error log. Further testing of suspect requestor may
be necessary. If this error appears to recur from the same K.sdi,
replace the K.sdi. If no hardware problem exists, submit an
SPR with the crash dump.

004041
No buffer found in FRB when expected
Facility: DISK, ERROR
Explanation: The error recovery routine MAPBUF attempted to
map a buffer but found the buffer address to be zero. The only
cause of this crash is a design error within the error recovery code.
User Action: Submit an SPR with the crash dump.

004042
FRB not in error state for level D I/O operation
Facility: DISK, ERROR
Explanation: A call to the error recovery subroutine LVLDIO
was made without the port being in FRB error state. The only
cause of this logical inconsistency is a design error within
the error recovery code.
User Action: Submit an SPR with the crash dump.

B-47

004043

Stack too deep to save in thread block
Facility: DISK, ERROR
Explanation: A call to the error recovery subroutine LVLDIO
was made with too many items on the stack to save in a thread
block. The only cause of this logical inconsistency is a design error within the error recovery code.
User Action: Submit an SPR with the crash dump.
004044

Buffer not found for specified error
Facility: DISK, ERROR
Explanation: A call to the error recovery subroutine RCDHMX
specified a buffer which was not found in the list of buffers
for the specified FRS. The only cause of this logical inconsistency is a design error within the error recovery code.
User Action: Submit an SPR with the crash dump.
004e45

Parent downcount failed
Facility: DISK, ERROR
Explanation: A downcount of the parent HMB failed during routing of an FRB in the error recovery subroutine RETIRE. This
crash is caused by improper manipulation of the parent counter
by some process or overwritten memory.
User Action: Submit an SPR with the crash dump.

8-48

004046

DRAT not found for FRB retirement
Facility: DISK, ERROR
Explanation: The error recovery subroutine RETIRE could not
locate the DRAT for down counting while attempting to retire
an FRB by simulating route completion. This crash is caused
by either a logic inconsistency within error recovery or overwritten memory. It is also possible, although unlikely, it is
caused by a malfunctioning K.sdi.
User Action: If possible, get the number of the requestor involved from the last error log printed on the console or from
the system error log. Further testing of requestors and HSC
internal buses may be necessary. If this error appears to recur from the same K.sdi, replace the K.sdi. If no hardware problem exists, submit an SPR with the crash dump.
004047

Sectors/track field in K control area is zero
Facility: DISK, ERROR
Explanation: The error recovery subroutine RETIRE found the
sectors/track field in the K control area to be zero while attempting to retire an FRB by simulating route completion. This
crash is caused by either a logic inconsistency within error
recovery or overwritten memory. It is also possible, although
unlikely, it is caused by a malfunctioning K.sdi.
User Action: If possible, get the number of the requestor involved from the last error log printed on the console or from
the system error log. If this error appears to recur from the
same K.sdi, replace it. If no hardward problem exists, submit an SPR with the crash dump.

B-49

004050

DRAT queue not empty for shadow copy
Facility: DISK, MSCP
Explanation: After obtaining exclusive use of a drive, the shadow
copy code found a DRAT queue for that drive was not empty. This
crash can only be caused by a design error within the MSCP command processing.
User Action: Submit an SPR with the crash dump.

004051
Inconsistent result for repair operation
Facility: DISK, MSCP
Explanation: An impossible combination of results was found
at the end of a shadow repair operation. This crash can only
be caused by a design error within the shadow repair code.
User Action: Submit an SPR with the crash dump.

004052
Known drive not found in the Disk Unit Table
Facility: DISK, MSCP
Explanation: While attempting to remove a known disk unit from
the Disk UnIt Table, the unit was not found in that table. This
crash can only be caused by a design error within the MSCP command processing.
User Action: Submit an SPR with the crash dump. Note any utilities or diagnostics running at the time of the crash.

8-50

004053

Invalid block number for transfer operation
Facility: DISK, MSCP
Explanation: All MSCP transfer commands are prechecked for valid
parameters. This applies to most diagnostic transfers as well.
This crash indicates an invalid block__ fium_ber somehow slipped
past the checks. It indicates a design error within the disk
path transfer processing or a corrupted Disk unit Control Block.
User Action: Submit an SPR with the crash dump. Note any utilities or diagnostics running at the time of the crash.
004054

Unexpected compare failure following write
Facility: DISK, ERROR
Explanation: The RCT.MWRITE routine writes, reads back, and
one at a time compares a block of data to all copies of that
block in the RCT. This crash indicates a block was read back
with no errors detected; however, it did not compare with the
original data written. This indicates data was delivered incorrectly by the K.sdi without any error indications. It is
possible, but unlikely, the failure is due to a legitimate undetected error.
User Action: If possible, get the number of the requestor involved from the last error log printed on the console or from
the system error log. If this error appears to recur from the
same unit, repair the unit. If no hardware problem exists, submit an SPR with the crash dump.

B-51

994955

Attempt to enable drive interrupt already enabled
Facility: DISK, many
Explanation: The ARM subroutine was called to enable the interrupt for drive state changes when the interrupt was already
enabled. The only possible cause for this crash is a design
error.
User Action: Submit an SPR with the crash dump. Note the process running at the time of the crash.
904056

Attempt to enable drive interrupt with pending state change
Facility: DISK, many
Explanation: The ARM subroutine was called to enable the interrupt for drive state changes while a drive state change was
being processed. The only possible cause for this crash is a
design error.
User Action: Submit an SPR with the crash dump. Note the process running at the time of the crash.
004057

State change requested for available but inoperative drive
Facility: DISK, many
Explanation: The SCHSQM subroutine was called to schedule a
state change operation for an available but inoperative drive.
The only possible cause for this crash is a design error.
User Action: Submit an SPR with the crash dump. Note the process running at the time of the crash.

8-52

004060

Attempt to down count DRAT already at zero
Facility: DISK, many
Explanation: A call was made to the DWNCDT subroutine to down
count a DRAT when the count was already zero. The only possible cause for this crash is a design error.
User Action: Submit an SPR with the crash dump. Note the process running at the time of the crash.
004061

Thread block count not initialized
Facility: DISK, SDI
Explanation: During initialization, the routine which allocates thread blocks discovered the number of threads to be al"located was set to zero. This was probably caused by the failure of a previous initialization routine to initialize this
count word. This inconsistency may indicate an improper load.
User Action: Reboot the HSC. If the failure persists, submit
an SPR with the crash dump.
004062

Thread block area too small
Facility: DISK, SDI
Explanation: During initialization, the routine which carves
up thread blocks found the area too small to allocate all the
thread blocks required. This inconsistency may indicate an improper load.
User Action: Reboot the HSC with a backup copy of the HSC system software. If the failure persists, submit an SPR with the
crash dump.

8-53

004063
Seek DeB without Clear D Bit flag set

Facility: DISK, SDI
Explanation: A SEEK DCB failed because the Clear D Bit flag
was not set as expected. The DeB was not a SEEK DeB or the DCB
was improperly set up. The only possible cause is a design error within the DeB processing code.
User Action: Submit an SPR with the crash dump.

004064
DRAT/SEEK timer running with SEEK DeB queued
Facility: DISK, SDI
Explanation: During processing a failed SEEK DCB, the DRAT/SEEK
timer was not running as expected. The only possible cause is
a design error within the DeB processing code.
User Action: Submit an SPR with the crash dump.

004065
D Bit set for port with SEEK DeB being processed
Facility: DISK, SDr
Explanation: During processing of a failed SEEK DCB, the n (Process DRAT) bit was set for the port to which the SEEK DeB had
been queued. The only possihle cause is a design error within
the DeB processing code.
User Action: Submit an SPR with the crash dump.

B-54

004066
State changed during SDI ONLINE
Facility: DISK, SDI
Explanation: After completing an SDI ONLINE command, either
the state was not AVAILABLE or a state change was pending. Because state changes are inhibited during the SDI ONLINE, this
is a logical inconsistency. The only possible cause is a design error within the SDI manager.
User Action: Submit an SPR with the crash dump.

004067
SOl WRITE MEMORY command not implemented
Facility: DISK, SDI
Explanation: The SDI WRITE MEMORY command cannot be issued in
the current implementation of the SDI manager. This crash indicates some process attempted to issue an SDI WRITE MEMORY
command.
User Action: Submit an SPR with the crash dump. Note the diagnostics or utilities running at the time of the crash.

004070
Nonzero status for SUCCESSful DCB
Facility: DISK, SDI
Explanation: A DCB completed with a status of SUCCESS, but the
error word indicated errors anywaYe The most probable cause
is a design error within DCB processing. It is possible, although unlikely, the cause is a malfunctioning K.sdi. This is
a logical inconsistency.
User Action: If possible, get the number of the requestor involved from the last error log printed on the console or from
the system error log. If this error appears to recur from the
same K.sdi, replace the K.sdi. If no hardware problem exists,
submit an SPR with the crash dump.

B-55

004071

D bit set in DeB error state
Facility: DISK, SDI
Explanation: During processing of a DCB, the D (process DRAT)
bit was set for the port to which the DCB had been queued. This
is a logical inconsistency. The only possible cause is a design error within DCB processing.
User Action: Submit an SPR with the crash dump.
004072

DeB state is busy with empty DeB queue
Facility: DISK, SDI
Explanation: The drive state indicator in the K control indicates a DCB is being processed by the K.sdi but the DCB queue
is empty. The most probable cause of this crash is a design
error within DCB processing. It is also possible, but unlikely,
the K.sdi is malfunctioning.
User Action: If possible, get the number of the requestor involved from the last error log printed on the console or from
the system error log. Further testing of requestors, HSC internal buses, and memory subsystem may be necessary. If this
error appears to recur from the same K.sdi, replace it. If no
hardware problem exists, submit an SPR with the crash dump.
004073

K.sdi is not responding
Facility: DISK, SDI
Explanation: A K.sdi failed to process an immediate DCB within
one second. The most probable cause is a broken K.sdi.
User Action: If possible, get the number of the requestor involved from the last error log printed on the console or from
the system error log. If the error persists, replace the K.sdi.

B-56

004074

DCB state is blocked after QUIESCE or DCBSTS DCB
Facility: DISK, SDI
Explanation: The drive state indicator in the K control indicates DCB activity is blocked. This should not be possible
after a QUIESCE or DCBSTS DCB. The most probable cause of this
crash is a design error within DCB processing. It is also possible, but unlikely, the K.sdi is malfunctioning.
User Action: If possible, get the number of the requestor involved from the last error log printed on the console or from
the system error log. Further testing of requestors, HSC internal buses, and memory subsystem may be necessary. If this
error appears to recur from the same K.sdi, replace the K.sdi.
004075

Call to DCBOPR from process other than DISK
Facility: DISK, many
Explanation: The DCBOPR routine may only be called from the
DISK process. This crash indicates a call was made from some
other process.
User Action: Submit an SPR with the crash dump. Note the process running at the time of the crash.
004076

Port not in DeB error state for error DCB
Facility: DISK, SDI
Explanation: The DCBOPR routine received an error DCB, but the
port was not in DCB error state as expected. This logical inconsistency can only be the result of a design error within
DCB processing.
User Action: Submit an SPR with the crash dump.

B-57

004077

Match enable not set for DIALOG DCB
Facility: DISK, SDI
Explanation: The DCBOPR routine received a DCB with an improper
combination of request bits set. This logical inconsistency
can only be the result of a design error within DCB processing.
User Action: Submit an SPR with the crash dump.
004100
No thread block for operation

Facility: DISK, SDI
Explanation: The DCBWAIT routine was called. Insufficient thread
block was available to suspend the process. This logical inconsistency can only be the result of a design error within
DCB processing.
User Action: Submit an SPR with the crash dump.
004101
Stack too deep to suspend process in thread block

Facility: DISK, SDI
Explanation: The DCBWAIT routine was called with too many words
on the stack to suspend the process in a thread block. This
logical inconsistency can only be the result of a design error within DCB processing.
User Action: Submit an SPR with the crash dump.

B-58

004102
Thread block pointer corrupted in DeB
Facility: DISK, SDI
Explanation: A DeB was returned from a K.sdi with a corrupted
thread block pointer. The most probable cause of this crash
is a design error within DeB processing. It is also possible
but unlikely the cause is a malfunctioning K.sdi.
User Action: If possible, get the number of the requestor involved from the last error log printed on the console or from
the system error log. If this error appears to recur from the
same K.sdi, replace it. If no hardware problem exists, submit an SPR with the crash dump.

004103
Insufficient pool to allocate a timer
Facility: DISK, SDI
Explanation: This crash indicates too much memory has been allocated from common pool. It can be caused by any process.
User Action: Submit an SPR with the crash dump. Specify the
OTagnostics or utilities running at the time of the crash and
if there were DUP connections.

004104
DeB received with no errors and no frames
Facility: DISK, SDI
Explanation: A DeB was received from K.sdi with no frames in
the response and no error indications. It can be caused by a
design error within DeB processing. It is probably caused by
a malfunctioning K.sdi.
User Action: If possible, get the number of the requestor involved from the last error log printed on the console or from
the system error log. If this error appears to recur from the

B-59

same K.sdi, replace it. If no hardware problem exists, submit an SPR with the crash dump.
~~4l~5

Element in deferred seek queue with no FRB
Facility: DISK, MSCP
Explanation: An element was found in the deferred seek queue
for a disk unit having no FRBs. The only possible cause is a
design error within seek queue element processing.
User Action: Submit an SPR with the crash dump.
~~5~0l

ECC self-test string too big for FAO
Facility: ECC
Explanation: A self-test string generated for the Eee process
was too bIg to print with the FAD buffer allocated. This crash
can only occur if the self-test code is present and enabled.
The self-test code is not enabled for distributed base levels.
User Action: Submit an SPR with the crash dump.
005~02

No ECC errors to correct
Facility:

Ece

Explanation: An FRB with no errors was sent to the Ece process. This logical inconsistency can only occur due to a design error within error recovery.
User Action: Submit an SPR with the crash dump.

8-60

005003

Can't allocate XFRB to print self-test messages
Facility: ECC
Explanation: The ECC process failed to allocate an XFRB (Extended Function Request Block) for printing messages during
self-test. This crash can only occur if the self-test code is
present and enabled (not tru for distributed base levels).
User Action: Submit an SPR with the crash dump.
005004

ECC found more than a 10-bit symbol error
Facility: ECC
Explanation: The ECC process was sent a buffer with more than
a 10-bit symbol error. Error recovery processing should never
pass on such a buffer. This logical inconsistency can only occur due to a design error within error recovery.
User Action: Submit an SPR with the crash dump.
006000

This class of crashes is for tape path software inconsistency errors
Facility: TAPE, TFxxxx
Explanation: A software inconsistency error occurred.
User Action: Submit an SPR with the crash dump. Specify the
utIlities or diagnostics active at the time of the crash.

B-61

006001

An STI GET LINE STATUS failed
Facility: TAPE, TFATNAVL
Explanation: When issued to the tape data channel, the STI command GET LINE STATUS returned with a failure. This command should
not return failure when issued to a working tape data channel. General register 5 points to the windowed K control area
for the tape data channel in question. Offset KG$SLT points
to the tape requestor in question.
User Action: Investigate the tape data channel in question.
006002
Received an interrupt from an unknown tape data channel

Facility: TAPE, TFATNAVL
Explanation: Received an interrupt from an unknown tape data
channel. This is a software inconsistency. General register
1 points to the windowed tape data channel control area for
the tape data channel in question. General register 2 contains
the tape data channel slot number the interrupt was received
from.
User Action: Submit an SPR with the crash dump.
006003
Received an illegal Connection Block (CB) from the CIMGR

Facility: TAPE, TFCI
Explanation: A Connection Block (CB) with an illegal opcode
was sent to the tape path. General register I points to the
windowed address of the Connection Block (eS) in question. General register 2 contains the opcode in question.
User Action: Submit an SPR with the crash dump. Include the
Connection Block (CB) structure.

B-62

006~~4

An illegal diagnostic opcode was received
Facility: TAPE, TFDIAG
Explanation: A diagnostic Host Message Block (HMB) with an illegal opcode was sent to the tape diagnostic interface. General register 3 points to the windowed diagnostic Host Message Block (HMB). General register 1 contains the opcode in
question.
User Action: Submit an SPR with the crash dump. Specify the
utilities or diagnostics active at the time of the crash. Include the Host Message Block (HMB) structure.

006005
Diagnostics trying to acquire assigned drive state area
Facility: TAPE, TFDIAG
Explanation: Diagnostics are trying to acquire previously-assigned
drive state area. General register 3 points to the windowed
control memory address of the Host Message Block (HMB). General register 2 points to the Tape Formatter Control Block (TFCB).
User Action: Submit an SPR with the crash dump. Specify the
diagnostics or utilities active at the time of the crash. Include the Host Message Block (HMB) and Tape Formatter Control
Block (TFCB) structure.

B-63

006006
Inconsistencies during drive state area acquisition
Facility: TAPE, TFDIAG
Explanation: The software context word (SFW) is not equal to
the Tape Formatter Control Block (TFCB) address and/or DIALOG list head is nonzero when diagnostics are trying to acquire the drive state area. General register 0 points to the
windowed K control area. General register 2 points to the Tape
Formatter Control Block (TFCB).
User Action: Submit an SPR with the crash dump. Indicate the
utilities or diagnostics active at the time of the crash. Include the Tape Formatter Control Block (TFCB) structure.
006007
No Block Header supplied by BACKUP

Facility: TAPE, TFDIAG
Explanation: BACKUP did not supply the initial Block Header
buffer descriptor. General register 3 points to the windowed
Host Message Block (HMB) address. General register 5 should
point to the buffer descriptor and, in this case, should be
0.
User Action: Submit an SPR with the crash dump. Include detalls of the BACKUP operation. Include the Host Message Block
(HMB) structure.

B-64

006010
No buffers supplied in BACKUP operation
Facility: TAPE, TFDIAG
Explanation: No disk data block buffers were supplied in Host
Message Block (HMB) for BACKUP operation. General register 3
points to the windowed control memory address of the Host Message Block (HMB) in question. General register 0 should point
to the buffer descriptor list for the BACKUP operation (in this
case does so).
User Action: Submit an SPR with 'crash dump. Include details
or-BACKUP operation. Include the Host Message Block (HMB) structure.
006011
Could not allocate a XFRB
Facility: TAPE, TFLIB
Explanation: Could not allocate a XFRB (Extended Function Control Block) through ALoeB for print routine.
User Action: Submit an SPR with the crash dump.
006012
Required CIMGR functionality not yet implemented
Facility: TAPE, TFMSCP
Explanation: The host sent the tape server a command packet
with an opcode that was not a sequenced message. General register 5 is the opcode received. General register 3 is the windowed control memory address of the command packet received
(Host Message Block (HMB)).
User Action: Submit an SPR with crash dump. Indicate the host
software version • Include the Host Message Block (HMB) (command packet) structure.

B-65

006013
Required CIMGR functionality not yet implemented

Facility: TAPE, TFMSCP
Explanation: The tape server received a host command packet
longer than allowed (36. bytes). General register 4 is the size
of command packet received. General register 3 is the windowed
control memory address of the command packet in question.
User Action: Submit an SPR with the crash dump. Indicate the
host software version. Include the Host Message Block (HMB)
(command packet) structure.
006014
Required CIMGR functionality not yet implemented

Facility: TAPE, TFMSCP
Explanation: The tape server received a host command packet
with a status that currently is not executed. General register 3 points to the windowed control memory address of the command packet in question. Offset RM$ERR is the field in question.
User Action: Submit an
host software version.
necessary. Investigate
(RMB) (command packet)
ists, submit an SPR.

SPR with the crash dump. Indicate the
Further testing of HSC hardware may be
K.ci. Include the Host Message Block
structure. If no hardware problem ex-

8-66

006~15

Could not find correct Tape Drive Control Block (TDCB) pointer
Facility: TAPE, TFSEQUEN
Explanation: A call to remove a host's access to a drive resulted in searching the current chain of Tape Drive Control
Blocks (TDCB) in that host's HCB. Inability to find the correct Tape Drive Control Block {TDCB} pointer resulted in this
message. General register 4 points to the Tape Drive Control
Block (TDCB) that is trying to have host access removed. General register 3 points to the windowed control memory address
of the Host Message Block (HMB). Offset HM$CTX in the Host Message Block (HMB) points to the Host Disk Block (HDB). Offset
HDB.TDCB in the HDB points to the Tape Drive Control Block (TDCB).
User Action: Submit an SPR with the crash dump. Include the
Host Message Block (HMB), Tape Drive Control Block (TDCB), Host
Disk Block (HDB) structures.

006016
Unable to allocate an RDB
Facility: TAPE, TFSEQUEN
Explanation: An attempt to add a host access (requiring allocatIon of a Host Disk Block (HDB)) failed for lack of resources.
User Action: Submit an SPR with the crash dump.

B-67

006017
Tape formatter does not support allowed densities
Facility: TAPE, TFSEQUEN
Explanation: The tape formatter does not support a density the
HSC supports. General register 4 points to the Tape Drive Control Block (TDCB) for the drive in question.
User Action: Submit an SPR with the crash dump. Include the
host software version and the tape formatter revision. Also
include the Tape Drive Control Block (TDCB) structure, host
software version, and tape formatter revision.
006g20
An invalid density is set in the Tape Drive Control Block (TDCB)

Facility: TAPE, TFSEQUEN
Explanation: An invalid density was set in the Tape Drive Control Block (TDCB). This should not happen. General register
4 points to the Tape Drive Control Block (TDCB) in question.
User Action: Submit an SPR with the crash dump. Submit an SPR
with crash dump and Tape Drive Control Block (TDCB) structure.

006021
Read reverse emulation not flagged
Facility: TAPE, TFSEQUEN
Explanation: The tape server entered the read reverse emulation code without read reverse emulation being flagged in the
Tape Drive Control Block (TDCB) at offset TD.FLAGS bit TDF.RREVEM.
General register 3 points to the windowed control memory address of the Host Message Block (HMB). General register 4 points
to the Tape Drive Control Block (TDCB) for drive in question.

8-68

General register 2 points to the Tape Formatter Control Block
(TFCB) for formatter in question.
User Action: Submit an SPR with the crash dump. Include the
following structures: Host Message Block (HMB), Tape Drive Con-'
trol Block (TDCB), Tape Formatter Control Block (TFCB).

006022
Route pointer for read reverse emulation zero
Facility: TAPE, TFSEQUEN
Explanation: The tape server entered the read reverse emulatlon code wlthout having the route pointer set in the Host Message Block (HMB). General register 3 points to the windowed
control memory address of the Host Message Block (HMB) in question.
User Action: Submit an SPR with crash dump and the Host Message Block (HMB) structure.

006023
Requested transfer larger than 64Kb
Facility: TAPE, TFSEQUEN
Explanation: The requested transfer size for a read reverse
is larger than 04 Kb. This should not happen. General register 3 points to the windowed control memory address of the Host
Message Block (HMB) in question and offset HP •• BC indicates
the transfer size requested.
User Action: Submit an SPR with the crash dump. Include the
Host Message Block (HMB) structure.

B-69

006024
Read reverse emulation not flagged

Facility: TAPE, TFSEQUEN
Explanation: The tape server entered the read reverse emulation short retry code without read reverse emulation being flagged
in the Tape Drive Control Block (TDCB) at offset TD.FLAGS bit
TDF.RREVEM. General register 3 points to the windowed control
memory address of the Host Message Block (HMB). General register 4 points to the Tape Drive Control Block (TDCB) for drive
in question. General register 2 points to the Tape Formatter
Control Block (TFCB) for formatter in question.
User Action: Submit an SPR with the crash dump. Include the
following structures: Host Message Block (HMB) , Tape Drive Control Block (TDCB), Tape Formatter Control Block (TFCB).
006025
Read reverse emulation not flagged

Facility: TAPE, TFSEQUEN
Explanation: The tape server entered the read reverse emulatIon long retry code without read reverse emulation being flagged
in the Tape Drive Control Block (TDCB) at offset TD.FLAGS bit
TDF.RREVEM. General register 3 points to the windowed control
memory address of the Host Message Block (HMB). General register 4 points to the Tape Drive Control Block (TDCB) for drive
in question. General register 2 points to the Tape Formatter
Control Block (TFCB) for formatter in question.
User Action: Submit an SPR with the crash dump. Include the
roITowing structures: Host Message Block (HMB) , Tape Drive Control Block (TDCB), Tape Formatter Control Block (TFCB).

B-70

006026
KT$SEM is equal to zero
Facility: TAPE, TFSEQUEN
Explanation: The K control area offset KT$SEM is zero. This
should not happen. General register 3 points to the K control
area in question.
User Action: Submit an SPR with the crash dump. Include the
K control area structure.

006027
The thread stack is not initialized
Facility: TAPE, TFSERVER
Explanation: The thread stack is not initialized to 52525(8)
for a process suspend. This should not happen.
User Action: Submit an SPR with the crash dump.

006030
The thread stack is not initialized
Facility: TAPE, TFSERVER
Explanation: The thread stack is not initialized to 52525(8)
for a process resume. This should not happen.
User Action: Submit an SPR with the crash dump.

006031
No available stacks
Facility: TAPE, TFSERVER
Explanation: There are no available stacks for a process trying to suspend.
User Action: Submit an SPR with the crash dump.

8-71

006032
The thread stack is not initialized

Facility: TAPE, TFSERVER
Explanation: The thread stack is not initialized to 52525(8)
for a process suspend. This should not happen.
User Action: Submit an SPR with the crash dump.
006033
Top of user stack for a resume is not set to server return

Facility: TAPE, TFSERVER
Explanation: The top of the user stack on a process resume is
not set to the routine server return. This is a software inconsistency.
User Action: Submit an SPR with the crash dump.
006034
Stack not valid for a process resume

Facility: TAPE, TFSERVER
Explanation: The stack being returned on a process resume is
not valid~ This is a software inconsistency, caused by the stack
not being set 52525(8).
User Action: Submit an SPR with the crash dump.

8-72

006035

Wrong port state for Dialogue Control Block (DCB)
Facility: TAPE, TFSTI
Explanation: The Dialogue Control Block (DCB) is in the wrong
port state for attempted operation. The port should be in DCB
error state. General register 4 points to the Dialogue Control Block (DCB) in question. General register 2 points to the
Tape Formatter Control Block (TFCB).
User Action: Submit an SPR with the crash dump. Include the
structures Tape Formatter Control Block (TFCB) and Dialogue
Control Block (DCB).
006036

Wrong port state
Facility: TAPE, TFSTI
Explanation: An error recovery Dialogue Control Block (DCB)
operation is attempted when the TRB is not in error state. Error state is determined by bit TFF.DE being set in offset TF.FLAGS
of the Tape Formatter Control Block (TFCB). This is a software inconsistency. General register 4 points to the Dialogue
Control Block (DCB) in question. General register 2 points to
the Tape Formatter Control Block (TFCB) for the drive in question.
User Action: Submit an SPR with the crash dump. Include the
following structures: Dialogue Control Block (DCB) , Tape Formatter Control Block (TFCB).

B-73

006037
Tape data channel not idle

Facility: TAPE, TFSTI
Explanation: The tape data channel should be idle when queuing this Dialogue Control Block (DCB) to the idle Dialogue Control Block (DCB) list. General register 0 points to the K control area in question. General register 4 points to the Dialogue Control Block (DCB).
User Action: Further testing of the K.sti may be necessary.
InvestIgate tape data channel in requestor slot indicated by
the field KG$SLT of the K control area. If no hardware problem exists, submit an SPR. Include the following structures:
K control area, and Dialogue Control Block (DCB).
006040
No stack available to suspend with

Facility: TAPE, TFSTI
Explanation: No stack available for suspending a process. General register 2 points to the Tape Formatter Control Block (TFCB).
General register 5 points to the K control area. General register 4 points to the Dialogue Control Block (DCB).
User Action: Submit an SPR with the crash dump and the tOllowing structures: Tape Formatter Control Block (TFCB), Dialogue Control Block (DCB) , and K control area.
006041
Dialogue Control Block (DCB) operation timed out

Facility: TAPE, TFSTI
Explanation: A Dialogue Control Block (DCB) operation timed
out. This usually indicates a problem in the tape data channel. The tape requestor slot in question is given as the second word on the stack.
User Action: If no hardware problem exists, submit an SPR.
B-74

006042
Invalid context
Facility: TAPE, TFSTI
Explanation: A Dialogue Control Block (DCB) operation is being attempted from a context other than the TAPE server. This
is a software inconsistency.
User Action: Submit an SPR with the crash dump.

006043
Buffer descriptor address missing
Facility: TAPE, TXREVERSE
Explanation: The next address is missing from the linked list
of buffer descriptors. General register 5 points to the Fragment Request Block (FRB) in question. Offset F$BFHD points to
the buffer descriptor list in question.
User Action: Submit an SPR with the crash dump. Include the
Fragment Request Block (FRB) structure.

006044
Unexpected Fragment Request Block (FR8) error received
Facility: TAPE, TFERR
Explanation: An error was received from a software station rather
than from a hardware station. General register 5 points to the
fragment request block (FRB) in error.
User Action: Submit an SPR with the crash dump. Include the
FRB.

8-75

006045
Unknown Fragment Request Block (FRB) error received

Facility: TAPE, TFERR
Explanation: An unidentifiable error is flaggen in a fragment
request block (FRS).
User Action: Submit an SPR with the crash dump.
006046
K.ci did not return a Fragment Request Block

Facility: TAPE, TFERR
Explanation: Transfer Request Blocks (TRB) have associated Fragment Request Blocks (FRB) that point to data buffers. When a
TRB is received in error, the FRBs must be deallocated. If an
FRB is held by K.ci and not returned within 20 seconds, this
crash occurs.
User Action: If no hardware problem exists, submit an SPR with
the crash dump. If the problem reoccurs, investigate the K.ci.
006047
Illegal downcount occurred on a Host Message Block (HMB) chain

Facility: TAPE, TFERR
Explanation: Whenever Transfer Request Blocks (TRB) are purged
from the K.sti input queue, the associated Host Message Block
(HMB) must not be returned to the host as an end message. This
catching mechanism relies on a change of HMBs with associated
counters. This is a software consistency check to ensure control memory is not corrupted by the end of the chain. General
register 5 points to the HMB.
User Action: Submit an SPR with the crash dump. Include the
HMB.

B-76

006050
Sequence number corruption occurred

Facility: TAPE, TFERR
Explanation: Error recovery ensures against a deadlock on K.sti
by preventing a Transfer Request Block (TRB) from waiting for
a diagnostic control block (DCB) that will never execute. Such
a deadlock can only occur from a software inconsistency.
User Action: Submit an SPR with the crash dump.
007000
This class of crashes includes CIMGR software consistency errors

Facility: CIMGR, any
Explanation: A software inconsistency error occurred.
User Action: Submit an SPR with the crash dump. Specify the
utilities or diagnostics active at the time of the crash.
007001
Received a sequence message without a credit

Facility: CIMGR, CIDIRECT
Explanation: The SCS$DIRECT process received a sequence message in a Host Message Block (HMB) flagged by the K.ci as not
having a credit for the connection. General register 1 has the
address of the HMB in error.
User Action: Submit an SPR with the crash dump. Include the
HMB.

8-77

007002
Failed to acquire a control block from K.ci
Facility: CIMGR, CIMISCPRC
Explanation: The paLLER process was not able to obtain a control block from R.ci to resend a timed-out STACK datagram.
User Action: Further testing of the HSC subsystem may be necessary. Investigate the available control memory. If no hardware problem exists, submit an SPR with the crash dump.

007003
K.ci is hung
Facility: CIMGR, CIMISCPRC
Explanation: During the polling interval the paLLER ensures
that K.ci is still runnjng. This trap indicates it is not.
User Action: Further testing of the HSC subsystem may be necessary. Investigate the R.ci hoards. If no hardware problem
exists, submit an SPR with the crash dump.

007004
K.ci detected an unrecoverable error and stopped
Facility: CIMGR, CIMISCPRC
Explanation: K.ci sent its control area to the CIMGR exception process. This is done whenever R.ci has detected a nonrecoverable hardware error.
User Action: Further testing of the HSC subsystem may be necessary. Investigate the R.ci boards and data memory. If no hardware problem exists, submit an SPR with the crash dump.

8-78

007005
K.ci patch status check failed
Facility: CIMGR, CIMISCPRC
Explanation: K.ci did not respond to a path status check within
eight seconds.
User Action: Investigate the K.ci boards. Further testing of
the HSC subsystem may be necessary. If no hardware problem exists, submit an SPR with the crash dump.

007006
System name is corrupted
Facility: CIMGR, CIROOT
Explanation: During initialization, the CIMGR discovered the
System name was corrupted in the seT.
User Action: Release the Online button on the HSC (out). Reboot the HSC by holding the Fault button down until the State
light blinks. This will bypass using the SCT on the boot device. Run SF.TSHO to reset system name and ID, then reboot HSC
one more time before pushing in the Online button on the front
panel.

007007
HMB received with wrong number of BMBs
Facility: CIMGR, CISCS
Explanation: A Host Message Block (HMB) was received with the
wrong number of Big Message Blocks (BMBs). A START or ID packet
was received from K.ci without the proper number of associated data memory blocks. General register 0 points the HMB.
User Action: If no hardware problem exists, submit an SPR with
dump. Investigate the K.ci boards.

~crash

8-79

~~7AI~

Inconsistent connection state
Facility: CIMGR, CISCS
Explanation: An illegal state transition was attempted on a
connection. This is a software problem. General register 2 points
to the Connection Block (CB).
User Action: Submit an SPR with the crash dump. Include the
~
~~7~11

Connection incarnation inconsistent
Facility: CIMGR, CISCS
Explanation: While a connection is in the process of opening,
the incarnation of that connection is flagged as formative.
The final step of opening the connection is to remove the flag.
This crash indicates the flag was prematurely removed indicating a state inconsistency for the connection. General register 2 points to the Connection Block (CB).
User Action: Submit an SPR with the crash dump. Include the

ep;:007012

Connection incarnation mismatch
Facility: CIMGR, CISCS
Explanation: The incarnation of an opening connection is kept
in both the Connection Block (CB) and the Connection Block vector table. As a connection opens a check is made to ensure these
incarnations agree. A disagreement indicates dangling reference to an old carnation of the connection. Register 2 points
to the Connection Block (CB).
User Action: Submit an SPR with the crash 0ump. Include the

ep;:-

B-80

007013
Inconsistent connection state due to a Vc closure

Facility: CIMGR, CISCS
Explanation: An illegal state transition was attempted on a
connection. The state transition was initiated by a VC closure. General register 2 points to the Connection Block (CB).
User Action: Submit an SPR with the crash dump. Include the

cs:--

007014
Unable to retrieve resource from K.ci during a disconnect

Facility: CIMGR, CISCS
Explanation: During a disconnect, the CIMGR was unable to retrieve the resources associated with the credits on that connection from K.ci.
User Action: Submit an SPR with the crash dump.
007015
K.ci did not respond to notification of a VC closure

Facility: CIMGR, CISUBRS
Explanation: The CIMGR informs K.ci when it marks a VC as closed.
It then allows the K.ci eight seconds to respond to the notification. This crash occurs if the response times out.
User Action: If no hardware problem exists, submit an SPR with
the crash dump. Investigate the K.ci.

B-81

007016

Illegal attempt to deallocate a Connection Block (CB)
Facility: CIMGR, CISUBRS
Explanation: An attempt was made to deallocate a Connection
Block (CB) without breaking the connection. General register
2 points to the CB.
Use~

cs:--

Action: Submit an SPR with the crash dump. Include the

007017

Attempt to deallocate a Connection Block without an incarnation
Facility: CIMGR, CISUBRS
Explanation: A Connection Block (CR) did not have a valid incarnation at the time it was 0eallocated. This crash indicates
a software inconsistency.
User Action: Submit an SPR with the crash dump. Include the
Ci3"":007020

Failure to retrieve SCS resources from K.ci
Facility: CIMGR, CISUBRS
Explanation: Wh~ri trying to allocate resources for use across
a virtual circuit (VC), the count of data memory resources was
incorrect. The Host Message Block (HMB) for serializing VC traffic must have two Big Message Blocks (BMB). General register
o points to the HMB.
User Action: Submit an SPR with the crash dump. Include the
HMB.

B-82

007021
The count of waiters for virtual circuit resources went negative
Facility: CIMGR, CISUBRS
Explanation: While processing the list of waiters for transmission resources for a virtual circuit (VC), a nonempty list
was detected to indicate a negative number of waiters. This
is strictly a software inconsistency. General register 1 points
to the system block (SB).
User Action: Submit an SPR with the crash dump. Include the

"SB"":012001
Can't Find Connection Block
Facility: DUP
Explanation: When DUP receives an HMB, DUP tries to find a reference to the Connection Block (referred to by HM$CTX in the
HMB) in the DG$ structures (DUP Context Control Blocks). DUP
was unable to find a reference to the Connection Block, even
though it searched every DG$ structure.
User Action: Submit an SPR with the exception dump or startup
message indicating the contents of the stack.

012002
Illegal BMB Count
Facility: DUP
Explanation: The HMB (MSCP packet carrier) has an illegal number of BIg Message Buffers (BMBs) allocated. DUP allows only
one. The HMB is invalid.
User Action: Submit an SPR with the exception dump or startup
message indicating the contents of the stack. The second word
of the stack contains the windowed address of the HMB. The third

B-83

word of the stack contains the value in HM$CN -- the count of
the number of BMBs.
e12~e3

Illegal HMB Opcode
Facility: DUP
Explanation: The opcode specified in the HM$LOF field of the
HMB was not equal to HML$RM. (Received sequence message over
connection; HML$RM=000~0~.) HMB opcodes must indicate the HMB
is for a sequenced message.
User Action: Submit an SPR with the exception dump or startup
message lndicating the contents of the stack. The second word
of the stack contains the illegal opcode.
~12~~4

Illegal HMB Error
Facility: DUP
Explanation: The error specified in the HM$ERR field of the
HMB was not equal to 0, HME$EC, or HME$NC. (Extra credits received; HME$EC=10.) (No credits received; HME$NC=4.)
User Action: Submit an SPR with the exception dump or startup
message indIcating the contents of the stack. The second word

of the stack contains the value in the HM$ERR field.
e12~2l

Invalid Connection Block
Facility: DUP
Explanation: The DUP process received a Connection Block with
an invalid value in the CB$ACT field. The CB$ACT field contains the action value (action to be performed by the DUP server) .
User Action: Submit an SPR with the exception dump or startup
message indicating the contents of the stack. The second word
on the stack contains the contents of the CB$ACT field.

8-84

012024

Bad Down Count
Facility: DUP
Explanation: DUP initiates return of the endpacket to the host
by down counting the reference counter in the related control
block. The down-count action should return 1. If the downcount
does not decrement the reference counter to 1, DUP crashes the
HSC.
User Action: Submit an SPR with the exception dump or startup message indicating the contents of the stack. The second
word on the stack is the value of the counter following the
downcount.
012036
Connection Broken
Facility: DUP
Explanation: While DUP was preparing to send a message to the

K.Cl, the connection to the host was broken. The connection
was broken after DUP did an extensive check to ensure the connection existed. DUP detected the connection break the second time because the DG$CB field was set to 0.
User Action: Submit an SPR. This is is an internal consistency
check and should never be seen.
042001
FAD message buffer overflow
Facility: DIRECT
Explanation: The progam DIRECT was attempting to output the
end message, but the length of that message was longer than
the allotted FAO output buffer.
User Action: Suhmit an SPR with the crash dump.

B-85

043001
Wrong HMB received when trying to bring source online
Facility: DKCOPY
Explanation: This is crash $CDKCOPY+SRC ONL HMB. An HMB (Host
Message Block) was sent to the disk ser~er ~equesting the source
-unit be brought online in a shadow set. When the completion
queue of this HMB was checked, it pointed to a different (incorrect) HMB.
User Action: Submit an SPR with the dump. Top of stack equals
crash code~ Second word points to previous HMB.

043002
Bad downcount when trying to bring source online
Facility: DKCOPY
Explanation: This is crash $CDKCOPY+SRC ONL CNT. When an MSCP
end message was to be sent over a connection to a host, a counter
keeping track of the transaction (decrementing by one) failed
to operate properly. This occurred after the disk server was
asked to bring the source unit online in a shadow set.
User Action: Submit an SPR with the dump. Top of stack equals
crash code. Second word points to counter.

043003
Wrong HMB received when trying to issue GCS to target unit
Facility: DKCOPY
Explanation: This is crash $CDKCOPY+TGT GCS HMB. An HMB (Host
Message Block) was sent to the disk server ~equesting a GCS
(GET COMMAND STATUS) command be sent to the target unit. When
the completion queue of this HMB was checked, pointed to a different (incorrect) HMB.
User Action: Submit an SPR with the dump. Top of stack equals
crash code. Second word points to previous HMB.

B-86

043004
Bad downcount when trying to issue GCS to target unit
Facility: DKCOPY
Explanation: This is crash $CDKCOPY+TGT GCS CNT. When an MSCP
end message was to be sent over a connection to a host, a counter
keeping track of the transaction (decrementing by one) failed
to operate properly. This occurred after the disk server was
asked to send a GCS (GET COMMAND STATUS) command to the target unit.
User Action: Submit an SPR with the dump. Top of stack equals
crash code. Second word points to counter.

043005
Bad downcount when trying to bring target unit online
Facility: DKCOPY
Explanation: This is crash $CDKCOPY+TGT ONL CNT. When an MSCP
end message was to be sent over a connection to a host, a counter
keeping track of the transaction (decrementing by one) failed
to operate properly. This occurred after the disk server was
asked to bring the target unit online into the shadow set.
User Action: Submit an SPR with the dump. Top of stack equals
crash code. Second word points to counter.

043006
Bad downcount when trying to issue abort command to target unit
Facility: DKCOPY
Explanation: This is crash $CDKCOPY+TGT ABO CNT. When an MSCP
end message was to be sent over a connection to a host, a counter
keeping track of the transaction (decrementing by one) failed

B-87

to operate properly. This occurred after the disk server had
been asked to abort an online command to the target unit.
User Action: Submit an SPR with the dump. Top of stack equals
crash code. Second word points to counter.
~43~07

Wrong HMB received after issuing AVL command to shadow unit
Facility: DKCOPY
Explanation: This is crash $CDKCOPY+SHA AVL HMB. An HMB (Host
Message Block) was sent to the disk server requesting the shadow
unit used to facilitate the copy operation be made available.
When the completion queue of this HMB was checked, it pointed
to a different (incorrect) HMB.
User Action: Submit an SPR with the dump. Top of stack equals
crash code. Second word points to previous HMB.
~430l0

Bad downcount when trying to issue AVL command to shadow unit
Facility: DKCOPY
Explanation: This is crash $CDKCOPY+SHA AVL CNT. When an MSCP
end message was to be sent ove~ a connection to a host, a counter
keeping track of the transaction (decrementing by one) failed
to operate properly. This occurred after the disk server was
asked to send the shadow unit available.
User Action: Submit an SPR with the dump. Top of stack equals
crash code. Second word points to counter.

B-88

051001
An XFRB was not acquired to print messages
Facility: SETSHO,SSMAIN
Explanation: This is crash $CSETSHO+NOXFRB. An XFRB (Extended
Function Request Block) was not acquired by the SETSHO main
routine. A crash was initiated because the lack of this item
prevented communication between the HSC and the console.
User Action: Submit an SPR with the dump.
051002
Failed to properly send HMB to K.ci
Facility: SETSHO,SSMAIN
Explanation: This'is crash $CSETSHO+CIHMB. An HMB (Host Memory Block) was sent to the K.ci (the hardware that handles communication between the hosts and the HSC). A crash was initiated because confirmation of the HMB was not recieved from
the K.ci within the required time.
User Action: Suhmit an SPR with the dump.
51003
Too many characters intended for console printout
Facility: SETSHO,SSMAIN
Explanation: This is crash $SETSHO+PNTOVF. In this case, Formatted ASCII Ouput (FAD) was called and it generated more characters than the buffer size allocated would allow. The maximum is 510 characters.
User Action: Submit an SPR with the dump. R1 points to string

STZ"e.

B-89

~51004

The SCT (System Control Table) crossed a page boundary
Facility: SETSHO,SSMAIN
Explanation: This is crash $SETSHO+SCTXPG. The SCT must remain on one page in memory. It typically indicates an incorrect amount of padding was placed at the end of the file SSDATA.MAC.
User Action: Submit an SPR with the dump.

051101
Failed in sending HMB to disk server for SET Dn [NO]HOST
Facility: SETSHO,SET
Explanation: This is crash $CSETSHO+SETDSK. An HMB (Host Memory Block) was sent to the disk server in order to SET a disk
drive HOST or NOHOST. The crash was initiated because the confirmation of this command was not received within the required
time.
User Action: Submit an SPR with the dump.

051102
Failed in sending HMB to tape server for SET Tn [NO]HOST
Facility: SETSHO,SET
Expl~nation: This is crash $CSETSHO+SETTAP. An HMB (Host Memory Block) was sent to the tape server in order to SET a tape
drive HOST or NOHOST. The crash was initiated because the confirmation of this command was not received within the required
time.

User Action: Submit an SPR with the dump.

B-90

051201
Failed in sending HMB to disk server for SHOW Dn

Facility: SETSHO,SHOW
Explanation: This is crash SCSETSHO+SHODSK. An HMB (Host Memory Block) was sent to the disk server in order to SHOW a specific disk drive. The crash was initiated because the confirmation of this command was not received within the required
time.
User Action: Submit an SPR with the dump.
051202
Failed in sending HMB to tape server for SHOW Tn

Facility: SETSHO,SET
Explanation: This is crash SCSETSHO+SHOTAP. An HMB (Host Memory Block) was sent to the tape server in order to SHOW a specific tape drive. The crash was initiated because the confirmation of this command was not received within the required
time.
User Action: Submit an SPR with the dump.
051203
SCT crash context table contained too many characters

Facility: SETSHO,SHOW
Explanation: This is crash SSETSHO+CSHOVF. The SCT crash context table contained too many characters. In this case FAD was
called and it generated more characters than the buffer size
would allow. The maximum is 510 characters.
User Action: Submit an SPR with the dump. Rl points to string

SIZe.

B-9l

052001 (SCDWMATH)
Double word math not consistent

Facility: SINI
Explanation: During calculation and allocation of control blocks
(allocate in quantities of double-word), the count of words
in control blocks was not a double-word multiple.
User Action: Submit an SPR with a dump. R0 points to Memory
DescrIptor (MD).
052002 ($CDIV10)
Divide operation set overflow

Facility: SINI
Explanation: During allocation of control blocks (set as 80
percent of available control memory), a divide operation set
the PSW Overflow bit.
User Action: Submit an SPR with a dump.
052003 ($CMUL8)
Multiply operation set overflow

Facility: SINI
Explanation: During allocation of control blocks (set as 80
percent of available control memory), a divide operation set
the PSW Overflow bit.
User Action: Submit an SPR with a dump.

8-92

061001
XCALL stack corrupted
Facility: DlAGlNT
Explanation: The DDUSUB transfer routines use a stack allocated from common pool for XCALLs (cross-address space calls)
from the disk server. The low word of this stack is initialized to a special value which should never change. This crash
occurs when the routine DnUTlO is called. The low word of the
stack contains a different value than the initialization value.
The most probable cause is corruption hy the process running.
User Action: Submit an SPR with the crash nump. Note the diagnostics or utilities running at the time of the crash.
062001 ($CNOWlNDOW)
Process does not have windows declared
Facility: SUBLlB, ERTYP
Explanation: A process which requested an out of bano error
log be issued via the ERTYP$ service in SUBLlB does not have
windows declared in its PCB (Process Control Block) declaration. A Window set is required to use this service.
User Action: Submit an SPR with a dump.

8-93

APPENDIX C
GENERIC ERROR LOG FIELDS

C.l

GENERIC ERROR LOG FIELDS

Some fields described on HSC console message printouts are
generic, regardless of error type. These fields are described in
the following table. Error Flags and MSCP/TMSCP Event Codes are
covered in more depth in separate tables in this appendix.

Table C-l

Generic Error Log Fields

Field

Description

ERROR-X

The X represents the severity level of the error
message. Severity levels are E for error, S for
success, W for warning, I for informational, and
F for fatal. What follows is the English version
of the error message describing the event code,
the date and time.

Command Ref #

This number, in hexadecimal, is the MSCP command
number that caused the error reported, or zero
if the error does not correspond to a specific
outstanding command.

Err Seq #

This number, in decimal, is the sequence number
of this error log message since the last time
the MSCP server lost context, or zero if the
MSCP server does not implement error log
sequence numbers.

Error Flags

This number, in hexadecimal, indicates bit
flags, collectively called error log message
flags, used to report various attributes of the
error. See Table C-2 for a description of the
error flags.

C-l

Field

Description

Event

This number, in hexadecimal, identifies the
specific error or event being reported by this
error log message. This code consists of a
five-bit major event code and an II-bit subcode.
The event codes and what they mean are listed in
Table C-3.

Table C-2

Bit
Number

Error Flags

Bit
Mask
Hex.

Format Description

If set, the operation causing this error log
message has successfully completed. The error log
message summarizes the retry sequence necessary
to successfully complete the operation.

If set, the retry sequence for this operation
continues. This error log message reports the
unsuccessful completion of one or more retries.

(MSCP-specific) If set, the identified logical
block number (LBN) needs replacement.

(MSCP-specific) If set, the reported error
occurred during a disk access initiated by the
controller bad block replacement process.

If set, the error log sequence number has been
reset by the MSCP server since the last error log
message sent to the receiving class driver.

C.2

MSCP/TMSCP EVENT CODES

The following table is a sequential list of all known MSCP and
TMSCP event codes. Each event code cross references to an error
description. The first column is the event code number in
hexadecimal. The second column references the class of error.
The third column is the expanded description that matches the
event code.

C-2

Table C-3

MSCP/TMSCP Event Codes

Event
Code

Hex

Class

Description

0000

Success

Normal

0001

Invalid Command

Invalid message length
Other invalid command subcode values
should be referenced as follows. Note,
this is combined with the status code:
offset*256.+code
Offset
symbol
is the
status

is the command message offset
for the field in error and code
symbol for the Invalid Command
code.

0002

Command Aborted

0003

unit Offline

unit unknown or online to another
controller.

0004

unit Available

Unit Available

0007

Compare Error

Data compare error
Data compare error resulted from
COMPARE CONTROLLER DATA or COMPARE
HOST DATA command.

0008

Data

Disk - Sector was written with Force
Error modifier.
Tape - Long gap encountered.

0009

Host Buffer

Host buffer access error--cause not
available
The controller was unable to access a
host buffer to perform a transfer, but
has no visibility into the cause of
the error.

OOOA

Controller

Reserved for host--command timeout
expired.

C-3

Event
Code
Hex

Class

Description

OOOC

Shadow Set
Status Has
Changed

Shadow set status has changed

OOOD

BOT Encountered

OOOE

Tape Mark
Encountered

Tape mark encountered

0010

Record Data
Truncated

Record data truncated, data transfer
operation

0013

LEOT Detected

LEOT detected

0014

Bad Block
Replacement

Bad block successfully replaced

0016

Access Denied

Access denied

0020

Success

Spindown ignored

0023

unit Offline

Disk - No volume mounted or drive
disabled via RUN/STOP switch. Unit is
in known substate.
Tape - No media mounted or disabled
via switch setting

0026

unit Available

No members in shadow set

0029

Host Buffer

Odd transfer address

002A

Controller

SERDES overrun or underrun error
Either the drive is too fast for the
controller, or a controller hardware
fault has prevented controller
microcode from being able to keep up
with data transfer to or from the
drive.

002B

Disk Drive

Drive command timeout
For SI drives, the controller timeout
expired for either a Level 2 exchange
or the assertion of Read/Write Ready
after an Initiate Seek.

C-4

Event
Code
Hex

Class

Description

0034

Bad Block
Replacement

Block verified OK - not a bad block

0035

Invalid
Parameter

Invalid key length
The key length is to short for the
specified key type.

0040

Success

Still connected

0043

Unit Offline

Unit is inoperative
For SI drives, the controller has
marked the drive inoperative due to an
unrecoverable error in a previous
Level 2 exchange, the drive CI flag is
set, or the drive has a duplicate unit
identifier.

0044

Unit Available

Shadow set copy in progress

0048

Disk Data

Invalid header
The subsystem read an invalid or
inconsistent header for the requested
sector. For recoverable errors, this
code implies a retry of the transfer
read a valid header. For unrecoverable
errors, this code implies the
subsystem attempted non primary
revectoring and determined the
requested sector was not revectored.
(As an example, the RCT indicates the
sector is not revectored). Causes of
an invalid header include header
missync, header sync timeout, and an
unreadable header.

0049

Host Buffer

Odd byte count

004A

Controller

EDC Error
The sector was read with correct or
correctable ECC and an invalid EDC. A
fault probably exists in the ECC logic
of either this controller or the
controller that last wrote the sector.

C-5

Event
Code
Hex

Class

Description

004B

Disk Drive

Controller-detected transmission error
For SI drives, the controller detected
an invalid framing code or a checksum
error in a Level 2 response from the
drive.

0054

Bad Block
Replacement

Replacement failure-- REPLACE command
or its analogue failed

Invalid
Parameter
The controller does not implement the
specified key type.
0068

Disk Data

Data Sync not found (Data Sync
timeout)

0069

Host Buffer

Non-existent memory error

006A

Controller

Inconsistent internal control
structure
A high-level check detected an
inconsistent data structure. For
example, a reserved field contained a
nonzero value, or the value in a field
was outside its valid range. This
error almost always implies the
existence of a microcode problem.

006B

Disk Drive

Positioner error (misseek)
The drive reported a seek operation
was successful, but the controller
determined the drive had positioned
itself to an incorrect cylinder.

0074

Bad Block
Replacement

Replacement failure-- inconsistent RCT

0075

Invalid
Parameter

Invalid key value
A checksum or similar indicates the
key value is internally inconsistent.

0080

Success

Duplicate unit number

0083

unit Offline

Duplicate unit number

C-6

Event
Code
Hex

Class

Description

0085

Media Format

Characteristics or protection mismatch
for shadow member

0088

Disk Data

Correctable error in ECC field
A transfer encountered a correctable
error where only the ECC field was
affected. All data bits were correct,
but a portion of the ECC field was
incorrect. The severity of the error
(the number of symbols in error) is
unknown. If the number of symbols in
erro~ is known, an n Symbol ECC Error
subcode should be returned iDstead.

0089

Host Buffer

Host memory parity error

008A

Controller

Internal EDC error
A low-level check detected an
inconsistent data structure. For
example, a microcode-implemented
checksum or vertical parity (hardware
parity is horizontal) associated with
internal sector data was inconsistent.
This error usually implies a fault in
the memory addressing logic of one or
more controller processing elemen~s.
It can also result from a double bit
error or other error exceeding the
error detection capability of the
controller hardware memory checking
circuitry.

0088

Disk Drive

Lost Read/Write Ready during or
between transfers
For 8I drives, Read/Write Ready drops
when the controller attempts to
initiate a transfer or at the
completion of a transfer with
Read/Write Ready previously asserted.
This usually results from a
drive-detected transfer error, where
additional error log messages
containing the drive-detected error
subcode may be generated.

C-7

Event
Code
Hex

0094

Class

Description

Bad Block
Replacement

Replacement failure--- drive access
failure
One or more transfers specified by the
replacement algorithm failed.

OOAS

Disk Media

Disk not formatted with Sl2-byte
sectors
The disk FeT indicates it is formatted
•• ~ .. 1...

W.LI..U

r:.-,c

1,.". • • L. __

.J/U-UYI..C

_ _ _ .L _ _ _

~C\"'I..U.L.~,

_1 .. 1..._ •• _1...

a..L.I..11UU~ll

l..._L.L.

UUI..11

the controller and the drive support
only Sl2-byte sectors.
00A9

Host Buffer

Invalid page table entry
See Unibus/Q-bus Storage Systems Port
Specifications for additional detail.

OOAA

Controller

LESI Adapter Card parity error on
input (adapter to controller)

OOAB

Disk Drive

Drive clock dropout
For SI drives, either data or state
clock was missing when it should have
been present. This is usually detected
by means of a timeout.

00B4

Bad Block
Replacement

Replacement failure, no replacement
block available
Replacement was attempted for a bad
block, but a replacement block could
not be allocated. For example; the
volume's RCT is full.

OOCS

Disk Media

Disk not formatted or FCT corrupted
The disk FCT indicates the disk is not
formatted in either 512- or S76-byte
mode.

C-8

Event
Code

Hex

Class

Description

OOC9

Host Buffer

Invalid buffer name
The key in the buffer name does not
match the key in the buffer
descriptor, the B bit in the buffer
descriptor is clear, or the index into
the buffer descriptor table is too
large.

OOCA

Controller

LESI Adapter Card parity error on
output (controller to adapter)

OOCB

Disk Drive

Lost receiver ready for transfer
For SI drives, Receiver Ready was
negated when the controller attempted
to initiate a transfer or did not
assert at the completion of a
transfer. This includes all cases of
the controller timeout expiring for a
transfer operation (Level I real time
command) .

OOD4

Bad Block
Replacement

Replacement failure, recursion failure
Two successive RBNs were bad.

OOE8

Data

Disk - Uncorrectable ECC Error
A transfer without the Suppress Error
Correction modifier encountered an ECC
error exceeding the correction
capability of the subsystem error
correction ted.algorithms or a
transfer with the Suppress Error
Correction modifier encountered an ECC
error of any severity.
Tape - Unrecoverable read error

OOE9

Host Buffer

Buffer length violation
The number of bytes requested in the
MSCP command exceeds the buffer length
as specified in the buffer descriptor.

OOEA

Controller

LESI Adapter Card "cable in place" not
asserted.

C-9

Event
Code
Hex

Class

Description

OOES

Disk Drive

Drive-detected error
For SI drives, the controller received
a Get Status or unsuccessful response
with EL set or the controller received
a response with the DR flag set, and
it does not support automatic
diagnosis for that drive type.

0100

Success

Already online

0103

unit Offline

unit disabled by Field Service or
diagnostic
For SI drive, the drive DD flag is
set.

0105

Disk Media

RCT corrupted
The RCT search algorithm encountered
an invalid RCT entry. The subcode may
be returned under the following
conditions: during replacement of a
block, revectoring a faulty block, and
when a unit is brought online.

0106

Write Protected

unit is data safety write protected

0108

Disk Data

One-Symbol ECC Error

0109

Host Buffer

Access control violation
The access mode specified in the
buffer descriptor is protected against
the PROT field in the PTE~

OlOA

Controller

Controller overrun or underrun
The controller attempted to perform
too many concurrent transfers, causing
one or more of them to fail due to a
data overrun or underrun.

C-lO

Event
Code
Hex

Class

Description

010B

Disk Drive

controller-detected pulse or state
parity error
For SI drives, the controller detected
a pulse error on either the state or
data line, or the controller detected
a parity error in a state frame.

0125

Disk Media

No replacement block available
Replacement of a faulty block was
attempted, but a replacement block
could not be allocated (i.e. the RCT
is full). This subcode may be returned
during actual replacement and when an
interrupted replacement is completed
as part of bringing a unit online.

0128

Disk Data

Two-Symbol ECC Error

012A

controller

Controller memory error
The controller detected an error in an
internal memory, such as a parity
error or nonresponding address. This
subcode applies only to errors not
affecting the ability of the HSC70 to
properly generate End and Error Log
messages. Errors affecting End and
Error Log messages are not reported
via MSCP. For most controllers, this
subcode is returned only for
controller memory errors in data or
buffer memory and noncritical control
structures. If the controller has
several such memories, the specific
memory involved is reported as part of
the error address in the error log
message.

012B

Disk Drive

Drive-requested error log (EL bit set)

0148

Disk Data

Three-Symbol ECC Error

014A

Controller

PLI reception buffer parity error

C-ll

Event
Code
Hex

Class

Description

0148

Disk Drive

Controller-detected protocol error
For S1 drives, a Level 2 response from
the drive had correct framing codes
and checksum but was not a valid
response within the constraints of the
S1 protocol. The response had an
invalid opcode, was an improper
length,error. or was not a possible
response in the context of the
exchange.

0168

Disk Data

Four-Symbol ECC Error

016A

Controller

PL1 transmission buffer parity error

016B

Disk Drive

Drive failed initialization
For S1 drives, the drive clock did not
resume following a controller attempt
to initialize the drive. This implies
the drive encountered a fatal
initialization error.

0188

Disk Data

Five-Symbol ECC Error

0188

Disk Drive

Drive ignored initialization
For S1 drives, the drive clock did not
cease following a controller attempt
to initialize the drive. This implies
the drive did not recognize the
initialization attempt.

OlA8

Disk Data

Six-Symbol ECC Error

OIAB

Disk Drive

Receiver Ready collision
For S1 drives, the controller
attempted to assert its Receiver Ready
when the Receiver Ready of the drive
was still asserted.

Olca

Disk Data

Seven-Symbol ECC Error

C-12

Event
Code
Hex

Class

Description

OICB

Disk Drive

Response overflow
A drive sent back more frames than the
reception buffer could hold. This can
be caused by a hung drive
microdiagnostic or a malfunctioning
K.sdi.

0lE8

Disk Data

Eight-Symbol ECC Error
A transfer encountered a correctable
ECC error with the specified number of
ECC symbols in error. The number of
symbols in error roughly corresponds
to the severity of the error.

0200

Success

Still online

0203

Unit Offline

Exclusive use

0208

Disk Data

Nine-Symbol ECC Error.

0220

Success

Still Online/Unload ignored

0228

Disk Data

Ten-Symbol ECC Error.

0248

Disk Data

Eleven-Symbol ECC Error.

0268

Disk Data

Twelve-Symbol ECC Error.

0288

Disk Data

Thirteen-Symbol ECC Error.

02A8

Disk Data

Fourteen-Symbol ECC Error.

02C8

Disk Data

Fifteen-Symbol ECC Error.

0400

Success
Tape - EOT encountered

0404

Unit Available

Already in use

044B

Tape Drive

Drive error
Controller retry limit exhausted.

0800

Success

Invalid RCT

1000

Success

Read only volume format

C-13

Event
Code
Hex

Class

Description

1006

Write Protected

unit is software write protected

2006

write Protected

unit is hardware write protected

F3AA

Controller

Unknown K.tape error

FCAA

Controller

Word Rate Clock timeout
The K.sti detected the loss of clocks
from a drive during a transfer.

FCEA

Controller

Receiver Ready not asserted at start
of transfer - The HSC70 is ready to
start a transfer by sending the
formatter a Level I command, and the
formatter does not have Receiver Ready
asserted.

FD2A

Controller

Data Ready timeout - This controller
did not detect Data Ready from the
formatter within 5 ms after sending it
a Level I command.

FD6A

Controller

Acknowledge not asserted at start of
transfer - The HSC70 is ready to start
a transfer by sending the formatter a
Level I command, and the formatter
does not have Acknowledge asserted.

FDEC

Tape Formatter

Could not get extended drive status

FEOC

Tape Formatter

Could not get formatter summary status
while trying to restore tape position

FE2A

Controller

Record EDC error - On a read from tape
operation the EDC calculated by the
K.STI did not match the EDC generated
by the tape formatter

FE2B

Tape Drive

Could not set byte count

FE4B

Tape Drive

Could not write tape mark

·FE6B

Tape Drive

Could not set unit characteristics

FE8A

Controller

Lower processor timeout - The upper
processor in the K.sti detected the
lower processor had stopped and
restarted it.

C-14

Event
Code
Hex

Class

Description

FE8B

Tape Drive

Unable to position to before LEOT

FEAB

Tape Drive

Rewind failure

FECB

Tape Drive

Could not complete online sequence

FEEB

Tape Drive

Erase gap failed

FFOB

Tape Drive

ERASE command failed

FFOC

Tape Formatter

TOPOLOGY command failed

FF3l

Tape Drive
Position Lost

Retry limit exceeded while attempting
to restore tape position

FF68

Tape Data

Formatter retry sequence exhausted

FF6A

Controller

Lower processor error
A bit was set in the lower processor
error register. Bits included in the
lower processo~ error register are
Data Bus NXM, Data SERDES Overrun Data
Bus Overrun, Data Bus Par Err, Data
Pulse Missing, and Sync Real Time Par
Err.

FF6B

Tape Drive

Tape drive requested error log

FF6C

Tape Formatter

Formatter requested error log

FF7l

Tape Drive
position Lost

Formatter-detected position lost

FF88

Tape Data

Controller transfer retry limit
exceeded

FF8A

Controller

Buffer EDC error
The K.sti detected an EDC error on the
data buffer it read from memory on a
Write operation.

FFA8

Tape Data

Host requested retry suppression on a
K.sti-detected error

C-1S

Event
Code
Hex

Class

Description

FFAA

Controller

Data overflow due to Pipeline error
No data buffers in HSC70 data memory
were available when the K.sti needed
one during a data transfer

FFC8

Tape Data

Reverse retry currently not supported

FFCB

Tape Drive

Could not position for (formatter)
retry.

FFCC

Tape Formatter

cannot clear formatter errors

FFDI

Tape Drive
Position Lost

Formatter and HSC70 disagree on tape
position

FFE8

Tape Data

Host requested retry suppression on a
formatter-detected error

FFEB

Tape Drive

Cannot clear drive errors

FFEC

Tape Formatter

Could not get formatter summary status
during transfer error recovery

FFFI

Tape Drive
Position Lost

Controller-detected position lost

C-16

APPENDIX D
INTERPRETATION OF STATUS BYTES

0.1 INTRODUCTION
This appendix lists all possible codes each K can generate after
detecting a fatal error. Only K-detected errors are included.
When a K detects a fatal error, it puts a code in its status
register and performs a Level 7 Control Bus Interrupt.
This
interrupt causes the HSC to trap through location 134 and crash.
The crash message contains the status codes from all Ks in the
Status of Requestors (1-9):
field.
Figure D-1 shows a printout example from a K-detected error.
In
this case, as in many others, the crash was not caused by the K
but was detected by the K forcing the crash.
For additional
explanations of the fields in the crash message, refer to
Appendix B.

D-1

-* SUBSYSTEM EXCEPTION *-

V# Y10B
at 18-Jan-1986 01:15:14.50 up

User
PC: 0027360 caused by (134
PSW: 140000
KBCTRL active, PCB addr
RO-R5:
024302

047632

000020

HSC70 HSC002
0 00:08:46.20
Kint

= 102636
047626

0000000

141404

Kernel SP: 000774
Kernel Stack:
005046 000004 053354 046022 001012 050476 050476 000000
047062 047466 047466 000000 047264 000000 055352 000000
User SP: 023346
User Stack:
052525 052525 025252 025252 025252 025252 025252 025252
025252 025252 025252 025252 025252 025252 025252 025252
KPAR(0-7):
000440 000640 001040 001440 002040 001240 000240 177600
KPDR(0-7):
077506 077506 177506 077506 077406 077506 077506 077506
UPAR(0-7):
000440 000640 001040 001440 002040 001240 000240 177600
UPDR( 0-7) :
007406 007406 177406 007406 007406 007406 007406 100016
MMSR(0-2): 000017 000020 037654

Window index reg:
Window Bus Reg: 140105
WADR( 0-7) :
160004 161004 162004 163004 164004 165004 166034 167034
Translated WADR(0-7):
001401 001401 001401 001401 001401 001401 001407 001607
Figure D-1

Subsystem Exception K-Detected Error (1 of 2)

D-2

Error regs: 170024

000077

status of requestors (1-9):
000177 000002 000002 000377 000377 000377 000377 000377 000203
(PC-6) TO (PC):

104002 012600 000003 011505
Control area for slot #000001
Control area address: 022010
Register area contents:
000000
000000
100307
040003
104000
140143
100007
000552
000200
012002
000000
000533
104000
000401
022000
000000
000001
000003
004572
000003
017176
000003
000063
000150
000000
000000
000372
040003
002501
002431
000000
000000
000000
Figure D-l Subsystem Exception K-Detected Error (2 of 2)
D.2

OVERVIEW
The purpose of this appendix is to aid Field Service in analyzing
the K-detected failure codes through the use of the status code
tables. This appendix contains one status code table for each
type of K:

D-3

Table 0-1 describes the K.ci status codes and applies
only to requestor number 1.

Table 0-2 describes K.sdi status codes.

Table 0-3 describes the K.sti status codes.

0.3 HOW TO USE THE STATUS CODE TABLES
First of all, using these tables requires information as to the
~",.'I"""t.~

\..:11:-''''

,....(:
V.L

. . . " ....... , .. I'"\,....~,..... . . .

...... '-1"" ... O\..V.L

. ; .....

'W' .......... 1 .......... ,.:J
..LUVV..LVCU •

T ................... ....:1 ......................
..Lll
V.LUC.L
~v

....:1 ........... _ _ _ .: __

UC~C.LJ.ll.LUC

•• \..-...:_\...-.
WU.LI...U

requestor detected the error, check the Status of requestors
(1-9): field in the crash message. This field shows the status
register contents of all requestors present in the subsystem.
NOTE

The registers referred to in this appendix are
not general registers, but the internal K
registers. All status codes followed by an * are
hardware-detected errors. More detailed
information for these errors is found in the
appropriate sequencer error register.
The normal operational status codes for requestors are 001 for a
K.ci, 002 for a K.sdi, and 203 for a K.sti. A 377 means no
requestor is in the slot. Any value other than a 001, 002, 203,
or 377 means the K detected an error. A K.ci-detected error
always shows in the far left position in the Status of requestors
(1-9): field of the message. In any other position, the type of
requestor must be determined.
Count over the Status of requestors (1-9) field to the status
contents showing an error (this is the requestor number). Type
SHOW REQUESTOR at the SETSHO> prompt to see whether the requestor
detecting the error is a K.sdi or a K.sti. Find the number of
the data channel that found the error in the displayed response.
This display shows whether that requestor number is either an
K.sdi or a K.sti.
NOTE

If the HSC is not operational or the requestor in
question fails initialization self tests, check
the module utilization label above the card cage
to determine whether the involved requestor
number is a K.sdi or a K.sti.
Tables in this appendix consider only the rightmost two octal
characters in failure code. Use the appropriate table (dependent
upon requestor type) to find the meaning of the status code.

D-4

D.4 EXAMPLE EXAMINATION
Notice the third line of the message states the crash was caused
by (134) Kint.
The 134 indicates a K detected a fatal problem
and interrupted the P.ioj with a Level 7 interrupt.
In this crash, requestor number 1 (the K.ci) status shows a
000177.
The K.ci detected a fatal condition.
The two digits in
the status code are 77 (from the 000177 failure code).
Table D-1 provides additional information regarding status code
77.
The description of this error indicates the HSC received a
HOST CLEAR command from a host node.
The description for the 77
status also shows the node number of the host which sent the HOST
CLEAR is found in R17.
To find R17, look at the Register area contents:
field on the
second page of the example.
The first entry in the register area
contents is always the Q register from the K.
The Q register
contains important information for some crashes.
The second
entry is RO.
In the example, count in octal up to R17 (remember
the first entry is the Q register).
The contents of R17 are
000001. Many of the error descriptions in the following tables
indicate additional information exists in one of these registers.
Notice other entries below R17 in the register area contents.
In
the K.sdi and K.sti register areas, these other entries are RAMO
through RAM17 , and they sometimes contain important information.
On the K.ci, these entries are not significant for
troubleshooting crash messages.
NOTE
A statement, See Note., appears in several places
in the following tables.
In each table, this
information appears on the last page.

D-5

Table D-1

K.ci status Bytes

status Code
(octal)

Description

Two conditions cause failure of the 2911
sequencer test upon powerup or reinitialization.
In one case, the requestor sent status back to
the P.io while Init was asserted. In the other
case the sequencer had already released the Init
signal, but the sequencer failed to reach the
point in its code where it could change the
status bits. A common occurrence of this status
code is from an HSC false power fail crash dump.
In this type of crash dump (lOT through 20), all
requestors present report a 00 status code.

2901 ALU test failed upon powerup or
reinitialization.

Data Bus (DBUS) test failed upon powerup or
reinitialization.

Control Bus (CBUS) test failed upon powerup or
reinitialization.

CRaM test failed upon powerup or
reinitialization.

K.pli RAM test failed upon powerup or
reinitialization.

PLI interface test failed upon powerup or
reinitialization.

Packet buffer test failed upon power~p or
reinitialization.

LINK board test failed upon pOWerup or

reinitialization.
12

Control Bus/memory error occurred during a lock
cycle while the K.ci was attempting to locate
the K-Init packet in Control memory upon powerup
or reinitialization.

K.ci could not find a properly formatted K-Init
packet in Control memory after completing
power-up/init diagnostics.

0-6

status Code
(octal)

Description

An error was detected by the upper (control)
sequencer. While attempting to update the next
buffer pointer in an FRS, the pointer was found
to be zero (illegal). R11 contains the FRS
address.

15 *

An error was detected by the upper (control)
sequencer. (See note.)

An error was detected by the upper (control)
sequencer. The control stream found a structure
on its own work queue that is not an HMB or FRB.
R11 contains the structure address.

An error was detected by the upper (control)
sequencer. While constructing a slot (SNDDAT,
REQDAT) from an FRB, the FRB address was found
to be zero (illegal). R12 contains the slot
address.

20 *

An error was detected by the upper (control)
sequencer. (See note.)

An error was detected by the upper (control)
sequencer. A buffer allocate request was
initiated without sufficient buffers on the
Allocated queue in the control area to satisfy
the request. R11 contains the FRB address.

An error was detected by the upper (control)
sequencer. The queue head for an allocated Send
buffer was zero.

An error was detected by the upper (control)
sequencer. (See note.)

An error was detected by the lower (control)
sequencer. The lower sequencer encountered an
inconsistent internal data structure. R2
contains the message slot address.

An error was detected by the lower (control)
sequencer. During the RTNOAT routine, the lower
sequencer finds a zero (illegal) FRB address.

An error was detected by the lower (control)
sequencer. The lower sequencer has received a
packet from a node with a node ID greater than
63. R7 contains the node number.

0-7

Status Code
(octal)

Description

An error was detected by the lower (control)
sequencer. This error occurs when the lower
sequencer polling loop calls a routine which
adds or removes Big Message Block (BMB) pointers
to or from the BMB chain, if the queue that is
supposed to contain these pointers is empty.

~"
JV

1\ _ _ _ _ _ _ _ _ _ _

...3_.L.. _ _ .L.._...3

1- ••

.L..1-_

1 _. ___

rtll

ut:::l..t:::~l..t:::u

1..11t:::

.LUWt:::1.

\~UUl..1.U.L1

t:::1.1.U1.

WQi:)

_ _ _ .L.. _ _ 1

sequencer. This error occurs when the lower
sequencer determines that BMBs need to be
returned to the free BMB pool and during a
consistency check finds no BMBs to return. R2
contains the message slot address.
31

An error was detected by the upper (control)
sequencer. (See note.)

An error was detected by the upper (control)
sequencer. While attempting to transmit over a
connection, the upper sequencer found an
incarnation number of zero (invalid) in the
connection block structure. R11 contains the HMB
address, R14 contains the CB address.

33 through
41 *

An error was detected by the upper (control)
sequencer. (See note.)

An error was detected by the upper (control)
sequencer. A hardware error was detected
following a block move to Control memory. R10
contains the upper processor error register
contents. R16 contains the last Control memory
address in the block that was moved.

An error was detected by the upper (control)
sequencer. A hardware error was detected
following a block move out of Control memory.
R10 contains the upper processor error register
contents. R16 contains the last control memory
address in the block that was moved.

An error was detected by the upper (control)
sequencer. A hardware error was detected
following a Control memory receive operation.
RlO contains the upper processor error register
contents. R16 contains the Control memory
address of the item received. R17 contains the
Control memory address of the queue head.

D-8

status Code
(octal)

45 and 46

Description

An error was detected by the upper (control)
sequencer. (See note.)

An error was detected by the upper (control)
sequencer. A hardware error was detected during
a downcount operation. R10 contains the upper
processor error register value. R17 contains the
counter address.

50 *

An error was detected by the upper (control)
sequencer. A hardware error was detected while
de-queueing a Control memory item from a
scratchpad list. R10 contains the upper
processor error register contents. R11 contains
the Control memory address of the item.

An error was detected by the upper (control)
sequencer. A hardware error was detected while
internalizing an FRB. R10 contains the contents
of the upper processor error register, R11
contains the FRB address, R14 contains the CB
address. The Q register contains the work queue
index.

An error was detected by the upper (control)
sequencer. Either a consistency problem was
found with the scratchpad queue or an attempt
was made to send to a queue at address zero
(illegal address).

53 through
55 *

An error was detected by the upper (control)
sequencer. (See note.)

56 through
71 *

An error was detected by the lower (control)
sequencer.(See note.)

73 *

An error was detected by the lower (control)
sequencer.This error occurs while the lower
processor is trying to link a BMB on the BMS
free chain. R10 contains the lower processor
error register contents. R5 contains the BMB
data memory address.
An error was detected by the lower (control)
sequencer. A hardware error was detected during
a BMB list operation. R10 contains the lower
processor error register contents. R5 contains
the BMB data memory address.

D-9

status Code
(octal)

Description

An error was detected by the lower (control)
sequencer. A hardware error was detected during
a BMB list operation. R10 contains the lower
processor error register contents. R5 contains
the BMB data memory address.

An error was detected by the lower (control)
sequencer. (See note.)

An error was detected by the upper (control)
sequencer. While copying data from an HMB to a
message slot, the upper sequencer found the byte
count of the HMB was larger than the slot
capacity. R12 contains the slot address. R17
contains the text length.

An error was detected by the upper (control)
sequencer. A host clear sequence has been
received. R17 contains the address of the
issuing node number.

D-10

status Code
(octal)

Description

NOTE

The sequencers access Control memory several time before
checking for a hardware error. Thus, to help determine
the particular cause of the error, the sequencer saves
the contents of the error register present at the time of
the error check in R10 (octal). The contents of R10 are
visible within the crash dump and can help in narrowing
the error

possibilities~

The following lists show the

bits available from both the upper and Lower processor
error registers. Those bits marked with (*) may cause a
crash.
Upper Processor Error Register:
Bit 0 = Even/Odd Bit Control Memory Address
Bits 3,2,1 = CCYCLE 2,1,0
* Bit 4 = Control Bus Error (Illegal Cycle)
* Bit 5 Control Bus NXM
* Bit 6 = Control Data parity Error
* Bit 7 = Instruction (CRaM) parity Error
* Bit 8 = Scratchpad parity Error
Bit 9
PLI Parity Error
Bits 10 through 15 indicate the K.ci hardware revision level
Lower Processor Error Register:
Bit 0 = Data Memory Address Bit 16
Data Memory Address Bit 17
Bit 1
Data Memory NMA
Bit 2
Bit
5
Bus NXM
* Bit 6 Data
Data
Memory
parity Error
*
Data
Memory
Overrun
Bit
7
*
Scratchpad parity Error
* Bit 8
Bit
PLI Parity Error
9
*
Bits 10 through 15 indicate the K.ci hardware revision level

D-11

Table D-2

K.sdi status Bytes

status Code
(octal)

Description

Two conditions cause failure of the 2911
sequencer test upon powerup or reinitialization.
In one case, the requestor sent status back to
the P.io while Init was asserted. In the other
case, the P.io had already released the Init
signal, but the sequencer failed to reach the
point in its code where it could change the
status bits. A common occurrence of this status
code is from an HSC false power fail crash dump.
In this type of crash dump (lOT through 20), all
requestors present report a 00 status code.

2901 ALU test failed upon powerup or
reinitialization.

Data Bus (DBUS) test failed upon powerup or
reinitialization.

Control Bus (CBUS) test failed upon powerup or
reinitialization.

PROM test failed upon powerup or
reinitialization.

Scratchpad RAM test failed upon powerup or
reinitialization.

R-SjGen test failed upon powerup or
reinitialization.

Partial SOl test failed upon powerup or
reinitialization.

The K.sdi encountered a Control Bus/memory
problem while searching for the K-Init packet in
Control memory.

After completing power-upjinit diagnostics, the
K.sdi could not find a properly formatted K-Init
packet in Control memory.

While trying to write the microcode version into
the control area at address R7+44 (R7 is base
address), the upper sequencer encountered a
Control Bus error. R11 contains the contents of
the upper error register. (See note.)

0-12

Status Code
(octal)

Description

This error occurs if the upper processor tries
to advance the buffer descriptor pointer if the
old value of the pointer is zero (illegal).

While attempting to read the block number (LBN)
from a buffer descriptor in Control memory, the
upper processor encountered a hardware error.
R11 contains the contents of the upper error
register. (See noteQ)

17 through
30 *

The upper processor encountered an error while
attempting to access Control memory. R11
contains the upper processor error register
contents. (See note.)

This error occurs if, during transfer
completion, a DRAT counter goes to zero and the
DRAT list head in the control area is not locked
and not equal to the current DRAT value.

32 through
42 *

The upper processor encountered an error while
attempting to access Control memory. R11
contains the upper processor error register
contents. (See note.)

This error occurs while processing an active
DCB, if the dialogue state indicator is not
locked (a value of 100000 is not in KS$DHD) and
not valid (KS$IND does not contain the values 0,
1, 2, 3, OR4, or -1).

The upper processor encountered an error while
attempting to access Control memory. R11
contains the upper processor error register
contents. (See note.)

This error occurs if, after completing state 0
processing, the upper sequencer cannot find a
valid DCB opcode. (No valid state is present to
go to next.)

46 through
55 *

The upper processor encountered an error while
attempting to access Control memory. R11
contains the upper processor error register
contents. (See note.)

74 through
76

The upper processor attempted to downcount a
counter that was already at zero.

D-13

Status Code
(octal)

Description

NOTE

The upper sequencer accesses Control memory several times
before checking for a Control Bus error. Thus, to help
determine the particular cause of the error, the upper
sequencer saves the contents of the error register
present at the time of the error in Rll (octal). The
contents of Rll are visible within the crash dump and may
help in narrowing the error possibilities. The following
list defines all the bits contained within the upper
processor error register (value loaded in RII). Those
bits that can possibly cause a crash are denoted with an
asterisk (*).
Upper Processor Error Register:
Bit 0 = Even/Odd bit Control Memory Address
Bits 3,2,1 = CCYCLE 2,1,0
* Bit 4 = Control Bus Error (Illegal Cycle)
* Bit 5 = Control Bus NXM
* Bit 6 = Control Data parity Error
* Bit 7 = Instruction (CROM) Parity Error
Bits 8 through 12 not used
* Bit 13 = Response Pulse Missing on SOl RD/RES Line
Bit 14 = Upper Processor RTC Clock Pulse Present
Bit 15 = Parity Error on RTDS Line

Table D-3

K.sti Status Bytes

Status Code
(octal)

Description

Two conditions cause failure of the 2911
sequencer test upon powerup or reinitialization.
In one case, the requestor sent status back to
the P.io while Init was asserted. In the other
case, the sequencer had already released the
Init signal, but the sequencer failed to reach
the point in its code where it could change the
status bits. A common occurrence of this status
code is from an HSC false power fail crash dump.
In this type of crash dump (lOT through 20), all
requestors present report a 00 status code.

2901 ALU test failed upon powerup or
reinitialization.

D-14

status Code
(octal)

Description

Data Bus (DBUS) test failed upon powerup or
reinitialization.

control Bus(CBUS) test failed upon powerup or
reinitialization.

PROM test failed upon powerup or
reinitialization.

Scratchpad RAM test failed upon powerup or
reinitialization.

SERDES test failed upon powerup or
reinitialization.

Partial STl test failed upon powerup or
reinitialization.

The K.sti encountered a Control Bus/memory
problem while searching for the K-lnit packet in
Control memory.

After completing power-up/init diagnostics, the
K.sti could not find a properly formatted K-lnit
packet in Control memory.
control Bus error. (See note.)

14 through
22

During transfer completion, the buffer
descriptor link word in the FRB was zero. RAM7
contains the lower processor status.

24 through
33 *

control Bus error. (See note.)

The lower processor has timed out on a transfer
operation and the upper processor cannot restart
it.

35 and 36

control Bus error. (See note.)

A software inconsistency. The STl state zero
processing code was entered when the drive state
indicator was not zero.

State zero processing is complete. However, the
next state (such as Send Levell frame, or Get
Drive Status) is not specified. Thus, the state
is undefined.

D-15

status Code
(octal)

Description

41 through
43 *

Control Bus error.

While setting up a transfer, the next buffer
descriptor in the FRB was zero (no buffer was
there).
",~~

"'7J1

___

~_...:I

nl.l.t;lll}:Jl.t;U

l.U

(See note.)

...:I_ •• _ _ _ ••

UUWl1~UUl1l.

_ _ _ ••

~UUlll.t;1.

~1..._~

1.11Cll.

•• __

WCl,=,

already zero. R14 contains the FRB. R16 contains
the counter minus one. R17 contains the address
of the counter structure.
75 and 76

Control Bus error.

(See note.)

NOTE
The upper sequencer accesses Control memory several times
before checking for a Control Bus error. Thus, to help
determine the particular cause of the error, the upper
sequencer saves the contents of the error register
present at the time of the error in Rll (octal). The
contents of Rll are visible within the crash dump and may
help in narrowing the error possibilities. The following
list defines all the bits contained within the upper
processor error register (value loaded in Rll). Those
bits that can possibly cause a crash are denoted with an
asterisk (*).
Upper Processor Error Register:

Bit
= Even/Odd bit Control Memory Address
Bits 3,2,1 = CCYCLE 2,1,0
* Bit 4 Control Bus Error (Illegal Cycle)
* Bit 5 Control Bus NXM

* Bit 6 Control Data Parity Error
* Bit 7
Instruction (CRaM) Parity Error
Bits 8 through 12 not used
* Bit 13 = Response Pulse Missing On SOl RD/RES Line
Bit 14
Bit 15

Upper Processor RTC Clock Pulse Present
parity Error On RTOS Line

0-16

APPENDIX E
HSC70 REVISION MATRIX CHART

E.l

INTRODUCTION
Figure E-l shows the revision status of all applicable HSC70
FRUs. An HSC70 must have all the FRUs at a particular revision
level in order to be supported. Initial release of HSC70-AA
(120/208 VAC, 60Hz) and HSC70-AB (380-415 VAC, 50 Hz) including
vlOO software, are at revision AI.

E-l

HSC70 - AA/CA

NUMBER

III (CI LINK)

L0107 - YA

K.pli

LOlO8 - YA. (HSC5X - BA)

K.sdi

LOlO8 - YB (HSC5X - CAl

DESCRIPTION

L0100 - 00

trl

REV

K.sti

L0109 - 00

PILA

LOlll - 00

P.ioj

REVISIONS

B-ETCH

C-ETCH

01
C2

..-

C ETCH

0- ETCH

ClO

1--

E-ETCH

F-ETCH

C22

0- ETCH

Cl0

E-ETCH

F-ETCH

C23
El

C22 C23

-..

..-

C23 C24
E2

-..
..

C-ETCH

0- ETCH

..-

LOl17 - AA

M.std2

A - ETCH

5417764 - 01

BACKPLANE

C-ETCH

CX-1271A
Sheet 1 of 4

Figure E-1

HSC70 Revision Matrix Chart (1 of 4)

IREV A1

HSC70 - AA/CA
NUMBER

STD PS ASSY - 120VAC IN

70 - 20184 - 01

OPT PS ASSY - 120VAC IN

30 - 24374 - 01

881A PWR CNTR ASSY

70 - 23138

OCP ASSEMBLY

OCP

FLOPPY DRIVE BKT ASSY

RX33 DRIVE

EK - HSC70 - IN - xxx

INSTALLATION MANUAL

001

QX926 - H7

HSC70 SOFTWARE

V 100

BL - FH74x - DE

HSC70 OFFLINE DIAGS

54 - 15286 - 01 * *
70 - 23129 - 01
30 - 24962 - 01

trl
I

REVISIONS

DESCRIPTION

70 - 20033 - 03

------~

-------

1--~

tv300

---

"THIS BREAKDOWN IS FOR FIELD SERVICE INFORMATION ONLY.

CX-1271A
Sheet 2 of 4

Figure E-l HSC70 Revision Matrix Chart (2 of 4)

lREV A11

HSC70 - AB/CB

III (CI L INK )

L0107 - YA

K.pli

L0108 - YA (HSC5X - BA)

K.sdi

REVISIONS
D1

C-ETCH

C . ETCH

D - ETCH

C10

E-ETCH

C3 I~

-_F-ETCH

C22

D - ETCH

C10

E-ETCH

F-ETCH

C23

....

K.sti

...-----. ~.-

L0109 - 00

PILA

LOlll-00

P.ioj

LOl17 - AA

M.std2

..........

_..

C - ETCH

D - ETCH

--C22 C23

-....
-

.. C23 C24

_-_._---....-.-_-,, 1-."-'-'
A - ETCH

-..
....-

--,._".,,.

B-ETCH

._--

L0108 - YB (HSC5X - CAl

DE SCRIPTION

NUMBER
L0100 - 00

II I I I I I I I

-...
-

-..
..

--- 1--.•
---- I--" .
5417764-01

BACKPL ANE

C-ETCH

-- -- --_.-

1----'"

CX-1271A

Sheet 3 of 4

Figure E-l HSC70 Revision Matrix Chart (3 of 4)

HSC70 - AB/CB

NUMBER

REVISIONS

STD PS ASSY

240VAC IN

70 - 20184 - 02

OPT PS ASSY - 240V AC IN

30 - 24374 - 02

881B PWR CNTR ASSY

70 - 23138 - 01

OCP ASSEMBL Y

OCP

FLOPPY DRIVE BKT ASSY

RX33 DRIVE

EK - HSC70 - IN - xxx

INSTALLATION MANUAL

001

QX926 - H7

HSC70 SOFTWARE

Vl00

B L - F H 7 4x - DE

HSC70 OFFLINE DIAGS

70 - 23129 - 01
30 - 24962 - 01

DESCR I PTION

70 - 20033 - 04

54 - 15286 - 01**

ttl

REV

I
U1

-------

----..

V300

'*THIS BREAKDOWN IS FOR FIELD SERVICE INFORMATION ONLY.

CX-1271A
Sheet 4 of 4

Figure E-l HSC70 Revision Matrix Chart (4 of 4)

Control memory size, 1-16
Controls and indicators
DC power switch, 2-5
Enable indicator, 2-4
Operator control panel, 2-1
Secure/Enable switch, 2-3
Cooling, 1-2, 1-5

-A-

AC power
Removing, 3-1
ACK/NAK generation, 1-13
Address switches, node, 2-9
Airflow sensor assembly
Removing, 3-15
Auxiliary power supply
Removing, 3-21

-0-

-8-

BBR errors, 8-42
Block Diagram, 1-10 figure
Blower, 1-5
Removing, 3-13
Booting procedures, 4-2
Booting the Offline Diagnostics
diskette, 6-2
Booting the System diskette, 4-2

-cCables
Backplane to bulkhead, 1-6
Bulkhead to outside, 1-6
CI, 1-7
CI bus, 1-7
SOl, 1-7
SOl bus, 1-7
STI, 1-7
STI bus, 1-7
Cache Test Descriptions, 6-29
Cache Test Parameters, 6-23
CI Bus
Connecting to multiple hosts,
1-1
CI cables
Port link module interfaces,
1-14
CI Errors, 8-71
CI manager, 1-9
Console Terminal
Troubleshooting, 8-7
Console terminal connection), 4-1
Control Bus Error Conditions
(Hardware Detected), 8-112

Data memory size, 1-16
DC power
Removing, 3-3
DC Power Switch
Location of, 3-4figure
DC power switch, 2-5
DEFAULT command, DKUTIL, 1-0
Defaults, utility prompts, 7-1
Diagnostic manager, 1-10
Diagnostic subroutines, 1-10
Disabling P.ioj parity errors,
6-60
Disk functional errors, 8-84
Disk I/O manager, 1-9
DISPLAY command, DKUTIL, 7-10
DKUTIL
commands, 7-8
error messages, 7-20
DKUTIL command descriptions, 7-7
DKUTIL command modifiers, 7-3
DKUTIL command syntax, 7-2
DKUTIL Initiation, 7-1
Documents
ordering, 1-20
Door
Back, opening, 3-5
Front, opening, 3-4
DUMP command, DKUTIL, 7-12
-E-

Error classes, VERIFY, 7-21
Error message format, generic,
6-10
Error message severity levels,
DKUTIL, 7-17
Error message severity levels,
VERIFY, 7-25

Index-l

Error message variables, DKUTIL,
7-17
Error message variables, FORMAT,
7-34
Error messages, FORMAT, 7-38
Error processor, 1-9
Errors
Aborting Error Recovery Due to
Excessive RECALS, 8-84
Aborting Error Recovery Due to
Excessive Timeouts, 8-84
Acknowledge Not Asserted At
Start Of Transfer, 8-58
ATN. message sent to Node xx,
for Unit xx, 8-85
AccenClon Con01Clon serviced
for ONLINE disk unit xxx,
8-85
Bad Block Replacement (Block
OK), 8-45
Bad Block Replacement (Drive
Inoperative), 8-45
Bad Block Replacement (RCT
Inconsistant), 8-45
Bad Block Replacement (REPLACE
Failed), 8-46
Bad Block Replacement (Success),
8-46
Bad dispatch state in CB ... ,
8-80
Booted from drive 1. Drive 0
Error (text), 8-101
Buffer EDC Error, 8-59
Cables have gone from uncrossed
to crossed, 8-72
Cache disabled due to failure,
8-102
Cannot Clear Drive Errors, 8-59
Cannot Clear Formatter Errors,
8-59
Clock dropout from ONLINE disk
unit xx, 8-85
Compare Error, 8-17
Controller Detected Position
Lost, 8-60
Controller Transfer Retry Limit
Exceeded, 8-60
Controller-Detected
Transmission or Time Out
Error, 8-28
Could Not Complete Online
Sequence, 8-60

Errors (Cont.)
Could Not Get Extended Drive
Status, 8-61
Could Not Get Formatter Summary
Status During Transfer
Error Recovery, 8-61
Could Not Get Formatter Summary
Status While Trying To
Restore Tape Position, 8-61
Could Not Position For
Formatter Retry, 8-62
Could Not Set Byte Count, 8-62
Could Not Set Unit
Characteristics, 8-62
Data Bus Overrun, 8-17
Data Error Flagged in Backup
Record, 8-93
Data Memory Error (NXM or
Parity), 8-18
Data Overflow Due To Pipeline
Error, 8-63
Data Ready Timeout, 8-63
Data Synch Not Found, 8-39
Date/Time set by node nn, 8-71
Deferred ATN. message for Node
xx, Unit xx, 8-86
Disk Unit xx (Requestor xx,
Port xx) being initialized,
8-86
Disk unit xx ready to
transfer.!, 8-86
Disk unit xxx.(requestor
xx.,Port xx.) declared
inoperative, 8-87
DRAT/SEEK timeout, disk unit
xxx, 8-87
DRIVE CLEAR attempt on disk
unit xx, 8-88
Drive Clock Dropout, 8-28
Drive Inoperative, 8-29
Drive-Detected Error, 8-29
Drive-Requested Error Log (EL
Bit Set), 8-30
Duplicate disk unit xx, 8-88
ECC Errors, 8-40
Eight Symbol, 8-40
Five Symbol, 8-40
Four Symbol, 8-40
One Symbol, 8-40
Seven Symbol, 8-40
Six Symbol, 8-40
Three Symbol, 8-40

Index-2

Errors
ECC Errors (Cont.)
Two Symbol, 8-40
Uncorrectable, 8-40
EDC Error, 8-18
Erase Command Failed, 8-64
Erase Gap Command Failed, 8-64
Forced Error, 8-41
Formatter And HSC Disagree On
Tape Position, 8-64
Formatter Detected Position
Lost, 8-65
Formatter Requested Error Log,
8-65
Formatter Retry Sequence
Exhausted, 8-65
FRB error: K.ci, 1st LBN xx
buffers, FE$SUM xx, 8-88
FRB error: K.sdi, unit xx,
first LBN xxx, buffers,
FE$SUM, 8-89
Hard transfer error loading
(file) xx, 8-102
Hard transfer error writing SCT
xx, 8-103
Header Error, 8-41
HML$ER set - HM$ERR = nn, 8-78
Host Clear from CI node, 8-103
Host interface (K.ci) failed
INIT diags, status = xxx,
8-104
Host interface (K.ci) is
required but not present,
8-104
Host Requested Retry
Suppression On A Formatter
Detected Error, 8-66
Host Requested Retry
Suppression On A K.sti
Detected Error, 8-66
Illegal bit change in status
from disk unit xxx, 8-89
Insufficient Control Memory for
K.sti in Requestor xx, 8-93
Insufficient Private Memory
remaining for TMSCP Server,
8-94
Internal Consistency Error,
8-19
K.ci exception detected, code
nnn, 8-75

Errors (Cont.)
K.ci loopback microcode loaded,
8-80
K.sdi in slot xx failed its
init diagnostics, status
xxx, 8-89
K.sti in Requestor xx has
microcode incompatable with
this TMSCP Server, 8-94
Last soft init resulted from
unknown cause, 8-105
LBN Restored with Forced Error
in RESTOR Operation!, 8-90
Less than 87.5% of xx memory is
available, 8-105
Level 7 K Interrupt (Trap thru
134), 8-112
Level 7 K interrupt trap thru
134, 8-107
Lost Read/Write Ready, 8-30
Lost Receiver Ready, 8-31
Lower Processor Error, 8-66
Lower Processor Timeout, 8-67
MMU (Trap thru 250), 8-115
MMU Trap thru 250, 8-107
No control block available to
satisfy HMB request., 8-78
No Tape Drive Structures
available for Requestor xx
Port xx Unit xx Increase
Structures via SET MAXTAPE
command, 8-95
No Tape Formatter Structures
available for Requestor xx
Port xx Increase structures
via SET MAXFORMATTERS
command, 8-95
No usable K.sti boards were
found by the TMSCP Server,
8-95
Node nn Cables have gone from
crossed to uncrossed, 8-73
Node nn Path (A or B) has gone
from good to bad, 8-73
Node nn Path n has gone from
bad to good, 8-74
NXM (Trap thru 4), 8-111
NXM Trap thru 4, 8-108
P.ioj running with memory bank
or board swap enabled,
8-105
Parameter change, 8-108

Index-3

Errors (Cont.)
Parity Error Trap thru 114,
8-109, 8-111
PLI Receive Buffer Parity Error,
8-19
PLI Transmit Buffer Parity
Error, 8-20
position or Unintelligible
Header Error, 8-31
positioner error on disk unit
xxx. DRAT addr:xxx, 8-90
Premature LP flag in RTNDAT
sequence from host node xx,
8-90
pulse or Parity Error, 8-32
n rtm
~~~

rt ....... 't"" ..... 1 ........... ,........:1

~v~~u~~c~

'r:t .... _

~~~u~,

A ""'\
o-~,

l'}

Receiver Ready Not Asserted At
Start Of Transfer, 8-67
Record EDC Error, 8-67
Requestor xx failed INIT diags,
status = xxx, 8-106
Requestor xx has failed
initialization dignostics
with status = xx, 8-96
Reserved Instruction (Trap thru
10), 8-111
Reserved Instruction Trap thru
10, 8-109
Resource lost to K.ci -- xxx
xxx HMBs, 8-80
Retry Limit Exceeded While
Attempting To Restore Tape
Position, 8-68
Reverse Retry Currently Not
Supported, 8-68
Rewind Failure, 8-68
sCT read or verification error.
Using template SCT., 8-106
SDI exchange retry on disk unit
xxx, 8-91
sERDES Overrun, 8-19
S1 Clock Persisted After INIT,
8-33
51 Clock Resumption Failed
After 1N1T, 8-32
51 Command Timeout, 8-33
51 Receiver Ready Collision,
8-34
51 Response Length or Opcode
Error, 8-35
51 Response Overflow, 8-35

Errors (Cont.)
Software Inconsistency (Trap
thru 20), 8-118
Software inconsistency Trap
thru 20, 8-110
Tape Drive Requested Error Log,
8-69
Tape Formatter declared
inoperative, 8-97
Tape unit number xx connected
to Requestor xx Port xx
Ceased to exist while
Online, 8-96
Tape unit number xx connected
to Requestor xx Port xx
dropped state clock, 8-96
Tape unit number xx connected
to Requestor xx Port xx Is
not asserting Available
when it should be, 8-98
Tape unit number xx connected
to Requestor xx Port xx
Went Available without
request, 8-98
Tape unit number xx connected
to Requestor xx Port xx
Went Offline without
request, 8-99
TMSCP fatal initialization
error - TMSCP functionality
not available, 8-99
TMSCP Server operation limited
by insuffcient Private
Memory, 8-100
Topology Command Failed, 8-69
TTRASH fatal initialization
error, 8-100
Unable To Position To Before
LEOT, 8-69
Unexpected AVAILABLE signal
from ONLINE disk unit xx!
8-91
Unknown K.tape Error, 8-70
Unrecoverable error on disk
unit xx. Drive appears
inoperative, 8-91
Unsuccessful SEEK initiation,
disk unit xxx. DCB addr:
xxx, 8-92
VC closed due to timeout of
RTNDAT/CNF from host node
xx 8-92
I

1ndex-4

Errors (Cont.)
VC closed with node nn due to
disconnect timeout, 8-76
VC closed with node nn due to
request from K.ci, 8-77
VC closed with node nn due to
START received, 8-77
VC closed with node nn due to
unexpected disconnect, 8-76
VC open with node nn, 8-72
WARNING K.sti microcode too low
for large transfers., 8-101
Word Rate Clock Timeout, 8-70
Event Codes
MSCP, C-3
TMSCP, C-3
Exception codes and messages, B-1
EXIT Command, DKUTIL, 7-14
External interfaces, 1-7
-F-

Fatal error messages, FORMAT,
7-35
Fatal error messages, VERIFY,
7-25
Fault code interpretation, 4-4
Fault codes, 4-6
FORMAT, CAUTION, 7-30
FRU
Removal sequence, 3-4
-G-

GEDS Text Field
Breakdown of, 8-53
General information, 1-1
GET command, DKUTIL, 7-14
GSS Text Field
Breakdown of, 8-55
-H-

HSC70 control program, 1-8
HSC70 maintenance strategy, 1-17
HSC70 specifications, 1-19
-I-

ILDISK error messages
error 01 DDUSUB initialization
failure, 5-13

ILDISK error messages (Cont.)
error 02 unit selected is not a
disk., 5-13
error 03 drive unavailable.,
5-14
error 04 unknown status from
DDUSUB., 5-14
ILEXER data patterns, 5-60
ILMEMY error messages
error 000 tested twice with no
error., 5-8
error 001 returned buffer to
free buffer queue., 5-8
error 002 memory parity error.,
5-8
error 003 memory data error.,
5-8
Iltape Error Messages
error 1 - initialization
failure, 5-39
error 1 - requested device is
busy, 5-40
error 10 - load device write
error - check if write
locked, 5-40
error 11 - command failure,
5-40
error 12 - read memory byte
count error, 5-40
error 13 - formatter diagnostic
detected error, 5-40
error 14 - formatter diagnostic
detected fatal error, 5-40
error 15 - Rx33 read error,
5-41
error 16 - insufficient
resources to acquire
specified device, 5-41
error 17 - k microdiagnostic
did not complete, 5-41
error 18 - k microdiagnostic
reported error, 5-41
error 19 - dcb not returned, k
failed for unknown reason,
5-41
error 2 - selected unit not a
tape, 5-39
error 20 - error in DCB upon
completion, 5-41
error 21 - unexpected item on
drive service queue, 5-41

Index-5

Iltape Error Messages (Cont.)
error 22 - state line clock not
running, 5-41
error 23 - init did not stop
state line clock, 5-41
error 24 - state line clock did
not start up after init,
5-41
error 25 - formatter state not
preserved across init, 5-42
error 26 - echo data error,
5-42
error 27 - receiver ready not
set, 5-42
Error 28 - available set in
online formatter, 5-42
error 29 - Rx33 error - file
not found, 5-42
error 3 - invalid
requestor/port number, 5-39
error 30 - data compare error,
5-42
error 31 - edc error, 5-42
error 32 - invalid multiunit
code from GUS command, 5-42
error 33 - insufficient
resources to acquire timer,
5-42
error 34 - unit unknown or
online to another
controller, 5-43
error 4 - requestor not a k.sti,
5-40
error 5 - timeout acquiring
drive service area, 5-40
error 6 - requested device
unknown, 5-40
error 8 - unknown status from
tape diagnostic, 5-40
error 9 - unable to release
device, 5-40
Iltape Prompts
drive unit number (u) []?, 5-32
enter port number (0-3) []?,
5-33
enter requestor number (2-9)
[]?, 5-33
execute formatter diagnostics
(yn) [y]?, 5-33
execute test of tape transport
(yn) [n]?, 5-33

Iltape Prompts (Cont.)
is media mounted (yn) [n]?,
5-33
memory region number (h) [OJ?,
5-33
ILTCOM Error Messages
error 1 - initialization
failure, 5-50
error 10 - can't find end of
bunch, 5-50
error 11 - data compare error,
5-50
error 12 - data EDC error, 5-51
error 2 - selected unit not a
tape, 5-50
error 3 - command failure, 5-50
error 5 - specified unit not
available, 5-50
error 6 - specified unit cannot
be brought online, 5-50
error 7 - specified unit
unknown, 5-50
error 8 - unknown status from
TDUSUB, 5-50
error 9 - error releasing drive,
5-50
Information messages, FORMAT,
7-37
Information messages, VERIFY,
7-26
Informational messages, VERIFY,
7-29
Initialization error indications,
8-2
Initiating FORMAT, 7-31
Initiating VERIFY, 7-22
Inline Diagnostics
inline disk drive diagnostic
test (ILDISK), 5-9
inline memory test (ILMEMY),
5-6
inline multidrive exerciser
(ILEXER), 5-51
inline tape compatability test
(ILTCOM), 5-44
inline tape test (ILTAPE), 5-31
Inline diagnostics generic error
message format, 5-1
Inline diagnostics generic prompt
syntax, 5-1
Inline RX33 Diagnostic Test
(ILRX33), 5-2

Index-6

Logic modules, descriptions, 1-11
to 1-18

Internal Software, 1-8 figure
Internal software, 1-8

-M-

-K-

K.pli
See Port processor module
K.sdi
See Logic modules
Disk data channel
K.sti
See Logic modules
Tape data channel

M.std2
See Logic modules
Memory module
Main power supply
Removing, 3-18
Maintenance features, 1-18
Message severity levels, FORMAT,
7-35
Microcode detected errors
K.ci, 0-11
K.sdi, 0-14

-L-

Load device
See Rx33 disk drives
Load device errors, 8-81
Loader commands, 6-12
Loader DEPOSIT Command, 6-15
Loader EXAMINE Command, 6-14
Loader HELP Command, 6-12
Loader LOAD Command, 6-14
Loader SIZE Command, 6-13
Loader START Command, 6-14
Loader TEST Command, 6-13
Logic Modules
LEOS, functions of, 2-8 table
Port Buffer Module (PILA),
functions of, 1-14
Port processor module,
interfaces, 1-14
Logic modules
Card cage
Module utilization label, 1-4
Oisk data channel, functions of,
1-14
I/O control processor module,
functions, 1-15
Indicators and switches, 2-6
Memory module, functions, 1-16
Port link module (LINK),
functions of, 1-12
Port processor module,
functions, 1-14
Removing, 3-11
Swit.ches, 2-9
Port link buffer, 2-10
Tape data channel, functions,
1-15

Miscellaneous errors, 8-101
Module indicators, 2-6
Module LEOS
Data Channel LEOs, 8-6
Host Interface LEOs, 8-6
Memory module LEOs, 8-5
P. ioj LEOs, 8-4
Power up sequence, 8-4
Module Nomenclature, 1-12 table
Module Switches
Module sin switches, 2-10
figure
Module switches
Node address switches, 2-9
Module Utilization Label, 1-5
figure
Moving Inversions aLgorithm, 6-42
MSCP errors, 8-13
Controller error list, 8-16
Disk Transfer Errors, 8-35
SOl Errors, 8-20
MSCP processor, 1-9
-N-

Node address switches, 2-9
-0-

OCP Fault code displays, 8-2
Offline bus interaction test
error messages
Error 000 - Memory test error,
6-39

Index-7

Offline bus interaction test
error messages (Cont.)
Error 001 - K timed-out during
init., 6-39
Error 002 - K timed-out during
test., 6-40
Error 003 - parity trap., 6-40
Error 004 - NXM trap, 6-40
Error 005 - Memory test error
(P.ioj detected)., 6-41
Error 010 - Cache parity trap,
6-41
Error 011 - RX33 drive not
ready, 6-41
Error 012 - RX33 CRC error
AI,"';1"\1"'r
.......... "':1

c:!'oov
",,,,,-..n
.• ,

1::=,11
V
~.I.

Error 013 - RX33 track 0 not
set on reca1ibrate., 6-42
Error 014 - RX33 seek timeout,
6-42
Error 015 - RX33 seek error.,
6-42
Error 016 - RX33 read timeout.,
6-42
Error 017 - RX33 CRC/RNF error
on read command., 6-42
Offline cache cumulative soft
error results, 6-24
Offline Cache Test Error Messages
error 00 - memory parity error,
6-25
error 01 - NXM trap, 6-25
error 02 - cache parity error,
6-25
error 03 - bit stuck in cache
control register., 6-25
Offline Cache Test Error messages
Error 04 - Forced miss
operation failed., 6-25
Offline cache test error messages
Error 05 - Forced miss with
abort failed., 6-25
Error 06 - Expected cache hit
did not occur., 6-25
Error 07 - Expected cache miss
did not occur., 6-25
Error 10 - Value in hit/miss
register incorrect., 6-26
Error 11 - Write byte operation
caused cache update., 6-26
Error 12 - Write byte did not
cause cache update., 6-26

Offline cache test error messages
(Cont.)
Error 13 - Cache failed to
flush successfully., 6-26
Error 14 - Access with force
bypass did not cause
invalidate., 6-26
Error 15 - Tag Parity error bit
did not set., 6-26
Error 16 - Abort on cache
parity error did not occur.,
6-26
Error 17 - Unexpected parity
trap during abort test.,
6-26
Error 20 - Content of memory
system error register
incorrect., 6-27
Error 21 - Return PC wrong
during abort/interrupt
test., 6-27
Error 22 - Cache data parity
bit(s) did not set., 6-27
Error 23 - Interrupt on parity
error did not occur., 6-27
Error 24 - Expected NXM trap
did not occur., 6-27
Error 25 - Parity error was not
blocked by NXM., 6-27
Error 26 - Cache data
miscompare on word
operation., 6-27
Error 27 - Cache data
miscompare on byte
operation., 6-27
Error 30 - DMA write to memory
did not cause cache to
invalidate., 6-27
Error 31 - Instruction still
completed during abort
condition., 6-28
Error 32 - Load device error
during DMA test., 6-28
Error 33 - PDR cache bypass
failed., 6-28
Error 34 - Tag store address
hit failure., 6-28
Error 35 - Tag store address
miss failure., 6-28
Error 41 - Processor type is
not Jll., 6-28

Index-8

Offline Diagnostics
Offline Bus Interaction Test,
6-33
offline cache test, 6-22
offline diagnostic loader, 6-11
Offline K Test, 6-43
Offline K/P Memory Test, 6-57
Offline Memory Test, 6-73
Offline Operator Control Panel
Test, 6-104
Offline Refresh Test, 6-100
Rx33 Offline Exerciser, 6-89
Offline diagnostics
P.ioj ROM Bootstrap, 6-2
Offlines common characteristics,
6-1
Off1ines generic error message
format, 6-10
Operator control panel, 2-2figure
Blank indicators, 2-3
Fault codes, 2-3
Fault indicator and switch, 2-2
Init switch, 2-2
Lamp test, 2-3
Online indicator, 2-3
Online switch, 2-3
Power indicator, 2-2
Removing, 3-9
State and Init indicators, 2-1
Out-of-band errors, 8-70
Categories of, 8-70
-p-

P. ioc
See logic modules
I/O control processor module
packaging, 1-2
Packet reception, 1-13
Packet transmission, 1-13
Power, 1-2
Power control bus
See power controller, 1-6
Power Controller
rotating the line cord elbow,
3-16
Power controller
Bus/off/on switch, 2-12
Circuit breaker, 2-12
Delayed output line, 1-6
Description of, 2-10
Fuse, location of, 2-12

Power controller (Cont.)
Noise isolation filters, 1-6
Operating instructions, 2-10
Power control bus, 1-6
Power control bus connections,
2-12
Removing, 3-16
Total Off connector and power
up, 2-12
Program memory size, 1-16
PUSH Command, DKUTIL, 7-15
-R-

Removing power, 3-1
Requestor Error Summary, 6-38
REVECTOR command, DKUTIL, 7-16
Rx33 cover plate
Removal, 3-5
RX33 Disk Drives
jumper configurations, 3-8, 3-9
RX33 disk drives, as program load
device, 2-5
Rx33 Diskette controller location,
1-16
RX33 error code tables, 6-10
RX33 Exerciser Data Patterns,
6-99

-sSafety precautions, 3-1
SDI manager, 1-10
Secure/Enable switch, 2-3
SET Command, DKUTIL, 7-16
SINI errors
See miscellaneous errors, 8-101
Software, 1-8, 1-9, 1-10, 1-15
Software Error Messages
Categories of, 8-13
Software Release Notes, 1-1, 1-15
Status bytes interpretation
K.ci, D-ll
K.sdi, D-14
K.sti, D-16
STI bus
Maximum number of tape
formatters, 1-1
STI Communication or Command
Errors, 8-46
STI manager, 1-9
Subsystem block diagram, 1-10

Index-9

Success messages, FORMAT, 7-38
Switches, node address, 2-9
Symbolic addresses, DEPOSIT and
EXAMINE commands, 6-15

-uutilities manager, 1-10
utility processes, 1-10

-v-

-T-

Tape functional errors, 8~92
Tape I/O manager, 1-9
Terminal connection), 4-1
Troubleshooting, Cache, 6-28
Typical Bus Interaction Test
error message, 6-37

Variable output fields, VERIFY,
7-25
VERIFY process, 7-20
VERIFY Type error messages,
VERIFY, 7-28
VERIFY/FORMAT, 7-34

-wWarning message, FORMAT, 7-37
Warning messages, VERIFY, 7-26

Index-10

Digital Equipment Corporation. Colorado Springs, CO 80919