Digital PDFs

EK-ORAS0-SV-003

June 1990

256 pages

Original

11MB

Document:	EK-ORA90-SV-003 RA90 RA92 Service Jun90
Order Number:	EK-ORAS0-SV
Revision:	003
Pages:	256
Original Filename:

OCR Text

RA90/RA92 Disk Drive Service Manual
Order Number EK-0RA90-SV-003

Digital Equipment Corporation
Maynard, Massachusetts

First Edition: June 1988
Second Edition: June 1989
Third Edition: June 1990
The information in this document is subject to change without notice and should not be construed as a commitment
by Digital Equipment Corporation. Digital Equipment Corporation assumes no responsibility for any errors that may
appear in this document.
The software described in this document is furnished under a license and may be used or copied only in
accordance with the terms of such license.
No responsibility is assumed for the use or reliability of software on equipment that is not supplied by Digital
Equipment Corporation or its affiliated companies.
Restricted Rights: Use, duplication, or disclosure by the U.S. Govemment is subject to restrictions as set forth in
subparagraph (c)(1)(ii) of the Rights in Technical Data and Computer Software clause at DFARS 252.227-7013.
Copyright © 1989, 1990 by Digital Equipment Corporation

All Rights Reserved.
Printed in U.S.A.
The postpaid READER'S COMMENTS card requests the user's critical evaluation to assist in preparing future
documentation.

FCC NOTICE: The equipment described in this manual generates, uses, and may emit radio frequency energy.
The equipment has been type tested and found to comply with the limits for a Class A computing device pursuant
to Subpart J of Part 15 of FCC Rules, which are designed to provide reasonable protection against such radio
frequency interference when operated in a commercial environment. Operation of this equipment in a residential
area may cause interference, in which case the user at his own expense may be required to take measures to
correct the interference.
The following are trademarks of Digital Equipment Corporation:
DEC
DECUS
DECnet
HSC
KDA
MASSBUS
MicroVAX
MSCP
PDP

RC25
RQDX3
RSTSIE
RSX
R~11

SA
TA

TU
UDA50
UlTRIX
UNIBUS
VAX
VAXsimPlUS
VMS

TOPS-10
TOPS-20

RA90 ©Digital Equipment Corporation 1987
Covered by one or more U.S. PAT. Nos.
4,475,212
4,150,172

4,503,420

4,434,487
and other patents pending

This document was prepared using VAX DOCUMENT, Version 1.1

Contents
About This Manual
1

xiii

Introduction
1.1 RA90 and RA92 Disk Drive Descriptions ...•..........................
1.1.1
Physical and Logical Media Layout. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

1-1
1-3

1.2 Maintenance Strategy. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.2.1
Service Delivery Strategy . . . . . . . . . . . . . • . . . . . . . . . . . . . . . . . . . . . . . . . .
1.2.1.1
Six-Step Maintenance Strategy. . . . • . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.2.2
Tools Required for Maintenance .. . . . . . . . . . . . . . . . . . . . . ... . . . . . . . . . . .
1.2.3
Preventative Maintenance ......................•................

1-3
1-4
1-4
1-5
1-5

1.3

RA9O/RA92 Disk Drive Specifications. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

1-5

1.4

Electrostatic Protection. . . . . . . . . . . • . . . . . . • . . . . . . . . . . . . . . . . . . . . . . . . .

1-8

Installation
2.1

Introduction

2-1

2.2

Site Preparation and Pianning ..................................... .
Power and Safety Precautions. . . . . . . . . . . . . . • . . . . . . . . . . . . . . . . . . . . . .
20202
Three-Phase Power Requirements ••••••••••••••••••.••.••...••••••
2.2.3
AC Power Wiring ............................................. .
2.2.4
Thermal Stabilization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2.2.5
Floor Loading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . • . . . . . . . . . . . . .
2.2.6
Operating Temperature and Humidity ...•..........................

2-1

2.2.1

2-1
2-1
2-3
2-3
2-3
2-3

2.3 Unpacking the Cabinet . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2.3.1
Deskidding the Cabinet ......................................... .

2-3
2-5

2.4 Installing SDI Cables and Power Cords .............................. .
2.4.1
Removing the Front and Rear Access Panels ........................ .
2.4.1.1
Front Access Panel Removal. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2.4.1.2
Removing the Rear Access Panel ................................
2.4.2
SDI Cable Connections and Routing. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2.4.3
Power Cord Connections and Routing. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

2-7
2-7
2-7
2-9
2-10
2-11

2.5

2-12

Locating the RA9OIRA92 Disk Drive Power Supply ......................

iii

iv Contents

2.5.1

Plugging in the Power Cord ..................................... .

2-12

2.6

International Operator Control Panel Labeling ........................ .

2-13

2.7 RAOOIRA92 Disk Drive Acceptance Testing Procedures .................. .
Voltage Selection .............................................. .
2.7.1
Applying
Power to the Drive ..................................... .
2.7.2

2-13
2-13
2-14

2.8 Power-Up Resident Diagnostics .................................... .
2.8.1
OCP Lamp Testing ............................................ .
2.8.2
Test Selection from the OCP ..................................... .
2.8.3
RA901RA92 Idle Loop Acceptance Testing ........................... .
2.8.4
Testing Spun-Down Drive ....................................... .
2.8.5
Testing Spun-Up Drive ......................................... .

2-16
2-16
2-16
2-16
2-18
2-19

2.9 Placing the Drive On Line ........................................ .
2.9.1
Programming the Drive Unit Address .............................. .

2-20
2-20

2.10 Installing RA9OIRA92 Add-On Disk Drives in SO-Inch Cabinets ............ .

2-22

Operating Instructions
Introduction. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

3-1

3.2 RA9OIRA92 Disk Drive Components . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3.2.1
Electronic Control Module (ECM) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3.2.1.1
I10-R/W Mod·ule .............................................
3.2.1.2
Servo Module .............................................. .
3.2.2
Preamp Control Module (PCM) ................................... .
Head Disk Assembly and Carrier Assembly ......................... .
3.2.3
3.2.4
Dual Outlet Blower Motor ....................................... .
3.2.5
Power Supply ................................................ .
3.2.6
Drive Functional Microcode ..................................... .
3.2.7
OCP Functions ............................................... .

3-1
3-3
3-3
3-5
3-7
3-10
3-12
3-12
3-13
3-14

3.3 RA.901RA92 Operating Modes ...................................... .
3.3.1
Normal Mode Setup ........................................... .
3.3.2
Fault Display Mode Setup ....................................... .
3.3.3
Test Mode Setup .............................................. .

3-15
3-15
3-16
3-18

3.4 Programming the Drive Unit Address. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3.4.1
Alternate Unit Address Display Mode. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

3-19
3-21

3.1

Drive-Resident Diagnostics and Utilities
Introduction. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

4-1

4.2 Power-Up and Idle Loop Diagnostics. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4.2.1
Power-Up (Hardcore) Diagnostics. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4.2.2
Idle Loop Tests (Drive Spun Down). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4.2.3
Idle Loop Tests (Drive Spun Up) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

4-1
4-1
4-2
4-2

4.3

4-2

4.1

Sequence Diagnostics .............................................

Contents v

4.4

Standard OCP Displays Indicating Procedural Problems .. . . . . . . . . . . . . . . . .

4-3

4.5

Software Jum.per. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

4--4

4.6

Temperature's Affect on Drive Performance ............................

4-5

4.7 Diagnostics Descriptions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4.7.1
Seek Timing Tests . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4.7.2
Time, Seeks, and Spinups Display Interpretation . . . . . . . . . . . . . . . . . . . . . .

4-5
4-14
4-17

Troubleshooting and Error Codes
5.1 Troubleshooting Reference Material ..................................
5.1.1
Customer Support Training for the RA9OIRA92 Disk Drive. . . . . . . . . . . . . .

5-1
5-1

5.2 RA9OIRA92 Troubleshooting Aids . . . . . . . . . . . . . . . . . . . . . . . . . . . . . • . . . . . .
5.2.1
V.AXsi~US. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Host Error wgs .............................................. .
5.2.2
Extended Status Bytes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
5.2.3
Response Opcode (Byte 1) .••...•..•••....•.•.•..•••.....••••.••
5.2.3.1
Unit Num.ber Low Byte (Byte 2) and Subunit Mask (Byte 3) .......... .
5.2.3.2
Request Byte (Byte 4) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . • . . .
5.2.3.3
Mode Byte (Byte 5) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
5.2.3.4
Error Byte (Byte 6) .......................................... .
O.~.C>.O
Controller Byte (Byte 7) •.••..•............•.•••...••••••.•••••
5.2.3.6
Retry' Coun.t (Byte 8) ...••.•.•..••••.•.•.•••••••••••.•.••••••••
5.2.3.7
Previous Command Opcode (Byte 9) ..•.•••••••..•••••••••••••••••
5.2.3.8
IIDA Revision Bits (Byte 10) ................................... .
5.2.3.9
Cylinder Address (Bytes 11 and 12) ...........................••.
5.2.3.10
Error Recovery Level (Selected Group) (Byte 13) ................... .
5.2.3.11
Error Code (Byte 14) ......................................... .
5.2.3.12
Manufacturing Fault Code (Byte 15) ............................. .
o.~.c>.~c>
5.2.4
Drive Internal Error Log. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Running DKUTIL From the HSC Console or KDM70 Controller ....... .
5.2.4.1
5.2.4.2
Running the Drive-Resident Utility Dum.p (T41) From the OCP ....... .
5.2.5
OCP Fault IndicatorlError Codes ................................. .
5.2.6
Drive Power Supply Indicator. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
5.2.7
Drive Error Reporting Mechanisms ................................
5.2.7.1
Detailed Description of Error Reporting Mechanisms. . . . . . . . . . . . . . . . .
5.2.8
Host-Level Diagnostics and Utilities. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

5-1
5-2
5-2
5-2
5-3
5-3
5-3
5-4
5-4
5-5
5-5
5-6
5-6
5-6
5-9
5-9
5-9
5-9

I"' ... f t , . .

1"'''' ft 4 f t

5-12
5-14
5-14
5-14
5-15
5-15
5-16

5.3 General Troubleshooting Information . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
5.3.1
Drive-Resident Diagnostics Limitations .............................

5-16
5-16

Step-by~Step Troubleshooting Procedure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Troubleshooting Worksheet . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

5-16
5-23

5.5 Identifying the Problem Drive f!] • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • •
5.5.1
Talking to the System Operator/Checking the OCP Fault Indicator II] . . . . .
5.5.2
Using VAXsimPLUS to Identify the Problem Drive (gJ. . . . . . . . . . . . . . . . . .
5.5.3
Using the Host Error Log to Identify the Problem Drive [!I. . . . . . . . . . . . . .

5-23
5-23
5-23
5-23

5.4
5.4.1

Contents

Using the HSC Console Log to Identify the Problem Drive [] ........... .
Using the Host ConsolelUser Terminal Trails to Identify the Problem Drive

5-24

Using Other Means to Identify the Problem Drive [!] ................. .

5-24
5-24

5.6 Identifying the Problem FRU ~ ..................................... .
Pre-Verifying Drive Symptoms ~ ................................ .
5.6.1
Using OCP Error Codes to Identify the Problem FRU ~ ............... .
5.6.2
Using VAXsimPLUS to Identify the Problem FRU § ................. .
5.6.3
Using the Host Error Log to Identify the Problem FRU ~ ............. .
5.6.4
Using the HSC Console Log to Identify the Problem FRU ~ ........... .
5.6.5
Using the Drive Internal Error Log to Identify the Problem FRU ~ ...... .
5.6.6

5-24
5-25
5-25
5-25
5-25
5-26
5-27

5.7 Priority Order of Troubleshooting DSA Errors f! . . . . . . . . . . . . . . . . . . . . . . . . .
5.7.1
Drive-Detected Drive Errors and Diagnostic Faults [!] .................
5.7.1.1
Drive-Detected Protocol Errors Without Communication Errors ~ . . . . . .
5.7.1.2
Drive-Detected Pulse or State Parity Errors ~ . . . . . . . . . . . . . . . . . . . . .
5.7.2
Controller-Detected EDC Error ~ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
5.7.2.1
Controller-Detected Protocol and Transmission Errors Without
Communication Errors (StatuslEvent Codes 14B or 4B) [!]. . . . . . . . . . . .
5.7.2.2
Controller-Detected Pulse or State Parity Errors (Status/Event Code lOB)
~ .... . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
5.7.3
Controller-Detected Communication Events and Faults [!I . . . . . . . . . . . . . .
5.7.3.1
Controller-Detected: LOSS OF READIWRITE READY (Status/Event Code:

5-27
5-27
5-27
5-27
5-28

5.5.4
5.5.5
5.5.6

........................................................ .

8B)~....................................................

5.7.3.2
5.7.3.3
5.7.3.4
5.7.3.5
5.7.3.6
5.7.3.7
5.7.3.8

Controller-Detected: LOST RECEIVER READY (Status/Event Code: CB)
~ .......................................................
Controller-Detected: RECEIVER READY COLLISION (StatusIEvent
Code: lAB) 13.101 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Controller-Detected: DRIVE CLOCK DROPOUT (Status/Event Code: AB)
13.111. ....... ...............................................
Controller-Detected: DRIVE FAILED INITIALIZATION (StatuslEvent
Code: 16B) 13.121 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Controller-Detected: DRIVE IGNORED INITIALIZATION (Status/Event
Code: 18B) 13.131 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Controller-Detected: SERDES OVERRUN ERROR (Status/Event Code:
2A) 13.141 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
SDI Drive Command Timeout (Status/Event Code: 2B)13.1sl ...........

5.8 Media-Related Errors ~. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
5.8.1
Repeating LBNslRBNs [!]. . . . . . . . . . . . . . . . . . . . . . . . . . . . ... . . . . . . . . .
5.8.2
Excessive Number of Blocks Replaced Because of RJW Path Problems ~ . . .
5.8.3
LBN Correlation to Single GrouplTrack ~ ..........................
5.8.4
LBN Correlation to Head Groups ~ ......................... .. . . ..
5.8.4.1
LBNs Correlated to Zone Write Boundaries ~ .....................
5.8.4.2
LBN Correlation to a Physical Cylinder ~ ........................
5.8.5
Multiple Controllers Report Same Error Types ~ . . . . . . . . . . . . . . . . . . . . .
5.8.6
Only Single Controller Port Affected I!!J . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
5.8.7
Isolating Random RJW Transfer Errors ~. . . . . . . . . . . . . . . . . . . . . . . . . . .
5.8.7.1
Not Defined to a Specific Drive/Controller Port. . . . . . . . . . . . . . . . . . . . . .

5-29
5-29
5-30
5-30
5-30
5-31
5-31
5-31
5-31
5-32
5-32
5-32
5-33
5-33
5-33
5-34
5-34
5-34
5-35
5-35
5-35
5-35

Contents

Miscellaneous Checks ~. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

5--36

5.10 Axe You wst? ~ .................................................

5--36

5.11 Using Host-Level Diagnostics as a Last Resort ~ . . . . . . . . . . . . . . . . . . . . . . . .
5.11.1 HSC-Based Diagnostics. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
5.11.2 KDM-Based Diagnostics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
5.11.2.1
On Line from 'VMS . . . . . . • . . • . . • • • • • . . • . • . . . • • • • . . . . . • •.• . . . . . .
5.11.2.2
Running Standalone Programs from the VAX Diagnostic Supervisor . . . . .
5.11.3 xDA Controller-Based Diagnostics .................................

5--37
5--37
5--37
5-37
5-38
5-38

5.12 Exiting Data Collection: Action Item List Process ~ . . . . . . . . . . . . . . . . . . . . . .

5--39

5.13 FRU Replacement ~ ..............................................
5.13.1 Multiple Error Codes ~. • • • • • . . . • • • • . • • • . • • • . • • • • • • • • • • • • • • . • • • •
5.13.2 Service Post-Verification rg . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
5.13.3 Return Disk Drive to User ~ ....................................

5-40
5-40
5-40
5-41

5.14 Performance Issues When No Errors Are Being Logged. . . . . . . . . . . . . . . . . . .

5-41

5.15 Troubleshooting VMS Mount Verification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
5.15.1 VMS Mount Verification .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
5.15.2 VMS Problems Surrounding Diagnosis of "Why a Drive Mount-Verifies" ....
5.15.3 Non-VMS Mount Verification .....................................

5-42
5-42
5-42
5-44

5.16 Troubleshooting ECC Errors on RA9OIRA92 Disk Drives ... . . . . . . . . . . . . . . .
5.16.1 Uncorrectable ECC Errors--MSCP Status/Event E8 ...................
5.16.1.1
Hard Uncorrectable ECC Errors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
5.16.1.2
Soft Uncorrectable ECC Errors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
5.16.2 Correctable ECC Errors--MSCP Status/Event Codes lAB, lC8, 1E8 . . . . . . .
5.16.2.1
BBR Packet. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

5-44
5-44
5-44
5-46

5.17 Troubleshooting Controller-Detected Positioner Errors-MSCP StatuslEvent 6B
5.17.1 RA92 Disk Drive With MSCP Status/Event 6B. . . . . . .. . . . . . . . . . . . . . . . .
5.17.2 Evaiuaiing MSCP 6B Events .....................................

5-49
5-49
5-52

5.18 Conclusion......................................................

5--52

5.19 Error Codes and Descriptions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

5--53

5.9

vii

5-48
5=-48

Removal and Replacement Procedures
6.1

Introduction ................................................... .

6-1

6.2

Sequence for FRU Removal ........................................ .

6-3

6.3

Electrostatic Sensitivity ........................................... .

6-3

6.4

Power Precautions .............................................. .

6-3

6.5

Tools Checklist ................................................. .

6-3

6.6 Removing/Replacing Cabinet Front and Rear Access Panels ............... .
6.6.1
RemovinglReplacing the Front Access Panel ......................... .
6.6.2

Removing!Replacing the Rear Access Panel ......................... .

6-4
6-4
6-4

6.7

Removing the Operator Control Panel ............................... .

viii

Contents

6.8 Removing the BlowerlBezel Motor Assembly ...........................
6.8.1
Separating the Bezel and Blower Motor Assembly . . . . . . . . . . . . . . . . . . . . .

6-7
6-9

6.9

Removing the Electronic Control Module. . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

6-10

6.10 Removing the Preamp Control Module. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

6-11

6.11 Removing/Replacing the Head Disk Assembly ..........................
6.11.1 Removing the HDA. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
6.11.2 HDA Thermal Stabilization Procedure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
6.11.3 Replacing th.e IIDA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
6.11.4 Separating the HDA and Carrier . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
6.11.5 Removing the Spindle Ground Brush. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
6.11.6 Removing the Brake Assembly . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
6.11. 7 Spindle Lock Solenoid Failure. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

6-12
6-12
6-13
6-14
6-14
6-16
6-17
6-20

6.12 Removing the Power Supply . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

6-22

6.13 Removing/Replacing the Rear Flex Cable Assembly . . . . . . . . . . . . . . . . . . . . . .

6-23

6.14 Media Removal Service for Customers. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

6-25

Microcode Update Procedure
7.1

Introduction. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

7-1

7.2

Microcode Update Cartridge Description. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

7-1

7.3

Microcode Update Port Description. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

7-2

7.4

Running Test 40 (T40) ............................................

7-3

7.5 Updating the Microcode ...........................................
7.5.1
Error Codes/Common Problems During Microcode Update. . . . . . . . . . . . . . .

7-3
7-3

Capturing Information for LARS and CHAMPS

RA90/RA92 Error Recovery Levels

C Customer Equipment Maintenance
C.1 Customer Responsibilities ......................................... .
C.1.1
Cleaning Supplies ............................................. .
Ongoing Equipment Care ....................................... .
C.1.2
C.I.3
Monthly Equipment Maintenance ................................. .
C.1.4
Maintenance Records .......................................... .

C-1
C-1
C-1
C-2
C-2

Contents

Customer Services'Preventative Maintenance
D.1

PM Checklist for RAOOIRA92 Disk Drives. . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

D-1

Index
Examples
5-1
5-2
5-3

5-4
5-5
5-6

RA90 Cylinder Address and Group (Head) . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
RA92 Cylinder Address and Group (Head) . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
VMS Uncorrectable ECC Error Log-Hard. . . . . . . . . . . . . . . . . . . . . . . . . . . . .
VMS Uncorrectable ECC Error Log=Soft. . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
VMS BBR Packet . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . • . .
Positioner Mis-Seek MSCP Status/Event 6B. . . . . . . . . . . . . . . . . . . . . . . . . . . .

5-7
5-8
5-45
1)...47

5-50
5-51

Figures
1-1
1-2
1-3
1-4
2-1
2-2

Example of Sector Format. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
RA90 Physical and Logical Media Layout. . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
RA92 Physical and Logical Media Layout. . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
ESD Wrist Strap . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Electrical Plug Configurations ......................................
Unpacking the 60-lnch Cabinet .....................................
2--3 Cabinet Deskidding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . •
2-4 Ramp Installation of Shipping Pallet .................................
2-5 ~veler Adjustment. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2-6 Front· Panel Removal ............................................•
2-7 Rear Access Panel Removal ........................................
2-8 SDI Cable Connections and Routing-SA600 Example. . . . . . . . . . . . . . . . . . . •
2-9 Power Cord Connections and Routing-SA600 Example" " " " " " " " " " " " " " " " " " "
2-10 RA9OIRA92 Power Supply Controls and Indicators. . . . . . . . . . . . . . . . . . . . . . .
2-11 RA9OIRA92 Operator Control Panel ..................................
2-12 Location of Voltage Selector Switch. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2-13 Location of Power Controller Controls-881 Example. . . . . . . . . . . . . . . . . . . . .
2-14 Test Selection Flowchart. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2-15 OCP Displays During Testing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2-16 Unit Selection Flowchart ..........................................
3-1 RA9OIRA92 Disk Drive Block Diagram. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3-2 I10-R/W Module Block Diagram. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3-3 Servo Module Block Diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3--4 PCM Block Diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3-5 PCM Switch Pack Location. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3--6 HDA Block Diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3-7 Power Supply OK LED . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3-8 RA.9OIRA.92 OCP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3-9 OCP Fault Display Error Code Example. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3-10 Fault Display Mode Flowchart . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

1-2
1-3
1-4
1-9
2-2
2-4
2-5
2-6

2-7
2--8
2-9
2-10
2-11
2-12
2-13
2-14
2-15
2-17
2-19
2-21
3-2
3-4
3-6
~

3-10
3-11
3-13
3-15
3-17
3-17

x Contents

3-11 OCP Display After Test Selection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3-12 OCP Display While Running Test. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3-13 Unit Address Selection Flowchart. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3-14 Alternate Unit Address Display Mode Flowchart. . . . . . . . . . . . . . . . . . . . . . . .
4-1 Using Loopback Connectors. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4-2 Hardware Revision Switches. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4-3 Hardware Revision Byte . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4-4 T65 FCY OCP Display ............................................
4-5 T65 LCY OCP Display ............................................
4-6 T65 INC OCP Display. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4-7 T65 DLY OCP Display ............................................
5-1 RA9OIRA92 Extended Drive Status Bytes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
5-2 RA9OIRA92 Drive Internal ElTOr Log Memory Layout. . . . . . . . . . . . . . . . . . . .
5-3 RA9OIRA92 Drive Internal ElTOr Log Header Format ....................
5-4 RA9OIRA92 Drive Internal ElTOr Log Descriptor Format . . . . . . . . . . . . . . . . . .
5-5 Drive Internal Error wg. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
5-6 Power Supply Indicators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
5-7 Step-by-Step Troubleshooting Flowchart. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
5-8 Power Supply Cover Removal . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
5-9 WRT/CMD Data Format . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . •
6-1 RA9OIRA92 Disk Drive - Exploded View. . . . . . . . . . . . . . . . . . . . . • . . . . . . . .
6-2 FRU Removal Sequence ...........................................
6-3 Front Access Panel P.emoval . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
6-4 Rear Access Panel Removal ........................................
6-5 OCP Removal . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
6-6 Blower Motor Assembly Removal Sequence ............................
6-7 Bezel and Blower Motor Assembly Separation ..........................
6-8 ECM Removal . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
6-9 PCM Removal . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
6-10 IIDA Removal . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
6-11 IIDA Carrier Separation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
6-12 Spindle Ground Brush Removal . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
6-13 Contact Extraction Tool. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
6-14 RA9OIRA92 Brake Assembly Removal/Replacement . . . . . . . . . . . . . . . . . . . . . .
6-15 Disabling the Solenoid for In-Field Data Recovery . . . . . . . . . . . . . . . . . . . . . . .
6-16 Power Supply Removal. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
6-17 Rear Flex Cable Assembly Removal ..................................
6-18 HDA Media Removal - Top View. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
6-19 HDA Media Removal - Bottom View ........................ ". . . . . . . . .
7-1 Microcod.e Update Cartridge . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
7-2 Microcod.e Update Port . . . . . . . . . . . . . . . . . . . . . . . . . . . . • . . • • . . . . . . . . . . .
A-I LARS Exa.m.ple . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
0-1 Customer Equipment Maintenance Log for Storage Array Cabinets. . . . . . . . . .

3-18
3-18
3-20
3-22
4-10
4-19
4-20
4-23
4-24
4-24
4-24
5-2
5-10
5-11
5-12
5-13
5-14
5-17
5-59

5-65
6-2
6-3
6-5
~

6-7

6-8
6-9
6-10
6-11
6-13
6-15
6-16
6-18
6-19
6-21
6-23
6-24
6-27
6-28
7-1
7-2
A-2
C-3

Contents xi

Tables
1-1
1-2
1-3
2-1
3-1
3-2
3-3
3-4
3-5

Specifications for RA90 and RA92 Disk Drives. . . . • . . . . . . . . . . . . . . . . . . . . .
Additional Electrical Specifications by Model for RA90 and RA92 Disk Drives. .
RA9OIRA92 Environmental Limits. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
OCP Error Codes ................................................
ECM Module Types = Compatibility Matrix ...........................
I/O-PJW Module - Hardware Revision Matrix . . . . . . • . . . . . . . . . . . . . . . . . . .
Servo Module - Hardware Revision Matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . .
PCM Switch Pack Setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
PC'.M: Module - Hardware Revision Matrix . . . • . . . . . . . . . . . . . . . . . . . . . . . .
3-6 RA9OIRA92 HDA Hardware Compatibility Matrix . . . . . . . . • . . • . . . . . . . . . . .
3-7 RA9OIRA92 Microcode Compatibility With Drive FRUs ...•........•......
3-8 Power-Up: Normal Mode Operations .........................•.......
5-1 Reference Material for Troubleshooting ...............................
5-2 1\vo-Board Controller Diagnostics. . . . . . . . . . . . . . . . . . . . . . . . . . . . . • . . . . . .
5-3 Summary of Controller-Detected Communication Errors ..................
5-4 RA9OIRA92 Write Zones .... . . . . . . . . . . . . . . . . . . . . . . . • . . . . . . . . . . . . . . .
5-5 VDS-Based Off-lAne Diagnostics. . . . . . . • . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
~
MDM-Based Off-lAne Diagnostics. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
5-7 XXDP-Based Off-Line Diagnostics ...................................
5-8 Serial Num.ber ............................................. '" . . . .
5-9 Power Supply Voltage Measurements . . . . . . . . . . . . . . . . . . . . . . . . . . • . . . . . .
5-10 HDA Connector Pin Designations. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
5-11 HDA Resistance MeastJrements ................•....................
6-1 Digital Part Num.bers for Recommended Tools . . . . . . . . . . • . . . . . . . . . . . . . . .
7-1 Common Error CodeslProblems During Microcode Update. . . . . . . . . . . . . . . . .
B-1 RA9OIRA92 Hardware Error Recovery Circuits • . . • . . . . . • . . . . . . . . . • . . . . . .
B-2 RA9OIRA92 Error Recovery I..eve1s .. . . . . . . . . . . . . . . . . . . . . . . . . . . • . . . . . .

1-5
1-7
1-7
2-18
3-3
3-5
3-7
3-9

3-9
3-12
3-14
3-16
5-1
5-9
5-29

5-34
5-39
5-39
5-39
5-55
5-60
~1
~1

6-26
7-4
B-1
B-2

About This Manual
The information contained in this manual is intended for Di2ital Customer Services personnel
responsible for RA9OIRA92 disk drive maintenance and serviCe calls.
This manual contains checkout, servicing, and troubleshooting information for RA90 and RA92 disk
drives. Procedures for unpacking, deskidding, and cabling 6O-inch cabinets are also included.
Procedures for installing RA90 and RA92 add-on disk drives in GO-inch cabinets are not included in
this manual. Refer to product-specific documentation.
Related documentation is listed below, in alphabetical order.
DcK.-ument Title

Order Number

DSA Controller Documentation Kit

QP9OS-GZ

DSA Drive Documentation Kit

QP907-GZ

DSA Error Log Manual
DSA Error Log Pocket Seroice Guide

EK-DSAEL-MN
EK-DSAEL-PG

Getting Started With VAXsimPLUS

AA-KN79A-TE

HSC Service Manual

EK-HSCMA-SV

RA90 Disk Drive Rlustrated Parts Breakdown

EK-ORA.90-IP

RA90 Disk Drive Technical Description Manual

EK-O~TD

RA.9O Field Maintenance Print Set

MP-01424-01

RA90 /6000 Cabinet Series Upgrade Installation Guide

EK-RA9CK-IN

RA90 / H9643 Cabinet Installation Guide

EK-RA90H-IN

RA90 / RA92 Disk Drive Pocket Service Card

EK-ORA.90-PS

RA90 / RA92 Disk Drive User Guide

EK-ORA9G-UG

SA600 / SA800 Storage Array Family Configuration Guide

EK-SA6OO-CG

SA650 / SA850 Storage Array Family Configuration Guide

EK-SA65O-CG

VAXsimPLUS Field Service Manual

AA-KN82A-RE

VAXsimPLUS User Guide

AA-KNSOA-TE

xiii

Introduction
1;;1 RASO and RA92 Disk Drive De-scrlptlons
The RA90 and RA92 disk drives are high density, fixed-media disk drives which use nonremoveable,
thin film media and thin film heads. The RA9OIRA92 heads, disks, rotary actuator, and filtering
system are encased in a single unit called the Head Disk Assembly (HDA).
The RA90 disk drive has a formatted data storage capacity of 1.216 gigabytes and an unformatted
data storage capacity of 1.604 gigaby..es in a IS-bit word format. The RA92 disk drive has a
formatted data storage capacity of 1.506 gigabytes and an unformatted data storage capacity of
1.987 gigabytes in a 16-bit word format.
Thirteen surfaces contain data and embedded servo information. The embedded servo information
is within the intersector gaps. The embedded servo information accomplishes fine positioning of
read/write heads over the data tracks. Figure 1-1 is an example of the sector format used for
RA9OIRA92 disk drives.
The fourteenth surface is a dedicated servo surface that, when decoded by the drive electronics,
provides information on:
•

Coarse radial position

•

Track crossing (velocity)

•

Rotational index and sector position

•

Generation of clock synch ?.llse

•

Inner and outer guardband detection

DIGITAL INTERNAL USE ONLY

1-1

1-2 Introduction

A BURST
11 BYTES
HDR
PRE

HSY

HDR

DATA
PREA

B BURST
11 BYTES

DSY
,~

DATA

EDC PAD

•

ECC

WRITE/
READ
RECOV
,~

HEADERS PREAMBLE
17 BYTES HDR SYNC
2 BYTES
HEADER
16 BYTES
DATA PREAMBLE
37 BYTES
DATA SYNC
2 BYTES
DATA
512 BYTES
EOC
2 BYTES
PAD
1 BYTE
ECC
23 BYTES
WRITE/READ
RECOVERY
26 BYTES
CXO-2166A

Figure 1-1

Example of Sector Format

DIGITAL INTERNAL USE ONLY

Introduction 1=3

1.1.1 Physical and Logical Media Layout
The physical structure of the media is transparent to the user. Figures 1-2 and 1-3 represent the
layout of logical information for the RA90 and RA92 media.
CYLO

2649

~
897

LeNs
PER
CYL

OUTER
GUARD
BAND

HOST APPLICATION AREA
(CYL 0~2648)

2656

~
I

2659

! !

REVECTOR
CONTROL
TABLES
(CYLS

FORMAT
AREA
(CYLS

2649~

265~

DIAG
AREA
(CYLS

26~

2653)

2655)

XBNs

DBNs

INNER
GUARD
BAND

RIW
DIAG
265~

2660

2650)
REPLACEMENT SECTORS
13 RBNlCYL
I

iV,S,BLETO

HOST APPLICATIONS
LBNs (0 ~2,376, 152)

---

VISIBLE TO HOST
OPERATING SYSTEMS
RBNs (0 "34462)
LBNs (0 "2,3n ,946)

(O~

2729)

1819)

t4------VISIBLE TO CONTROLLER-----,..
CXO-2167B

Figure 1-2

RA90 Physical and Logical Media Layout

; .2 Maintenance Strategy
The RA90 and RA92 disk drives introduce a new approach to repairing peripheral equipment. In
most cases, RA9OIRA92 disk drives afford easy access to field replaceable units (FRU) without the
use of tools.
Additional drive maintenance features include the following:
• . A microprocessor-controlled operator control panel (OCP) interface eliminating the need for
external test equipment
•

EEPROM where an internal error log is stored

•

Twelve error recovery levels

•

Extensive drive-resident diagnostics

•

Drive microcode that can be updated by way of the microcode update port

DIGITAL INTERNAL USE ONLY

1-4 Introdudion

949
LBNs
PER
CYL

CYLO

3099

OUTER
GUARD
BAND

HOST APPLICATION AREA
(CYL 0..... 3098)

3106

3108

3111

l l l
REVECTOR
CONTROL
TABLES
(CYLS

FORMAT
AREA
(CYLS

DIAG
AREA
(CYLS

310~

3099~

3105)

3107)

XBNs
(0"
4809)

DBNs
(0'"
1923)

INNER
GUARD
BAND

RIW
DIAG
3111~

3112

3100)
REPLACEMENT SECTORS
13 RBNlCYL

~VISIBLETO

HOST APPLICATIONS
LBNs (0 "'2.940.950)

VISIBLE TO HOST
OPERATING SYSTEMS
RBNs (0 -'40312)
LBNs (0 -'2,942,848)

-...

......-----VISIBLE TO CONTROLLER------I...
CXO-2976A

Figure 1-3

RA92 Physical and logical Media Layout

1.2.1 Service Delivery Strategy
Real-time subsystem (drive) faults detected by the drive are recorded in the RA9OIRA92 drive
internal error log. Real-time faults detected in the disk subsystem are recorded in the supporting
system host error log. Controller-detected errors (such as ECC errors) are also logged to the host
error log and not the RA9OIRA92 drive-resident error log.
Use utility programs to obtain a print-out of the drive internal error log and isolate faults, provided
the error was drive-detected. Additionally, you can run the RA9OIRA92 drive-resident utility T41
to access the drive internal error log. This provides the drive LED error codes only. Use of other
utility programs provides additional error information.
Use drive-resident diagnostics to validate repairs to RA9OIRA92 disk drives. For more information
on drive-resident diagnostics and utilities, refer to Chapter 4.

1.2.1.1 SIx-step Maintenance Strategy
This section describes the maintenance strategy for RA90 and RA92 disk drives. Become familiar
with it as it determines the course of action necessary to successfully service RA9OIRA92 disk
drives.
Implement the following six-step maintenance strategy on each service call for a drive problem:
1. Examine and analyze VAXsimPLUS.
2. Examine and analyze system error logs.
3. Examine and analyze the drive internal error log.
4. Correlate failure symptoms to the probable failing FRU through service documentation.

DIGITAL INTERNAL USE ONLY

Introduction 1~5

5. Replace the FRU only after a prime FRU is identified from previous steps.
6. Verify device repair through drive-resident diagnostics. (Running host-level diagnostics to verify
repairs is unnecessary and penalizes the customer by tying up the system.)

Use host-based diagnostics only as a iast resort, to obtain symptomatic iailure information, and
only if system and drive error logs are unavailable.
Verify the drive is on line and operational through normal system-level commands that access the
unit under repair.

1.2.2 Tools Required for Maintenance
Tools required for maintaining RA9OIRA92 disk drives are identified in the procedures where they
are needed and in Chapter 6.

1.2.3 Preventative Maintenance
Customer responsibilities for preventative maintenance (60-inch cabinets only) are described in
AppendixC.
Digital Customer Services responsibilities for cabinet and RA9OIRA92 disk cL-ive maintenance are
described in Appendix D.

1.3 RA90/RA92 Disk Drive Specifications
Table 1-1Usts important operating and nonoperating specifications for RA90 and RA92 disk
drives.
Table 1-1 Specifications for RA90 and RA92 DIsk DrIves
Characteristic

RA90 Disk Drive

RA92 Disk Drive

Bead Disk Assembly (BDA)
Storage capacity, formatted

1.216 gigabytes

Storage capacity, unformatted

1.604 gigabytes

1.987 gigabr..es

HDA word format

IS-bit only

Same asRA90

Bits/square inch

40 megabits

49.4 megabits

Tracks/inch

1750

2045

Disk recording method

Rate 213 modulation code

Same asRA90

Number of disks

Same as RA90

Disk surfaces

14 (13 data and 1 servo)

Same asRA90

Number of heads

Same as RA90

Heads per surface

Same as RA90

Data tracks

34,437

40,287

Logical cylinders

2656

3101

User logical cylinders

2649

3099

Number of sectors

69 + 1 spare

73 + 1 spare

Number of logical blocks

2,376,153

2,942,849

1.506 gigabytes

DIGITAL INTERNAL USE ONLY

1-6 Introduction

Table 1-1 (Cont.) Specifications for RA90 and RA92 Disk Drives
Characteristic

RA90 Disk Drive

RA92 Disk Drive

Seek 'limes

One cylinder

5.5 milliseconds

3.0 milliseconds

Average seek

18.5 milliseconds

16.0 milliseconds

Maximum cylinder seek

31.5 milliseconds

29.0 milliseconds

Latency
Rotation speed

3600 rlmin

3405 rlmin

Average latency

8.33 milliseconds

8.81 milliseconds

Maximum latency

16.67 milliseconds

17.62 milliseconds

SiDgle Start/Stop Time

Start (maximum)

40 seconds

Same asRAOO

Inhibit between stop and restart

40 seconds

Same asRA90

Data Bates

Transfer rate

2.77 megabytes/sec

Same asRAOO

Physical Characteristics
Height

26.56 em (10.42 inches)

Same asRAOO

Width

22.19 em (8.74 inches)

Same as RA90

Depth

68.47 em (26.96 inches)

Same asRA90

Weight

31.8 kg (70 pounds)

Same asRA90

Inrush Current
120Vac

60 amperes peak @ 132 Vac

Same as RA90

220-240Vac

70 amperes peak @ 264 Vac

Same asRA90

120Vac

4.6 amps

Same as RA90

220-240Vae

2.4 amps

Same asRAOO

120Vae

0.7

Same asRA90

220-240Vac

0.58

Same as RAOO

Line cord length (from the cabinet)

2.74 meters (9 feet)

Same asRA90

Power factor:

DIGITAL INTERNAL USE ONLY

Introduction 1-7

Table 1-2 contains additional electrical specifications by model for RA90 and RA92 disk drives.
NOTE
The RA90 and RA92 disk drives are not line-frequency dependent.
Table 1-2 Additional Bectrlcal Specifications by Model for RA90 and RA92 Disk DrIves
Input Current (Amps)l

NomiDal
Voltage

CurreDt

PHI

Neutral

Power
Dissipation

BTUslBour

Model
RA9O-xxIRA92-u

120 volta

5.0

3.4

281 Watts

960

RA.9O-xxIRA92-xx

240 volts

2.85

1.45

271 Watts

[976]

Start-Up

[IijIIIour]1

lCurrents are for nominal voltages of 120 Vac phase to neu1z'al or for 240 Vac phase to neutral. For 101 Vac and 220
Vac nominal voltages, the drives will have proportionately higher phase CUlTeDts by a ratio of 1201101 or 24G'220 to the
currents specified in this table.
2Bracketed figures indicate kilojoules per hour.

Table 1-3 shows the maximum environmental limits and the recommended environmental
operating ranges to optimize equipment performance and reliability.
Table 1-3 RA90JRA92 Environmental Limits
Characteristic

RA9OIBA92 Disk Drive

Mmmum EnviroDmeDtal Limits
Temperature (Required)
Operating

100 e to 400 e (50°F to 104°F) with a temperature gradient of
200 elhour (36°Flhour)

Nonoperating

-400 e to +6Ooe (-40°F to +14O°F)

Beiaiive humidity

Operating

10% to 90% (noncondensing) with a minimum wet bulb
temperature of 28°e (82°F) and a minimum dew point of 2°e
(86° F)

Nonoperating

10% to 90% with no condensation

DIGITAL INTERNAL USE ONLY

1-8 Introduction

Table 1-3 (Cont.)

RA90/RA92 Environmental limits
RA9OIRA92 Disk Drive

Characteristic

Recommended Environmental Operating Ranges
Temperature

18°C to 24°C (64.4°F to 7S.2°F) with an average rate of change
of 3°Clhour maximum and a step change of 3°C or less

Relative humidity

40% to 60% (noncondensing) with a step change of 10% or less
(noncondensing)

Air quality (maximum particle count)

Not to exceed 500,000 particles per cubic foot of air at a size of
O.S micron or larger

Air volume (at inlet)

SO cubic feet per minute (.026 cubic meters per second)

Altitude
Operating

Sea level to 2400 meters (8000 feet); maximum allowable
operating temperatures are reduced by a factor of 1.8°Cl1ooo
meters (l°F/1000 feet) for operation above sea level

Nonoperating

300 meters (1000 feet) below sea level to 7500 meters (16,000
feet) above sea level (actual or effective by means of cabin
pressurization)

1.4 Electrostatic Protection
Electrostatic discharge (ESD) is the result of electrostatic buildup and its subsequent release. The
surface storage of an electrostatic charge from a person or object can damage hardware components
and may result in premature device or option failure.
The basic concept of static protection for electronic components is the prevention of static buildup,
where possible, and the safe release of existing electrostatic charge buildup. If the charged object is
a conductor, such as an object or person, complete discharge can be achieved through grounding the
person or object.
Use the following guidelines when handling static-sensitive components and modules:
CAUTION

Always use grounding straps to avoid product damage when handling static-sensitive
components and modules.
1. Read all instructions and installation procedures included with static control materials and
kits.
2. Use static-protective containers to transfer modules and components (including bags and tote
boxes).

3. Wear a properly grounded ESD wrist strap when handling components, modules, or other
static-sensitive devices. Figure 1-4 shows the ESD wrist strap in use.
When using an ESD wrist strap:
•

Ensure the wrist strap fits snugly for proper conductivity.

•

Attach the alligator clip securely to a clean, unpainted, grounded metal surface such as the
drive chassis or cabinet frame.

•

Do not overextend the grounding cord.

DIGITAL INTERNAL USE ONLY

Introduction 1-9

CHASSIS
STABILIZER
BRACKET

GROUNDING ESD
WRIST STRAP

CXO-2168C

Figure 1-4

ESD Wrist Strap

DIGITAL INTERNAL USE ONLY

Installation
2.1 Introduction
The SA600 and SA650 cabinets are the most commonly used cabinets for RA90 and RA92 disk
drives. Procedures for unpacking, deskidding, and cabling 6O-incht cabinets are contained in this
chapter. This chapter also covers site preparation and planning considerations, drive acceptance
testing procedures, and power-up diagnostics.

Information on unpacking and installing add-on RA90 and RA92 disk drives in 6O-inch cabinets can
be found in product-specific documentation and is not covered here.

2.2 Site Preparation and Planning
Site preparation and planning are necessary before installing an RA90 or RA92 disk drive
subsystem. Chapter 1 contains a full range of recommended environmental specifications. In
addition, consider the following items before attempting installation.

2.2.1 Power and Safety Precautions
The RA901RA92 disk drives do not present any unusual fire or safety hazards. It is recommended,
however, that you check ac power Wiring for the computer system to determine adequate capacity
for expansion.

2.2.2 Three-Phase Power Requirements
The RA90 and RA92 disk drives use a single-phase power supply; however, the 881 power controller
uses three phases. It is very important that the correct phase requirements for this product be met.
Refer to Chapter 1 for power specifications.

WARNING
Hazardous voltages are present in this equipment. IDstallation and service must be
performed by trained service pel'SODlleL Bodily iDjury or equipment damage may result
from incorrect servicing.
To prevent damage to equipment and personnel, ensure power sources meet the specifications
required for this equipment.

t The SA600 and the SA650 are both 6O-inch cabinets.
DIGITAL INTERNAL USE ONLY

2-1

2-2 Installation

POWER CORDS GOING TO POWER CONTROLLER
120V 60HZ
POWER CORD
DEC NO. A-PS-1700083-23
PLUG - POWER CONTROLLER END

240V 50HZ
POWER CORD
DEC NO. A-PS-1700083-24
PLUG - POWER CONTROLLER END

120/240V 47 -63HZ
10Al6A
POWER CORD
DEC NO. A-PS-1700442-18 OR
A-PS-1700442-19
PLUG - DRIVE END
PLUGS GOING TO WALL OUTLET (FROM CONTROLLER)
120V 60HZ
24A
1-PHASE

40-INCH
CABINET

NEMA NO. L5-30P
DEC NO. 12-11193
(874-0)

220/240V 50-60HZ
16A
1-PHASE
lEe 309 320-P6W
DEC NO. 12-14379-03
(874-F)

120/208V AC 60HZ
30A
3-PHASE
WVE
USED WITH 881-A AND 881-C
POWER CONTROLLERS

(@
\.V

5-WIRE
NEMA NO. L21-30P

60-INCH
CABINET
220-240/380-415V AC 50HZ
20A OR 16A
3-PHASE
WVE
USED WITH 881-B
POWER CONTROLLER

••
~

5-WIRE. 4-POLE.
IEC 309

CXO-1872D

Figure 2-1

Electrical Plug Configurations

DIGITAL INTERNAL USE ONLY

Installation 2-3

2.2.3 AC Power Wiring
The wiring used by Digital Equipment Corporation conforms to UL, CSA, and ISE standards.
Figure 2-1 shows the ae plug configurations for RA90 and RA92 disk drives and 881 and 874 power
controllers.

2.2.4 Thermal Stabilization
Thermal stabilization prevents temperature differences between the equipment and its environment
from damaging disk drive components.
Prior to installation, a 6O-inch cabinet subsystem and the RA9OIRA92 add-on drive must be stored
at a temperature of sooF (16°C) CT bigher fer a '"""'mum of 24 hou-'PS. These units may be stored
either in the computer room or in another storage room under controlled temperature conditions. If
r..ored in another storage room, each unit must sit for an additional hour in the computer room in
which it is to be installed.

CAUTION
The thermal stabilization procedure is JIIGlUlaIory. Do not open the moisture barrier bag
until after the thermal stabilization periocL Failure to thermally stabiHze the equipment
may cause premature equipmcmt failure.
After the thermal stabilization criteria has been met, carefully cut the moisture barrier bag and
proceed with the installation.

2.2.5 Floor Loading
Consider the placement of this equipment, especially if a fully loaded 6O-inch configuration is used.
A fully loaded 6O-inch cabinet weighs approximately 390 kilograms (860 pounds). Each RA90 or
RA92 disk drive weighs approximately 31.8 kilograms (70 pounds).

2.2..6 Operating Temperature and Humidity
The required relative humidity range is between 10 percent and 90 percent with a minimum wet
bulb temperature of 28°C (8~F) and a minimum dew point of ~C (36°F) (non-condensing) with a
step change of 10 percent or less.
The RA90 and RA92 disk drives can be operated within temperatures of 10°C to 40°C (50°F to
104°F). However, it is highly recommended that RA90 and RA92 disk drives be operated in a
temperature range below 25°C (77°F) to increase reliability and extend product life.

2.3 Unpacking the Cabinet
The GO-inch cabinet configuration is packed in a cardboard carton attached to a wooden shipping
pallet. Refer to Figure 2-2 and use the following procedure to unpack the cabinet:
1. Inspect the shipping carton for any sign of external damage. Report any damage to the local
carrier and to the Digital Customer Services or sales office.
2. Remove the two cardboard U-sections but leave the sealed moisture barrier with desiccant in
place during thermal stabilization.

CAUTION
This equipment must be thermally stabiHzed in the site enviromnent for at least 24
hours before operation.

DIGITAL INTERNAL USE ONLY

2-4 Installation

MACHINE BOLTS
(2 EACH SIDE)

SHIPPING
STRAPS

CARDBOARD
U-SECTION

SHIPPING
BOLT

*00 NOT OPEN UNTIL THE THERMAL STABILIZATION PROCEDURE IS COMPLETE.

Figure 2-2

Unpacking the 50-Inch Cabinet

DIGITAL INTERNAL USE ONLY

CXO-2717 A_S

instaiiation

2-5

2.3.1 Deskiddlng the Cabinet
Three people are required to deskid the 60-inch cabinet. See Figure 2-3.

WARNING
Serious injury could result if the cabinet is improperly handled.

Figure 2-3

cabinet Desklddlng

1. 'Remove the two unloading ramps from their carton located under the carton top cover.
2. Inspect the ramps, ramp side rails, and metal hardware for defects described in the following
list:
•

Cracks more than 25 percent of the ramp depth, either across or lengthwise on the ramp.

•

Knots or knotholes going through the thickness of the ramp and greater than 50 percent of
the ramp width.

•

Loose, missing, or broken ramp side rails.

•

Loose, missing, or bent metal hardware.

If any of these conditions exist, do not use that ramp. Investigate alternate means of removing
the cabinet and/or order a new ramp. The part number for the left ramp is ~768~1; the
part number for the right ramp is ~768~2.

DIGITAL INTERNAL USE ONLY

2-6

Installation

3. Remove shipping bolts from the shipping brackets on each of the four levelers. See inset in
Figure 2-2.
4. Remove shipping brackets from the four cabinet levelers.
5. Fasten unloading ramps onto the pallet by fitting the grooved end of each ramp over the metal
mating strip on the pallet. See Figure 2-4.
SHIPPING
PALLET

RIGHT
UNLOADING
RAMP

STEEL~

RIGHT RAMP
ATTACHES HERE

Figure 2-4

DOWEL

Ramp Installation of Shipping Pallet

6. Screw the cabinet levelers (Figure 2-5) all the way up until the cabinet rests on its rollers on
the pallet.
7. Carefully roll the cabinet down the ramps (three people are required).
8. Move the cabinet into its final position.
9. Turn each leveler hex nut clockwise until the leveler foot contacts the floor (no weight on the
casters) and the cabinet is level.

DIGITAL INTERNAL USE ONLY

installation 2-7

2.4 Installing SDI cables and Power Cords
Generally, SDI cables and power cords are installed in the 6O-inch cabinet prior to shipping. Use
this section as a ref'erence should you need to remove or reinstall the power cords or SDI cables.

LOCKNUT

LEVELER
HEX NUT
LEVELER
FOOT

Figure 2-5

Leveler Adjustment

2.4.1 Removing the Front and Rear Access Panels
Use the following procedure to remove front and rear cabinet aeeess panels.
2.4.1.1 Front Access Panel Removal
Refer to Figure 2-6 while performing this procedure:

1. Use a hex wrench or fiat-bladed screwdriver to unlock the two quarter-turn fasteners at the top
of the panel. Turn the fasteners counterclockwise.

2. Grasp the panel by its edges, tilt it toward you, and lift it up about 2 inches. Remove the panel
and store it in a safe place.
To r~tall the front panel, lift it into place and lower it straight down until the tabs on the panel's
lower edge engage the slots in the cabinet support bracket. Hold the panel flush with the cabinet

and use a hex wrench to lock the fasteners.

DIGITAL INTERNAL USE ONLY

2-8

Installation

OUARTER-TURN
FASTENER
.,,./'

FRONT
PANEL

./'

SUPPORT
BRACKET
CXO-2130C

Figure 2-6

Front Panel Removal

DIGITAL INTERNAL USE ONLY

Installation 2-9

2.4.1.2 Removing the Rear Access Panel
Refer to Figure 2r-7 while performing this procedure:
1. Use a hex wrench or fiat-bladed screwdriver to unlock the two quarter-turn fasteners at the top
of the panel. T-w-n the fasteners counterclockwi..se.
2. Tilt the panel toward you and lift it up to disengage the pins at the bottom.
3. Lift the panel clear of the enclosure and store it in a safe place.
When replacing the rear panel, lift it into place and fit the pins into the holes at the top of the VO
bulkhead. Push the top of the panel into place and turn the quarter-turn fasteners clockwise.

QUARTER-TURN ~
FASTENER

CABINET

REAR
~

BUSTLE

il!lIl.
II

"lIlItI,i,"

HEX
WRENCH

1111 'l'IIII~1I1111
/

REAR
ACCESS
PANEL

'"
110 BULKHEAD

jill 1111111111 11 111 '-

IIIIIII~IIII'II

'" 1111111

PINS
CXO-2131D

Figure 2-7

Rear Access Panel Removal

DIGITAL INTERNAL USE ONLY

2-10 Installation

2.4.2 SDI Cable Connections and Routing
Both external and internal cables are connected to the 110 bulkhead located at the base of the drive
cabinet. See Figure 2-8. Refer to product-specific documentation for more information.

CABLE

TROUGH

SOl
CABLES

1/0

BULKHEAD

•
2

•

~OOB

~DoB

•

<tOOB

•

~OOA

~DoA

°OoB
°

<tOOA
7

o
OB
°D

~DoA

<to OA

°OoA
0
0

<to OB

<tOOB
0

<tOOA

~OOB

•
CXO-2132B

Figure 2-8

SDI Cable Connections and Routlng-SA600 Example

DIGITAL INTERNAL USE ONLY

installation 2-11

2.4.3 Power Cord Connections and Routing
Figure 2-9 shows drive power cord connections and the recommended power cord routing for an
SA600 storage array cabinet. Refer to product-specific documentation for power cord connections
and routing for other subsystems.

DISK
DRIVE
POWER
CORDS

POWER'
CONTROLLER
ACPOWER
CORD

~
q

CXO-2133B

Figure 2-9

Power Cord ConnecUons and Routlng-SA600 Example
DIGITAL INTERNAL USE ONLY

2-12

Installation

2.5 Locating the RA90/RA92 Disk Drive Power Supply
To access the RA90 or RA92 disk drive power supply, remove the cabinet rear access panel
(Figure 2-7). Figure 2-10 shows the location of the RA9OIRA92 disk drive power supply, circuit
breaker, and the Power OK LED.
DRIVE

/REAR

CIRCUIT
BREAKER

GREEN LED
(POWER OK)

CXO-2134B

Figure 2-10

RA90/RA92 Power Supply Controls and Indicators

2.5.1 Plugging in the Power Cord
The drive power cords in a fully-configured cabinet are already plugged into the power controller.
Only the ac power cord from the cabinet power controller needs to be plugged into an external
power source.
NOTE

Do not apply power to the power controller until proper voltage has been selected.
(Refer to Section 2.7.1.)

DIGITAL INTERNAL USE ONLY

Installation 2-13

2.6 International Operator Control Panel Labeling
Each drive unit or cabinet configuration is shipped with a set of international labels for the operator
control panel (OCP). The labels come in a packet or on a single sheet. Select and apply the set of
labels applicable to the countr-.i in w hlch the equipment is being installed.

2.7 RA90/RA92 Disk Drive Acceptance Testing Procedures
The following sections cover RA9OIRA92 disk drive acceptance testing procedures. Follow each
procedure to completion before starting the next.
Refer to Figure 2-11 while performing acceptance testing on RA90 and RA92 disk drives. A more
detailed description of the RA9OIRA92 OCP and its functions can be found in Chapter 3.
FOUR-CHARACTER
ALPHANUMERIC
DISPLAY
UNIT
NUMBER

TEST
SWITCH

FAULT
SWITCH

STATE LED
INDICATORS

\
RUN
SWITCH

NOTE: RA90 PART NO. 74-35109-02
RA92 PART NO. 74-39769-01
.

Figure 2-11

CXO-2962A

RA90/RA92 Operator Control Panel

2.7.1 Voltage Selection
Before applying power to RA90 or RA92 disk drives, ensure the proper operating voltage has
been selected for your area of operation. The voltage selector is a slide switch capable of selecting
120 volts or 240 volts. (The frequency 60 Hz or 50 Hz is universal.) To -select the proper voltage,
perform the following steps:
1. Remove the cabinet rear access panel (refer to Section 2.4.1.2).
2. Verify the ac circuit breaker on the power controller is off.
DIGITAL INTERNAL USE ONLY

2-14

Installation

3. Verify the circuit breaker on each disk drive is off (0).
4. Locate the voltage selector switch (Figure 2-12).
5. Using a non-conductive pointed object, slide the voltage selector switch into the position
applicable to your site.

PORTB
120V
OR
240V

PORTA

VOLTAGE
SELECTOR
SWITCH
POWER
SUPPLY

NOTE: VOLTAGE MARKINGS ON SOME POWER SUPPLIES READ 1151230V.

Figure 2-12

CXO-2135D

Location of Voltage Selector Switch

2.7.2 Applying Power to the Drive
Use the following procedure to apply power to RA9OIRA92 disk drives:
l. Verify drive voltage selector switch has been properly set (see Section 2.7.1).

2. Verify the ac circuit breaker on the power controller is off. Also verify the circuit breaker on
each disk drive is off. See Figures 2-10 and 2-13 for circuit breaker locations.

DIGITAL INTERNAL USE ONLY

installation 2-15

POWER
CONTROLLER

POWER CONTROL
BUS CONNECTORS

GROMMETED
CORD
OPENING

UNDELAYED~FUSE

'"
DELAYED
(0.5 SEC)

SERIAULOGO
LABEL
CX0-2136A

Figure 2-13

Location of Power Controller Controls-881 Example

3. Verify the LocalIRemote switch on the 881 power controller is in the Local position.
4. Verify the drive power cord is plugged into the power controller.
5. Verify the external power source is correct.
6. PlUg the ac power cord from the power controller into an external power receptacle.
7. Switch the ac circuit breaker on the power controller to the on position.
8. Switch the ae circuit breaker on the RA90 or RA92 disk drive to the on position.

DIGITAL INTERNAL USE ONLY

2-16

Installation

2.8 Power-Up Resident Diagnostics
A sequence of drive-resident diagnostics run at power-up. The sequence consists of hardcore tests
with basic processor tests. Successful completion of the hardcore tests is indicated by the following
OCP displays:
1. Blank (1 second)
2. WAIT (16 seconds)
3. [0000] (If previously programmed, the drive unit number is displayed; otherwise, zeros are
displayed.)

2.8.1 OCP Lamp Testing
Before continuing with acceptance testing, perform an OCP lamp test to ensure the LED state
indicators and alphanumeric display are working properly. Perform the following procedure before
selecting any other OCP switches (refer to Figure 2-11):
1. Select the Test switch. The Test LED indicator lights.

2. Select the Fault switch. All lamps light momentarily.
3. Deselect the Test switch.
All lamps should momentarily light. If not, ensure the OCP is seated properly and power is applied
to the drive. Repeat the test.
Replace the OCP if any lamps fail (refer to Section 6.7).

2.8.2 Test Selection from the OCP
It is necessary to select and run resident diagnostics from the OCP to complete acceptance testing.
Use the following procedure to select and run diagnostics from the OCP. Figure 2-14 is a flowchart
of this procedure.
1. Power up the drive (if not done previously).

2. Select the Test switch (test defaults to zero; no other operator action is required).
3. Select the Write Protect switch.
4. Select the diagnostic to run by using Port A and Port B switches. See the test selection
flowchart (Figure 2-14).
5. Start the test by selecting the Write Protect switch.
6. Stop the test by selecting either the Port A or Port B switch.
7. Restart the test by selecting the Write Protect switch again.
8. Select the Test switch to exit the test mode.

2.8.3 RA90/RA92 Idle Loop Acceptance Testing
After the hardcore diagnostics have successfully run, the drive automatically enters an idle loop
diagnostic test sequence. Do not select any front panel switches. Allow the drive to remain in the
idle loop test for 5 minutes.

DIGITAL INTERNAL USE ONLY

instailation 2-17

1
SELECT PORT A
SWITCH (MSD
BEGINS FLASHING)

DISPLAY =

SELECT PORT
SWITCHES TO
DESELECT
PORTS(S)

DISPLAY =

i iNCREMENT
NUMBERS 0-9
BY SELECTING
PORTB
SWITCH

DISPLAY =

SELECT WRITE
PROTECT SWITCH
(TEST STARTS)
SELECT WRITE
PROTECT SWITCH

DISPLAY =

SELECT PORT
A SWITCH
I (LSD BEGINS
FLASHING)

DISPLAY =

INCREMENT
NUMBERS 0-9
BY SELECTING
PORTB
SWITCH

• INDICATES FLASHING READOUT

Figure 2-14

DISPLAY =

(START)

DISPLAY =

(COMPLETE)

t
1

--+-1_°..,&,1_1......1

DISPLAY = ...T-Io.I

t
DIAG CAN BE STOPPED
BY SELECTING PORT A
OR PORT B, RESTARTED
BY SELECTING WRITE
PROTECT, OR EXITED
BY SELECTING TEST
SWITCH

CXO-2139B

Test SeIecUon Flowchart

DIGITAL INTERNAL USE ONLY

2-18

Installation

If an error occurs during power-up or during idle loop testing, the drive attempts to display an error
code. Table 2-1 lists error codes and required operator actions. Error codes not found in Table 2-1
indicate a problem requiring additional troubleshooting. Refer to Chapter 5 for troubleshooting
strategy.
Table 2-1

OCP Error Codes

Error

Description

Action

Drive write protected

Disable write protection with the OCP Write Protect switch or
turn off software write protection.

Drive over-temperature
condition

Spin down and remove power from the drive. Ensure the
cabinet air vent grill is clean and room temperature is within
recommended limits. Call Digital Customer Services if dirty air
vent grill or temperature has not caused an over-temperature
condition.

Power supply overtemperature condition

3A,

Write protect errors

Disable write protection with the OCP Write Protect switch or
turn off software write protection.

2.8.4 Testing Spun-Down Drive
To invoke resident diagnostics while the drive is still spun down:
1. Select Test switch (Test indicator lights).
2. Select the Write Protect switch: [T 00] is displayed.
3. Input [T 60] into display. This is a loop-on-test utility.
4. Start T60 by selecting the Write Protect switch a second time. The following occurs:
[8.60]
[LOT]

[C.60]
[T 00] (LSD flashing)
5. Input [T 00] into display.
6. Start TOO by selecting the Write Protect switch a second time.
The drive is now running a sequence of resident diagnostics. A number of displays are seen during
the execution of the diagnostics. These displays are normal. Examples of these displays are shown
in Figure 2-15.

DIGITAL INTERNAL USE ONLY

Installation 2-19

DISPLAY =

I I I I I
T

DISPLAY =

I I
S

1 0 11

(START)

DISPLAY =

C 1

(COMPLETE)

DISPLAY =

I I

.. i
1

0 11

10 11*1

* INDICATES FLASHING DISPLAY

CXO-2137A

Figure 2-15

OCP Displays During Testing

Allow drive tests to run for 5 minutes before continuing acceptance testing. To halt testing, select
the Test switch (Test LED extinguishes).

2.8.5 Testing Spun."Up Drive
To spin up the RA90 or RA92 disk drive, select the Run switch. The Run indicator lights and an
[R.-] appears in the display. Allow the drive to come to the ready state as indicated by the front

panel Ready indicator.
If either of the ports (AlB) are selected when the drive reaches the ready state, deselect the port
switches, then proceed as follows:
1. Select the Test switch. Test indicator lights.

2. Select the Write Protect switch. [T 00] is displayed.
3. Input [T 60] into display. This is a loop-on-test utility.
4. Start T60 by selecting the Write Protect switch a second time. [LOT] is displayed in the OCP.
5. Select the Write Protect switch.
6. Input [T 00] into the display.
7. Start TOO by selecting the Write Protect switch a second time.
The above steps invoke a sequence of resident-diagnostic tests. The tests check drive functions in
the following areas:
Processor
Servo bus
Positioner
Head select
Read/write circuitry
Fault detection circuitry

DIGITAL INTERNAL USE ONLY

2-20

Installation

Allow the tests to run for 30 minutes to complete ac..ceptance testing, then select the Test switch to
exit the test mode. The Test LED extinguishes, an [R..••] appears in the display and the Ready and
Run indicators light. Additionally, if either port switch is selected, it will be displayed after the unit
address: [R AB].
If an error occurs during power-up or during the idle loop diagnostics, the drive attempts to display
an error code. Table 2-1 lists error codes and required operator actions.
If no problems are encountered, place the drive on line.
NOTE

In an HSe cluster environment, you can duplicate system usage by running n,EXER for a
few minutes; in a non-HSe environment, a successful operating system disk initialization
and mount operation are sufficient for verifying subsystem operation.

2.9 Placing the Drive On Line
The following procedure assumes drive acceptance testing and cabling procedures have been
completed. If not, refer to the appropriate sections of this manual for details.

2.9.1 Programming the Drive Unit Address
The unit address can be set once power has been applied to the drive. The unit address is
programmable in the range of 0 to 4094. t
Enter the test mode to set the unit address. In the test mode, Port A and Port B switches have the
added function of selecting both the unit address numbers and test numbers.
After applying power, follow this procedure to set the drive unit address. Figure 2-16 is a flowchart
of this procedure.
1. Select the Test switch. The Test LED lights and zeros are displayed. (Something other than

zeros may be displayed if the unit address has been previously programmed.)
2. Select the Port A switch for the ones position. Position zero will blink.
3. Select the Port B switch. Position zero will increment 1 through 9 for every time Port B is
selected.
4. Select the Port A switch for the tens position. Position one will blink.
5. Select the Port B switch. Position one will increment 1 through 9 for every time Port B is
selected.
6. Select the Port A switch for the hundreds position. Position two will blink.
7. Select the Port B switch. Position two will increment 1 through 9 for every time Port B is
selected.
8. Select the Port A switch for the thousands position. Position three will blink.
9. Select the Port B switch. Position three will increment 1 through 4 for every time Port B is
selected.
10. Select the Test switch to exit the unit selection function.

t The KDA50lUDA50IKDB50 support drive logical unit addresses only up ro 255.
DIGITAL INTERNAL USE ONLY

Installation 2-21

~ORMAl, MODi)
INCREMENT
NUMBERS 0-9
BY SELECTING
PORTB
SWITCH

DISPLAY =

DESELECT
PORT AAND
B TO SET UNIT
ADDRESS

THE FOLLOWING
IS A SCROLLING
DISPLAY. TO
STOP DISPLAY,
SELECT RUN
SWITCH

SELECT
PORTA
SWITCH

DISPLAY =

. - - 1 _......--.........

L..I

•

DISPLAY =
DISPLAY =

rOO· 0

lit

SELECT TEST
SWITCH (TEST
LED LIGHTS)

SELECT
PORTA
SWITCH

DISPLAY =

INCREMENT
NUMBERS 0-9
BY SELECTING
PORTB
SWITCH

DISPLAY.

SELECT PORT
B SWITCH
TO TOGGLE
(Y) OR (N)

SELECT
PORTA
SWITCH

TO SAVE OLD
ADDRESS,SELECT
[N] AND EXIT BY
SELECTING TEST
SWITCH

INCREMENT
NUMBERS 0-9
BY SELECT!NG I
PORTS
SWITCH

DISPLAY =

SELECT
PORTA
SWITCH

r 0 "I 0

iNCREMENT
NUMBERS 0-4
BY SELECTING
PORTB
SWITCH

I
DISPLAY =

TO SAVE NEW
ADDRESS,
SELECT [Y]
AND EXIT BY
SELECTING
TEST SWITCH

SELECT
TEST
SWITCH
DISPLAY =

DISPLAY =

• INDICATES FLASHING READOUT

Figure 2-16

CXO-2138A

Unit Selection Flowchart

DIGITAL INTERNAL USE ONLY

2-22 Installation

Before exiting, you will be prompted to verify that you want the unit number changed. The OCP
displays the following prompt:
eRG UNT , {? [N]}

1. If you do not want to change the unit address, select the Test switch a second time.
2. To change the unit address, proceed as follows:
•

Toggle the Port B switch. eRG UNT , {? [Y]} displays.

•

Select the Test switch. The old unit address will be overwritten, and the new unit address
will be displayed in the OCP.

NOTE

The unit address number is written to EEPROM and is not lost if the drive loses power.

2.10 Installing RA90/RA92 Add-On Disk Drives in 6o-Inch Cabinets
Information for unpacking and installing RA9OIRA92 add-on disk drives into 6O-inch cabinets can
be found in product-specific documentation. Refer to the preface, About This Manual, for a list of
related documentation.

DIGITAL INTERNAL USE ONLY

Operating Instructions
3.1 Introduction
This chapter describes each of the RA90 and RA92 disk drive components. Module compatibility
tables are provided to explain the relationships between RA90 and RA92 disk drive hardware.
Drive block diagrams are included to illustrate component relationships.
This chapter also explains various operating modes of RA9OIRA92 disk drives, and covers drive unit
address programming, test functions, and fault functions.

3.2 RA90/RA92 Disk Drive Components
The main components of RA90 and RA92 disk drives are:
•

The electronic control module (ECM)

•

The preamp control module (PCM)

•

The blower motor assembly

•

The head disk assembly (HDA)

•

The drive power supply

•

The operator control panel (OCP)

RASOlRA92 disk drives use three microproeess..,~ to aCC()ft"plish cL-rive functions. The processors are
the master (or I/O), the servo (or DSP), and the operator control panel (OCP) processor.
Figure 3-1 shows a simplified block diagram of RA9OIRA92 disk drives.

DIGITAL INTERNAL USE ONLY

3-1

z-f

~
I

::n

cc
c

»
CD

e:u

»
CD
HALL S1
HALL S2
HALL S3

en
~

...

BYTE CLK H
WRT CURRENT SWITCH L
EMBEDDED OK H
HALL PWR/GND
SVO FLT H
+1- SPINDLE LOCK
FINE TRK H
COILA
INDEX DET H
COIL B
DETENT L
COILC
SERVO CLK ECL UH
BUF OCP SERIAL OUT H
OCP INTLK CLOSED L
5VPF
BEZEL INTLK CLOSED L
TEMP/AIR FLT L
40KHZ H
DCOKH
SENSE -5.2V
OVER TEMP L
POKH

+/- ACT LOCK
+/- MOTOR
SPINDLE

POWER SUPPLY

0"
(')

+1-12V
+5V
-5.2V
+1-24V
GND BLR
40KHZ H
SENSE RTN
DCOKH
OVER TEMP L
POKH

...

cc
Q)

ON L
OCP SERIAL IN H
+5V
PWR/UP RST L
INIT OCP PROC L
SOUTH
BZL INTLK CLOSED L
OCP INTLK CLOSED L

DRIVE 10
REV LEVEL 01 H
DRV 10 (00:19)

ZONE 2 H
ZONE 1 H
CHIP ENABLE L
WRT DATA TRANS H/L
+/- 12V
+/- 5.2V
+5V
HD SEL1I2L
FRC MULTICHIP H
HEAD CLK H
ENC WRITE GATE L
SEL REV LEVEL H
RIW ROY H
VENDOR TYPE 1/2 H
+1- ANALOG SERVO DATA
+1- RAW READ DATA
WRITE UNS L
CURRENT MONITOR L
WRITE CUR DET L
FORMAT 16L
HDA DATA L
HDA INTLK CLOSED L

+5V
SENSE RTN
SENSE -S.2V
-6.2V
+12V
-12V

OCP

BLOWER

HDA

IIO-RIW

SERVO

ACTUATOR

GND BLR
+24V

BUFF INT ENAB L (INT ENAB L)
ONL
PWR FAIL ENA L
GASP RST L
OCP SERIAL IN H
BURST AMP OK H
BURST PROT L
ON/OFF L
DATA (SDO:SD7)
ADDRESS (SAO:SA10)
PWR/UP RST L (UP RST L)
EMB A+I-B L
+/- ANALOG SERVO DATA
SE CLK H
SR L
GASP SELECT L
SEL DRV 10 L
INIT OCP PROC L
SASH
SOE L
AlB READ/RES DATA
SWE L
AlB REAL-TIME DRIVE STATE

SOl INTERFACE
A/B REAL-TIME CTRL STATE
AlB WRT/CMD DATA

CXO-2185A

Operating Instructions 3-3

3.2. 1 Electronic Control Module (ECM)
The ECM field replaceable unit (FRU) consists of two modules back-to-baek mounted on a slide
carrier. One module contains the inputloutput-readlwrite (I10-R/W) circuitry and is referred to as
the I10-R/W module. The second module contains the servo circuitry and is referred 1;0 as the servo
module.
Each module has a set of four physical jumpers that are hard-wired at the factory. These jumpers
are ECO-controlled and are used to mark the differences in functionality between the two hardware
versions of the ECM modules. (These jumpers allow the microcode to display the correct hardware
revision codes for the I10-RJW and servo modules when running drive utility T45.)
The two 70-elass ECM module set versions and related 54-class component part numbers are listed
in Table 3-1.
Table 3-1

ECM Module Types -

Compatibility Matrix

ECMPIN

VO·BJWPIN

Servo PIN

Comments

70-22942-011

54-17771-01

54-17769-01

RA.9O with HDA 70-22951-01
or 70-27268-01

70-22942-02 1

54-17771-02

54-17769-02

RA92-compatible

1The ECM FRU is available as a 70-class part. The individual 54-c1ass parts are not field/customer available due to repair
and error log history strategies implemented by Digital.

The Digital circuit schematic (CS) revision alphanumeric marking on the ECM and its 54-class
component modules does not reflect the microcode loaded into the non-volatile EEPROM as
firmware code. This code is loaded in the field through the use of a microcode update cartridge.
The microcode can then configure itself (enabled by the physical jumpers on the ECM modules) to
assure the correct functionality of that particular ECM module.
The functions of the I10-BJW and servo modules are described in the sections that follow.
3.2.1.1 1I0-RJW Module
Functionally, the IlO-RIW module can be divided into tlu-ee pr~ ar"'~S: SDI interface, control,
and readlwrite. Figure 3-2 provides a block diagram of the IiO-B/W module.

The control circuitry on this module contains the following:
•

MC6801 microprocessor (Master processor)

•

Memory (ROM and RAM)

• . Output control registers
•

Input status registers

DIGITAL INTERNAL USE ONLY

i!rz

-I

TSID
ARTOS
A RD/RES

m
JJ
Z

»
rC

(JJ

m
0
Z

C5
•

- · ARTCS
AWRT/CMD
SOl PORTB

B RTDS
B RD/RES

Q.
C

ii'

SYNC WRT DATA H

m
0'
()

-. ·
-

,r:"

ii"

co
...
I»

SOl PORTA

RTCS SYNC DATA H
RTCS SYNC ClK H
WRT/CMD SYNC DATA H
PlS ERR H

·..

B RTCS
B WRT/CMD

SOl

EMBEDDED BURST TIMING

:.....
---FR~
SERVO

··

...

DRV IDXSCT PlS H
lOX PlS l
SEl TRK a RD H
SEl TRK a WRT l
REC RDGATE H
REC WRT GATE H
INSEl TRWT
FMT CUD l

....
.... --

DRIVE FlT
POWER FAil ENABLE
POWER OK
HEAD SEl 1....
THRESHOLD CTRl

SERVOClK
WRITE DATA l
DRV RD GATE H DRV WRT GATE l

NRZ READ DATA H
ECl RD ClK l
WRTCUR DETl
VCOOKl
.......
MULTlCHIP SEl l WRTUNSl

- :::

COMPAR ERR l
NO DIAG SYNC l

----.
----.

MASTER PROCESSOR
lOOP BACKH

1:-

ClR ERR l
SYSTEM OK l
104- AlTRDGTH
~ PWR/UP RST

...-

READ DATA

110
RTDSH
RTDS ClK H
EN RTDS A H
EN RTDS B H
EN PORTS AlB H
DlY RDGATE H
RD/RES DATA ECl H

EMB A-B l
BURST AMP OK l
EMB A+B l
ANALOG SERVO DATA

TO
SERVO_

FROM
INDEX DET H SERVO...
BYTE ClK H
-SERVOClK H
INTERLOCK STATUS
RIW STATUS
SERVO STATUS
POWER SUPPLY STATUS

RIW ENDEC

FROM
BYTE ClK H SERVO;
DETENTl

==--

ADDRESS 0-3
RIWSTATUS
DATA 0-7

CXO-2186A

Operating Instructions 3-5

The master microcode software controls drive functions through the control and status registers.
Functional and diagnostic software for the master is stored in ROM, RAM, EEPROM and PROM
memories. The master is the logic processor, and it controls and performs the following tasks:
•

OCP communications

•

Drive fault detection (including error recovery)

•

Servo processor communications

•

Functional servo microcode loading

•

Standard disk interconnect (SDI) processing

The master processor controls the servo processor through the use of software. The servo
processor's response to master processor commands is also accomplished through the use of
software.
Upon power-up, the master processor (after self-testing the logic on the I/O-RIW) has the ability
to test portions of the serVo processor logie, including the servo processor RAM memory. After a
successful test of the servo RAM, the master processor will execute a load of the functional servo
microcode from the EEPROM located on the I/O-R/W module.
RA9OIRA92 disk drives are equipped with special error recovery circuits which the master processor
controls. IT the drive receives error recovery commands from the disk controller, the master
processor software activates combinations of error recovery signals. As a result, drive read/write
and servo characteristics are altered in an attempt to recover drive data. Appendix B contains a
more detailed description of the RA901RA.92 error recovery mechanisms.
The master processor retains the drive OCP switch state information and drive unit number in
memory. This state information is saved into non-volatile EEPROM memory if a power loss is
detected. Upon restoration of drive power, the original state of the drive can be resumed
Functional microcode in the drive provides base level revision information concerning the I/O-RJW
module. Drive utility T45 (refer to Chapter 4) displays a numeric number (decimal) code that
translates to the module's hardware revision. The display format is [lOP=xx]. Table 3-2 presents
the displayed codes and the corresponding module part numbers and revisions.
Table 3-2 I/O-RIW Module -

Hardware Revision Matrix

T45 Displayed
Revision IOP=xx

VO-BJW Module

CIS Part

Part Number

Revision

Etch
Revision

Compatibility

54-17771-01

Lx-Nx

RA90 only

54-17771-01

Rx-xx

RA90 only

54-17771-02

Ax-

RA92-compatible

3.2.1.2 Servo Module
Figure 3-3 is a simplified block diagram of the servo portion of the ECM module. The servo portion
of the ECM uses a digital signal processor of the Texas Instruments TMS family. The digital signal
processor is called the servo processor (or sometimes the DSP processor).

The servo processor communicates with the master processor and does the following:
•

Obtains embedded servo information from the IJO-RJW module for offset calibration of the
read/write heads.

•

Obtains dedicated servo information for positioning the read/write heads.

DIGITAL INTERNAL USE ONLY

.-~
z-I

ca
c

;

cCJ)

»
.m

z
~

I""'

Ci'
CD

7f'

Ai"

GASP2

RlWMODULE
A

a.
c

ACTUATOR
POWER AMP
MOTOR +1ACT LOCK +1-

-'"

ACT EN L

.....

IACT

-,-...

f--

1.0

(6801 DATA BUS)
~

ECLKH
BURST PROT L
R/WL
RST GASP l
GASP SELECT L

---

-y

S E CLK H
--.- BURST
PORT L
SR L
PWR/UP RST l
-.- SCS
L
SVO FLT H

...

SERVO
PROCESSOR

- ANALOG

--,.

T CTRl X H
T CTRl Y H

-----

RESET DSP L
HOLD DSP L
INTERRUPT L

iii...

~BUSO-1;) DSP DATO-15 H

=:>r

SPIN/RUN H

VCMD

14t---

L...-

RS
HOLD "'"
.-

ACTUATOR

GRAY CD X H
GRAY CD Y H
FINE DAC H
T CTRL X
T CTRL Y

BYTE ClK H
FINE TRK H
INDEX DET H

....

H DA
ACTUATOR

EP10-17 H:

HI RAM EN L
LO RAM EN L ! - MC WRSP L I--

DMA CTRL

DSP MEMORY

~
....

SPINDLE DRIVER
ISPNDL

I""'

S PINOLE
MOTOR

.-.
,.

COIL AlBIC
+5V
HALL GND
SPINDLE LOCK +1-

.....

HALL S1-S3

CXO-2187A

Operating Instructions

•

Controls spindle motor spin-up and spin-down operations.

•

Monitors HDA spindle speed and servo positioning (including errors).

•

Controls servo-related internal diagnostics.

3-7

Additionally, the servo processor controls the following:
•

Retract (moving heads off'data surface)

•

Return to zero (RTZ)

•

Fine track (keeping heads on track centerline)

• Seeks
Functional microcode in the drive provides base level revision information concerning the servo
module. Drive utility T45 (refer to Chapter 4) displays a numeric number (decimal) code that
translates to the module's hardware revision. The display format is [SRV=xx]. Table 3-3 presents
the displayed codes and the corresponding module part numbers and revisions.
Table 3-3 Servo Module -

Hardware Revision Matrix

T45 Displayed
Revision SRV=xx

ClS Part
Revision

Etch

Part Number

Revision

Compatibility

54-17769-01

Ax- Nx

RAoo only

54-17769-01

Px-xx

RAOO only

54-17769-02

Ax -

RA.92-compatible

Servo Module

3.2.2 Preamp Control Module (PCM)
The PCM FRU is part of the HDAlcarrier assembly which is also an FRU. Figure 3-4 is a simplified
block diagram of the PCM.
The PCM perlorms the following operations:
•

Decodes head select signals sent from the master to select the appropriate read/write head
matrix chips Ooeated inside the RnA), and the appropriate output from each matrix chip.

•

Monitors unsafe read/write conditions.

•

Provides differential write pulses to the preamplifiers.

•

Passes through the HDA vendor type bits from the HDA to the master processor.

•

Passes the type of format bits from the PCM switch pack to the master processor.

Two different PCM modules exist in the RA901RA92 disk drive family. The two PCM types are
electrically incompatible in the interconnect between the PCM and the internal HDA electronics.
However, the PCMs are functionally compatible between the PCM and internal HDA and the ECM
variants that may be attached. A physical mechanism prevents the use of an incompatible PCM
with an HDA.
Table 3-4 describes the PCM switch pack settings with regard to the type of PCM, HD~ and RA9x
model.

DIGITAL INTERNAL USE ONLY

C5
~
r-

:II

;

z-4

:IJ

-a

m
0'
()

»
rC/J

m
0

z
~

_ RD DATA 1
- RD DATA 2

:
:

MUX

t' ,
~

CHIP EN4
CHIP EN3

ii
co

;

DECODER

.-..

R/W READY
ENC WRT GATE

RD/WRT 2
RD/WRT 3
RD/WRT 4

.....

.,.

LATCH

HD SEL 8
HD SEL 4
HD SEL 2
HD SEL 1

-----,.

WRT DATA 1
WRITE
DATA
SWITCH

WRT DATA TRANS--.1

WRT DATA 2
WRT DATA 3
WRT DATA 4

HDA

..-

--..
WRT CUR 1

ZONE 2

---..
----.
---

SELECT

ZONE 1

-,.

HDA DATA)

CHIP EN2
CHIP EN1
RD/WRT 1

DECODER

RD DATA 3
RD DATA 4

-..
-

WRITE
CURRENT
GENERATOR

WRT CUR 2
WRT CUR 3
WRT CUR 4

..-

---

CXO-2188A

Operating Instructions 3-9

Table 3-4 PCM Switch Pack Setup
PCMPlN
54-17758-01

HDAPfN

PCM SW Pack Settings·
S14
SI-3
81·2
SI·1

70-22951-01

RA.9O long arm only

RA90 short arm

Comments

54-19724-01

70-27268-01

1
~

54-19724-011

70-27492-01

RA92 only

54-19724-011

70-27492-01

Incompatible setup 2

54-19724-011

70-27492-01

Incompatible setup 2

54-19724-011

70-27268-01

Incompatible setup 2

54-19724-011

70-27268-01

Incompatible setup 2

*0 = ON = CLOSED, 1 = OFF = OPEN
1PCM spares shipped from logistics are configured by default to declare an incompatible situation. This forces the field
person to properly configure the replacement P{;M" to indicate the proper IIDA format type. The drive microcode uses the
switch setting information to properly configure servo operations.

2Drive LED error code CO signifies that the microcode has determined an incompatible situation between the hardware
and/or microcode components of the drive configuratio~ or a hardware failure has caused the drive to believe the
configuration is improper.

Functional microcode in the drive provides base level revision information concerning the PCM
module. Drive utility T45 (refer to Chapter 4) displays a numeric number (decimal) code that
translates to the module's hardware revision. The display format is [PCM=xx].
Table 3-5 presents the displayed codes and the corresponding module part numbers and revisions.
Table 3-5 PCM Module -

Hardware Revision Matrix

T45 Displayed
Revision PC""l\'isxr

PCMModule
Part Number

CIS Part

Revision

Etch
Revision

Compatibility

54-17758-01 2

Ex-Hx

HDA 70-22951-01 only

Ax-

HDA 70-27268-01 and
70-27492-01

54-19724-01

18witch position 81-3 and 81-4 of Switch Pack Sl determine the displayed PCM hardware revision.
2These modules have a mechanical interlock that prevents the inadvertant mating of electrically incompatible PCMs to the
HDA

There is a four-position switch pack on the PCM. Switch pack switches S1-3 and S1-4 determine
the PCM hardware revision (not CS revision) through OCP display T45. Switches S1-1 and S1-2
are used to tell the drive functional microcode the format type written on the HDA. There are two
planned format types - R...A..90-eompatible and F...A92-compatible.

A new HDA/carrier assembly FRU should have the switch pack set correctly by the manufacturing
plant. If the PCM is defective, set the switch pack switches appropriately. Figure 3-5 shows the
location of the switch pack on the PCM.

DIGITAL INTERNAL USE ONLY

3-10 Operating Instructions

PCM
SWITCH
PACK

PREAMP CONTROL MODULE (PCM)

PCM-ECM CABLE CONNECTOR

PCM SWITCH: 0 = OPEN/OFF = LOGICAL 1
C = CLOSED/ON = LOGICAL 0

f.lgure 3-5

CXO-2963A

PCM SwHch Pack location

3.2.3 Head Disk Assembly and Carrier Assembly
Figure 3-6 is a simplified block diagram of the RA901RA92 head disk assembly (HDA) and its
relationship to the rest of the drive.

The HDA consists of the following components:
•

The spindle motor, spindle, and recording media

•

The actuator motor to position the read/write heads

•

The Hall sensors to monitor spindle speed

•

The preamp/select chips

•

The brake assembly

•

The ground brush

•

The positioner lock mechanism

Currently, there are three different HDAs in the RA9x disk drive products family. Two different
PCMs are available for these three HDAs. Table 3-6 is a compatibility matrix for HDA types, PCM
types, and RA9x models.

DIGITAL INTERNAL USE ONLY

co
c

CiJ

t
,...
SERVO MODULE

HDA

R/W 110 MODULE

PREAMP CTRL MODULE

WR RECOVERY L
ACTUATOR
PWR AMP

MOTOR +/ACT LOCK +/SPINDLE
DRIVER
COIIL AlB/C
+5V
HALL GND
SPINDLE LOCK +/-

MANUFACTURING
ONLY

m
lJ
z
»
rC

en
m

!:(

.-...

-------

----

RE AD DATA 1-4 L I - - - - -.......
RE AD DATA 1-4 H I - - - - -.......
CIJB MONITOR L I - - - - -.......
WFl ITE UNSAFE L I - - - - -.....
SEA. VO RD DATA L I - - - - -.....
SER VO RD DATA L I - - - - -......
VEND OR TYPE 1/2 H t - - - - -......

+/- READ DATA
HDA INTLK CLSD L
WRITE CUR DET L
CURRENT MONITOR
WRITE UNS L
ANALOG SERVO OAT H
ANALOG SERVO OAT L
HD SEL 1-2
VENDOR TYPE 1/2 H

-5.2V
SERVO READ IH
SERVO WRT DATAL
SERVOWRT CIATA H

....- - - - - 1 +5V
~-----I

READ/WRITE 1-4 H

.....----.-1 WR CUR 1·,4 H

...- - - - - 1 CHIP ENA 1-4 L
CHIP ENA L

....- - - - - 1 WRITE DATA 1-4 L
WRITE DATA 1-4 H
....- - -...... HEAD SEL 1-2 L

14----,-.

,----.....

4--

----

HDA DATA 1-4 L
CHIP ENABLE L
WRITE DATA TRANS H
WRITE DATA 1rRANS L.
FORCE MULTICHIP H
HEAD ClK H
R/W READY H
ENC WRITE GATE L
SELECT REV LEVEL H
ZONE 1-2 H
+/- 12V IN
-5.2V IN
+5VIN
CXO-2189A

3-12

Operating Instructions

Table 3-6

RA90/RA92 HDA Hardware Compatibility Matrix

BDAPIN
and Type

RA90
70-23899-01 1

RA90
70-23899-02,

RA92
7027490-01

70-22951-01,

Original*

Compatible

Incompatible

54-17758-01

70-27268-01,
Short-arm RA90 HDA

Compatible

Original*

Incompatible

54-19724-012

70-27492-01,
Short-arm RA92 HDA

Incompatible

Original*

PCM

Long-arm RA90 HDA

*Original

=RDA type original to drive

IThe RA90 disk drive was originally made from the base drive part number 70-23899-01. With the introduction of the
short-arm HDA, the variant of the base part number for the RA90 disk drive was changed. The size ancl SDI disk topology
of the 70-23899-01 and 70-23899-02 variant RA90 disk drives are identical. There is not a duplication of drive serial
numbers between the 7O-class numbers. Architecturally, the drives are itlenticGl. At the HDA FRU level, the abort-arm
HDA is electrically compatible with the originallong-arm HDA. However, microcode compatibility issues must be watched.
2The PCM switch pack must be set to indicate the type of HDA.

3.2.4 Dual Outlet Blower Motor
The blower motor assembly provides drive cooling. In addition, the blower motor contains speed
control circuitry to activate higher throughput if the ambient air temperature exceeds 23°C (75°F).
If the drive is operating without problems at or below this temperature, blower speed is reduced for
better acoustic levels.

3.2.5 Power Supply
The power supply provides the following voltages to RA9OIRA92 disk drives:
•

±12 Vdc

•

±5.1 Vdc

•

±24 Vdc

•

-5.2 Vdc

Normal power supply operation is indicated by the presence of a green Power OK LED located at
the rear of the drive. Refer to Figure 3-7 for the location of the Power OK LED.
The power supply operates on any line frequency within the range of 47 HZ to 63 HZ. It is switchselectable to either of two ranges: 120 Vac or 240 Vac.
CAUTION

If a unit has its voltage selector switch in the 120 Vac position and is plugged into 240
Vac, the power supply will be damaged.
If a unit has its voltage selector switch in the 240 Vac position and is plugged into 120
Vac, it may work, but would be very sensitive to low line voltage.

DIGITAL INTERNAL USE ONLY

Operating instructions 3-13

DRIVE
/REAR
~

CIRCUIT
BREAKER

II /~n q
~~~O /I c
GREEN LED
(POWER OK)

CXO-2134B

Figure 3-7

Power Supply OK LED

This power supply has two vendors, designated Vendor A and Vendor B. Power supplies from
Vendor A have a serial number with a ex. site code. Power supplies from Vendor B have a serial
number with a KB site code. (Voltage markings on some power supplies may read 115/230 Vac.)
The power supplies from both vendors are functionally identical and catTy the same Digital part
number.

3.2.6 Drive Functional Microcode
The drive functional microcode can be field loaded and upgraded using the OCP microcode update
port. ROM-based utility programs contained on the ECM module (I/O-RIW) allow microcode
loading.
Table 3--7 is a compatibility matrix for microcode cartridges, microcode levels, and ECM and HDA
FRUs.

DIGITAL INTERNAL USE ONLY

3-14 Operating Instructions

Table 3-7

RA90/RA92 Microcode Compatibility With Drive FRUs

ECMFRU
Microcode

Micro-

Cart. PIN
and Rev

code
Level

70-24432-02 Al

BDAFRU

70·22942-01

70-22942-02

70-22951·01

70-27268-01

70-27492-01

Yes

70-24432-02 A1

Yes

70-24432-02 Bl

Yes

Nol
Nol
No1

70-24432-02 Cl

Yes

70-24432-02 Dl

Yes

N02
No2

Yes

No
No
No
No
No

70-27950-01 Al

Yes

70-27950-01 Bl

Yes

Yes
Yes

1Results in LED Code 13
2Results in LED Code E2

NOTE

Microcode compatible with an ECM FaU means the code can be loaded into the ECM
FaU without error and wiD function, provided there is a compatible BDA with the
appropriate PCM and PCM switch settings are correct. (This does not apply to Hard
faults, because the microcode cannot be loaded into the ECM.)
Microcode compatible with an BDA FaU, means tbat the code (when loaded into a
compatible ECM) will support the BDA identified in Table 3-7.
To determine total compatibility, you must verify the following:

- Code compatibility to ECM (Table 3-7)
- Code compatibility to HDA (Table 3-7)
- ECM compatibility to HDA (Table 3-1)
- PCM and HDA compatibility (Table 3-4)
- PCM switch pack setup (Table 3-4)

3.2.7 OCP Functions
The operator control panel (OCP) shown in Figure 3-8 functions as the interface to the RA901RA92
disk drive. The OCP performs the following functions:
•

Selects and displays the unit address.

•

Selects Run, Write Protect, Port A, and Port B.

•

Displays fault indication and elTOr codes.

•

Selects tests in the test mode.

•

Controls the drive software update process.

•

Communicates with the RA9OIRA92 master processor.

•

Monitors momentary contact switches for closure.

DIGITAL INTERNAL USE ONLY

Operating Instructions

FOUR-CHARACTER
ALPHANUMERIC
DISPLAY""

UNIT
NUMBER

3-15

TEST
SWITCH

STATE LED
INDICATORS
NOTE: RA90 PART NO. 74-35109-02
RA92 PART NO. 74-39769-01

Figure 3-8

CXO-2962A

RA90/RA92 OCP

To execute or select these functions, you must be familiar with the following OCP features (refer to
Figure 3-8):

•

Six input switches (Run, Fault, Write Protect, Port A. Port B, and Test).

•

Seven LED indicators (Ready, Run, Fault, Write Protect, Port A, Port B, and Test).

•

A four-character alphanumeric display.

• . A software update port (refer to Chapter 7).

3.3 RA90/RA92 Operating Modes
RA9O/RA92 disk drives operate in three setup modes: normal, fault display, and test. The following
sections describe the function of each of these modes.

3.3.1 Normal Mode Setup
The normal mode setup is the usual operating mode of the RA90 and RA92 disk drives. Switch
selection during normal operation usually consists of the Run switch, Write Protect switch (for
normal write protection), and Port A or Port B switch. No Fault or Test indicators are lit. The
switch states are displayed in the alphanumeric display, and the state of the drive relative to the
controller is displayed in the LED indicators.

DIGITAL INTERNAL USE ONLY

3-16

Operating Instructions

In the normal operating mode:
1. Selecting the Run switch causes an R to appear in the OCP display and causes the drive to spin
up. Additionally, the Run LED indicator lights. The Ready LED indicator lights once the drive
is up to speed.
2. Selecting the Port A or Port B switch causes an A or B to appear in the OCP display and
logically makes the drive available to the controller.
3. Selecting the Write Protect switch logically write protects the drive and lights the Write Protect
LED indicator.
4. Selecting the Fault switch:
•

(Without a fault indicator) causes a 2-second OCP lamp test.

•

(With a fault indicator) causes an elTOr code to display. Selecting the Fault switch a second
time (with a fault indicator) clears the fault. (Refer to Section 3.3.2.)

5. Selecting the Test switch:
•

(With the Port A or Port B switch selected) causes a 2-second display of the unit address.
(Refer to Section 3.4.1 for information on the alternate unit address display mode.)

•

(Without the Port A or Port B switch selected) causes the drive to enter the test mode. (At
this time the Ready LED is extinguished.)

Table 3-8 details operator actions and the result of OCP switch selection(s) in the normal mode.
Power-up OCP functions and normal switch selection functions are covered.
Table 3-8 Power-Up: Normal Mode Operations
Operator Action

OCPResult

Drive Function

<Power-up>

[WAIT]

Drive is running power-up diagnostics

Default

[0000]

Unit number displayed may be something other than zero

<RUN>

[R. •• ]

Spinup command issued to spindle

<A>

[B.A.]

Port A is enabled

<B>

[R.AB]

Port B is enabled

3.3.2 Fault Display Mode Setup
The fault display mode can only be entered if the Fault indicator is lit; otherwise, selecting the
Fault switch causes a 2-second OCP lamp test.
To enter the fault display mode, select the Fault switch. An error code is displayed in the format
shown in Figure 3-9. To exit the fault display mode and clear the fault, select the Fault switch a
second time.
NOTE

Hard faults will not clear.
Figure 3-9 shows a characteristic alphanumeric fault display error code. Figure 3-10 is a fault
display mode flowchart.

DIGITAL INTERNAL USE ONLY

Operating Instructions 3-17

DISPLAY =

I. ._E_,--_I,--o_..i i
_F...

CXO-219CA

Figure 3-9

OCP Faun DIsplay Error Code Example

NORMAL MODE

DiSPLAY=

SELECT
FAULT
SWITCH

OCP LAMP
TEST
(2 SEC)

·DISPLAY=

FAULT
CLEARS

·NOTE: ANY COMBINATION OF LEGAL ALPHANUMERIC
ERROR CODES (HEX).
CXO-2191B

Figure 3-10

FauH DIsplay Mode Flowchart

DIGITAL INTERNAL USE ONLY

3-18 Operating Instructions

3.3.3 Test Mode Setup
You must enter the test mode to set the RA90 or RA92 disk drive unit address or to run resident
diagnostic tests. In this mode, Port A and Port B switches have the function of selecting both the
unit address numbers and test numbers. In addition, the port switches are used to abort running
diagnostics. The Write Protect switch starts the tests and the Port A or Port B switch stops selected
tests.
The test mode is characterized by three displays. Figure 3-11 shows an OCP after test selection is
made. Figure 3-12 shows a display while the test is running.

DISPLAY

=1...T----.I----'I~o....1_1*.....1

* INDICATES FLASHING DISPLAY
CXO-2192A

Figure 3-11

OCP Display After Test Selection
DISPLAY =

I I I

DISPLAY =

I S I 1 0 11 1 (START)

DISPLAY...

DISPLAY =

Is I I 0 1 2 I

C 1

0 11*1

1 0 11

(COMPLETE)

* INDICATES FLASHING DISPLAY
CXO-2195A

Figure 3-12

OCP Display While Running Test

DIGITAL INTERNAL USE ONLY

Operating instructions

3-19

3.4 Programming the Drive Unit Address
The unit address can be set once power has been applied to the drive. You must set the drive unit
address before placing the drive on line.
The RA90 or RA92 unit address is programmable from 0 to 4094. (Note that the operating system
or subsystem type can limit the unit address range.)
Use the following procedure to set the drive unit address. (Refer to Figure 3-13 for a :flowchart of
this procedure.)
1. Select the Test switch. (The Test LED indicator lights and a unit address (if previously
programmed) is displayed; otherwise, zeros are displayed.}

2. Select the Port A switch for the ones position. (Position zero blinks.)

3. Select the Port B switch. (Position zero increments i through 9 every time Port B is selected.)
4. Select the Port A switch for the tens position. (Position one blinks.)

5. Select the Port B switch. (Position one increments 1 through 9 every time Port B is selected.)
6. Select the Port A switch for the hundreds position. (Position two blinks.)
7. Select the Port B switch. (Position two increments 1 through 9 every time Port B is selected.)

8. Select the Port A switch for the thousands position. (Position three blinks.)
9. Select the Port B switch. (Position three increments 1 through 4 every time Port B is selected.)

10. Select the Test switch to exit.
At this point, the OCP prompts you to verify that you want to change the unit address. The
following prompt scrolls through the OCP display:
eRG UNT I {? [N]}

•

If you do not want to change the unit address, select the Test switch a second time. The drive
returns to normal mode.

•

To change the unit address:
1. Toggle the Port B switch, eRG UNT , {? [Y]} displays.

2. Select the Test switch.
The old unit address is overwritten with the new address. The new unit address is displayed in the
OCP.
NOTE

The new unit address is written to EEPROM and is not lost if the drive loses power.

DIGITAL INTERNAL USE ONLY

3-20 Operating Instructions

NORMALMOD~
DISPLAY =

INCREMENT
NUMBERS 0-9
BY SELECTING
PORTB
SWITCH

L..--I...-.J-.....I._-'
I~

DESELECT
PORT A AND
B TO SET UNIT
ADDRESS

DISPLAY =

THE FOLLOWING
IS A SCROLLING
DISPLAY. TO
STOP DISPLAY.
SELECT RUN
SWITCH

SELECT
PORTA
SWITCH

L -....._-#---'_~

DISPLAY = r O O *

DISPLAY =

SELECT TEST
SWITCH (TEST
LED LIGHTS)

SELECT
PORTA
SWITCH

I0 I 0

I
DISPLAY =

J0 *

O·

INCREMENT
NUMBERS 0-4
BY SELECTING
PORTB
SWITCH

TO SAVE OLD
ADDRESS, SELECT
[N} AND EXIT BY
SELECTING TEST
SWITCH

01,

SELECT
PORTA
SWITCH

DISPLAY =

* INDICATES FLASHING READOUT

UnH Address Selection Flowchart

DIGITAL INTERNAL USE ONLY

11 [ *1 ]
N

,
TO SAVE NEW
ADDRESS.
SELECT [Y}
AND EXIT BY
SELECTING
TEST SWITCH

SELECT
TEST
SWITCH

DISPLAY =

Figure 3-13

1 [ N*l)

SELECT PORT
B SWITCH
TO TOGGLE
(Y) OR (N)

INCREMENT
NUMBERS 0-9
BY SELECTING
PORTB
SWITCH

L -.......- - f -.....-~

SELECT
PORTA
SWITCH

,
DISPLAY =

DISPLAY =

INCREMENT
NUMBERS 0-9
BY SELECTING
PORTB
SWITCH

r I[
1

CXO-2138A

Operating instructions 3-21

3.4.1 Anemate Unit Address Display Mode
Future RA90 and RA92 disk drives will incorporate a microcode enhancement that will provide
an alternate unit address display mode. 'Th display the unit address, refer to Figure 3-14 while
penorming the ionowing procedure:

1. The OCP display shows an B, A, and/or B.

2. While in normal mode, select the Port A and/or Port B switch.
3. Select the Test switch. At this point, the Run, Fault, Write Protect, Port A, and Port B switches
are disabled.
4. The unit address is displayed until:
•

The Test switch is deselected.

•

Power is cycled.

•

An SDI HARD INIT occurs, or the drive forces a hard initialization due to a fatal error.

Any of these conditions will clear the OCP from the alternate display mode.

DIGITAL INTERNAL USE ONLY

3-22 Operating Instructions

NORMAL MODE

DISPLAY=

RFWAB
FUNCTIONS
DISABLED

DISPLAY.

ENABLES
ALL
SWITCHES
CXO-2958A

Figure 3-14

Alternate Unit Address Display Mode Flowchart

DIGITAL INTERNAL USE ONLY

Drive-Resident Diagnostics and Utilities
4.1 Introduction
This chapter describes drive-resident diagnostic fault detection, power-up and idle loop diagnostic
routines, and sequenced or chained diagnostics. The RA9OIRA92 drive-resident diagnostics and
utilities are described individually. These drive-resident diagnostics test for and detect elTOrs in the
following field replaceable units (FRUs):
•

Electronic Control Module (ECM) (inputloutput-readlwrite (IlO-PJW> and servo modules)

•

Preamp Control Module (PCM)

•

Head Disk Assembly (lIDA)

4.2 Power-Up and Idle Loop Diagnostics
Resident diagnostics execute any time the drive is powered up or the master processor is reset.
Additionally, diagnostic routines execute during idle loop with the drive spun up or down. The Test
LED, when lit, indicates the drive is in idle loop testing.
The following sections describe power-up (reset) and idle loop diagnostic sequences.

4.2.1 Power-Up (Hardcore) Diagnostics
The iollowing hardcore tesis are run at power-up or upon reset of the master processor (refer to
Section 4.7 for a description of each test):
•

Master CPU test (POR)

•

Master ROM test (TOl)

•

Master RAM test (POR)

•. Master timer test (T02)
•

Serial communication test (SCI) (POR)

•

Servo data bus loopback test (T03)

•

Servo RAM test (POR)

DIGITAL INTERNAL USE ONLY

4-1

4-2

Drive-Resident Diagnostics and Utilities

4.2.2 Idle Loop Tests (Drive Spun Down)
Idle loop is defined as the drive being off line to the controller. The following sequence is executed
every 30 seconds during idle loop (refer to Section 4.7 for a description of each test):
•

Master ROM test (TO 1)

•

Master timer test (T02)

•

Servo data bus loopback test (T03)

•

Head select test (TOO)

•

Sectorlbyte counter test (T07)

•

SDI loopback test (internal) (T08)

4.2.3 Idle Loop Tests (Drive Spun Up)
The following tests are run during idle loop with the drive spun up (refer to Section 4.7 for a
description of each test):
•

Master ROM test (T01)

•

Master timer test (POR)

•

Servo data bus loopback test (T03)

•

Head select test (TOG)

•

Gray code (track counter) test (1'29)

•

Guardband test (T30)

•

Incremental seek test (quick verify mode) (T31)

•

Random seek test (quick verify mode) (T33)

4.3 Sequence Diagnostics
A number of tests are sequenced together to form a chain of tests. The test [chain] numbers and
the individual test numbers that make up the chain are listed here. An example of the information
seen in the OCP alphanumeric display is also included. Refer to Section 4.7 for a description of
each test.
•

TOO and T23 are the same when the drive is spun down, and include:
T01
T02
T03
TOO
T07
TOB

Duration: 12 seconds
•

TOO and T22 are the same when the drive is spun up, and include:
T01
T02
T03
T06
T29

T30
T31
T33
DIGITAL INTERNAL USE ONLY

Drive-Resident Diagnostics and Utilities 4-3

T14
T15
T16
Duration: 7:10 minutes
The following is an example of the information seen in the alphanumeric display as the drive
sequences through a chain:
1. [T 00] Enter test TOO from the OCP front panel.

2. [S 00] Start TOO.
3. [S 01] T01 starts.
4. [C 01] T01 completes.

5. [802] T"02 starts.

6. [C 02] T02 completes.
(and so on until each diagnostic in the chain is completed)
7. [T 00] Concludes with this display and the least significant digit (LSD) blinking. The OCP
display is read from left to right with the LSD on the right side.
The majoriv,f of tests are of a relatively short duration, with the fonowing exceptions:

•

Tal (2.5 minutes; indefinite when standalone)

•

T32 (1 minute; indefinite when standalone)

•

T33 (55 seconds; indefinite when standalone)

Additional test chains are:
•

T18: T01, 02, 03,06 (8 seconds total)

•

T19: T14, 15, 16 (20 seconds total)

•

T20: (4 seconds if spun down; 2 seconds if spun up)

•

T21: T03, 29, 30, 31 (4.5 minutes), 32, 33 (7:10 minutes total); error if spun down

•

T22: Same as TOO except Tal (4.5 minutes) (7 minutes total)

•

T23: T01, 02, 03,06,07, 08 (20 seconds total)

4.4 Standard OCP Displays Indicating Procedural Problems
If you attempt to load and run a nonexistent test, [INVL] (invalid) displays in the OCp, followed
by an error code. For example, if you attempt to run T10 (an invalid test number), the following
occurs:

1. [T 10] (Display)

2. [S 10] (Display)
3. [INVL] (2 seconds-indicates invalid test)
4. [C 10]
5. [T 10]

No error code is generated. To continue, simply select another diagnostic.
If you attempt to run a diagnostic while the drive is faulted and that particular diagnostic cannot
be run under fault conditions, the OCP displays [NRUN].

DIGITAL INTERNAL USE ONLY

4-4 Drive-Resident Diagnostics and Utilities

For example, read/write or seek tests cannot be run while the drive is faulted. However, ROM or
RAM tests can be run.
If you attempt to run a test that requires the drive be spun up (but the drive is spun down), the
following occurs:
1. [T 14] Load T14.

2. [8 14] Start T14.

3. [T 14] (with fault light)
4. Select Fault switch.

5. fE.CAl error code indicates the drive must be spinning for the test to run successfully. Unless
otherwise indicated, this is the format for all errors.
Select the Fault switch again to clear the fault and continue.
If you attempt to run a test that requires the drive be spun down (but the drive is spun up), the
following occurs:
1. [T 07] Load T07.

2. [8 07] Start T07.
3. [T 07] (with fault light)
4. Select the Fault switch.
5. [E 7B] Invalid-test-while-drive-is-spinning error. This is the format for all tests that are invalid
while the drive is spinning.
Select the Fault switch again to clear the fault and continue.
Some diagnostic test numbers call up other tests. These are displayed in the OCP after the
diagnostic starts. An example of this is 1'24. The following is displayed in the OCP:
1. [T 24] Load 1'24.
2. Start test.
3. [8 63] See test T63.
4. Mter the head(s) are selected, select Write Protect.
5. [T 31] Loaded by the drive.
6. [8 31] See test T3l.
The reverse is not true. T63 does not start 1'24.

4.5 Software Jumper
References to a software jumper are frequently made throughout the discussion of diagnostics.
To use the software jumper, simply select the Run/Stop switch within 1.5 seconds of starting a
diagnostic requiring the jumper's use.
CAUTION

Do not use the jumper unless it is required. Valuable drive component information can
be accidentally lost. Use the jumper only when instructed to do so.

DIGITAL INTERNAL USE ONLY

Drive-Resident Diagnostics and Utilities

4-5

Master RAM Test (POR/SOI)

4.6 Temperature's Affect on Drive Performance
The RA9OIRA92 drive utilities T36, T3B, and T39 measure various seek time parameters. Compare
measured times to drive specifications in cases where seek time is in question.
At the areal densities of the RA9OIRA92 disk drives, variations of mechanical responses within
the HDA mechanical structures change significantly over the wide temperature ranges acceptable
to the drive. To control these variations and their impact on the subsystem, the drive monitors
and compensates its seek profile to optimize the seek time performance. This compensation is a
dynamic process, that assures top seek performance of the disk drive.

4.7 Diagnostics Descriptions
This section describes each of the diagnostic tests and utilities resident in the RA901RA92 disk
drive. Tests are listed by a test number (where applll"$lble), a name. an explanation of how the test
is invoked, and a test description.
Conventions include the following:
•

(TOO): test number

•

(POR): power-up or reset

•

(SDI): initialization performed by the controller over the SDI cable

•

([0000]): items enclosed in square brackets represent the OCP alphanumeric display

NOTE

Some diagnostics implement a scrolling display pattern. To stop the scrolling display
patternt select the Run switch; this halts the display until you are ready to continue.
Select the Run switch again to continue the display.
Some tests run for several seconds then have results to display_ These tests stop the
scrolling display and send an asterisk to the display_ Press the Run switch to display test
results.

Master CPU Test (POR)
The Master CPU test verifies the basic functions of the drive master processor. Accumulator
functions, conditional codes, and other MCU chip functions are tested.

Master RAM Test (PORISDI)
The Master RAM test runs at power-up only. It verifies the master processor internal and static
RAM. The test reads and writes, then reads each RAM location again to verify data integrity of the
component. The test is executed in both forward and reverse directions.

DIGITAL INTERNAL USE ONLY

4-6 Drive-Resident Diagnostics and Utilities
(T03) Servo Data Bus Loopback Test (PORlSDI)

Serial Communications Interface (SCI) Test (POR)
The SCI test checks the master processor serial communication interface by looping a data pattern
from the serial output back to the serial input. It compares data out to data in for integrity.
Additionally, the serial port is tested for overrun error detection and overrun recovery. The test
simulates OCP MCU communication with the master MCU.

Servo RAM Test (POR)
The Servo RAM test checks the servo processor RAM by writing a pattern of ones and zeros through
RAM. The entire 16 Kbytes of RAM is tested.

(T01) Master ROM Test
The Master Processor ROM test verifies the master processor internal ROM, EEPROM, and the
associated address decode logic. A checksum is done on each ROM. Next, the test verifies that the
consistency codes match between the MCU ROM and the master processor EPROM and EEPROM.
If a failure occurs, the master processor attempts to display an error code to the OCP.

(T02) Master Timer Test (PORISOI)
The Master Timer test verifies the output compare timer in the master processor by checking the
Output Compare Flag (OCF) for stuck bits. Additionally, the test operates the timer in polling and
interrupt modes.
In polling mode, the output compare register generates a compare every 50 ms and ensures that
the OCF sets within 60 ms.
In interrupt mode, the output compare register generates a compare every 50 ms and checks for
one interrupt within a 75 ms period.

(T03) Servo Data Bus Loopback Test (PORISOI)
The Servo Data Bus Loopback test checks the data bus interface between the I10-PJW module and
the servo module (ECM) by rotating a single bit through each bit position on the servo data bus.
The data pattern is written to the GASP register #1 and read back through GASP register #7.

DIGITAL INTERNAL USE ONLY

Drive-Resident Diagnostics and Utilities

4-7

(T04) Drive SIN Bus Test (PORlSDI)

(T04) Drive SIN Bus Test (PORISDI)
The Drive SlN Bus test checks the drive serial number bus (the rear flex cable between the HDA
and the servo module with the PCM switch pack). The three drive ID ports hardwired on the rear
flex cable assembly are read and concatenated into one 20-bit binary encoded serial number.
Bits 19 and 18 represent the manufacturing plant code of the drive (CX or KB). Bits 17 through 0
are the alphanumeric serial number (OOOOO-Z9999). The numbering scheme is displayed using the
following:
Encoded Serial
Number Displayed

Decimal Drive
Serial Number

00000-99999

0-99,999

A~A9999

100,000-109,999

B~B9999

110,000-119,999

COOOO-C9999

120,000-129,999

DOOOO-D9999

130,000-139,999

E~E9999

140,000-149,999

FOOOO-F9999

150,000-159,999

HOOOO-H9999

160,000-169,999

JOOOO-J9999

170,000-179,999

KOOOO-K9999

180,000-189,999

LOOOO-L9999

190,000-1999,999

M~M9999

200,000-209,999

NOOOO-N9999

210,000-219,999

POOOO-P9999

220,000-229,999

ROOOO-R9999

230,000-239,999

SOOOO-S9999

240,000-249,999

TOOO()...T9999

250,000-259,999

UOOOO-U9999

260,000-269,999*

V~V9999

270,000-279,999

WOOOO-W9999

280,000-289,999

YOOOO-Y9999

290,000-299,999

ZOOOO-A9999

300,000-309,999

*NOTE
U2143 is the maximum serial number that can be coded for the KB manufacturing site
because only the bottom 18 binary bits are used for the serial number range.

DIGITAL INTERNAL USE ONLY

4-8 Drive-Resident Diagnostics and Utilities
(TO?) SectorlByte Counter Test

The test passes or fails based on the following valid and invalid bit-encoded binary information:
VALID DRIVE SiN CODES
19

CXO-built drive (serial
number 1 through 262,143)

CXO-built drive (serial
number 262,144 through 309,999)
Limitation is based upon
the number of alphabetic
characters available.

KBO-built drive (serial
number 1 through 262,143)

INVALID DRIVE SiN CODES
BITS MIN BINARY VALUE
19 18 BITS<17:00>

o
1

1
1

MAX BINARY VALUE
BITS<17:00>

001011101011110000 111111111111111111
000000000000000000 111111111111111111

NOTE

Do not alter these switches in the field unless you are instructed to do so during an
ECOIFCO installation.

(T06) Head Select Test
The Head Select test checks the SDI gate array (SGA) head select register for stuck-at conditions.
The test writes a head select pattern to an SGA internal register and vermes the pattern by reading
it back through another SGA internal register. Each head select pattern is clocked to the preamp
control module (PCM) verifying the correct head select chip can be enabled.

(T07) Sector/Byte Counter Test
The SectorlByte Counter test checks the sector preset by writing and reading each bit in the sector
preset register. The test checks the byte preset counter by presetting the byte counter. A full
coUnting sequence is needed to increment the sector count by one. Finally, the sector/byte counter
is checked with the actual preset values used in the functional code. A diagnostic clocking signal is
used.

DIGITAL INTERNAL USE ONLY

Drive-Resident Diagnostics and Utilities

4-9

(T15) WritelRead Test

(TOB) SOl Loopback Test (Internal)
The internal SDI Loopback test is executed with the TS!D GATE .AP"p_A"y in loopback mode.

The State Frame part of this test asserts the state bits (RDY, ATN, RiW, SEC) in the Real Time
Drive State (RTDS) frame and checks the corresponding state bits (RDY, WRT, RD, 00) in the Real
Time Controller State (RTCS) frame for accuracy.
The Response Serializer part of this test sends framing codes (START, CONTINUE, END) by way
of the CMDREG register, along with response data (a pattern) by way of the RSPDAT register. The
test checks the correct framing codes by way of the INSTR2 register and the correct command data
through the CMDATA register. This test is executed on Ports A and B.

(T09) SDI Loopback Test (External)
The external SDI Loopback test is the same as the internal SDI Loopback test except the SDI
signals are looped back via connectors at the end of the SDI cables. See Figure 4-1.

(T14) Read-Only Test
The Read-Only test compares prerecorded data information from cylinder 2659 to the data read
by each head. The data pattern is different for each head. If the compare fails, an error code is
generated. In addition, if five off-track errors are detected while reading with anyone head, an
error is generated. Errors are analyzed in the following manner:
•

A sector is considered bad if the same sector fails to read the correct data three out of five

times.
•

A head is considered bad if the same head contains nine bad sectors.

If no errors are detected during this test, a compare error is induced to ensure that the lID chip
compare circuitry can detect a compare error.

(T15) Write/Read Test
The WritelRead test executes only after the read-only (T14) test has passed. This test writes and
reads dedicated cylinder 2660 using all read/write heads.
Two patterns are used during this test:
1. First, all the heads are written with an all-zeros-plus-a-SYNC-BIT pattern and read to verify

that the data compares. If there are no errors, a NO SYNC detection test is run verifying that
the llD sync detection circuitry is working correctly and that it can detect a NO SYNC error.
2. Second, a ones-plus-a-SYNC-BIT pattern is written to cylinder 2660 and read back using each
data head. Data is compared to ensure data integrity.

DIGITAL INTERNAL USE ONLY

4-10

Drive-Resident Diagnostics and Utilities
(T15) WritelRead Test

110 BULKHEAD
CXO-2144A

Figure 4-1

Using Loopback Connectors

DIGITAL INTERNAL USE ONLY

Drive-Resident Diagnostics and Utilities

4-11

(T21) Total Servo Sequence Test

(T16) ReadlWrite Force Fault Test
Readlw~~te safety detection ch-cuits Ch-e tested by SOihvCh-e and hcu-dwCh-e routines ~'tat force
read/write faults.

(T17) Read-Only Cylinder Formatter
Read-only cylinder 2659 is written with a zeros-plus-a-SYNC-BIT pattern (all heads) and read
back to verify data. Then another pattern is written and read back, and the data is compared for
accuracy. This cylinder is not formatted by any other subsystem formatter.
NOTE

Use a software jumper to execute this utility. This protects the stored information from
unintentional clearing. Refer to Section 4.5.
Reformatting this cylinder is sometimes necessary in the field.

(T18) Hardcore Sequence Test
This sequence diagnostic consists of TO 1, 02, 03, and 06. Duration: 20 seconds. Drive may be spun
up or down.

(T19) ReadlWrite Sequence Test
The drive must be spun up to run the ReadIWrite sequence test. This sequence diagnostic consists
of T14, 15, and 16. Duration: 25 seconds.

(T20) Servo Spinup Sequence Test
See T03.

(T21) Total Servo Sequence Test
The drive must be spun up to run the 'lbtal Servo Sequence test. This sequence diagnostic consists
of T03, 29, 30, 31, 32, and 33. Duration: 4.5 minutes.

DIGITAL INTERNAL USE ONLY

4-12

Drive-Resident Diagnostics and Utilities
(T30) Guardband Test

(T22) Total Drive Sequence Test (Spinning)
The drive must be spun up to run this test. This sequence diagnostic consists of TO!, 02, 03, 06, 29,
30, 31, 33, 14, 15, and 16. Duration: 7 minutes.

(T23) Total Drive Sequence Test (Spun down)
The drive must be spun down to run this test. This sequence diagnostic consists of TO!, 02, 03, 06,
07, and 08. Duration: 20 seconds.

(T24) Head Select and One Seek Test Sequence
See T63.

(T28) Drive-Sensed Temperature Display Utility
This utility was implemented with version 25 of the drive microcode to display the drive-sensed
temperature in degrees Fahrenheit, in a scrolling display on the OCP. Version 26 of the microcode
displays this temperature in degrees Fahrenheit and Celsius.
The OCP scrolling display is as follows:
[*TEMP=xxxF/xxC*]

(T29) Gray Code (Track Counter) Test
The Gray Code test checks that the correct gray code is generated from the two least significant
bits of the track counter as the drive seeks from cylinder 0 to 3 and 3 to O. This test is executed on
the dedicated servo surface only.

(T30) Guardband Test
The Guardband test checks the drive's ability to find inner and outer guardbands during seeks to
these areas.

DIGITAL INTERNAL USE ONLY

Drive-Resident Diagnostics and Utiiities
(T34) Tapered Seek Test

4-13

(T31) Incremental Seek Test
The Incremental Seek test exercises the serve by seeking between two eyl;nders mring an
incremental seek pattern. The starting cylinder, ending cylinder, and incremental value can be
default or user defined.
Default seek parameters are: starting cylinder 0, ending cylinder 2655 (last data cylinder), and an
incremental value of l.
An example of the seek algorithm using the default seeJ,t parameters:
BEG: 0-1-2-3-4-5- 1/ 2653=2654=2655 :END

(T32) Toggle Seek Test
The Toggle Seek test does repetitive seeks between two cylinders. The starting and ending cylinders
can be user defined or default cylinder addresses.
Default seek parameters are: starting cylinder 0 and ending cylinder 2655 Oast data cylinder).
An example of the seek algorithm using the default seek parameters:

BEG: 0-2655-0-2655-0-2655- etc .. , :END

(T33) Random Seek Test
The Random Seek test does repetitive seeks between two cylinders. The starting and ending
cylinders can be user defined or default cylinder addresses.
Default seek parameters are: starting cylinder 0 and ending cylinder 2655 Oast data cylinder).

(T34) Tapered Seek Test
The Tapered Seek test exercises the servo by seeking between two cylinders using a tapered seek
pattern. The pattern starts at the cylinder with ~he longest stroke and ends at the cylinder with
the shortest stroke.
The starting and ending cylinders can be user defined or default cylinders.
Default seek parameters are: starting cylinder 0 and ending cylinder 2660 (diagnostic write
cylinder).
This example has the reference cyl=O and ending cyl=2660:
BEG: 0-2660-0-2659-0-2658-0-2657~0-2656- etc. 0-6-0-5-0-4-0-3-0-2-0-1-0 :END
This exampie has the rekrence cyl=2660 and ending cyl=2660:
BEG: 2660-0-2660-1-2660-2-2660-3-2660-4 etc. 2660-2658-2660-2659-2660 :END
This example has the reference cyl=1330 and ending cyl=2660:
BEG: 1330-2660-1330-2659-1330-2658 etc. 1330-1332-1330-1331-1330 :END
BEG: 0-1330-1-1330-2-1330-3-1330-4 etc. 1327-1330-1328-1330-1329-1330 :END

DIGITAL INTERNAL USE ONLY

4-14

Drive-Resident Diagnostics and Utilities
(T38) Average Seek Timing Test

4.7.1 Seek Timing Tests
The following diagnostics are classified as seek timing tests. Seek timing tests can be executed
through the OCP or through the SDI level 2 DIAGNOSE command.
At the completion of a timing test, position three is blank, positions two and one contain a timing
test acronym (MH, MX, AV, HD), and position zero contains an asterisk (*).
At this point, the results can be displayed. A scrolling message display reports the test results to
the user. The message is scrolled, one character at a time, starting at the right side of the OCP
and continuing off to the left side of the OCP. The Run switch is used to start and stop the scrolling
display by pressing it once to start the display, and once to stop the "display.
All the timing tests use a 1-microsecond clock to calculate seek times. Because of this, the short
seek and head switch times are not as accurate as the long seek times.

(T36) Minimum Seek Timing Test
This test executes the minimum seek timing algorithm and displays the results of the test in the
OCP. Test time is approximately 75 seconds.
The following scrolling message format is used to display test results:
[MIN TIM FWD=xx.xMS]
[MIN TIM REV=xx.xMS]

where xx.x is the seek time (in milliseconds). The minimum. seek time is defined as the average
of 2655 single cylinder seeks (forward and reverse). This test uses the default incremental seek
pattern.
NOTE

If the time exceeds 99.9. the decimal point is shifted one digit to the right. The OCP
displays [999].

(T38) Average Seek Timing Test
This test executes the average seek timing algorithm and displays the test results to the OCP. This
test takes 5-7 minutes to complete.
The following message is scrolled across the OCP display:
[AVG TIM FWD=xx.xMS]
[AVG TIM REV=xx. xMS ]

where xx.x is the seek time (in milliseconds). The average seek time is defined as the average of
512 one-third-Iength seeks. For the RA90 disk drive, the seek length is 855 cylinders. For the RA92
disk drive, the seek length is 1035 cylinders.
Average seek time: < 21 milliseconds for RA90.
Average seek time: < 19 milliseconds for RA92.

DIGITAL INTERNAL USE ONLY

Drive-Resident Diagnostics and Utilities

4-15

(T40) Update Cartridge Utility (Spun Down)

(T39) Head Switch Timing Test
This test exec-u.tes the head switch timing algorithm and displays the test reS'~lts to the OCP.
This test takes approximately 2 seconds to run. The following message is scrolled across the OCP
display:
[HO SWT TIME=xx.xMS]

where xx.x is the head switch time (in milliseconds).
The head switch iime is defined as the summation oi all possible head switches divided by the total
number of head switches.

(T40) Update Cartridge Utility (Spun Down)
The drive must be spun down to run the Update Cartridge utility.
This internal microcode update utility is used in the field to update the following internal drive
microcode functions:
•

Diagnostics microcode

•

Servo microcode

•

Functional microcode

New microcode is loaded in the following sequence:
1. Load update cartridge into update port.

2. Load test T40. (Drive must be spun down.)
3. Start test T40. The following occurs in the OCP display once this test has begun (S = start,
P = pass, C = complete):
•

[840] (2 seconds).

•

[p 1] (20 seconds) Pass one checks PROM to be loaded.

•

[p 2] (20 seconds) Pass two writes the new code into the even pages in EEPROM.

•

[p 3] (20 seconds) Pass three writes the new code into the odd pages in EEPROM.

•

[C 40] (1 second) Update is complete.

•

[WAIT] (10 seconds) Exits test mode and goes through power-up hardcore sequence.

•

[0000] Returns to display the drive unit address.

DIGITAL INTERNAL USE ONLY

4-16

Drive-Resident Diagnostics and Utilities
(T41) Display Error Log Errors

(T41) Display Error Log Errors
This utility displays the RA.901RA92 drive-resident error log. When initiated, it first verifies the
integrity of the error log by reading the first four bytes of the elTor log header and comparing them
to expected values. If the compare fails, the utility exits and an eITor code displays.
The elTor log is displayed starting with the latest entry first and continuing until all entries are
displayed. Positions three and two represent the error log entry in decimal. Positions one and zero
represent the two-digit LED hex error code. Each entry is displayed for 1.5 seconds with the option
of starting and stopping the display using the Run switch.
NOTE

Null entries are displayed as 00 and should be ignored.

DIGITAL INTERNAL USE ONLY

Drive-Resident Diagnostics and Utilities

4-17

(T43) Display Seeks Utility

4.7.2 Time, Seeks, and Spinups Display Interpretation
The time, seeks, and spinups display utilities all use the following format to display the counts to
the OCP:

POSITION

OCP 1

OCP2
OCP3
OCP4
OCP5

OCP6

CXO-2146B

The following conventions are used:
TM =time
SK= seeks
SP =spinups
OCP 1 contains either TM, SK, or Sp, and the binary digits 9 and 8.

OCP 3 contains binary digits 7, 6, 5, and 4.
OCP 5 contains binary digits 3, 2, 1, and o.
OCP displays 2, 4, and 6 are used as separators to indicate the display is changing.

(T42) Display Time Utility
A 10-digit decimal number representing time is displayed when this utility is run, This number is
time, in minutes, since the drive was first powered up. See Section 4.7.2 for display interpretation.

(T43) Display Seeks Utility
When this utility is run, the OCP displays the number of total seeks (times a thousand) since the
drive was first powered up. A 10--digit decimal number is displayed in six segments at the OCP.
Each segment is displayed for 1.5 seconds unless the display is halted by selecting the Run switch.
See Section 4.7.2 for display interpretation.

DIGITAL INTERNAL USE ONLY

4-18

Drive-Resident Diagnostics and Utilities
(T45) Drive Revision Level Utility

(T44) Display Spinups Utility
This utility displays the total number of spinups since the drive was first powered up. When
this utility is run, the total number of spinups is displayed on the OCP in six segments. Each
segment is displayed for 1.5 seconds unless the display is halted by selecting the Run switch. See
Section 4.7.2 for display interpretation.

(T45) Drive Revision Level Utility
This utility uses the following mnemonics to display drive component hardware and/or microcode
revisions as follows:
•

DRV = Drive hardware revision

•

DCD = Drive microcode revision (microcode)

•

lOP = Master processor module (hardware)

•

SRV = Servo module (hardware)

•

PCM = Preamp control module (hardware)

•

ORV = Operator control panel (hardware)

•

OCD = Operator control panel (microcode)

Running this utility displays the revision level for each module in a scrolling message format across
the OCP. The following scrolling message format is used to display the information to the drive
OCP:
•

DRV=www
where www is the decimal hardware revision (0 to 255) of the drive.

•

DCD= yyy
where yyy is the decimal revision number (0 to 255) of the combined functional, servo, and
diagnostic microcode. The microcode is loaded from the microcode update cartridge.
NOTE
If a drive microcode revision (in the OCP display) contains an alpha character, for

example, DCD=L200, this signifies unreleased code. The drive microcode should be
updated. with a formally released microcode revision.
•

IOP= xx
where xx is the decimal revision number (0 to 15) of the appropriate module.

•

SRV= xx:

•

PCM= xx

•

ORV= xx

•

OCD= z.z
where z.z is the decimal revision number (0.0 to 9.9) for the OCP microcode.
NOTE
If the OCD is displayed as version 5.1 (OCD= 5.1), the drive has an OCP that allows
the alternate unit address display mode to be used. Refer to Chapter 3.

DIGITAL INTERNAL USE ONLY

Drive-Resident Diagnostics and Utilities

4-19

(T45) Drive Revision Level Utility

The hardware revision switches (Figure 4(2) provide the subsystem with the ability to determine
base-level module revision compatibilities. The hardware switches are changed only by direction of
a drive ECOIFCO. All ECO and FCO activity will take into account the impact of the changes to
the drive and to the subsystem to which it is attached.

RA90/RA92 DRIVE
CHASSIS FRONT

FOUR-POSITION
HARDWARE
REVISION
DIP SWITCH

CXO-2147B

Figure 4-2

Hardware Revision SwItches

NOTE
Do not alter these switches in the field unless you are instructed to do so during an
ECOIFCO installatiOD.
The hardware revision switches make up only part of the total reported hardware revision. 'Ihe
total reported hardware revision is a byte of information determined as shown in Figure 4-3.

DIGITAL INTERNAL USE ONLY

4-20

Drive-Resident Diagnostics and Utilities
(T47) Display Drive Serial Number Utility

BITS

'----- HARDWARE REVISION SWITCHES
L . -_ _ _ _

L . -_ _ _ _ _ _

INDICATES SERVO SYSTEM IMPLEMENTED IS:
00 = DEDICATED (ONLY SERVO SYSTEM)
01 = EMBEDDED (BLEND) SERVO SYSTEM
HDA CONFIGURATION
00 = RA90 LONG-ARM HDA (PIN 70-22951-01)
01 = RA90 SHORT-ARM HDA (PIN 70-27268-01)
10 = RA92 HDA (PIN 70-27492-01)
11 = NOT USED
CXO-2716B

Figure 4-3

Hardware Revision Byte

(T46) HDA Revision Utility
This utility allows you to display the HDA revision/vendor bits in the OCP display. The first year of
production will reflect HDA revisionlvendor bit o.

I_V_......N. . .I......=.....I-.."__

OCP 1 ..

CXO-2148B

The two left-most places of the OCP display contain a VN for the vendor code. The right-most place
of the OCP display contains a vendor code of 0 through 3. These revision/vendor bits are used to
distinguish the HDA type to the drive microcode. These bits, in conjunction with PCM switches
81-1 and 81-2, tell the microcode how the servo system should be configured in microcode.

(T47) Display Drive Serial Number Utility
This utility displays the drive serial number to the OCP.
The following message is scrolled Oeft to right) across the OCP display:
[DRV SIN xxy_zzzz]

where:
- xx is the manufacturing location of the drive (CX=CXO, KB=KBO)
- y is the alphanumeric digit 0-9 or A-Z (G, I, 0, Q, and X are not allowed)
- zzzz is 0000-9999

DIGITAL INTERNAL USE ONLY

Drive-Resident Diagnostics and Utilities

4-21

(T60) Loop-On-Test Utility

(T50) Error Log Checkpoint Utility
The Error Log Checkpoint utility allows you to enter a checkpoint entry into the internal drive
elTOr log. This is similar to a place marker.

(T53) Clear Seeks Utility
The Clear Seeks utility clears the total number of seeks since the drive was first powered up. Run
this test any time the HDA is replaced.
NOTE

Use a software jumper to execute this utility. This protects the stored information from
unintentional clearing. Refer to Section 4.5.
This test causes [INVL) to be displayed if you fail to use the software jumper.

(TS4) Clear Spinups Utility
The Clear Spinups utility clears the total number of spinups since the drive was first powered up.
Run this test any time the HDA is repiaced.
NOTE

(T55) Clear DD Bit Utility
The Clear DD Bit utility clears the DD bit set by the diagnostics or a controller.

(T60) Loop-On-Test Utility
This utility enables looping on a test. It can be set to loop on a diagnostic test or a diagnostic
sequence of tests. [LOT] is displayed on the OCP for 1.5 seconds when the loop utility is run. The
utility loops until an error is encountered or until the Test switch on the OCP is selected.

OCP

11'-_L_Io.....0_I......T--Ilo..-~1
CXO-2149A

DIGITAL INTERNAL USE ONLY

4-22

Drive-Resident Diagnostics and Utilities
(T63) Head Select Utility

(T61) Loop-On-Error Utility
This utility loops continuously on elTors encountered during the execution of drive internal
diagnostics. The test loops as long as the error is present. [LOE] is displayed on the OCP for
1.5 seconds when the loop utility is run.

(T62) Loop-Off Utility
The Loop-Off utility terminates all loop-on conditions. [LOF] is displayed on the OCP for 1.5
seconds when this utility is run.
OCP1

__ I__~

MI_L____o~I

F__

CXO-2150A

The effects of the LOT or LOE utilities may be canceled manually (LOF) or by exiting OCP test
mode and letting the idle loop routine execute at least one time.

(T63) Head Select Utility
The Head Select utility allows you to select or change the head to be tested. When the utility is
first run, the currently selected head number is displayed in decimal (0-12) in the OCP display, and
the least significant digit (LSD) blinks.
The format is as follows:

1_H____=_I__o~__o__'1

OCP 1 ...

CXO-2151A

The head number may be changed by selecting the Port B switch to increment the blinking digit.
When the desired head number is displayed in the OCp, pressing the Write Protect switch causes
that head to be selected and the head number to be changed in RAM. If the Test switch is pressed,
the test is aborted and the change does not take place.
The head remains selected until changed by this utility, power-up or reset, I/O processor reset, SDI
INIT, or controller intervention.

DIGITAL INTERNAL USE ONLY

Drive-Resident Diagnostics and Utilities

4-23

(T65) Seek Parameter Input Utility

(T64) One Seek Utility
The One Seek utility can be used to seek and lock on a cylinder. Vlhen run, the folloW='u...g OCP
display is seen:
OCP 1

(CYLINDER 1.5 SEC)

OCP2

(CYLINDER VALUE: 0-2660 SEC)
CXO-2152A

The right-most digit blinks to indicate cylinder value selection can begin. Selecting the Port A
switch selects the next desired digit position which starts blinking upon selection. Digit position
is from right to left (LSD to MaD). A wrap back to the LSD occurs if the Port A switch is selected
enough times. Selecting the Port B switch increments the blinking digit.

After the cylinder value is set, select the Write Protect switch to cause the heads to position
themselves at the desired cylinder. Selecting the Test switch aborts the process without changing
the cylinder value.
The selected cylinder value is stored in RAM until T64 is run again or a power-up reset, master
processor reset, or SDI INIT occurs.

(T65) Seek Parameter Input Utility
Four seek parameters can be examjned or changed when using the seek timing tests T36, T38, and
T39. They are:
•

FCY (first cylinder)

LCY (last cylinder)

•

INC (increment)

•

DLY (delay)

Seek parameters are changed the same way as the seek utility parameters. Refer to tests T36, T38,
and T39 for a discussion on altering parameters for diagnostics.
The following describes the sequence of events which occur when test T65 is run:
FCY=is the first display seen when this utility is started (Figure 4-4). The first cylinder value
follows 1.5 seconds later. The FCY can be any decimal number between 0 and 2660.

OCP 1

(FIRST CYLINDER VALUE 1.5 SEC)

OCP2

(DESIRED VALUE: 0-2660 SEC)
CXO-2153A

Figure 4-4

T65 Fey OCP Display

DIGITAL INTERNAL USE ONLY

4-24

Drive-Resident Diagnostics and Utilities
(T65) Seek Parameter Input Utility

Next, select the Write Protect switch. LCY= is displayed (Figure 4-5). The last cylinder value
follows 1.5 seconds later. The LCY can be any decimal value between 0 and 2660.

OCP 1

(LAST CYLINDER VALUE 1.5 SEC)

OCP 2

(DESIRED VALUE: 0-2660 SEC)
CXO-2154A

Figure 4-5

T65 LCY OCP Display

Select the Write Protect switch again. INC= is displayed (Figure 4-6). The incremental value
follows 1.5 seconds later. The INC value can be any decimal number between 1 and 2660. If a
value of 0 is chosen, the test loops indefinitely.

OCP 1

(CURRENT INCREMENT VALUE 1.5 SEC)

(DESIRED VALUE: 0-2660 SEC)

CXO-2155A

Figure 4-6

T65 INC OCP Display

Select the Write Protect switch and DLY= is displayed (Figure 4-7). The delay value between seeks
is displayed 1.5 seconds later. A delay value can be between 0 and 2999 milliseconds.

OCP 1

(CURRENT DELAY VALUE 1.5 SEC)

OCP 2

(DESIRED VALUE: 0-2999 SEC)
CXO-2156A

Figure 4-7

T65 DLY OCP Display

The seek parameters remain changed until this utility is run again or a power-up reset, 110
processor reset, or SDI INIT occurs.
NOTE

T65 does not check for out-of.range values. Do not exceed the maximum specified input
values. Also, the last cylinder parameter must always be equal to or greater than the
:&rst cylinder parameter. If an invalid cylinder value is entered, a (servo) seek failed
error (F5) occurs.

DIGITAL INTERNAL USE ONLY

Drive-Resident Diagnostics and Utilities

4-25

(T66) Variable Average Seek liming Test

(T66) Variable Average Seek Timing Test
This test exe.A!>'"lltes the average seek timing algorithm and allows you tG time any ler~h seek. To
set the seek length, modify the first (FeY) and last (LeY) cylinder addresses through the seek
parameter input utility (T65).
The run time for this test varies, depending on the length of the seek used. The run time should
not take longer than 45 seconds, regardless of the length of the seek.
The following message is scrolled across the OCP display:
[AVG TIM FWD=xx.xMS]
[AVG TIM REV=xx.xMS]

where XLX is the seek time (in milliseconds). The variable average seek time is defined as the
average (AVG) of 512 seeks in forward (FWD) and reverse (REV) directions.

DIGITAL INTERNAL USE ONLY

Troubleshooting and Error Codes
J:.. Tirft"hleshoctinl"'l
Re'erenl'A
••
v ... w.
v .....
•.•
m atAria
__ •• l•

"'. I

..~

When running diagnostics and interpreting error logs, you will need the documents listed
(alphabetically) in Table 5-1.
Table &-1

Reference Material for Troubleshooting

Document Title

Order Number

DBA Error Log Manual

EK-DSAEL-MN

DBA Error Log Pocket Service Guide

EK-DSAEL-PG

Getting Started With VAXsimPLUS

AA-KN79A-TE

HSC Service Manual

EK-HSCMA-SV

VAXsimPLUS Field Service Manual

AA-KN82A-RE

VAXsimPLUS User Guide

AA-KNSOA-TE

Refer to Section 5.19 for RA9OIRA92 disk drive error codes and descriptions.

5" 1"1 Customer Support Training for the R-.A90/RA92 Disk Drive
You must have the proper training to efficiently support the RA disk family. This training is
available at most Customer Services Training Centers, category A and B sites. Consult with your
Customer Services unit managers for training information.
DSA Level I and HSC Level I courses are prerequisites to the RA90 IVIS training.
Although support organizations are available to assist in problem solving, there is no substitute for
proper training. Support training resources include DSA Level II and DSA Troubleshooting courses,
and the RA90 Disk Drive Technical Description Manual.

5.2 RA90/RA92 Troubleshooting Aids
The following aids are available for disk drive troubleshooting:
•

VAXsimPLUS (VMS systems) (see Section 5.2.1)

•

Host error logs (see Section 5.2.2)

•

Drive internal error log (see Section 5.2.4)

•

Operator control panel (OCP) fault indicator/error codes (see Section 5.2.5)

•

Drive power supply indicator (see Section 5.2.6)

DIGITAL INTERNAL USE ONLY

5-1

5-2

Troubleshooting and Error Codes

•

Drive error reporting mechanisms (see Section 5.2.7)

•

Host-level diagnosticS/utilities (see Section 5.2.8)

5.2.1 VAXsimPLUS
The VAX System Integrity Monitor (VAXsimPLUS) provides access to VMS error log data. The
three VAXsimPLUS manuals needed to operate VAXsimPLUS effectively are listed in Section 5.1.

5.2.2 Host Error Logs
Refer to the appropriate system error logs for error interpretation. The DSA Error Log Manual and
the DSA Error Log Pocket Service Guide contain system error log descriptions for most operating
systems.

5.2.3 Extended Status Bytes
Extended status bytes are part of the response to the SDI GET STATUSII'OPOLOGY command or
any unsuccessful response to a level 2 command. These bytes are passed through the controller
to the host for error logging purposes. Figure 5-1 shows a breakdown of the RA9OIRA92 extended
drive status bytes. Extended status bytes are described in detail in the sections that follow.

BYTE 01

RESPONSE OPCODE

BYTE 02

UNIT NUMBER LOW BYTE

BYTE 03

SUBUNIT MASK

BYTE 04

REQUEST BYTE

GENERIC DRIVE STATUS BYTE

BYTE 05

MODE BYTE

GENERIC DRIVE STATUS BYTE

BYTE 06

ERROR BYTE

GENERIC DRIVE STATUS BYTE

BYTE 07

CONTROLLER BYTE

GENERIC DRIVE STATUS BYTE

BYTE 08

RETRY COUNT

BYTE 09

PREVIOUS CMD OPCODE

EXTENDED DRIVE STATUS BYTE

BYTE 10

HDA REVISION BITS

EXTENDED DRIVE STATUS BYTE

BYTE 11

CYLINDER ADDR (LO)

EXTENDED DRIVE STATUS BYTE

BYTE 12

CYLINDER ADDR (HI)

EXTENDED DRIVE STATUS BYTE

BYTE 13

RECOVERY LVL

GROUP NO

BYTE 14

ERROR CODE

EXTENDED DRIVE STATUS BYTE

BYTE 15

MFG FAULT CODE

EXTENDED DRIVE STATUS BYTE
CXO-2157B

Figure 5-1

RA90/RA92 Extended Drive Status Bytes

DIGITAL INTERNAL USE ONLY

Troubleshooting and Error Codes

5-3

5.2.3.1 Response Opcode (Byte 1)

Response Opcode (Byte 1) is the drive-to-controller response opcode and indicates the success or
failure of the previous controller-to-drive command. Generally, this is transparent to the user.
5.2.3.2 UnH Number Low Byte (Byte 2) and Subunit Mask (Byte 3)

BITE 3
10

1 Ix

BITE 2

x x xl

I I I I I 1,1 I

Ix x x xix x x xl

I 1,1

I I 1,1 I

IIIL-1_1I_______LnD"/~II~IT~~I~"~''''''B~o/nTI''\Ant\Af''\~'''''''.lA'
SUB"U..
REPORTiNGUTHISSTATUS)'UM/
II
I

.. lIAV' .. .I'I .... \

NIT~O'MASK(SUBUNrro

SUBUNIT 1 MASK (NOT USED)
SUBUNIT 2 MASK (NOT USED)
' - - - - - - - - - - - S U B U N I T 3 MASK (NOT USED)

CXO-3017A

5.2.3.3 Request Byte (Byte 4)

I xxxx I xxxxI

BYTE 4

REQUEST BYTE

(RU) 0 = RUN/STOP SWITCH OUT
1 = RUN/STOP SWITCH IN

(PS) 0 = PORT SWITCH OUT
1 = PORT SWITCH IN
( PB) 0 = PORT A RECEIVERS ENABLED

1 = PORT B RECEIVERS ENABLED
I I

(EL) 0 = NO LOGGABLE INFORMATION IN EXTENDED STATUS AREA
1 = LOGGABLE INFORMATION IN EXTENDED STATUS AREA
(SR) 0 = SPINDLE NOT READY (NOT UP TO SPEED)
1 = SPINDLE READY
(DR) 0 = NO DIAGNOSTIC IS BEING REQUESTED FROM THE HOST
1 = THERE IS A REQUEST FOR A DIAGNOSTIC TO BE
LOADED INTO THE DRIVE MICROPROCESSOR MEMORY

""""'-------(RR) 0 = DRIVE REQUIRES NO RECALIBRATE COMMAND
1 = DRIVES REQUESTS RECALIBRATE COMMAND
' - - - - - - - - - ( O A ) 0 = DRIVE ON LINE OR AVAILABLE TO CURRENT CONTROLLER
1 = DRIVE UNAVAILABLE (IT IS ALREADY ON LINE TO ANOTHER
CONTROLLER)
CXO-1281A

DIGITAL INTERNAL USE ONLY

5-4 Troubleshooting and Error Codes

5.2.3.4 Mode Byte (Byte 5)

I 000 x I XXXXJ

MODE BYTE

BYTE 5

S7) 0 = 512-BYTE SECTOR FORMAT (16-BIT)
1 = 576-BYTE SECTOR FORMAT (18-BIT)
(NO CURRENT PLAN TO IMPLEMENT 18-BIT)

(DB) 0 = DBN AREA ACCESS DISABLED
1 = DBN AREA ACCESS ENABLED
( FO) 0 = FORMATTING OPERATIONS DISABLED
1 = FORMATTING OPERATIONS ENABLED

( DD) 0 = DRIVE ENABLED BY CONTROLLER ERROR ROUTINE
OR DIAGNOSTIC
1 = DRIVE DISABLED BY CONTROLLER ERROR ROUTINE
OR DIAGNOSTIC (FAULT LIGHT = ON)
(W1) 0 = WRITE-PROTECT SWITCH FOR SUBUNIT 0 IS OUT
1 = WRITE-PROTECT SWITCH FOR SUBUNIT 0 IS IN
(W2)NOTIMPLEMETED
( ED1) ERROR LOG DISABLE (SET BY TWO-BOARD CONTROLLER DIAGNOSTICS)
(EDO) ERROR LOG DISABLE (SET BY TWO-BOARD CONTROLLER DIAGNOSTICS)
CXO-2193A

Bits EDl and EDO can only be set by two-board controller diagnostics. If either EDl or EDO are set
(EDx=l), the RA901RA92 disk drive turns off internal error logging.
5.2.3.5 Error Byte (Byte 6)

l X X X 0 J X 000 I

BYTE 6

ERROR BYTE

(WE) 0 = NO ERROR
1 = WRITE LOCK ERROR (ATTEMPT TO WRITE WHILE
WRITE-PROTECTED)
NOT USED
( DF) 0 = NO ERROR
1 = DRIVE FAILURE DURING INIT
( PE) 0 = NO ERROR
1 = LEVEL 2 PROTOCOL ERROR (IMPROPER COMMAND
CODES OR PARAMETERS ISSUED TO DRIVE)

(RE) 0 = NO ERROR
1 = SDI RECEIVE ERROR ON SDI TRANSMISSION
L1NE(S) FROM CONTROLLER
(DE) 0 = NO ERROR
1 = DRIVE ERROR (DRIVE FAULT LIGHT MAY BE ON;
CAN BE CLEARED VIA DRIVE CLEAR COMMAND)
CXO-1283C

DIGITAL INTERNAL USE ONLY

Troubleshooting and Error Codes 5-5

The error byte is one of four generic status bytes. Error bits in the error byte are set by the drive
for drive-detected errors. The controller clears the bits with the SDI DRIVE CLEAR command. The
bits are described as follows:
•

The DE bit reports any internal drive error that requires explicit controller recovery ac'"~on
other than simple command retransmission or context readjustment.

•

The RE hit reports transmission errors detected by the drive.

•

The PE bit reports level 2 protocol errors detected by the drive.

•

The DF bit indicates the drive did not pass its initialization/diagnostics the last time it was
initialized or powered up.

•

The WE bit reports the drive received a SELECT TRACK AND WRITE command or a FORMAT
command while the drive was write protected.

NOTE

Drive-detected errors fit into one of the five classes described above and are reported as
such.
Controller-detected drive errors are logged without any of these bits being set. For
example, the drive actuator has positioned itself to a cyHnder other than the one the
controller requested.. The controller detects this failure as a drive positioner error or an
invalid header error.
5.2.3.6 Controller Byte (Byte 7)

BYTE 7

CONTROLLER BYTE

'----- 0000 = NORMAL DRIVE OPERATION
1000 = DRIVE IS OFF LINE AND UNDER CONTROL
OF A DIAGNOSTIC

1001 = DRiVE is OFF LiNE DUE TO ANOTHER DRiVE
HAVING THE SAME UNIT SELECT IDENTIFIER

' - - - - - - (SI) 1 = NOT USED
'--------------(S2) 1 = NOT USED
--------------(~)1=NOTUSED
~---------------(S4)1=NOTUSED

CXO-2158A

5.2.3.7 Retry Count (Byte 8)

Byte 8 is the retry count during the last SEEK or RECALIBRATION command. (The retry count is
the number of times the command was retried, internal to the drive, in an attempt to successfully
complete the SEEK or RECALIBRATE operation.)

DIGITAL INTERNAL USE ONLY

5-6 Troubleshooting and Error Codes

5.2.3.8 Previous Command Opcode (Byte 9)

xxxx

BYTE 9

LAST OPCODE

(EXTENDED DRIVE STATUS BYTE)

"----OPCODE OF THE LAST PREVIOUS LEVEL 2 DRIVE COMMAND
DECODED BY THE DRIVE (RECEIVED FROM THE SOl CONTROLLER)
81 = CHANGE MODE
82 = CHANGE CONTROLLER FLAGS
03 = DIAGNOSE
84 = DISCONNECT (DRIVE)
05 = DRIVE CLEAR
06 = ERROR RECOVERY
87 = GET COMMON CHARACTERISTICS
88 = GET SUBUNIT CHARACTERISTICS
OA = INITIATE SEEK
8B = ON LINE
OC = RUN
80 = READ MEMORY
8E = RECALIBRATE
90 = TOPOLOGY
OF = WRITE MEMORY
FF = SELECT GROUP (LEVEL 1 COMMAND - PROCESSED BY FIRMWARE
SEEK HEAD SELECT SUBROUTINES)
CXO-1285B

5.2.3.9 HDA Revision Bns (Byte 10)

Byte 10, bits 0 and 1, indicate which vendor heads are used in the HDA. Bit 7 is the
UNCALIBRATED hit and indicates the drive failed during drive recalibration.
5.2.3.10 Cylinder Address (Bytes 11 and 12)

Decoding bytes 11 and 12 gives you the cylinder address from the last SDI SEEK command issued
to the drive. See Examples 5-1 (for the RA90 disk drive) and 5-2 (for the RA92 disk drive) to
determine cylinder address and group (head).

DIGITAL INTERNAL USE ONLY

Troubleshooting and Error Codes

5-7

The RA90 implements the following geometry for logical
addressing:
T"ne RA90 has 1 logical track = 1 physical
The RA90 has

1 logical group

The RA90 has logical cylinder

1 logical track

13 logical groups

The current cylinder address and current group bytes indicate the
cylinder address and group where the read/write heads are
positioned. The following formula outlines how to obtain
the cylinder head from the logical block number (LBN).

= LBN/897 = cyl.fraction (discard fraction)
Coyl * 897»/69 = head.fraction

Cylinder (cyl)
Head

= (LaN

to physical cylinder and head number
conversion:

LBH

If LBN

23609

Then

23609/897

26.32 (discard fraction)

cn = 26
To find the head, use the following example:
Head

(23609 - (26 * 897»/69

Head

4.16 (discard fraction)

Head

As you can see LBN 23609 = head 4 and physical cylinder 26.

DBRa to physical cylinder and track (head on
RA90 disk drives) conversion:
cn = 2654 + DBN/910 = cylinder.fraction (discard fraction)
Head = (DBN fraction)
XBR

«Cn - 2654)

* 910»/70 = head.fraction (discard

to physical cylinder and head conversion:

cn = 2651 + XBN/910 = cylinder.fraction (discard fraction)
Head = (XBN - «Cn - 2651)
fraction)

* 910»/70

Head.fraction (discard

RBR to convert a RBN to the associated physical
cylinder and head, use the following formula:
cn = RBN/13 = cylinder. fraction (discard fraction)
Head = RBN - (Cn * 13)

Example 5-1

RA90 Cylinder Address and Group (Head)

DIGITAL INTERNAL USE ONLY

5-8 Troubleshooting and Error Codes

The RA92 implements the following geometry for logical
addressing:
The RA92 has 1 logical track = 1 physical track
The RA92 has

1 logical group

The RA92 has logical cylinder

1 logical track
13 logical groups

The current cylinder address and current group bytes indicate the
cylinder address and group where the read/write heads are
positioned. The following formula outlines how to obtain the
cylinder head from the logical block number (LBN).

= LBN/949 = cyl.fraction (discard fraction)
(cyl * 949»/73 = head.fraction (discard fraction)

Cylinder (cyl)
Head = (LBN -

LBN to physical cylinder and head number
conversion:

If LBN

23609

Then

23609/949

24.88 (discard fraction)

cn = 24
To find the head, use the following example:

* 949»/73

Head

(23609 -

Head

11.411 (discard fraction)

Head

(24

As you can see LBN 23609 = head 11 and physical cylinder 24.

DBNa to physical cylinder and track (head on
RA90 disk drives) conversion:
CYL = 3104

+ DBN/962 = cylinder.fraction (discard fraction)

Head = (DBN fraction)
XBN

«Cn - 3104)

* 962»/74 = head.fraction (discard

to physical cylinder and head conversion:

CYL = 3101 + XBN/962 = cylinder.fraction (discard fraction)
Head = (XBN fraction)

«CYL - 3101)

* 962»/74 = Head.fraction (discard

RBN to convert a RBN to the associated physical
cylinder and head, use the following formula:
CYL = RBN/13

= cylinder. fraction (discard fraction)
Head = RBN - (CYL * 13)
Example 5-2

RA92 Cylinder Address and Group (Head)

DIGITAL INTERNAL USE ONLY

Troubleshooting and Error Codes 5-9

5.2.3.11 Error Recovery Level (Selected Group) (Byte 13)

I x Ix I x I x Ix I x I x I x I BYTE 13

ERROR RECOVERY LEVEL (SELECTED GROUP)

.......- - - - - - G R O U P NUMBER FOR LAST GROUP SELECT
COMMAND, OR LAST SUCCESSFUL GROUP
SELECT DURING A SEEK COMMAND
(R/W HEAD NUMBER)

1...-_ _ _ _ _ _ _ _ _ _ CURRENT ERROR RECOVERY LEVEL

CXO-2159A

5.2.3.12 Error Code (Byte 14)

Refer to Section 5.19 for drive error codes and their descriptions.
5.2.3.13 Manufacturing Fault Code (Byte 15)

Byte 15 contains the manufacturing repair code and is used by the repair depot.

5..2. 4 Drive Internal Error Log
All drive-detected disk subsystem errors are recorded in the RA9O!RA92 drive internal error log.
Power-related errors are also recorded. ECC errors are not recorded in the drive internal error log.
Figure 5-2 shows the RA9OIRA92 drive internal error log memory layout; Figure 5-3 shows the
RAOOIRA92 drive internal error log header format; and Figure 5-4 shows the RA.9OIRA92 drive
internal error log descriptor format.
There are three ways to extract the RA901RA92 drive internal error log:
1. Run DKUTIL from the HSC console or KDM controller (see Section 5.2.4.1).

2. Run utilities for two-board controllers. (Table 5-2 lists the systems that use two--bca.~
controllers.)

3. Run drive-resident utility T41 from the RA901RA92 OCP (see Section 5.2.4.2).
Table 5-2 Two-Board Controller Diagnostics
Monitor

KDAlKDBlUDA

XXDP

ZUDM

VDS

EVRLL

MDM

Test drive internal error log utility at the device utility menu

DIGITAL INTERNAL USE ONLY

5-10

Troubleshooting and Error Codes

LABEL

BYTE WIDE MEMORY

LOGBUF
OA006H

START OF ERROR LOG HEADER

SAVESET
OA010H

START OF POWER DOWN
PAGE; FIRST 8 BYTES ARE
DRIVE GENERIC

SAVEO
OA018H

SECOND 8 BYTES
ARE DRIVE SPECIFIC

OA025H

LAST BYTE OF HEADER

OA026H

UNUSED

OA02FH

UNUSED

DSCBEG
OAOSOH

START OF ERROR LOG DESCRIPTORS

OA42FH

LAST BYTE OF LAST DESCRIPTOR

DSCEND
OAOSOH

END OF DESCRIPTOR MARKER;
FROM HERE ON EEPROM IS NOT
USED FOR ERROR LOG
CXO-2162A

Figure 5-2

RA90/RA92 Drive Internal Error Log Memory Layout

DIGITAL INTERNAL USE ONLY

Troubleshooting and Error Codes

5-11

LOGBUF (ADDRESS LABEL) = OAOOSH
FFFB

WORD 00
WORD 01

SIZE

WORD 02

______

DEVICE TYPE

ERRORLOG SIZE

LO ORDER SEEKS SINCE LAST POWERUP

WORD 03
WORD 04 t~

H_I_O_R_D_E_R_S_E_E_K_S_S_I_N_C_E_L_A_S_T_P_O_W_E_R_U_P______~l

SAVESET (ADDRESS LABEL) = OA010H

.. r-

,f'\

WORD 05

LO ORDER CUMULATIVE SEEKS

WORD OS

HI ORDER CUMULATIVE SEEKS

WORD 07

LO ORDER TOTAL ELAPSED TIME (MIN)

WORD 08

HI ORDER TOTAL ELAPSED TIME (MIN)

WORD 09

OCP SWITCH STATUS

UNIT NUMBER ONES DIGIT

WORD 10

UNIT NUMBER TENS DIGIT

UNIT NUMBER 100S DIGIT

WORD 11

UNIT NUMBER 1000 DIGIT

S.SA2 STATUS BYTE

WORD 12

CUMULATIVE NUMBER OF SPINUPS

WORD 13

NOT USED = OOOOH

WORD 14

WORD 15

BAD ERROR LOG FLAG

POWER DOWN DATA*

FAUL T TABLE POINTER

POINTER TO DESCRIPTOR ENTRY THAT FA!LED

*MUST BE SAVED AT AN EEPROM PAGE BOUNDRY (XXXOH).
CXO-21S0A

Figure S-3

RA90/RA92 Drive Internal Error Log Header Format

DIGITAL INTERNAL USE ONLY

5-12 Troubleshooting and Error Codes

DSCBEG (ADDRESS LABEL) = OA020H
WORD 00

ERROR TYPE

ERROR CODE

WORD 01

FRUIDRU NUMBER

NUMBER OF ASCII BYTES

WORD 02

LO NUMBERS SEEKS AT TIME OF ERROR

WORD 03

HI NUMBER OF SEEKS AT TIME OF ERROR

WORD 04

ENTRY WRITE COUNT

WORD 05

NUMBER OF SPINUPS SINCE FIRST POWERUP

WORD 06

WORD 08

I CURR GROUP I

ERR RCVRY LVL

WORD 07

DRIVE GENERIC
INFORMATION

TBD

DESIRED CYLINDER
LO ORDER TOTAL ELAPSED TIME (MIN)

WORD 09

HI ORDER TOTAL ELAPSED TIME (MIN)

WORD 10

ASCII BYTE

WORD 11

ASCII BYTE

WORD 12

ASCII BYTE

WORD 13

ASCII BYTE

WORD 14

ASCII BYTE

WORD 15

ASCII BYTE

DRIVE SPECIFIC
INFORMATION

CXO-2161A

Figure 5-4

RA90/RA92 Drive Internal Error Log Descriptor Fonnat

5.2.4.1 Running DKUTIL From the HSC Console or KDM70 Controller

Running DKUTIL from the HSC console controller dumps the drive internal error log to the HSC
console. The same capability exists for the KDM70 controller.

To display the drive internal error log, enter the DISPLAY ERROR command at the HSC prompt
(see the example below).
First do:
DKUTIL> GET Dxxxx (If Drive is capable of being put on line)
OR
DKUTIL> GET Dzzzz/NOOHLXNB (If Drive is incapable of being put on line)
THEN
DKUTIL> DISPLAY ERROR

Figure 5-5 shows an example of a formatted drive internal error log. The data in this example will
help you determine the time elapsed since a failure occurred.

DIGITAL INTERNAL USE ONLY

Troubleshooting and Error Codes

5-13

ERROR LOG ENTRIES FOR DR!VE 0090
SELECT STARTING ENTRY LOCATION (1-32) [20]?
ENTER HOW MANY ERROR LOG ENTRIES TO DISPLAY (0-32) [32]?
PAUSE AND PROMPT AFTER 10 ERROR LOG ENTRIES [(Y),N]? Y

DRIVE
TYPE

MAX#ENTRYS

SEEKS/POWER ON

CUM. SEEKS

(D)

RA90

328

9065

4
4
4

PE
DE
RE

2B
F5
07

0000042695

0000A6C7*

DRIVE SPECIFIC HEX DATA
BYTE 0-9, RIGHT TO LEFT
(H)

ENTRY ENTRY ERR ERR
SEEK
MFG
LOCTN COUNT TYP CODe COUNT CODE
(D)
(D)
(A)
(H)
(D)
(H)
20
19
18

CUMULATIVE
POWER-ON MINUTES
(D)
(H)

DRIVE ERR
MESSAGE
(A)

8751

8731

inv.dmr.num.
dsp.sek.fit.
frm.seq.err.

~yttWAy
TIME··

(D) = decimal
(A) = ASCII
(H) = hex

CYL

ERROR
REC
LEVEL

SPIN-UPS
SINCE FIRST
POWER-UP

HEAD/
GROUP

* 0000A6C7 (H) CUMULATIVE POWER-ON MINUTES
(SUBTRACT) - ** 00003F1C (H) LEFT-MOST FOUR "TIME- BYTES
(EQUALS) =

000067AB (H) T!ME LAPSE SINCE LAST ERROR
(D) = 26,539 MiNUTES

CONVERT HEX TIME LAPSE TO DECIMAL MINUTES, THEN
CONVERT TO HOURS, THEN
CONVERT TO DAYS.
CXO-2994A

Figure S-5

Drive Internal Error Log

The ten bytes of drive-specific hex data printed by the DKUTIL utility are divided by the
RA9OIRA92 into five data fields. The drive specific hex data fields are:
1. Time (minutes)

2. Cylinder
3. Head/group
4. Undefined
5. Spinups since the last power-up
NOTE

All five data fields represent the drive state at the time of the error.

DIGITAL INTERNAL USE ONLY

5-14

Troubleshooting and Error Codes

5.2.4.2 Running the Drive-Resident Utility Dump (T41) From the OCP

Run drive-resident utility T41 to display the drive internal error log. (Refer to Chapter 4 for
instructions on how to run this utility.) The drive internal error log is displayed starting with the
latest entry and continuing until all entries are displayed. Positions three and two represent the
error log entry in decimal. Positions one and zero represent the two-digit LED hex error code. Each
entry is displayed for 1.5 seconds. You can start or stop the display using the Run switch.

5.2.5 OCP Fault Indicator/Error Codes
The OCP Fault indicator lights when a hard fault is detected. Select the Fault switch to display
an error code. These error codes are described in Section 5.19. Each description includes fault
isolation information.

5.2.6 Drive Power Supply Indicator
The drive power supply has a green LED that, when lit, indicates the power supply is operating
normally. If the LED is not lit and the drive is experiencing problems, begin troubleshooting in this
area. Figure ~ shows the location of the green LED.
DRIVE

CIRCUIT
BREAKER

/REAR

(

On~O ~ (
GREEN LED
(POWER OK)

CXO-2134B

Figure 5-6

Power Supply Indicators

DiGITAL INTERNAL USE ONLY

Troubleshooting and Error Codes 5-15

If the green LED appears to be at about half brilliance and the OCP has no display, the power
suppiy is in a crow-bar state. Recycling the circuit breaker may clear the condition.
J: 2.~
,

nrh,ft
c ••_. Rftnc..in'" ··echanisft'lls
. . , . Y'V . . . . . " ' . • 'VI""'
M
• • • II I ••
.....~

The RA9OIRA92 detects and reports the majority of real-time errors and faults in the drive,
including intermittent failures.
All drive-detected errors are reported to the controller. If error logging is available and enabled, the
controller reports errors to the host.
5.2.7.1 Detailed Description of Error Reporting Mechanisms

RA9OIRA92 disk drives have five mechanisms available to report error conditions to the controller.
The mechanism used is based on the state of the drive, the drive activity at the time of the error,
and the error that occurred. The five mechanisms are listed below. As described in this list, it is
assumed that a port or ports have been selected from the OCP port select switches.
1. STOP TRANSMITTING CLOCKS AND DATA OVER ANY SDI LINE-The drive stops
transmitting clocks and data over any SDI line connected to either port if any of the following
conditions exist:
•

The drive is off line to the controller.

•

Power is failing.

•

A failure is detected that prevents communication between the drive and the controller.

2. TRANSMIT CLOCKS BUT NO STATE INFORMATION-The drive transmits drive clock
but does not transmit state (RTDS) information if it is off line to the controller or if it failed
resident diagnostics. The only time a drive executes resident diagnostics is at power-up or reset
and when an SDI INIT is received by the real time controller state (RTCS) line. If a drive
receives an SDI INIT, it executes resident diagnostics verifying processor and communications
paths to the controller.
3. ASSERT ATrENTION IN THE RTDS-The drive uses the RTDS attention mechanism to
report error conditions if the drive is on line to the controller. The RTDS attention mechanism
is used when the command timer expires or when one of the generic status bits changes, with
the following exceptions: when a generic status bit changes as a result of a correct operation
during an SOl leve12 command or an error in an SDI level 2 command occurs.
4. SEND UNSUCCESSFUL RESPONSE-An unsuccessful response to an SDI level 2 command
is sent to the controller if any of the following conditions exist:
•

The execution of an SDI level 2 command could not be completed successfully. (For example,
a level 2 DRIVE CLEAR command was issued but the error condition could not be cleared.)

•

A transmission error occurred during an SDI level 1 exchange and the drive successfully
received a valid SDI level 1 end frame.

•

A protocol error occurred.

•

A fault occurred while the drive was executing a topology command.

5. CONTROLLER RESPONSE TIMEOUT -This is not a drive mechanism, but it indicates to
the controller that the drive has an error condition.

DIGITAL INTERNAL USE ONLY

5-16 Troubleshooting and Error Codes

5.2.8 Host-Level Diagnostics and Utilities
If possible, avoid running host-level diagnostics to recreate the symptoms. You only extend the
service period. However, under certain conditions you may need to run host-level diagnostics. Refer
to Section 5.11.

Do not use host-level diagnostics to verify drive repair; use resident diagnostics tests. Use systemlevel commands to ensure the drive is on line and operating normally.

5.3 General Troubleshooting Information
The drive internal error log records all drive-detected (DD) faults as error codes. Use the recorded
error codes to help isolate faults to a failing or failed FRU. Run the RA9OIRA92 disk drive utility
program T41, Display Drive Error Log, to extract drive internal error log information.
Real-time faults detected by the disk subsystem are recorded in the host error log of the supporting
operating system software. Host error logs contain detailed information on intermittent and hard
drive errors and can also be used to isolate the failing field replaceable unit (FRU).
ECC-type errors are detected by controllers and logged in the host (or HSC) level error logs. These
errors are not recorded in the drive internal error log. The drive only reports drive-detected errors.
Once a disk drive fault has been isolated to an FRU and repairs have been made, use drive-resident
diagnostics to verify proper drive operation.

5.3.1 Drive-Resident Diag nostics Limitations
The following disk functions or areas are not covered by resident diagnostic testing:
1. Customer data areas (are never read or written to during testing).
2. Data paths between the drive and controller.
3. Internalloopback testing (only tests the SDI loopback through the TSID gate array). External
SDI testing can be accomplished with resident diagnostic T09 and use of a loopback connector
(Digital part number 7~19074-01).
"At-speed" testing of the SDI circuitry is not done. SDI interface testing is accomplished by
internally looping the SDI signals within the SDI gate array and TSID. Transformer couplings
are not tested.
If you suspect media, go to Section 5.8.
Drive-resident diagnostics descriptions are in Chapter 4.

5.4 Step-by-Step Troubleshooting Procedure
Use this troubleshooting procedure when you are reasonably certain the problem is in a disk drive.
Some troubleshooting procedures may require that you follow the entire procedure before isolating
the problem. If you have an error code, go to Section 5.19 for a description of the error and an FRU
replacement list.
Included in this section is a step-by-step troubleshooting flowchart (Figure 5-7). Each section
heading that follows this flowchart contains a number, enclosed within a box, that corresponds to
those in the step-by-step troubleshooting flowchart.

DIGITAL INTERNAL USE ONLY

Troubleshooting and Error Codes 5-17

(

START)

IDENTIFY
PROBLEM
DRIVE
CONSIDER

1.1

1-4---------------1 REMOTE
SUPPORT

1.2

YES

1.3

YES

1.4

YES

1.5

YES

,
2 . ....-----.....
IDENTIFY
PROBLEM
FRU

1.6
OTHER MEANS .....------...J~

CXO-2163C
Sheet 1 of 6

Figure 5-7 (Cont.)

Step-by-step Troubleshooting Flowchart

DIGITAL INTERNAL USE ONLY

5-18 Troubleshooting and Error Codes

IDENTIFY
PROBLEM
FRU

2.1
PRE-VERIFY
DRIVE
SYMPTOMS

YES

3.
OSA
ERRORS

2.3

YES

2.4

YES

2.5

YES

2.6

YES

FRU
REPLACEMENT
PROCEDURES

5.
MISC
CHECKS

CXO-2163C
Sheet 2 of 6

Figure 5-7 (Cont.)

Step-by-Step Troubleshooting Flowchart

DIGITAL INTERNAL USE ONLY

Troubleshooting and Error Codes 5-19

DDDE = DRIVE-DETECTED DRIVE ERROR
DDDF = DRIVE-DETECTED DIAGNOSTIC FAULT
DOPE"", DRIVE-DETECTED PROTOCOL ERROR
RE = TRANSMISSION ERROR

DSA

SEE SECTION
>-_~ ON "ERROR

CODES AND
DESCRIPTIONS"

> - -...... CONTROLLER

L..-_ _ _ _ _. . . .

......- - - - . .

CONTROLLER!
DRIVE POWER
CABLING

> - -..... CONTROLLER

YES

ECM

YES

DRIVE
SDICABLES
CONTROLLER

CXO-2163C
Sheet 3 of 6

Figure 5-7 (Cont.)

Step-by-step Troubleshooting Flowchart

DIGITAL INTERNAL USE ONLY

5-20 Troubleshooting and Error Codes

SEE ·ORDER OF
YES
TROUBLESHOOTING
>---1.... DSA ERRORS·
SECTION

4.
MEDIA

4.1

YES

RUN
DKUTIL
RIW PATH
PROBLEMS
CONTROLLER
MEDIA

4.2

MAY NEED
TO FORMAT
AFTER REPAIR

YES

CXO-2163C
Sheet 4 of 6

Figure 5-7 (ConI.)

Step-by-Step Troubleshooting Flowchart

DIGITAL INTERNAL USE ONLY

Troubleshooting and Error Codes 5-21

HDA

YES
ECM

ECM
~------------~" CABLES
HDA
CONTROLLER

ECM
___... CABLING

P.S.
HDA

I NO

t
5.

MISC
CHECKS

SEE
SERVICE
MANUAL

YES
YOU ARE
LOST

CXO-2163C
Sheet 5 of 6

Figure 5-7 (Cont.)

Step-by-Step Troubleshooting Flowchart

DIGITAL INTERNAL USE ONLY

5-22 Troubleshooting and Error Codes

YOU ARE NOT LOST;
DRIVE IS IDENTIFIED,
PROBLEM IS NOT
r---~

USE
HOST-LEVEL
DIAGS AS
LAST RESORT

9.3

FRU
REPLACEMENT

RETURN DRIVE TO
USER
-MOUNT
-ACCESS
-BASIC APPLICATIONS

SEE -MULTI PLE
ERROR CODES~---.... IN THE -FRU
REPLACEMENTSECTION
YES

9.1

9.2

EXECUTE DRIVE
SEQUENCE
TESTS;
-POWERUP
-SPINUP

SERVICE POST
VERIFICATION

1-----------------'
CXO-2163C
Sheet 6 of 6

Figure 5-7

Step-by-step Troubleshooting Flowchart

DIGITAL INTERNAL USE ONLY

Troubleshooting and Error Codes 5-23

5.4.1 Troubleshooting Worksheet
Develop a worksheet to aid in collecting error data. Identify only those errors being reported
against the identified drive. Arrange a piece of wide, line-printer paper with columns identified as
follows:
•

MSCP StatuslEvent Code

•

Comment Area

•

Block Number

•

Block Type (LBN or RBN)

•

Cylinder

•

Group

•

Sector

•

Drive LED Error

•

Drive-Reported Previous/Current Group

DatelTime of Error

5.5 Identifying the Problem Drive ffi
The cause of local drive error problems generally requires minimum analysis. These problems
can be identified by noting that the drive is not performing basic operational functions (power-up,
spinup, spindown, and so on), by incorrect lamp indications, or by OCP error codes.
Once you have isolated the problem drive, proceed to Section 5.6.
If you have not isolated the problem drive, refer to Sections 5.5.1 through 5.5.6. These sections
describe procedures to use for problem drive identification.

5.5.1 Talking to the System Operator/Checking the OCP Fault Indicator [!J
Discuss d...~ve errors with the system operator/manager and users. Operators or users can provide
valuable information concerning system activity at the time of the error (such as applications that
were nmning, disks the data is stored on, affected users, and impact on other applications).
Check the OCP for fault indications.

5.5.2 Using VAXsimPLUS to Identify the Problem Drive [!]
Use VAXsimPLUS to obtain a summary of information that may lead to direct identification of the
failing drive. Section 5.1 lists appropriate VAXsimPLUS documentation.
If the problem drive is identified using information obtained with VAXsimPLUS, go directly to
Section 5.6.

5.5.3 Using the Host Error Log to Identify the Problem Drive ~
Study available host error logs. Host error logs proVide failing drive and error code information.
Use this information to identify failing FRUs.
Refer to the DSA Error Log Manual for detailed descriptions of most system-level host error logs.

DIGITAL INTERNAL USE ONLY

5-24

Troubleshooting and Error Codes

5.5.4 Using the HSC Console Log to Identify the Problem Drive [!3J
Drives attached to HSC controllers send drive state information to the HSC console log. Use the
HSC console log to identify problem drives. Correlate time-of-error information to user operations.

5.5.5 Using the Host Console/User Terminal Trails to Identify the Problem Drive
~
If no host error log or VAXsimPLUS resource is available, check host console trails or user terminal
trails. These may indicate drive problems and identify the problem drive.

5.5.6 Using Other Means to Identify the Problem Drive ~
If no hard fault indications, error logs, or console logs are available to identify the problem drive,
refer to Section 5.9.
It is important to identify the failing disk drive before attempting to isolate the failing subsystem
component. If more than one drive exhibits the same failure symptoms, examine the possibility of a
controller or system problem.
NOTE

Using DSA utilities such as Error Log Dumper (ZUDMlEVRLUMDMlDKUTIL) to dump
the RA9OIRA92 drive internal error log may identify problem hardware areas. However,
there may be a significant negative impact on the availability of hardware and data to
the customer. Consider off-line diagnostics only as a last resort.
DSA utilities (Bad Block Replacement or HSC Verify) verify that the logical structures of the user
data are correct. Additionally, these utilities check the status of any revectored blocks, blocks with
forced error flags set, blocks marked bad in the RCT area, the number of primary and non-primary
replaced blocks, and blocks that exceed symbol error thresholds. User data areas that have :Bagged
forced error conditions are identified as disk areas that cannot be accessed due to media or drive
problems.
Transient problems may require the use of off-line diagnostics. EVRL, ZUD, and MDM frequently
miss a problem executing in the DBN area of a disk. You may have to exercise the customer data
area of the disk to increase the chances of generating an error.
CAUTION

Back up customer data before executing diagnostics on customer data areas of the disk.
Refer to Section 5.11 for host-level diagnostics information.

5.6 Identifying the Problem FRU ~
Mter identifying the problem drive, you must identify the failing FRU. The following sections
describe procedures to use for identifying the problem FRU.
Use the host error log or HSC console log to fill in the troubleshooting worksheet (described in
Section 5.4.1). Calculate the logical cylinder, group, and sector from the targeted LBN or RBN
and add that information to the worksheet. Drive-reported errors (SDI error packet) include valid
extended drive status bytes that call out the logical cylinder, the previous and current group select,
and the master drive error code.
After the data is collected, analyze the data to select the most logical replacement FRU. Proceed to
Section 5.7 and compare the collected data to determine troubleshooting priority.

DIGITAL INTERNAL USE ONLY

Troubleshooting and Error Codes 5-25

5.6.1 Pre-Verifying Drive Symptoms [?]I
After identifying the drive, you should verify drive failure symptoms by performing pre-verification
testing of the drive. Pre-verification of drive symptoms using resident diagnostics has the following
benefits:
•

•

Establishes a basis for post-verification and:
-

Ensures that no new problems have been introduced.

Ensures that a replaced FRU corrected the problems detected during pre-verification
testing.

Establishes a more reliable error code or condition to troubleshoot. Generally, errors detected
while performing drive-resident diagnostics have a higher priority than errors or symptoms
derived from any source previously mentioned.

To complete pre-verification testing, perform the following steps:
1. Spin up the drive.

2. Execute resident diagnostic test T60 Ooop-on-test utility).

3. Execute resident diagnostic test TOO (sequence test).
Examine the drive internal error log and note the type of errors. Compare the generated errors to
the error symptoms originally encountered. The following sections help isolate the failure symptoms
to the failing FRU.

5.6.2 Using OCP Error Codes to Identify the Problem FRU ~
Correlate error codes displayed in the OCp, host error logs, or drive internal error logs to error
descriptions given in Section 5.19. Each error description includes a list of suggested replacement
FRUs. Use this list to repair the drive. Verify repairs using the post-verification procedures defined
in Section 5.13.2.

5.6.3 Using VAXsimPLUS to Identify the Problem FRU ~
VAXsimPLUS identifies FRU replacements based upon an analysis of the errors being recorded by
the VMS error logging system. VAXsimPLUS identifies the failing FRU through a theory number.
The procedure for cross-referencing theory numbers to drive FRUs is determined by individual
Digital service areas. Each service area has the responsibility of defining and implementing
VAXsimPLUS in line with individual area service goals and strategies.
IfVAXsimPLUS identifies a failing FRU, replace the FRU then proceed with post-verification
testing. Refer to Chapter 6 for FRU removal and replacement procedures.

5.6.4 Using the Host Error Log to Identify the Problem FRU ~
If the system does not support host error logs, or if a host error log cannot be obtained, go to
Section 5.6.5
If you are working in a cluster environment, it may be easier to use the HSC console log. The HSC
console log is a condensed version of the host error log. Proceed to Section 5.6.5 for information on
using the HSe console log.

The following is a data collection step:
Access the host error log. Obtain the drive and controller event (error) codes. Note the LBNs
involved in read/write disk transfer errors.

DIGITAL INTERNAL USE ONLY

5-26 Troubleshooting and Error Codes

Note the LBN being reported in the data transfer error packet. Also note if any of the following
error types have been detected by the controller:

•
•
•
•
•
•
•
•

Data errors
ECC errors
Uncorrectable ECC errors
Header-not-found errors
Invalid header errors
Header compare errors
Format errors
Data sync timeout errors

Study the SDI error packet of the error log for drive-detected errors and check for the following
information:
•

Error code

•

Drive group number

•

Logical cylinder number

For controller-detected (communication) errors, such as protocol or transmission errors, note the
controller-reported error code in the status/event code field.

5.6.5 Using the HSC Console Log to Identify the Problem FRU ~
If the disk drive is not attached to an HSC or KDM and no supporting error data is available, go to
Section 5.6.6.
The amount of subsystem error information reported by the HSC console log depends upon the HSC
error threshold level setting. The HSC SETSHO utility can be set to alter the error threshold level
as follows:
•

Information

•

Warning

•

Error

• Fatal
Execute the HSC SHO SYSTEM command to display the error threshold parameter setting. If the
error threshold is set sufficiently high (fatal), no error information may be available from the HSC
console log. Refer to Section 5.6.6 to continue error analysis.
If the drive is attached to an HSC, check the HSC console log. Use the HSC Service Manual
to decode the console error log. Obtain status/event codes, drive extended status bytes for the
drive LED error codes, and the LBN addresses at the time of the error. Organize the gathered
information on the troubleshooting worksheet to help isolate the failing FRU. Proceed to Section 5.7
and compare the collected data to determine troubleshooting priority.
If the information from the HSC console log does not identify the problem FRU, go to Section 5.6.4
to examine the host error log, or Section 5.6.6 to examine the drive internal error log.

DIGITAL INTERNAL USE ONLY

Troubieshooting and Error Codes

5-27

5.6.6 Using the Drive Internal Error Log to Identify the Problem FRU ~
If the drive is connected to a cluster, it is strongly recommended that you dump the drive internal
error log before troubleshooting or attempting FRU replacement.
To extract the RA9OIRA92 drive internal error log, use one of the following methods:

•

Run DKUTIL from the HSC console or KDM controller (see Section 5.2.4.1).

•

Run drive-resident utility T41 from the RA9OIRA92 OCP (see Section 5.2.4.2).

•

As a last resort, run utilities such as MDM, EVRLL, or ZUDMxx.

NOTE

Off·line diagnostics remove system availability from the user and should only be used as
a last resod
Media problems such as ECC errors are not logged in the drive internal error log.
Proceed to Section 5.8 for media errors.
If you cannot access the drive internal error log, verify the physical connection between the drive
and the controller. If the drive is attached to an HSC, type a SHOW DISK command at the HSC
console to verify that the drives are on line to the controller.
If no errors have been logged., or the drive internal error log is inaccessible, proceed to Section 5.9.
If a host error log or an HSC console trail has been acquired, proceed to Section 5.9.

5. i

Priority Order of Troubleshooting DSA Errors ~

The priority order of troubleshooting DSA errors is important. The following sections describe the
importance of each error type and DSA reporting mechanisms.

5.7.1 Drive-Detected Drive Errors and Diagnostic Faults [II
Give error codes in this category top priority.
Drive-detected drive errors (DDDEs) appear in host error logs and HSC console logs provided the
error threshold is set low enough. DDDEs are also available in the drive internal error log.
Drive-detected diagnostic faults (DDDF) appear in the drive internal error log, although they may
be seen at the host level. This error type is top priority.
5.7.1.1 Drive-Detected Protocol Errors Without Communication Errors ~

The occurrence of drive-detected protocol errors (such as errors 07, OC, and so on) without the
occurrence of transmission errors (errors 20, 21) indicate a controller problem or an electronic
cOI\trol module (ECM) failure. Troubleshooting must be done on that basis.
The occurrence of drive-detected transmission errors with error codes 08, 09, OD, OE, OF, 10, 16, 19,
lA, 29, 2A, 2B, 2E, or 2F without communication errors generally indicate a controller problem.
The drive detects these errors by analyzing packet frames as they are being received. If the drive
is at fault (in other words, replacing the controller did not fix the problem), replace the drive ECM
module.
5.7.1.2 Drive-Detected Pulse or State Parity Errors ~

The occurrence of transient, drive-detected communication errors occasionally causes a protocol
error. This is generally a manifestation of communications problems. Determine if the problems
occur on the transmit or receive lines from the controller to the drive. Drive error codes associated
with pulse or parity errors are OA, 20, or 21.

DIGITAL INTERNAL USE ONLY

5-28

Troubleshooting and Error Codes

If the drive is seeing drive-detected communication errors, then the drive ECM receive circuitry,
SDI port transmit circuitry (controller), or SDI cabling is suspect. Reconfiguration might further
isolate the problem (use different drive/controller ports and cable combinations).
If the controller is seeing communication errors (these also show up as ECC errors) and the drive
is also seeing communication errors, then the whole path (drive to controller) is suspect. It is
important to make a distinction between the communication errors and ECC errors. If an SDI
subsystem is having communication errors, one of the manifestations (not the cause) is ECC errors.
If the communication errors are severe enough, data transfers are halted.
NOTE

Fix communication problems before concentrating on ECC or positioner errors.
Ensure SDI cable connections are secure enough to provide proper electrical and
mechanical continuity.

5.7.2 Controller-Detected EDC Error ~
NOTE

EDC errors are not caused by drives.
EDC is a data protection mechanism to ensure data integrity within a disk controller. In contrast,
the ECC mechanism ensures data integrity from the controller through the drive, to the media, and
back again. ECC ensures integrity of customer data and the EDC mechanism together.
It is important to note the differences in how controllers implement the EDC mechanism:

•

For the KDAlKDBIUDA family of controllers, EDC is generated on a sector of data at the bus
interface as the data is initially read from host memory. EDC is verified on a sector basis as
the data is written to host memory from the controller memory. Therefore, xDAlxDB controllers
generate and check EDC. The microcode engine of the controller performs this check at the bus
interface.

•

For HSC controllers, EDC is generated on a sector of data at the K.pli port processor module
as the data streams in from host memory over the CI bus. EDC then becomes an integral part
of the user data as the data is transferred to the HSC data memory. As this data is read out
of HSC data memory by the K.sdi modules and transmitted to the drive, user data EDC is
regenerated and checked in the K.sdi and compared to the EDC characters appended to the
data by the K.pli.
The EDC must check OK, or the write-transfer-to-disk will be aborted. The IISC again requests
the data from host memory and again queues the write-transfer-to-disk when data becomes
available in the HSC data memory. If the EDC checks OK at the K.sdi on a write-to-disk, the
EDC and ECC codes are appended to the data stream and written to disk with ECC ensuring
data integrity of the customer data and the EDC code.
For a disk read, the data, as it is read by the K.sdi (over the SDI read/response line), is checked
for good ECC, then the data plus EDC characters are stored in HSC data memory. As the data
is sent to host memory, the K.pli, while transferring the data to host memory, verifies that good
EDC exists for the customer data block but does not transfer EDC characters to host memory.
IfEDC is bad, the K.pli informs the HSC functional code to again request the same data from
the disk.

•

For KDM controllers, EDC is generated on a sector of data at the bus interface as the data is
initially read from host memory. EDC is verified on a sector basis at the SDI SERDES port
interface as the data is written to disk. On a read, EDC is checked by the SDI SERDES at the
completion of each sector read (and data correction, if applicable). EDC is checked again as the
data is written to host memory from the controller memory.

DIGITAL INTERNAL USE ONLY

Troubieshooting and Error Codes

5-29

If EDC errors are detected, the problem is a controller problem. The ECC is protecting the data
to and from the disk and checking the integrity of the data at the SDI port module logic.
NOTE

A properly functioning controller always reads bad EDC written to disks. However, if
bad EDC is written to a disk (improperly functioning controller), each time the block
with bad EDC is read, EDC errors are logged against the drive. Only after the data is
restored or rewritten to the disk with good EDC by a good controller will the errors go
away.
5.7.2.1 Controller-Detected Protocol and Transmission ElTOrs Without Communication Errors
(Status/Even! Codes 14B or 4B) f!]

The troubleshooting process for this type of error is very similar to the discussion in Section 5.7.1.1.
It i8 important to determine that the controller detected protocol errors without basic
communications errors such as:
•

Protocol errors-A level 2 response from the drive had correct framing codes and checksum
but was not a valid response under SDI protocol rules. If the opcode on the readlresponse line
has an odd number of bits, it is an unknown opcode; if the response packet is bad, it is also
classified as a protocol error.

•

Transmission errors-The controller detected an invalid framing code or a checksum error in
a level 2 response from the drive. The UDA50 also returns the same siaiusleveni code ior
controller-detected protocol errors.

Tabie 5-3 Summary of Controller-Detected Communication Errors

StatuslEvent Code
Controller-Detected
Communication Errors

USC

UDA

KDA

KDB

KDM

Protocol

14B

l4B

Invalid frame code,
level 2 checksum

Pulse/state parity (wi...re)

lOB

Communication (wire) errors are described in Section 5.7.2.2.
5.7.2.2 Controller-Detected Pulse or State Parity Errors (Status/Event Code 10B) ~

The procedure for handling controller-detected communication errors is very similar to the one
described in Section 5.7.1.2. The controller detected a pulse error on the state or data line, or the
controller detected a parity error in a state frame from the drive. This error is associated with the
controller and drive SDI port electronics (including interconnecting cables).
The symptoms indicate a basic (wire) communications problem within the SDI pathway, including
drive or controller port electronics. Noise can be injected through the port electronics or the cabling
between the controller and the drive. Additionally, bad cables (bent, walked on) or loose connecting
hardware (bulkhead connections) can contribute to the problem.
Pulse errors are caused by two consecutive pulses of the same polarity. SDI signal lines use an NRZ
transmission technique where no two adjacent pulses can be of the same polarity. This is detected
on either the state or read/response line.
A state parity error is the occurrence of bad parity over the length of a single SDI RTDS state
frame or SDI read/response frame. This type of error may also result in the detection of ECC errors
during data transfer times. This occurs when the read/response line and the write/command line
are functioning as the data line.

DIGITAL INTERNAL USE ONLY

&-30

Troubleshooting and Error Codes

Controller-detected transmission errors (4B) occur if an invalid framing code or a checksum error is
detected during a level 2 response from the drive.
NOTE

The UDA50 also returns this status/event code for controller-detected protocol errors.

5.7.3 Controller-Detected Communication Events and Faults [ZJ
Controller-detected communication events include:
•

Loss of read/write ready-MSCP StatuslEvent 8B

•

Loss of receiver ready-MSCP Status/Event CB

•

Receiver ready collisions-MSCP StatuslEvent lAB

•

Drive clock dropout-MSCP StatuslEvent AB

•

Failure of drive initialization process-MSCP StatuslEvent 16B

•

Failure of drive to· respond to controller-requested initialization-MSCP StatuslEvent 18B

•

SERDES overrun error (in controller)-MSCP StatuslEvent 2A

•

SDI drive command time-out-MSCP StatuslEvent 2B

Communication systems have faults and event irregularities. Communication faults are events, but
not all events are faults. The difference is related to timing between events and system operations
occurring at the time of the event.
For example, a loss of read/write ready is an event if no write activity is occurring at the time of
the loss. During a write, however, a loss of readlwrite ready is an error (fault) event.
5.7.3.1 Controller-Detected: LOSS OF READIWRITE READY (Status/Event Code: 8B) [!)

The controller event is LOST READIWRITE READY DURING OR BETWEEN TRANSFERS.
This error indicates read/write ready (RTDS status bit) was negated when RJW ready had been
previously asserted (indicating completion of a preceding seek) and:
•

The controller attempted to initiate a transfer, or

•

A RJW ready was found negated at the completion of a transfer

This event usually results from a drive-detected transfer error, in which case an additional error log
message may be generated containing the drive-detected error event code.
This error may be symptomatic of a fine track servo problem in the RA9OIRA92 disk drive. If there
are no other such subsequent error log entries, the loss of fine track was probably responsible for
t~e loss of read/write ready. Examine the drive internal error log for evidence of servo problems.
5.7.3.2 Controller-Detected: LOST RECEIVER READY (Status/Event Code: CB) ~

RECEIVER READY (RTDS status bit) was negated when the controller attempted to initiate a
transfer, or RECEIVER READY was not asserted at the completion of a transfer. This includes all
cases of the controller timeout expiring for a transfer operation (level 1 real-time command).
As a consequence of this condition, the controller performs an SDI INIT then attempts to request
a GET STATUS. The extended status error log entry returned in the GET STATUS command may
indicate what the problem is.

If no information is being reported by the drive as a part of the error log sequence, approach
the problem as a drive ECM failure. Examine the drive internal error log for extended error
information.

DIGITAL INTERNAL USE ONLY

Troubieshooting and Error Codes

5-31

5.7.3.3 Controller-Detected: RECEIVER READY COLUSION (Status/Event Code: 1AB)13.1ol

The controller attempted to assert RECEIVER READY (RTCS status bit), indicating it was ready
to receive a drive response. The drive RECEIVER READY (RTDS status bit) was still asserted,
indicating it was ready to receive a command from the controller.
This is not an error, but an event within the subsystem. All DBA drives and controllers occasionally
log this event. There is no performance impact because of the occasional OCCUITence of this event.
No data corruption is associated with the occurrence of this event if no other SDI bus errors occur
at the same time.
Acceptable event rates for RECEIVER READY collisions are less than ten per day, provided the
following events are not contributing:
•

Broken physical SDI interconnects (plugging and unplugging SDI cabies).

•

Controller (node) initi!:!l;z!:!tions or F~C f'~ilovers.

NOTE

The occurrence of RECEIVER READY collisions happens primarily when both Ports A
and B are enabled at the drive.
Resolve unacceptable event rates of more than ten a day by replacing either the ECM or controller
port interface module, cables, or bulkheads.
5.7.3.4 Controller-Detected: DRIVE CLOCK DROPOUT (Status/Event Code: AB) 13.111

Either data (read/response line) or state clock (RTDS) was missing when it should have been
present. This is usually detected through a timeout.
A fatal drive condition can cause the drive to drop the drive clocks. The drive should reassert
clocks after performing a drive initialization and establishing clocks to the controller to re-establish
communications and state information between the drive and controller. The sequence of getting
status and error information then occurs. Analysis of error log message packets usually indicates
that the above sequence has occurred.
If such message packets are not being processed or received, it is possible that the condition cannot
be detected by the drive. Execute drive SDI loopback tests to try to find subtle SDI problems. The
order of emphasis is:

• ECM
•

Controller port module

•

Cabling (including bulkhead connectors)

5.7.3.5 Controller-Detected: DRIVE FAILED INITIALIZATION (Status/Event Code: 16B)j3.121

The drive clock failed to resume following a controller-attempted drive initialization. This implies
the- drive encountered a fatal initialization error. It may also indicate the drive was attempting its
own initialization or that the drive is looping in an initialization state or routine.
5.7.3.6 Controller-Detected: DRIVE IGNORED INITIAUZAll0N (Status/Event Code: 18B)13.131

The drive clock continued running even though the controller attempted to perform a drive
initialization. This implies the drive did not recognize the INIT command from the controller.
It may also indicate the drive was performing an initialization caused by some drive-detected
condition and, in the course of initialization, ignored the controller's attempt to initialize the drive.

DIGITAL INTERNAL USE ONLY

5-32 Troubleshooting and Error Codes

5.7.3.7 Controller-Detected: SEROES OVERRUN ERROR (Status/Event Code: 2A)/s.141
SERDES overrun (or underrun) errors indicate that the drive is too fast for the controller or, more
typically, a controller hardware fault is preventing the controller microcode from keeping up with
data transfers to or from the drive.

Because of the speed with which the RA901RA92 disk drive handles data transfers, some SDI
controller ports may not be able to keep up with data transfers to and from the drive. This speed
sensitivity may even show up on drive ports that have successfully run other RA-type disk drives.
There is not a universal problem with Digital SDI controller port boards. The controller port boards
design supports RA9OIRA92 operating speeds.
The SERDES overrun problem manifests itself as transient occurrences of the error or as solid
SERDES problems preventing execution of read/write operations to the drive. For all controllers,
the SERDES occurrence looks like a single controller port failure and is seldom related to a
particular drive port.
5.7.3.8 SOl Drive Command Timeout (Status/Event Code: 28) 13.151
A controller may report an SDI command timeout when it issues a command to the drive and
the drive does not respond within the required timeout period. The timeout period is commanddependent.

SDI command timeouts are associated with Status/Event Code 2B. These events will frequently
occur under the following conditions:
•

Powering up a drive with one or both port switches depressed, then hitting the Run switch.

•

Spinning down a drive with one or both port switches depressed.

Under these two conditions, the SDI command timeout event reports can be ignored. However,
under other conditions, you should examine SDI command timeout events by looking at the logged
errors around the time of the event. The drive internal error log may also reveal clues to the
problem; however, you should verify that the time of the error, as logged in the drive, corresponds
to the time of the event.
If the controller is an HSC, verify that the device priority is correctly managed. The RA9OIRA92
disk drive's place in the priority scheme is as follows:
TA9O-highest priority
RA9OIRA92
ESE2x
RA82
RA81

RA70
RASO-Iowest priority

5.8 Media-Related Errors ~
Media and read/write transfer problems manifest themselves in many ways. Symptoms include:
•

ECC errors (refer to Section 5.16)

•

Uncorrectable ECC errors (refer to Section 5.16.1)

•

Header-not-found errors (refer to Section 5.16.1)

•

Invalid header errors (refer to Section 5.16.1)

•

Header compare errors (refer to Section 5.16.1)

•

Format errors

•

Data sync timeout errors

DIGITAL INTERNAL USE ONLY

Troubieshooting and Error Codes 5-33

Read/write errors may involve the read/write data path or defective media. For the SDI disk
subsystem, the readiwrite data path includes:
•

SDI controller read/write data path circuits

•

SDI cables and bulkhead connectors

•

Disk drive read/write data path hardware

•

Disk drive media

Use the following process to analyze read/write transfer errors:
1. Isolate the LBNs associated with the logged transfer errors in the host or HSC error log. If
there are many, randomly select 10 to 20. Use the approprUlte algorithm to decode targeted
LBN numbers to the logical cylinder, group, and head. Refer to Example 5-1 for RA90 LBN
conveLNon, and Example 5-2 for P..A92 LBN conversion.
2. Decode the LBNs in question to physical cylinders, tracks, and groups (physical readlwrite
heads).

5.8.1 Repeating LBNs/RBNs ~
LBNs or RBNs that consistently recur in the host error log should be replaced. If the controller or
system has noi marked ihese fur replacement, replace them manually by nJnning HSC DKtJTIL,
EVRLK, or ZUDLx, and MDM. This is a useful procedure for blocks that consistently report ECC or
data errors.
This symptom occurs when the host bad block replacement (BBR) software does not use customer
data as a pattern to test the suspect block. The block is initially flagged for replacement. The host
executes a test of the block and finds nothing wrong. It does not revector the block, but instead
restores the original data back to the block. The user then attempts to access the data and may get
another ECC error severe enough to invoke the BBR activity again.

5.8.2 Excessive Number of Blocks Replaced Because of RIW Path Problems ~
Read/write data path problems may cause the replacement of a high number of good blocks. This
may lead to logical fragmentation of the disk. If this happens, the number of blocks in the RCT
recorded as revec+..ored differs substantially from FCT inful'mation. For example, the RCT may show
a doubling of replaced blocks occurring over a short period of time. Use EVRLB, MDM, ZUDK:xx, or
HSC FORMAT to reformat the disk and recover these good blocks.
NOTE

Back up customer data before executing the reformat.
Use the host error log to identify replacement blocks and to show if BBR activity is complete. Use
HSC DKUTIL to dump the factory scan (FCT) and RCT areas of the disk. Look for differences
in the FCT and what is currently in the RCT. The contents of RCT only show what blocks were
replaced; the host error log and HSC console logs supply the time of replacement.
Keep good records in the site management/cluster guide. Include results of VERIFY and BBR scans
of each disk. This information helps identify changes in block replacement activity and is part of
good site management practices.

5.8.3 LBN Correlation to Single Grouprrrack @2l
Consistent failures involving one or two readlwrite heads usually indicate an HDA failure.

DIGITAL INTERNAL USE ONLY

~34

Troubleshooting and Error Codes

5.8.4 LBN Correlation to Head Groups ~
Consistent failures within head groups are usually due to head selection logic within the HDA. The
groups are as follows:
RA90 (LA)

RA90 (SA)

RA92

70-22951-01
HDARevOO

70-27268-01
HDARevOl

70-27492-01
BDARevl0

0-3

0-2

4-7

3--6

8-11

7-9

10-12

Replace in the following order:
1. PCM

2. HDA
5.8.4.1 LBNs Correlated to Zone Write Boundaries ~

Failures showing no consistency to a group or head may show consistency in write current zones.
DSA drives divide the media into different write current amplitude zones. The RA9OIRA92 divides
the media into four write current amplitude zones as listed in Table 5-4.
Table 5-4 RA90/RA92 Write Zones
RA90

RA92

Zone

Cylinder
Range

LBN

Cylinder
Range

LBN

0000-1722

0-1546428

0000-2014

0-1912234

1723-2020

1546429-1813724

2015-2363

1912235-2243435

2021-2335

1813725-2096289

2364-2731

2243436--2592667

2336--2660

2096290-2377747

2732-3112

2592668-2954237

To verify this correlation, you need a substantial number of errors (greater than 100) and knowledge
of the user disk space being used. A customer using more than 50 percent of the available disk
space is probably accessing all zones of the disk. A disk using less than 25 percent of the disk space
may only be accessing a single zone. Knowledge of operating system utilization of disk space is
necessary to make this troubleshooting procedure effective.
Zone-related problems encountered with the RA9OIRA92 disk drives generally are resolved by
replacing the PCM, ECM, or BDA (in that order).
5.8.4.2 LBN Correlation to a Physical Cylinder ~

Failures consistently related to a specific cylinder may be the result of a head touchdown. Problems
involving servo detection information (dedicated and/or embedded) that prevent head tracking to
cylinders usually indicate media corruption. These problems include HDA and ECM electronics.
Failures are usually due to specific cylinders in a head crash and may include an area as wide as
ten cylinders. One to three cylinders usually indicate servo data failures.
In the RA901RA92, logical cylinders correlate to physical cylinders.

DIGITAL INTERNAL USE ONLY

Troubieshooting and Error Codes

5-35

5.8.5 Multiple Controllers Report Same Error Types [!l
If multiple controllers report the same error types and only one drive port (after cable swap) reports
the error, it is likely an ECM problem.
If multiple controllers report the same elTor types and both drive ports report the same elTor,
replace drive components in the following order:
1. PCM
2. ECM

3. SDI cabling/interconnects
4. Power source

5. Spindle ground brush
6. HDA

5.8.6 Only Single Controller Port Affected I!!I
If errors occur to a single controller port and both drive ports have been tested to a known good
controller interface, then the problem is in the controller or cable.

5.8.7 Isolating Random RIW Transfer Errors ~
NOTE

You are here only because the disk drive is experiencing random readlwrite transfer
errors or because your checklist has led you here. If you have not pinpointed the failure,
see Section 5.9.
Random physical cylinder and head failures are generally caused by ECMlSDIISDI-controller
interface problems. A faulty spindle ground mechanism or a power supply exceeding noise
specifications may also cause a drive to exhibit random errors.
Intermittent read/write problems involving random read/write heads and cylinders may be the
result of intermittent failures through the read/write data path. This includes SDI cabling or
rt::adlw .l~te data path hardware in the controller.
5.8.7.1 Not Defined to a Specific Drive/Controller Port
This is a decision point for the first-time call efTort with random read/write errors. If working from
a miscellaneous check or action item list, proceed to Section 5.9.

For the RA.901RA92 drive, replace parts in the following order:
1.,PCM
2. ECM

3. Cabling (reconfigure)
4. Power supply
5. Spindle ground brush
6. HDA

DIGITAL INTERNAL USE ONLY

5-36

Troubleshooting and Error Codes

5.9 Miscellaneous Checks ~
Miscellaneous checks are provided as an alternative when:
•

No host elTor log is available.

•

No HSC console trail is available.

•

No errors are logged in the drive internal error log.

•

ElTors are transient or not reproducible through standalone diagnostics.

If you cannot access the RA9OIRA92 drive internal elTor log from the OCP, replace FRUs in the
following order:
1. ECM

2. OCP
3. Power supply
If you cannot ac~ss the RA9OIRA92 drive internal error log with DKUTIL or EVRLUZUDMlMDM,
perform the following:
1. Execute resident diagnostic test TOO (drive spun down).

2. Execute resident diagnostic test TOO (drive spun up).
3. Execute externalloopback SDI test T09 (use loopback connector Digital part number
70-19074-01).
4. Check drive power supply and indicators. See Section 5.2.6 for the location of power supply
indicators and their meanings.
5. Check drive power supply for proper voltages and ripple (noise). See Chapter 1 for power supply
operating specifications.
6. Check spindle ground brush for excessive wear.
7. Check the SDI cable by changing the cable.
8. Check the controller port by connecting the SDI cable to another port.
Unreliable power from the power supply, controller, or source power may cause the drive to exhibit
a variety of unrelated elTors. Ensure source power is within tolerances and follow suggested drive
power checks.
If all checks have been made and no problem is found, replace the ECM. The ECM is the moSt
likely FRU to fail, provided the failing drive has been colTectly identified.
Use the Customer Support Center for problems beyond the scope of your experience or this manual.
NOTE

For transient disk subsystem errors, nJnning host-level diagnostics on xDAlxDB
controllers seldom isolates errors without long run times. This seriously impacts system
availability to the customer. Use system-level and drive internal error logs whenever
possible.

5.10 Are You Lost? ~
If you feel that the problem is beyond your capabilities and you have spent too much time trying
to isolate it, use available support resources. Digital Customer Services should operate within the
Management Action Planning (MAP) guidelines for each respective area of the country/world.
If you are in the process of performing action items, complete those items and reenter the drive
fault evaluation phase after collecting new error data.
DIGITAL INTERNAL USE ONLY

Troubleshooting and Error Codes

5-37

5.11 Using Host-Level Diagnostics as a Last Resort ~
There are significant concerns about running standalone diagnostics in troubleshooting RA901RA92
disk problems. Running standalone diagnostics extends site time and makes the system
unavailable to the customer. Customer Services goals ~'""S to m8~mi7.e system or device availability
to the customer and minimize repair time. Consider running host-level diagnostics only if you have
exhausted all options. Tables 5-5 through 5-7 contain the names of diagnostics that are compatible
with the RA901RA92 disk drives.
CAUTION

Back up customer data before executing diagnostics on customer data areas of the disk.
Protection of customer data is your responsibility.
Follow the strategy which is in place to provide quick and accurate diagnosis, repair, and validation.
Trls strategy Tn~n~Tn;zes +:he impact on syst.em or device aVAilabilitYo

5.11.1 HSC-Based Diagnostics
Use HSC utilities (DKUTIL) and diagnostics (lLEXER and ILDISK) in a cluster environment.
Though the diagnostics are in line and do not cause a loss of system availability, device availability
is an issue. With that in mind, examine the drive internal error log prior to rnnning standalone
diagnostics.
To execute the in-line tests or utilities, the drive must first be dismounted. The rest of the disk
subsystem will not be affected. DKUTIL, ILEXER, and ILDISK do not adversely affect the drive;
however, ensure customer data is protected. While rnnning these tests, give errors detected by the
drive or controller top priority.

5.11.2 KDM-Based Diagnostics
Use KDM utilities (DKUTIL) and diagnostics (ILEXER and ILDEVO) in a cluster environment.
Though the diagnostics are in line and do not cause a loss of system availability, device availability
is an issue. With that in lnind, examine the drive internal error log before nJnning standalone
diagnostics.
To execute the in-line tests or utilities, the drive must first be dismounted. The rest of the disk
subsystem will not be affectect DKUTIL. ILEXER, and ILDEVO do not adversely affect the drive;
however, ensure customer data is protected. While running these tests, give errors detected by the
drive or controller top priority.
5.11.2.1 On Une from VMS

Use the following procedure to access and run on-line programs on a KDM controller. See
Section 5.11.2.2 for instructions on accessing and running programs in standalone mode.
NOTE

You cannot run on-line diagnostics, exercisers, and utilities without first mnnjng
EVRLN.KDM. Follow the procedure shown here.
$ RON SYS$SYSTZN:SYSQBR
SYS GEN> COBRBC'!' J'!'AO /WOADU'1'D

SYSGEN> EXI~
$

SB~ DBFAOL~ SYS$~

SB~ HOST/DtJP /SBRVBR=DOP /L01t.I);I:BVRLR.1mM PtJAO/DBVICB

SB~ HOS~/DtJP /SBRVBR=DOP /DSlC-ILDBVO PUAO/DBVICB

DIGITAL INTERNAL USE ONLY

5-38 Troubleshooting and Error Codes

5.11.2.2 Running Standalone Programs from the VAX Diagnostic Supervisor
DS> ATTACH Ia)N70 ROB DOx 11 BR

, _ _ _ _ BUS REQUEST

, _ _ _ _ _ NODE NUMBER
DS> SBLEC'l' DOx
DS> ROIl BVRLN

EVRLN> ROHL ILDBVO

5.11.3 xDA Controller-Based Diagnostics
To run standalone diagnostics or utilities (excluding EVRAE) through any UDA, KDA, KDB
controller, the operating system must be shut down and the appropriate diagnostic/supervisor
loaded.

Some diagnostics force error conditions to validate the drive's ability to detect eITor conditions.
Error conditions detected by the drives are logged to the drive internal error log as a normal course
of operation. Therefore, through several iterations of a standalone diagnostic, the drive internal
error log may be overwritten and the real drive-detected errors lost.
For example, running a single iteration MDM on a MicroVAX may result in 13 error events. These
events are logged to the drive's internal error log (EEPROM) and may overwrite important error
information.
With that in mind, examine the drive internal error log before running standalone diagnostics.
A recent SDI specification change addresses this issue by having the controller disable drive
error logging during drive testing. The following diagnostic software releases incorporate the SDI
specification changes:
•

•

XXDP-Release 135 (Q3FY88)
-

ZUDGrev CO

ZUDHrev CO

MDM-Release 122 (Q3FY88)
-

•

NAKDAH

VDS-Release 31 (Q4FY88)
-

EVRLF version 8.3

EVRLG version 8.3

If any errors occur while running disk diagnostics, go to Section 5.6.
If multiple errors occur, go to Section 5.13.1.
If no errors occur, go to Section 5.10 and call remote support.

DIGITAL INTERNAL USE ONLY

Troubleshooting and Error Codes 5-39

Table 5-5 VDS-Based Off-Une Diagnostics
Diagnostic

Title

EV"RLB

Drive formatter

EVRLF

Tests 1-3

EVRLG

Test 4

EVRLJ

Test 5

EVRLK

Bad block replace~ent utility (Scrubber)

EVRLL

Drive-resident error log utility

EVRAE

MSCP disk exerciser

EVSBA

VAX autosizer

Table 5-6 MOIl-Based Off-Une Diagnostics
Diagnostic

Title

MDM

MicroVAX diagnostic supervisor 1

lCurrently has a problem identifying drive unit number.

Table 5-7 XXDP-Based Off-una Dlagnosacs
Title

ZUDH2

Tests 1-3
Test 1: UNIBUS interruptJaddress test
Test 2: Executes drive-resident diagnostics
Test 3: Disk function test (rdIwrt)

ZlJDI :&

Test 4: Disk exerciser

ZUDJ

Test 5: UDAIKDASO subsystem exerciser

ZUDK

Formatter

ZUDL

Bad block replacement utility

ZUDM

Disk-resident error log utility

2Forl:es 8lTOl'S during nm that are logged in the drive intema1 error log.

5.12 Exiting Data Collection: Action Item ust Process ~
Your goal dming the data collection phase is to collect logged subsystem events including:
41)

Status/event codes from error log packets

•

Drive-detected master error codes

•

Identified target LBN numbers

DIGITAL INTERNAL USE ONLY

5-40 Troubleshooting and Error Codes

When no host or HSC error log information is available, use the drive internal error log or
operator/system console trail to identify the problem drive. In some isolated cases (less than
one percent), you will have to use a troubleshooting worksheet (described in Section 5.4.1) in place
of system logged information. You should leave this phase ready to analyze collected data or with
an action item list.

5.13 FRU Replacement ~
Replace an FRU only after:
•

Analysis of VAXsimPLUS directed a replacement FRU based upon its analysis of occurring
errors or error rates.

•

Analysis of host error logs resulted in a list of error codes with particular emphasis placed on
identifying drive-detected error codes. The error codes should predominately be drive error
codes. In some circumstances, error codes are generated by the controller.

•

Analysis of the HSC console log resulted in a list of drive error codes used in identifying
replacement FRUs.

•

Analysis of the drive internal error log led to an identification of a replacement FRU.

•

Analysis of miscellaneous checks or the process of elimination identified an FRU replacement.

Once an error code has been established from one of the previously mentioned sources, refer to
Section 5.19 for error code descriptions and suggested FRU replacement(s).

5.13.1 Multiple Error Codes [1]
If a number of different error codes are detected, consider the following to decide which error code(s)
to use for troubleshooting:
•
•

Give error codes obtained from running internal drive diagnostics top priority.
Select an error code or symptom that indicates the least number ofFRUs. Drive-detected errors
of this type will have been derived using the least amount of circuitry to isolate the particular

failure.
•

Select the error code that occurs most often.

•

Select the FRU that is most commonly indicated by different error codes.

•

Select the FRU that most commonly indicates the same manufacturing code (Section 5.2.3.13).

5.13.2 Service Post-Verification ~
After replacing an FRU or repairing a drive, execute drive-resident diagnostics. You can do this
through power-up and spin-up cycles or by using tests which exercise the repaired FRUs. Compare
the results to the diagnostics executed during pre-verification testing (Section 5.6.1).
Post-verification testing accomplishes the following:
•

Verifies that no new problems have been introduced when servicing or replacing FRUs.

•

Verifies that a repair or replaced part corrected any problems detected during pre-verification
testing.

If the same error code(s) occur during post-verification testing, reinstall the original FRU. Continue
troubleshooting procedures, or replace the next identified FRU in the appropriate list.

DIGITAL INTERNAL USE ONLY

Troubieshooting and Error Codes 5-41

If the diagnostics pass successfully, the problem has most likely been resolved, with the following
exception:
•

If the original error codes used for FRU isolation were the result of host, controller, or drive
internal error log entries (not duplicated by running pre-verification testing): the problem may
be due to an intermittent failure. Proceed to Section 5.13.3

If any errors occur, you may want to reinstall the original drive FRU and go to Section 5.6.1.

5.13.3 Return Disk Drive to User [!J
After checkout is complete, return the disk drive to the user. Have the user exercise the repaired
disk drive tlu·ough customer applicationR • If customer applications appear to be functioning
normally, the call can be closed.
If the drive fails, return to Section 5.6 or call remote support.
If there is a question as to the correct identity of the failing disk drive, return to Section 5.5.

5.14 Performance Issues When No Errors Are Being Logged
Customer complaints of disk performance can require a fair amount of analysis. Often the
performance complaints are quite subjective. The following list of questions may help analyze
perlormance complaints:
1. Do the performance issues relate to all or most of the disks?
If so, ensure that system parameters comply with suggested guidelines. Cluster si.aZe of disks,
working set size parameters, paging parameters, and ACPIXQP-related parameters all can
affect performance.
2. Do the performance problems occur during image activation (when a large sized application
program is initially started)?
Many layered products require some time to fully activate. This is not a disk problem.
3. Is the performance problem noticed by users of the same image, layered product, or file on the
(same) disk?
If the disk is attached to a iocai controller (uuAiK..lJAlKDB) but is a VAX node member in a
cluster, then request that the filelimagel1ayered software product be moved to a disk on the
HSC. Local serving of disks creates bus, VAX, and 110 overhead that impacts performance.
4. Is the performance problem noticed by users of a filel"nnagel1ayered product that resides on the
same disk as the swap and page files?
If so, request the system manager monitor paging and swapping activity. High pagelswap
rates decrease VMS response and create an I/O bottleneck for the pagelswap disk. Request the
filelimagel1ayered product be moved to another disk.
In addition to system parameter settings, two areas of the architecture (hardware-related)
contribute to actual loss of performance. These include:
1. Nonprimary replacements in a critical file or directory structure, such as the following
examples:
•

Nonprimary replacement in VMS disk: [000000] INDEXF.SYS

•

Nonprimary replacement in a frequently used directory file

The two examples are of files that may affect the perceived performance of a disk. However, the
location of a block of data within a file and how the operating system is set up equally affect
nonprimary replacement which, in tum, impacts system or disk drive performance.

DIGITAL INTERNAL USE ONLY

5-42

Troubleshooting and Error Codes

A non primary replaced block in the INDEXF.SYS file of a disk could be very significant if it is
in the front of the file. However, if it is the last block within the file, it might not have as large
an impact on system performance.
A nonprimary replacement in a block within SYS.EXE that is loaded once by VMS into memory
(at startup) and stays resident in memory has no effect on performance. However, if the block is
within a portion of SYS.EXE that is frequently brought in by VMS, it could impact performance.
A solution is to increase the VMS working set size.
A nonprimary replaced block within a swap or paging file has little performance impact. If
the system is doing enough paging and swapping to notice the occurrence of nonprimary
replacements, the real problem may be with the user or system working set size. Performance
may improve if the system manager adjusts system parameters around paging and swapping.
VMS uses virtual block file structures, not logical blocks. VBNs do not correlate to LBNs. To
correlate an LBN to the affected file, contact someone familiar with the operating system file
structure, such as VMS ODS-2. Identifying affected files within ODS-2 is very complicated.
2. Difficulty (but success) in achieving fine track following a seek.
The RA9OIRA92 disk drive utilities T36, T3B, and T39 measure various seek time parameters.
Compare measured times to drive specifications in cases where seek time is in question.
Temperature can affect the performance of T36 and T3B.

5.15 Troubleshooting VMS Mount Verification
EXE$MOUNTVER is the VMS executable mount verification process to bring disks back on line
after a problem has made the drives inaccessible to a host VAX. It is a very complicated process.
If any failure to reinitialize the disk occurs, or if EXE$MOUNTVER exceeds its allowed timeout
period (default 10 minutes), the host logs a mount verification error to the host error log.

5.15.1 VMS Mount Verification
The mount-verification feature of Files-11 disk handling generally leaves users unaware that a
mounted disk has gone off line and returned on line (or in some other way has been unreachable
and then restored). Mount verification is the default parameter for EXE$MOUNTVER, with the
following exceptions: Disks mounted !FOREIGN and disks mounted INOMOUNTVERIFICATION
do not undergo mount verification except during cluster state transitions.
Drives dual-ported through HSC controllers should never be mounted INOMOUNTVERIFICATION
because this may prevent VMS from failing the drive over to the secondary HSC controller.
EXE$MOUNTVER sends status messages to OPCOM. Because there are cases when mount
verification messages are needed at the operator console and OPCOM might not be able to provide
them, mount verification also sends special messages with the prefix %SYSTEM-I-MOUNTVER to
the operator console, OPAO.

5.15.2 VMS Problems Surrounding Diagnosis of "Why a Drive Mount-Verifies"
VMS calls EXE$MOUNTVER if a drive loses contact with the system. (For example, the controller
sends a command to the drive but does not get a successful response back within the controllerspecific timeout period.) The process verifies that the disk VMS reestablished contact with is the
same disk originally connected.
Sending the drive to the mount-verify state involves:
1. The host initiating an MSCP ONLINE command to the drive modifier followed by a GET UNIT

STATUS (GUS).

DIGITAL INTERNAL USE ONLY

Troubieshooting and Error Codes 5-43

2. The host reading the home block and comparing the volume information (serial number, name,
etc.) for the drive before VMS lost contact and after VMS reestablished contact with the drive
during mount verification.
This sequence is re~..,ated until success or timeout. Th;s sequence is m.ade evident by t.he drive
having a port light on and the Ready light blinking quite slowly as the controller accesses the FCT
for the on line and LBN block for the media ID, effectively doing full-stroke seeks.
The MVTIMEOUT system parameter defines the time (in seconds) allowed for a pending mount
verification to complete before it is aborted. This dynamic parameter should always be set to a
reasonable value for the typical operations at the site.
NOTE

Do not use values less than the recommended default of 600 seconds (10 minutes).
After a mount verification times out, any pending YO requests to the. volume will fail. Try to
execute the DISMOUNT/ABORT command which allows a subsequent mount to be successful if the
MV-timer has previously expired. In some extreme cases, drive failures may require a reboot of the
controller; some require a reboot of the system.
Entry and exit to or from MOUNT VERIFY are time stamped. VAXcluster time stamps may vary
across the cluster nodes due to differences in the TOY clocks and the initial clock times. Slight
variations in time stamps do not indicate multiple drive or controller failures causing MOUNT
VERIFICATION, but rather one drive or controller failure causing every node to enter MOUNT
VERIFICATION at their own locally specified time.
Some reasons why a drive enters mount verification:
•

Disk drive dropped off line because of:
Port switch glitch.
Drive fault.
-

•
•
•

Lost communications with controller or cable fault (drive temporarily went away and came
back).

Drive status changed (operator physically did something with the drive).
Someone accidentally pushed the Write Protect switch.

By noting the time duration of the mount verification and other circumstances surrounding the
mount verify status, you can determine some valuable troubleshooting information.

How long did the mount verify take?
Le~s than MVTIMEOUT and the drive eventually succeeded.

A few seconds-implying a glitch or a recoverable fault.

Did the drive appear on another controller after the mount verification? If so, it could be
a port-related problem.
Thirty seconds to a minute to remount probably means the drive was spun down and had to be
spun back up. Was this due to a drive fault? Did it run its spin-up diagnostics error free?

Infinite time probably means that, along with the drive disappearing, it also changed its media_ID,
or it is a different drive, or it continually fails its spin-up diagnostics, or there is a hard fault on the
drive.

What happened?
VMS does not log errors during the MOUNT VERIFY process, although it may log some before or
after, depending on how the drive failed.

DIGITAL INTERNAL USE ONLY

5-44 Troubleshooting and Error Codes

Did the drive see a fault during this period? (Examine the drive internal error log for error
information.)
Were any errors logged to the host or HSC console log before or after the mount verify?
Is it always the same drive?
Do any nonexistent drive numbers appear which may characterize a unit select problem?

Was there a last-fail packet from the xDAlxDB shortly after, meaning the controller
faulted/initialized as well?
Did all the drives on a portlKlcontroller fail?

5.15.3 Non-VMS Mount Verification
RSTS 9.x is tolerant of DSA drives dropping off line. It reinitiaIizes the drive and puts it back on
line. Most other drives remain off line unless the driver is patched to reissue on line before every
command (as RSX does).

5.16 Troubleshooting ECC Errors on RA90/RA92 Disk Drives
Disks are getting bigger and faster. As disk bit and track density increases, the electronics and
mechanical components of the subsystem operate under tighter constraints. This means that error
recovery mechanisms within the architecture may be called upon more frequently to compensate for
these narrow tolerances.
This is one of the significant advantages of a Digital storage solution. Digital integrates into the
design of the controller and the drive error recovery attributes that enhance and ensure data
integrity and delivery to the user. Plug-compatible manufacturers (PCMs) of storage devices, by not
owning the design of both ends of the subsystem (controller and drive), are left with little capacity
to implement such techniques.
The RA9OIRA92 disk drive has 14 different error recovery mechanisms (reference Appendix B) and,
therefore, affords excellent recovery potential for data errors. These error recovery mechanisms
provide the margins necessary to protect customer data at increased densities and to ensure that
the data is always delivered successfully.
In order to better determine the significance of logged correctable and uncorrectable ECC errors,
and for assistance in troubleshooting either, note the discussions and error log examples in the
sections that follow.

5.16.1 Uncorrectable ECC Errors--MSCP StatuslEvent E8
An uncorrectable ECC error is architecturally defined as the occurrence of a controller logging an
MSCP status/event E8 as a result of a read data error. There are two uncorrectable ECC error
types: hard and soft. Both types are reflected by a single MSCP status/event code.

The next two sections attempt to aid the engineer in determining/distinguishing between whether
the status/event was hard or soft and significant or insignificant.
5.16.1.1 Hard Uncorrectable ECC Errors

A hard uncorrectable ECC error is the occurrence of an uncorrectable ECC error that renders the
drive unable to recover data through any retry or recovery mechanism. An uncorrectable ECC error
is not considered "bard" until all attempts at getting the data are exhausted and the controller has
to terminate its attempts.
Example 5-3 shows a VMS error log error packet where the data was lost due to a hard error. The
fields of note are emphasized in bold.

DIGITAL INTERNAL USE ONLY

Troubleshooting and Error Codes 5-45

29. *******************************

******************************* ENTRY
ERROR SEQUENCE 3885.
DATE/TIME 30-JAN-1989 19:54:03.77
SCS NODE: PICKUP
ERL$LOGMESSAGE ENTRY

LOGGED ON:

REVI 14.

KA750

SID 0200620E
SYS_TYPE 00000000

OCOOE REVI 98.

I/O SOB-SYSTEM, UNIT _HSC013$DUA36:
MESSAGE TYPE

0001
DISK MSCP MESSAGE

MSLG$L eM[) REF
MSLG$W:UNIT

AF66000F
0024

MSLG$W_SEQ_NOM

0054

MSLG$B_FORMAT

MSLG$B_FLAGS

UNIT +36 ..

SEQUENCE 184.
DISK TRANSFER ERROR
BAD BLK REPLACEMENT REQUEST
OPERATION CONTINUING
OPERATION SUCCESSFUL
MSLG$1I_BVBft

OOB.
DAD DROIt.
0ltC0lUUlCD8I& ace DItOa

MSLG$Q_CNT_ID

0000F20D
01010000

MSLG$B_UNIT_ SVR

MSLG$B_UNIT_HVR

MSLC;$B_LBWL

<Laat ze=y 1e".1 ~

KSLG$B_UftY

<l'Utb rRzy . . . .~~
<a~ ~ ~ zeUy 1.".1

UNIT SOFTWARE VERSION I l l .
UNIT HARDWARE REVISION 11.
~ 1.".la

MSLG$L_VOL_SER

14 ~ 2.

0000036C
VOLUME SERL~ +876:

MSLG$L_BOR_CODE 000E75BO
LOGICAL BLOCK 1947645.
GOOD LOGICAL SECTOR
CONTROLLER DEPENDENT INFORMATION
ORIG ERR

8010
EDC ERROR
ECC ERROR

ERR RECOV FLGS

0003
LBN REPLACEMENT INDICATED
ERR LOGGED TO CONSOLE AND HOST

LV1 A RETRY
LV1 B RETRY
BOF OAT HEM ADR
SRC REO I
DET REQ I

00
00
C41B
03
03

******************************************************

Example 5-3

VMS Uncorrectable ECC Error Log-Hard

DIGITAL INTERNAL USE ONLY

5-46

Troubleshooting and Error Codes

The disk subsystem will attempt to recover from an uncorrectable ECC error by retrying the
transfer five times. For an RA901RA92 disk drive, the controller would then invoke drive recovery
level 14 and execute that recovery mechanism up to five times, then invoke drive recovery level 13,
and so on, until executing the last recovery level (1).
Note that for UDA controllers, the reported recovery levels from the controller will differ from what
the other controllers will report.
5.16.1.2 Soft Un correctable ECC Errors

A soft uncorrectable ECC error is the occurrence of an uncorrectable ECC error on the first read
attempt; however, a successful recovery level and/or retry was made and the data was read
successfully (with eight or less symbols in error). In such a case, the block is flagged as a BBR
candidate for testing purposes by the HSC controller (or in case of a UDAlKDAlKDB controller, the
host operating system driver).
For uncorrectable ECC errors (MSCP status/event ES), the following items should be considered:
•

For the RA901RA92 disk drive, examine the error log and determine that the MSLG$_LEVEL
and MSLG$_RETRY (for VMS) is being reported as follows:
If the recovery level is reported as 0 and the retry count is =1 for the uncorrectable ECC errors,
an occasional error under high I/O rates may be considered normal. The normal recovery will
occur on the first retry with a recovery level of O. If more than a single retry is necessary,
and especially if other levels of recovery are necessary, this indicates potentially more serious
error conditions, including the legitimate condition whereby a block is going bad and needs
replacement.

The RA90 short-arm HDA and the RA92 HDA will show improved (decreased) ECC error rates.
The nominal distribution of uncorrectable ECC errors for an RA90 disk drive with a long-arm
HDA operating at very high I/O rates should appear as follows:
-

Ninety percent of the errors occur in the top five heads (heads 0 through 4).
One of the heads (in the 0-4 range) will have no errors logged.
At least three of the top five heads will have errors of this type.

You should have a sample size of at least 16 uncorredable ECC errors for examination. If
this distribution of errors is not met, then further analysis should be done.
For example, if 10 of the 13 heads are logging these data errors, then consider it a general
read path problem and troubleshoot accordingly.
If distribution is to a single head, then consider the likelihood of a defective HDA.
If error log information indicates that data recovery was accomplished by utilizing a
drive error recovery level of 7 through 14 (head offset mechanism), then consider HDA
replacement (especially if 9A, 9B, or 9C errors are being logged in the drive as well).

•

Each error log entry of an uncorrectable ECC error should be followed by a BBR packet
(reference Section 5.16.2.1). The MSCP status/event code should reflect a 34, BBR replacement
attempted but block tested okay. Blocks in a normal drive will be retired at a very low rate (less
than 20 percent of the time) for the normal transient occurrence of uncorrectable ECC errors on
RA90 disk drives.

Example 5-4 has three fields of note (emphasized in bold). The:first emphasized field denotes the
actual MSCP status/event logged (OOES), and a bit-to-text decode denoting that the read error was
an uncorrectable ECC error.
The second field of note indicates how the subsystem recovered from the error condition; in this
case, a single retry was successful with no special error recovery mechanism being invoked to aid in
the recovery of the data.

DIGITAL INTERNAL USE ONLY

Troubieshooting and Error Codes 5-47

The third emphasized field is the field within an error log packet that, for an ECC-type MSCP
status/event packet, typically has no meaning and will in most all cases indicate zeros. This section
of an errorlog packet will, however, contain significant information for the interpretation of MSCP
status/event 6B error packets.
29. *******************************
LOGGED ON:
SID 0200620E
SYS TYPE 00000000

******************************* ENTRY
ERROR SEQUENCE 3885.
DATE/TIME 30-JAN-1989 19:54:03.77
SCS NODE: PICKUP
ERL$LOGMESSAGE ENTRY

REVI 14.

KA750

OCODE REVI 98.

I/O SOB-SYSTEM, UNIT _HSC013$D0A36:
MESSAGE TYPE

0001
DISK MSCP MESSAGE

MSLG$L CMD REF
MSLG$W:UNIT

AE66000F
0024

MSLG$W_SEQ_NOM

0054

MSLG$B_FORMAT

MSLG$B_FLAGS

UNIT 136.
SEQUENCE 184.
DISK TRANSFER ERROR
BAD BLK REPLACEMENT REQOEST
OPERATION CONTINUING
OPERATION SUCCESSFUL

MSLG$Q_CNT_ID

0000F20D
01010000

MSLG$B_UNIT_SVR

MSLG$B_UNIT_ HVR

IISIaQ$B LBWL
!!SLG$B:lLll'!RY

00
01

UNIT SOFTWARE VERSION I l l .
UNIT HARDWARE REVISION 11.

MSLG$L_VOL_SER

<so Drive BeOOV8%Y Invoked
<S~91. ~'b:y . . . auQO•••~ul'
<Minimal. impact event

00OO036C
VOLUME SERIAL 1876.

MSLG$L_HDR_CODE 000E75BD
LOGICAL BLOCK 1947645.
GOOD LOGICAL SECTOR
CONTROLLER DEPENDENT INFORMATION
ORIG ERR

8010
EDC ERROR
ECC ERROR

ERR RECOV FLGS
LV11 D'lRY
LVl B amy

BOF DAT HEM ADR
SRC REQ 1
DET REQ 1

0003
00
00
C41B
03
03

LBN REPLACEMENT INDICATED
ERR LOGGED TO CONSOLE AND HOST
< POJ: data pJ:Obleru, the ••
~iel.da

8houl.cl contaa ' zeros' •

***********************************************
Example 5-4

VMS Uncorrectable ECC Error Log-Soft

DIGITAL INTERNAL USE ONLY

5-48

Troubleshooting and Error Codes

5.16.2 Correctable ECC Errors-MSCP Status/Event Codes 1A8, 1C8, 1E8
Correctable ECC errors are those where the data was read with symbols in error above the drive
threshold (6-8 symbols for the RA901RA92 disk drive). For ECC errors (MSCP status/event codes
lAB, 1C8, and 1E8), consider the following:
•

For an RA90 disk drive with a long-arm HDA, an occasional ECC error (including 6-8 symbols
in error and soft uncorrectable errors) may be considered normal when the drive has sustained
or 110 burst rates of >30 1I0s per second.
The RA90 short-arm HDA and the RA92 HDA show a marked improvement (decrease) in ECC
error rates.
The nominal distribution of correctable ECC errors for an RA90 disk drive with a long-arm
HDA should appear as follows:
-

Ninety percent of the errors occur in the top five heads (heads 0 through 4).

One of the heads (in the 0-4 range) will have no errors logged.
At least three of the top five heads will have errors of this type.
You have a sample size of at least 16 uncorrectable ECC errors for examination. If this
distribution of errors is not met, then further analysis should be done.
For example, if 10 of the 13 heads are logging these data errors, then consider it a general
read path problem and troubleshoot accordingly.
If distribution is to a single head, then consider the likelihood of a defective HDA.
If error log information indicates that data recovery was accomplished by utilizing a
drive error recovery level of 7 through 14 (head offset mechanism), then consider HDA
replacement (especially if 9A, 9B, or 9C errors are being logged in the drive as well).

•

Each error log entry of an ECC (6-8 symbol) error should be followed by a BBR packet
(reference Section 5.16.2.1). The MSCP status/event code should reflect a 34, BBR replacement
attempted but block tested okay. Blocks in a normal drive will be retired at a very low rate
Gess than 20 percent of the time) for the normal transient occurrence of correctable ECC errors
on RA90 disk drives.

5.16.2.1 BBR Packet

ECC errors that exceed the drive threshold initiate BBR algorithms. The BBR algorithms are
provided to test, verify, and replace (if needed) defective media spots or marginal media/head spot
combinations (assuming no data path problems). In those instances where the BBR algorithms do
not determine a need for block replacement, it may be due to a transient type error situation,
or mechanisms not attributable to actual head/media margins. These above-drive-threshold
ECC errors (or uncorrectable ECC errors) may be caused by drive phenomena other than bad
media/heads.
The BBR packet, which is generated at the completion of the BBR algorithm, will contain several
important clues about the nature of the ECC error. Included in the packet is whether the block
tested good or bad, and whether the original data was recovered or restored with the FORCED
ERROR flag set, indicating the data was lost.
The following MSCP status/event codes are applicable for a BBR packet:
MSCP status/event 14-Bad block successfully replaced.
MSCP status/event 34-Block verified okay; not a bad block.
MSCP status/event 54-Replacement failure; replace command failed.
MSCP status/event 74-Replacement failure; inconsistent RCT.
MSCP status/event 94-Replacement failure; drive access failure.

DIGITAL INTERNAL USE ONLY

Troubieshooting and Error Codes 5-49

MSCP status/event B4-Replacement failure; no block available.
MSCP status/event D4-Replacement failure; two successive RBNs were bad.
Example 5-5 illustrates what the status of the BBR replacement algorithm resulted in. In this
example, the block in question did go through BER; however, the block was not replaced, Further
in the example, the replace flags demonstrate that the block was not replaced because the block
"verified good." The last segment of the BBR log packet reveals why the block was even tested. In
this example, the block was thought to contain a data error with a severity level of "uncorrectable
ECC."

5.17 Troubleshooting Controller-Detected Positioner Errors-MSCP
Status/Event 68
MSCP status/event 6B is a positioner unintelligible header error (also referred to as a positioner
error mis-seek). Several considerations must be weighed when troubieshooting the MSCP 6B event.
These include:
•

For RA9OIRA92 disk drives, what is the I/O rate on the drive?

•

Is only one SDI path noting the problem?

•

Are other errors being logged at or near the same frequency as the MSCP 6B?

•

For RA92 disk drives, what is the write-to-read ratio?

•

What recovery level/mechanism is the controller using in order to recover from the situation?

With the RA90lRA92 disk drive, if in the examination of the error log, it can be dete!'Tn1Ded that:
•

the Level A retry mechanism is successful on first retry, and

•

the Level B retry mechanism is not being used (reported Level B retry count =0), and

•

"all" errors are being recovered on a single retry,

then an error rate of six per day may be considered nominal for the RA901RA92 disk drives
operating near or above 30 I/Os per second.
Example 5-6 illustrates a typical RA90 error log on a VMS system. The fields of note are
empl-..asized in bold.

5.17.1 RA92 Disk Drive With MSCP Status/Event 68
RA92 disk drives may log more occurrences of MSCP status/event 6B than RA90 disk drives in
applications during which long sequences of write activity are occurring. This phenomenon, as a
contributor to 6B events, was recently discovered and identified. Though it occurs more often with
the RA.92 disk drive, heavy write-to-read ratios could be a contributor to logged MSCP 6B events by
RA90 disk drives.

The problem is occuning within the design of the heads while the head is involved in large
sequential write transfers. When the head has to switch back to read (for next header
identification), noise can result in the head that essentially disrupts the header signal as it is
read. No identifiable damage to the actual header information is exhibited on the media. Customer
data is not at risk. The noise merely disrupts the read chain momentarily as the header is being
read. By the time the next sector comes around, the read chain will have stabilized.
This head phenomenon will result in additional 6B errors being logged when the write-to-read
ratios are heavily weighted in favor of writes. Typical VMS environments may not provide this
scenario. It has been noted that typical ULTRIXIUNIX applications appear to have a higher mix of
write-to-read activity than VMS applications. However, regardless of the operating system, certain
applications may increase the potential of this phenomenon occurring when those applications, by
their nature, offer heavy write-to-read ratios.

DIGITAL INTERNAL USE ONLY

5-50 Troubleshooting and Error Codes

****** ENTRY 6., ERROR SEQUENCE 4709. LOGGED ON SID 05283914
ERL$LOGMESSAGE ENTRY

KA820 REVt E
BI NODE' 2.
I/O SOB-SYSTEM, UNIT _HSC015$D0A36:
MESSAGE TYPE

PATCH REV' 28.

OCODE REV' 20.

0001
DISK MSCP MESSAGE

MSLG$J.. CMD REF
MSLG$W= UNIT

6BBCOOOA
0024

MSLG$W_SEQ_NOM

0002

MSLG$B_FORMAT

MSLG$B_FLAGS

KSLG$W_ BVlD1T

0034

UNIT '36.
SEQUENCE '2.
BAD BLOCK REPLACEMENT ATTEMPT
OPERATION SOCCESSFUL
BAD BLOC1t RBPL&CBND1'f

BLOCK VB1U:rIBD C:OOD
MSLG$Q_CNT_ID

0000FC15
01200000
UNIQUE IDENTIFIER, 00000000FC15 (X)
MASS STORAGE CONTROLLER
HSC70

MSLG$B_CNT_SVR

MSLG$B_CNT_HVR

CONTROLLER SOFTWARE VERSION '39.
CONTROLLER HARDWARE REVISION '0.
MSLG$W MOLT UNT
0060
MSLG$Q=UNIT=ID 000003F6
02130000
UNIQUE IDENTIFIER, 0000000003F6(X)
DISK CLASS DEVICE (166)
RA90
MSLG$B_UNIT_ SVR

MSLG$B_UNIT_HVR

MSLG$W_ RPL_ JlLGS

0000

MSLG$L_VOL_SER

0000036C

MSLG$L_BAD_LBN

00175A52

UNIT SOFTWARE VERSION '11.
UNIT HARDWARE REVISION '1.

VOLUME SERIAL '876.
BAD LOGICAL BLOCK
NUMBER = 1530450.
MSLG$L OLD RBN
MSLG$L-NEW-RBN
MSLG$W:CAOO

00000000
000056A4
00B8

DHA aaoR

ORCORUCTULJI ace DBOR

*******************************************
Example 5-5

VMS BBR Packet

DIGITAL INTERNAL USE ONLY

Troubleshooting and Error Codes 5-51

VMS LOGGED !<'..sCP '6B'

POSITIONER ERRORS

******************************* ENTRY

1. *******************************
LOGGED ON:
SID 1105009C
SYS TYPE 00000000

ERROR SEQUENCE 2151.
DATE/TIME 26-JOL-1990 11:12:49.31
ERL$LOGMESSAGE ENTRY

KA88
BEV. 5.
CPU • O.

CPO O.

I/O SOB-SYSTEM, UNIT _BSC4$DUA39:
MESSAGE TYPE

0001
DISK MSCP MESSAGE

MSLG$L CMD REF
MSLG$W:UNIT

56310024
0027

MSLG$W_SEQ_NOM

001B

MSLG$B_FORMAT

MSLG$B_FLAGS

UNIT .39.
SEQUENCE '27.
DISK TRANSFER ERROR
SEQUENCE NUMBER RESET
OPERATION SUCCESSFUL
IfSlaQ$1f_avD'f

OOG

MSLG$Q_CNT_ID

0017F20D
01010000
UNIQCE IDENTIFIER, 00000017F20D(X)
MASS STORAGE CONTROLLER

HSC50

CONTROLLER DEPENDENT INFORMATION
ORIG ERR

1800
HEADER COMPARE ERROR
HEADER SYNC TIMEOUT
SOSPECTED LOW HEADER MISMATCH

ERR RECOV FLGS

0002

LVl A amy
LVl a amy

01
00
C4BF
02
02

BUF OAT MEN ADR

SRC REO t
DET REQ t

Example 5-6

ERR LOGGED TO CONSOLE AND HOST
<---~ 1 "A" ~
<---BOB 110 "a" aBIBS

Positioner MIs-Seek MSCP StatuslEvent 6B

DIGITAL INTERNAL USE ONLY

5-52 Troubleshooting and Error Codes

The occurrence of 6B errors caused by this phenomenon has been more pronounced on the
KDMlHSC controllers than on the KDAlKDBIUDA controllers. Since experience and engineering
evaluation have shown that the occasional occurrence of the MSCP status/event 6B, when recovered
on a single retry, is inconsequential, extra error management code has been implemented as follows:
•

HSC software released after the 39x series will contain special 6B error management code that
will look for this error signature and will not report this event characteristic of the RA9OIRA92
product.

•

The KDM70 controller with microcode at revision level 2 will also contain this enhanced error
management code for 6B errors on RA9OIRA92 disk drives.

This phenomenon is being aggressively plD'Sued by Digital and resolution details will be
communicated to the field.

5.17.2 Evaluating MSCP 68 Events
When converting some (20-30 LBNs identified as 6B MSCP events) of the target LBN numbers,
look for the following:
•

Single head but quite random cylinder addresses-consider the HDA

•

Single head but narrow band of cylinder addresses-consider mapping out suspect LBNs with
DKUTIL or HDA replacement. To manually force replacement of a perceived bad block, make
sure a current disk backup exists.

•

Repeating LBNs-consider "mapping" out suspect LBNs with the BBR utility (DKUTIL).

•

Random heads (10 of 13 heads>--<:Onsider data path including controller SDI module.

Troubleshoot MSCP status/event 6B as follows:
•

Update the drive with the latest drive microcode version.

•

If errors are only happening on one port, plD'Sue a port path problem, including ECM, SDI
cables between drive and bulkhead, cabinet to controller cabinet, and within the controller
cabinet and the port interface module in the controller.

•

Note whether more than one drive on the requester is reporting consistent 6B events. This
would more definitely suggest a port interface problem within the controller.

•

If errors are clearly happening on both drive ports, pursue the problem as a drive problem

first, when the event rate exceeds the guidelines indicated above and/or customer satisfaction
dictates.

5.18 Conclusion
The DSA architecture defines a very reliable and flexible storage subsystem. This subsystem can be
maintained efficiently and effectively when consistent and methodical troubleshooting procedures
are followed.
Poorly trained or untrained Customer Services engineers are at a serious disadvantage. The cost
of supporting incolTectly identified FRUs is very high. Many of the FRU units are expensive to
replace. Some very expensive FRUs are not repairable FRUs. The impact to a customer can be
substantial. Impacts include:
•

Necessity to back up and restore potentially large amounts of data on misdiagnosed HDA
replacements.

•

Loss of system availability when using standalone diagnostics with controllers such as
UDAlKDAlKDB.

•

Loss of drive availability when performing extensive subsystem diagnostics using an HSC
controller.

DIGITAL INTERNAL USE ONLY

Troubieshooting and Error Codes 5-53

•

Increased frustration and inconvenience of dealing with repeated calls.

•

Loss of confidence in Digital as a quality supplier of storage systems.

•

Increased potential of data loss if improper diagnosis is made and the failure mode continues or
gets worse.

SERVICE GOAL

The Customer Services engineer'. number one goal in service efforts is to correctly
diagnose a problem on the first ea11 and replace the correct part 80 the c1l8tomer'. disk
and data availability is minjma]]y impacted.

5.19 Error Codes and Descriptions
This section describes RA9OlRA92 disk drive eIIOr codes. Included in each elTUC wde de&'ciption is
a list of suggested replacement FRUs for repairing drive problems.
Careful analysis of both system and drive internal error logs, along with drive-generated error
codes, should lead to problem isolation and correction.
Error codes are listed in hex numerical order starting with error code 01 through error code FD
(hex). The general format of the error code listings is as follows:

01 0 Spindle Motor Transducer nmeout •

Error Type: DE •
Error Description: The spindle was given the command to spin up by an SDI cOmmand
or from the front panel Run switch and no movement was detected by the spindle motor
transducer. See error code 13 for possible isolation help before replacing FRUs.

Fault IsolatioDlCorrection: •
1. ECM
2. HDA

3. P..ear flex cable assembly
Where:

o 01 is the error code.
•

SPINDLE MOTOR TRANSDUCER TIMEOUT is the error message.

6) DE is the error type.

e Error Description: is a brief summary of the error event.
"

Fault Isolation/Correetion: is the suggested FRU replacement order for
troubleshooting.

DIGITAL INTERNAL USE ONLY

5-54 Troubleshooting and Error Codes

01 Spindle Motor Transducer

nmeout

Error Type: DE
Error Description: The spindle was given the command to spin up by an SDI command
or from the front panel Run switch, and no movement was detected by the spindle motor
transducer. See error code 13 before replacing FRUs.
Fault IsolatioDlCorrection:
1. ECM
2. HDA
3. Rear flex cable assembly
02 Splnup Too Slow

Error Type: DE
Error Description: The spindle did not reach 1000 rlmin within 20 seconds. See error code 13
before replacing FRUs.
FaultlsolatioDlCorrection:
1. ECM

2. HDA
3. Rear flex cable assembly
03 Spindle Not Accelerating During Splnup

Error Type: DE
Error Description: The spindle did not accelerate above 1000 rlmin in the allotted spinup
timeout period. See error code 13 before replacing FRUs.
Fault IsolatioDlCorrection:
1. ECM

2. HDA

3. Rear flex cable assembly
04 Splnup Too Long to Lock on Speed

Error Type: DE
Error Description: The spindle did not reach 3600 r/min (::t 18 r/min) within 30 seconds. See
. error code 13 before replacing FRUs.
Fault lsolatioDlCorrection:
1. ECM

2. HDA
3. Rear flex cable assembly

DIG1TAL INTERNAL USE ONLY

Troubleshooting and Error Codes 5-55

05 Invalid Drive Serial Number Code

Error Type: DF
Error Description: The drive serial number is out of acceptable range or an invalid
manufacturing plant code was read by the drive microcode.
Switches are set (or read) incorrectly on the rear flex cable assembly (S1IS2). This is neither a
fatal error nor a hard error. Clearing the fault allows the drive to continue operation~ The drive
serial number is checked during the power-up sequence.
Table 5-8 Serial Number

Bits
<19:18>

MIg

ex
ex

Serial Number Riwp
inDecbaaJ.

Mas BiDaz7 Value Bits <17:00>

0-262,143

1111111111111111111

262,144-309,999

001011101011101111

310,000-524,287

1111111111111111111 invalid

0-262,143

1111111111111111111

0-262,143

1111111111111111111 invalid

Fawtho~tio~Co~tiom

1. Incorrect SlIS2 bite set on rear flex cable assembly

2. Rear flex cable assembly
3. ECM seating problem
4. ECM
06 Microcode Fault

Error Type: DF
Error Description: A 'hA?tIwL-relsoftware failure caused the master processor addressing to
point to a null EEPROM area.
Fawt Iso~tioDlCorreetiom
1. Reload drive microcode
2. ECM
07 $DI Frame Sequence Error

Error Type: RE

Error Deseriptiom Level 1 SDI commands were detected in the wrong sequence. If the same
drive is reporting errors from two controllers, start troubleshooting at the drive.

Fawt Iso~tioDlCo~tiom
1. Controller

2. SDI cable
3. ECM

DIGITAL INTERNAL USE ONLY

5-56 Troubleshooting and Error Codes

08 SOl Lvi 2 Checksum Error

Error Type: RE
Error Description: The calculated checksum did not compare with the checksum field sent by
the controller to the drive for SDI level 2 commands. If the same drive is reporting errors from
two controllers, start troubleshooting the drive.
Fawtuomtio~Co~tiom

1. Controller
2. SDI cable
3. ECM
09 SOl Lvi 1 Framing Error

Error Type: RE
Error Deseriptiom A sync pattern was detected by the drive on the SDI WRITFJCOMMAND
line, but no SDI level 1 control message transmission or single frame command was detected.

Fawt Uomtio~Co~tiom
1. Controller
2. SDI cable
3. ECM
OA SOl Incorrect Command Opcode Parity Error

Error Type: PE

Error Description: The wrong parity was detected on the opcode byte of a level 1 or level 2
command.

Fault Uomtio~Co1Teetiom
1. Controller

2. SDI cable
3. ECM
DB SOl invalid Opcode

Error Type: PE
Error Description: The decoded opcode is not a valid Gevel 2) SDI opcode.

Fault Uomtio~Co1Teetiom
1. Controller
2. SDI cable
3. ECM

DIGITAL INTERNAL USE ONLY

Troubieshooting and Error Codes 5-57

OC SDI Command Length Error (LVL2)

Error Type: RE
Error Description: This error indicates the controller caused the drive SDI input command
buffer to overflow.
Fault IsolatioDlCorreetioD:
1. Controller
2. SDI cable
3. ECM
OD SDI invalid Command with DrIve Error

Error Type: PE
Error Description: The controller issued an INITIATE SEEK command, an ERROR
RECOVERY command, or a RECALIBRATE command while the drive was faulted.

Fault IsolatioDlCorrectioD:
1. Controller
2. ECM
3. SDI cable
OE SDI Lvi 1 invalid Select Group Number

Error Type: RE
Error Description: Indications are the controller attempted to select a nonexistent group. For
RA90 and RA92 disk drives, group=head.
Fault IsolatioDlCorreetioD:
1. Controller
2. ECM
3. SDI cable
OF SD. Write Enable on a WrIte-Protected DrIve

Error Type: PE
Error DescriptioD: A drive write-protected from the OCP (front panel) was issued a WRITE
ENABLE command through an SDI CHANGE MODE command. The OCP switch state has
priority over any SDI CHANGE MODE commands.
Fault IsolatioDlCorreetioD:
1. Disable Write Protect switch
2. Controller

3. ECM
4. OCP

DIGITAL INTERNAL USE ONLY

5-58 Troubleshooting and Error Codes

10 SOl Command Length Error (LVL2)

Error Type: PE
Error Description: An SDI command length error, LVL2, indicates the number of bytes
expected did not equal the number of bytes received for an SDI level 2 command.
Fault Isolation/Correction:
1.

Controller

2. SDI cable
3. ECM
11 Microcode Cartridge Load Occurred

Error Type: Informational Only
Error Description: This logged event indicates that a drive microcode update successfully
occurred.
This new event occurred with the introduction of the Etch-F I1O-RIW module. Etch-F revision
ECM boards are -indicated by revision 1 or later in the lOP and SRV values reported with drive
internal test T45. (There are a minimal number of Etch-E revision modules that provide this
information.)

Fault Isolation/Correction:
1. Information only
12 Spindle Speed Unsafe Error

Error Type: DE
Error Description: During idle loop, a spindle speed check indicated the drive was not up to
speed at 3600 rlmin (:t 18 r/min). The servo processor will also detect this condition dynamically
and have the master processor log this error as well.
Fault IsolatioDlCorrection: Disabling the brake circuit may aid in troubleshooting. The
brake can be disabled by opening either pin 4 or 5 of the rear HDA connector. Use the pin
extraction tool (PIN 29-26655-00) to avoid breaking pins.
CAUTION

The female pins in the HDA connector are delicate and must be handled with care.
When disabling the brake, cover loose pins with electrical tape to prevent them from
shorting.
1. Reseat HDA

2. ECM
3. Power supply
4. Brake
5. HDA

DIGITAL INTERNAL USE ONLY

iroubleshooting and Error Codes 5-59

13 Spindle Motor Control Fault

Error Type: DE
Error Description: The motor control Ie detected a condition that prevented the spindle from
getting up to speed.

Fault IsolatioDlCorrection:

1. Reseat ECMlHDA
2. ECM
3. HDA
A number of checks are made to detect this fault. A failure of any of the following checks
results in this error:
1. If no Hall effect is seen within 700 ms after current is applied to the spindle motor.
2. If the SSI chip on the servo module which controls spindle speed rotation is operating at
less than 6.8 volts.
3. If the brake circuit is activated at the same time that current is applied to the spindle.
4. If the Hall sensor input from the spindle motor is not occurring at a 700 ms rate.
Additionally, any open condition in the spindle circuitry, including Hall sense phase or spindle
motor phase circuitry, causes this error to be asserted.
Although power supply voltages cannot be adjusted, they can be meaB".L~ by removi..ng the
small cover as shown in Figure 5-8 (power supplies bearing a serial number starting with
only). On the back of the connector, the pin numbers are visible. A very small electrical probe
is required to make connection.

POWER SUPPLY

I
II I

----.....1
>

QUARTER-TURN

HOLD-DOWN ~:::::::ttr------~,L.J
SCREWS

POWER SUPPLY
ACCESS COVER
CXO-2184B

Figure 5-8

Power Supply Cover Removal

DIGITAL INTERNAL USE ONLY

5-SO Troubleshooting and Error Codes
Removal of this cover allows access to the power supply output voltage connector. 1b remove
the power supply cover, use a quarter-inch hex driver. Remove the hold-down screws. Next, use
a DVM or oscilloscope to measure the points to ground (black lead) as shown in Table 5-9.
Table 5-9 Power SUpply Voltage Measurements

Wire Color

Volta,re MeuureJD8Ilt

Deviation

Orange

+12 Vdc

:.6 V

Black

:t:12 Vdc (return)

Black

:12 Vdc (return)

Blue

-12 Vdc

Red

+5.1 Vdc

Red

+5.1 Vdc

:.6 V
:.25 V
:.25 V

Red

+5.1 Vdc

:.25 V

Red

+5.1 Vdc

:.25 V

Black

+5.1 Vdc (return)

Black

+5.1 Vdc (return)

Black

+5.1 Vdc (return)

Black

+5.1 Vdc (return)

Purple

-5.2Vdc

:.17Vdc

Purple

-5.2 Vdc

:.17Vdc

Brown

-24 Vdc

:2.4 Vdc

Brown

-24 Vdc

:2.4 Vdc

Brown

-24 Vdc

:2.4 Vdc

Black

:24 Vdc (return)

Black

:24 Vdc (return)

Yellow

+24 Vdc

:2.4 Vdc

Yellow

+24 Vdc

:t2.4 Vdc

Yellow

+24 Vdc

:t2.4 Vdc

Brown

40kHzH

Blue

-5.2 Vdc (sense)

Black

-5.2 Vdc (sense return)

Orange

DCOKH

Red

OVTEMPH

Blue

POCKH

White

ONIOFFL

In addition to these measurements, error codes 2D and FF indicate power problems.
Along with the power supply measurements, a number of resistance checks can be made to the
HDA. The HDA must first be removed from the drive chassis. Exercise care when handling
the HDA so that connector pins are not damaged during measurements. DO NOT jam probes
into the connector housing from the front of the connector because it is easy to damage the pins
in these sockets. Access the pins from the rear of the connector or use the pin insert/extract

DIGITAL INTERNAL USE ONLY

Troubieshooting and Error Codes 5-61

tool (PIN 29-26655-00) to remove pins from connectors for easier measurements. Refer to
Table 5-10 to locate opens in the circuits.
Table 5-10 lists pin-to-cireuit connections.
Table 5-10 HDA Connector Pin Designations
Pin

Wire Color

Cireait

Blue

Positioner lock solenoid (-)

Blue

Positioner lock solenoid (+)

White

Brake (-)

White

Brake(+)

Green

Violet

Flex

Positioner actuator 1Ix (-)

Orange

Flex

Positioner actuator fix (+)

Brown

Ball sen&Or ground

Gray

Spindle motor coil C

Red

Hall sensor 5 V input

Blue

Spindle motor coil B

Black

Spindle motor coil A

Grnd

Yellow

Spindle motor lamination lead exits HDA and is grounded on HDA.

Resistance measurements are checked according Table 5-11.
Table 5-11

HDA Resistance Measurements

(-)Pm to (+)Pm

Circuit

Measured Value

16-14

Coil A - Coil B

1.4 ohm

16-12

Coil A- Coil C

1.4 ohm

14-12

Coil B - Coil C

1.4 ohm

16 - HDA ground

Coil A - ground

2Omegobm

14 - HDA ground

Coil B - ground

20megobm

12 - HDA ground

Coil C - ground

2Omegobm

9-7

SI- S2

2Omegobm

9-6

SI- sa

2Omegobm

7-6

S2- sa

2Omegobm

9-13

S1- Hall 5 V

20megobm

7-13

S2-Hall5V

20megobm

6-13

sa-Hall5V

20 megohm

9-11

SI - Hall ground

~4.50 megobm

DIGITAL INTERNAL USE ONLY

5-62

Troubleshooting and Error Codes

Table 5-11 (Cont.)

HDA Resistance Measurements

(-)Pin to (+)Pin

Circuit

Measured Value

7-11

82 - Hall ground

~4.30 megohm

6-11

S3 - Hall ground

~4.50 megohm

11-13

Hall ground - Han 5V

~7megohm

1-2

Positioner lock solenoid

~30ohm

8-10

Actuator coil

~4 ohms

14 Head Offset Margin Event

Error Type: DE
Error Description: This is not an error condition. Manufacturing sets the enable ftag for the
detection of this event. If this code shows up in the field, reset the ftag by taking the drive off
line and powering it down and then up.
Fawt~~tio~Co~tion:

1. Power drive off and back on.
15 Head Offset Out-of-Band Error

Error Type: DE
Error Description: Head offset has exceeded normal head offset parameters for this drive.
This is a serious problem. Data is in danger of being lost. Do not use the drive for further
writes. Initiate prompt backup. Head offset errors can result from an over-temperature
condition. Check drive airflow and ambient room temperature. If temperature appears to
be normal, replace the HDA
The amount of offset necessary before this eITOr is ftagged is :3/4ths of a track. After each
offset table rebuild, the servo processor tests each head value against this threshold. If a head
exceeds offset limits, the master processor asserts ATTENTION and uses the GET STATUS
response to identify which head or heads are involved.
The drive specific bytes of the drive internal error log should indicate which head has marginal
offsets.
Fawt Iso~tioDlCorreetion:
1. HDA
2. ECM

3. PCM
16 SDllnvalld Group Select LVL2

Error Type: PE
Error Description: The controller attempted to select a nonexistent group. A group refers
to a head in the RA90 and RA92 disk drives. If the drive is dual-ported and logging this error
from both controllers, troubleshoot the drive.
Fawt ~~tioDlCorreetion:

1. Controller
2. ECM

DIGITAL INTERNAL USE ONLY

Troubleshooting and Error Codes 5-63

17 SDI Port A CommandIResponse Timeout

Error Type: Informational Only
Error Description: The Port A controller did not accept message response data from the
drive. This is typically a communications event and not a drive error.
Fault Isolation/Correction:
1. Communications event (typically not a drive problem)
2. Controller on Port A
3. ECM
18 SDI Port B CommandlResponse Timeout

Error Type: Informational Only
Error Description: The Port B controller did not accept message response data from the
drive. This is typically a communications event and not a drive error.
Fault Isolation/Correction:
1. Communications event (typically not a drive problem)
2. Controller on Port B
3. ECM
19 SDI Invalid Format Request

Error Type: PE

Error Description: The controller requested that the drive place itself in 576-byte format.
The RAOOIRA92 only accepts 512-byte format. This error can also be caused by someone trying
to format the drive in 576-byte mode.
Fault Isolation/Correction:
1. Controller
2. ECM
1A SDI Invalid Cylinder Address

Error Type: PE
Error Description: The drive decoded a nonexistent cylinder address during a controllerinitiated SEEK command.
This error also occurs when a controller, while running diagnostics, attempts to test the DBN
area of the disk without first setting the drive's DB bit.
This error also occurs if an attempt is made to access cylinders beyond the DBN space if the DB
bit is set.

Fault Isolation/Correction:
1. Controller
2. ECM

DIGITAL INTERNAL USE ONLY

5-64 Troubleshooting and Error Codes

1Binner Guardband Error

Error Type: DE
Error Description: The drive hardware detected servo inner guardband information instead
of servo data information or outer guardband information. The only time the servo head is
positioned in the inner guardband area and does not generate an error is during execution of
diagnostics.
Fault IsolatioDlCorrection:
1. ECM

2. HDA
NOTE

If an actuator current error or actuator speed error is also indicated, it is probable

that the inner guardband error is secondary. Reference the respective actuator error.
1C Outer Guardband Error

Error Type: DE
Error Description: Outer guardband information was decoded when servo or inner guardband
information was expected. The only time the servo head is positioned in the outer guardband
area and does not generate an error is during execution of a head load operation, a recalibrate,
or internal diagnostics.
Fault IsolatioDlCorrection:
1. ECM
2. HDA
NOTE

If an actuator current error or actuator speed error is also indicated, it is probable

that the inner guardband error is secondary. Reference the respective actuator error.
1D Ilegal Servo Fault

Error Type: DE
Error Description: A servo fault was detected by the GASP array; however, when the master
processor examjned the register information, the error was invalid.
FaultuolatioDlCorrection:
1. ECM
1E Power-Up Aft. AC Power Loss

Error Type: Information only
Error Description: Information event noting that the drive performed a power-up sequence
after ac power loss. This may be the result of turning the drive power off at the breaker, or loss
of ac power to the drive/cabinet.

This new event occurred with the introduction of the Etch-F VO-BIW module. Etch-F revision
ECM boards are indicated by revision 1 or later in the lOP and SRV values reported with drive
internal test T45. (There are a minimal number of Etch-E revision modules that provide this
information.)

DIGITAL INTERNAL USE ONLY

Troubleshooting and Error Codes 5-65

This event is different from the logged event as a result of the power supply being over
temperature. )

Fault Isolation/Correction:
1. Information Only
1F Sector OVerrun Error

Error Type: DE
Error Description: When a sector or index pulse occurs with either WRITE GATE or READ
GATE asserted, an overrun error is asserted. This indicates a write or read operation was
attempted through a sector/index boundL-y.

Fault Isolation/Correction:
1. Controller

2. ECM
20 SDI RTCS Parity Error

Error Type: DE
Error Description: A bit was dropped or picked up in data transferred on the SDI Real Time
Controller State (RTCS) tine.

Fault Isolation/Correction:
1. Controller
2. SDI cable

3. ECM
21 SDI Transfer (Pulse) Error

Error Type: DE
Error Description: An extra or missing pulse was detected on the SDI WRT/CMD tine or the
RTCS line. If t~'s enor OCC'.!!'S &rJft' both ports and/or more than one controller, trouhieshoot
the drive. If only one port is involved, troubleshoot the SDI cables or the controller. See Figure
5-9.
BIT CELL TIME
(86.2n8)
BIT
CELL
TIME

BIT
CELL
TIME

ENCODED
DATA

~
Figure 5-9

r--

PULSE WIDTH
(12 +1-2n8)
CXO-1325B

WRT/CIID Data Format
DIGITAL INTERNAL USE ONLY

5-66 Troubleshooting and Error Codes

On the WRT/CMD and RTCS lines, a positive transition at the leading edge of a bit cell
indicates a one; a negative transition indicates a zero. If the next bit cell contains the same
data (a one followed by a one or a zero followed by a zero), the line switches polarity in the
middle of the bit cell.
The error is detected by the TSID gate array and is passed to the SDI gate array as a PLS
ERR error. A pulse error should only be reported when the drive is executing a data transfer
operation. If a pulse error occurs during a TRANSFER command, PLS ERR will set bit 0 of
Fault Register 3 of the SDI gate array.
Fault IsolatioDlCorrection:
1. ECM
2. SDI cables
3. Controller
4. Power supply
5. Spindle ground brush
22 Electronic Control Module Over-Temperature Error

Error Type: DE
Error Description: An over-temperature condition exists in the drive. Drive over-temperature
conditions result from high room temperature or a dirty air vent inhibiting airftow through the
drive. Additionally, a bad blower motor could cause the intemal temperature of the drive to
increase, but a 2D error is more likely in this case. This over-temperature condition happens
when the detector senses 43°C (1100F).
Fault IsolatioDlCorrection:
1. Ambient air temperature is too high
2. Cabinet door air vent needs cleaning
3. Blower assembly
4. ECM
24 Loss of Fine Track During Data Transfer

Error Type: DE
Error Description: A loss of fine track was detected when a read or write operation was ready
to begin, but not actually started. This error code is not implemented in microcode revision 7
and later.
'Refer to servo event 9A.

Fault IsolatioDlCorrection:
1. Install RA9OX-OOOI FCO
25 Servo Fault Error

Error Type: DE
Error Description: A servo error was detected but no condition was found that would cause
the error condition. The master processor, while in its idle loop, was smnning the servo GASP
gate array and discovered error bit(s) set. Valid conditions include:
Actuator fault
PLO error
Actuator over current error
DIGITAL INTERNAL USE ONLY

Troubleshooting and Error Codes 5-67

Actuator over speed error
Track counter en-or
Off track error
Guardband error
Heat sink 1 error-over-temperature
Heat sink 2 error-over-temperature

Fault Isolation/Correction:
1. ECM
26 Spindle Speed Error (Servo Processor)

Error Type: DE
Error Description: Spindle is not within :t 0.5% of 3600 rlmin. The servo processor monitors
spindle speed. This error is different from the loss of PLO which can OCC'".lr sspL-ately from the
error. Upon detection of the loss of PLO, the master processor examines the servo processor
status to determine if it has valid servo-detected error information. If it does, this error is
asserted.
Fault Isolation/Correction:
1. ECM

2. Brake
3. HDA
'Z'I Servo OVer-Temperature Error at Sl

Error Type: DE
Error Description: The thermal sensor (S1) on the servo module detected an overtemperature condition. This results in the master processor spinning the disks down and
setting this error condition. If the over--temperature clears, the controller can initialize the
drive and try to spin it back up.
Fault Isolation/Correction:
1. Ambient air temperature too high
2. Cabinet door air vent needs cleaning
3. Blower assembly
4. ECM
28 Servo OVer-Temperature Error at S2

Error Type: DE
Error Description: The thermal sensor (82) on the ECM detected an over-temperature
condition. This results in the master processor spinning the disks down and setting this error
condition. If the over-temperature clears, the controller can initialize the drive and try to spin
it back up.
Fault IsolatioDlCorrection:
1. Ambient air temperature too high
2. Cabinet door air vent needs cleaning
3. Blower assembly
4. ECM

DIGITAL INTERNAL USE ONLY

5-68 Troubleshooting and Error Codes

28 SDa invalid Error Recovery Level SpecHlec:l

Error Type: PE
Error Description: The controller issued an SDI ERROR RECOVERY command with an
illegal recovery level. The RA901RA92 supports 14 error recovery mechanisms. This value
is passed to the controller during a GET COMMON CHARACTERISTICS command. The
controller in this case asked {or a level greater than 14.

Not all controllers report the etTOr recovery levels in the same manner.
Fault IsolatioDlCorreetion:
1. Controller
2. ECM
2A SDI invalid SUbunit SpecHled

Error Type: PE
Error Description: The controller attempted a GET STATUS command to a subunit address
other than zero. (The RA9OIRA92 is a single unit drive with a subunit address of zero.)
Fault Isolation/Correction:
1. Controller

2. ECM
2B SDa invalid Diagnose Memory Region location

Error Type: PE
Error Description: The controller or the operator attempted to execute a nonexistent internal
drive test or internal diagnostics while the drive was on line to the controller.

Fault IsolatioDlCorrection:
1. Use valid diagnostic

2. Controller
3. ECM
2C SOl Spindle Not R_cty with SeeklRecallbration Command

Error Type: PE
Error Description: A RECALmRATE or SEEK command was issued to a spun-down disk
drive.
Fault IsolatioDlCorrection:
1. Controller

2. ECM

DIGITAL INTERNAL USE ONLY

Troubleshooting and Error Codes

5-69

2D Power Supply OVer-Temperature

Error Type: DE
Error Description: A critical over-temperature condition exists in the power supply. This
condition is detected by the master processor th..?Ough the OVER TEMP L signal Within 15 ms
of detection, the de voltages are removed in an orderly fashion. The error is stored in EEPROM
and can be'read when power is restored to the drive after the over-temperature condition is
corrected or the power supply cools down sufficiently to allow power to be reapplied.

Fault Isolation/Correction:
1. Ambient air temperature too high
2. Blower assembly
3. Power supply
4. ECM
5. Rear :flex cable assembly
2E SDI Splnup Inhibited by Controller Flags

Error Type: PE
Error Description: The drive cannot be spun up from the OCP while the drive is in the
AVAILABLE or ONLINE state to the controller.
NOTE
If the Run switch is selected prior to the Fault switch, a 2E led code will be indicated.

Fault Isolation/Correction:
1. Check Run switch
2. ECM
2F SDI RUN Command wHh Run Switch In Stop Position

Error Type: PE
Error Description: An SDI RUN command was issued to the drive when the OCP Run switch
was in a logical stop state.

Fault Isolation/Correction:
1. Check OCP switch state

2. Controller
3. ECM
30 Write Current and No Write Gate

Error Type: DE
Error Description: Current was detected at the read/write heads and WRITE GATE was not
asserted. The PCM provides the current source for the write chips in the HD.A_ Drive fLT'!!lware
tests for this condition during diagnostics.

DIGITAL INTERNAL USE ONLY

5-70

Troubleshooting and Error Codes

Fault IsoIatioDlCorrection:
1. ECM
2. PCM
3. HDA
31 Read Gate and Write Gate Both Auerted

Error Type: DE
Error Description: SDI gate array detected. that READ GATE and WRITE GATE were
asserted at the same time.

Fault IsoIatioDlCorrection:
1. ECM

2. Controller
32 Read or Write While Faulted

Error Type: DE
Error Description: A READ or WRITE command was issued to a drive that bas a fault
condition.
FaultIsoIatioDl~tion:

1. Check error log for fault condition

2. Controller
3. ECM
33 Attempt to Write Through Bursts

Error Type: DE
Error Description: An attempt was made to aseert WRITE GATE while the read/write heads
were positioned over embedded servo burst iDformatiOll.
Fault IsoIatioDlCorrection:
1. ECM

2. Controller
3. HDA
34 EN DEC Encoder Error

Error Type: DE
Error Description: Data to be written to media has been improperly 213 encoded.
FawtuoIatioDlCorrection:
1. ECM

2. PCM

DIGITAL INTERNAL USE ONLY

Troubleshooting and Error Codes 5-71

35 Write and Write Unsafe

Error Type: DE
Error Description: A problem in the write data path prevented the drive from correctly
writing data to the disk su.~ace. One or more of the :fono~ng conditions cause this error:
•

No write data transitions

•

No write current

•

No SSI283 (head select chip) selected

•

SSI 283 stuck in read mode

The unsafe conditions are wire ORad together and are detected on the PCM.

Fault Isolation/Correction:
1. PCM
2. HDA
3. ECM
36 Write and Servo Uncallbrated

Error Type: DE
Error Description: The firmware routines used to calibrate the read/write heads and the
servo system failed to complete successfully. The subsequent write was attempted with the
servo unealibrated.
Fault Isolation/Correction:
1. PCM
2. ECM
3. HDA
37 Write Gate and No Write CUrrent

Error Type: DE
Error Description: WRITE GATE was asserted but no write current was detected at the
read/write heads. The PCM sources the current when WRITE GATE is asserted.
Fault Isolation/Correction:
1. PCM
2. ECM
3. HDA
38 Read Gate and Multiple Head Chips Selected

Error Type: DE
Error Description: During a read operation, the master processor determined that more than
one head and/or more than one SSI 283 chip was selected.

DIGITAL INTERNAL USE ONLY

5-72 Troubleshooting and Error Codes

Fault Isolation/Correction:
1. PCM
2. ECM

3. HDA
39 Write Gate and Off Track

Error Type: DE
Error Description: A loss of fine track was detected when WRITE GATE was asserted. This
error code is not implemented in microcode revisions 7 and later. This error code is used
exclusively with the dedicated-only servo system found on earlier drives. Refer to error code 9B.
Fault IsolatioDlCorrection:
1. Install RA9OX-OOOI unless superseded by a later FCO.
3A Write Gate and Write-Protected

Error Type: WE
Error Description: A write-protected drive detected the assertion of WRITE GATE.
Fault IsolationlCorrectiom
1. Controller
2. ECM
3B Hard INIT Occurred to Drive

Error Type: DE
Error Descriptiom This is not typically an error condition. It is a record of initializations
(initializations the controller started by the RTCS logical signallNIT, and initializations started
by the drive). Initializations stop mechanical movements, and the drive performs a power-up
initialize and reloads the servo processor code. .
Examine previous error conditions.
With drive microcode revisions 10 or earlier, if the drive performs a bard initialization on its
own (for example, when new drive microcode has just been reloaded), this error entry will be
recorded into the EEPROM.

Microcode revisions later than 10 give a new indication of microcode reload. Refer to drive LED
code 11.

Fault Isolation/Correctiom
1. Look for previous errors
2. ECM

3. Controller
3D HDA ReadlWrite Interlock Broken

Error Type: DE
Error Description: The cable between the PCM and the ECM is disconnected or broken.

DIGITAL INTERNAL USE ONLY

Troubleshooting and Error Codes 5-73

Fault Isolation/Correction:
1. Disconnected ECM-to-PCM cable
2. Bad ECM-to-PCM cable

3. PCM
4. ECM
3E OCP Interlock Broken

Error Type: DE

Error Description: The operator control panel was removed with de voltages still applied to
the drive.

Fault Isolation/Correction:
1. OCP flex circuit connectors
2. Bezellblower flex circuit/servo module connectors

3. Servo modulelECM connectors

4. OCP
5. ECM
40 SDI Invalid Read Memory Region Error

Error Type: PE

Error Description: The controller issued an SDI level 2 READ MEMORY REGION command
to an invalid region of drive read memory.
Fault Isolation/Correction:
1. Operator attempted to write or read a nonexistent or protected memory location.
2. Controller
3. ECM
42 Drive Not On UneISEEK Command Issued

Error Type: PE

Error Description: The controller issued an SDI level 2 INITIATE SEEK command and the
drive was not on line to the controller.
Fault Isolation/Correction:
1. Controller
2. SDI cable

3. ECM

DIGITAL INTERNAL USE ONLY

5-74 Troubleshooting and Error Codes

43 TCR and Not ReadIWrlte Ready Fault

Error Type: RE
Error Description: The SDI gate array has decoded a data transfer command from the
controller, but the drive is not ready to read/write; or the drive detected a loss of READIWRITE
READY during a data transfer.
Fault Isolation/Correction:
1. Controller
2. SDI cable (poor SDI connection)
3. ECM
44 Format Command and Format Not Enabled

Error Type: RE
Error Description: (A FORMAT ON SECTOR OR INDEX command or a SELECT TRACK
AND FORMAT ON INDEX command was decoded by the drive without the format bit (FO)
being set in the drive.)
Fault Isolation/Correction:
1. Controller
2. ECM
45 Read Gate and Off Track Both Asserted

Error Type: DE
Error Description: A loss of fine track was detected when read gate was asserted. This error
code is not implemented in microcode revisions 7 and later. This error code is used exclusively
with the dedicated-only servo system found on earlier drives. Refer to error code 9B.
Faultlsolation/Co~on:

1. Install RA9OX-0001 unless superseded by a later FCO.
46 Invalid Hardware Fault

Error Type: DE
Error Description: A failure was detected for unused fault inputs to the SDI gate array.
Fault Isolation/Correction:
.1. ECM
47 Invalid Disconnect CommandllT Bit Error

Error Type: PE
Error Description: An SDI DISCONNECT command was issued to the drive and the 'IT
modifier bit was in an incorrect state.

Fault Isolation/Correction:
1. Controller
2. ECM

DIGITAL INTERNAL USE ONLY

Troubleshooting and Error Codes 5-75

48 Invalid Write Memory Byte CounterlOffset Error

Error Type: PE
Error Description: The drive detected an incorrect number of data bytes to be written in
drive memory; or the directed offset into the memory region was incorrect.
Fault IsolatioDlCorreetion:
1. Controller

2. ECM
49 Invalid Command During TOPOLOGY Command

Error Type: PE
Error Description: During me execution oi an SDI level 2 TOPOLOGY OOfijDjijfid, the drive
received an illegal SDI level 2 command from another controller.
Fault IsolatioDlCorreetion:
1. Controller
2. ECM
4A Drive Disabled by Controller (OD Bit Set)

Error Type: Informational Only
Error Description: The controller issued an SDI level 2 CHA.~GE ~dODE command to a d..-ive
with its DD bit asserted. When the controller asserts the DD bit, it disables the drive from
further 110 activity.

Fault IsolatioDlColTeetion:
1. Controller (controller error routine determined the drive should be taken out of service)

2. ECM
4B Index Error

Error Type: DE

Error Description: No index pulse was detected for one revolution of the disk.
Fault IsolatioDlCorreetion:
1. ECM

2. HDA
4C SOl invalid Write Memory Region Error

Error Type: PE
Error Description: An SDI level 2 command was issued to a drive-defined invalid memory
region.
Fault Isolation/Correction:
1. Operator (attempting to write a nonexistent or protected memory location in drive)

2. Controller
3. ECM

DIGITAL INTERNAL USE ONLY

5-76 Troubleshooting and Error Codes

4D Write Gate and Bad Embedded Servo information

Error Type: DE
Error Description: The servo processor discovered incorrect embedded servo information
while WRITE GATE was asserted.
Fault IsolatioDlCorrection:
1. HDA
2. PCM
3. ECM
4F invalid Select Group (Level 1 Command) - Not ReadIWrIte Ready

Error Type: RE

Error Description: The controller issued a level 1 SELECT GROUP command to a drive when
the drive was not read/write ready.

Fault IsolatioDlCorreetion:
1. Cheek OCP for drive state
2. Controller

3. ECM
50 Servo Data Bus Failure

Error Type: DF
Error Description: A communication path to the GASP 8lT8y failed during resident diagnostic
testing.
Fault IsolatioDlCorreetion:
1. ECM
51 Sector/Byte Counter Error

Error Type: DF
Error Description: A resident diagnostic failure occurred during testing of the sector counter
register or byte counter register.
Fault Isolation/Correction:
1. ECM
52 Servo RAM Test Failure (Low Byte of Address)

Error Type: DF
Error Description: At power-up, the drive-resident diagnostics failed during testing of RAM
located on the servo portion of the ECM.
Fault IsolatioDlCorrection:
1. ECM

DIGITAL INTERNAL USE ONLY

Troubleshooting and Error Codes 5-77

53 Servo Processor Offset Error

Error Type: DE
Error Description: The servo system failed to offset the heads during error recovery.
Fawt~~tiowCon~tiom

1. ECM
54 Head Select Register Loopbaok Error

Error Type: DF
Error Description: A drive-resident diagnostic detected a failure in the head select register.
The head select register is inside the SDI gate array.
Fawtho~QowCo~Qom

1. ECM
55 DSP Sanity llmeout After Load

Error Type: DE
Error Description: The servo processor microcode was reloaded from the EEPROM on the
I10-R/W module because of a fault condition. After the microcode was reloaded in servo RAM,
the master processor initiated a servo sanity test. The sanity test timed out, indicating a
problem with the servo processor.
Fawt~~tio~Co~tiom

1. ECM
56 Servo RAM Test Failure (HIgh Byte of Address)

Error Type: DF
Error Description: A drive-resident diagnostic failed when testing RAM that resides on the
servo module.
Fawtho~tio~Co~tiom

1. ECM
Sf Master Processor 'nmer Failure

Error Type: DF

Error Description: A drive-resident diagnostic failed when testing the time count register or
output compare register. Both are located internal to the master processor.
Fawt Iso1atiowCon~tiom
1. ECM
58 Dedicated Head Gain Calibration error

Error Type: DE
Error Description: The servo processor timed out while attempting to measure and
compensate for the gain of the dedicated servo head.

DIGITAL INTERNAL USE ONLY

5-78 Troubleshooting and Error Codes

Fault Isolation/Correction:
1. ECM
2. PCM
3. HDA
59 Embedded Servo Offset Calibration Error

Error Type: DE
Error Description: The servo processor timed out during a calibrate of the read/write head
offsets. This calibration occurs during all head loads and periodically thereafter.
Fault Isolation/Correction:
1. HDA (most probable, especially if only one head is involved)
2. ECM (10 of 13 heads affected)
3. PCM
5A Embedded Head Gain Calibration

Error Type: DE
Error Description: The servo processor timed out while attempting to calibrate the head gain
relative to the readlwrite head embedded burst information. The drive calculates this gain for
each of the read/write heads.
Fault Isolation/Correction:
1. ECM (if most heads show problem)
2. PCM (if most heads show problem)

3. HDA (most probable, especially if only 1 head is involved)
5B Bias Calibration Error

Error Type: DE
Error Description: The servo processor timed out during a bias force adjustment to the
actuator.
Fault Isolation/Correction:
1. ECM
2. PCM

3. HDA
5C Incorrect Diagnostic Index or Sector Pul..

Error Type: DF
Error Description: In testing the sector and byte counters, the master processor detected that
the sector counter was not working properly.
Fault IsolatioDlCorreetion:
1. ECM

DIGITAL INTERNAL USE ONLY

Troubleshooting and Error Codes 5-79

60 ReadIWrHe Head Select Failure

Error Type: DE
Error Description: A failure occurred when attempting to select a group (head). When a
group selection is requested, logic and fL'P!nwa..~ in the drive veri,jy that the correct SSt 283
chip and head in the HDA have been selected. This verification takes place during functional
operation and in diagnostic mode.
Fault IsolatioDlCorrection:
1. PCM
2. PCM-to-ECM cable
3. ECM
61 Diagnostic Index Sync Timeout Error

Error Type: DF
Error Description: A drive-resident diagnostic failed to detect an index pulse.
Fault IsolatioDlCorreetion:
1. ECM
2. HDA
62 Read Test Overall Read Failure (Three or More Bad Heads)

Error Type: DF
Error Description: DUring the execution of a resident diagnostic read-only test or write/read
test, data by three or more heads read from diagnostic cylinders did not compare to the
originally written patterns.
The RA90 drive has two diagnostic cylinders (2659 and 2660) located in the inner guardband
area of the media. Only the drive can access these two cylinders; they cannot be accessed by the
controller. These are not the same cylinders used by the controller to execute controller-based
diagnostics (DBN space). Refer to drive-resident diagnostic 17.

Fault Isolation/Correction:
1. Reformat the read-only cylinder by running drive-resident diagnostic 17
2. PCM
3. ECM
.4. HDA
63 Read Test Partial Failure (One or lWo Bad Heads)

Error Type: DF
Error Description: During the execution of a resident diagnostic read-only test or write/read
test, data by one or two heads read from diagnostic cylinders did not compare to the originally
written patterns. Refer to error code 62.

DIGITAL INTERNAL USE ONLY

5-80 Troubleshooting and Error Codes

Fawt~mtio~Coneetio~

1. Reformat the read-only cylinder by running drive-resident diagnostic 17
2. PCM

3. ECM
4. HDA
64 Cannot Clear lID Error Bits

Error Type: DF
Error Description: Error detection logic internal to the lID gate array cannot be cleared.
Fawt~mtio~Co~tio~

1. ECM
65 Diagnostic Index or Sector Not Detected

Error Type: DF
Error Descriptio~ No index pulse was detected during the execution of resident diagnostics
that read or write media.
Fawt ~mtio~Correetio~

1. ECM
2. HDA
66 Read Test Servo Failure

Error Type: DF
Error Descriptio~ The drive internal diagnostic read or writelread test failed because of an
off-track condition.

Fawt ~mtio~Coneetio~

1. PCM
2. ECM
3. HDA
01 Cannot Execute Write Test (Read-Only Test failed or Not Run First)

Error Type: DF
Error Description: This indicates an operator error, not a drive problem.
Service personnel must run the read-only test before attempting to run the write test.
Additionall;Yt the read test must be successful before the write/read diagnostic is executed.

Fawt Isomtio~Correetio~
1. Service personnel attempted to execute the writelread diagnostic without first executing the
read-only diagnostic.
2. The read-only diagnostic failed and the write/read diagnostic was attempted anyway.

3. ECM

DIGITAL INTERNAL USE ONLY

Troubleshooting and Error Codes 5-81

68 This Diagnostic Cannot Execute WIthout Software Jumper

Error 1)rpe: DF

Error Description: A diagnostic or utility was attempted without having first selected the
RunfStop switch. The Run/Stop switch must be selected within 1.5 seco~w.S ~..er initiating
certain tests with the Write Protect switch.

Fault IsoIatioDlCorreetion:
1. Procedural error

2. ECM
3. OCP
69 Unable to Force Compare Error

Error 1)rpe: DF

Error Description: The drive failed to force a data compare eITOr during a read-only
diagnostic.

Fault IsoIatioDlCorrection:
1. ECM
6A Unable to Force No-Sync Error

Error 1)rpe: DF

Error Description: The diagnostic firmware was unable to force a no-sync error.
Fault IsoIatioDlCorrection:
1. ECM
6B RJW WrlteJRead Test Overall failure (Three or More Bad Heads)

Error 1)rpe: DF
Error Description: The data read from three or more heads during execution of resident
diagnostics was incorrect. The heads are positioned at the drive-reserved diagnostic cylinders
during these tests.

Fault IsoIatioDlCorrection:
1. ECM
2. PCM
3. HDA
6C RJW WriteIRead Test Partial Failure (One or Two Bad Heads)

Error 1)rpe: DF
Error Description: The data read from one or two heads was incorrect. The heads were
positioned at the drive reserved diagnostic cylinders.

Fault IsoIatioDlCorreetion:
1. ECM
2. PCM
3. HDA

DIGITAL INTERNAL USE ONLY

5-82 Troubleshooting and Error Codes

60 Unable to Force Read Gate and Write Gate T....,..

Error Type: DF

Error Description: Drive-resident diagnostics were unable to force the simultaneous assertion
of READ GATE and WRITE GATE.
Fault IsolationlCorrection:
1. ECM
6E Unable to Force Write Gate and Write Protect Error

Error Type: DF
Error Description: A write-protected drive bas WRITE GATE asserted but no etTOr was
detected.
Fault IsolationlCorrection:
1. ECM
6F Diagnostic Write Attempted While Wrtte-Protecl8d

Error Type: DF
Error Description: Either the Writ&'Read Diagnostic or the Diagnostic Track Format Utility
was attempted on a write-protected drive.
FaultuolationlCorreetion:
1. Drive write-protected from the OCP
2. Drive write-protected by the controller
3. ECM
70 Servo Processor Splnup Timeout

Error Type: DE
Error Description: The master processor timed out after issuing a SPINUP command to the
servo processor.

FaultUIDlationlCorrection:
1. ECM
71 Recallbrate Timeout Error

Error Type: DE
Error Description: The master processor timed out during a RECALIBRATE command issued
to the servo processor.
Fault IsolationlCorrection:
1. ECM

DIGITAL INTERNAL USE ONLY

Troubleshooting and Error Codes 5-83
72 Servo Processor Seek nmeout

Error Type: DE
Error Description: The servo processor timed out the execution of a SEEK command. This is
a gross seek error in that the servo subsystem never sensed thAt it got even within a cylinder of
the desired cylinder within a 100 DlS.
Fawt~~tiowCo~tion:

1. ECM
2. HDA
73 Servo Processor Head SWItch Timeout

Error Type: DE
Error Description: The master processor timed out before the servo processor responded to a
head switch status request.
Fawt~~tiowCo~tion:

1. ECM
74 Offset nmeout Error

Error Type: DE
Error Description: The master processor timed out during an offset check or OFFSET
command to the servo processor.
Fawt~~tiowCo~on:

1. ECM

2. HDA
75 Servo Processor Unload timeout

Error Type: DE
Error Description: The master processor iimed out after issuing an li·:N~OAD (bead) .
command to the servo processor.
Fawt ~~tiowCo~on:
1. ECM
76 Servo Processor Sanity nmeout

Error Type: DE
Error Description: The master processor timed out while waiting for a response from the
servo processor after issuing a SANITY CHECK command.
Fawt ~~tiowCo~on:
1. ECM

DIGITAL INTERNAL USE ONLY

5-84 Troubleshooting and Error Codes
77 Head Load nmeout Error

Error Type: DE
Error Description: The master processor timed out waiting for a response from the servo
processor after issuing a HEAD LOAD command.
Fault IsolatioDlCorrection:
1. ECM
78 Servo Processor Bias Force calibration Timeout

Error Type: DE
Error Description: The master processor issued a BIAS CALIBRATION command (diagnostic
opcode) to the servo processor. The master processor timed out while waiting for a servo
processor response.
Fault Isolation/Correction:
1. ECM
79 Dedicated Servo calibration Timeout Error

Error Type: DE
Error Description: The master processor timed out waiting for the servo processor to respond
to a DEDICATED SERVO CALIBRATION command issued as part of a diagnostic opcode.
Fault Isolation/Correction:
1. ECM
7A Embedded OffsetlGaln calibration Timeout

Error Type: DE
Error Description: The master processor timed out while waiting for the servo processor
to respond to an EMBEDDED OFFSET CALIBRATION or EMBEDDED HEAD GAIN
CALIBRATION command issued by a diagnostic opcode.
FaultIso~tion/Correction:

1. ECM
7B Invalid Test While Spindle Running

Error Type: DF
Error Description: The drive was spun up and the operator selected a diagnostic that can
only be run when the drive is spun down. (Certain diagnostics can only be executed on a
spun-down drive.) Refer to Chapter 4 for a complete listing of diagnostics and execution
requirements.
Fault Iso~tion/Correetion:
1. Spin down drive to run selected tests
2. ECM

DIGITAL INTERNAL USE ONLY

Troubleshooting and Error Codes 5-85

7C Gray Code Match Error After Settling

Error Type: DE
Error Description: Head settling on a track normally occurs following a SEEK command.
A gray cede oo:mp~"'iscn is :made to ensu..-re the heads l't..ave settled on the requestAA trar.k. In
this case, the servo was settling within 114 track of the desired track (but fine track had not
been asserted) when suddenly the servo gray coded information indicated that movement of
>1 cylinder has taken place away from the desired target cylinder. Such an occurrence may
be related to an intermittent open of the coil actuator circuitry or transient spike in voltage
establishing the holding current for the positioner.
Fault Isolation/Correction:
1. EC-M:
2. BTlA
7D Embedded Interrupt Timeout

Error Type: DE
Error Description: The servo processor failed to detect a BURST PROTECT transition
(asserted to de-asserted state) as generated from the master processor (ECM).
Fault Isolation/Correction:
1. ECM
7E Fine Track Lost After Settling

Error Type: DE
Error Description: The actuator initially settled on track but has now moved off track and
loss of fine track has been declared by the servo subsystem This condition has persisted for 2
seconds.
Examine head and or cylinder correlation when considering this error. This information should
be derivable from the host error log or by doing a complete dump of the drive internal error log
with a controller.
Other contributors to tbis condition might be sustained vibration to the drive lWit. HDA
runnout condition, or an HDA mechanical resonance problem.

Fault Isolation/Correction:
1. ECM (if totally random cylinders and heads)
2. HDA (first choice if same cylinder(s»
3. HDA (first choice if same head(s»
7F Servo Settling Timer Expired

Error Type: DE
Error Description: The actuator was not able to settle on track within the allotted settling
timeout period. The servo system was able to relocate to within 114 track. of the desired
track/cylinder; however, it could not meet the fine track threshold stability criteria within
the time allotted (1.8 seconds).
Examine head and or cylinder correlation when considering this error.

DIGITAL INTERNAL USE ONLY

5-86 Troubleshooting and Error Codes

Fault Isolation/Correction:
1. ECM (if totally random cylinders and heads)
2. HDA (first choice if same cylinder(s»
3. HDA (first choice if same head(s»
80 Master Processor ROM ConsIstency Code . . . . .tch

Error Type: DF
Error Description: The master processor microcode is inconsistent with the microcode stored
in EPROM.
Fault~lationlCorrectio~

1. Reload microcode
2. ECM
81 Servo Processor Settle State nmeout

Error Type: DE
Error Description: The actuator was not able to settle on track within the allotted settling
timeout period.
Fault~1ation/Co~o~

1. ECM

2. PCM
3. HDA
82 Servo Processor Coarse Velocity State TImeout

Error Type: DE
Error Description: The servo processor timed out when commanded to move the actuator 256
or more cylinders.
Fault Iso1ationlCorreetio~
1. ECM

2. HDA
83 Servo Processor Fine Velocity State
>

nmeout

Error Type: DE
Error Description: The servo processor timed out when commanded to move the actuator less
then 256 cylinders.
Fault Iso1ationlCorrectio~
1. ECM
2. PCM

3. HDA

DIGITAL INTERNAL USE ONLY

Troub!eshooting and Error Codes 5-87

84 Servo Processor Seek Direction Error

Error Type: DE
Error Description: Servo processor actuator (positioner) and dedicated servo information
indicated that the seek d:i.reetion was wrong.
Fault Isolation/Correction:
1. ECM
2. HDA
85 Master Processor RAIl Test Failure

Error Type: DF
Error Description: The drive-resident diagnostics detected bad RAM intemaI to the master
processor.

Fault Isolation/Correction:
1. ECM
86 Static RAM Failure

Error Type: DF
Error Description: Drive-resident diagnostics detected bad RAM external to the master
processor.

Fault Isolation/Correction:
1. ECM
87 Master Processor ROM Checksum failure

Error Type: DF
Error Description: Drive-resident diagnostics detected bad ROM internal to the master
processor.

Fault !solAtio-n/Correetion:
1. ECM
88 Master Processor EEPROM Write Violation Error

Error Type: DE
. Error Description: EEPROM was addressed and written to while in read-only mode.
Fault Isolation/Correction:
1. ECM
89 Seek Speed Out of Range

Error Type: DE

Error Description: While monitoring the speed of the actuator, the servo processor
determined that seek velocity is beyond prescribed speed.

DIGITAL INTERNAL USE ONLY

5-88 Troubleshooting and Error Codes

Fault IsolatioDlCorrection:
1. ECM
2. Power supply
3. HDA
8A Servo Processor Inside of Destination Track During Settle State

Error Type: DE
Error Description: Servo processor has determined that the positioner has placed heads
inside of the destination track during settle state.
Fault IsolatioDlCorreetion:
1. ECM
88 Gray Code Error After Settling WIth Fine Track

Error Type: DE
Error Description: Head settling on a track normally occurs following a SEEK command.
A gray code comparison is made to ensure the heads have settled on the requested track.
In this case, the servo was settling within 114 track of the desired track and fine track had
been asserted when suddenly the servo gray coded information indicated that movement of
>1 cylinder has taken place away from the desired target cylinder. Such an occurrence may
be related to a significant amount of vibration in the vertical axis of the drive, or electrical
transients from the positioner control voltage and holding current circuitry.
Fault IsolatioDlCorreetion:
1. ECM

2. HDA
8e Uncallbrated and PLO Error

Error Type: DE
Error Description: A PLO elTOr occurred and the head offsets were uncalibrated.
Fault IsolatioDlCorreetion:
1. ECM

2. PCM
3. HDA
8D Polarity Error on Velocity Command During a Multi-Track Seek

Error Type: DE
Error Description: The polarity indication bit in a velocity command profile was clear (zero)
during a multi-track seek. This bit should have been set. (This is one of the setup functions the
servo processor checks before it executes the digital servo seek profiles.)
Fault lsolatioDlCorrection:
1. ECM

DIGITAL INTERNAL USE ONLY

Troub!eshooting and Error Codes 5-89

SE Master Processor ROMlEEPROM Consistency Code Mismatch

Error Type: DF
Error Description: Master processor microcode is incompatible with EEPROM microcode.
Fault Isolation/Correction:

1. Reload microcode
2. ECM
SF EEPROM Checksum Failure

Error Type: DF
Error Description: Drive-resident diagnostics detected bad EEPROM external to the master
processor. The calculated checksum did noi maicb the stored checksum.
FaultIsolation/Co~tion:

1. ECM
90 Unable to Force Index Error

Error Type: DF
Error Description: Drive-resident diagnostics were unable to force and/or detect an index
error.
Fauit IsoiationiCorrection:
1. ECM
91 No InterRipt Detected During RIW Force Fault

Error Type: DF
Error Description: No interrupt to the master processor was detected by the drive during the
read/write force fault diagnostic.
Fault Isolation/Correction:
1. ECM
92 Inner Guardband WIthout a Servo Fault Set

Error Type: DF
Error Description: The actuator was positioned in the inner guardband area and the inner
guardband flag was set; however, a servo fault condition was not detected.
Fault Isolation/Correction:
1. ECM
93 Inner GuardbandlServo Fault: No Interrupt Detected

Error Type: DF
Error Description: The actuator was positioned at a cylinder in the inner guardband area,
the inner guardband flag was set, and a servo fault error was detected, but the master processor
was not interrupted.
Fault Isolation/Correction:
1. ECM

DIGITAL INTERNAL USE ONLY

5-90 Troubleshooting and Error Codes

94 SOl Loopback Test Failure on Both Ports

Error Type: DF
Error Description: Drive-resident diagnostics detected an SDI gate array or TSID gate array
failure involving both SDI ports A and B logic. If the drive internal test 09 fails, the failure
could be in the hardware external to the SDIfl'SID gate aITay as well. Dming internal T09, the
testing expects SDI loopback connectors to be attached to the ECM or at the cab bulkhead.
Fault Isolation/Correction:
If test number 08 fails:
1. ECM

If test number 09 fails:
1. Loopback connectors are not installed
2. Defective SDI cable
3. Defective bulkhead connector
4. ECM

5. SDI connectors JI0l or JI02
95 SOl Test Failure: Port A

Error Type: DF
Error Description: A drive-resident diagnostic detected a failure with the SDI gate array or
the TSID gate array involving SDI Port A. If the drive internal test 09 fails for this error code,
the failure could be in the SDI Port A hardware external to the SDursID gate array as well.
During internal TOO, the testing expects SDI loopback connectors to be attached to the ECM or
at the cabinet SDI bulkhead.
Fault IsolatioDlCorrection:
If test number 08 fails:
1. ECM

If test number 09 fails:
1. Port A loopback connectors are not installed

2. Defective SDI cable (Port A)
3. Defective bulkhead connector (Port A)
. 4. ECM

5. SDI connector JI02 (Port A)
96 SOl Failure: Port B

Error Type: DF
Error Description: A drive-resident diagnostic detected a failure with the SDI gate array or
the TSID gate array involving SDI Port B. If the drive internal test 09 fails for this error code,
the failure could be in the SDI Port B hardware external to the SDIIl'SID gate array as well.
During internal T09, the testing expects SDI loopback connectors to be attached to the ECM or
at the cabinet SDI bulkhead.

DIGITAL INTERNAL USE ONLY

Troubleshooting and Error Codes 5-91

Fault Isolation/Correction:
if test number 08 fails:

1. ECM
If test number 09 fails:

1. Port B loopback connectors are not installed
2. Defective SDI cable (Port B)
3. Defective bulkhead connector (Port B)
4. ECM
5. SDI connector J101 (Port B)
98 Can't Execute Diagnostic/Jumper

Error Type: DF
Error Description: A diagnostic test could not be run because a hardware jumper was not
installed. If this error is seen in the field, do not attempt to alter jumpers.
Fault Isolation/Correction:
1. Operator (do not attempt to alter jumpers)
9A Positioner Corrected Event During Data Transfer

This is typically an event unless analyzed by VAXsimPLUS to be worthy of correction.
Reference expanded di~on or 9A, 9B, and 9C events under error code 9C.

Error Type: DE
Error Description: Heads were not fine positioned or locked on track (relative to the
embedded servo information) at the time a read or write operation was ready to start. The
drive took necessary procedures to re-establish on-track condition. The drive command was
received but READ GATE or WRITE GATE had not yet been asserted.
Fault Isolation/Correction:
1. IillA (if only one head)

2. ECM (if 10 or 13 heads)
98 Write and Positioner Corrected Event

This is typically an event unless analyzed by VAXsimPLUS to be worthy of correction.
Reference expanded discussion or 9A, 9B, and 9C events under error code 9C.

Error'Type: DE
Error Description: The master processor determined that the selected read/write head moved
off track when WRITE GATE was asserted. The condition was corrected. (The readlwrite heads
must be within 57.1 microinches from track centerline.)
Fault Isolation/Correction:
1. HDA (if only one head)

2. ECM (if 10 of 13 heads)

DIGITAL INTERNAL USE ONLY

5-92 Troubleshooting and Error Codes

9C Read Gate and Positioner Corrected Event

Error Type: DE
Error Description: The master processor determined that the selected read/write head moved
off track when READ GATE was asserted. The condition was corrected. (The read/write heads
must be within 57.1 microinches from track centerline.)
TROUBLESHOOTING 9A, 9B, AND 9C:

This is typically an event unless analyzed by VAXsimPLUS to be worthy of correction.
For HSCIKDM controllers, event rates of <5 per day may be considered normal for
disks that operate with fairly high I/O rates (continually or in significant bursts)
provided that the following pattern is noted:
•

Ninety percent of occurrences are with the top five heads (heads 0 through 4).

•

One of the top five heads will have few if any errors.

•

No one head in the top five has 90 percent of the errors. (This might point to a
track/surface problem.)

If the event pattern matches this, and the event rate exceeds these guidelines, then
HDA replacement may be necessary.
If the event pattern does not match this, then further analysis is required.

For KDAlKDB/UDA controllers, event rates should not exceed 16 per day on heavily used disks
(110 rates of 30 per second).
If these events occur over 10 of the 13 heads, then the occurrence may be related to a general
servclread path problem. This is possibly an electro~cs problem that may not involve the HDA.
If these errorS occur primarily on one head, there is strong head/surface correlation and possible
HDA replacement is warranted.
The above number of events to be expected was determined by analysis and experience with the
RA90 HDA 70-22951~1. With the introduction of the RA92 (HDA 70-27492-(1), the number
of 9A, 9B, and 9C events has decreased significantly. The phase-in of the RA92 HDA hardware
mechanics (resulting in an RA9O-compatible HDA 70-27268-01) into RA90 production has
substantially reduced the occurrence of these events because of the new design.
Fault Isolation/Correction:
1. HDA (if only one head)
2. ECM (if 10 of 13 heads)
9D Error Log Header Corrupted

Error Type: DF
Error Description: A location in EEPROM containing drive-resident error log identifier
information, device type, or descriptor size is invalid.
Fault Isolation/Correction:
1. Attempt to load new microcode
2. ECM

DIGITAL INTERNAL USE ONLY

Troubleshooting and Error Codes 5-93

9E Drive FauHed, Test Cannot Run

Error Type: DF
Error Description: Drive-resident diagnostics cannot run while the drive is faulted.
Fault IsolatioDlCorreetion:

1. Check fault condition
2. ECM
9F Error Log Check Point Code

Error Type: Informational Only.
Error Description: If drive-resident diagnostic T50 has been used to place a checkpoint
between errors in the drive internal error iog, a 9F entry will be seen in ihe drive iniernai error
log. This makes drive troubleshooting easier by placing a null field between errors in the drive
internal error log to partition repair activity.
Fault IsolatioDlCorreetion:
1. None (read Error Description above)
AO Unable to Clear SDI Array Safety St..atus Register

Error Type: DF
Error Description: Drive-resident diagnostics attempted to clear the SDI gate array safety
status registers but were unsuccessful.
Fault IsolatioDlCorrection:
1. To isolate the stuck bit, check the preceding error in the drive internal error log storage silo.
Base corrective action on the preceding error.
2. ECM
A 1 Unable to Force Encoder Error

Error Type: DF
Error Description: Drive-resident diagnostics were unable to force a read/write
encoder/decoder (RWENDEC) error.
Fault IsolatioDlCorrection:
1. ECM
A2 Unable to Force Multiple Head Select While Reading

Error Type: DF
Error Description: Drive-resident diagnostics were unable to force read gate and multi-chips
error.
Fault IsolatioDlCorreetion:

1. PCM
2. ECM
3. HDA

DIGITAL INTERNAL USE ONLY

5-94 Troubleshooting and Error Codes

A3 Unable to Force Write Gate and Write U...t.

Error Type: DF

Error Description: A drive-resident diagnostic was unable to force write gate and write
unsafe error conditions.

Fault Isolation/Correction:
1. ECM
2. PCM
A4 Unable to Force Write Current and No Write Gate

Error Type: DF
Error Description: Drive-resident diagnostics were unable to force write current and no write
gate error conditions and detect such a condition.

FaultlsolationlCorrection:
1. ECM
2. PCM
AS Unable to Force Write Gate and No Write Current

Error Type: DF
Error Description: Drive-resident diagnostics were unable to force write gate and no write
current error conditions.

Fault IsolatioDlCorrection:
1. ECM
2. PCM
A6 Unable to Force Read Gate and Off Track Error

Error Type: DF
Error Description: Drive-resident diagnostics were unable to force read gate and off track
error conditions.

Fault IsolationlCorrection:
1. ECM
A7 Unable to Force Write Gate and Off Track Error

Error Type: DF
Error Description: Drive-resident diagnostics were unable to force write gate and off track
error conditions.

Fault IsolationlCorrection:
1. PJW cable to PCM

2. ECM

DIGITAL INTERNAL USE ONLY

Troubleshooting and Error Codes

5-95

AS Unable to Force Read and Write Fault While Writing

Error Type: DF
Error Description: Drive-resident diagnostics were unable to force a readlwrite-while-faulted
en ui' condition.

Fawt~mtio~Co~on:

1. ECM
A9 Servo FaultIForce Fault Test

Error Type: DF
Error Description: A servo check occurred while the diagnostic firmware was attempting to
execute the force fault subtest.
Fawthomtio~Co~tion:

1. ECM

2. HDA
AS Forced Read and Write Fauil While Reading

Error Type: DF
Error Description: Drive-resident diagnostics were unable to force a readlwrite-while-faulted
error condition.

Fault ~mtio~Correction:
1. ECM
AD UART Overrun or Framing Error

Error Type: DE
Error Description: The master processor internal UART detected an overrun condition or a
framing error condition on data received from the OCP.
Fault Isolation/Correction:
1. OCP

2. ECM

30 Blowerlbezel assembly
AE ,OCP Data Packet Checksum Error

Error Type: DE
Error Description: Data packets transmitted between the master processor and the OCP
processor are in error.
Faultbomtio~Co~tion:

1. ECM

2. OCP
3. Blowerlbezel assembly

DIGITAL INTERNAL USE ONLY

5-96 Troubleshooting and Error Codes

AF ocp Start Byte Is Not a Sync Charact.

Error 'JYpe: DE
Error Description: The first byte the master processor expects in a data packet transfer is a
sync character. This error indicates no sync character was received.

Fault IsolatioDiCorreetion:
1. ECM

2. OCP
3. Blowerlbezel assembly
80 OCP invalid Response

Error 'JYpe: DE
Error Description: The OCP processor did not acknowledge a command from the master
processor.
FaultlsolatioDlCo~tion:

1. OCP
2. ECM
3. Blowerlbezel assembly
82 OCP Retransmit Failure

Error 'JYpe: DE
Error Description: The OCP processor can request three retransmits from the master
processor. This error indicates the OCP requested more than three consecutive retransmit
responses.

Fault IsolatioDiCorreetion:
1. OCP
2. ECM
3. Blowerlbezel assembly
83 OCP Command Unsuccessful

Error 'JYpe: DE

Error Description: An incoITeCt response was received from the OCP processor after the
master processor issued a SEND STATUS command to the OCP.

FaultlsolatioDiCorreetion:
1. OCP
2. ECM

DIGITAL INTERNAL USE ONLY

Troubleshooting and Error Codes 5-97

B4 OCP Command nmeout

Error Type: DE
Error Description: The OCP processor did not respond to a master processor command
within iis allotted timeout period. As a result of this error, the il'l&r..er processor logs a B6 euJr
condition into EEPROM and latches B4 into the display.
Fault IsolatioDlCorrection:
1. OCP

2. ECM
3. Blowerlbezel assembly
136 Master Processor UART Loopback Test Failure

Error Type: DF

Error Description: Drive-resident diagnostics were unable to transmit and receive data
through the master processor serial communications interface (SCI).
Fault IsolatioDlCorrection:
1. ECM
B8 Master Processor UART TransmitterlReceiver Error

Error Type: DE
Error Description: The OCP failed to transmit or receive data through its master processor
serial port.
Fault IsolatioDlCorrection:
1. OCP
B9 OCP-to-Master Processor Communications

nmaout Failure

Error Type: OCP Error Code

Error Description: The master processor failed to CO"'municate with the OCP processor
within 4 seconds after power-up.
Fault IsolatioDlCorrection:
1. OCP
2. ECM

3. Blowerlbezel assembly
BA OCP NIII nmeout Failure

Error Type: OCP Error Code
Error Description: The master processor failed to communicate with the OCP processor
within 4 seconds after issuing an initialize request to the OCP processor.

Fault IsolatioDlCorreetion:
1. OCP
2. ECM
3. Blowerlbezel assembly

DIGITAL INTERNAL USE ONLY

5-98 Troubleshooting and Error Codes

BB OCP Processor ROM Checksum Failure

Error Type: OCP Error Code
Error Description: The OCP processor performed a ROM checksum, and the calculated
checksum did not match the stored checksum.
Fault IsolatioDlCorrection:
1. OCP
BC Cartridge Checksum Fa.lwe

Error Type: DF
Error Description: Invalid microcode was detected in the microcode update cartridge.
Fault IsolatioDlCorrectiom
1. Reseat update cartridge (retry T40)

2. Defective cartridge
3. OCP
4. ECM
BD Microcode Update CartrIdge DetectIon failure

Error Type: DF
Error Description: The microcode update utility (T40) was attempted without an update
cartridge in place.
Fault IsolatioDlCorrection:
1. Cartridge is not inserted

2. Defective cartridge
3. OCP
4. ECM
BE CartrldgeJEEPROM'Master Proceaeor ~ Check

Error Type: DF
Error Description: Microcode contained within the cartridge is inconsistent with the
microcode in the master processor, EPROM, or EEPROM. The microcode update 'process is
halted to prevent loading incompatible microcode. The product revision matrix documentation
. shows which codes are compatible.
Fault lsolatioDlCorrection:
1. Reseat update cartridge

2. Replace update cartridge with a compatible cartridge
3. ECM

DIGITAL INTERNAL USE ONLY

Troubleshooting and Error Codes

5-99

BF Error Log Write Compare Error

Error Type: DE
Error Description: Each time the drive writes an error log entry into the error silo, it verifies
the data written. The microcode got a data compare error on the page (16 bytes) that was
written. This is not a fatal error. Should that particular silo entry be rewritten, it mayor may
not fail again. This error code is not written to EEPROM but may be displayed at the time of
the elTor if the fault button is depressed.
Fault Isolation/Correction:
1. ECM

co Hardware Revision and Microcode incompatlbliity
Error Type: DE
Error Description: The microcode has determined that there is an incompatible hardware
and/or software combination from the revision information that it has visibility to. The
microcode looks at the following hardware revisions in a drive:
•

I10-R/W module hardware revision jumpers

•

Servo module hardware revision jumpers

•

PCM switch pack (S1-1 through S1-4)

•

HDA revision bits information

Most of this hardware revision information can be determined by executing drive internal iest
T45 (see Chapter 4), then decoding the reported revision information.
The microcode, after checking this internal revision information, will modify the final drive
reported hardware revision that is reported to the subsystem and host as the drive hardware
revision.
Microcode revision 9 was the first release that checked for HDA revision. Subsequent microcode
revisions have been expanding on the compatibility testing. With the RA92 (microcode revisions
20 and later), a significant amount of revision checking/testing is necessary for the microcode to
properly configure itself as to the type of drive (RA9O vs RA92), type of HDA (short arm vs long
8-1"!!l), type of format (RA.90 VB R.'\92). and type of ECM (70-22942-01 vs 70-22942-02).

To determine TOTAL compatibility, you must verify:
•

Code compatibility to ECM

•

Code compatibility to HDA

•

ECM compatibility to HDA

•

PCM and HDA compatibility

•

PCM switch pack setup.

Reference the compatibility tables in Chapter 3.
With microcode revisions 20 and later, the CO LED error is a a very significant fault to the
drive and must be resolved. The error type was redefined to a drive error.

Fault Isolation/Correction: If the HDA has just been replaced, replace it again with a
compatible revision or load compatible drive microcode in the ECM.

DIGITAL INTERNAL USE ONLY

5-100 Troubleshooting and Error Codes

If the drive HDA and microcode were operational before the failure, then revision bits are now
being detected in error. This will require careful troubleshooting. A series of tables in the
RA90 I RA92 Disk Drive Pocket Reference Card have been prepared to assist in the determining
and resolving of this error condition. Additional tables are provided in Chapter 3.
1. If the HDA has just been replaced: load compatible microcode
2. If the PCM has just been replaced: check PCM switch pack. 81-1 through 81-4 for correct
switch settings. Refer to the RA90 / RA92 Disk Drive Pocket Reference Card and the tables
in Chapter 3.

3. If the ECM has just been replaced: check microcode compatibility. Refer to the RA90 I RA92
Disk Drive Pocket Reference Card and the tables in Chapter 3.
4. RJW cable

5. PCM
6. ECM
C1 outer Guardband Detected After HEAD LOAD Command

Error Type: DF
Error Description: The GASP gate array detected outer guardband after a HEAD LOAD
command.
Fault Isolation/Correction:
1. ECM
C2 Inner Guardband Detected After HEAD LOAD Command

Error Type: DF
Error Description: The GASP gate array detected inner guardband after a HEAD LOAD
command.
Fault Isolation/Correction:
1. ECM
C3 Seek to Outer Guardband Failed

Error Type: DF
Error Description: The servo processor was issued a 8EEK command to the outer guardband
area of the disk but failed the seek.
Fault Isolation/Correction:

1. Clean cabinet air vent grill
2. ECM
3. PCM
4. Blowerlbezel assembly

5. HDA

DIGITAL INTERNAL USE ONLY

Troubleshooting and Error Codes

5-101

C4 Seek to Outer Guardband Not Detected

Error Type: DF
Error Description: The servo processor was issued a SEEK command to the outer guardband
area of the disk, but the OGB flag was not detected.

Fault Isolation/Correction:
1. ECM
C5 HDA and ECM Incompatibility

Error Type: DF
Error Description: The microcode has determined that the reported HDA type and ECM type
are incompatible. Specifically, the incompatible combination is an old ECM type (70-22942-01)
and an RA92 I:uJA

Microcode revision 25 was the first release to check specifically for this error. Previous
microcode revisions (revision 9 and later) will report this condition as error code CO.
Fault Isolation/Correction: If the HDA or ECM has just been replaced, make sure compatible
part numbers have been used.
If the PCM has just been replaced (part of the HDA FRU assembly), make sure switches SI-1
through 81-4 are set correctly. (See Chapter 3 or compatibility tables in the RA90 / RA92 Disk
Drive Pocket Reference Card.)
If HDA, PCM, ECM and drive microcode were operational before the failure, then the switch
pack SI on the PCM and/or the I10-R/W and servo revision jumpers are now being detected
in error. This will require careful troubleshooting. See drive error code CO for additional
troubleshooting information.
1. R/W cable

2. PCM (check 81 switch pack setting)
3. ECM (replace with PIN 70-22942-02)
C6 PLO Failure

Error TYPe: DE
Error Description: The voltage controlled oscillator (VCO) is not synchronized to the
dedicated servo information read from the media.

Fault Isolation/Correction:
1. ECM

2. PCM
3. HDA
C7 Seek to Inner Guardband Failed

Error Type: DF
Error Description: The servo processor was issued a SEEK command to the inner guardband
area of the disk but failed the seek.

DIGITAL INTERNAL USE ONLY

5-102 Troubleshooting and Error Codes

Fault IsolatioDlCorrection:
1. ECM

2. PCM

3. HDA
C8 Inner Guardband Not Detected After Seek to Inner Guardband

Error Type: DF
Error Description: A SEEK command, issued to the servo processor to seek to the inner
guardband area, failed to detect the inner guardband ftag.
Fault Isolation/Correction:
1. ECM
2. HDA
C9 Analog Loop Test Failure

Error Type: DE
Error Description: The D/A and AID circuitry did not respond correctly while tested in a loop.
The servo processor performs the analog testing on these circuits.
FaultIsolatio~rrection:

1. ECM
2. PCM
CA Media Not Spinning

Error Type: DF

Error Description: Selected drive-resident diagnostics could not be executed because the
drive was spun down.
FaultbolatioDlCorrection:
1. Spin up drive

2. ECM
CC Servo Processor Recallbrate Failed

Error Type: DE
Error Description: A RECALmRATE command issued to the servo processor failed.
FaultlsolatioDlCorrection:
1. ECM

2. PCM
3. HDA

DIGITAL INTERNAL USE ONLY

Troubleshooting and Error Codes

5-103

CD Track Counter (Gray Code)

Error Type: DE
Error Description: During coarse positioning, both gray code bits (X and Y) changed during
the same sa-yO frame; or the same gray code changed (X or Y) during two consecutive servo
frames.
Fault Isolation/Correction:
1. HDA

2. ECM
CE EEPROM Write Cyc!e TimeoLif

Error Type: DE
Error Description: During an EEPROM write operation, a location in EEPROM could not be
written within 20 milliseconds.
Fault Isolation/Correction:
1. ECM
CF Invalid Data In EEPROM

Error Type: DE
Error Description: Error information in EEPROM was found to be invalid.
Fault Isolation/Correction:
1. ECM
EO Spindle Rotation Not Detected

Error Type: DE
Error Description: The servo system has not detected Hall sensor signal transitions. This
indicates either the spindle motor is not turning or the Hall sensor circuitry has failed. An open
motor coil (or drive circuitrv) will show this svmDtom if that Darticu1ar phase is needed to start
the spindle drive. See erro; ~e 13 before replacing FRUs. With microcode revisions 19 and earlier, this error was spindle speed unsafe-basically the same
error detection.
After microcode revision 20, this error is simply failure to detect that the spindle has performed
any motion. The servo monitors the hall sense 81 signal (reference error code 13). If it detects
any transition on this specific motor control signal, then this check is okay.

Fault Isolation/Correction:
1. ECM

2. Rear flex cable assembly (visually inspect for damage (HDA removal necessary); the rear
flex cable assembly should be neatly dressed along the sides of the chassis at the rear)
3. Servo-to-spindle motor interconnect
4. Brake failure (on/open all the time)
5. HDA

6. Rear flex cable assembly

DIGITAL INTERNAL USE ONLY

5-104 Troubleshooting and Error Codes

E1 Spindle Speed Out Of Range

Error Type: DE
Error Description: Spindle speed is monitored initially by input from the Hall sensors inside
the HDA spindle motor. Improper spindle speed, as detected by the Hall sensors, may prevent
proper speed control until the PLO frequency lock range is attained. Once the spindle speed
is within the PLO range, the servo system begins to look for servo data in which to lock its
frequency to. This error implies that the drive is unable to establish spindle speed rotation
within the range required (RA9O=3600 rpm, RA92=3405 rpm).
An open failure of a spindle motor coil winding, or a motor drive circuitry failure, or a bad hall
sense 81 or 82 circuit will cause this type of error. See error code 13 for measurement points
and troubleshooting aids before replacing FRUs.

Fault Isolation/Correction:
1. Rear flex cable assembly (visually inspect for damage (HDA removal necessary); the rear

flex cable assembly should be neatly dressed along the sides of the chassis at the rear)
2. ECM

3. Continuity checks (refer to Table 5-10)
4. HDA
E2 AID or D/A Converter Insane

Error Type: DE
Error Description: The servo processor detected a failure in its AID or D/A converters during
a precheck before the head load was initiated.
Fault Isolation/Correction:
1. ECM

2. If you load microcode revision 13 (or earlier) into a 70-22942-02 (RA92-compatible) ECM, a
solid E2 error will be seen upon drive spinup.
E3 Excessive Positioner Current During Test

Error Type: DE
Error Description: The servo processor detected a failure in the power amp circuitry that
indicates a shorted condition.
Fault Isolation/Correction:
1. ECM
E4 Open Circuit Detected During Power Amp Toggle Test

Error Type: DE
Error Description: An open was detected in the power amp circuitry during a head load
sequence. Power is applied to the positioner in a toggle fashion during the head load sequence.
Reference error code 13 for information that may be useful for isolating an open circuit of the
actuator. An ohmmeter measurement might verify this condition at the HDA.

DIGITAL INTERNAL USE ONLY

Troubleshooting and Error Codes 5-105

Fault Isolation/Correction:
1. Rear flex cable assembly {visually inspect for damage (HDA removal necessary); the rear
flex cable assembly should be neatly dressed along the sides of the chassis at the rear)

2. ECM
3. HDA
E5 Overcurrent Detected During Actuator Test

Error Type: DE
Error Description: The servo processor detected an overcurrent condition before attempting a
head load process.
Fault Isolation/Correction:
1. ECM
E6 Track Counter Clear Failure

Error Type: DE
Error Description: The track counter failed to clear indicating establishment of cylinder O.
This is the final phase of the head loadIRTZ process that must be accomplished.
Loss of PLO during this portion of the head loadIRTZ process will also cause t~ error. See the
note in the error description for error code EB.

Fault IsolationiCorrection:
1. ECM (most likely)

2. PCM
3. HDA
E7 Hlegal Zone Detected

Error Type: DE
Error Description: The servo system is executing a head load or RTZ operation.
For microcode revisions 19 and earlier, the order of band detection is: outer guardband, data
area, then inner guardband area.
For microcode revisions 20 and later, the order of band detection that the servo system is
looking for is OGB, data area, then back to OGB. In this case (without an E9 error), the servo
system could not re-establish finding the OGB area (the second time). The servo system will
spend up to one second trying to re-establish the OGB area.
Loss of PLO during this portion of the head loadIRTZ process will also cause this error. See the
note in the error description for error code EB.

Fault Isolation/Correction:
1. ECM (most likely)

2. IwA
3. PCM

DIGITAL INTERNAL USE ONLY

5-106 Troubleshooting and Error Codes

E8 Outer Guardband Timeout

Error 'J.Ype: DE
Error Description: Servo is in the outer guardband (OGB) of the disk and wants to be able to
detect this region by looking for the OGB pattern from the dedicated servo information. At this
time, however, the servo cannot establish PLO lock and faults. Interruption of the servo data
stream is likely. Up to 3.4 seconds is allocated to trying to find servo data.
NOTE

PLO Loss During Head LoadIRTZ: The PLO coming unlocked is a fairly serious error
to a servo system. It causes all the servo information to become unreadable. There
are now four different codes for the PLO being unlocked, depending on when it
happens:
•

At the begiDDing of RTZ, if unable to establish lock, an E8 is reported.

•

Midway through the RTZ, if lock is lost while SCAnnjng the disk for the OGB, an
E7 is reported.

•

Late in the RTZ, while going from the OGB to cylinder 0, lost lock results in an E6.

•

During normal track fonowing and seeking, lost lock causes an EC.

These are the error codes reported by the servo and logged in the error log while
functional VO code is r 11nnjng. Diagnostic VO code may log (and the OCP may
display) the VO's error code of C6 for a PLO failure.
Fault Isolation/Correction:
1. ECM
2. PCM

3. HDA
E9 Gray Code nmeout During the Turnaround State

Error 'J.Ype: DE
Error Description: No gray code transitions were detected during a hold sequence. The
drive is attempting a head load (NRZ), is in the OGB, and has PLO locked, reading its OGB
position. At this point, the servo is attempting to move forward to look for track crossings and
the eventual detection of the data area of the disk. However, the servo cannot get the positioner
to move. The servo will spend up to 3.4 seconds trying to move the positioner.
A sticky (dragging) actuator lock pin or faulty actuator lock solenoid will also cause this error.

Fault Isolation/Correction:
1. HDA (positioner lock solenoid failu.re--see elTor code 13)
2. Rear flex cable assembly (visually inspect for damage (HDA removal necessary); the rear
flex cable assembly should be neatly dressed along the sides of the chassis at the rear)
3. ECM

DIGITAL INTERNAL USE ONLY

Troubleshooting and Error Codes 5-107

EA Gray Code Tlmeout During Outer Guardband State

Error Type: DE
Error Description: No gray code transitions were detected during a head load sequence.
Fault Isolation/Correction:
1. Visually inspect rear flex cable assembly
2. HDA (positioner lock solenoid failure-see error code 13)
3. ECM
ED Sector Pulse nmeout During Syne-Up Stat.

Error Type: DE

Error Description: An index pulse was detected but no sector pulse was detected in the data
area of the disk. Heads may not be positioned over the data area.
Fault Isolation/Correction:
1. ECM

2. HDA
EC Servo Fault and PLO Fault Bit Set In GASP

Error Type: DE
Error Description: The servo fault and PLO fault bits are both set in the GASp, but it
was noted by the servo processor that the PLO had come unlocked. Similar to error code 25,
however, the servo processor did see the PLO deassert, which in tum caused the servo fault bit
to set.
Fault Isolation/Correction:

1. ECM
2. PCM

30 HDA
ED Servo Watchdog Tlmeout

Error Type: DE
Error Description: The digital signal processor (DSP) was not interrupted on time by the
GASP. Possibly, the servo clock signal is not present or is not being detected properly. The
timeout is 820 microseconds.
Fault Isolation/Correction:
1. ECM
EE Servo Digital Signal Processor Reset

Error Type: DE

Error Description: The Servo DSP has been reset. As a result, the profiles for the drive have
not been loaded by the master processor. The DSP is sane, but has not been told what type
of HDA is present in the drive-it may be an RA90 long arm, RA90 short arm, or an RA92.
Therefore, the servo will not load its servo tables or move the actuator. This is an unusual
error condition. The master processor should have reinitialized the drive characteristics into
the servo system.

DIGITAL INTERNAL USE ONLY

5-108 Troubleshooting and Error Codes

Fault Isolation/Correction:
1. Turn drive power off and on
2. ECM
EF Head Unload Failed

Error Type: DE
Error Description: The servo processor responded with an error condition to a HEAD
UNLOAD command.
Fault Isolation/Correction:
1. ECM
2. HDA
FO Servo Microcode Update Failed

Error Type: DE
Error Description: The servo processor did not send a SUCCESSFUL acknowledgment when
the master processor attempted to load external servo processor RAM with new microcode.
When the drive _powered up, a microcode update occurred or a servo timeout took place. The
master processor did a compare of EEPROM to RAM microcode. The data did not compare.
Fault Isolation/Correction:
1. I/O-R/W to servo cable connection
2. ECM
F1 Command to Servo Processor nmed Out

Error Type: DE
Error Description: The master processor attempted to issue an UNLOAD command to the
servo processor; however, the command timed out during its execution.
Fault Isolation/Correction:
1. ECM
F3 Servo Splnup Failed

Error Type: DE
Error Description: The master processor issued a SPINUP command to the servo processor
and the servo processor responded with an error condition.
Fault Isolation/Correction:
1. ECM

2. Brake assembly
3. HDA

DIGITAL INTERNAL USE ONLY

Troubleshooti.ng and Error Codes 5-109

F4 Servo Splndown Failed

Error Type: DE
Error Descriution: The master processor issued a command to spin down the· drive. The
error condition.
servo processo~ responded with

Fault Isolation/Correction:
l. ECM
F5 Seek Failed

Error Type: DE
Error Description: The servo processor returned an error condition in response to a SEEK
command from the master processor.
NOTE

T65 does not check for out-of..range values. Do Dot exceed the maximum specified
input values. Also, the last cylinder parameter must always be equal to or greater
than the first cylinder parameter. If an invaHd cylinder value is entered, a (servo)
Seek Failed. error (F5) occurs.
Fauli Isolation/Correction:
l. HDA
2. EC!Y{ (if 10 of 13 heads)
F6 Head Switch Failed

Error Type: DE
Error Description: The servo processor responded with an error condition to a HEAD
SWITCH command initiated by the master processor.
Fault Isolation/Correction:
l. HDA

2. ECM (if 10 of 13 heads)
F7 RlZ Failed

Error Type: DE
Error Description: The master processor issued a RETURN TO ZERO (RTZ) command to the
servo processor, and the servo processor responded with an error condition.
Fault Isolation/Correction:
l. ECM
2. HDA
F8 Head Load Failed

Error Type: DE
Error Description: The master processor issued a HEAD LOAD command to the servo
processor, and the servo processor responded with an error condition and no specific error
information with it, or the head load timed out.

DIGITAL INTERNAL USE ONLY

5-110

Troubleshooting and Error Codes

Fault Isolation/Correction:
For microcode revisions 19 or earlier:
1. ECM (if 10 of 13 heads)

2. PCM
3. HDA

For microcode revisions 20 or later:
1. ECM
F9 Diagnostic Command Failed

Error Type: DF
Error Description: The servo processor responded with an error or a timeout condition to a
DIAGNOSE command issued by the master processor.

Fault Isolation/Correction:
1. ECM

2. PCM
3. HDA
FA Servo Processor Failed Seek to DON Write Cylinder

Error Type: DF
Error Description: A seek to the diagnostic (DGN) writelread cylinder failed while under
diagnostics control.

Fault Isolation/Correction:
1. ECM

PCM

3. HDA
FB Servo Processor Failed Seek to DON Read Cylinder

Error Type: DF
Error Description: A seek to the diagnostic (DGN) read-only cylinder failed while under
diagnostics control.

Fault Isolation/Correction:
1. ECM
2. PCM
3. HDA

DIGITAL INTERNAL USE ONLY

Troubleshooting and Error Codes

5-111

FD EEPROM Checksum Error

Error Type: DF
Error Description: An inCOlTect checksum was detected in one of the EEPROMs.

Fault Isolation/Correction:
1. Reload microcode

2. ECM

DIGITAL INTERNAL USE ONLY

Removal and Replacement Procedures
6.1 Introduction
This chapter describes the removal and replacement procedures for RA90 and RA92 disk drive
components. No tools are needed to remove or replace the six major field replaceable units (FRUs)
that make up the RA9OIRA92 disk drive. However, tools are required for the removal and/or
replacement of some drive components. A tools checklist is included to identify these tools. Tools
are also identified in procedures where needed.
Figure &-1 shows an exploded view of the RA9O!RA92 disk drive.

DIGITAL INTERNAL USE ONLY

6-1

6-2

Removal and Replacement Procedures

POWER
SUPPLY
RA90/RA92
DISK
DRIVE
CHASSIS

RIBBON
CABLE

BLOWER MOTOR
ASSEMBLY
CXO·2170B

Figure 6-1

RA90/RA92 Disk Drive -

DIGITAL INTERNAL USE ONLY

Exploded View

Removal and Replacement Procedures

6-3

6.2 Seauence
for FRU Removal
•
Remove RA9OIRA92 FRUs in the following sequence:

CABINET FRONT PANEUDRIVE GRILL
OCP
BLOWER MOTOR ASSEMBLY
ECM

~MHDA

...........-

SPINDLE GROUND BRUSH
BRAKE
SOLENOID

CABINET REAR PANEL
POWER SUPPLY
CXO-2200A

Figure 6-2

FRU Removal Sequence

Use care when removing and replacing drive components. Never force fit drive modules or
components. Generally, a steady, firm pressure and the correct alignment ensures proper seating of
drive components. If you encounter resistance during FRU removal or replacement, check for bent
pins, obstructions, or improper alignment of parts.

6.3 Electrostatic Sensitivity
Drive components and FRUs are highly sensitive to electrostatic shock. Use proper ESD methods
when handling drive components. (Refer to Section 1.4, Electrostatic Protection.)

6.4 Power Precautions
Since hazardous voltages are present in this equipment, it is recommended that only trained service
personnel attempt to service this equipment.

WARNING
Always remove power from the unit before removing or replacing any internal part or
cable. Bodily injury or equipment damage may result from improper servicing.

6.5 Tools Checklist
Most RA90 and RA.92 disk drive repairs can be performed without the use of tools. However, the
following tools are required during some procedures:
•

5/32 Hex wrench

•

1116 Allen wrench

•

3/32 Allen wrench

•

5/32 Allen wrench

•

3/16 Allen wrench
DIGITAL INTERNAL USE ONLY

6-4 Removal and Replacement Procedures

•

Pliers

•

Needlenose pliers

•

Medium Phillips screwdriver

•

Flat-blade screwdriver

6.6 Removing/Replacing Cabinet Front and Rear Access Panels
Procedures contained in this chapter require the removal of cabinet front and rear access panels.
Panel removal and replacement procedures follow.

6.6.1 Removing/Replacing the Front Access Panel
To remove the cabinet front access panel (refer to Figure 6-3):
1. Use a hex wrench or flat-bladed screwdriver to unlock the two quarter-tum fasteners at the top
of the panel. Turn the fasteners counterclockwise.
2. Grasp the panel by its edges, tilt it toward you, and lift it up about 2 inches. Remove the panel
and store it in a safe place.

To reinstall the front access panel:
1. Lift the panel into place and lower it straight down until the tabs on the panel's lower edge
engage the slots in the cabinet support bracket.
2. Holding the panel flush with the cabinet, use a hex wrench to lock the quarter-turn fasteners.
Turn the fasteners clockwise.

6.6.2 Removing/Replacing the Rear Access Panel
To remove the cabinet rear access panel (refer to Figure 6-4):
1. Use a hex wrench or flat-bladed screwdriver to unlock the two quarter-turn fasteners at the top
of the panel. Turn the fasteners counterclockwise.
2. Tilt the panel toward you and lift it up to disengage the pins at the bottom.
3. Lift the panel clear of the enclosure and store it in a safe place.

To reinstall the rear access panel:
1. Lift the panel into place and :fit the pins into the holes at the top of the 110 bulkhead.
2. Push the top of the panel into place and use a hex wrench to lock the quarter-turn fasteners.
Turn the fasteners clockwise.

DIGITAL INTERNAL USE ONLY

Removal and Replacement Procedures 6-5

QUARTER-TURN
FASTENER

HEX
WRENCH

PANEL

SUPPORT
BRACKET
CXO-2130C

Figure 6--3

Front Access Panel Removal

DIGITAL INTERNAL USE ONLY

Removal and Replacement Procedures

HEX
FASTENER

CABINET
REAR
BUSTLE

/'"

111111
IIIIIIIUIII I

,III ':'II,'I~II'II'
/'"

/'"

i!iil!!!!!!iii!
!lil!!!!lll!!!
Ilitl 11111 III' ,UI
REAR
ACCESS
PANEL

iii!!!!!i!!!!!! i!!iiiii!!!!!!
1I1111::IIIII~~ 111111

~ ...... 11111:1111111

","

~ ......
110 BULKHEAD

I
I

...... ~

PINS
CXO-2131C

Figure 6-4

Rear Access Panel Removal

6.7 Removing the Operator Control Panel
The operator control panel (OCP) is secured to the bezellblower assembly by the OCP-to-blower
connector and by flexible metal retention clips.
NOTE

Note the orientation of the OCP before removing.

To remove the OCP (refer to Figure 6-5):
1. Remove power from the drive.

2. Grip the OCP in the middle and gently pull it towards you.
3. Note OCP-to-blower connector orientation.
Reverse this process to replace the OCP. (Check for bent pins before replacing.)

DIGITAL INTERNAL USE ONLY

Removal and Replacement Procedures

DRIVE
FRONT

&-7

OPERATOR
CONTROL
PANEL

..........

"~BEZEL

..........

CONNECTOR
CAP
CXO-2172C

Figure 6-5

OCP Removal

6.8 Removing the Blower/Bezel Motor Assembly
Although the bezel and blower motor assembly are removed as one unit from the drive chassis, the
bezel and blower motor assembly are two separate units. The blower motor assembly is the FRU.
Pay particular attention to the blower motor orientation and blower motor-to-ECM connection.
To remove the blower motor assembly (refer to Figure ~):
1. Remove power from the drive.
2. Remove the OCP (refer to Section 6.7).
3. Note blower motor orientation before removing.
4. Locate the four wing nuts.
5. Rotate lower then upper wing nuts counterclockwise to loosen.
6. Grasp the assembly sides and pull the assembly toward you.

DIGITAL INTERNAL USE ONLY

6-8 Removal and Replacement Procedures

DRIVE
CHASSIS

BLOWER
ASSEMBLY

WING
NUTS

CXO-2173B

Figure 6-8

Blower Motor Assembly Removal Sequence

To replace the blower motor assembly:
1. Ensure a good connection exists between the blower motor assembly and the ECM.
2. Check for proper connector alignment.
3. Use steady, gentle pressure to replace the blower motor assembly. Do not force the blower
assembly into position. If resistance is encountered, check for bent pins.
4. Tighten the upper and lower wing nuts in a clockwise direction.

DIGITAL INTERNAL USE ONLY

Removal and Replacement Procedures 6-9

6.8.1 Separating the Bezel and Blower Motor Assembly
Use the following procedure to separate the blower motor assembly from the bezel (refer to
Figure 6-7):
1. Place the assembly grill-side down.
2. Locate and disconnect the +24 V blower motor connector (red and black leads).
3. Locate the Phillips-head screws; loosen and remove.
4. Separate the bezel and blower motor assembly.
Reverse this procedure to reconnect the bezel and blower motor assembly. Return the assembly to
the chassis.

BEZEL

BLOWER

PHILLIPS·
HEAD
SCREW

+24 V BLOWER
MOTOR
CONNECTOR

PHILLIPSHEAD
SCREWS

PHILLIPS- _

HEAD
SCREW

BLOWER
CONNECTION

CX0-2174B

Figure 6-7

Bezel and Blower Motor Assembly Separation

DIGITAL INTERNAL USE ONLY

6-10

Removal and Replacement Procedures

6.9 Removing the Electronic Control Module
Ensure proper grounding before beginning this procedure. To remove the electronic control module
(ECM) (refer to Figure 6-8):
1. Remove power from the drive.
2. Remove the OCP (refer to Section 6.7).
3. Remove the blower motor assembly (refer to Section 6.8).
4. Remove the ribbon cable from the preamp control module (PCM).
5. Locate the lock/release lever on the side of the ECM.
6. Grasp the ECM handle and apply pressure to the lock/release lever with your thumb.
NOTE

Do not use extreme force when applying pressure to the lock-release lever. Only firm,
steady pressure is required to remove the ECM.
RA90/RA92
DISK DRIVE
CHASSIS

RED
BAND

LOCK/RELEASE
LEVER

CXO-2176C

Figure 6-8

ECM Removal

DIGITAL INTERNAL USE ONLY

Removal and Replacement Procedures

6-11

7. Pull the ECM toward the front of the chassis.
8. If resistance is encountered, apply a small amount of back pressure to the ECM and, at the
same time, apply pressure to the lock release lever. Pull the ECM toward the front of the
chassis.
Reverse this procedure to replace the ECM. Apply firm (not excessive) pressure until the carrier
latch engages its detent. Reconnect the ECM-to-PCM ribbon cable.
NOTE

Do not force the ECM. If necessary, remove and examine rear connector pins to verify
nothing is bent or jammed. In very extreme cases, it may be necessary to remove the SDI
cabies from the rear of the drive before inserting the ECM.

6.10 Removing the Preamp Control Module
It is not necessary to remove the HDA in order to remove the preamp control module (PCM).
Refer to Figure 6-9 while performing this procedure. Ensure proper grounding before beginning
PCM removal.

PCM TO HDA
CONNECTOR

SWITCH
PACK

\
PCM

CXO-2175B

Figure 6-9 . PCM Removal

DIGITAL INTERNAL USE ONLY

6-12

Removal and Replacement Procedures

1. Remove power from the drive.

2. Remove the OCP (refer to Section 6.7).
3. Remove the blower motor assembly (refer to Section 6.8).
4. Remove the ribbon cable from the PCM.

5. Remove the Phillips-head screws securing the PCM to the HDA
6. Note the orientation of the PCM-to-HDA connector. Place your fingers on the sides and near
the PCM-to-HDA connector. Use steady, firm pressure to dismount the PCM from the HDA
Reverse this procedure to replace the PCM. Ensure proper alignment between the HDA and
PCM-to-HDA connectors. (Check for bent pins prior to reinstalling.)

6.11 Removing/Replacing the Head Disk Assembly
This section documents the procedures for removing and replacing the HDA Use extreme care
during HDA removal/replacement procedures to prevent damage to the HDA.
As with all static-sensitive components, ensure proper grounding when handling. Place components
on a grounded, anti-static work surface. Prior to installation, a replacement HDA must be
thermally stabilized.

WARNING
The thermal stabilization procedure is mandatory. Failure to thermally stabilize this
equipment could cause premature equipment failure.

6.11.1 Removing the HDA
Run tests T43 and T44 before replacing the HDA, to capture seek and spinup information. Record
this information on the red tag when returning
, the HDA
Run tests T53 and T54 to clear stored parameters from the old HDA
WARNING
An HDA weighs 15 kilograms (33 pounds). Use both hands during this procedure.

The positionerlhead assembly must never be rotated in a counterclockwise direction.
Damage to the media and heads could occur.
Place the HDA on a grounded, anti-static work surface after it has been removed. Use proper
grounding techniques when working with drive components.
To remove the HDA (refer to Figure 6-10):
1. Remove power from the drive.

2. Remove the OCP (refer to Section 6.7).
3. Remove the blower motor assembly (refer to Section 6.8).
4. Remove the ribbon cable from the HDA.
5. Locate the baseplate latch assembly.
6. To unlock the HDA from the drive chassis, grasp the baseplate latch assembly and pull up and
turn until the lock is in its top position.
7. Grasp the HDA carrier handle and pull the HDA toward the front of the drive.
8. Place one hand under the HDA as you remove it from the drive chassis.

DIGITAL INTERNAL USE ONLY

Removal and Replacement Procedures 6-13

RA90/RA92
DISK DRIVE
CHASSIS

BASEPLATE
LATCH
ASSEMBLY

CX0-2177B

Figure 6-10

HDA Removal

9. If resistance is encountered, attempt to carefully reinsert the HDA and try this procedure again.
It may be necessary to apply a small amount of back pressure before the HDA can be removed
from the chassis.
10. Place the HDA on a grounded, anti-static work surface.

6.11.2 HDA Thermal Stabilization Procedure
The replacement HDA must be thermally stabilized before its moisture barrier bag is opened.
Prior to installation, a replacement HDA must be stored at a temperature of 16°C (60°F) or higher
for a minimum of 24 hours. The HDA may be stored in the computer room or in another storage
room under controlled temperature conditions. If stored in another storage room, the HDA must sit
for an additional hour in the computer room in which it will be installed.
CAUTION

Under no circumstances should the HDA be left overnight in an uncontrolled
temperature environment where cold temperatures could occur (for example, in a car)
and then openedlinsta1led without a 24-hour thermal stabilization period.

DIGITAL INTERNAL USE ONLY

6-14

Removal and Replacement Procedures

6.11.3 Replacing the HDA
After the thermal stabilization criteria has been met, open the HDA box and carefully cut the
heat-sealed end of the moisture barrier bag. Remove the desiccant from the moisture barrier bag
and the HDA from the foam bag. Save all HDA packing material to repackage the failing HDA.
Use the following procedure to install the replacement HDA:
•

Slide the HDA into the chassis until the spring-loaded latch locks into place.

WARNING
When reinserting the BDA into the drive chassis, take care not to pinch your :6ngers.
There is limited clearance between the HDA handle and chassis edges.
•

Turn the baseplate latch assembly until the latch drops into place and the HDA is secure. To
ensure the HDA is secure, try sliding the drive in and out of the chassis.

•

Reconnect the ECM-to-PCM ribbon cable.

•

Run tests T53 and T54 to clear stored (replaced) HDA-related information.

Return the defective HDA in the replacement HDNs shipping package. Place desiccant inside the
moisture barrier bag before folding and sealing the package. Tape the red tag to the outside of the
sealed HDA package.

6.11.4 Separating the HDA and Carrier
A number of repairs require separating the HDA and carrier. Use the following procedure to
accomplish this:
1. Remove the HDA from the chassis and set it carrier-side up on a grounded, anti-static work
surface (Section 6.11.1).
2. Locate the rear HDA connector and remove the retaining C clips shown in Figure 6-11.
NOTE

Remove the C clips by pressing against the spring-loaded rear HDA connector and,
at the same time, using a small, :Hat-bladed screwdriver or small needIenose pliers to
loosen and remove the clips.
3. Remove the rear HDA connector.
4. Use a Phillips screwdriver to remove the two screws securing the HDA carrier to the damper
bracket assembly.
5. Completely loosen but do not remove the four Ton-head screws with a Torx T-15 screwdriver.
Refer to Figure 6-11 for the location of the Torx-head screws.
Reverse this procedure to reassemble the HDA and the carrier.

DIGITAL INTERNAL USE ONLY

Removal and Replacement Procedures

6-15

DAMPER
BRACKET
ASSEMBLY

I
I

1
CXO-2178B

Figure 6-11

HDA Carrier Separation

DIGITAL INTERNAL USE ONLY

6-16

Removal and Replacement Procedures

6.11.5 Removing the Spindle Ground Brush
This section documents the procedure for removing and replacing the RA9OIRA92 spindle ground
brush. Because handling the HDA is necessary, extreme caution must be used.
Refer to Figure 6-12 during this procedure.

HEX-HEAD.-'._----~
SCREWS
•

SPINDLE
GROUND
BRUSH
COVER

CXO-2180B

Figure 6-12

Spindle Ground Brush Removal

DIGITAL INTERNAL USE ONLY

Removal and Replacement Procedures

&-17

1. Remove power from the drive.
2. Remove the OCP (refer to Section 6.7).
3. F..emove the blower mot.or assembly (refer to Section 6.B).
4. Disconnect the ribbon cable from the PCM.

5. Remove the HDA from the chassis (Section 6.11.1) and set it on a grounded, anti-static work
surface, carrier-side up.
6. Locate the rear HDA connector and remove the retaining C clips shown in Figure ~11.
NOTE

Remove the C clips by pressing against the spring-loaded rear HDA conneetor and,
at the same time, using a small, fiat..b 1gded screwcLPiver or Annan needlenose pliers to
loosen and remove the clips.
7. Remove the rear HDA connector.
B. Use a Phillips screwdriver to remove the two screws securing the HDA carrier to the damper
bracket assembly (refer to Figure ~11).

9. Loosen the four Ton-head screws with a Ton T-15 screwdriver. Refer to Figure ~11 for the
location of the Torx-head screws.
10. Separate the HDA and carrier (refer to Section 6.11.4).
11. Locate and remove the spindle ground brush cover shown in Figure 6-12.
12. Locate and remove the spindle ground brush by removing the two hex-head screws that hold it
in place.
Replace the ground brush then reassemble the HDA and drive assemblies.

6.11.6 Removing the Brake Assembly
This section documents the procedures for removing and replacing the RA9O/RA92 brake assembly.
Because handling the HDA is necessary, extreme caution must be used.
You will need a contact extraction tooi (Digitai Part Number 29-26655-00) to periorm this
procedure. Refer to Figures ~12, ~13, and ~14 while performing this procedure.
CAUTION

Never rotate the actuator or positioner shaft counterclockwise. HDA damage could
occur.
1. Remove power from the drive.
2. Remove the OCP (refer to Section 6.7).
3. Remove the blower motor assembly (refer to Section 6.B).
4. Disconnect the ribbon cable from the PCM.
5. Remove the HDA from the chassis (Section 6.11.1) and set it on a grounded, anti-static work
surface, carrier side up.
6. Locate the rear HDA connector and remove the retaining C clips shown in Figure ~11.
NOTE

Remove the C clips by pressing against the spring-loaded rear HDA connector and,
at the same time, using a small, flat-bladed serewdriv~r or small needlenose pliers to
loosen and remove the clips.

DIGITAL INTERNAL USE ONLY

6-18

Removal and Replacement Procedures

REAR HDA
CONNECTOR

HANDLE

CONTACT
CAVITY

Figure 6-13

LANCE
RELEASE
TIP

CXO-2181B

Contact extraction Tool

7. Remove the rear HDA connector.
8. Use a Phillips screwdriver to remove the two screws securing the HDA carrier to the damper
bracket assembly (refer to Figure ~11).
9. Loosen the four Ton-head screws with a Torx T-15 screwdriver. Refer to Figure ~11 for the
location of the Ton-head screws.
10. Separate the HDA and carrier (refer to Section 6.11.4).
11. Locate and trace the brake electrical contacts to the rear HDA connector.
12. Extract the brake electrical contacts (contacts 4 and 5) from the rear HDA connector using the
contact extraction tool from the kit.
13. Align t~e contact extraction tool with the front of the connector. Align the lance release tip
with the lance release slot, making sure to align the tip with the contact cavity. Refer to
Figure &-13.

DIGITAL INTERNAL USE ONLY

Removal and Replacement Procedures

6-19

BRAKE
HOLD-DOWN
SCREWS

BRAKE
HUB

CXO-2997A

Figure S-14

RA90/RA92 Brake Assembly Removal/Replacement

14. Push the lance release tip in until the locking lance (metal tip inside contact pin) is released
from the slot.
15. Hold the connector firm and push the handle of the contact extraction tool forward. The contact
should back out of the rear of the connector.
16. Remove the contact extraction tool and pull the brake contact from the back of the connector.
17. Locate and remove the spindle ground brush cover (refer to Figure 6-12).
18. Locate and remove the spindle ground brush.
19. Use a 5/32 Allen wrench to remove brake hold-down screws (refer to Figure 6-14).
20. Note the hex shape of the spindle and matching hex shape of the brake hub.

DIGITAL INTERNAL USE ONLY

6-20

Removal and Replacement Procedures

21. Orient the brake hub to the spindle and fit them together. Do not rotate spindle
counterclockwise.
22. Secure the brake to the baseplate with the brake hold-down screws. Refer to Figure 6-14.
23. Replace the spindle ground brush.
24. Reinstall the spindle ground brush cover.
25. Insert brake electrical contacts into slots 5 and 6 in the connector. (Ensure a secure fit by
tugging on leads.)
26. Reassemble the HDA to the HDA carrier.
27. Attach the rear HDA connector and C clips.
28. Reassemble the drive.
29. Install the HDA into the drive chassis.

6.11.7 Spindle Lock Solenoid Failure
This section covers solenoid failures. The solenoid is not a replaceable FRU; however, its failure
prevents the heads from loading and data from being recovered.
To preclude the loss of data because of a solenoid failure, this procedure allows you to bypass the
solenoid long enough to recover the data and back it up onto another disk drive or tape unit.
CAUTION

Attempt this procedure only under the worst possible situations; that is, if customer
backup data is not current or work in progress must be recovered. After performing this
procedure and recovering the data, replace the HDA according to Section 6.11.S.

Refer to Figure 6-15 while performing this procedure.
1. Remove power from the drive.
2. Remove the OCP (refer to Section 6.7).
3. Remove the blower motor assembly (refer to Section 6.8).
4. Remove the HDA from the chassis (Section 6.11.1) and set it on a grounded, anti-static work
surface, carrier side up.
5. Locate the rear HDA connector and remove the retaining C clips shown in Figure 6-11.
NOTE

Remove the C clips by pressing against the spring-loaded rear HDA connector and,
at the same time, using a small, :flat-bladed screwdriver or small need1enose pliers to
loosen and remove the clips.

6. Remove the rear HDA connector.
7. Use a Phillips screwdriver to remove the two screws securing the HDA carrier to the damper
bracket assembly.
8. Loosen the four Torx-head screws with a Torx T-15 screwdriver. Refer to Figure 6-11 for the
location of the Torx-head screws.
9. Separate the HDA and carner (refer to Section 6.11.4).
10. Locate the solenoid (refer to Figure 6-15).

DIGITAL INTERNAL USE ONLY

Removai and Replacement Procedures 6-21

TAPE
CONTACTS_

SOLENOID
ARMATURE

SOLENOID
HOLD-DOWN
SCREWS

CXO-2179B

Figure 6-15

Disabling the Solenoid for In-Field Data Recovery

DIGITAL INTERNAL USE ONLY

6-22

Removal and Replacement Procedures

11. Disconnect the electrical leads from the solenoid and place electrical tape over the lead contacts
to prevent shorting.
12. Loosen and remove the positioner lock solenoid hold-down screws with a T-15 Ton wrench.
13. Remove the solenoid and set it aside.
14. Reinstall the solenoid hold-down screws to the baseplate and tighten slightly.
15. Loop a piece of 20-gauge wire (or equivalent) approximately 6 inches long through the solenoid
armature as shown in Figure 6-15.
16. Secure one end of the wire around one of the solenoid hold-down screws and tighten the screw
securely onto the wire.
17. After looping the wire through the solenoid armature, gently pull the solenoid plunger away
from the positioner/actuator assembly until it stops (approximately a quarter inch).
lB. Loop the .loose end of wire around the second hold-down screw and tighten the screw securely
onto the looped wire.
CAUTION

Ensure both sides of the wire are secure and that the solenoid plunger is held back.
The aim of this procedure is to recover customer data. If the solenoid plunger slips
back, it will cause the solenoid armature to allow the positioner/actuator assembly to
lock. Data recovery will then be unsuccessful.
Reassemble the HDA, carrier, and drive.
After data has been recovered, replace the HDA according to the HDA replacement procedure in
Section 6.11.3. When returning the old HDA from the field, also return the failed solenoid.

6.12 Removing the Power Supply
This section documents the procedures for removing and replacing the RA9OIRA.92 power supply.
Ensure you have removed power from the correct drive. Proceed with caution whenever working
with high voltages. Refer to Figure 6-16 while performing this procedure.
WARNING

When removing and replacing drive components, take care not to pinch your fingers.
There is limited clearance between the HDA handle and chassis edges.
1. Spin down the drive.
2. Turn off the drive circuit breaker to remove power from the drive.
3. Note port cable connector locations when removing the power supply.
4. Remove the power cord from the rear of the drive.
5. Remove other cables that may interfere with the power supply removal.
6. Loosen the bottom two quarter-turn fasteners by turning in a counterclockwise direction.
7. Support the bottom of the power supply with one hand.
B. Loosen the top two quarter-turn fasteners by turning in a counterclockwise direction.

9. Remove the power supply.

DIGITAL INTERNAL USE ONLY

Removal and Replacement Procedures

~23

RA90/RA92
DISK DRIVE
REAR
/

QUARTERTURN

QUARTERTURN
FASTENERS

FASTENERS

CXO-2171B

Figure 6-16

Power Supply Removal

CAUTION

The power supply weighs approximately 6.8 kilograms (15 pounds). It must be supported
when being removed from the drive.
Reverse this process to replace the power supply. Check the line voltage selector switch to ensure
yo~ have the correct voltage for your area.

6.13 Removing/Replacing the Rear Flex Cable Assembly
This section documents the procedures for removing and replacing the RA90 and RA92 rear flex
cable assembly. To facilitate the removal of the rear flex cable assembly, first remove the drive
HDA, power supply, and ECM. After these drive components have been removed, remove the drive
chassis from the cabinet and place it on a grounded, anti-static work surface"

DIGITAL INTERNAL USE ONLY

6-24

Removal and Replacement Procedures

To remove the rear flex cable assembly (refer to Figure 6-17):
1. Loosen the four Allen screws holding the rear panel assembly to the drive chassis.

2. Remove the 15 contact springs. Set the contact springs aside.
3. Remove the four Allen screws and set the rear panel assembly aside. (Set aside the drive serial
number label bracket.)
BLACK FEMALE
ECM CONNECTOR

DRIVE SIN
EL BRACKET

ADHESIVEBACKED
CABLE CLAMP
(REMOVE)

HELICAL-SPLIT
WASHER (4)
GREEN MALE
REAR CONNECTOR
DRIVE HARDWARE
REVISION SWITCH PACK

Figure 6-17

CXO-2990A

Rear Flex Cable Assembly Removal

The next step requires the removal of the rear flex cable assembly. There are a number of adhesivebacked cable clamps used to secure the rear flex cable assembly in place. The cable clamps all
open toward the rear of the drive with one exception; locate and remove this "one" cable clamp to
facilitate removal of the rear flex cable assembly. (See Figure 6-17 for the location of this clamp.)
4. Remove the two Allen-head screws that secure the green male rear connector to its bracket.
5. Remove the two C clips that secure the black ECM female connector to its bracket.
6. Remove the rear flex cable assembly.

DIGITAL INTERNAL USE ONLY

Removal and Replacement Procedures

6-25

The next step requires the replacement of the rear :flex cable assembly. Lay the replacement rear
:flex cable assembly out next to the one being replaced. Set the dip switches on the new rear flex
cable assembly to the exact settings from the replaced one. By hand, bend the rear :flex cable
assembly 90 degrees in the same places as the original assembly.
NOTE

Future flex cable assemblies may use dip shunt switch packs rather than dip switch
packs. A shunt open = switch open or off.
7. Place the rear :flex cable assembly on the rear panel assembly with the two connectors on their
proper brackets.

8. Secure the green male rear connector to its bracket with the two (previously removed) A11enhead screws.
9. Secure the black female ECM connector to its bracket with the two (previously removed) C clips.
10. Replace the previously removed adhesive-backed cable clamp.
1l. Loosely attach the rear panel assembly to the rear of the drive chassis.

12. Replace the 15 contact springs.
13. Secure the rear panel assembly by tightening the Allen screws.
14. Return the drive chassis to the cabinet.
15. Return the drive components to the drive chassis.

6.14 Media Removal Service for Customers
The on-site media removal and disposal service is an exclusive Digital Customer Services offering.
The following tools are needed to remove drive media from the HDA. Digital part numbers for these
tools are listed in Table 6-1:
l.

1116 Allen wrench

2. 3/32 Allen wrench
3. 5i32 Alien wrench
4. 3/16 Allen wrench
5. Torx size T-15 wrench

6. Torx size T-15 socket wrench

7. Pliers
8. Diagonal cut pliers
9. Needlenose pliers
10. Medium Phillips screwdriver
1l. Flat-bladed screwdriver

DIGITAL INTERNAL USE ONLY

6-26

Removal and Replacement Procedures

Table 6-1

Digital Part Numbers for Recommended Tools

Technical Description

Part Number

Ballpoint hex screwdriver blade, 1116"

29-26111-00

Ballpoint hex screwdriver blade, 3/32"

29-26113-00

Ballpoint hex screwdriver blade, 5/32"

29-26117-00

Ballpoint hex screwdriver blade, 3/16"

29-26118-00

Pliers, diagonal cutters, 4"

29-19328-00

Pliers, long needlenose

29-13461-00

Socket, Ton: T-15

29-27275-01

Screwdriver, Ton: T-15

29-22772-00

Screwdriver blade, Phillips # 1

29-11001-00

Screwdriver blade, slotted, 3/16"

29-1098S-00

Screwdriver blade, Ton: T-15

29-22772-00

Screwdriver blade, Ton T-10

29-26947-01

To remove the media from the HDA (refer to Figures 6-1S and 6-19):
1. Remove the PCM from the HDA and store it in an ESD bag for return to Customer Services
Logistics. Use proper ESD procedures.
2. Remove the four Ton head screws, or three Ton head screws and one medium Phillips-head
screw that secure the PCM plug to the HDA chassis.
8. Remove the HDA from the drive chassis (refer to Section 6.11.1).
4. Separate the HDA and carrier (refer to Section 6.11.4).
5. Use a Phillips screwdriver to remove the actuator counterweight located at the end of the
positioner shaft.
6. Use a 8/S-inch open-end wrench or a pair of medium-sized needlenose pliers to hold the 81S-inch
nut on the positioner motor assembly located near the center of the shaft. This is a locking
nut for an expander bolt holding the positioner coil assembly to the positioner shaft. Hold
the nut and, at the same time, loosen the 8/82 Allen screw with a 8/82 Allen wrench. Turn
counterclockwise until the 3/32 Allen screw, the 3/S-inch nut, and expander bolt assembly can
be removed.
7. Use a medium-sized Phillips screwdriver to remove the three retaining screws holding the
positioner motor assembly to the HDA baseplate.
S. Cut the flex leads from the positioner motor to the HDA electrical socket with diagonal cutters.

9. Firmly grasp the positioner motor assembly at the end of the positioner shaft and lift up. If you
have difficulty sliding the positioner motor assembly off the end of the positioner shaft:
•

Loosen the four crash stop Allen screws using a 5/32 and 1116 Allen wrench. Turn screws in
a counterclockwise direction.

•

Reattempt to remove the positioner motor assembly from the positioner shaft.

DIGITAL INTERNAL USE ONLY

Removal and Replacement Procedures

iI
I

I
I
I
ii I

I
I
I

6-27

TOP COVER
TORX-HEAO

SCREW (13)

HOA
TOP
..,COVER

MALE TORX-HEAD
SCREW (6)

AIR FILTER
ASSEMBLY

TOP
CLAMP
RING

HEADS
POSITIONER!
HEAD ASSEMBLY

SCREWS (TO SECURE
POS!T!ONEPJHEAD ASSEMBLY)

Figure 6-18

CXO-2991A

HDA Media Removal - Top View

DIGITAL INTERNAL USE ONLY

&-28

Removal and Replacement Procedures

3/32 ALLEN

SCREW
RETAINING
SCREW (3)

POSITIONER
MOTOR
ASSEMBLY

EXPANDER
BOLT

CRASH-STOP
ALLEN SCREW (4)

POSITIONER
SHAFT

AIR
FOIL
SCREWS
PCM
PLUG

TORX-HEAD OR
PHILLIPS-HEAD
SCREW

Figure 6-19

HDA Media Removal -

POSITIONER
COLLAR
CXO-2992A

Bottom View

10. Use a flat-bladed screwdriver to detach the spring clip that secures the positioner lock pin into
the positioner collar.
11. Remove the solenoid armature that connects the lock pin to the solenoid from the lock pin.
Using a pair of needlenose pliers, remove the lock pin from the positioner shaft.
12. Use a Ton T-15 wrench to remove the three screws used to secure the positionerlhead assembly
to the baseplate.

DIGITAL INTERNAL USE ONLY

Removal and Replacement Procedures

6-29

13. Use a Torx T-15 wrench to remove the two (in some cases three) screws that secure the internal
airfoil. All Torx screws should now be removed from the bottom of the baseplate.
14. Turn the HDA over to access top cover Torx-head screws.

15. Use a Torx T-15 head wrench to remove the 13 top cover Torx-head screws (refer to
Figure 6-18).
16. Remove the top cover of the HDA.

17. Remove the internal air filter assembly from the HDA
18. Remove the HDA filter fence from the HDA assembly. It may be necessary to rotate the
positionerlhead assembly so the heads are toward the inner guardband area of the media.
19. Push the loose PCM plug out of the chassis and maneuver the PCM plug and its attached cable

assembly so the positionerlhead assembly can be removed from the chassis.
20. Rotate the positioner out of the way as you manually unload the heads from the media.
21. Lift the entire positioner/head assembly out of the HDA chassis.
22. Use a Torx T-15 internal socket wrench to remove the six (6) male Torx-head screws securing
the top clamp ring on the media stack, and lift clamp rings, media, and spacer rings from the
spindle hub.
23. Give the media to the customer.
24. Collect all loose pieces of hardware and remove from the site. Return hardware to Customer
Services Logistics for proper disposal.

DIGITAL INTERNAL USE ONLY

7
Microcode Update Procedure
7.1 Introduction
This chapter describes the procedure for updating RA9OIRA92 disk drive microcode when a new
version of the microcode is released.

7.2 Microcode Update Cartridge Description
The microcode update cartridge is a ROM assembly that contains updated microcode for the
RA9OIRA92 disk drive microprocessor. Figure 7-1 shows the microcode update cartridge.
1b update the RA9OIRA92 microcode, insert the cartridge in the microcode update port and run
T40.
MICROCODE
UPDATE
CARTRIDGE

CXO-2164A

Figure 7-1

Microcode Update Cartridge

DIGITAL INTERNAL USE ONLY

7-1

7-2

Microcode Update Procedure

7.3 Microcode Update Port Description
The microcode update port is a cutout in the operator control panel (OCP). It is located below and
to the left of the Run switch. Figure 7-2 shows the location of the RA9OIRA92 microcode update
port.
To access the microcode update port, it is necessary to remove the cabinet front access panel.

MICROCODE
UPDATE
~ORT

MICROCODE
UPDATE
CARTRIDGE

CXO-2165C

Figure 7-2

Microcode Update Port

DIGITAL INTERNAL USE ONLY

Microcode Update Procedure 7-3

7ii4 Running Test 40 (T40)
T40 is a microcode subroutine used to load the new microcode from the microcode update cartridge
into the master processor. The new microcode may be intended as a servo microcode update, a
diagnostic update, or a functional microcode update.
During update, the new microcode is downloaded to its destination EEPROM in three separate
passes. Each pass takes approximately 20 seconds. The pass count is displayed in the OCP
alphanumeric display during the update procedure.
Pass one reads the cartridge, calculates and verifies the checksum in the cartridge, and verifies the
microcode consistency codes. If pass one fails, the update is aborted and an appropriate error code
is generated.
Pass two writes the even pages in EEPROM (16 bytes). An even page is defined as BIT04 of the
EEPROM address equal to zero.
Pass three writes the odd pages in EEPROM. An odd page is defined as BIT04 of the EEPROM
address equal to one.
After the microcode is fully loaded (indicated by [C 40)), the drive performs a reset and goes
through its normal power-up sequence of internal diagnostics. The OCP performs a reset, returns
the drive to its normal operating state, and displays the unit address.

7.5 Updating the Microcode
Remove the cabinet front access panel before beginning the microcode update procedure. Refer to
Section 6.6 for the front access panei removal procedure.
Use the following procedure when updating drive microcode:

1. Load the microcode update cartridge in the microcode update port.
2. Load test T40 (drive must be spun down).
3. Start test T40.
The following occurs in the OCP display (where S = start, P = pass, C = completed):
1. [8 40] (2 seconds).
2. [p 1] (20 seconds) Pass one checks PROM to be loaded.
3. [p 2] (20 seconds) Pass two writes the new code into the even pages in EEPROM.
4. [p 3] (20 seconds) Pass three writes the new code into the odd pages in EEPROM.
5. [C 40] (1 second) Update is complete.
6. [WAIT] (10 seconds) Exits test mode and goes through power-up hardcore sequence.
7. [0000] Returns to display the drive unit address.
Remove the microcode update cartridge from the OCP and replace the cabinet front access panel.
Select the appropriate port switches to return the drive to the available state.

i.5.1 Error Codes/Common Problems During Microcode Update
The most common problems encountered during a microcode update are as listed by error code in
Table 7-1.

DIGITAL INTERNAL USE ONLY

7-4 Microcode Update Procedure

Table 7-1

Common Error Codes/Problems During Microcode Update

Error Code

Reason

Solution

The microcode cartridge was not
detected.

Reseat the microcode update cartridge.

The cartridge checksum was incorrect.

Reseat the cartridge and retry the update. If
it still fails, either replace the OCP or try
the cartridge in another drive. Acquire a new
microcode cartridge if necessary.

Cartridge and EPROM consistency
check failed.

Reseat the cartridge and try again. If the same
elTor occurs, replace the cartridge with one
containing compatible code.

An EEPROM checksum error
occurred.

Attempt to reload the cartridge code. If the failure
occurs again, electronic control module (ECM)
replacement may be necessary.

DIGITAL INTERNAL USE ONLY

A
Capturing Information for LARS and CHAMPS
This appendix contains sample LARS for installation and general troubleshooting of field
replaceable units (FRUs) in the field.

DIGITAL INTERNAL USE ONLY

A-1

C5
~
r-

z-i

:D
Z

>
rC

(J)

0
Z

!:(

:!I

;

:II

1».

i&"

START DATE
STOP DATE
REQUEST DATE
STOP TIME
START TIME
DAY MONTH YEAR
DAY MONTH YEAR
DAY MONTH YEAR
1, 1, 11 151 l'I11FIEI a ielei
1,1'11111 1'11IFIEl a l·I'1
11 10 14 1.1
1014IFIEl a l·I'1
ACT REPAIR DEC OPTION VAR
DEC OPT. SIN
TYP ACT
FAIL AREA -MODULE -FCO -COMMENTS
AUTHORIZED
TIME
CAL TAK
TESTS
,
0
I 1 1 1 IsiAlelolol-IHIAI Iclxlololol4151
El IDIElslKlllDIDIEIDI 1-lplolslllTlllolNIEJDI I I I 0
REQTIME

LINE

o [[]

EJ
E1 [[] I I 151 IRIAleIOI-IAI I I ICIXlOlOl71 SlSJ EJ El IIINISITIAILILIEIDI I-I ITIEISITI I I I I I I I I 0
~ [[] I I 15 \ IRIAleIOI-IAI I \ ICIX\OIOI7ISl e l EJ ElIIINISITIAIL\LIEID\ I-I ITIEISITI I I I I I I I 1 0
~ [[] I I 151 IRIAlelol-tAI I I ICI XlOlOl71 171 EJ El IIINISITIAILILIEIDI I-I ITIEISITI I I I I I I I I 0
S

~ [[] I I 151 IRIAlei 0I- IAI I I ICIx I0 10171sl·1 E]
II IN IS IT IAILILIE IDI I- I ITIE ISiT I I I I I I I I I
SAMPLES OF LINE ITEMS FOR ECM REPLACEMENT:
LINE ACT REPAIR DEC OPTION VAR
DEC OPT. SIN
TYP ACT
FAIL AREA -MODULE -FCO -COMMENTS
AUTHORIZED
TIME
CAL TAK
TESTS
110141 IRIAlelOI I-IAI 1 ICIXlAI21S12121 ~ ~. IEICIMIIEI2Iel·12IelllllsIOI7ISlsI1121314ISI
ECM WITH VAX.lmPLUS THEORY CODE:
LINE ACT REPAIR DEC OPTION VAR
DEC OPT. SIN
TYP ACT
FAIL AREA -MODULE -FCO -COMMENTS
AUTHORIZED
TIME
CAL TAK
TESTS
0
4
X
X
O
S
S
I 1 1 1 IRIAlelOI I-IAI 1 ICI lAI21312121 ~ ~ IEICIMI 1'1·1'ISI·I IXI·IYIYISI I7I I31'121314I I
WHERE XX AND YY ARE VAX.'mPLUS SUPPLIED NUMBERS
CXO·218ea

o [[]

Sheet 1 of S

s::D

(J)

S»
:::t

o:::t
>
s::

(J)

FOR HDA REPLACEMENT:

LINE

0 CD [EEJ

IR I A 1,10 I

DEC OPT. SIN

VAR

DEC OPTION

REPAIR
TIME

ACT

I·IA I

EEeI~! 1,12121

TVP
CAL

ACT
TAl<

D D

FAIL AREA· MODULE· FCO • COMMENTS

AUTHORIZED
TEST8

EJ!.E""'A-rrn-r-""e'-rH-c
r-oTI-'I-H"-1D""11--'1r-T"'1-TEEI-S
rQT 7- 'I-s""'1.""11--'1r2 "'I,"'1-4r--1lsl

Z- LAND R FOR LEFT AND RIGt-IlT MOUNTED DR. (FRONT VIEW)
HDAZ HEAD ONE ECC ERRORS

FOR HDA WITH VAXllmPLUS THEORV CODE:
LINE

ACT

0 CD

[

REPAIR
TIME

DEC OPTION

10141

IR I A 1'1 0 I

DEC OPT. SIN

VAR
I·IA I

TVP
CAL

] EEE~!I'12121 D

ACT
TAl<

FAIL AREA· MODULE· FCO • COIMMENTS

AUTHORIZED
TESTS

D EI!>j~~rnBsl·IXI·IVI

1 'lclxl·1112I s l'14171·1
WHERE X AND V ARE VAXllmPLUS SUPPLlEI) NUMBER

FOR PCM REPLACEMENT:
LINE

ACT

DEC OPTION

REPAIR
TIME

0 CD [m

IR I A 1'1 0 I

DEC OPT. SIN

VAR
I· I A]]

EEEI21'12121

TVP
CAL

ACT
TAl<

FAIL AREA· MODULE· FCO • COMMI:NT8

AUTHORIZED
TESTS

D D Iil~I~ill~~J_[rrEEl7IslsI214121211 D
I

FOR PCM REPLACEMENT WITH VAXllmPLUS THEORV CODE:
LINE

ACT

REPAIR
TIME

I!J CD [m

DEC OPTION
IRIAI'lol

VAR
I·IAI

DEC OPT. SIN

TVP
CAL

ACT
TAK

FAIL ARf,A • MODULE· FCO • COMMENTS

AUTHORIZED
TESTS

I [~I!!I'12121 0 D [!I~I~~LEIB!lJ!J~LEEEEl7Isl

S 1214121211 I
WHERE XX AND VV ARE VAXlimPLUS SUPPIl.IED NUM8ERS

FOR BLOWER REPLACEMENT:
LINE

m
JJ

z
>
rC

(J)

REPAIR
TIME

o CD [m

i3r-

ACT

DEC OPTION
IR IAI'lo I

VAR
I·IAI

DEC f)PT. SIN

TVP
CAL

EEE~I 1 2 1 1 1 0 1 E1

ACT
TAl<

FAIL AREA· MODULE· FCO • COMMENTS

AUTHORIZED
TESTS

E1 E[~I~E~E~JJTIfCTXl7IsI013121110101 D
OR THE APPARENT CAUSE: FROZEN WON'T TURN

FOR BLOWER REPLACEMENT WITH VAXllmPLUS THEORV COI)E::
LINE

ACT

REPAIR
TIME

o CD [EEJ

DEC OPTION

VAR

IRIAI'loll·IAII

DEC f)PT. SIN

EEE~11211101

TVP
CAL

ACT
TAl<

FAIL AREA· MODULE· FCO • COMMElifrS

E1 E1 l!I~.~EliJ~B!lIXJ~LEEEEJI71 s

AUTHORIZED
TEST8

1 0 1.1211 10 1 0
WHERE XX AND VV ARE VAXlimPLUiS 8UPPLIED NUMBERS

ID
C)(0·216'8
Sh,..t 20f 3

t
....

-f

m
JJ

;

::IJ
eD

t1'
tD

:::!.
:::J

»
ren

o
~
c:

0'
FOR POWER SUPPLY REPLACEMENT:
LINE

ACT

REPAIR
TIME

DEC OPTION

DEC OPT. SIN

VAR

TYP
CAL

ACT
TAK

FAIL AREA - MODULE - FCO - COMMENTS
Ipisl

IclRlolwlBIAIRlsl

AUTHORIZED
TESTS

Icl x lal111121211121 e

l 0

NO VOLTAGEi NO POS 24V ERR 22

ACT

o OJ

REPAIR
TIME
I

10 14

DEC OPTION

1 IRIAleiol

VAR
I-II.I

DEC OPT. SIN

TYP
CAL

ACT
TAK

1 Icl x l2161112121

E l l p l s I111·11161·lxlxl·IYIYllcl x laI11112121112Iej

FAIL AREA - MODULE - FCO - COMMENTS

AUTHORIZED
TESTS

WHERE XX AND YYARE VAXllmPLUS SUPPLIED NUMBER

ACT

o OJ

REPAIR
TIME
11 0

DEC OPTION

12 1 IRIAleiol

VAR
I-II.I

DEC OPT. SIN

TYP
CAL

I Icl x l21 e l112121

FAIL AREA - MODULE - FCO - COMMENTS

ACT
TAK

lolcipi

ILIElol

lolulTI

AUTHORIZED
TESTS

Icl x lal111121211121eJ

FOR MISCELLANEOUS PARTS:
ACT

REPAIR
TIME

DEC OPTION

VAR

DEC OPT. SIN

TYP
CAL

ACT
TAK

FAIL AREA - MODULE - FCO - COMMENTS
Islolll

IclAIBILIEI

lolplelNI

AUTHORIZED
TESTS

IIII 0
I

ITEM REPAIRED
FOR SMOO REPAIRS:
LINE

ACT

REPAIR
TIME

(J)
Q)

:::J

:r:

SEGMeNT OUT OR SWITCH IS BROKEN
LINE

o
.,

FOR OCP REPLACEMENT:
LINE

>
JJ

FOR POWER SUPPLY WITH VAX 11m PLUS THEORY CODE:
LINE

3
ao·
:::J

DEC OPTION

VAR

DEC OPT. SIN

TYP
CAL

ACT
TAK

FAIL AREA· MODULE· FCO· COMMENTS
181 a l11

IRIEILIAlyl

10iRIolpisi

AUTHORIZED
TESTS
lolulTI

CXO·216eB

Sheet 3 of a

RA90/RA92 Error Recovery Levels

RA90 and RA92 disk drives incorporate hardware error recovery as part of the RA9OIRA92 circuitry.
Read data circuitry is altered any time the controller issues error recovery commands.
Generally, error recovery is used to assist the controller during unrecoverable or uncorreetable
errors. The intent is to enhance the controller/disk interaction to recover data that might otherwise
be lost.
The RA9OIRA92 hardware recovery circuitry is divided into six functional areas, as shown in
Table B-1.
Table B-1

RA90JRA92 Hardware Error Recovery Circuits

Circuit

Description

READ THRESHOLD GAIN

There are two ways to increase the chances of reading data from a potentially
bad spot on a disk: increase read threshold or decrease read threshold. The
drive determines whether information coming off the disk is either too weak
or too strong and consequently increases or decreases the read circuitry
amplitude in an attempt to recover information.

HOLD-OVER ONE-SHOT

VCO control voltage is held stable to prevent large phase errors during a
momentary loss of read pulses from the disk.

SKEW READ GATE

A delay of one or two byte times is introduced between the moment the SDI
gate L'"T8y (on the !!O~RlW 7'ftodule) receives t.he READ GATE signal from the
SDI controller and the time the jjO-RtW module acts upon the READ GATE
signal. The amount of delay (skew) changes for each revolution of the disk
when the index pulse is received. The skew time is one byte time for odd
revolutions of the disk and two byte times for even revolutions of the disk.

FAST LOCK DELAY

Fast lock delay is accomplished by the BJW ENDEC chip. The drive sofuvue
enables fast lock delay through Mise. 110 Port 0 (bit <4> ) with a 2.24microsecond delay in addition to the delayed gate signal.

OFFSET OF HEADS

Positive and negative offsets can be applied to the servo circuitry during
attempted reads. Six combinations of offsets are utilized in the RA9O. These
include plus or minus offsets of 5%, 10%, 12.4%, or 20% of the track. width.

WRITE DIAGNOSTICS

Thin-film heads can sometimes take on the characteristics of the magnetic
media. The buildup of this magnetic field in the heads interferes with the
drive's ability to read the surface of the disk. Running write current through
the heads usually breaks up the magnetic alignment of the thin-film heads
substrata layers. This level of error recovery writes internal diagnostics
within the dedicated inner guardband to eliminate this problem. WIth normal
drive operations, this should rarely be a problem.

DIGITAL INTERNAL USE ONLY

5-1

B-2 RA901RA92 Error Recovery Levels

The RA901RA92 error recovery circuits are activated when the SDI controller issues an SDI ERROR
RECOVERY command to the drive. This occurs after the controller has exhausted its read retry
count (five for the RA9OIRA92). An error recovery level is specified by the controller in the SDI
ERROR RECOVERY command. The level number specifies which combination of error recovery
circuits the drive is to employ. There is no controller intervention in the actual drive error recovery
process.
RA90 and RA92 disk drives employ 14 levels of error recovery, as shown in Table B-2.
Table B-2

RA90/RA92 Error Recovery Levels

Level

Description

Offset of heads by dedicated servo to +5% (offset is towards outer guardband).

Offset of heads by dedicated servo to -5% (offset is towards inner guardband).

Offset of heads by dedicated servo to +10%.

Offset of heads by dedicated servo to -10%.

Offset of heads by dedicated servo to +12.4%.

Offset of heads by-dedicated servo to -12.4%.

Offset of heads by dedicated servo to +20%.

Offset of heads by dedicated servo to -20%.

Enable hold-over one shot.

Fast lock delay level.

Turn on low threshold.

Turn on high threshold.

Turn on read gate delay.

Diagnostic writes (to clear head domain cluttering).

NOP: This is the normal default state of the drive. No elTOr recovery circuits are activated.

The drive supplies the controller with the number of error recovery levels it has at its command.
This is done by the drive in response to a GET COMMON CHARACTERISTICS command from the
controller. The actual mechanism is transparent to the user, but works as follows:
During a read data operation, the controller reads a block of data from the disk. If there are no
ECC errors, data is passed to the host operating system. However, if the controller detects an ECC
error, it compares the number of ECC symbols in error to the drive's ECC error symbol threshold.
The RA9OIRA92 disk drive has an error symbol threshold of six.

As long as the error symbol threshold has not been reached, the controller can correct the data. If
the error symbol threshold is equaled or exceeded, the drive then sends an error to the host error
log and sets the BBR (bad block replacement) flag. The BBR process is actually implemented at a
later time.
The controller then determines if it can correct the data. If the data is uncorrectable, the controller
examines the drive's common characteristics to determine the drive's read retry count parameter.
The RA9OIRA92 disk drive has a read retry count of five.
If, after exhausting the read retry count on a block of data, the data is still uncorrectable, the
controller determines if the drive has error recovery capabilities. The RAOOIRA92 disk drive has 14
error recovery levels (see Table B-2). The controller issues an ERROR RECOVERY command to the
drive. The drive then initiates the first level of error recovery. In the case of the RAOO1RA92, level
14 is used first and the drive decrements down to zero. The RA901RA92 activates the appropriate
hardware circuits corresponding to a level 14 error recovery. The controller repeats the entire read
data block process including, if necessary, the read retry process.
DIGITAL INTERNAL USE ONLY

RA9OIRA92 Error Recovery levels B-3

If the data has still Dot been I'eCOveled, the controller issues another ERROR RECOVERY
command, this time specifying leve113. Again, the drive error :recovery process starts and continues
until the data bas been recovered or all the error recovery levels have been tried. If the read retry
operation fails and the error recovery 1eve1s fail, the controller returns an error to the host and
BBR is impiemented on thai biock oi data.
The error recovery mechanism is not restricted to ECC errors encountered during reads. Headerrelated errors may also cause the hardware error recovery levels to be implemented.

DIGITAL INTERNAL USE ONLY

C
Customer Equipment Maintenance
This appendix will assist customers in maintaining their equipment to ensure the highest level of
equipment performance and reliability. Specifically, this appendix addresses the mainienance oi
6O-inch storage array cabinet systems.

C.1 Customer Responsibilities
The customer is direCtly responsible for:
•

Supplying accessories, including storage racks, cabinetry, tables and chairs, as required.

•

Making the appropriate documentation available in a location convenient to the system.

•

Obtaining cleaning supplies specified in this appendix.

•

Performing the specific equipment maintenance described in ~ appendix.

C.1.1 Cleaning Supplies
To properly maintain the equipment, the customer must acquire the following items and supplies:
•

Vacuum cleaner with tlexible hose and nonmetallic, sotlrbristle brush attacbment

•

Isopropyl alcohol (at least 91%) (Digital PIN 29-19665)

•

T·;"t~free tissues or clOt.hS

•

All-purpose spray cleaner

CAunON
When using spray cleaner, do not spray cleaner directly into computer equipment.
This could adversely affect equipment reHabiUty or damage electrical components.
C~ 1.2

Ongoing Equipment Care

The following should be performed on an ongoing basis:
•

Keep the immediate area in front of the storage array cabinets free of obstructions.

•

Keep the exterior of the cabinets and the surrounding area clean. Use a lint-free cloth and
isopropyl alcohol to remove sticky residue left on painted surfaces by customer cabinet number
labels, and so forth.

•

Maintain the site temperaturelhumidity to comply with Digital's recommended environmental
range (reference product-specific documentation). This will ensure the highest product
reliability and product life goals are achieved.

DIGITAL INTERNAL USE ONLY

C-1

C-2

Customer Equipment Maintenance

C.1.3 Monthly Equipment Maintenance
The following tasks should be performed on a monthly basis, or more often if environment warrants:
CAUTION

Avoid touching the operator control paDel switches during cleaning operations. The
state of the drives could ehaDge and affect the operation of the subsystem.
•

Vacuum and/or wipe top of storage array cabinet with a lint-free cloth.

•

With a soft-bristle brash attachment, vacuum the air vent grill on the front door of the storage
array cabinet. Leave the front door assembly attached to the storage 8lT8y cabinet while
vacuuming.

C.1.4 Maintenance Records
Digital suggests the customer keep an accurate log of all equipment maintenance. A maintenance
log form for 60-inch storage array cabinets is included in this appendix for customer use. This
form may be reproduced and inserted in the customer's site management guide for record-keeping
purposes. Refer to Figure 0-1.

DIGITAL INTERNAL USE ONLY

Customer Equipment Maintenance 0-3

CUSTOMER EQUIPMENT MAiNTENANCE LOG

FOR STORAGE ARRAY CABINETS

" ' " I .. v r

SERVICE

CABINET SIN

TYPE OF SERVICE PERFORMED

I
I

CABINET SIN

CABINET SIN CABINET SIN

I
I
I

CX0-2eUA

Figure C-1

Customer Equipment Maintenance Log for Storage Array Cabinets

DIGITAL INTERNAL USE ONLY

Customer Services' Preventative Maintenance
The information contained in this appendix will assist Digital Customer Services engineers in
performing and pianning preventative maintenance (Ply{) pcOr:edw-es for P..A90IP..A92 disk drive
products.

D.1 PM Checklist for RA90IRA92 Disk Drives
The following preventative maintenance steps should be performed by Digital Customer Services on
a scheduled basis at specified intervals. The PM checklist is a per storage element checklist.
Due to the frequency of this activity, we suggest that you record this activity on the RA9OIRA92
Preventative Maintenance Activity Log provided in this section. This log sheet may be reproduced
and inserted in the site management guide, as appropriate.
One-Year Interval

Perform the following PM steps at 1-year intervals:
1. Utilize VAXsimPLUS to obtain the repair history of each disk drive. Examine the drive error
profile over various lengths of time to determine whether a proactive repair may be warranted.
Examination may include opening up the time window for the last week, last month, and last 3
months. Deeper examination of error logs may be necessary if there are any error rate trends
of concern. (Time: 10:00 minutes for basic error analysis with VAXsimPLUS)
2. Remove the drive(s) from service. (Time: 2:00 minutes)
3. Remove the cabinet front access panel or bezei assembiy. Remove and clean each cabinet
pre-filter or air vent grill as necessary. (Time: 5:00 minutes)
4. Determine the drive microcode revision levels by examjning subsystem printouts or running
drive test T45. Update microcode to the latest compatible functional revision as necessary.
(Time: 3:00 minutes)
5. From the rear of the cabinet at the 110 bulkhead panel, verify the SDI cables are dressed and
, routed in an orderly fashion to prevent the cables from being tripped over or stepped on.
6. Verify the SDI connectors are securely attached to the 110 bulkhead panel.
7. Return the drive(s) to service.
The yearly PM steps can be accomplished in approximately 20 minutes per drive. Servicing more
than one drive at a time will result in reduced time per drive.

DIGITAL INTERNAL USE ONLY

0-1

0-2 Customer Services' Preventative Maintenance

'TWo-Year Interval

Perform the following PM steps at 2-year intervals:
1. Remove the drive(s) from service. (Time: 2:00 minutes)
2. Remove drive power.
3. Remove the OCP and blower bezel assembly. Visually inspect the drive chassis interior for
debris. If considerable dirtIlint is present, remove the electronic control module (ECM) assembly
and head disk assembly (RDA) then vacuum the c~8ssis. Reassemble the drive. (Time: 10:00
minutes)
.
4. Power up the drive and determine whether the blower motor quickly attains its speed aDd the
drive becomes ready. (Time: 2:00 minutes)
5. Execute drive internal test TOO for one pass. ('lime: 10:00 minutes)
6. Return the drive(s) to service.
The 2-year interval PM steps can be accomplished in apprcmimately 24 minutes per drive. Servicing
more than one drive at a time will result in reduced time per drive.
Five-Year Interval (for the HDA)

In addition to the 1- and 2-year interval PM steps previously described, perform the fonowing step
at 5-year intervals:
1. Remove aDd replace the spindle ground brush using ~ures contained in this manual.
The 5-year interval PM steps should be accomplished within 40 .minutes per drive.

DIGITAL INTERNAL USE ONLY

Customer Services' Preventative Maintenance D-3

RA90/RA92 PREVENTATIVE MAINTENANCE ACTIVITY LOG
FOR EACH RA90/RA92 STORAGE ELEMENT
DRIVE TYPE (circle one) RA90 I RA92
DRIVE SERIAL NUMBER _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ __
CABINET _ _ _ _ _ _ _ _ CABINET SIN _ _ _ _ _ _ _ __
ECM
DATE OF
SERVICE

MICROCODE
REV LEVEL

MAINTENANCE ACTIVITY

HDA
REV

SIN

REV

I
T

CXO-2H3A

DIGITAL INTERNAL USE ONLY

Index
A
Acceptance testing, 2-13
drive spun down, 2-18
drive spun up, 2-19
Add-on installation, 2-22
Address selection, 2-20

B
BBR algorithms, 5-48
BBR packet, 5-48
Blowerlbezel motor assembly removal,
6-7
Blower motor, dual outlet, 3-12
Brake assembly removal, 6-17

c
Cluster installation note, 2-20
Controller byte, 5-5
Correctable ECC errors, 5-48
Cylinder address bytes, 5-6

D
Data rates, 1-6
Data storage capacity
RA90 disk drive, 1-1
RA92 disk drive, 1-1
Deskidding cabinets, 2-5
Diagnostics
host-level, 5-37
HSC-based, 5-37
KDM-based, 5-37
off-line, 5-37
power-up, 2-16
standalone, 5-37
XDA controller-based, 5-38
Diagnostics and utilities
Average Seek Timing test (T38), 4-14
Clear DD 'Bit utility (T55), 4-21
Clear Seeks utility (T53), 4-21
Clear Spinups utility (T54), 4-21

Diagnostics and utilities (cont'd.)
Display Drive Serial Number utility
(T47), 4-20
Display Error Log Errors utility (T41),
4-16
Display Seeks utility (T43), 4-17
Display Spinups utility (T44), 4-18
Display Time utility (T24), 4-17
Drive Revision Level utility (T45),
4-18
Drive SIN Bus test (T04), 4-7
Drive-Sensed Temperature Display
utility (T29), 4-12
Error Log Checkpoint utility (T50),
4-21
Gray Code (Track Counter) test (T29),
4-12
Guardband test (TSO), 4-12
Hardcore Sequence Test (T18), 4-11
HDA Revision utility (T46), 4-20
Head Select and One Seek test
sequence (T24), 4-12
Head Select test (".r06), 4-8
Head Select utility (T63), 4-22
Head Switch Timing test (T39), 4-15
idle loop tests (spun down), 4-2
idle loop tests (spun up), 4-2
Incremental Seek test (T3l), 4-13
individual descriptions, 4-5
Loop-Off utility (T62), 4-22
Loop-On-Error utility (T61), 4-22
Loop-On-Test utility (T6O), 4-21
Master CPU test, 4-5
Master RAM test, 4-5
Master ROM test (T01), 4-6
Master Timer test (T02), 4-6
Minimum Seek Timing test (T36),
4-14
One Seek utility (T64), 4-23
power-up, 4-1
.
problem OCP displays, 4-3
Random Seek test (T33), 4-13
ReadJWrite Force Fault test (T16),
4-11

Index 1

2 Index

Diagnostics and utilities (cont'cL)
ReadlWrite Sequence test <T19), 4-11
Read-Only Cylinder Formatter test
(T17), 4-11
Read-Only test (T14), 4-9
SDI Loopback test (external) (T09),
4-9
SDI Loopback test (internal) ('1'08),
4-9
SectorlByte Counter test ('1'07), 4-8
Seek Parameter Input utility (T65),
4-23
sequence tests, 4-2
Serial Communications Interface test,
4-6
Servo Data Bus Loopback test (T03),
4-6
Servo RAM test, 4-6
Servo Spinup Sequence test (T20),
4-11
'l8pered Seek test ('1'34), 4-13
'lbUle Seek test (T32), 4-13
'lbtal Drive Sequence test (spinning)
(T22), 4-12
'lbtal Drive Sequence test (spun down)
(T23), 4-12
'lbtal Servo Sequence test (T21), 4-11
Update Cartridge utility (spun down)
(T40), 4-15, 7-3
Variable Average Seek Timing test
(T66), 4-25
WritelRead test (T15), 4-9
Documentation
related, xiii
troubleshooting, 5-1
Drive unit address
alternate display mode, 3-21
programming, 3-19

E
ECM
description, 3-3
JlO-RIW module, 3-3
module types, compatibility, 3-3
removal, 6-10
servo module, 3-5
Electrical specifications, 1-7
Electronic control module
SeeECM
Electrostatic protection
See ESD protection
Environmental limits, 1-7
Error byte, 5-4
Error code byte, 5-9
Error codes
during acceptance testing, 2-20

Error codes (cont'cL)
OCP, 2-18
Error descriptions
AO Unable to Clear SDI Array Safety
Status Register, 5-93
A1 Unable to Force Encoder Error,
5-93
A2 Unable to Force Multiple Head
Select While Reading, 5-93
A3 Unable to Force Write Gate and
Write Unsafe, 5-94
A4 Unable to Force Write Current and
No Write Gate, 5-94
AS Unable to Force Write Gate and No
Write Current, 5-94
A6 Unable to Force Read Gate and Off
'!rack Error, 5-94
A7 Unable to Force Write Gate and Off
Track Error, 5-94
AS Unable to Force Read and Write
Fault While Writing, 5-95
AS Servo FaultIForce Fault Test, 5-95
AB Forced Read and Write Fault While
Reading, 5-95
4A. Drive Disabled by Controller (DD
Bit Set), 5-75
AD UART Overrun or Framing Error,
5-95
5A Embedded Head Gain Calibration,
5-78
7A Embedded OffsetJGain Cah"bration
Timeout, 5-84
AE OCP Data Packet Checksum Error,
5-95
AF OCP Start Byte is Not a Sync
Character, 5-96
9A Positioner Correeted Event During
Data 'lransfer, 5-91
OA SDI Incorrect Command Opcode
Parity Error, 5-56
1A SDI Invalid Cylinder Address,
5-63
2A SDI Invalid Subunit Specified,
5-68
SA Servo Processor Inside of
Destination '!rack During Settle
State, 5-88
33 Attempt to Write Through Bursts,
5-70
SA Unable to Force No-Sync Error,
5-81
SA Write Gate and Write-Protected,
5-72
BO OCP Invalid Response, 5-96
B2 OCP Retransmit Failure, 5-96
B3 OCP Command Unsuccessful, 5-96
B4 OCP Command Timeout, 5-97

Index 3

Error descriptions (cont'cL)
B6 Master Processor UART Loopback
Test Failure, 5-97

B8 V.iBE..er Processor UART
TransmitterlReceiver Error, 5-97
B9 OCP-to-Master Processor
Communications Timeout Failure,
5-97
BA OCP NMI Timeout Failure, 5-97
5B Bias Calibration Error, 5-78
BB OCP Processor ROM Checksum
Failure, 5-98
BC Cartridge Checksum Failure, 5-98
BD Miaccode Update CArtridge
Detection Failure, 5-98
BE Cartridge/EEPROMlMaster
Processor Consistency Check,
5-98
BF Error Log Write Compare Error,

5-99
8B Gray Code Error After Settling With
Fine Track, 5-88
3B Hard INIT OCCUITed to Drive, 5-72
4B Index Error, 5-75
IB Inner Guardband Error, 5-64
7B Invalid Test While Spindle Running,
5-84
6B BJW WritelRead Test Overall
Failure (Three or More Bad
Heads), 5-81
2B SDI Invalid Diagnose Memory
Region Location, 5-68
OB SDI Invalid Opcode, 5-56
9B Write and Positioner Corrected
Event, 5-91
CO HL~WL-re Revision and Microcode
Incompatibility, 5-99
Cl Outer Guardband Detected After
HEAD LOAD Command, 5-100
C2 Inner Guardband Detected After
HEAD LOAD Command, 5-100
C3 Seek to Outer Guardband Failed,
5-100
C4 Seek to Outer Guardband Not
Detected, 5-101
C5 HDA and EOM Incompatibility,
5-101
C6 PLO Failure, 5-101
C7 Seek to Inner Guardband Failed,
5-101
C8 Inner Guardband Not Detected
After Seek to Inner Guardband,
5-102
C9 Analog Loop Test Failure, 5-102
CAMedia Not Spinning, 5-102
98 Can't Execute Diagnostic/Jumper,
5-91

Error descriptions (cont'd.)
64 Cannot Clear lID Error Bits, 5-80
67 Cannot Execute Write Test (ReadOnly Test Failed or Not Run First),
5-80

CC Servo Processor Recalibrate Failed,
5-102
CD Track Counter (Gray Code), 5-103
CE EEPROM Write Cycle Timeout,
5-103
CF Invalid Data in EEPROM, 5-103
7C Gray Code Match Error After
Settling, 5-85
50 IncotTeCt Diagnostic Index or Sector
Pulse, 5-78
lC Outer Guardband Error, 5-64
6C BJW WritelRead Test Partial Failure
(One or'l\vo Bad Heads), 5-81
9C Read Gate and Positioner Corrected
Event, 5-92
OC SDI Command Length Error
(LVL2), 5-57
4C SDI Invalid Write Memory Region
Error, 5-75
2C SDI Spindle Not Ready with
SeeklRecalibration Command,
5-68
8C Uncalibrated and PLO Error, 5-88
58 Dedicated Head Gain Calibration
Error, 5-77
79 Dedicated Servo Calibration
Timeout Error, 5-84
7D Embedded Interrupt Timeout,

5-85
9D Error Log Header COlTUpted, 5-92
3D IiDA ReacilWrite Interlock Broken,
5-72
65 Diagnostic Index or Sector Not
Detected, 5-80
61 Diagnostic Index Sync Timeout
Error, 5-79
1D megal Servo Fault, 5-64
8D Polarity Error on Velocity Command
During a Multi-Track Seek, 5-88
2D Power Supply Over-'Iemperature,
5-69
42 Drive Not On LinelSEEK Command
Issued, 5-73
OD SDI Invalid Command with Drive
Error, 5-57
55 DSP Sanity Timeout After Load,
5-77
6D Unable to Force Read Gate and
Write Gate Together, 5-82
4D Write Gate and Bad Embedded
Servo Information, 5-76
EO Spindle Rotation Not Detected,
5-103

4 Index

Error descriptions (cont'd.)
E1 Spindle Speed Out Of Range,
5-104
E2 AID or D/A Converter Insane,
5-104
E3 Excessive Positioner CUlTent
During Test, 5-104
E4 Open Circuit Detected During
Power Amp Toggle Test, 5-104
E5 Overcurrent Detected During
Actuator Test, 5-105
E6 Track Counter Clear Failure,
5-105
E7 nlegal Zone Detected, 5-105
E8 Outer Guardband Timeout, 5-106
E9 Gray Code Timeout During the
Turnaround State, 5-106
EA Gray Code Timeout During Outer
Guardband State, 5-107
EB Sector Pulse Timeout During SyncUp State, 5-107
EC Servo Fault and PLO Fault Bit Set
in GASP, 5-107
9E Drive Faulted, Test Cannot Run,
5-93
ED Servo Watchdog Timeout, 5-107
EE Servo Digital Signal Processor
Reset, 5-107
EF Head Unload Failed, 5-108
7E Fine Track Lost After Settling,
5-85
22 Electronic Control Module OverTemperature Error, 5-66
8E Master Processor ROMlEEPROM
Consistency Code Mismatch, 5-89
59 Embedded Servo Offset Calibration
Error, 5-78
34 ENDEC Encoder Error, 5-70
3E OCP Interlock Broken, 5-73
IE Power-Up After AC Power Loss,
5-64

OE SDI LvI 1 Invalid Select Group
Number, 5--57
,2E SDI Spinup Inhibited by Controller
Flags, 5-69
6E Unable to Force Write Gate and
Write Protect Error, 5-82
FO Servo Microcode Update Failed,
5-108
Fl Command to Servo Processor Timed
Out, 5-108
F3 Servo Spinup Failed, 5-108
F4 Servo Spindown Failed, 5-109
F5 Seek Failed, 5-109
F6 Head Switch Failed, 5-109
F7 RTZ Failed, 5-109
F8 Head Load Failed, 5-109

Error descriptions (cont'd.)
F9 Diagnostic Command Failed, 5-110
FA Servo Processor Failed Seek to DGN
Write Cylinder, 5-110
FB Servo Processor Failed Seek to
DGN Read Cylinder, 5-110
FD EEPROM Checksum Error, 5-111
6F Diagnostic Write Attempted While
Write-Protected, 5-82
8F EEPROM Checksum Failure, 5-89
9F Error Log Check Point Code, 5-93
4F Invalid Select Group (Level 1
Command) - Not Read/Write
Ready, 5-76
44 Format Command and Format Not
Enabled, 5-74
2F SDI RUN Command with Run
Switch in Stop Position, 5-69
OF SDI Write Enable on a WriteProtected Drive, 5-57
1F Sector Overrun Error, 5-65
7F Servo Settling Timer Expired, 5-85
77 Head Load Timeout Error, 5-84
14 Head Offset Margin Event, 5-62
15 Head Offset Out-of-Band Error,
5-62
54 Head Select Register Loopback
Error, 5-77
93 Inner GuardbandlServo Fault: No
Interrupt Detected, 5-89
92 Inner Guardband Without a Servo
Fault Set, 5-89
49 Invalid Command During
TOPOLOGY Command, 5-75
47 Invalid Disconnect CommandtTl' Bit
Error, 5-74
05 Invalid Drive Serial Number Code,
5--55
46 Invalid Hardware Fault, 5-74
48 Invalid Write Memory Byte
Counter/Offset Error, 5-75
24 Loss of Fine Track During Data
Transfer, 5-66
88 Master Processor EEPROM Write
Violation Error, 5-87
85 Master Processor RAM Test Failure,
5-87
87 Master Processor ROM Checksum
Failure, 5-87
80 Master Processor ROM Consistency
Code Mismatch, 5-86
57 Ma~ter Processor Timer Failure,
5-77
11 Microcode Cartridge Load Occurred,
5--58
06 Microcode Fault, 5-55

Index 5

Error descriptions (cont'd.)
91 No Interrupt Detected During R/W
Force Fault, 5-89
74 Offset Timeout Error, 5-83
60 ReadlWrite Head Select Failure,
5-79
38 Read Gate and Multiple Head Chips
Selected, 5-71
45 Read Gate and Off Track Both
Asserted, 5-74
31 Read Gate and Write Gate Both
Asserted, 5-70
32 Read or Write While Faulted, 5-70
62 Read Test Overall Read Failure
(Three or More Bad Heads), 5-79
63 Read Test Partial Failure (One or
Two Bad Heads), 5-79
66 Read Test Servo Failure, 5-80
71 Recalibrate Timeout Error, 5-82
10 SDI Command Length Error (LVL2),
5-58
96 SDI Failure: Port B, 5-90
07 SDI Frame Sequence Error, 5-55
29 SDI Invalid Error Recovery Level
Specified, 5-68
19 SDi invalid Format Request, 5-63
16 SDI Invalid Group Select LVL2,
5-62
40 SDI Invalid Read Memory Region
Error, 5-73
94 SDI Loopback Test Failure on Both
Ports, 5-90
09 SDI LvI 1 Framing Error, 5-56
08 SDI LvI 2 Checksum Error, 5-56
17 SDI Port A CommandlResponse
Timeout, 5-63
18 SDI Port B CommandlResponse
Timeout, 5-63
20 SDI RTCS Parity Error, 5-65
95 SDI Test Failure: Port A, 5-90
21 SDI Transfer (Pulse) Error, 5-65
51 Sector/Byte Counter Error, 5-76
89 Seek Speed Out of Range, 5-87
50 Servo Data Bus Failure, 5-76
25 Servo Fault Error, 5-66
27 Servo Over-Temperature Error at
SI, 5-67
28 Servo Over-Temperature Error at
S2, 5-67
78 Servo Processor Bias Force
Calibration 'rJllleout, 5-84
82 Servo Processor Coarse Velocity
State Timeout, 5-86
83 Servo Processor Fine Velocity State
Timeout, 5-86
73 Servo Processor Head Switch
Timeout, 5-83

Error descriptions (cont'd.)
53 Servo Processor Offset Error, 0-1 1
76 Servo Processor Sanity Timeout,
5-83
84 Servo Processor Seek Direction
Error, 5-87
72 Servo Processor Seek Timeout,
5-83
81 Servo Processor Settle State
Timeout, 5-86
70 Servo Processor Spinup Timeout,
5-82
75 Servo Processor Unload Timeout,
5-83
56 Servo RAM T~st Failure (High Byte
of Address), 5-77
52 Servo RAM Test Failure (Low Byte
of Address), 5-76
13 Spindle Motor Control Fault, 5-59
01 Spindle Motor Transducer Timeout,
5-54
01. Spindle Motor Transducer
Timeout 8, 5-53
03 Spindle Not Accelerating During
Spinup, 5-54
26 Spindle Speed Error (Servo
Processor), 5-67
12 Spindle Speed UnSafe Error, 5-58
04 Spinup Too Long to Lock on Speed,
5-54
02 Spinup'Tho Slow, 5-54
86 Static RAM Failure, 5-87
43 TCR and Not ReadIWrite Ready
Fault, 5-74
68 This Diagnostic Cannot Execute
W:1t.Jlout Software Jumper, 5-81
69 Unable to Force Compare Error,
5-81
90 Unable to Force Index Error, 5-89
36 Write and Servo Uncalibrated,
5-71
35 Write and Write Unsafe, 5-71
30 Write Current and No Write Gate,
5-69
37 Write Gate and No Write Current,
5-71
39 Write Gate and OtfTrack, 5-72
Error logs, 1-4
Error recovery level byte, 5-9
Error recovery levels, B-1
Error recovery Levels
NOP:noopMatio~, B-2
Errors related to media
See media elTOrs
ESD protection, 1-8
wrist strap use, 1-8

6 Index

F
Fault display mode setup, 3-16
Floor loading, 2-3
Front access panel, removal, 2-7

H
HDA

brake assembly removal, 6-17
carrier separation, 6-14
description, 3-10
hardware compatibility, 3-12
installation, 6-14
removal, 6-12
spindle ground brush removal, 6-16
HDA preventative maintenance, D-2
HDA revision bits byte, 5-6
Host error logs, 5-2

J/O-B/W module
description, 3-3
hardware revision matrix, 3-5
Idle loop testing, 2-16
Input current (amps), 1-7
Inrush current, 1-6
Installation note, cluster, 2-20

L
Labeling, OCP, 2-13
Lamp test, OCP, 2-16
LARS examples, A-I
Latency, 1-6
Level A Retry, 5-49
Level B Retry, 5-49
Leveling cabinets, 2-6
Logical media layout, 1-3

Media errors (cont'cL)
LBN COITelation to multiple groups
(heads), 5-34
LBNs correlated to zone write
boundaries, 5-34
multiple controllers report same errors,
5-35
repeating LBNslRBNs, 5-33
single controller port aft'ected, 5-35
Media removal service, 6-25
Microcode
compatibility with drive FRUs, 3-13
Microcode update procedure, 7-3
microcode update cartridge description,
7-1
running T40, 7-3
update port description, 7-2
Mode byte, 5-4
MSCP status/event
6B, 5-49
MSLG$_LEVEL, 5-46
MSLG$_RETRY, 5-46

N
Normal mode setup, 3-15

o
OCP
fUnctions, 3-14
removal, 6-6
OCP error codes, 2-18
OCP labeling, international, 2-13
OCP lamp test, 2-16
On line
placing drive on line, 2-20
Operating temperature and humidity,
2-3
Operator Control Panel

SeeocP

M
Maintenance activity log, 0-3, D-2
Maintenance strategy, 1-3, 1-4
Manufacturing fault code, 5-9
Media elTOrs, 5-32
drive or controller port not defined
(random RIW errors), 5-35
excessive number of blocks replaced
because of RJW path problems,
5-33

isolating random RJW transfer errors,
5-35
LBN correlated to a physical cylinder,
5-34
LBN correlation to a single group
(head), 5-33

p
Part numbers, ECM components, 3-3
Parts removal sequence, 6-3
PeM
description, 3-7
removal, 6-11
switch pack settings, 3-9
Phase requirements, 2-1
Physical characteristics, 1-6
Physical media layout, 1-3
Positioner errors, 5-49
Power, applying to drive, 2-14
Power and safety precautions, 2-1
Power cord connections, 2-11

index 7

Power dissipation, 1-7
Power supply
available voltages, 3-12
removal, 6-22
Power supply location, drive, 2-12
Power-up
resident diagnostics, 2-16
Preamp control module
SeePCM
Preventative maintenance
customer responsibilities, 0-1
Customer Services' responsibilities,

Servo module (cont'd.)
hardware revision matrix, 3-7
Site preparation and planning, 2-1
Software jumper, 4-4
Specifications, RA9O/RA92, 1-5
Spindle ground brush removal, 6-16
Spindle lock solenoid failure, 6-20
Start/stop time, 1-6
Status/event codes
14, 5-48
34, 5-46, 5-48, 5-49
54, 5-48
74, 5-48
94, 5-48
2A, 5-32
lAB, 5-48

D-1
maintenance activity log, D-2
Previous command opcode byte, 5-6
Programming the unit address, 2-20

lAB, 5-31

AB, 5-31

R
Rear access panel, removal, 2=9
Rear flex cable removal, 6-23

Remova1lreplacement procedures
bezel and blower motor assembly
separation, 6-9
blowerlbezel motor assembly removal.,
6-7
brake assembly removal, 6-17
contact extraction tool, 6-20
ECM removal, 6-10
front access panel removal, 6-4
FRUs, sequence for removal, 6-3
HDA and carrier separation, 6-14
HDA installation, 6-14
HDA removal, 6-12
media removal service. 6-25
OCF removal, S-6 PCM removal, 6-11
power supply removal, 6-22
rear access panel removal, 6-4
rear :flex cable removal, 6-23
solenoid removal, 6-22
spindle ground brush removal 6-16
spindle lock solenoid failure, '6-20
tools checklist, 6-3
Request byte, 5-3
Response opcode byte, 5-3
Retry count byte, 5-5

s
SDI cable connections, 2-10
Sector format, 1-1
Seek times, 1-5,4-5
Sequence diagnostics, 4-2
Service delivery strategy, 1-4
Servo module
description, 3-5

14B, 5-29

4B, 5-29
lOB, 5-29
8B, 5-30
16B, 5-31
18B, 5-31
2B, 5-32
SB, 5-4:9
B4, 5-4-8
1C8, 5-48
CB, 5-30
D4, 5-4-8
E8, 5-44, 5-49
lE8, 5-48
Status bytes
extended, 5-2
generic, 5-4
S-ubwrlt mask bTw, 5-3

T
'Thmperature, affect on drive performance
4-5
'
Test selection from OCP, 2-16
Theory

drive operations and theory, 3-1
Thermal stabilization, 2-3
Tools checklist, 6-3
Training, 5-1
Troubleshooting
bad block replacement (BBR), 5-24
controller byte, iHS
controller-detected communication
events and faults, 5-30
controller-detected drive clock dropo~
5-31
controller-detected drive failed
initialization, 5-31

8 Index

Troubleshooting (cont'd.)
controller-detected drive ignored
initialization, 5-31
controller-detected EDC errors, 5-28
controller-detected loss of read/write
ready, 5-30
controller-detected lost receiver ready,
5-30
controller-detected protocol and
transmission errors without
communications errors, 5-29
controller-detected pulse or state parity
errors, 5-29
controller-detected receiver ready
collision, 5-31
controller-detected SERDES error,
5-32
correctable ECC errors, 5-48
cylinder address bytes, 5-6
data collection steps, 5-26
DBN conversion, RA9O, 5-6
DBN conversion, RA92, 5-8
drive-detected drive errors and
diagnostic faults (DDDE), 5-27
drive-detected protocol errors without
communication errors (DDPE),
5-27
drive-detected pulse or state parity
errors, 5-27
drive internal error log, 5-9, 5-27
drive-resident utility dump (T41),
5-14
error byte, 5-4
error code byte, 5-9
error recovery level byte, 5-9
error reporting mechanisms, 5-1,5-15
exiting data collection/action list
process, 5-39
extended status bytes, 5-2
FRU replacement stage, 5-40
general information, 5-16
HDA revision bits byte, 5-6
host console/user terminal trails, 5-24
, host error log, 5-25
host error logs, 5-2, 5-23
host-level diagnostics, 5-37
host-level diagnostics and utilities,
5-16
HSC-based diagnostics, 5-37
HSC console log, 5-24, 5-26
HSC console utility: DKUTlL, 5-12
identifying the problem drive, 5-23
identifying the problem FRU, 5-24
KDM-based diagnostics, 5-37
LBN conversion, RA90, 5-6
LBN conversion, RA92, 5--8
manufacturing fault code, 5-9

Troubleshooting (cont'd.)
miscellaneous checks, 5-36
mode byte, 5-4
OCP fault indicator/error codes, 5-14,
5-25
off-line diagnostics, 5-37
other means (to identify problem drive),
5-24
performance issues when no errors are
being logged, 5-41
post-verification testing, 5-40
Power OK indicator, 5-14
pre-verifying drive symptoms, 5-25
previous command opcode byte, 5-6
priority order of DSA errors, 5-27
RBN conversion, RA9O, 5-6
RBN conversion, RA92, 5--8
receiver ready collisions: acceptable
rates, 5-31
receiver ready collisions: unacceptable
rates, 5-31
recommended training, 5-1
reference material, 5-1
request byte, 5-3
resident diagnostics limitations, 5-16
response opcode byte, 5-3
retry count byte, 5-5
returning disk to customer, 5-41
SDI drive command timeout, 5-32
standalone diagnostics, 5-37
status/event 6B, 5-52
step-by-step procedure, 5-16
subunit mask byte, 5-3
uncorrectable ECC errors, 5-44
unit number low byte, 5-3
unusual problems, 5-36
VAXsimPLUS, 5-2, 5-23, 5-25
VMS mount verification, 5-42
worksheet, 5-23
XBN conversion, RA9O, 5-6
XBN conversion, RA92, 5-8
XDA controller-based diagnostics,
5-38

u
Uncorrectable ECC errors, 5-44
hard, 5-44
soft, 5-46
Unit address
see drive unit address
Unit number low byte, 5-3
Unpacking, 60-inch cabinets, 2--3
Updating microcode
See microcode update procedure

Index 9

v
VAXsimPLUS, 5-2
Voltage (frequency) selection
power supply, 2-13