Digital PDFs
Documents
Guest
Register
Log In
EK-D590A-SM-A01
June 1992
172 pages
Original
6.3MB
view
download
OCR Version
5.9MB
view
download
Document:
DECsystem 5900 CPU System Manual
Order Number:
EK-D590A-SM
Revision:
A01
Pages:
172
Original Filename:
OCR Text
DECsystem 5900 CPU System Manual Order Number EK-D590A-SM. AD1 Digital Equipment Corporation Maynard, Massachusetts First Printing, June 1992 The information in this document is subject to change without notice and should not be construed as a commitment by Digital Equipment Corporation. Digital Equipment Corporation assumes no responsibility for any errors that may appear in this document. Restricted Rights: Use, duplication or disclosure by the US. Government is subject to restrictions as set {orth in subparagraph {(c)(1)tVersion 2) of the Rights in Technical Data and Computer Software clause at DFARS 282.227-7013. © Digital Equipment Corporation 1992. All rights reserved. Printed in US.A. The following are trademarks of Digital Equipment Corporation: CompacTape, DEC, DECconnect, DECnet, DECserver, DECsystem 5900, DECwindows, RRD40, RRD56, RX, ThinWire, TK, TS, TU, TURBOchannel, ULTRIX, VAX, VAX DOCUMENT, VMS, VT, and the Digital logo. PrestoServe is a trademark of Legato Systems Inc. FCC NOTICE: The equipment described in this manual generates, uses, and may emit radio frequency energy. The equipment has been type tested and found to comply with the limits for a Class A computing device pursuant to Subpart J of Part 15 of FCC Rules, which are designed to provide reasonable protection against such radio frequency interference when operated in a commercial environment. Operation of this equipment in a residential area may cause interference, in which case the user at his own expense may be required to take measures to correct the interference, S1921 This document was prepared using VAX DOCUMENT, Version 2 1. Contents x1ii Pretace Chapter 1 1.1 1.2 1.3 1.4 1.4.1 1.42 System Qverview ... System Hardware Configurations .................. External System Unit Connectors ..................... .. ...... DECsystem 5900 Functional Overview .. ...... System Module Main Components . . ................... CPU Card Components ................oiiunnn... 1.4.3 1.44 1.4.5 146 1.4.7 ... ... ... ... ..... ... ... Memory Modules . ..... NVRAM Module . . .. ... ... .. . i .. TURBOchannel and TURBOchannel Extender ....... Internal Base System Module Connectors . ... ......... i e SCSIDIIVES . ..ot TURBOchannel Option Modules .. ... ... ........... Chapter 2 1-5 1-5 1-5 1-6 1-7 1-7 Console Mode and OperatingMode e Modes. ... .. 2.1 Console Mode . ........ .. . ... . i 2.1.1 Console prompts. . . ... ... ... i 2.1.1.1 To enter console mode ... ..... ... ... ... ........ 2.1.1.2 2.1.13 1-1 1-1 1-2 1-3 1-5 Haltbutton . ... ... ... ... .. . . .. OQOperatingMode . ....... ... . .. ... 2-1 2-1 2-2 2-2 2-3 e 2.1.2 ... To enter operatingmode . .................... 2.1.2.1 2.2 System Software Management . ....................... To Boot System Software .......................... 2.2.1 2-3 2-3 2.2.2 To Shut Down System Software . . ... ................ 2-4 2.2.3 To Access ULTRIX Error Logs .. ................. ... 2-5 2-3 2-3 Chapter 3 Troubleshooting Overview 3.1 Introduction ...................................... 3-1 3.2 PoWeTr . ... 3-2 3.3 SelffTests ... .. ... .. 3-2 3.4 Configuration Displays. . .. ... 3-2 3.5 Environment Variables. . . ... ... ... ... ... 3-3 3.6 Tests and Seripts . . ... 3-3 ... ... ... ... ... .. .. .. ... 3.6.1 S1 3-3 3.6.2 SCFIPLS . . 34 Chapter 4 41 Troubleshooting FRUs LEDDisplays.. ... ... . ... . 4.1.1 Diagnostic LED Array ... ... ... 4.1.2 CPUModule LEDs . . ... ... 4.1.3 Drawer LEDs . .. . ... ... . ... 42 Configuration Dasplays. . ... .. ... ... .. ... .. .. . ... .. ... 4-1 4-1 4-2 4-2 . ......... ... ... .. L. 4.2.1 Configuration Overview . . ... ... .. ........... 4-3 4.2.2 Detailed Configuration . . . . ..................... ... 4-3 ErrorMessages . ... ... ... . .. .. .. 4-5 4.3.1 Test Error Messages. . ........ ... . ... ... .. .. ... 4-5 4.3.2 Memory Test Error Messages . . . . .......... ... ..... 4-7 4.3.3 Console Exception Messages . ...................... 4-8 4.4 Addresses. . . ... ... 4-9 43 4.4.1 Slot Numbers .. ... .. .. . . . ... 4-9 4.4.2 Memory Addresses .. ....... ... ... ... 4-10 4.4.3 Hardware Physical Addresses . . .................... 4-11 45 ... ... 4-2 ULTRIX Error Logs . ... ... . ... .. ... . ... ... ....... 4-13 4.5.1 Examining Error Logs ... .......... ... .. ... ... ... 4-13 4.5.2 ULTRIX Error Log Format ........................ 4-13 4.5.3 ULTRIX Error Log Event Types . . . . ................. 4-15 4.54 Memory Error Logs . ... ... ... ... ... ... .. 4-16 46 Registers ... ... ... .. 4-18 4.7 For Further Information ........ ... ... ... ... ..... 4-19 Chapter 5 Troubleshooting Tools 51 Console Mode 5.2 TestS . . . 5.2.1 5-1 Slot Numbers in Test Commands and Error Messages 5-2 53 Power-UpSelf-Tests . ........ ... ...... 5-3 54 ConsoleModeTests ... ... .. ... ... .. .. ... .. ... .. .... 54 54.1 Using the t Command .. ... ... ... ... ... ... ... ... ... ... ... ... ... 5-4 5411 To display a list of available tests . ................ 5-6 542 CommonTests ......... ... ... ... ... ... T 5421 SCSI controller (entD test .. ....... ... ... ... .. 5T 5422 SCSI send diagnostics (sdiag)test . ... ............. 5-7 54.2.3 Ethernet external loopback test . . . . ............... 5-8 5424 SCC transmit and receive test . . .. ... ............. 5-8 54.25 SCCopinstest. 5-8 55 ... ... .. .. . . . . . Test Seripts . ..o 5-9 5.5.1 To Display a List of Available Seripts. ... ............. 5-10 5.5.2 To Display the Contents of a Seript . . .. .............. 5-11 5.5.3 ToCreateaTest Script. . .. ... 5-12 Appendix A .. ... ... .. .. ...... Console Commands Al UsingThis Appendix . ... ..... ... ... .. ... ... . ...... A-1 A.1.1 Conventions Used in This Appendix . ................ A-1 A.1.2 Some Terms Used in This Appendix. ... .............. A-1 A.1.3 Rules for Entering Console Commands . .............. A-2 A2 Console Command Reference . . ......... ... ... ... ... A-3 A21 Console Command Format Summary. A22 ?Command ... ...... A-3 ... .. ... A-5 A23 bootCommand........... ... ... .. .. .. ... .. A-5 A23.1 Important information about the boot command . ... .. A-T7 A24 catCommand..... A25 enfgCommand........ .. ... ... ... ... .. .. .. ... ... ... ... ... A-7 .. ... A-8 A25.1 General system configuration displays . ... .......... A-8 A252 Base system configuration displays . ... ............ A-9 A253 Ethernet controller configuration displays . ... ... . ... A-11 A254 SCSI controller displays . . ... A-11 ... ... ... ... ..... A26 A27 A28 A29 e e dCommand . ... ... . . .. ... eCommand . ... ... . . . . ... erlCommand ......... ... . ... .. . .. ... .... . ... . ... ... goCommand .......... A-12 A-13 A-15 A-15 printenvCommand ... .. ... ... ... ... L i ... .. ... .. . . restartCommand....... . . ... ... .......... serippCommand. . setenvCommand .. ...... ... . .. .. ... A-17 A-17 A-17 A-18 e A-15 mitCommand ...... . ... ... ... IsCommand .. ... .. ... ... . . .. . A-16 passwd Command .. ....... ... ... . ... ... . . .. A-16 A2.10 A2.11 A212 A2.13 A214 A215 A216 A217 A219 A-20 shCommand . ... ... .. . . e A-20 tCommand . ... . ... ... . ... testCommand . ........ ... . . ... A-21 A220 unsetenvCommand ............ ... . .. ... ... A-21 A218 A3 Console Command Error Messages .. .................. Appendix B -22 Base System Self-Test Commands and Error Messages Locating Individual Tests in This Appendix . .. . . ......... B.1 Tests. .. ... e B.2 cache/data - Cache DataTest . ... ................... B.2.1 Cache data test error messages . .................. B.2.1.1 cache/fill - Cache Fill Test . ... ... ... ... ... ...... B.2.2 B-1 B4 B4 B.2.21 Cache fill test error messages .. .................. cache/isol - Cache Isolate Test . . ... ... ... .. ... ... . B-6 B—6 B-6 B-7 B24.1 Cache 1solate test error messages ... .............. cache/reload - Cache Reload Test .. .. ... . ..... ... .. Cache reload test error messages . . . ............... B-8 B.2.5 B.2.5.1 cache/seg - Cache Segment Test . ... ................. Cache segment test error messages . ............... B-9 B-9 B.2.6 B.2.6.1 ecc/cor - Error Correction Coding (ECC) Correction Test .. ECC correction test error messages . ... ............ B-10 B-10 B.27 fpu- Floating-Point Unit Test .. ... ... ... ... ......... B-11 B.2.7.1 FPU test error messages . . . . ... .oovv vt B-11 B.2.3 B.2.3.1 B.2.4 vi B4 B-5 B.2.8 B.28.1 B.29 mem - Memory Module Test . . .. ...... ... ... ....... B-12 Memory module test error messages .. ............. B-13 mem/float10 - Floating 1/0 Memory Test .. ............ B-13 .. ......... B-14 B.2.10 mem/init - Zero Memory Utility . . . ... ... ... ... ... B-14 B.2.11 mem/select - RAM Select Lines Test . .. .............. B-14 RAM select lines test error messages . .. ............ B-14 B.2.9.1 B.2.11.1 B.2.12 B.2.12.1 B.2.13 B.2.13.1 B.2.14 B.2.14.1 B.2.15 B.2.15.1 B.2.16 B.2.16.1 B.2.17 B.2.17.1 B.2.18 B.2.18.1 B.2.19 Floating 1/0 memory test error messages. misc/cpu-type - CPU-Type Utility . . . ... ... .......... B-15 CPU-type utility messages . . .. ................... B-15 misc/halt - Halt Button Test . .. ...... .. ... ... .. ... B-15 . ... .............. B-15 misc/pstemp - Overheat Detect Test . . .. ... . ....... ... B-16 Overheat detect test error message .. .............. B-16 mis¢/wbpart - Partial Write Test . . ... ... ... ......... B-16 Partial write test error messages . . .. .............. B-16 ni/cllsn - Collision Test . . .. ........ ... ... ........ B-17 Collision test error messages . . ... ................ B-17 nicommen - Common Diagnostic Utilities ... . ..... .. . B-18 Halt button test error messages. Common diagnostic utility error messages. . ......... B-18 ni/cre - Cyclic Redundancy Code Test . . . . ............. B-19 CRC test error messages . ....................... B-19 ni/ctrs - Display Maintenance Operation Protocol (MOP) Counters Utihty . .. . ... ... ... ... ... ... .. ... .. .... B.2.20 B.2.20.1 B.2.21 B.2.21.1 B.2.22 B.2.22.1 B.2.23 B.2.23.1 B.2.24 B.2.24.1 B.2.25 B.2.25.1 B-20 ni/dmal - Ethernet-Direct Memory Access (DMA) Registers Test .. e B-20 Ethernet-DMA registers test error messages . .. ... ... B-21 n/dma2 - Ethernet-Direct Memory Access (DMA) Transfer Test . . . e B-22 Ethernet-DMA transfer test error messages ......... B-22 ni/esar - Ethernet Station Address ROM (ESAR) Test . . . . B-23 ESAR test error messages ....................... B-23 ni/ext-1b - Ethernet External Loopback Test............ B-24 External loopback test error messages . ... .......... B-24 ni/int - Ethernet Interrupt Request (IRQ) Test . ... ... .. B-25 . . . ...... ...t B-25 nvint-lb - Ethernet Internal Loopback Test . . ... ... ... B-25 Internal loopback test error messages ... .......... . B-26 IRQ test error messages. vii B.2.26 B.2.26.1 B.2.27 B.2.27.1 B.2.28 B.2.28.1 ni/m-cst - Ethernet Multicast Test . . ... ... ........... B-27 Multicast test error messages .................... B-27 niypromisc - Ethernet Promiscuous Mode Test . ... ... ... ~28 Promiscuous mode test error messages .. ........... B-28 ni/regs - Ethernet Registers Test . ... ... ... ... ... .. B-29 Registers test error messages. .. .................. B-29 prcache - Prcache Quick Test . . ... ... B.2.29.1 B.2.30 B.2.30.1 B.2.31 B.2.31.1 B.2.32 B.2.33.1 B.2.34 B.2.34.1 B.2.35 B.2.35.1 B.2.36 B.2.36.1 B.2.37 B.2.37.1 B.2.38 B.2.38.1 B.2.39 B.2.39.1 B.2.40 B.2.40.1 B.2.41 B.241.1 B.2.42 B.2.42.1 viii . ... ... ...... B-30 Prcache quick test error messages ... .............. B-30 prcache/arm - Disconnect Battery Command ... ........ B-31 Prcache/arm command error message .............. B-31 prcache/clear - Zero NVRAM Memory Command . . ... ... B-31 Prcache clear error message . .. . .......... ... . .... B-32 prcache/unarm - Connect Battery Command .. ... ... ... B-32 rte/nvr - Nonvolatile RAM Test . ... ................. B-32 NVR test errormessages . .. ................u.... B-33 rte/period - Real-Time Clock Period Test ... ........... B-33 RTC period test error messages . .................. B-33 rte/regs - Real-Time Clock Registers Test. . .. ... ........ B-34 Real-time clock registers test error messages. ........ B-34 rte/time - Real-TimeTest . . ... ... ... ... ......... B-34 Real-time test error messages .. .................. B-35 sce/access - Serial Communication Chip (SCC) Access Test B-35 SCC access test error messages . .................. B-36 scc/dma - Serial Communication Chip Direct Memory Access Test . .. ... ... .. ... .. . SCC DMA test errormessages. B-36 . . ................. B-37 sce/int - Serial Communication Chip Interrupts Test . . . .. B-37 SCC interrupts test error messages . ... ............ B--38 sce/1o - Serial Communication Chip Input/Output (1/0) Test B-38 SCC I/0O test error messages . .................... B-39 see/pins - Serial Communication Chip Pins Test SCC pins test error messages .. .................. B-41 sce/tx-rx - Serial Communication Chip Transmit and Receive Test SCC transmit and receive test error messages. .. ... . B-43 ... B-44 B.2.43 scsi/entl - SCSI Controller Test . ................. SCSI controller test error messages . . .............. B4 B.2.43.1 B.2.44 scsi/sdiag - SCSI Send Diagnosties Test . . . ............ B—45 B.2.44.1 SCSI send diagnostics test error messages .......... ... B.2.45 scsi/target - SCSI Target Test. .. ................. SCSI target test error messages. . . . ............... B.2.45.1 B.2.46 tlb/prb - Translation Lookaside Buffer Probe Test .. ... .. TLB probe test error messages ................... B.2.46.1 B.2.47 tlb/reg - Translation Lookaside Buffer Registers Test. . ... B.2.47.1 B-45 B-46 B—46 B-49 B-49 B-49 TLB registers test error messages . . ............... B—49 Appendix C CPU and System Registers CPURegisters ........ ...t C.1 .. CauseRegister. .. .. ... ... ... ... ... .. C.1.1 ... Exception Program Counter (EPC) Register. .. ...... C.1.2 L o Status Register ... ... ... . . ... ... C.1.3 ... Diagnostic status . ... ....... ... ... ... C.1.3.1 L. BadVAddrRegister. . . ...... ... .. ... .. C.1.4 . ... .. ... ..... . ... .. SystemRegisters ..... C.2 DataBuffers3to0 ....... ... ... . ... ... . ... ... C21 System Support Register (SSR) .. ................... C.2.2 C-1 C-2 CH4 C-5 C-6 C-7 Cc-7 C-17 C-8 C-11 C.2.3 System Interrupt Register (SIR) .................... C.2.4 C.2.5 C.2.6 System Interrupt Mask Register . ... . ... ............ C-16 Error Address Register (EAR) ... ... ... ... ......... C-16 Error Syndrome Register (ES) ... ................... C-19 C27 C28 Control Register (CS) . . ... .. ... .. ... ... ... ... ... C-20 ECCLogic . ... o e C-22 Appendix D Connector Pin Assignments ix Appendix E ULTRIX System Exercisers E.1 File system Exerciser (fsx) ... ... .. ... ... .. ... E-2 E.2 Memory Exerciser (memx) . ... ...... ... ... .. .. .... E-2 E.3 Shared Memory Exerciser (shmx) ....... ... ... ... ... -3 E4 Disk Exerciser(dskx) . ... ... ... oL E-3 E.5 Mag Tape Exeraiser imtx) ... ...... ... ... .. ... ... E-4 E.6 Tape Exerciser (tapex) ... . ... .. ... . ... ... ... ... E-5 E.7 Network Exerciser(netx) . . ........ .. ..... .. ... ...... E-5 E.8 Communications Exerciser (cmx) . . .................... E-5 E.9 Line Printer Exerciser (Ipx) .. .. ........... ... .. .. ... E-6 . o Index Figures 1-1 CPU Connectors. . ... ... ... i, 1-2 Block Diagram of CPU Control Paths ... ... ... ... 1-2 .. .. 1-4 1 Conventions Used in ThisGuide . . ............. ... ... Xiv 4--1 Base System Test Error Messages . .. ........ .. ... ...... 4-6 4-2 Slot Numbers in Commands and Messages . ............. 4-10 4-3 Memory Module Address Ranges . .. ........ ... ......... 4-11 4-4 Hardware Physical Addresses ... ...... ... ... ... .. .. 4-12 4-5 Error Log Event Types . ... .. ... .. .. ... . .. ... ... ... .. 4-15 5-1 Slot Numbers in Test Commands . . ... ................. 5-H A-1 Console Commands .. ...... ... . ... .. ... ... ... ... A4 A-2 Environment Variables in the Environment Variable Display A-18 A-3 Console Command Error Messages . ................ ... A-22 B—1 Base System Module Tests and Utilities .. .. .. .......... B-2 B—2 Cache Data Test Error Codes . . ... ... B-5 B-3 Cache Fill Test Error Descriptions . ... ... ... .. ... B-6 B-4 Cache Isolate Test Error Codes .. ........ ... .. ... .. .. B-7 B-5 Cache Reload Test Error Desceriptions .. ... ... ...... .. . B-9 B-6 Cache Segment Test Error Codes and Descriptions . ... .. .. B-10 Tables ... ... ...... ... ... B-7 ECC Correction Test Error Codes and Descriptions . . . ... . B-11 B-8 FPUTest Error Codes . ......... ... ... ... .. ... .. .... B-12 B-9 Partial Write Test Error Codes. . .. .......... ... ... ... B-17 B-10 Collision Test Error Codes and Descriptions B-11 .. ........... B-18 Common Diagnostic Utility Error Codes and Descriptions. .. B-18 B-12 CRC Test Error Codes and Descriptions . .. ............. B-19 B-13 Ethernet-DMA Registers Test Error Codes and Descriptions . B-21 B-14 Ethernet-DMA Registers Test Error Codes and Descriptions . B-22 B-15 ESAR Test Error Codes and Descriptions B-23 . .............. B-16 External Loopback Test Error Codes and Descriptions ... .. B-24 B-17 IRQ Test Error Codes and Descriptions . . . .............. B-25 B-18 Internal Loopback Test Error Codes and Descriptions . .. . .. B-26 B-19 Multicast Test Error Codes and Descriptions . . . .......... B-27 B-20 Promiscuous Mode Test Error Codes and Descriptions . .. .. B-28 B-21 Registers Test Error Codes and Descriptions . . ........... B-30 .. B-22 Prcache Quick Test Error Codes and Descriptions .. .. ... B-31 B-23 RTC Period Test Error Codes . . . ...................... B-33 B-24 Real-Time Cleck Register Test Error Codes and Desscriptions B-34 B-25 Real-Time Test ErrorCodes . .. ....................... B-35 B-26 SCCDMATestErrorCodes. . ........................ B-37 B-27 SCC I/0 Test Error Codes and Descriptions . .. ........... B-39 B-28 Pin Pairs Tested by Individual Loopback Connectors. ... ... B—40 B-29 SCC Pins Test Error Codes and Descriptions. B-—42 . ........... B-30 SCC Transmit and Receive Test Error Codes and Descriptions B-44 B-31 SCSI Controller Error Codes and Descriptions .. ......... B-45 B-32 SCSI Send Diagnostics Test Error Descriptions . . .. .... ... B-46 B-33 SCSI Target Test Error Codes and Descriptions. B-34 TLB Registers Test Error Descriptions. C-8 . ....... .. B-48 . . . .......... .... B-50 CPU Registers . . ... ... . ... i C-1 Exception Codes . . . ........ ... ... . . .. .. CH4 System Registers . . ....... ... ... ... . . ... ... Cc-7 System Support Register 0xBF840100 . . .. . ............. C-8 System Interrupt Register 0xBF840110 . ... ... .......... C-11 System Interrupt Mask Register 0xBF840120 . ... ... ... .. C-16 Error Address Register OxBFA40000 .. ... .............. C-17 EAError Log Types . . . ... ... C-18 . . . . . Xi C-9 Error Syndrome Register 0xBFA80000 .............. ... C-19 C-10 Control Register 0xBFAC0000 . ... .. ... C-11 .. ... .. ... ..... Participating Data Bits in Check Bit Calculation C-12 Syndrome Decoding . . ... ... C-20 ... ... . .. C-22 L. C-23 . ............... D-2 ... .. ... D-1 SCSI Cable Connector Pin Assignments D--2 Serial Communications Connectors Pin Assignments . ... .. D-3 D-3 ThickWire Ethernet Connector Pin Assignments . . ... ... .. D4 D4 Power Supply Pin Assignments ... ...... .. ....... .. ... D4 D-5 Modem Loopback Connector Pin Assignments . ... ........ D-5 D-6 Ethernet Loopback Connector Pin Assignments. . . ... .. ... D-5 D-7 Summary of Loopback Connectors . . ................... D-5 Xii Preface Intended Audience This guide is for Digital Service representatives who wish an in-depth view of DECsystem 5300 CPU operations and troubleshooting. How To Use This Guide This guide explains how the CPU works to control system operations and to interface to other devices. It also explains how to troubleshoot the system. The CPU is the same as the one in the DECstation 5000 Model 240. For an overview of the system hardware and its configurations, Chapter 1, “System Overview.” see For information about console mode, used for maintenance operations and operating mode, used for regular software operations, see Chapter 2, “Console Mode and Operating Mode.” For an overview of the tools that are used most often when troubleshooting the CPU and its peripherals, see Chapter 3, “Troubleshooting Overview.” For a description of the information available to help you identify failed FRUs, see Chapter 4, “Troubleshooting Information.” For a description of the tests and scripts used when troubleshooting, see Chapter 5, “Troubleshooting Tools.” For an explanation Commands.” of console commands, see Appendix A, “Console For an explanation of individual system module and memory module tests, see Appendix B, “Base System Self-Test Commands and Error Messages.” For information about connector pin assignments, see Appendix C, “CPU and System Registers.” For information about CPU “Connector Pin Assignments.” and system registers, see Appendix D, xiii For information about ULTRIX system “ULTRIX System Exercisers.” Table 1: exercisers, see Appendix E, Conventions Used in This Guide Convention Use Monospace type Anything that appears on your monitor screen is set in monospace type, like this. Boldface type Anything you are asked to type is set in boldface type, like this. ltalic type Any part of a command that you replace with an actual value is set in Note Notes provide general information about the current topic. Caution Cautions provide information to prevent damage to equipment or software. Read these carefully. Warning italic type, like this. Warnings provide information to prevent personal injury. Read these carefully. Xiv Chapter 1 System Overview This chapter provides an overview of the DECsystem 5900 hardware. This chapter discusses the following topics: e Basic system hardware o System hardware configurations ¢ Hardware options and peripherals 1.1 System Hardware Configurations The DECsystem 5900 is a reduced instruction set computer (RISC) desktop system based on the MIPS R-3000 processor and designed to support the ULTRIX operating system. The system is usually configured as a server. 1.2 External System Unit Connectors The external system unit connectors on the back panel connect the workstation to external devices. The external system unit connectors are 0000 06e listed here and shown in Figure 1-1. Not used TURBOchannel Extender option slot TURBOchannel Extender option slot TURBOchannel Extender [/O (connected to @) TURBOchannel Extender option slot TURBOchannel option slot 2 Not used Communications port System Overview 1-1 Figure 1-1: CPU Connectors MLO-007682 @ System console port @® TURBOchannel slot 1 @ Halt switch @ Diagnostic LEDs ® Standard Ethernet @® TURBOchannel Extender adapter (connected to @) @® System module SCSI port @ Remote power sequence connector 1.3 DECsystem 5900 Functional Overview The block diagram in Figure 1-2 shows the basic system block diagram from a functional perspective. The DECsystem 5900 is made up of five main subsystems: memory, SCSI, Ethernet, TURBOchannel and the serial communication lines, and the clocking and interrupt handling logic to co-ordinate these functions as shown in Figure 1-2. The system has four main Application Specific Integrated Circuits (ASIC) that make up much of the important logic circuits and control the movement 1-2 CPU System Manual of data in the system. These are the Memory Buffer (MB), Memory TURBOchannel (MT), Memory Subsystem (MS), and I/0 Control (IOCTL). The CPU daughter card mounts on the system module and contains the R3000A CPU/FPU, 64Kb data and instruction cache, a Memory Buffer (MB) ASIC and a Memory TURBOchannel (MT) ASIC. The MB ASIC interfaces the CPU to the rest of the system, provides timing and control and cache RAMs interfacing. The MT ASIC interfaces the CPU to memory and the TURBOchannel 1/0 interface. As a example, data can come in on the from the SCSI bus and DMA into memory by way of the MT ASIC without CPU intervention. Located on the system module is the Memory Subsystem (MS) ASIC which provides interface to the memory SIMMs in the memory slots. Address and data pass through this ASIC. The MS ASIC also provides memory timing signals. Also on the system module is the SCSI, Ethernet, serial communication chip, run-time clock and supporting hardware for these devices. These devices are controlled by the I/O Control (IOCTL) ASIC. The devices listed above are all on TURBOchannel slot 3, as the block diagram shows. The IOCTL ASIC also is the interface to memory and other parts of the system for TURBOchannel slots 0, 1, 2. Since all /O data passes through the IOCTL and control signals are provided by this ASIC, you must specify the TURBOchannzl path when booting or testing a device on that slot. For example, c¢nfg 1 will only show you the devices on TURBOchannel slot 1. For the system to perform its power-up tests, it must first read the ROM by means of the IOCTL, MT and MB ASICS. 1.4 System Module Main Components The system module is the largest printed circuit board in the CPU drawer, and is screwed directly to the floor of the CPU drawer. The system module supports the CPU module, Ethernet, SCSI bus, communication ports, memory modules, and up to three TURBOchannel options. Major elements on the system board include: ¢ 256-Kbyte power-up self test and bootstrap ROM * System control and status registers and diagnostic LEDs » RTC-based system clock and 50-byte (5-year) battery backed-up RAM System Overview 1-3 Figure 1-2: Biock Diagram of CPU Control Paths 8 — c - o R m prss—— Q— °l6 o () — N <) > z — o ] v ———— ] - £© 5 O o 28 ® |t 2 @ s|le3 =0 B so —— 0 | ; = g o~ — Q 14 o) ~ < =Y o a4] 5 - ._l. Q. O =oO Q5B ) O« = Q o« 1 Ed v [y © o A n §< 3 © oo [+] s £ £ jod g sglo = [ @ o £33 CPU D Q. [&] ¢ SCC-based serial lines « Two RS232 asynchronous serial comm ports * Error address status register ¢ ECC error check/syndrome status register 1-4 CPU System Manual * LANCE-based network interface for Ethernet * Disk/tape interface for SCSI peripherals * TURBOchannel IO option connectors e DMA for SCSI Ethernet, and two comm ports e Halt switch 1.4.1 CPU Card Comnonents The CPU card is a daughter card held to the system module by four standoffs and a dual card-edge connector. It has the following functionality: ¢ R3000A MIPS processor (40 MHz) CPU/FPU * Read/Write buffer ¢ 64-Kbyte instruction and data caches * Processor interface * Memory and TURBOchannel interface e Clock and configuration logic 1.4.2 Memory Modules Up to 14 32-Mbyte MS02-CA single in-line memory modules (SIMMs) may be connected to the system module, for a maximum total of 448 Mbytes. The standard DECsystem 5900 has a minimum of two 32-Mbyte MS02-CAs. 1.4.3 NVRAM Module A 1-Mbyte NVRAM memory module can be installed in the last memory slot, slot 14. This is the only slot in which the NVRAM can be used, and is dedicated to the NVRAM only. The NVRAM module is the hardware that supports the PrestoServe NFS accelerator. The NVRAM has two LEDs to indicate the status/condition of the on-board battery. 1.4.4 TURBOchannel and TURBOchannel Extender TURBOchannel is an I/0 interface. The system module contains three TURBOchannel option slots. One of these slots (0) is preconfigured with a adapter module that is used to connect a TURBOchannel extender. The remaining two slots (1 and 2) may be used for one dual or two single TURBOchannel option modules. System Overview 1-5 The TURBOchannel extender is a standard feature of the DECsystem 5900. The TCE is mounted in the hinged metal cover of the CPU drawer. The TURBOchannel extender allows a two- or three- slot TURBOchannel option module to be connected to the DECsystem 5900 and logically take up one TURBOchannel slot. This leaves two slots available for other TURBOchannel options. Without the TURBOchannel Extender, certain TURBOchannel options could use up all three system TURBQOchannel slots, preventing other options from being installed. TURBOchannel enables connection to TURBOchannel options such as: PMAZ (single-ended SCSI) PMAD (Ethernet) DEFZA (FDDI) As shown in Figure 1-1 there are three connections from the rear panel of the CPU drawer to optional TURBOchannel I/O devices. These are the TURBOchannel controller ports. One of them (slot 0) contains a TURBOchannel Extender adapter module and is connnected to the TURBOchannel Extender module (TCE). TCE provides a place to mount one-, two-, or three-slot option without using all of the system TURBOchannel slots. NOTE: Only one TURBOchannel option can be placed on the TURBOchannel Extender module. This module does not provide three logical slots but rather extends one slot outside of the system board. 1.4.5 Internal Base System Module Connectors The internal base system module connectors are the means by which the system’s modular components, both standard and optional, are connected to the base system module and, through it, to each other. ¢ The CPU module connector connects the CPU module to the base system module. Fifteen memory module connectors provide the means for installing SIMMs and one optional NVRAM module. Three internal TURBOchannel option module connectors connect TURBOchannel option modules to the base system module. The connector closest to the power supply is referred to as logical slot number 2 in test commands and error messages. The middle connector is logical slot 1, and the connector furthest away from the power supply is logical slot 0. The power input connectors receive dec power from the power supply for all components in the system unit enclosure. 1-6 CPU System Manual 1.4.6 SCSI Drives The base system module comes with a SCSI controller. In addition, up to three TURBOchannel SCSI controller option modules can be added to the system. Each SCSI controller can support up to seven drives. 1.4.7 TURBOchannel Option Modules The TURBOchannel connectors on the base system module can support three TURBOchannel option modules. Any SCSI controller, Ethernet controller, or serial controller TURBOchannel option modules operate in addition to the equivalent functions on the base system module. System Overview 1-7 Chapter 2 Console Mode and Operatiiig Mode This chapter discusses the following topics * Console mode and operating mode ¢ Console prompts ¢ Password management e System software startup and shutdown 2.1 Modes The system operates in two modes: console mode and operating mode. 2.1.1 Console Mode Most maintenance operations are conducted in console mode, including the following: e Displaying hardware configurations (see Chapter 4 and Appendix A) * Setting environment variables (see Chapters 3 and 5 and Appendix A) * Running diagnostic tests and scripts (see Chapters 3 and 5 and Appendixes A and B) ¢ Booting the system software (see the'System Software Management” section in this chapter) NOTE: ULTRIX error logs cannot be accessed from console mode; they can be accessed only from operating mode. Console mode operations require that at least one RAM single in-line memory module (SIMM) be installed in slot 0 o1 the base system module. Console Mode and Operating Mode 2-1 2.1.1.1 Console prompts When the normal console prompt (>>) is displayed, full console functionality is available, and you can use all console commands. When the restricted console prompt (R>) is displayed, you can enter only the boot or passwd command without qualifiers. Console commands, including boot and passwd, are discussed in Chapters 3 and Chapter 5 and Appendix A. 2.1.1.2 To enter console mode This section lists the methods of entering console mode in order of recommended preference. Enter console mode in one of the following ways, depending on circumstances: If system software is running, shut down the system software. The system enters console mode automatically when system software is shut down. This is the most orderly way to enter console mode as it prevents corruption of the data. NOTE:: Turning off the power while ULTRIX is running can corrupt data. If autoboot is not enabled, turn off the system power, and then turn it on again. The system executes the power-up self-test sequence and then comes up in console mode, displaving one of the console prompts (>> or R>). If autoboot is enabled, you have to defeat autoboot to enter console mode by turning on the system power. Defeat autoboot in one of the following ways: — Turn off the system power, and then turn it on again. Watch the command line on the monitor. As each power-up self-test runs, the name of the test appears on the command line. When the screen does not display any test name and the cursor appears on a blank line, quickly press Ctri-c. The system aborts the boot process and comes up in console mode, displaying one of the console prompts (>> or rR>) — Set the setenv variable to setenv haltact ion ~h. — When finished, set the setenv variable to setenv haltaction -b. If the &> prompt is displayed, you can use only the boot and passwd commands until you enter the password. 2-2 CPU System Manual 2.1.1.3 Hait button The Halt button on the rear parnel interrupts the processor. The data in the memory is preserved. 2.1.2 Operating Mode In operating mode, the system displays the ULTRIX prompt. mode is used for regular software operation. Operating For maintenance purposes, operating mode is used to access ULTRIX error logs. 2.1.2.1 To enter operating mode The system enters operating mode in one of two ways: (autoboot) or manually from console mede. automatically If autoboot is enabled, the system executes the boot command immediately after the power-up self-test. The system goes directly to operating mode without displaying the console prompt. Autoboot is enabled when the haltaction environment variable has been set to b and the boot environment variable has been set to a meaningful bootpath. The boot and setenv commands and the environment variables are discussed in Appendix A. Procedures for eniering operating mode manually from either console prompt are described in Section 2.2.1. 2.2 System Software Manugement The following system software (ULTRIX) operations are significant to the hardware maintenance process: ¢ Starting up ibooting) system software * Shutting down system software * Accessing ULTRIX error logs 2.2.1 To Boot System Software 1. At either console prompt >> or R>, enter boot and press Return. The boot process takes several minutes. 2. If the system displays the ULTRIX prompt (#) before the login: prompt appears, the system has stopped at single-user mode instead of multiuser mode. To move on to multiuser mode, press ctri-d to continue the boot operation. When the system displays the login: prompt, the system software has started successfully. The system probably stopped at single-user mode because the bootpath is set for Console Mode and Operating Mode 2-3 single user mode or because of disk corruption. See the "setenv Command” section in Appendix A for information about how to set the bootpath for multiuser mode. If the problem persists, clean the disks using the fsck function. 3. If the system displays a console prompt (>> or r>), the boot failed. Proceed as follows: If the system displays an error message, see the "Console Command Error Messages” section in Appendix A . a. If the the system displays the restricted prompt (r>), enter passwd and press Return. At the pwd: prompt, enter the password and press Return. The system displays the console prompt (>>). If you cannot enter the password, see the DECsystem 53800 Service Guide. At the console prompt (>>), enter printenv and press Return to display the environment variables table. Use the setenv command to set the boot environment variable to a device or to the network that contains the system software that you want to boot. See the "boot Command” section in Appendix A . Reenter the boot command to boot the system. e. 2.2.2 To Shut Down System Software If the system is running ULTRIX software, shut down the software before you perform hardware maintenance. At the ULTRIX prompt (#), enter /ete/shutdown -h (now | Ahmm | +n) and press Return. You must include one of the parameters shown in parentheses to tell the system when to shut down. ¢ Specify the now parameter to shut down the software immediately. ¢ Specify ~hmm to shut down the software at a specific hour and minute. * — Replace hh with the hour to begin the shutdown. — Replace mm with the minute to begin the shutdown. Specify +n to shut down the software in a specified number of minutes. Replace n with the number of minutes until shutdown begins. The system displays a console prompt (>>) or (R>) complete. 2-4 CPU System Manual when shutdown is 2.2.3 To Access ULTRIX Error Logs At the ULTRIX prompt (#), enter /etc/uerf (-R | more) and press Return. For information about interpreting the ULTRIX error logs, see the "ULTRIX Error Logs” section in Chapter 4. Console Mode and Operating Mode 2-5 Chapter 3 Troubleshooting Overview 3.1 Introduction This chapter provides a brief overview of the tools that are used most often when troubleshooting the CPU and its peripherals. The field service engineer will solve each problem differently as logic dictates. This chapter is an overview of the most often-used troubleshooting tools and techniques. Each of the tools and techniques mentioned is covered in detail in its own section. e Chapter 4 discusses the information that the field service engineer uses to identify failed field-replaceable units (FRUs). e Chapter 5 discusses the tools that the engineer uses to test the system and its components. In general, these are the questions that the Digital Services engineer deals with when working on a system: * What malfunction does the user report? ¢ What malfunction does the field engineer observe? * Have the proper procedures been followed? ¢ Has the system run properly in the past or is it a new system? * Are the cables and connectors in order? * Is power getting to the system and its components? * Does the screen work? When the power-up self-test sequence runs, do error messages appear on the screen or on the diagnostic LED array on the rear panel of the system unit? ¢ What useful information does the cnfg display provide? ¢ Are the environment variables set properly? ¢ What useful information can the tests and secripts provide? Troubleshooting Overview 3-1 ¢ Isthe software version appropriate? If this problem is suspected, check with the technical support group at Digital or the module vendor for further information. 3.2 Power If the green LED on the front of the system unit or any of the mass storage drawers doesn't light up, the first priority is to get power to the device. 3.3 Self-Tests When the self-test sequence runs, o If the system presents no error codes and displays the console prompt (>>) or boots up, go on. e [Ifan error code is displayed on the diagnostic LED array but not on the screen, use the LED error code to troubleshoot. See the "LED Displays” section in Chapter 4. If one or more error messages appears on the screen, use the error messages to troubleshoot. See the "Error Messages” section in Chapter 4 and the "Power-Up Self-Tests” section in Chapter 5. When the console prompt (>>) appears, you can use the console tests and utilities to get more information. See Chapter 5. 3.4 Configuration Displays To see the configuration overview display, at the console prompt (>>) enter enfg and press Return. To view the detailed configuration information for one module, enter cenfg slot_ number and press Return. Replace slot_number with the number of the slot where the module is installed. See the "cnfg Command” section in Appendix A. Look for the following information: * Does all of the installed hardware appear in the configuration display? * Does the right amount of memory appear? * Are the SCSI IDs correct? 3-2 CPU System Manual ¢ [s the firmware version appropriate? If this problem is suspected, check with the technical support group at Digital or the module vendor for further information. 3.5 Environment Variables Environment variables are set as follows: At the console prompt (>>) enter printenv and press Return. See the "printenv Command” and "setenv Command” sections in Appendix A. Look for the following information: ¢ Does the boot variable refer to the correct drive and is the drive working? See the "boot Command"” section in Appendix A. ¢ Is the haltaction variable set as desired, either to autoboot or to stop in console mode.? 3.6 Tests and Scripts The base system and the TURBOchannel option modules contain tests and scripts that can be used to test functions and components singly or in combination. 3.6.1 Tests To view the tests that are available for a module, at the console prompt (>>) enter t slot_number/? and press Return. Replace slot_number with the number of the slot where the module to be tested is installed. Table B-1, "Base System Module Tests and Utilities", lists all of the tests for the base system module and indicates the function assessed by each test. Troubleshooting Overview 3-3 3.6.2 Scripts To view the scripts available for a module, at the console prompt (>>) enter Is slot_ number and press Return. Replace slot_number with the slot where the moduie to be tested is installed. To view the contents of a script, at the console prompt (>>) enter cat scrnipt_name and press Return. Replace script_name with the name of the script. You can write your own script to assemble a set of tests and scripts appropriate to a given troubleshooting situation. See the "To Create a Test Script” section in Chapter b. 3-4 CPU System Manual Chapter 4 Troubleshooting FRUs This chapter describes the information available to help you identify failed FRUs. The types of troubleshooting information are as follows: * LED displays ¢ Configuration displays * Error messages ¢ Addresses ¢ ULTRIX error logs ¢ Registers Some of the information, such as exception messages and power-up selftest error messages, is displayed automatically. Other information, such as configuration displays, test error messages, ULTRIX error logs, and registers, must be specifically generated or accessed by the engineer. ULTRIX error logs are accessible only in operating mode. All of the other types of configuration information are accessible only in console mode. See Chapter 2 for information about console and operating modes. 4.1 LED Displays The following three LED displays provide information about malfunctions: ¢ Diagnostic LED array ¢ CPU module LEDs * Power supply (DCOK) LED 4.1.1 Diagnostic LED Array The LEDs at the center rear of the CPU drawer indicate that tests are proceeding. This is useful when the system cannot display error messages on the console device. Troubleshooting FRUs 4-1 4.1.2 CPU Module LEDs A pair of LEDs on the CPU module light up when certain power-up events occur. When the power-up self-test fails to complete, the status of the CPU module LEDs implies the following: e If neither LED lights up, the CPU module is likely faulty. If only one LED lights up, the base system module is likely faulty. If both LEDs light up, the CPU and base system modules have completed basic communication operations with each other. 4.1.3 Drawer LEDs The green LED on the on/off switch at the front of each drawer (and at the rear switch of some versions of the mass storage drawer) indicates when power is on in that drawer. 4.2 Configuration Displays The configuration displays show what devices are installed in the system. When hardware does not show up on the configuration display, or shows up incorrectly, this fact can be useful for troubleshooting. You can also use configuration displays to obtain the following information about the components: ¢ The amount of RAM memory installed on a board ¢ Whether an NVRAM module is installed e The Ethernet address of an Ethernet controller e The SCS! ID and Device Type of SCSI devices See the "enfg Command” section in Appendix A. You can request configuration information in either of two forms: e You can request a configuration overview, which provides basic information about the hardware installed in all of the TURBOchannel slots. You can request detailed information about the hardware in one TURBOchannel slot. 4-2 CPU System Manual 4.2.1 Configuration Overview For the configuration overview, at the console prompt (>>) enter cnfg and press Return. The following is a typical configuration overview display: 3: KNO3-AA DEC V5.0a TCF { 64MB, {enet: p._}__._..___,_.—_. (scsi 64MB, = 1) I I | I | (Installed | | | (Ethernet f l I | {SCSI1 ! { | | | | | | Module Module name slot Firmware RAM) address) 1ID) type Firmware version | | URBOchannel * 1MB NVRAM) 08-00-2b-24-5b-82) vendor (base system module) number 1MB NVRAM 64 megabytes of RAM and 1 megabyte of NVRAM is installed on the base system module. ® enet: 08-00-2b-64-5b-82 The Ethernet address of the base system Ethernet controller. hd = 7 scsl The SCSI ID of the base system SCSI controller. 4.2.2 Detailed Configuration For detailed information about the hardware in one TURBOchannel slot, at the console prompt (>>) enter enfg slot_number and press Return. Replace slot_number with the slot number of the module for which you want configuration information. For example, for detailed information about the base system module hardware, at the console prompt (>>) enter cnfg 3 and press Return. The following is a typical detailed configuration display for the base system module: Troubleshooting FRUs 4-3 3: KNO3-2A DEC V5.0a TCF0O ( 96MB, (enet: 1MB NVRAM) 08-00-2b-64~5b~82) = (scsi DEV PID rzl RZ58 {c) rz2 RRD42 (c) dcache ( 64 KB), 7} VID REV DEC DEC nnnn DIR DEC DEC nnnn CD-ROM ( KB) 1icache 64 SCSI mem( 0): a0000000:alffffff ( 32 MB) mem( 1): a2000000:a3f£fffff ( 32 MB) mem( 2): ad000000:aSffFffEf (32 mem(14): alB800000:al1Bfffff ( mem(l4): clean, batt ok, 1 DEV MB) MB) Prest o-NVR armed This detailed configuration display example provides information in addition to the configuration overview: the following The three SCSI drives connected to the base system module are as follows: rz]1 RZ58 (c) DEC DEC nnnn DIR: An RZ58 drive with SCSIID 1, manufactured by Digital Equipment Corporation, is using firmware version nnnn and is a hard disk drive (DIR). rz2 RRD42 (c) DEC DEC nnnn CD-ROM: An RRD42 optical medium with SCSI ID 2, manufactured by Digital Equipment Corporation, is using firmware version nnnn and is a CD-ROM. wcache ( 64 KB), i1cache ( 64 KB) : The base system module data cache is 64 kilobytes. The base system module instruction cache is 64 kilobytes. The RAM configuration is as follows: mem{ 0): a0000000:alffffff ( 32 MB) 32 megabytes of memory in memory slot 0 are assigned memory addresses a0000000 to alffifiT . mem{ 1): a2000000:a3ffffff ( 32 MB) 32 megabytes of memory in memory slot 1 are assigned memory addresses a2000000 to a3ffiiY. mem( 2): ad4000000:a5ffffff ( 32 MB) 32 megabytes of memory in memory slot 2 are assigned memory addresses 24000000 to aSfTfY. 4-4 CPU System Manual — mem( 14): alB800000:al8ffff ( 1 MB) — mem(14): Memory slot 14 contains a 1-megabyte NVRAM module. Addresses a1000000 to al18fffff are assigned to the NVRAM module. clean, batt ok, armed: The memory in the NVRAM module is clean, the battery is okay, and the battery in not turned on. 4.3 Error Messages An error message can be a test error message or a console exception message. Test error messages are displayed when an automatic or userinitiated test fails. Console exception messages are automatically displayed when console operations fail. This section describes the following error message types: ¢ Test error messages ¢ Console exception messages * Memory test error messages 4.3.1 Test Error Messages When a test fails, the message appears on the screen in the following format.: ?TFL slot_number/test name (n:description) [modulel . ?TFL Identifies a test error message. slot number Identifies the module that reported the error. test name Identifies the test that failed. n Indicates which part of the test failed. description Describes the failure; the message may include an address. module Indicates the module identification number. For an explanation of system and memory module test error messages , see Appendix B. For information about other error messages, see Appendix A. For an explanation of TURBOchannel option module error messages, refer to the TURBOchannel Maintenance Guide. This is a typical error message: Troubleshooting FRUs 4-5 ?TFL 3/scsi/cntl (3: cont xfr) [KNO3-AA] This error message states that the KN03-AA module in slot number 3, the base system module, failed the SCSI controller test. The explanation of the SCSI controller test in Appendix B states that the message (3: cnt xfr) means that the read and write operation reported a mismatch. Table 4-1 lists the base system tests and the corrective action indicated when each test is listed in a test error message. Table 4-1: Base System Test Error Messages Error Message Component Corrective Action cache/data cache/fill CPU module Replace the CPU module. If the problem persists, replace the system module. Memory modules See the appropriate test section in Appendix B. mem/select Memory and system Replace misc/halt System module Replace the system module. misc/pstemp Power supply See the misc/pstemp test section in Appendix B. misc/wbpart Memory modules See the mise/wbpart test section in Appendix B. ni/clisn ni/common Base system Ethernet controller See the appropriate test section in Appendix B. NVRAM module See the appropriate test section in Appendix B. cache/isol ¢ ache/reload cache/seg fpu ecc/cor mem mem/{loat10 module the memory module that the test identifies. If the problem persists, replace the system module. nifere ni/cntrs ni/dmal ni/dma2 ni/esar ni‘ext-1b ni/ing n¥/int-1b ni/m-cst ni/promisc ni/regs preache preache/arm preache/clear prcache/unarm 4-6 CPU System Manual Table 4-1 (Cont.): Base System Test Error Messages Error Message Component Corrective Action rte/nvr Syatem module Replace the system module. rte/period rte/regs ri¢/time scc/aceess Serial line controllers See the appropriate test section in Appendix B. scc/dma and devices attached see/init to them sec/io see/ ping [ce/tx-rx sesventl sesi/sdiag Base system SCSI controtler or device Sec the appropriate test section in Appendix B. CPU moduie Replace the CPU module. sesi/target tib/prb tih/reg 4.3.2 Memory Test Error Messages When a memory test detects an error, the message appears on the screen in the following format: ?TFL 3/mem (n: board xx, MBE = yy, SBE = zz) ?TFL 3/mem Indicates that a memory test failed. n Represents the number of the subtest that failed. xx Represents the memory slot where the faulty board is installed. yy Represents the number of multiple-bit errors that occurred. 2z Represents the number of single-bit errors that occurred. This i1s a typical memory test error message: ?TFL:3/mem (1: board 3, MBE = 25, SBE = €)) In this example: Troubleshooting FRUs 4-7 3/mem Indicates that the mem test failed. i: Indicates that subtest number 1 failed. board 3 Indicates that the SIMM in slot 3 is faulty. MHBE = 25 Indicates that 256 multiple bit errors occurred. SBE = & Indicates that 6 single bit errors occurred. 4.3.3 Console Exception Messages When a console operation fails, the system displays a console exception message. When a console exception message appears, first verify that any command and address that you entered are valid. If you are sure the command and address are correct but the console exception still occurs, interpret the message to determine what caused the exception. For information about the registers, see Appendix C. A console exception message can be recognized by the first line, which always begins with the characters ? PC:.. A console exception message includes some combination of the following entries: P PC: address T CR: cause ? SR status ? VA: virtual ? ER: error address ? CK: error address syndrome address represents the address of the exception instruction. * cause represents the value in the cause register. ¢ status represents the contents of the status register. ¢ virtual address represents the virtual address of the exception. e error address represents the contents of the error address register. * error syndrome represents the value in the error syndrome register. The following example shows a typical value for each of the possible entries of a console exception message. In each entry, the information in brackets is the decoded version of the hexadecimal value that precedes it. 4-8 CPU System Manual At Y A B PSR ? PC: Oxbfc0d0d <vtr=NRML> CR: SR: 0x210c¢c <CE=0,1P6,EXC=DBE> 0x30080000 <CU1,CUO,CM, IPL=8> VA: 0x0 ER: 0xd0800006 <VALID,CPU, ECCERR,ADR=2000018> CK: 0x8¢cl18c321 <VLDHI, CHKHI=C, SYNHI=18,VLDLO, CHKLO=43, SYNLO=21> 4.4 Addresses Addresses of various types appear in error and exception messages. These addresses indicate the location of the malfunction. You use addresses in test commands to indicate which module or memory location the test is to address. This section describes the following types of addresses: ¢ Slot numbers ¢ Memory addresses ¢ Hardware physical addresses 4.4.1 Slot Numbers Test commands and error messages include slot numbers that identify the hardware to which the test command or error message refers, as shown in Table 4-2. Troubleshooting FRUs 4-9 Table 4-2: Slot Numbers in Commands and Messages Slot Hardware ldentified 0 Option module in slot 0 (on left side, viewing from rear) 1 Option module in slot 1 {(middle option slot) 2 Option module 1n slot 2 (on right gide, viewing {from rear) 3 Base system hardware, including System module CPU module — SIMMs NVRAM Serial communications controller — Base system SCSI controller Base system Ethernet controller 4.4.2 Memory Addresses When a memory error occurs, the error message contains the address of the error. You can identify the faulty SIMM by the address. Addresses can appear in error messages in several formats, but you must use the ksegl format to specify addresses in console commands. ksegl format refers to uncached, unmapped address space. In ksegl format, the uppermost three bits of the address are always 101 and the hexadecimal form of the address always begins with an A or a B. For example, if an address is listed in an error message as 0x04040404, you would use 0xA4040404 to specify that address in a console command. If an address is listed in an error message as 0x14040404, you would use 0xB4040404 to specify that address in a console command. Table 4-3 lists the memory addresses in ksegl format by slot number for 32-Mbyte memory modules. 4-10 CPU System Manual Table 4-3: Memory Module Address Ranges Memory 32-Mbyte Module Slot Address Ranges 0xA0000000 to OxAIFFFFFF 0xA2000000 to OXxA3FFFFFF 0xA4000000 to OXAGFFFFFF 0xA6000000 to OXATFFFFFF 0xA8000000 to 0xAIFFFFFF 0xAA000000 to OXABFFFFFF 0xAC000000 to OxADFFFFFF 0xAE000000 to OXAFFFFFFF 0xB0000000 to OxB1FFFFFF 0xB2000000 to OxB3FFFFFF 0xB4000000 to OxB5SFFFFFF 0xB6000000 to OxB7FFFFFF 0xB8000000 to OxBIFFFFFF 0xBA000000 to OxBBFFFFFF 14 0xBC000000 to OxBDFFFFFF Reserved 0xA7800000 4.4.3 Hardware Physical Addresses The hardware addresses in Table 4—4 appear in ULTRIX error logs. Troubleshooting FRUs 4-11 Table 4-4: Hardware Physical Addresses Physical Address Range Indicated Hardware 0x00000000 to 0x1DFFFFFF RAM 0x1E000000 to Ox IE7FFFFF TURBOchannel slot 0 0x TE800000 to Ox 1IEFFFFFF TURBOchannel slot 1 0x1F000000 to Ox1F7FFFFF TURBOchannel slot 2 0x 1800000 to Ox 1FFFFFFF Siot 3. Base system module The following addresses are included in the system module address range: 0x 1F800000 to 0x1F83FFFF System ROM 0x1¥840000 to Ox1F87FFFF Input/output control (IOCTL) memory access (DMA) pointers 0x1F880000 to 0x1F8BFFFF Ethernet address PROM and EEPROM 0x1F8C0000 to Ox1F8FFFFF Ethernet interface 0x1F900000 to 0x1F93FFFF Serial communication chip (SCCX0) registers 0x1F940000 to Ox 1IF97FFFF Reserved 0x1F980000 to Ox 1FIBFFFF SCC(1) registers 0x1F9C0000 to Ox1FOFFFFF Reserved 0x1FA00000 to 0x1FA3FFFF Real-time clock 0x1FA40000 to 0x1FA7FFFF Error address (EA) register (0x1FA40000) 0x 1FA80000 to Ox1FABFFFF Error syndrome (ES) register (1FA80000) 0x1FAC0000 to Ox1FAFFFFF Control/status (CS) register (0xFAC0000) 0x1FB00000 to 0x1FB3FFFF SCSI interface 0x1FB40000 to Ox1FB7FFFF Reserved 0x1FB80000 to 0x1FBBFFFF SCSI DMA 0x1FBC0000 to Ox1FBFFFFF Reserved 0x1FC00000 to 0x1FC3FFFF Boot ROM 0x1F'C40000 to 0x1FFFFFFF Reserved 0x20000000 to 0x3FFFFFFF TURBOchannel slot 0 0x40000000 to Ox5FFFFFFF TURBOchannel slot 1 0x60000000 to 0x7TFFFFFFF TURBOchannel slot 2 0x80000000 to OXFFFFFFFF Reserved 4-12 CPU System Manual registers and direct 4.5 ULTRIX Error Logs The system records events and errors in the ULTRIX error logs. Use the memory error logs and the error and status register error logs to troubleshoot intermittent problems. This section describes ULTRIX error log formats and error log items that are useful for troubleshooting. The ULTRIX error logs are not the same as the test error logs that appear when you use the erl command from the console prompt. A test error log is a record of errors reported by tests run in console mode. 4.5.1 Examining Error Logs You must be running ULTRIX to examine error logs. At the ULTRIX prompt (#) enter lete/uerf -R | more and press Return. A full display of ULTRIX error logs, with the newest error logs first, appears on the monitor. For information about running ULTRIX, see the "System Software Management" section in Chapter 2. Information about the uerf command in the ULTRIX man facility can be obtained by entering man uerf at the ULTRIX prompt (#). 4.5.2 ULTRIX Error Log Format The first part of each ULTRIX error log describes the type of error and system conditions in effect when the error occurred. The format of the first part is the same for all ULTRIX error logs, regardless of the event type. The second part of each log provides specific information about the error and its location. In the second part, the information available for troubleshooting varies according to the event type. The first part of all ULTRIX error logs is similar to this example: khkhkkkhkkhxkhkkkkkhkx ENTRY 6. Ak hkkkk xAhkhkhkhkkhkkhkkx Troubleshooting FRUs 4-13 ~~~~~ EVENT EVENT CLASS 0S INFORMATION ----=- OPERATIONAL EVENT EVENT TYPE 250. ASCII MSG OPERATING SYSTEM ULTRIX 32 OCCURRED/LOGGED ON Mon OCCURRED GRANITE 5. SEQUENCE NUMBER SYSTEM ON SYSTEM 1D x82040230 Nov TYPE MESSAGE HW REV: x30 REV: x2 TYPE: 1992 EST R2000A/R3000 KNO3 Error count on memory module 2048, * 10:39:27 FW CPU PROCESSOR 11 resetting count 0 reached to zero. EVENT CLASS indicates the category of the error. The two event class categories are operational events and error events. — Operational events are changes in system operation that are not errors. — e * Error events are actual errors in system operation. 0s EVENT TYPE describes the type of error or event recorded in the log. Table 4-5 lists the operating system event types and their codes. For information about memory error logs and error and status register error logs, see the "Ultrix Error Log Event Types" and "Memory Error Logs" sections in this chapter. SEQUENCE NUMBER indicates the order in which the system logged the event. * OPERATING SYSTEM indicates the system’s version of ULTRIX. * OCCURRED/LOGGED ON indicates when the error occurred. * OCCURRED ON sYSTEM identifies the system that reported the error. * s5YSTEM ID includes the following listings: — The first number is the system ID. — HW REV indicates the system hardware revision number. — FW REV indicates the system firmware revision number. — cpU TYPE indicates the type of CPU installed in the system. — PROCESSOR TYPE indicates the type of processor chip that the system uses. The MEsSAGE field provides information about the event or error. 4-14 CPU System Manual 4.5.3 ULTRIX Error Log Event Types The second line of each error log indicates the code number and event type of the error Table 4-5 lists the error log event types. Table 4-5: Error Log Event Types Code Event Type 100 Machine check 101 Memory error 162 Disk error 103 Tape error 104 Device controiler error 1056 Adapter error 106 Bus error 107 Stray interrupt 108 Asynchronous write error 109 Exception or fault 113 CPU error and status information 130 Error and status register 200 Panic (bug check) 250 Informational ASCIl message 251 Operational message 300 System startup message 310 Time change message 350 Diagnostic information The information in the second part of an error lng varies according to the event type listed on line 2 of the first part of the error log. For a detailed explanation of other error logs, refer to the ULTRIX documentation for the uerf function or the documentation for the device that the error log discusses. Troubleshooting FRUs 4-15 4.5.4 Memory Error Logs The two examples in this section are two sequential ULTRIX error log entries that are related to each other. The two entries were generated when a correctable single-bit error occurred in the SIMM in slot 0. ENTRY 6 occurred within 1 second after ENTRY 5. * A Ak kokk kk Ak khhih ki ————— EVENT EVENT CLASS 05 EVENT SEQUENCE Nk kokokok ok k ok kokokok ok ok ok ok ----OPERATIONAL 250. NUMBER ON ASCII MSG ULTRIX 32 ON Mon SYSTEM Nov x82040230 Error HW REV: x30 REV: x2 TYPE: count on _2048, ———————————— 0OS RS ENTRY EVENT 5. SEQUENCE OCCURRED SYSTEM 101. NUMBER ON CPU CLASS UNIT TYPE ERROR UNIT HW REV: REV: x2 TYPE: INFORMATION MS02 x30 RZ2000A/R3000 -~—=-w—r=—m——- MEMORY MEMORY KNO3 MEMORY CRD REGISTERS ERROR ------- x00006400 ERROR CHECK 32 MB MEM ECC ERROR ADDR SYNDROME VALUE x0 MODULES CORRECTION ENABLED x010205F8 x006308CB4 SYND BITS SINGLE BIT CHECK BITS CPU System Manual x34 MCDULE NUM. ERROR COUNT INVALID 4-16 1991 x800AFA3C C35 PHYSICAL CHECK 106:39:27 MEMCRY EPC MEMORY 11 KNO3 SYNDROME ——————————————— Nov GRANITE TYPE —————————————— reached to zero. 32 Mon FW UNIT 0 count ERROR ULTRIX ®x82040230 PROCESSOR module EVENT MEMORY ON SYSTEM 1D memory resetting 4. SYSTEM OCCURRED/LOGGED R2000A/R3000 ~---w=—e—me—m———— ERROR TYPE OPERATING PST kkxhkhkAhkkhxkkhkkxkA INFORMATION CLASS EVENT 1991 KNO3 MESSAGE EVENT 10:39:27 FW CPU TYPE I E RS SR SR ESEERESE 11 GRANITE ID PROCESSOR EVENT 5. SYSTEM OCCURRED/LOGGED SYSTEM 6‘ INFORMATION TYPE OPERATING OCCURRED ENTRY PC ERROR xC xO 3. MEMINTR PST A troubleshooter would analyze the error logs in the preceding examples as discussed here. See Appendix C for detailed information about memory registers. * The MESsSAGE field of ENTRY 6 indicates that more than 2048 singlebit ECC errors have occurred on the memory module in memory slot 0 and the counter has been reset to zero. Since the memory error correction feature corrects single bit errors, this is an operational event, not strictly an error. ° ENTRY 5 reports the actual single-bit error that overflowed the counter, causing it to be reset. The information under the KNO3 MEMORY REGISTERS heading is useful to the troubleshooter: — SINGLE BIT ERROR indicates that a single-bit correctable error occurred. — MODULE NUM x0 indicates that the error occurred on the module in slot 0. — The pHYSICAL ERROR ADDR field indicates the error address. — The value 32 MB MEM MODULES in the MEMORY csSR field indicates the size of the memory modules. ECC memory is designed so that occasional single-bit errors can occur and correction will take place. If occasional errors occur on a module, the module should not be replaced. But if a particular memory module records errors frequently, the module should be replaced. The memory error log example in this section describes a multiple-bit uncorrectable error. The module where the error occurred must be changed. LB S S EEEEE BRI ————— EVENT EVENT CLASS 0S EVENT SEQUENCE EEEEESE ENTRY INFORMATION IREEERER SR ERERERERE ----ERROR TYPE 101. NUMBER CPERATING 193_ ERROR ULTRIX 32 7. SYSTEM OCCURRED/LOGGED EVENT MEMORY ON Tue Jan 7 10:52:18 1992 PST OCCURRED SYSTEM ON SYSTEM ID csselab2 x82040230 HW REV: x30 FW REV: x2 CPU PROCESSOR ----- UNIT TYPE INFORMATION TYPE: R2000A/R3000 KNO3 ----- Troubleshooting FRUs 4-17 UNIT CLASS MEMORY UNIT TYPE MS02 ERROR SYNDROME ————— KNO3 MEMORY REGISTERS EPC RDS ERROR -~-—- x8011995C MEMORY €S x00006400 CHECK VALUE 32 PHYSTCAL CHECK e MEMORY MEMORY ERROR ADDR SYNDROME xO MB MEM MODULES ECC ERROR SYND BITS CORRECTION ENABLED x00FDSECC x800080B5 SINGLE BIT CHECK BITS x35 ERROR x0 MODULE NUM. xO0 ERROR COUNT O. The ERrROR SYNDROME field describes the error. The value in that field (MEMORY RDS ERROR) indicates that a multi-bit uncorrectable error occurred. ¢ The information under the KN03 MEMORY REGISTERS heading provides the following useful information: — The value nUM. x0 in the fourth line of the cHECK SYNDROME field indicates that the error occurred on the module in slot 0. — The pHYs1cAL ERROR ADDRESS field indicates the error address. — The value 32 MB MEM MODULES in the second line of the MEMORY CSR filed indicates the size of the memory modules. Replace the indicated memory module. Multi-bit errors are not correctable, and will cause processes and the system to crash. 4.6 Registers The system automatically displays CPU register information in the console exception message when console exception exception occurs. To access system registers, from the console prompt (>>) enter e console _address and press Return. Replace console_address with the address of the register that you want to examine. Use the ksegl format for the address. For information about the ksegl format, see the "Memory Addresses” section in this chapter. For complete register information, see Appendix C. For information about the e command, see the "e Command” section in Appendix A, 4-18 CPU System Manual 4.7 For Further Information For an explanation of other error logs, refer to the ULTRIX documentation for the uerf function. For an explanation of error logs for SCSI devices, refer to the documentation for the device described in the error log. Troubleshooting FRUs 4-19 Chapter 5 Troubleshooting Tools This chapter discusses the system troubleshooting tools. It explains how to ¢ Run tests * Use test scripts 5.1 Console Mode You have to be in console mode to perform maintenance operations, including the following: * Run diagnostic tests ¢ Read error messages * Set environment variables * Display hardware configurations See the "Console Mode" section in Chapter 2 NOTE: You have to be in operating mode to use ULTRIX error logs. 5.2 Tests The read-only memory (ROMs) on the base system module and on the TURBOchannel option modules contain numerous tests that verify the functions of the system. Tests can be used in the following ways to check system hardware operation: ¢ The automatic power-up self-test scripts run a comprehensive set of individual tests on the system and option module hardware. You can run individual tests in console mode to test specific system and option module functions. You can run one of several prepared scripts or create a script of your own, containing any set of tests that you find appropriate. Troubleshooting Tools 51 5.2.1 Slot Numbers in Test Commands and Error Messages Test commands and error messages use slot numbers to identify the hardware to which the command or message refers. Slot 3 always refers to the base system hardware, which includes the » following: System module CPU module Memory modules (SIMMs and NVRAM) e Base system Ethernet controller Base system SCSI controller Serial line controllers Slot 0 refers to the TURBOchannel option slot on the left side, viewing e from the back. e Slot 1 refers to the middle TURBOchannel option slot. » Slot 2 refers to the TURBOchannel option slot on the right side, viewing from the back. 5-2 CPU System Manual 5.3 Power-Up Self-Tests When you turn on the system power, the system automatically runs a power-up self-test script. The monitor and the diagnostic LED array report any errors the power-up self-tests detect. Self-test error codes are discussed in the "Error Messages” section in Chapter 4 and in Appendix B. You can specify a quick or a thorough power-up self-test script to run when the system powers up. The quick script, usually specified for normal power-up, is a limited script that allows the system to boot quickly * The thorough script runs an extensive check of system hardware. The thorough script is most useful for field service troubleshooting. To select a power-up self-test script, use the setenv command to set the testaction environment variable. Enter setenv testaction (q | t) and press Return. The vertical bar ( | ) means that you choose one of the alternatives. In this case, ¢ Enter setenv testaction q to select the quick test. * [Enter setenv testaction t to select the thorough test. You can use the powerup script to run the power-up self-tests without turning the power off and on again. To run the powernp script, at the console prompt (>>) enter powerup and press Return. Troubleshooting Tools 5-3 5.4 Console Mode Tests From the console prompt (>>), use the t command to run an individual test or the sh command to run a test script. To see a list of available console commands and their formats, at the console prompt (>>), enter l) and press Return. Appendix A describes the console commands in detail. 5.4.1 Using the t Command To run an individual test, from the console prompt (>>) enter and press Return. t Indicates the test command. - Causes the test to repeat until you press Ctrl-e or reset the system by pushing the Halt button or by switching the power off and then on. slot number Replace with the slot number of the module to be tested. test name Replace with the name of the test to be run. argl..argn Specify individual test conditions. 5-4 CPU System Manual Table 5-1: Slot Slot Numbers in Test Commands Number Component Tested 0 Option module in slot 0 (on left side, viewing from the back) 1 Option module in slot 1 (middle option slot) 2 Option module in slot 2 (on right side, viewing from the back) 3 Base system hardware, which includes — System module ~~~~~ CPU module Memory modules (SIMMs and NVRAM) Base system SCSI controller Base system Ethernet controller Serial line controllers Troubleshooting Tools 5-5 5.4.1.1 To display a list of available tests To display a list of tests available for a module, from the console prompt (>>) enter t slot numberl? cache/1sa) cache/reload cache/seq = cache/data or or DID) or D[D] address[80050000} bt and press Return. Replace slot_number with the number of the slot where the module is installed. A display similiar to this appears on the monitor: or D[D] address [80050000) DD} address 80050000} fpu board[0] mem thrsld[10] pattern{ 55555555} mem/init mem/floatl0 address [A0100000} mem/select mfg/done misc/pstemp misc/wbpart rte/nvry pattern[55]} rtoe/period ree/regs rtce/time tlb/prb tlb/reg ¢ pattern{55555555] The first column lists the names of the tests available for the module in the slot that you specified. Entries in the other columns are individual test parameters. The value in brackets next to each parameter is the default value for that parameter. 5-6 CPU System Manual 5.4.2 Common Tests This section briefly describes the following frequently used tests: ¢ SCSI controller test ¢ SCSI send diagnostics test * Ethernet external loopback test * Transmit and receive test Pins test Appendix B describes the base system module tests and their parameters and error messages in detail. For information about the TURBOchannel module tests, refer to the TURBOchannel Maintenance Guide. 5.4.2.1 SCSI controller (cntl) test To cntl test tests the operation of a SCSI controller. For example, to run the controller test on the base system SCSI controller, at the console prompt (>>) enter t J/scsi/ent’ and press Rewurn. For information about SCSI controller test error messages, see the "SCSI Controller Test" section in Appendix B. 5.4.2.2 SCSI send diagnostics (sdiag) test The sdiag test runs the self-test for an individual SCSI1 device. For example, to run the SCSI send diagnostics test on device 0 connected to the base system SCSI controller, at the console prompt (>>) enter t 3/scsi/sdiag and press Return. For information about sdiag test parameters and error messages, see the "SCSI Send Diagnostics Test” section in Appendix B. Troubleshooting Tools 5-7 5.4.2.3 Ethernet external loopback test The Ethernet external loopback test tests an Ethernet controller and its connections. First install a ThickWire loopback connector on the external connector of the controller to be tested. Then, enter the xternal loopback test command. For example, to test the base system Ethernet controller, at the console prompt (>>) enter t 3/ni/ext-lb and press Return. For information about external leopback test error messages, see the "SCSI Controller Test” section in Appendix B. 5.4.2.4 SCC transmit and receive test The SCC transmit and receive test tests the transmit and receive function of a serial port. First, install a communications adapter with an MMdJ loopback connector on the serial connector to be tested, then enter the SCC transmit and receive test command. For example, to run the internal loopback test on serial line 3, at the console prompt (>>) enter t 3/sce/tx-rx 3 int and press Return. For information about the SCC transmit and receive test format and error messages, see the "SCC transmit and receive test” section in Appendix B. 5.4.2.5 SCC pins test The SCC pins test tests the pins on a serial communications connector. First, install a modem loopback connector on the communications connector, then enter the SCC pins test command. For example, to test serial line 3 using a 29-24795 loopback connector, at the console prompt {>>) enter t 3/sce/pins 3 29-24795 and press Return. For information about the SCC pins test format, the pins tested by the different loopback connectors, and the pins test error messages, see the "SCC Pins Test” section in Appendix B. 5-8 CPU System Manual 5.5 Test Scripts The ROM for each module contains preprogrammed test scripts. A test script is a short program that includes a list of individual tests and other test scripts. When you run a test script, the system automatically runs the included tests and scripts in order. Use the sh command to run a test script. To run a test script once and then stop, at the console prompt (>>) enter sh slot_numberitest_script and press Return. Replace slot_number with the slot number of the module that you want to test. Replace test_script with the name of the test script that you want to run. For example, to run the quick pst test script on the option module in slot 1, at the console prompt (>>) enter sh -1/pst-q and press Return. To have a test script keep repeating until you press Ctrl-c, at the console prompt (>>) enter sh -1 slot numberftest _script and press Return. For detailed information about scripts, see the "script Command” and "sh Command” sections in Appendix A. Troubleshooting Tools 5-9 5.5.1 To Display a List of Available Scripts To display a list of scripts available for a module, from the console prompt (>>) enter Is slot_number and press Return. Replace slot_number with the slot number of the module. This is a partial listing of the scripts in the base system module: -> code 28 1 cnfg 28 24 1 1 boot -> code rst-q -> rst 24 1 rat-t 28 1 rst-m -> 32 28 1 1 test-ni-m ~> test-ni-t init -> code 304 1 powerup 44 1 reset 36 28 1 1 halt-r halt-b 192 1 pst-m 272 1 pst-q 196 1 pst-t 96 1 tech 121 1 test 2401 1 test—-cache 132 1 test-cpu 1928 1 test~-scc~-m 868 1 test-scc-t 124 1 test-crt 60 1 test-misc 268 1 test ~-mem-m 80 1 test-mem=-q 184 1 test ~mem-t 196 1 test~ni-t 88 1 test-rtc 40 1 test—-scsi 104 1 rst g8g 1 cnsltest 5-10 CPU System Manual =-> rst powerup 5.5.2 To Display the Contents of a Script To see which individual tests and other test scripts are in a specific test seript, at the console prompt (>>) enter cat slot_number/script_name and press Return. Replace slot_number with the slot number of the module. Replace script_name with the name of the test script for which you want a histing. The system displays a list of the individual tests and any other test scripts that are ip-the test script. The following example shows the cat command and the r\esultmg listing of the contents of the test-rtc test script for slot 3 (the base sgstem module): >>cat 3/test-rtc t ${#}/rtc/regs t S{#})/rte/nvr t S{#l/rtc/period t S{#)/rte/time In the listing of a test script, the character # represents the slot number of the module where the script resides. The cat command displays the contents of test scripts only. display the contents of other objects. It does not For further information about the cat command, see the "cat Command” section in Appendix A. Troubleshooting Tools 5-11 5.0.3 To Create a Test Script You can create a test script to test modules under conditions you choose. 1. At the console prompt (>>), enter script ¢oript_name and press Return. Replace script_name with the name you want to give the script you are creating. 2. Enter the test commands for the tests that you want to include in the script. — Enter test commands in the same order that you want the tests to run. You can include individual tests and test scripts. — Specify any test parameters that you want to include with each entry. — Press Return after you finish typing each individual test command. 3. To finish creating the test script, press Return twice after you enter the last test command in the test script. 4. To run the script you just created, enter sh script_name and press Return. the script. Replace script_name with the name you assigned The system stores the test script in volatile memory (RAM). The test script is lost when you turn off the system unit or halt the system with the Halt button. You can store only one script at a time. If you use the s command to list the test scripts for the base system, the test script you created appears in the test script list. 5-12 CPU System Manual Appendix A Console Commands This appendix explains ¢ The rules to follow when you type console commands * Terms commonly used in this discussion of console commands * The command format and purpose of each console command ¢ Possible console command error messages A.1 Using This Appendix A.1.1 Conventions Used in This Appendix ¢ Letters in boldface type like this are to be typed exactly as they appear. e Letters in italic type like this are variables that you replace with actual values. * Arguments enclosed in square brackets (| 1) are optional. ¢ Ellipses (...) follow an argument that can be repeated. * A vertical bar (1) separates choices. symbol that means "or". * Parentheses enclose a group of values from which you must select one value. For example, -(b | h | w) means enter -b or -h or -w. You can think of the bar as a A.1.2 Some Terms Used in This Appendix Controller: A hardware device that directs the operation and communication between devices or other controllers. Each controller in the system has a unique controller ID number. Script: A collection of console commands that run in a set order. Test scripts, which are collections of individual tests and may also contain other test scripts, are commonly used for troubleshooting the system. Console Commands A-1 Slot: The physical location of a module or modules. L] TURBOchannel option modules occupy slots 0, 1, and 2. The base system occupies slot 3. Base system hardware includes the system module, CPU module, and memory modules. The system module contains the base system SCSI and Ethernet controllers. A.1.3 Rules for Entering Console Commands You can use console commands when the system monitor displays the >> or r- prompt. When the system displays the rR> prompt, you can use only the boot and passwd commands until you enter the console command password. Follow these rules when you enter console commands: Enter uppercase and lowercase letters exactly as they appear in command lines. The system treats uppercase and lowercase letters as different input. Press Return after you enter a command. ‘nter numbers as follows: — — Enter decimal values as a string of decimal digits with no leading zeros (for example, 123). Enter octal values as a string of octal digits with a leading zero (for example, 0177). — Enter hexadecimal values as a string of hexadecimal digits preceded by 0x (for example, 0x3fT). When reading or writing to memory, enter data as bytes, halfwords, or words. Because a word is 4 bytes, successive addresses referenced by a word are successive multiples of 4. For example, the address following 0x80000004 is 0x80000008. An error occurs if you specify an address that is not on a boundary for the data size you are using. When reading or writing to memory, enter the address in ksegl format. ksegl format refers to uncached, unmapped address space. In ksegl format, the uppermost three bits of the address are always 101 and the hexadecimal form of the address always begins with an A or a B. For example, if an address is listed in an error message as 0x04040404, you would use 0xA4040404 to specify that address in a console command. If an address is listed in an error message as 0x14040404, you would use 0xB4040404 to specify that address in a console command. A-2 CPU System Manual * The following key combinations have an immediate effect when the system is in console mode: — Ctrl-s freezes the screen display. — Ctrl-q releases a frozen screen display. — Ctrl-¢ aborts a command. — Ctrl-u erases a partially entered line. A.2 Console Command Reference This section describes the console commands used to test the following hardware: * System module * CPU module * Memory modules ¢ Ethernet controllers ¢ SCSI controllers Console commands in this appendix appear in the same order as they appear in the system console command Help menu. For information about console commands used by TURBOchannel options not on this list, refer to the TURBOchannel Maintenance Guide. A.2.1 Console Command Format Summary Here are the console commands and their formats displayed in the Help menu that appears when you enter ?: ?{emd] boot ({~z cat SCRPT cnfg [#] #} [-n] [-S #] d [-bhw] © [~bhwcdoux] erl [=-c] go [ADR] init 1s [#] [-m] #/path RNG [=S #] [ARG...]]} val RNG [ARG...] [#1} passwd [-c] printenv [~-s] [EVN] restart Console Commands A-3 script SCRPT setenv EVN STR sh t [~belvS] [~11 #/8TR [SCRPT] [ARG..] [ARG..] unsetenv EVN The following sections describe the console commands in detail. Note that the command descriptions do not always use the format that appears in the Help menu. Table A-1 lists the console commands. Table A-1: Console Commands Command Function ? lemd)] Displays console commands and formats. boot |-z seconds| [-n| [bootpath] [-a] largs...] Boots the system. cat slot number /script name Displays the contents of a script. enfg [slot number) Displays system configuration information. di-(b | h | w)}|-Scountlrng Deposits data into memory. ef{b | h | w)}(-c][-d]{-0ol Examines memory contents. erl [-c] Displays the error message log. go laddress) Transfers control to a specific address. init {slot number||-mj Resets the system or a module. I8 [slot number) Displays the scripts and other files in a module. passwd [-c] -8} Sets and clears the console password. printenv [variable] Prints environment variables. restart Attempts script name Creates a temporary script of console commands. setenv variable value Sets an environment variable. sh (-b] [-e] [-1] [-v] |-S] Runs a script. [-u) [-x] [-Scount] rng lslot number/script} larg...| A-4 to restart the specified in the restart block. CPU System Manual operating system software Table A-1 (Cont.): Console Commands Command Function t |-1} slot number /test name Runs a test. {argl]... largn| test Runs a comprehensive test script that checks the system hardware. unsetenv vartable Removes an environment variable. A.2.2 ? Command Use the ? command to display a list of available console commands and their formats. The ? command format is ? lemd) To display the format for all available console commands, omit the optional cmd parameter. To display the format for a single command, replace the optional cmd parameter with the name of the command for which you want a command format display. A.2.3 boot Command Use the boot command to boot the system software. The boot command format is boot [-zseconds| [-n] |bootpath] |-a] largs...] Include the optional -z seconds parameter to have the system wait before starting the bootstrap operation. Replace seconds with the number of seconds the system should wait before the bootstrap operation starts. Include the optional -n parameter to have the boot command load, but not execute, the specified file. Replace the optional bootpath parameter with the specification for the file you are using to boot. The file specification form depends on the type of boot device you use. — To boot from Ethernet, use the file specification form slot_number/protocel |/ filel Console Commands A-5 Replace slot_number with the slot number of the Ethernet controller you are using to boot. The protocol parameter represents the name of the network protocol that performs the boot operation. Replace protocol with either mop or tftp. The optional file parameter represents a specific file that you use to boot. For example, to use the protocol named mop to boot from the base system Ethernet, which uses slot number 3, enter boot 3/mop and press Return. - To boot from a drive, use the file specification form slot_numberfxrz | tz) scsi_id/file_name Replace slot_number with the SCSI controller slot number. Use the (rz | tz) parameter to specify the type of drive that performs the boot operation. Specify rz to boot from a hard disk or compact disc drive. Specify tz to boot from a tape drive. Replace scsi_id with the SCSI ID for the drive you are using to boot. Replace file_name with the name of the specific file you want to boot. For example, to boot the file named vmunix in multiuser mode from a hard disk drive with SCSI ID 1 that is on the SCSI bus connected to the base system SCSI controller in slot 3, enter boot 3/rz1/vmunix -a and press Return. For example, to boot the file called vmunix from a tape drive that has SCSI ID 2 and is on the SCSI bus connected to a SCSI controller in option slot 1, enter boot 1/tz2/vmunix and press Return. The tape labeled "ULTRIX 4.L supported vol. bootable tape. * 1 (RISC)" is the To perform a multiuser boot operation, include the -a argument. If you omit the -a argument, the system performs a single-user boot. A-6 CPU System Manual A.2.3.1 important information about the boot command e If you do not include a boot path in the boot command, the system uses the boot environment variable as the string for the boot command. * If you include any additional arguments, you must enter the entire string in the boot command. The system ignores the boot environment variable whenever you specify any arguments in the boot command. * Ifyou use any spaces or tabs in the boot environment variable, you must surround the entire value with double quotation marks. For example, to set the boot environment variable to use the mop protocol to perform a multiuser boot from the base system Ethernet controller in slot 3, enter setenv boot "3/mop -a" and press Return. * For details about the boot command parameters for each TURBOchannel option module, refer to the documentation for the TURBOchannel option module in which you are interested. A.2.4 cat Command Use the cat command to display the contents of a script. The cat command format is cat slot_numberiscript_name ¢ Replace slot_number with the slot number of the module that has the * Replace script_name with the name of the script for which you want to contents you want to display. display the contents. For example, to display the individual self-tests contained in the test-rtc test script in the base system, enter cat 3/test-rtc and press Return. The following list of the individual tests that are in the test-rtc test script then appears on the monitor: Console Commands A-7 >»cat t 3/test-rtc S{#}/rtc/regs t S{#)/rrc/nvr t S${#}/rtc/period L ${#)/rtc/time A.2.5 cnfg Command Use the cnfg command to display hardware configuration information. The enfg command format is enfg (slot_number] e To display general system configuration information, enter the cnfg ¢ To display detailed configuration information for an individual module, replace the optional slot_number parameter with the slot number of the module for which you want a configuration display. command without the slot_number parameter. A.2.5.1 General system configuration displays The following sample general system configuration display is for a system with optional NVRAM, Ethernet, and SCSI modules installed: »>>cnfg 3: KNO3-AA DEC V5.0a TCFO (24MB, 1MB (enet: {scsi 2: PMAD-AA DEC V5.1f TCFO (enet: 1: PMAZ-AA DEC V5.1le TCFO (scsi 0: PMAG~BA DEC V5.2a TCFO {CX ~- NVRAM) 08-00-2b-24-5b-82) = 1) 08-00-2b-0f-43-31) = 7) D=8) Lines that begin with 0, 1, 2, or 3 describe the modules, if any, that are in the option slots. * The number that begins the line is the module slot number. * The second term is the module name. * The third term is the module vendor. ¢ The fourth term is the firmware version of the module. ¢ The fifth term is the type of firmware that is in the module ROM chip. A-8 CPU System Manual ¢ The messages in parentheses in the rightmost column provide additional information about each module. The meaning of each message depends on the type of module being described. — For the system module, the three lines in this column describe base system hardware. The first line lists the amount of memory in the system. The second line lists the address for the base system Ethernet controller. The third line lists the ID of the base system SCSI controller. —- For TURBOchannel Ethernet controllers, the additional information is the Ethernet station address. — For TURBOchannel SCSI controllers, the additional information is the SCSI ID for the SCSI controller. Individual configuration displays begin with the same line that describes that module in the general system configuration. A.2.5.2 Base system configuration displays To obtain a base system configuration display, enter cnfg 3 and press Return. This is a sample configuration display for the base system: 3: KNO3-AA DEC V5.2A TCFO ( 24 MB, (enet: (SCSI DEV PID 1 MB NVRAM) 08-00-2b-0f-45~72) = 17) viD REV tzl SCSI DEV SEQ rz2 RZ58 (C) DEC DEC nnnn DIR rzd RRD42 {C) DEC DEC nnnn CD-ROM dcache ( 64 KB), 1icache { 64 KB) mem ( 0): a0000000:a07fffff ( 8 mem ( 1): a0800000:a0ffffff ( 8 MB) mem ( 2): al000000:al7fffff ( 8 MB) mem{ 14): a7000000:a70fffff mem( 14): clean, bat ok, ( 1 MB) MB) Presto~-NVR armed Notice that the display begins with the same information as in the general system configuration display. The rest of the display provides details Console Commands A-9 regarding the devices and memory installed in the base slot. This example shows three devices and four memory modules in the base slot. The configuration display provides the following information about the base system devices: The pEV column lists the general category of the drive and its SCSI ID. — rz indicates that the drive is a hard disk or optical compact disc drive. — tz indicates that the drive is a tape drive. — The number at the end of the entry is the drive SCSI ID. The p1D column lists the product ID for some types of drives. — The term on the left indicates the specific drive type. — The term on the right indicates the product manufacturer. The viD column lists the drive vendor. The rREV column lists the firmware version number for the drive. The scs1 pDEV column further describes the drive type. — DIR, which indicates a direct access drive, appears in entries for hard disk drives. — SEQ, which indicates a sequential access drive, appears in entries for tape drives. — CD-ROM appears in entries for optical compact disc drives. The configuration display provides the following information about the base system memory modules: Memory slot number. Address range. Amount of memory in the slot. The amount can be 8 or 32 megabytes for SIMMs and 1 megabyte for an optional NVRAM module. Except for the NVRAM module, the same amount of memory should be displayed for all the slots because all of the SIMMs should be the same size. The display also will reveal a mixed memory installation, but ULTRIX will not work properly with mixed memory. A-10 CPU System Manual A.2.5.3 Ethernet controller configuration displays To display an Ethernet controller option module configuration, enter enfg slot_number Replace slot _number with the slot number of the Ethernet controller option module. To see the base system Ethernet controller configuration, display the base system configuration display. Enter cnfg 3 and press Return. The base system Ethernet controller configuration is displayed with the other base system configuration information. The following is a sample Ethernet controller configuration display for an Ethernet controller option module in slot 1: 1l: PMAD-AA DEC V5. 2a TCFO (enet: 08-00-2b-0c-e0-dl) The Ethernet controller configuration display has the same meaning as the Ethernet controller description in the general system configuration display. For an explanation of the Ethernet controller configuration display, see the "General System Configuration Displays” section earlier in this appendix. A.2.5.4 SCSI controller displays To display a SCSI controller option module configuration, enter onfg slot_number and press Return. Replace slot_number with the slot number of the SCSI controller option module. To see the base system SCSI controller configuration, enter cnfg 3 and press Return. The base system SCSI controller configuration is displayed with the other base system configuration information. The following is a sample configuration display for a SCSI controller in slot 2 that supports two hard disk drives, one optical compact disc drive, and one tape drive: Console Commands A-11 2: PMAZ-AA DEC DEV {(SCSI = 7) TCFO vV5.2a PID vID REV SCSI DEV rz0 rzl RZ58 RZ57 (C) (C) DEC DEC DEC DEC nnnn nnnn DIR DIR rz4 RRD42 (¢) DEC DEC nnnn CD-ROM tz5 SEQ In the SCSI configuration display, the first line has the same meaning as the SCSI description in the general system configuration display. For an explanation of this first line, see the section "General System Configuration Displays" earlier in this appendix. Lines following the first line describe drives on the SCSI bus. ¢ The pEv column lists the general category of the drive and its SCSI ID. — rz indicates that the drive is a hard disk or optical compact disc drive. tz indicates that the drive is a tape drive. — The number at the end of the entry is the drive SCSI 1D. ¢ The p1D column lists the product ID for some types of drives. — The term on the left indicates the specific drive type. — The term on the right indicates the product manufacturer. ¢ The viDp column lists the drive vendor. * The rREV column lists the firmware version number for the drive. * The scs1 pEV column further describes the drive type. — — DIR, which indicates a direct access drive, appears in entries for hard disk drives. 3k, which indicates a sequential access drive, appears in entries for tape drives. — CD-ROM appears in entries for optical compact disc drives. A.2.6 d Command Use the d command to deposit values in memory. The d command format is d(-(b! h | w)][-Scountirng A-12 CPU System Manual Use one of the optional parameters -(b | h | w) to specify whether to deposit the contents as bytes, halfwords, or words. Specify -b to deposit the contents as bytes. Specify -h to deposit the contents as halfwords. Specify -w to deposit the contents as words. Include the optional -Scournt parameter to store the same value more than once. Replace count with the number of times that you want the value to be stored. Use the rng parameter to set the range of addresses across which the values are stored. To deposit values at a single address, replace rng with that address. To deposit a number of values across a range of addresses, replace rng with the address range. Use the form address_low:address_high to define the range. Replace emphasis>(address_low) with the starting address for storing values and replace address_high with the ending address for storing values. To deposit values at a series of addresses, replace rng with the starting address and the number of successive addresses at which you want to store values. Use the form address_low#count to specify the addresses where you store values. Replace address_ low with the starting address for storing values. Replace count with the number of values you want to store. To specify more than one address range, separate the range specifications with commas. Leave no spaces between the range specifications. A.2.7 e Command Use the e command to examine the contents of a specific address. The e command format is el-b | h | wll-e][-d][-0}[-ul(-x][-S count] rng Use one of the optional parameters (-b | h | w) to specify whether to examine the contents as bytes, halfwords, or words. Specify -b to examine the contents as bytes. Console Commands A-13 peeS——. Specify -h to examine the contents as halfwords. Specify -w to examine the contents as words. ¢ Specify -x to display the contents in hexadecimal format. * Specify -0 to display the contents in octal format. * Specify -u to display the contents in unsigned decimal format. * Specify -d to display the data in decimal format. * Specify -¢ to display the data as ASCII characters. Include the optional -Scount parameter to have the command repeatedly fetch the value, but display the value only once. When you enter this parameter, replace count with the number of times that you want to fetch the value. ¢ Use the rng parameter to specify the range of addresses you want to examine. To examine values at a single address, replace rng with that address. To examine values at a range of addresses, replace rng with the address range. Use the form address_low:address_high to define the range. Replace address_low with the starting address for storing values and replace address_high with the ending address for storing values. To examine values at a series of addresses, replace rng with the starting address and the number of successive addresses you want to examine. Use the form address low#count to specify the addresses where you store values. Replace address_ low with the starting address for storing values. Replace count with the number of addresses at which you want to store values. To specify more than one address range, separate each range specification with commas. Leave no spaces between the ranges. A-14 CPU System Manual A.2.8 erl Command Use the er] command to display or clear the log of the errors that occurred since the most recent power-up or reset operation. When the buffer that holds these error log fills up, no further errors are recorded. If you intend to run tests and use these logs for information, use the erl -c command to clear the logs first. The erl command format is erl [-c] ¢ To display the current error message log, use the erl command without the -c option. ¢ To clear the error message log, include the -¢ option. When the error log buffer is full, no more messages are added until the buffer is cleared by the erl -¢c command. A.2.9 go Command Use the go command to transfer system control to a specific system address. The go command format is go laddress) * To transfer system control to the address specified in the last boot -n command, the go command without the address parameter. If you omit the address parameter, and if no previous boot -n command has been issued, the system ignores the go command. * To transfer system control to the contents of a specific address, include the address parameter. Replace address with the address to which you want to transfer control. A.2.10 init Command Use the init command to initialize module hardware. The init command format is init [slot_number] |- m] * To initialize the entire system, specify the init command with no additional arguments, * To initialize an individual module, replace the optional slot_number parameter with the slot number of the module that you want to initialize. ¢ If you perform an init operation on the system module (slot 3), include the optional -m parameter to zero all memory modules in the system module. Console Comimands A-15 A.2.11 Is Command Use the 1s command to list the scripts and other objects that are in system ROM. The Is command format is Is {slot_number] To display a list of scripts or other objects that are available in an individual module, replace the optional slot_number parameter with the slot number of the module that contains the files you want to display. This sample display is a portion of the 1s display for the base system in slot 3: >>1ls 3 28 1 cnfg ->» code 28 1 boot -> code 24 1 rst-q -> rst 24 1 rst-t -> rst 28 1 rst-m ->powerup 32 1 test-ni-m -> 28 1 init 304 1 powerup 44 1 reset 36 1 halt-r -» 28 1 halt-b 192 1 pst-m 272 1 pst-q 196 1 pst-t 96 1 tech 156 1 test test-ni-t code The third column lists the names of the scripts and other objects in the module ROM in the specified slot. A.2.12 passwd Command Use the passwd command to enter, set, or clear a password. The passwd commangd format is passwd [-c] [-s] If the console prompt is R>, you can use only the boot and passwd commands until you enter the correct password. To enter an existing password, enter the passwd command without any additional parameters. At the pwd: prompt, enter the password. Then press Return. A-16 CPU System Manual A’ter you enter the correct password, or if the system does not require a password, the system displays the console prompt >>. You can use all console commands whenever the console prompt is >>. * To clear an existing password, include the -¢ parameter when you enter the passwd command. First use the passwd command to enter the existing password. After the console prompt >> appears, enter passwd -c and press Return. The system then removes the password requirement. e To set a new password, include the -s parameter. Enter the new password at the pwd: prompt. When the pwd: prompt appears a second time, enter the password again. If the two password entries match, the system set the new value as the password. * The password must have at least six and no more than 32 characters. The system treats upper-case and lower-case letters as different characters. A.2.13 printenv Command Use the printenv command to display the list of environment variables. The printenv command format is printenv [variable] * To display the entire environment variable table, omit the optional variable parameter. * To display an individual environment variable, replace variable with the name of the environment variable you want to display. A.2.14 restart Command Use the restart command to restart the system software. For the restart operation to succeed, the operating system software must have a restart block set up in memory. The restart command format is restart A.2.15 script Command Use the script command to create a temporary set of console commands that run in an order that you specify. The seript command format is script name Replace name with the name that you are giving the script. Console Commands A-17 After you press Return, enter the commands that you want to include in the script. Press Return after each command that you enter. Commands can be t or sh commands. Enter one command per line. When you finish typing the commands that you are including, press Ctrl-d or press Return twice to complete the script. To run the script, use the sh command described later in this appendix. When you run the script, the commands execute in the order in which you entered them when you created the script. A.2.16 setenv Command Use the setenv command to change an environment variable. Table A-2 lists the standard environment variables. When you change a standard environment variable (except osconsole and #), the system stores the row value in NVR and uses it until you use the setenv command to change it again or reset the NVR with the clear-NVR jumper. The setenv command format is setenv variable value When you enter the setenv command, * Replace variable with the name of the environment variable you want to set. * Replace value with the new value that you want to assign to the environment variable. Note that if the new value contains blank spaces or tabs, you must use double quotation marks (") at the beginning and end of the value. Table A-2: Environment Variables in the Environment Variable Display Environment Variable boot Description Sets the default boot path. See the "hoot Command” and "unsetenv Command” gections in Appendix C. console Selects the system console. - Set the console variable to s to enable the terminal connected to the left comm connector (viewed from the back) as the active console. A-18 CPU System Manual Table A-2 (Cont.): Environment Variables in the Environment Variable Display Environment Variable Description FPAWS Specifies the way the systern responds when a diagnostic test finds an error. haltaction more -~ 8et the EPAWS variable to EPAWS to cause the system to pause when a diagnostic test finds an error. Press any key to continue testing. — If the EPAWS variable is set to any value other than EPAWS, the system does not pause when an error occurs. Specifies the way the system responds when it halts. - Set the haltaction variable to b to cause the console to boot after the console performs the appropriate initialization and self-tests. — Set the haltaction variable to h to cause the console to halt and attempt no other action. -~ Set the haltaction variable to r to cause the console to restart and then attempt to boot if the restart operation fails, Sets the way the screen scrolls lines of text. — Set the more variable to 0 to have text scroll to the end before stopping. — Set the more variable to a number other than zero to have scrolling pause after that number of lines has been displayed. osconsole Contains the slot numbers of the console drivers. If a TTY driver from slot x serves as the console, osconsole is set to x. If a CRT driver from slot y and a kbd driver from slot z serve as the console, osconsole is set to y,z. Although the environment variable display includes the osconsole setting, you cannot set this variable. The system automatically sets the asconsole value. testaction Specifies the type of power-up self-test that the system runs. —- Specify q to run a quick test when the powor-up self-test runs. -— Specify t to specify a thorough test when the power-up self- test runs. Specifies the slot number of the module that contains the current script. If no seript is active, the system specifies the base system module, slot number 3. Although the environment variable display includes the # setting, you cannot set this variable. Console Commands A-19 A.2.17 sh Command Use the sh command to run a script. The sh command format is sh [-b| [-e] [-1] |-v] [-S] |slot_number/script] |arg...] Include the optional -b parameter to execute the script directly, instead of through a subshell. Include the optional -e parameter to stop the script if an error occurs. Include the optional -1 parameter to have the seript loop until you press Ctrl-c. Include the optional -v parameter to echo the script to the console as the test runs. Include the optional -S parameter to suppress any error messages if the script is not found. To test a specific module with a specific script, include the optional slot _number/script parameter. — Replace slot_number with the slot number of the module that has the script you want to run. — Replace script with the name of the script you want to run. The use of any additional arguments depends on the particular script you specified in the sh command. You can also run a script by typing slot_number/script. Replace slot_number with the slot number of the module that contains the script you want to run. Replace script with the name of the script you want to run. For example, to run the thorough power-up self test script for a SCSI controller and drives in slot 2, enter sh 2/pst-t and press Return, A.2.18 t Command Use the t command to run individual tests. The t command format is t [-1] slot_number/test_name larg...] Include the optional -1 parameter to have the test loop until you press Ctrl-c or reset the system. A-20 CPU System Manual * Replace slot_number with the slot number of the module that you want to test. ¢ Replace test_name with the name of the individual test you want to run. ¢ The uses of any additional arguments depend on the particular test you are running. For an explanation of the additional arguments used in an individual test, see Appendix B. To display the name and format of all individual tests for a module, enter t slot_number? and press Return. Replace slot_number with the slot number of the module for which you want to display tests. A.2.19 test Command Use the test command to run a thorough test of all system hardware. The test command format is test A.2.20 unsetenv Command Use the unsetenv command to remove an environment variable. The unsetenv command removes a standard environment variable (except osconsole and #) during the current session only. When the system is reset, reinitialized, or powered up, the values of the standard environment variables revert to their previously set values. Table A-2 section in this appendix lists the standard environment values. The unsetenv command format is unsetenv variable When you enter the unsetenv command, replace variable with the name of the environment variable that you want to remove. NOTE: 7o clear the boot environment variable, use the setenv command. Enter setenv boot '’ and press Return. Console Commands A-21 A.3 Console Command Error Messages Table A-3 lists the error messages that the console commands can return. Table A-3: Console Command Error Messages Error Message Meaning 7'EV:ev name The specified environment variable does not exist. EVVwalue The specified environment variable value is invalid. MNO:slot_number/ device An /O device reported an error; slot_number represents the /0 device slot number, and emphasis>(device) represents an additional message about the error. M0:slot number/ device The module in the slot represented by slot number does not recognize the device represented by device. PDES: slot number The module in the slot represented by sioz number ?78NF: script The system did not find the script that was to be run. TXT: The name specified in the script command is not a valid contains an early version of the firmware. The ROM chip must be upgraded. script name. 2STF (4: Ln#0 Patr self test) A pointing-device self-test failed. This is an information message and does not prevent the system from automatically booting. ?8TX: usage A console command contained a syntax error. The usage parameter lists the correct syntax. ?78TX: error A console command contained a syntax error. The error parameter lists the incorrect portion of the command. TR 1.:slot numberitest A test failed; slot number represents the slot number of the module that reported the error. test represents the name of the failed test. The specified test could not be found. The test name was probably entered incorrectly. ?TNF: A-22 CPU System Manual Appendix B Base System Self-Test Commands and Error Messages This appendixl describes commands and messages for the following base system tests: e System moi'iule tests ¢ CPU module tests * Memory module tests * Base system SCSI controller tests * Base system Ethernet controller tests ¢ Initial power-up tests B.1 Locating Individual Tests in This Appendix When an individual test fails, the name of the test appears in the error message. For details of each base system test, see the section in this appendix that describes the test and its error messages. The tests are listed in alphabetical order. When troubleshooting the system, you can use the test command to run any single test when the console prompt (>>) appears. You can also write a test script to run a group of individual tests. See the "t Command" and "seript Command” sections in Appendix A for more information. To help you select the individual tests that apply to a problem that you are troubleshooting, Table B-1 lists the individual base system module tests grouped by the function that they test. Base System Self-Test Commands and Error Messages B-1 Table B-1: Base System Module Tests and Utilities Command Test or Utility Base System Module Tests Halt button test t 3/misc/halt [number] Overheat detect test t 3/misc/pstemp Nonvolatile RAM (NVR) test t 3/xte/nvr |pattern| Real-time clock period test t 3/rtc/period Real-time clock registers test t Y/rtc/regs Real-time test t 3/rte/time Serial Communication Tests SCC DMA test t ¥/sce/dma [line) |loopback| [baud) SCC interrupts test t 3/scefint [line] SCC inputioutput (VO) test t 3/scelio |line| |loopback] SCC pins test t 3/sce/pins {line| lattachment| SCC transmit and receive test t qug/tx-rx {tine] [loopback] |baud] (parity] |bits] CPU Module Tests Cache data test t 3/cache/data [cache] laddress] Cache fill test t 3/cache/B1 [cache] [offset| Cache isolate test t 3/cache/isol [cache] Cache reload test t 3/cache/reload [cache] |offset] Cache segment test t J/cache/seg [cache] {address] CPU.-rype utility t 3/misc/cpu-type Floating-point unit (FPU) test t 3fpu Translation lookaside buffer (TLB) t 3/tlb/prb TLB registers test t 3/tib/reg [pattern) prabe test Memory Module Tests B-2 CPU System Manual Table B-1 (Cont.): Base System Module Tests and Utilities Test or Utility Error Command correction coding (ECC) t 3/ecc/cor [address)| correction test Memory module test t 3/mem {module| (threshold| |pattern) Floating 1/0 memory test t 3/mem/floatl0 |address) Zero memory utility t 3/mem/init RAM select lines test t 3/mem/select Partial write test t 3/misc/whpart NVRAM Prcache Tests Preache quick test t 3/prcache Disable NVRAM battery t 3/prcache arm Enable NVRAM battery t 3/prcache unarm Ethernet Controller Tests Collision test t 3/mi/cllsn Cyclic redundancy code (CRC) test t 3/nifere Display maintenance operation protocol (MOP) counters utility t 3/ni/ctrs Ethernet-direct t 3/ni/dmal memory (DMA) registers test access Ethernet-DMA transfer test Ethernet station (ESAR) test address t 3/ni/dma2 ROM t 3/ni/esar External loopback test t 3/ni/ext-lb Interrupt request (IRQ) test t 3/ni/int Internal loopback test t 3/ni/int-Ib Multicast test t 3/ni/m-cst Promiscuous mode test t 3/ni/promisc Registers test t 3/ni/regs Base System Self-Test Commands and Error Messages Table B-1 (Cont.): Base System Module Tests and Utilities Test or Utility Command SCSI Tests SCSI controller test t 3/scsi/ontl SCSI send diagnostics test t 3/scei/sdiag scsi_id [d] u) {s] SCSI target test t Y/scsiftarget scsi id (w] {Hoops]| B.2 Tests The following sections explain the commands, parameters, and error messages for each base system module test. The tests are presented in alphabetical order. B.2.1 cache/data - Cache Data Test The cache data test writes data patterns to the cache and then reads them. To run the cache data test, enter t 3/cache/data {cache] laddress) and press Return. When you enter the cache test command, ¢ Replace cache with a value that specifies the cache you want to test. — Specify I to test the instruction cache. — Specify D to test the data cache. The default value is D. Replace the optional address parameter with the specific cache address where you want the test to start. Using the address parameter requires famihiarity with the firmware specifications. The default address is 80500000. B.2.1.1 Cache data test error messages Cache data test error messages have the form ?TFL:3 /cache/data (address=actual, ®* B-4 sb (code: expected]) ?TFL 3/cache/data indicates that the cache data test reported an error. CPU System Manual * code represents a number that identifies the portion of the test that failed. * The optional address=actual, sb expected phrase indicates the expected and actual values in the cache. — address represents the address where the error occurred. — actual represents the actual value at that address. — expected represents the expected value for that address. Table B-2 lists the codes used in cache data test error messages. Table B-2: Cache Data Test Error Codes Error Code Description 1 Error accurred writing data pattern to cache RAM. 2 Cache parity error occurred while test was reading floating 1. 3 Error occurred when test read data pattern in cache. 4 Cache parity error occurred while test was reading floating 0. 5 Error accurred when test wrote address complement to cache RAM. 6 Cache parity data error occcurred. 7 Error occurred reading address complement. 8 Cache address read caused a parity error. B.2.2 cache/fill - Cache Fill Test The cache fill test writes rotating data patterns to memory in spans that are twice the size of the cache and then reads the patterns. cache fill test, enter To run the t 3/cachefill [cache] |offset) and press Return. When you enter the cache fill test command, * ¢ Replace cache with a value that specifies the cache you want to test. — Specify I to test the instruction cache. — Specify D to test the data cache. The default value is D. Replace offset with a specific cache address where you want the test to start. The default address is 80500000. Base System Seilf-Test Commands and Error Messages B-5 B.2.2.1 Cache fill test error messages Cache fill test error messages have the form ?TFL: 3/cache/fill (description) * >TFL 3/cache/fill indicates that the cache fill test reported an error. * description represents an additional message that describes the error. Table B—3 lists the descriptions used in cache fill test error messages. Table B-3: Cache Fill Test Error Descriptions Error Description Meaning (PE) Unexpecied parity error occurred. (address= actual, sb expected) Data paitern read reported a miscompare. address represents the address where the miscompare occurred. actual represents the actual value at that address. expected represents the expected value for that address. (PE @ address (C)) Parity error occurred. The address parameter lists the address where the error occurred. B.2.3 cache/isol - Cache Isolate Test The cache isolate test isolates data patterns to the cache and then reads and compares them. To run the cache isolate test, enter t 3/cache/isol |cache) and press Return. When you enter the cache isolate test command, replace cache with a value that specifies the cache you want to test. e Specify I to test the instruction cache. e Specify D to test the data cache. The default value is D. B.2.3.1 Cache isolate test error messages Cache isolate test error messages have the form ?TFL: 3/cache/iscl (code: address=actual, sb expected]) B-6 CPU System Manual ®* 2TFL 3/cache/isol indicates that the cache isolate test reported an error. * code represents a number that identifies the portion of the test that failed. * The optional phrase address= actual, sb expectedindicates the actual and expected values at the address where the error occurred. ~ address represents the address where the error occurred. — actual represents the actual value at that address. — expected represents the expected value at that address. Table B—4 lists the codes used in cache isolate test error messages. Table B-4: Cache Isolate Test Error Codes Error Code Description 1 Reading 00000000 pattern resulted in a cache parity error. 2 Reading 00000000 pattern resulted in a cache miss error. 3 Reading 00000000 pattern returned a data miscompare. 4 Reading 656555555 pattern resulted in a cache parity error. b Reading 555555565 pattern resulted a cache miss error. 6 Reading 6556556555 pattern resulted in a data miscompare, 7 Reading AAAAAAAA pattern resulted in a cache parity error. 8 Reading AAAAAAAA pattern resulted in a cache miss error. 9 Reaaing AAAAAAAA pattern resulted in a data miscompare. 10 Reading data address pattern resulted in a cache parity error. n Reading data address pattern resulted in a parity error. 12 Reading data address pattern returned a miscompare error. B.2.4 cache/reload - Cache Reload Test The cache reload test writes rotating-parity data patterns to memory and then reads the patterns. To run the cache reload test, enter t 3/cache/reload [cache] loffset] and press Return. When you enter the cache reload test command, Base System Self-Test Commands and Error Messages B-7 Replace cache with a value that specifies the cache you want to test. — Specify I to test the instruction cache. ~ Specify D to test the data cache. The default value is D. Replace offset with a specific cache address where you want the test to start. The default address is 80500000. B.2.4.1 Cache reload test error messages Cache reload test error messages have the form ?TFL: 3/cache/reload (description) ?TFL 3/cache/reload indiestes that the cache reload test reported an error. * descriptionrepresents an additional message that describes the error. Table B-5 lists the descriptions used in cache reload test error messages. B-8 CPU System Manual Table B-5: Cache Reload Test Error Descriptions Error Description Meaning (PE) Unexpected parity error occurred. (address= actual, sb expected) Data pattern read reported a miscompare. address represents the address where the miscompare occurred. actual represents the actual value at that address. expected represents the expected value for that ad- dress, (PE @ address (C)) Parity error accurred. The address parameter lists the address where the error occurred. B.2.5 cache/seg - Cache Segment Test The cache segment test checks individual cache segments. To run the cache segment test, enter t 3/cache/seg |cachel laddress) and press Return. When you enter the cache segment test command, Replace the optional cache parameter with a value that specifies the cache you want fo test. ~— Specify D to test the data cache. The default value is D. — Specify I to test the instruction cache. Replace address with a specific address you want to test. The default address is 80500000. Note that using the optional address parameter correctly requires thorough knowledge of the firmware specifications. B.2.5.1 Cache segment test error messages Cache segment test error messages have the form ?TFL: LJ 3/cache/seg (code : description) ?TFL 3/cache/seq indicates that the cache segment test reported an error. code represents a number that identifies the portion of the test that failed. <description represents additional information that describes the failure. Base System Self-Test Commands and Error Messages B-9 Table B-6 lists the codes and descriptions used in cache segment test error messages. Table B-6: Cache Segment Test Error Codes and Descriptions Error Code and Descrip- tion Meaning (1: address=xzxxxexx, sbyyyyyyyy) Error occurred when the system tried to read the cache contents. The address parameter is the actual value at a given address. The correct value follows. (2: address=xxxxxxxx, sbyyyyyyyy) Error occurred when the system tried to read the memory contents. The address parameter is the actual value at a given address. The correct value follows. (3: address=xxxxxxxx, sbyyyyyyyy) Error occurred when the system performed a read and write operation on the uncached memaory. The address value is the actual value at a given address. The correct value follows. (4: address =xxxxxxxx, sbyyyyyyyy Cache data was inconsistent. The address value is the actual value at a given address. The correct value follows. B.2.6 ecc/cor - Error Correction Coding (ECC) Correction Test The error correction coding (ECC) correction test writes data patterns XOR'd with floating ones to create single bit errors and then checks to see whether the error was detected and corrected. To run the ECC correction test, enter t 3/ecc/cor laddress) and press Return. Replace address with a specific address you want to test. The default address is A0010000. B.2.6.1 ECC correction test error messages ECC correction test error messages have the form ?TFL: ® * 3/ecc/cor (code:description) 2TFL 3/ecc/cor indicates that the ECC correction test reported an error. code represents a number that identifies the portion of the test that failed. ® description represents additional information that describes the failure. B-10 CPU System Manual Table B-7 lists the codes and descriptions used in ECC correction test error messages. Table B-7: ECC Correction Test Error Codes and Descriptions Error Code and Description Meaning (1: xxxxxxxx vd err ) [KNO3-AA] Cannot read and write location with good data. (2: sbe not det) Single bit error in the data is not detected. (3: sbe not cor) Single bit error in the data is not corrected. B.2.7 fpu - Floating-Point Unit Test The floating-point unit (FPU) test uses the FPU to perform simple arithmetic and compares the result to known values. To run the FPU test, enter t 3/fpu and press Return. B.2.7.1 FPU test error messages FPU test error messages have the form ?TFL: s ® 3/fpu (code) 27FL 3/fpu indicates that the FPU test reported an error. code represents a number that identifies the portion of the test that failed. Table B-8 lists the codes used in FPU test error messages. Base System Self-Test Commands and Error Messages B-11 Table B-8: Error FPU Test Error Codes Code Meaning 1 Values did net match. Value should be 00000000. 2 Values did not match. Value should be 55556555. 3 Values did not match. Value should be AAAAAAAA. 4 Values did not match. Value should be FFFFFFFF. 5 Least-significant bit failed when the system was converting doubleword to word (CVT D. W) 6 Most-significant bit failed when the system was converting doubleword to word (CVT D. W) 7 Double miscompare occurred: n+n-n!=n 8 Double miscompare occurred: n+n==n 9 Convert float-double. Value should be 55555555. 10 FPU CSR double error occurred. 11 Single miscompare occurred: n+n-n!'=n 12 Single miscompare occurred: n+n==n 13 Convert float-double. Value should be 55555555. 14 FPU CS8R single error occurred. Value should be 06000000. 15 Single division failed. Value should be 00005555. 16 Single multiplication failed. 17 Double multiplication failed. 18 Double division failed. 19 Conversion error occurred. Pattern readback did not match. 21 FPU did not trap on overflow exception. 22 Did not get FPU interrupt. B.2.8 mem - Memory Module Test The memory module test performs a full pattern test on an entire SIMM or NVRAM memory module. To run the memory module test, enter t 3/mem |module) (threshold) pattern| B-12 CPU System Manual and press Return. When you enter the memory module test command, * Include the module parameter to specify the memory module you want to test. — To test one memory module, specify the slot number of the memory module that you want to test. The default value is 0. — To test all memory modules, specify an asterisk (*). ¢ Replace the optional threshold parameter with the number of single-bit errors the test allows before the test fails. The default threshold is 10. * Replace the optional pattern parameter with a specific pattern that you want to use in the test. The default pattern is 55555555. B.2.8.1 Memory module test error messages Set the verbose environment variable to 1 to see compare error messages in the following form: ?TFL:3/mem @ address=actual, expected * °TFL 3/mem @ indicates that the memory test reported a compare error. * The address parameter is the address at which the error occurred. * actual represents the value at that address. ®* cxpected represents the expected value at that address. If the verbose environment variable is not set, the error messages appear in the following formats: ?TFL:3/mem (1: board L, MBE=M, ?TFL:3/mem (2: board L, too SBE=N) many SBEs:N) where L represents the slot number of the failed memory module, M represents the number of multibit errors that occurred, and N represents the number of single-bit errors that occurred. B.2.9 mem/float10 - Floating 1/0 Memory Test The floating 1/0 memory test writes floating 1 and floating 0 across one location in RAM or NVRAM. To run the floating 1/0 memory test, enter t 3/mem/floatl0 |address) Base System Seif-Test Commands and Error Messages B-13 and press Return. When you enter the floating 1/0 memory test command, replace the optional address parameter with a specific address at which you want to start writing 1s. The default address is A0100000. B.2.9.1 Floating 1/0 memory test error messages If a RAM module is tested, the floating 1/0 memory test error message is ?TFL: 3/mem/floatl10 (Err= N) where N represents the number of errors the memory module reported. If an NVRAM module is tested and the module contains valid data, the floating 1/0 memory test error message is ?TFL: (1: {(tst nocomp) B.2.10 mem/init - Zero Memory Utility The zero memory utility floods RAM and NVRAM modules with zeros as fast as possible. If the NVRAM module contains valid data, only the scratch area will be cleared. To run the zero memory utility, enter t 3/menvinit and press Return. The zero memory utility returns no error codes. B.2.11 mem/select - RAM Select Lines Test The RAM select lines test checks for RAM select line faults by performing a read and write operation on one location in each memory module. To run the RAM select lines test, enter t 3/mem/select and press Return. B.2.11.1 RAM select lines test error messages The only RAM select test error message is ?TFL: ® 3/mem/select (address=actual, sb expected) 3TFL: 3/mem/select indicates that the RAM select lines test reported an error, address represents the memory address where the error occurred. * actual represents the actual value at the listed address. B-14 CPU System Manuai * expected represents what the value at the listed address should be. B.2.12 misc/cpu-type - CPU-Type Utility The CPU-type utility displays a message that identifies the CPU type. To run the CPU-type utility, enter t 3/misc/cpu-type and press Return. B.2.12.1 CPU-type utility messages The CPU-type utility message M as the form 3/misc/cpu-type’s code: NDX-type where type represents a code that indicates the type of CPU module installed in the system, For example, the code NDx-129a identifies the KN03-GA CPU module (40-MHz). B.2.13 misc/halt - Halt Button Test The halt button test checks whether the halt button is connected and can generate an interrupt. To run the halt button test, enter t 3/misc/halt [number] and press Return. When you enter the halt button command, replace number with the number that specifies the type of test you want to run. * Specify 0 to check whether the halt button is pressed. If you specify 0 and the button is pressed when the test runs, the test reports an error. The default value is 0. * Specify a number from 1 to 9 to check whether the button responds when pressed. Press the button the same number of times as the number you specify in the test command. B.2.13.1 Halt button test error messages There are two halt button test error messages. * ?TFL: 3/misc/halt (1:SIR=xXxXxxxX) This message indicates that the halt button is pressed in. represents the value in the system interrupt register. ¢ ?2TFL: 3/misc/halt (2: xxxxxxxx invlidbits: SIR=xXXXXXXX) Base System Self-Test Commands and Error Messages B-15 This message indicates that the system interrupt register contains an impossible combination of halt-button bits. xxxxxxxx represents the value in the system interrupt register. B.2.14 misc/pstemp - Overheat Detect Test The overheat detect test checks whether the power supply is overheating. To run the overheat detect test, enter t 3/misc/pstemp and press Return. B.2.14.1 Overheat detect test error message When the overheat detect test fails, the following error message is displayed: ?TFL: 3/misc/pstemp {system is *HOT*) This message indicates that the system is overheating. B.2.15 misc/wbpart - Partial Write Test The partial write test writes to a specific memory address and then checks whether the written values are correct. To run the partial write test, enter t 3/misc/whbpart and press Return. B.2.15.1 Partial write test error messages Partial write test error messages have the form ?TFL: ®* 3/misc/wbpart (code) °TFL 3/misc/wbpart indicates that the partial write test reported an error. ®* code represents a number that identifies the portion of the test that failed. Table B-9 lists the codes used in partial write test error messages. B-16 CPU System Manual Table B-9: Partial Write Test Error Codes Error Code Meaning 1 Pattern that was read showed mismatch on word access. 2 Byte O lailed partial byte write. 3 Byte 1 failed partial byte write. 4 Byte 2 failed partial byte write. 5 Byte 3 failed partial byte write. 6 Halfword 0 failed partial halfword write. 7 Halfword 1 failed partial halfword write. B.2.16 ni/clisn - Collision Test The collision test checks Ethernet collision detect circuitry by forcing a collision on transmission. To run the collision test, enter t 3/ni/cllsn and press Return. B.2.16.1 Collision test error messages Collision test error messages have the form ?TFL: e * 3/ni/cllsn (code:description) >TFL: 3/ni/cllsn indicates that the collision test reported a problem. code represents a number that identifies the portion of the test that failed. description represents failure. additional information that describes the Table B-10 lists the codes and descriptions used in collision test error messages. Base System Self-Test Commands and Error Messages B-17 Table B-10: Collision Test Error Codes and Descriptions Error Code and Descrip- tion 3: cllan not duetd Meaning Ethernet controller chip failed to detect an Ethernet collision. 4 xmt x| Ethernet controller chip transmission failed. 6: LANCE init [x; Ethernet contraller chip failed to initialize. B.2.17 ni/common - Common Diagnostic Utilities The common diagnostic utilities are run by Ethernet controller tests. You cannot run these diagnostic utilities by themselves. B.2.17.1 Common diagnostic utility error messages Common diagnostic utility error messages have the form ?TFL: ¢ * e 3/ni/test name (code:description) 2TFL: 3/ni indicates that a common diagnostic utility detected an error. test name represents the name of the test in which the diagnostic utility detected an error. code represents a number that identifies the utility that generated the error message. * description represents additional information that describes the error. Table B-11 lists the codes and descriptions used in common diagnostic utility error messages. Table B-11: Common Diagnostic Utility Error Codes and Descriptions Error Code: Description Meaning 700: Invld param frmt L) Parameter xxxxx was not in a valid format. 901: err hitng LANCE STOP bit did not halt the Ethernet controller chip. 902: LANCE-init timeout Timeout occurred when the system tried to initialize the B-18 CPU System Manual Ethernet controller chip. Table B-11 (Cont.): Common Diagnostic Utility Error Codes and Descriptions Error Code: Description Meaning 903: LANCE-start timeout Timeout occurred waiting for the Ethernet controller chip to start. 404 erriniting LANCE Utility could not initialize the Ethernet controller chip. 905: LANCE-stop timeout Timeout occurred in the Ethernet controller chip. 906: err initing LANCE I/O system failure occurred during Ethernet controller chip initialization. B.2.18 ni/crec - Cyclic Redundancy Code Test The cyclic redundancy code (CRC) test checks the Ethernet CRC verification and bad CRC detection abilities. To run the CRC test, enter t 3/ni/ere and press Return. B.2.18.1 CRC test error messages CRC test error messages have the form ?TFL: 3/ni/crc {(code:description) ®* ?TFL: 3/ni/crc indicates that the CRC test reported a problem. * code represents a number that identifies the portion of the test that failed. ®¢ description failure. Table B-12 represents lists the codes additional information that describes and descriptions in CRC used the test error messages. Table B-12: CRC Test Error Codes and Descriptions Error Code and Description 2: LANCE-init x| Meaning System could not initialize the Ethernet controller chip. The x represents a pass or fail code returned by one of the utilities that the test uses. Base System Seif-Test Commands and Error Messages B-19 Table B-12 (Cont.): CRC Test Error Codes and Descriptions Error Code and Description Meaning 3 xmt x| Error occurred during packet transmission, The x represents a pass or fail code returned by one of the utilities that the test uses. 5 fls CRC err Ethernet chip incorrectly flagged a good CRC as bad. 6 rov ix Frror occurred receiving a packet. The x represents a pass or fail code returned by one of the utilities that the test uses. 7: LANCE-init |x! Error occurred when the test attempted to initialize the Ethernet controller chip. The x represents a pass or fail code returned by one of the utilities that the test uses. #oxmt x| Error occurred transmitting a data packet. The x represents a pass or fail code returned by one of the utilities that the test uses. 10: bad CRC not duetd Ethernet chip did not detect a bad CRC in an incoming packet. 1 vev x) Error occurred in packet receive operation. The x represents a pass or fail code returned by one of the utilities that the test uses. B.2.19 ni/ctrs - Display Maintenance Operation Protocol (MOP) Counters Utility The display maintenance operation protocol (MOP) counters utility displays the current MOP counters for the base system Ethernet controller. To run the MOP counters utility, enter t 3/ni/ctrs and press Return, The display MOP counters utility produces no error messages. B.2.20 ni/dma1t - Ethernet-Direct Memory Access (DMA) Registers Test The Ethernet-direct memory access (DMA) registers test checks the Ethernet-DMA control and error registers. The test then checks the ability of the system to detect a DMA error. To run the Ethernet-DMA registers test, enter t 3/nv/dmal and press Return. B-20 CPU System Manual B.2.20.1 Ethernet-DMA registers test error messages Ethernet-DMA registers test error messages have the form ?TFL: ® 3/ni/dmal °TFL: (code: 3/ni/dmal description) indicates that the Ethernet-DMA registers test reported a problem. * code represents a number that identifies the portion of the test that failed. ® description represents additional information that describes the failure. Table B—13 lists the codes and descriptions used in Ethernet-DMA registers test error messages. Table B-13: Ethernet-DMA Registers Test Error Codes and Descriptions Error Code and Description Meaning 1: LDP wrt/rd | W=xxxexxxx r=yyyyyyyyl LDP register values matched when they should not. The w parameter is the value that was written to the LDP register. The 2: L10S wrt/rd [w=xxxxxxxx r=yyyyyyyy] r parameter is the value that was read from the LDP register. LANCE V/O slot register values matched when they should not. The w parameter is the value that was written to the LANCE VO slot register. The r parameter is the value that was read from the LANCE V/O slot register. 3: LANCE select LANCE 1/O slot register failed to select the LANCE. 4. LANCE deselect LANCE 1/0 slot register failed to deselect the LANCE. 5: err initing LANCE Ethernet controller chip initialization failed. 6: LANCE-init timeout Timeout occurred waiting for the LANCE initialization to finish. 7: MER Page boundary error was not recorded in the MER register. 8: SIR LANCE memory error bit was not set in the SIR register. 9:LANCE-start timeout Timeout occurred waiting for LANCE to start. Base System Self-Test Commands and Error Messages B-21 B.2.21 ni/dma2 - Ethernet-Direct Memory Access (DMA) Transfer Test The Ethernet-direct memory access (DMA) transfer test checks Ethernet DMA operation. To run the Ethernet-DMA transfer test, enter t 3/mi/dma2 and press Return. B.2.21.1 Ethernet-DMA transfer test error messages ithernet-DMA transfer test error messages have the form ?TFL: * * 3/ni/dma2 2TFL: (code: 3/ni/dma? description) indicates that the Ethernet-DMA transfer test reported a problem. code represents a number that identifies the portion of the test that failed. * description represents additional failure. information that describes the Table B-14 lists the codes and descriptions used in Ethernet-DMA transfer test error messages. Table B-14: Ethernet-DMA Descriptions Error Code and Description 2: LANCE-init [xxcxexxx] Registers Test Error Codes and Meaning Ethernet controller chip initialization failed. xorxxxxx represents a code that describes the LANCE failure. 3: xmt |xxxxxrxx|sz=yyyy ptrn=AA Ethernet controller chip transmission failed. xxxeraxx represents a code that describes the transmission failure. The sz parameter is the packet size. The pirn parameter is the pattern the test tried to transmit. 4: rev {xxxxxxxx) sz=yyyy ptrn=AA Ethernet controller chip receive operation failed. xxxxxxax represents a code that describes the receive failure. The sz parameter is the packet size. The ptrn parameter is the pattern the test tried to receive. 8: LANCE-DMA DMA error occurred afier a packet was received. 9: LANCE-DMA DMA error occurred when the test began. 10: LANCE-DMA DMA error occurred after a packet was transmitted. B-22 CPU System Manual B.2.22 ni/esar - Ethernet Station Address ROM (ESAR) Test The Ethernet station address ROM (ESAR) test checks the ESAR on the Ethernet controller. To run the ESAR test, enter t 3/ni/esar and press Return. B.2.22.1 ESAR test error messages ESAR test error messages have the form ?TFL: * * 3/ni/esar (code: description) ?TFL: 3/ni/esar indicates that the ESAR test reported a problem. code represents a number that identifies the portion of the test that failed. ¢ description represents additional information that describes the failure. Table B-15 lists the codes and descriptions used in ESAR test error messages. Table B-15: ESAR Test Error Codes and Descriptions Error Code and Description Meaning 2: ESAR[0-5] = 0’s First 6 bytes of the ESAR were 000000. 3: ESAR[0] = brdcst-sdrs ESAR contained broadcast address. 5: checksum ESAR checksum verificationfailed. T: rvrs cpy = Reverse copy mismatch occurred. 8: frwrd cpy != Forward copy mismatch occurred. 9: ESAR[24-28] t=FF Test pattern FF mismatch occurred. 10: ESAR|25-29] =00 Test pattern 00 mismatch occurred. 11: ESAR[26-30] =55 Test pattern 55 mismatch occurred. 12: ESAR[27-31] 1=AA Test pattern AA mismatch occurred. Base System Self-Test Commands and Error Messages B-23 B.2.23 ni/ext-lb - Ethernet External Loopback Test The external loopback test checks the Ethernet controller and its connection to the network. Before you run the external loopback test on the base system Ethernet controller, first install a ThickWire loopback connector on the Ethernet controller. To run the external loopback test, enter t 3/ni/ext-1b and press Return. B.2.23.1 External loopback test error messages External loopback test error messages have the form ?TFL: * 3/ni/ext-lb (code: description) >TFL: 3/ni/ext-1b indicates that the external loopback test reported a problem. * code represents a number that identifies the portion of the test that failed. * description represents additional information that describes the failure. Table B-16 lists the codes and descriptions used in external loopback test error messages. Table B-16: External Loopback Test Error Codes and Descriptions Error Code and Description 1. (LANCE-init {xooooxx ) 3 (xmit {xoexexxx, yyyyyyyyl Meaning LANCE initialization failed. xxxxxxxx represents a code that describes the LANCE failure. LANCE initialization failed. xxxxxxxx, yyyyyyyy represents 2z2z27) a code that describes the LANCE failure. zz2zz represents 4: rev [xxxxxxxx, yyyyyyyyi System did not receive packet. xxwxaxxx, yyyyyyvy represents a code that describes the receive failure. =) Transmitted packet was not reccived. ~1 a code that describes the likely cause of the failure. : pkt-data != Fatal error occurred. B-24 CPU System Manual B.2.24 ni/int - Ethernet Interrupt Request (IRQ) Test The interrupt request (IRQ) test checks whether the Ethernet controller can generate an interrupt to the R3C00A chip. To run the IRQ test, enter t 3/ni/int and press Return. B.2.24.1 IRQ test error messages IRQ test error messages have the form ?TFL: e * 3/ni/int (code: description) 2TFL: 3/ni/int indicates that the IRQ test reported a problem. code represents a number that identifies the type of failure that occurred. e description represeuts additional information that describes the failure. Table B-17 lists the codes and descriptions used in IRQ test error messages. Table B-17: IRQ Test Error Codes and Descriptions Error Code and Description Meaning l:int pndng Pending interrupt was invalid. 2: init LANCE err = x 4: intr err xmt.stat=x Error occurred when the system tried to initialize the Ethernet controller chip. System generated no interrupt on packet transmission. B.2.25 ni/int-lb - Ethernet Internal Loopback Test The internal loopback test sends and receives data packets to and from Ethernet in internal loopback mode. To run the internal loopback test, enter t 3/mi/int-1b and press Return. Base System Self-Test Commands and Error Messages B-25 B.2.25.1 Internal loopback test error messages Iniernal loopback test error messages have the form ?TFL: ¢ 3/ni/int-lb (code:description) °TFL: 3/ni/int-1b indicates that the internal loopback test reported a problem. ¢ code represents a number that identifies the portion of the test that failed. * description represents additional information that describes the failure. Table B-18 lists the codes and descriptions used in internal loopback test error messages. Table B-18: Internal Loopback Test Error Codes and Descriptions Error Code and Description Meaning 1: rd ESAR err The system could not access the Ethernet station address ROM. 2: LANCE-init |xxxxxxxx} Error occurred initializing the Ethernet controller chip. xxxxxxxx represents a code that describes the transmission failure. 3+ xmi {xxxxxoexx) SZ=yyyy ptrn=zz System did not transmit packet. xxxxxxxx represents a code that describes the transmission failure. The sz parameter is the size of the packet. The ptrn parameter is the pattern that was in the packet. 4: rev [ xaxxxxxxx) 82=yyyy ptrn=zz System did not receive packet. xxxxxxxx represents a code that describes the failure. The sz parameter is the size of the packet. The ptrn parameter is the pattern that was in the packet. 5: revd size=x, xptd=y Packets received and packets sent had different sizes. 6: pkt-data != Data received and data sent did not match. 7 Received CRC was incorrect. B-26 CPU System Manual B.2.26 ni/m-cst - Ethernet Multicast Test The multicast test checks the Ethernet ability to filter multicast packets. To run the multicast test, enter t 3/ni/m-cst and press Return. B.2.26.1 Muilticast test error messages Multicast test error messages have the form ?TFL: 3/ni/m-cst (code: description) ®* ?TFL: 3/ni/m-cst indicates that the multicast test reported a problem. * code represents a number that identifies the portion of the test that failed. * description failure. represents additional information that describes the Table B-19 lists the codes and descriptions used in multicast test error messages. Table B-19: Muliticast Test Error Codes and Descriptions Error Code and Description Meaning 1: rd ESAR err Error occurred reading Ethernet station address ROM. 2: LANCE-init [xexxxxxx] System failed to initialize the Ethernet controller chip. xxxxxxxx represents a code that describes the initialization failure. 3: xmt oo Ethernet controller failed to send packet. xxrxxxxx represents a code that describes the transmission failure. 5 revd invld m-cst Ethernet controller received a multicast multicast function was disabled. 6: rev [xrxxxxex| Packet receive routine reported an error. xxxxxxxx represents a code that describes the receive error. 7: LANCE-init |xrexxxxx] Error occurred when the system tried to initialize the Ethernet chip. xxxxxxxx represents a cade that describes the initialization error. 8: ¥xmt [xxxxxxxx] Ethernet controller failed to transmit packet. xxxxxxxx represents a code that describes the transmission error. packet when Base System Self-Test Commands and Error Messages B-27 Table B-19 (Cont.): Muiticast Test Error Codes and Descriptions Error Code and Description Meaning 9: rev [xcxxxaxx| Ethernet did not receive expected packet. xoorexx represents a code that describes the receive error. B.2.27 ni/promisc - Ethernet Promiscuous Mode Test The promiscuous mode test checks that the Ethernet controller can receive packets in promiscuous mode. To run the promiscuous mode test, enter t 3/ni/promisc and press Return. B.2.27.1 Promiscuous mode test error messages Promiscuous mode test error messages have the form ?TFL: ® * ®* 3/ni/promisc (code: description) 2TFL: 3/ni/promisc indicates that the promiscuous mode test reported a problem. code represents a number that identifies the type of failure that occurred. description represents additional information that describes the failure. Table B-20 lists the codes and descriptions used in promiscuous mode test error messages. Table B-20: Promiscuous Mode Test Error Codes and Descriptions Error Code and Description Meaning 2: LANCE-init |xxxxxxrx] Ethernet controller initialization failed. XAAXXXXX represents a code that describes the initialization failure. 3: xmt [xxoooorx) Packet transmission failed. xxxxxxxx represents a code that describes the transmission failure. B-28 CPU System Manual Table B-20 (Cont.): Promiscuous Mode Test Error Codes and Descriptions Error Code and Description . Meaning 5: revd invid adrs An inappropriate packet was received in nonpromiscuous mode. 6: rev | xxxxxxxx} Parkel receive routine failed. 7: LANCE-init [xcxxxxxx] System fziled to initialize Ethernet controller in promiscuous mode. xxxxrxxx represents a code that describes the initialization failure. 8: xmt |xxxxxxxx) Packet transmission failed. xxxxxxxx represents a code that describes the transmission failure. 9: rev (xxxxxaxx) Ethernet did not receive the expected packet while in promiscuous mode. xxxxxxxx represents a code that describes the receive failure. B.2.28 ni/regs - Ethernet Registers Test The registers test performs a read and write operation on the Ethernet regisiers. To run the registers test, enter t 3/mi/regs and press Return. B.2.28.1 Registers test error messages Registers test error messages have the form ?TFL: e * 3/ni/regs (code: description) TFL: 3/ni/regs indicates that the registers test reported a problem. code represents some number that identifies the portion of the test that failed. ® description represents additional information that describes the failure. — The csrR(n) parameter, where n represents the number of a specific CSR register, indicates the actual value in the CSR register. — The xpctd parameter indicates the expected value for the same CSR register. Base System Self-Test Commands and Error Messages B-29 Table B--21 lists the codes and descriptions used in registers test error messages. Table B-21: Registers Test Error Codes and Descriptions Error Code and Description Meaning 1: C8R|al=x - xpctd 0 Write and read operation to Ethernet CSR{n| failed. The n represents the number of the CSR involved in the failure. 3: CSR{1}=xxxx - Writing and reading OxFFFE failed on CSR 1. xxxx 4: CSR[2)=xxxx xpetd 0x00FF Writing and reading Ox00FF failed on CSR 2. represents the actual value in CSR 2, xxxx 5: C8SR{1)=xxxx bit 1k frm CSR{2] Bit leak from CSR2 to CSR1 occurred. xxxx represents the actual value in CSR[1]. 6: CSR|2|= xxxx bit Ik frm CSR| 1] Bit leak from CSR1 to CSR2 occurred. xx.x represents the actual value in CSRI2]. 7: immediate write/read fir Immediate write and read failure occurred. xpetd OxFFFE represents the actual value in CSR 1. B.2.29 prcache - Prcache Quick Test The preache quick test of NVRAM on power-up tests the scratch area of the optional NVRAM module. The diagnostic status bit in the diagnostic register on the NVRAM module is set on failure. The optional NVRAM module can be installed in memory slot 14, and is different from the system module NVR that is tested by the rte/nvr test. To run the precache quick test, enter t 3/prcache and press Return. For a thorough test, first zero the NVRAM memory by entering t 3/prcache/clear and press Return. B.2.29.1 Prcache quick test error messages Precache quick test error messages have the form ?TFL: 3/prcache (code: description) ?TFL: 3/prcache indicates that the prcache quick test reported a problem. B-30 CPU System Manual * code represents a number that identifies the portion of the test that failed. ® description represents additional information that describes the failure. Table B-22 lists the codes and descriptions used in registers test error messages. Table B-22: Prcache Quick Test Error Codes and Descriptions Error Code and Description 1: board 14: MBE Meaning =X SBE =Y X multiple bit errors and Y single bit errors occurred on the NVRAM board in slot 14. 2: board 14, too many SBEs: X Too many single bit errors. X single bit errors occurred on the NVRAM board in slot 14. B.2.30 prcache/arm - Disconnect Battery Command The preache/arm command turns off the battery on the NVRAM module. To run the preache/arm command, enter t 3/prcache/arm [board) and press Return. When you enter the prcache/arm command, replace the optional board parameter with the slot number of the NVRAM module. On the DECstation 5000 Model 240, the NVRAM module must always be installed in slot 14, The default value is 14. B.2.30.1 Prcache/arm command error message If the prcache/arm command does not complete successfully, it returns the following error message: ?TFL: 3/prcache/arm (1l:(tst nocomp)) B.2.31 prcache/clear - Zero NVRAM Memory Command The prcache/clear command quickly writes zeros to all NVRAM memory addresses. To run the preache/clear command, enter t 3/prcache/clear |board] and press Return. Base System Self-Test Commands and Error Messages B-31 When you enter the prcache/clear command, replace the optional board parameter with the slot number of the NVRAM module. On the DECstation 5000 Model 240, the NVRAM module must always be installed in slot 14. The default value is 14. If the preache contains valid data, the system responds with the following prompt; prcache valid data - wrt ? (1/0) enter 1 to clear the cache. Enter 0 to cancel the prcache/clear command. B.2.31.1 Prcache clear error message If the prcache/clear command does not complete successfully, it returns the following error message: ?TFL: 3/prcache/clear (1l:(tst nocomp)) B.2.32 prcache/unarm - Connect Battery Command The prcache/unarm command turns on the battery on the NVRAM module. To run the precache/unarm command, enter t 3/prcache/unarm [board | and press Return. When you enter the prcache/unarm command, replace the optional board parameter with the slot number of the NVRAM module. On the DECstation 5000 Model 240, the NVRAM module must always be installed in slot 14. The default value is 14. The prcache/unarm command returns no error message. B.2.33 rtc/nvr - Nonvolatile RAM Test The nonvolatile RAM (NVR) test checks the system module nonvolatile RAM. The system module NVR is different from the optional NVRAM cache module that can be installed in memory slot 14 and that is tested by the prcache test. To run the NVR test, enter t 3/rte/nvr [pattern] and press Return. When you enter the NVR test command, replace the optional pattern parameter with a specific pattern that you want to use in the test. The default pattern is 55. B-32 CPU System Manual B.2.33.1 NVR test error messages NVR test error messages have the form ?TFL: sb ®* 3/rtce/nvr (code: address=actual, expected) °TFL 3/rtc/nvr indicates that the NVR test read an incorrect pattern from the NVR. * code tepresents a number that identifies the portion of the test that failed. * address represents the address at which the error occurred. ®* actual represents the value at that address. * expected represents the expected value at that address. B.2.34 rtc/period - Real-Time Clock Period Test The real-time clock (RTC) period test checks the RTC periodic interrupt operation. To run the RTC period test, enter t 3/rte/period and press Return. B.2.34.1 RTC period test error messages RTC period test error messages have the form ?TFL: 3/rte/period/ (code) ®* 3TFL 3/rtec/period indicates that the RTC period test reported an error. * code represents a number that identifies the portion of the test that failed. Table B-23 lists the codes used in RTC period test error messages. Table B-23: RTC Period Test Error Codes Error Code Meaning 1 Update-in-progress (UIP) bit remained set past the allotted time. 2 Real-time clock interrupt was pending when it should not have been. 3 Allowed time ran out while waiting for interrupt. Base System Self-Test Commands and Error Messages B-33 B.2.35 rtc/regs - Real-Time Clock Registers Test The real-time clock registers test checks the real-time clock (RTC) registers. To run the real-time clock registers test, enter t 3/ric/regs and press Return. B.2.35.1 Real-time clock registers test error messages Real-time clock register test error messages have the form ?TFL: ®* 3/rtc/regs (code: description) 2TFL 3/rtc/regs indicates that the real-time clock register test reported an error. * code represents a number that identifies the portion of the test that failed. * description failure. represents additional information that describes the Table B-24 lists the codes and descriptions used in real-time clock register test error messages. Table B-24: Real-Time Clock Register Test Error Codes and Desscriptions Error Code and Description Meaning 1 UIP bit remained set past allotted time. 2: register=aciual, sb expected The test failed to write pattern correctly. The register value is the actual value in the named register, followed by the expected value. B.2.36 rtc/time - Real-Time Test The real-time test checks times generated by the real-time clock against hard-coded time values. To run the real-time test, enter t 3/rte/time and press Return. B-34 CPU System Manual B.2.36.1 Real-time test error messages Real-time test error messages have the form ?TFL: 3/rtc/time (code) * »TFL 3/rtec/time indicates that the real time test reported an error. * code represents a number that identifies the portion of the test that failed. Table B-25 lists the codes used in real-time test error messages. - N W A Allowed time ran out while waiting for second interrupt. U U1IP bit remained set past second allotted time. Real-time clock seconds were not set to 0 on wraparound. Real-time clock minutes were not set to 0. Real-time clock hours were not set to 0. Real-time clock day-of-the-week was not set to 1. < Allowed time ran ou' while waiting for interrupt. — Real-time clock interrupt was pending when it should not have been. Real-time clock date was not set to 1. TM UIP bit remained set past allotted time. O Meaning =3 Error Code o Real-Time Test Error Codes @w Table B-25: Real-time clock month was not set to 1. Real-time clock year was not set to 0. 13 B.2.37 scc/access - Serial Communication Chip (SCC) Access Test The serial communication chip (SCC) access test checks whether the system can perform a read and write operation on the SCC. To run the S8CC access test, enter t 3/sce/access and press Return. Base System Self-Test Commands and Error Messages B-35 R.2.37.1 SCC access test error messages The only SCC access test error message is ?TFL: 3/scc/access (1:LnM reg-N: actual=0xXX xpctd=0xYY ?TFL: 3/scc/access indicates that the read and write operation on the SCC failed. M represents the number of the serial line where the error occurred. Nrepresents the number of the register that failed the test. The actual value is the value in that register. The xpctd value is the expected value for that register. B.2.38 scc/dma - Serial Communication Chip Direct Memory Access Test The serial communication chip (SCC) direct memory access (DMA) test checks the ability of the serial communication and 10-ASIC chips to perform a DMA operation. To run the SCC DMA test, enter t 3/scc/dma (line] [loopback)] {baud] and press Return. Replace the optional line parameter with a value that specifies the line to test. — Specify 2 to test serial line number 2. The default value is 2. — Specify 3 to test serial line number 3. Replace the optional loopback parameter with a value that specifies the type of loopback operation the test is to perform. — Specify intl to run an internal loopback operation. The default value — Specify extl to run an external loopback operation. is intl. Replace baud with the baud rate at which you want the test to run. You can specify one of the following baud rates: 300 4800 1200 9600 2400 B-36 CPU System Manual 19200 38400 The default value is 38400. B.2.38.1 SCC DMA test error messages SCC DMA test error message have the form ?TFL: 3/scc/dma code:LnN SIR xptd=xxxxxxxx SIR=yyyyyyyy SSR=zzzzzzzz ®* °TFL: 3/scc/dma indicates that the SCC DMA test reported an error. * code represents a number that identifies the part of the test that failed. ¢ nrepresents the number of the serial line that reported the error. ¢ The SIR xpted value is the expected value for the system interrupt register. ¢ The sir value is the actual value in the system interrupt register. * The ssr value is the value in the system status register. Table B-26 lists the codes used in SCC DMA test error messages. Table B-26: SCC DMA Test Error Codes Error Code Meaning 1 SIR values are invalid. 2 Miscompare occurred during DMA read and write operation. 3 Overrun occurred in the receive buffer. 4 Interrupt signal was not sent to the system. B.2.39 scc/int - Serial Communication Chip Interrupts Test The serial communication chip (SCC) interrupts test checks the ability of the SCC to perform internal, external, and countdown interrupts. To run the SCC interrupts test, enter t 3/sce/int {line) and press Return. Replace the optional line parameter with a value that specifies which line to test. Base System Self-Test Commands and Error Messages B-37 B.2.39.1 SCC interrupts test error messages SCC interrupts test error messages have the form ?TFL: 3/sce/int (code: 1lnN RRO=xx RR3=yy SIR=zzzzzzzz) ?TFL: 3/scc/int indicates that the SCC interrupts test reported an error. code represents a number that indicates which portion of the test reported the error. — Ifthe number is odd, the bits to set the interrupt to on were invalid. — If the number is even, the bits to set the interrupt to off were invalid. N represents the number of the serial line where the error occurred. The RRO value is the contents of the SCC read register 0. The rr3 value is the contents of the SCC read register 3. The s1r value is the contents of the system interrupt register. B.2.40 scc/io - Serial Communication Chip Input/Output (I/0) Test The serial communication chip (SCC) input/output (I/O) test checks the ability of the SCC to perform an I/O operation on a serial line. To run the SCC V/O test, enter t 8/scefio [line) [loopback] and press Return. When you enter the SCC IO test command, Replace the optional line parameter with a value that specifies the line to test. —— Specify 0 to test serial line 0. The default value is 0. — Specify 1 to test serial line 1. — Specify 2 to test serial line 2. — Specify 3 to test serial line 3. B-38 CPU System Manual e Replace the optional luopback parameter with a value that specifies the type of loopback operation the test performs. ~— Specify intl to run an internal loopback operation. The default value 1s intl. — Specify extl to run an external loopback operation. B.2.40.1 SCC 1/O test error messages SCC I/O test error messages have the form ?TPL: 3/scc/io {(code: LnNdescription) ® °TFL: 3/scc/io indicates that the SCC I/O test reported an error. * code represents a number that identifies the portion of the test that failed. * nNrepresents the number of the line in which the error occurred. ®* description represents additional information that describes the error. Table B-27 lists the codes and descriptions used in SCC /O test error messages. Table B-27: SCC /O Test Error Codes and Descriptions Error Code and Description 1: LaN tx bfr not empty. status=xx Meaning System could not write a single character because the transmit buffer was not empty. N represents the line in which the error occurred. The status value is the contents of read register 0. status=x CHAR AVAIL signal not received when the system was expecting a character. N represents the line in which the error occurred. The status value is the contents of read register 0. 3: LnN expetd=xx, revd=yy, The character that 2: LnN char not revd. status=zz was received was different than the transmitted character. N represents the line in which the error occurred. xx represents the transmitted value. yy represents the received value. The status value is the contents of read register 0. Base System Self-Test Commands and Error Messages B~39 B.2.41 scc/pins - Serial Communication Chip Pins Test The serial communication chip (8CC) pins test checks the control pins on the communications connectors. To run the SCC pins test, enter t 3/sce/pins [line] [loopback) and press Return. * Replace the optional line parameter with a value that specifies the communications connector that you want to test. — Specify 2 to test the communications connector on the right as you face the back of the system unit. — Specify 3 to test the communications connector on the left as you face the back of the system unit. * Replace the optional loopback parameter with a value that specifies the loopback hardware that you attach to the communications connector being tested. — Specify 29-24795 if you attach a 29-24795 loopback connector. The default value is 29-24795. — Specify H8571 if you attach an H8571 loopback connector. — Specify hm if you attach an hm loopback connector. — Specify H3200 if you attach an H3200 loopback connector. Table B-28 lists the specific pin pairs that each loopback connector tests. Table B-28: Loopback connector 29-24795 Pin Pairs Tested by Individual Loopback Connectors Pin pairs tested Meaning 4-5 23-6-8 RTS! to CTS? 883 10 DSR? and CDS 6-23 failure implies 6 broken. 8-23 failure implies 8 broken. 6-23 8-23 failure implies 23 broken. "Request to send 2Clear to send 3Secondary request to send *Data set ready 5Carrier detector B—40 CPU System Manual Table B-28 (Cont.): Loopback connector H3200 Loopback Pin pairs tested Meaning 4.5 RTS to CTS 12-23 817 to 88 4.5 RTS to CTS 6-20 HB571-A Pin Pairs Tested by Individual Connectors 20-6-8 DSR to DTR® DTR to DSR and CD 6-20 failure implies 6 broken. 8-20 failure implies 8 broken. 6-20 8-20 failure implies 20 broken. hm 4.5 RTS to CTS 6Data terminal ready 7Secondary signal B.2.41.1 SCC pins test error messages SCC pins test error message have the form ?TFL: 3/scc/pins (code: LnN: description) * 2TFL: 3/scc/pins indicates that the SCC pins test reported an error. * Nrepresents the number of the serial line in which the error occurred. * description represents additional information that describes the error. Table B-29 lists the codes and descriptions used in SCC pins test error messages. Base System Self-Test Commands and Error Messages B-41 Table B-29: SCC Pins Test Error Codes and Descriptions Error Code and Description 1:L.nN Invld param {xxxxx] Meaning The number used in the test command to specify the loopback was invalid. N represents the number of the serial line in which the error occurred. xxxxx represents the first two characters of the invalid value that was specified. 2:L.nN Strtup R-xx xptd=yy actl=zz Test failed to generate the expected SCC status bits. N | pins | represents the number of the serial line in which the error occurred. The Strtup R value is the number of SCC register that contains the status bits. The xptd value is the expected status bits. The actl value is the actual status bits. pins represents the pin pairs for which the test was set up. 3: LaN xxxxx Pins failed to respond properly. xxxxx represents the numbers of one or more pin pairs that {ailed the test. B.2.42 scc/tx-rx - Serial Communication Chip Transmit and Receive Test The serial communication chip (SCC) transmit and receive test checks the ability of the SCC to transmit and receive information. To run the SCC transmit and receive test, enter t 3/scc/tx-rx [line) [loopback] |baud)) [parity)) 1bits] and press Return. * Replace the optional line parameter with a value that specifies the serial line to test. ¢ — Specify 0 to test serial line 0. The default value is 0. — Specify 1 to test serial line 1. — Specify 2 to test serial line 2. — Specify 3 to test serial line 3. Replace the optional loopback parameter with a value that specifies the type of loopback operation the test performs. — Specify intl to run an internal loopback operation. The default value is intl, — B-42 Specify extl to run an external loopback operation. CPU System Manual . Replace the optional baud parameter with a value that specifies the baud rate at which the test runs. You can specify one of the following baud rates: 300 1200 2400 3600 4800 9600 19200 The default baud rate is 9600. Replace the optional parity parameter with a value that specifies the type of parity that the test uses. — Specify none to use no parity. The default value is none. — Specify odd to use odd parity. — Specify even to use even parity. Replace the optional bits parameter with a value that specifies the number of bits per characters that the test uses. — Specify 8 to use 8 bits per character. The default value is 8. — Specify 7 to use 7 bits per character. — Specify 6 to use 6 bits per character. B.2.42.1 SCC transmit and receive test error messages SCC transmit and receive test error message have the form ?TFL: 3/scc/tx-rx (code:LnN description) ?TFL: 3/scc/tx-rx indicates that the SCC transmit and receive test failed. code represents a number that indicates the portion of the test that failed. description represents additional information that describes the error. Table B-30 lists the codes and descriptions used in SCC transmit and receive error messages. Base System Self-Test Commands and Error Messages B-43 Table B-30: SCC Transmit and Recelve Test Error Codes and Descriptions Error Code and Description Meaning 1: LnN tx bfr not empty. Systemn could not write a single character because the status=xx transmit buffer was not empty. N represents the line in which the error occurred. The status value is the contents of SCC read register 0. 2: LnN char not revd. status=xx CHAR AVAIL signal not received when the system was expecting a character. N represents the line in which the error occurred. The status value is the contents of SCC read register 0. 3: LoN expctd=xx, revd=yy, The character that was received was different than the status=zz transmitted character. N represents the line in which the 4: LnN Rx err. errs=xx Receiving character in FIFO reported an error. N represents the line in which the error occurred. The errs error accurred. The xx represents the transmitted value. The yy represents the received value. The status value is the contents of SCC read register 0. value equals the number of associated input character FIFO error bits. B.2.43 scsi/cntl - SCSI Controller Test The SCSI controller test checks SCSI controller operation. To run the SCSI controller test, enter t slot_number/scsi/cntl and press Return. Replace slot_number with the slot number of the SCSI controller to be tested. B.2.43.1 SCSI controlier test error messages SCSI controller test error messages have the form ? TFL: 3/scsi/cntl (code: description) * 2TFL 3/scsi/entl indicates that the SCSI controller test failed on the base system module SCSI controller. * code represents a number that indicates the portion of the test that failed. ®* description failure. B-44 represents additional information CPU System Manual that describes the Table B-31 lists the error descriptions used in SCSI controller test error messages. Table B-31: SCSIi Controller Error Codes and Descriptions Error Code and Description Meaning 1: rd enfg Values written to and read from configuration register did not match. 2: fifo fig First in, first out (FIFO) load and FIFQ flags did not match. 3: cnt xfr Write and read operation for TCL register reported a mismatch. 4: illg emd Command was illegal and did not generate an interrupt. 5: int reg Controller cannot clear internal interrupt register. 6: rd cnfg Mismatch occurred when reading the write/read configuration register. B.2.44 scsi/sdiag - SCSI Send Diagnostics Test The SCSI send diagnostics test runs the self-test for an individual SCSI device. You can specify whether the test alters drive parameters and includes a write operation. To run the SCSI send diagnostics test, enter t slot_number/scsi/sdiag [d] [u] [s] and press Return. s Replace slot_number with the slot number of the module to be tested. * Replace scsi_id with the SCSI ID of the device you want to test. The default value is 0. * Include the optional d and u parameters to specify the conditions that those parameters set for the specific drive you are testing. Note that the result of including d and u depends on the specific drive. To determine the effect of including d and u, refer to the service guide for the drive that you want to test. * Include the optional s parameter to suppress the error message display. B.2.44.1 SCSI send diagnhostics test error messages SCSI send diagnostic test error messages have the form ?TFL: * 3/sdiag code: description 2TFL 3/scsi/sdiag indicates that the SCSI send diagnostics test reported an error on the base system module SCSI controller. Base System Self-Test Commands and Error Messages B-45 * code represents a number that indicates the portion of the test that failed. * description represents additional information that describes the failure. Table B-32 lists the codes and descriptions used in SCSI send diagnostics test error messages. Table B-32: SCSI Send Diagnostics Test Error Descriptions Error Description Meaning 1: dev ol Test could not bring the unit on line. 2: dev ol Test could not bring the unit on line. 3: sdiag Device failed the send diagnostics test. B.2.45 scsi/target - SCSI Target Test The SCSI target test performs a read test on a specific SCSI device. If you include the optional write parameter, the test also performs a write test. To run the SCSI target test, enter t 3/scsi/target scsi_id (w] [Loops] and press Return. * Replace the scsi_id parameter with the SCSI ID of the device you want to test. * Specify the optional w parameter to include a write operation in the * Specify the optional | parameter to have the test repeat up to 9 times. If you include the 1 parameter, replace loops with the number of times SCSI target test. you want the test to repeat. CAUTION: This test can destroy existing data if it is run with the w option. The test writes over existing data at random. B.2.45.1 SCSI target test error messages SCSI target test error messages have the form ?TFL: * 3/scsi/target (code: description) TFL 3/scsi/target indicates that the SCSI target test reported an error. B-46 CPU System Manual ® code represents a number that indicates the portion of the test that failed. * description represents additional information that descrii es the failure. CAUTION: Always follow antistatic procedures when handling ele ‘tronic components. Table B-33 lists the codes and descriptions used in SCSI target tes ¢ error messages. Base System Self-Test Commands and Error Messages B-47 Table B-33: SCSI Target Test Error Codes and Descriptions Error Code and Description Meaning (devo) N Test could not bring the device on line. N represents the SCSI ID of 2: ( tst nocomp) N Command entered from the keyboard aborted the test. N represents the SCSI ID of the device being tested. 3: (rodev) N Test cannot perform write ope:..ion. Device is a read-only device. N represents the SCSI ID of the device specified in the test. 4: (dev type) N Test does not test the specified device. N represents the SCSI ID of the 6: (rdCapN Read capacity command failed. N represents the SCSI ID of the device N 7: (rzZWn Write operation failed. N represents the SCSI 1D of the device that 8: (kRN Read operation failed. N represents the SCSI ID of the device that 9: (emp) N Write and read values did not match. N represents the SCSI 1D of the 10: (wrFIMrk) N Write file mark failed. N represents the SCSI ID of the device being 1: the device that could not be brought on line. device apecified in the test. that could not be brought on line. failed. failed. device involved in the miscompare. tested. 11: (2Wr) N Write operation failed. N represents the SCSI ID of the device that failed. 12: (wrFiMrk) N Write file mark failed. N represents the SCSI ID of the device being tested. 13: (spc) N Space (-2) operation failed. N represents the SCSI 1D of the device 14: (spc) N Space (1) operation failed. 16: (tzRDN involved in the failure. involved in the failure. N represents the SCSI ID of the device lead operation failed. N represents the SCSI ID of the device being tested. 16: (cmp) N B-48 Write and read values did not match. N represents the SCSI ID of the device involved in the miscompare. CPU System Manual B.2.46 tib/prb - Translation Lookaside Buffer Probe Test The translation lookaside buffer (TLB) probe test checks whether all TLB registers respond to an address match operation. To run theTLB probe test, enter t 3/tib/prb and press Return. B.2.46.1 TLB probe test error messages The only TLB probe test error message is ?TFL: 3/tlb/prb (match(0, N)=actual, sb expected) ® 2TFL: 3/tlb/prb (match(0,N)) indicates that the value at address 0 did not match the value at the address represented by n. * actual represents the actual value found at the address represented by N. * expected represents the expected value at the address represented by N. B.2.47 tib/reg - Translation Lookaside Buffer Registers Test The translation lookaside buffer (TLB) registers test performs a read and write operation on the TLB. To run the TLB registers test, enter t 3/tlb/reg [pattern) and press Return. Replace the optional pattern parameter with the pattern you want to use for the read and write operation. The default pattern is 55555555. B.2.47.1 TLB registers test error messages TLB registers test error messages have the form ?TFL: 3/tlb/regs (description) * 2TFL 3/tlb/regs indicates that the TLB registers test reported an error. ® description represents additional information that describes the error. Table B-34 lists the descriptions used in TLB registers test error messages. Base System Self-Test Commands and Error Messages B-49 Table B-34: TLB Registers Test Error Descriptions Error Description Meaning tiblo [N]= actual, sb expected Pattern in TLB low (LO) register was not the expected pattern. N represents the number of the register with the incorrect value. The actual and expected values follow. tibhi [Ni= actual, sb expected Pattern in TLB high (HI) register was not the expected pattern. N represents the number of the register with the incorrect value. The actual and expected values in the register follow. B-50 CPU System Manual Appendix C CPU and System Registers This appendix describes the CPU and system registers. The CPU and system registers contain information that can be useful when troubleshooting. There are two types of registers: CPU registers and system registers. The system automatically displays CPU register information on the screen when exceptions occur. Use the e command in console mode to access system registers. C.1 CPU Registers Table C-1 lists the CPU registers. Table C-1: CPU Registers Register Description Cause Cause of last exception EPC Exception program counter Status Status register BadVAddr Bad virtual address (read only) When an exception occurs, the system automatically displays CPU register information in one of two formats. The first format is as follows: ?TFL slot_number/test name {CUX, cause= xxxxxxxx) (KNO3-~GA]} ?TFL slot_number/test_name (UEX, cause= xxxxxxxx) [KNO03-GA] where slot_number represents the slot number of the module being tested. test_name represents the name of the test being run. CPU and System Registers {T—~1 xxxxxxxx represents the contents of the cause register. The second format is as follows; ? PC: O0x451<vtr=nrml> ? CR: 0x810<ce=0, ip4, exc=AdEL> ? SR: 0x30080000<cul,cul,cm,ipl=8> ? VA: 0x0x451 ? ER: 0x100003f0 ? MER: 0x2000 where the values on each line are as follows: pC = Address of the exception instruction cR = Contents of the cause register sk = Contents of the status register vA = Virtual address of the exception ER =Contents of the error address register MER =Contents of the memory error register Refer to Chapter 9 for detailed troubleshooting information. C.1.1 Cause Register The cause register is a 32-bit read/write register that describes the nature of the last exception. A 4-bit exception code indicates the cause of the exception, and the remaining fields contain detail information relevant to the handling of certain types of exceptions. The branch delay (BD) bit indicates whether the exception program counter (EPC) was adjusted to point at the branch instruction that precedes the next restartable instruction. For a coprocessor unusable exception, the coprocessor error (CE) field indicates the coprocessor unit number referenced by the instructions that caused the exception. The interrupt pending (IP) field indicates which external, internal, coprocessor, and software interrupts are pending. You can write to IP; ¢ to set or reset software interrupts. The remaining bits, IP; g, are read only and represent external, internal, or coprocessor interrupts. The number and assignment of the IP bits are implementation-dependent. R3000A processors have six external interrupts. IP5 is used for the MIPS floating-point coprocessor interrupt. IP2 is normally used for system bus (IYO) interrupts. The cause register has the following format: C-2 CPU System Manual 1% fmm——— e #omm o {BD | O | e = e e + [ 0 | CE e o ——— $omm—— o m——— o e e+ 1 1 15 for | 2 12 10 9 8 7 6 5 21 0 e ————————— om————— e Fmm—r——————— to———— + 1P | SW | 0 Fo eb R 6 2 | ExcCode I 0 | e prmemr e ———— o m——— + 2 4 2 * BD indicates whether the last exception occurred during execution in a branch delay slot (0 = normal, 1 = delay slot). ¢ CE indicates the coprocessor unit number reference when a coprocessor ¢ IP indicates whether an interrupt is pending. ¢ SW indicates which of two software interrupts is pending. ¢ ExcCode is the exception code field. Table C~2 lists the exception codes * 0 is unused (ignored on write, zero when read). unusable exception occurs. and their meanings. CPU and System Registers C-3 Table C-2: Code Exception Codes Number Mnemonic Description 0 Int Interrupt 1 Mod TLB modification exception 2 TLBL TLB miss exception (load or instruction fetch) 3 TLBS TLB miss exception (store) 4 AdEL Address error exception (load or instruction fetch) 5 AdES Address error exception (store) 6 IBE Bus error exception (instruction fetch) 7 DBE Bus error exception (data reference: load or store) 8 Sys Syscall exception 9 Bp Breakpoint exception 10 RI Reserved instruction exception 11 CpU Coprocessor unusable exception 12 10)Y Arithmetic overflow exception 13-31 Reserved C.1.2 Exception Program Counter (EPC) Register The exception program counter (EPC) register indicates the virtual address at which the most recent exception occurred. This register is a 32-bit readonly register that contains an address at which instruction processing can resume after an exception is serviced. For synchronous exceptions, the EPC register contains the virtual address of the instruction that was the direct cause of the exception. When that instruction is in a branch delay slot, the EPC register contains the virtual address of the immediately preceding branch or jump instruction. If the exception is caused by recoverable, temporary conditions (such as a TLB miss), the EPC register contains a virtual address at the instruction that caused the exception. Thus, after correcting the conditions, the EPC registers contains a point at which execution can be legitimately resumed. The EPC register has the following format.: C—4 CPU System Manual EPC is the exception program counter. C.1.3 Status Register The status register (SR) 1s a 32-bit read/write register that contains the kernel/user mode, interrupt enable, and diagnostic state of the processor. The SR contains a three-level stack (current, previous, and old) of the kernel /user (KU) bit and the interrupt enable (IE) bit. The stack is pushed when each exception is taken. The stack is popped by the restore from exception (RFE) instruction. These bits can also be directly read or written. The status register has the following format: 28 27 16 ------------------------- e Cu —————————— ] DS | ————————————————————————— H o ——————— e} 18 8 $o— { FRp 7 12 6 ————— e M | C ------ T 8 2 | 5 4 e B Kbo | TEo B B 1 1 3 2 1 R | KUp | IEp s STt i 1 e | 0 T KUc R | IEc ST SR 1 1 Y | The coprocessor usability (CU) field is a 4-bit field that individually controls the usability of each of the four coprocessor unit numbers (1 = usable, 0 = unusable). Coprocessor zero is always usable in kernel mode, regardless of the setting of the CUOQ bit. The diagnostic status (DS) field is an implementation-dependent 12bit diagnostic status field that is used for self-testing and check:ng the cache and virtual memory system. For a detailed description of the DS field, see the "Diagnostic Register" section in this appendix The interrupt mask field (IM) is an 8-bit field that controls the enabling of each of eight external interrupt conditions. It controls the enabling of each of the external, internal, coprocessor, and software interrupts (0 = disable, 1 = enable). If interrupts are enabled, an external interrupt occurs when corresponding bits are set in both the interrupt mask field of the SR and the interrupt pending (IP) field of the cause register. The actual width of this register is machine-dependent. For a description of the IP field, see the "Cause Register" section in this appendix. CPU and System Registers C-5 ¢ KbUo is the old kerne¢l/user mode (0 = kernel, 1 = user). ¢ 1Eo is the old interrupt enable setting (0 = disable, 1 = enable). s KUp is the previous kernel/user mode (0 = kernel, 1 = user). ¢ 1Ep is the previous interrupt enable setting (0 = disable, 1 = enable). * KUc is the current kernel/user mode (0 = kernel, 1 = user). * [Fc is the current interrupt enable setting (0 = disable, 1 = enable). C.1.3.1 Diagnostic status The diagnostic facilities depend on the characteristics of the cache and virtual memory system of the implementation. Therefore, the layout of the diagnostic status field is implementation-dependent. The diagnostic status field is normally used for diagnostic code and, in certain cases, for operating system diagnostic facilities (such as reporting parity errors). On some machines it is used for relatively rare operations such as flushing caches. Normally, this field should be set to 0 by operating system code. The diagnostic status bits are BEV, TS, PE, CM PZ, SwC, and IsC. This set of bits provides a complete fault detection capability, but is not intended to provide extensive fault diagnosis. The diagnostic status field has the following format.: 27 23 22 21 20 19 i8 17 16 Homm——————— tm———— s T T Tt TR O So+ | 0 L 5 | BEV | TS ToT 1 1 | PE it 1 | CM e TeT 1 | R PZ PP | SwC | IsC | SR o + 1 1 1 ¢ BEV controls the location of UTLB miss and general exception vectors (0 = normal, 1 = bootstrap). When this bit is set, the UTLB miss exception vector is relocated to address Oxbfc00100 and the general exception vector is relocated to address Oxbfc00180 (general). * TS indicates that TLB shut down occurred. * PE indicates that a cache parity error occurred. This bit can be cleared by writing 1 to this bit position. * CM indicates whether a data cache miss occurred while the system was in cache test mode (0 = hit, 1 = miss). C-6 CPU System Manual e PZ controls the zeroing of cache parity bits (0 = normal, 1 = parity forced to zero). s SwC controls the switching of the data and instruction caches (0 = * IsC controls isolation of the cache (0 = normal, 1 = cache isolated). e 0 is unused (ignored on write, zero when read). normal, 1 = switched). C.1.4 BadVAddr Register The bad virtual address (BadVAddr) register is a 32-bit read-only register that contains the most recently translated virtual address for which a translation error occurred. The bad virtual address register has the following format: BadVAddr is the bad virtual address. C.2 System Registers This section describes the system registers used for troubleshooting. C.2.1 Data Buffers 3to 0 The data buffers are general-purpose 32-bit read-write registers used by the /O control ASIC. These registers can be read and written for test purposes. Any direct memory access (DMA) or access to a peripheral device can overwrite these registers. To ensure proper testing, disable all DMA engines. Table C-3 lists the system registers. Table C-3: System Registers Register Console Address Description SSR 0xBF840100 System support register SIR 0xBF840110 System interrupt register Mask 0xBF840120 System interrupt mask register CPU and System Registers C-7 Table C-3 (Cont.): System Registers Register Console Address Description EAR 0xBFA40000 Error address register ES 0xBFA80000 Memory error check/syndrome register CS 0xBFAC0000 Memory bank size and ECC diagnostics register Use the e command to examine the contents of a system register from the console. Enter the e command in the following format: e |options] [console_address] See Appendix C, “Console Commands,” for information about the formats and options used with the e command and the other console commands. C.2.2 System Support Register (SSR) The system support register (SSR) can be both read from and written to. Bits <31:16> are used internally by the I/O control ASIC. Bits <15:0> generate signals visible outside the I/O control ASIC. Table C-4: System Support Register 0xBF840100 Bits Access Description 31 Rw Communication port 1 transmit DMA enable (1=enable, 0=disable) 30 RW Communication port 1 receive DMA enable (1=¢nable, O=disable) 29 R/W Communication port 2 transmit DMA enable ( 1=enable, 0=disable) 28 R'W Communication port 2 receive DMA enable (1=enable, O=disable) 27:23 RwW Reserved. 22 RW Reserved 21 RW Reserved 20 RW Reserved 19 R/W Reserved 18 R/W SCSI DMA direction, 0 = transmit (read from memory) 17 R/W SCSI DMA enable (1=enable, O=disable) 16 R'W LANCE DMA enable (1=enable, O=disable) C-8 CPU System Manual Table C-4 (Cont.): System Support Register 0xBF840100 Bits Access Description 15 RW DIAGDN (diagnostic flag) 14:13 W TXDIS (serial transmit disable) 12 W Reserved 1 RW 8CC reset (active low) 10 R/W RTC reset (active low) 9 RW 53C94 SCSI controller reset (SCSI active low) 8 RW LANCE reset (Ethernct active low) 7.0 RW LEDs SSR<31> When set to 1, this bit enables communication port 1 (serial line 2) to transmit DMA to SCC(0)-B. Communication port 1 is the right comm port, viewed from the back. SSR<30> When set to 1, this bit enables communication port 1 (serial line 2) to receive DMA from SCC(0)-B. Communication port 1 is the right comm port, viewed from the back. SSR<29> When set to 1, this bit enables communication port 2 (serial line 3) to transmit DMA to SCC(1)-B. Communication port 2 is the left comm port, viewed from the back. SSR<28> When set to 1, this bit enables communication port 2 (serial line 3) to receive DMA from SCC(1)-B. Communication port 2 is the left comm port, viewed from the back. SSR<27:19> These bits are reserved. SSR<«18> This bit, set to 0 on power-up or reset, determines the direction of the SCSI DMA transfer. If the bit is 0, memory data will be supplied to the 53C94 SCSI controller upon demand from the address specified by the SCSI DMA CPU and System Registers C-9 pointer. If the bit is set to 1, data bursts of two words supplied from the 53C94 SCSI controller are written to memory. SSR<17>, When set to 1, this bit enables SCSI DMA; 0 disables it. SSR<16> When set to 1, this bit enables LANCE DMA so that the Ethernet interface can begin data transfer. SSR<15> DIAGDN This bit reflects the state of the DIAGDN pin on the motherboard, which is used by manufacturing diagnostics. SSR<14:13> TXDIS These bits allow diagnostics to disable the EIA drivers on the serial lines. When TXDIS are 0’s, the EIA drivers are active. When TXDIS are 1's, the EIA drivers are disabled. Since the TXDIS signals are automatically cleared at power up or reset, the EIA drivers are enabled by default. TXDIS<0> disables communication port 1 (serial line 2), and TXDIS<1> disables communication port 2 (serial line 3). SSR«12> This bit is reserved. SSR«11> This signal can be read from and written to. The SCC UARTS are placed in a hard reset state when this bit is 0. This bit is cleared to 0 at power up or reset, resetting the two SCC’s. SSR<10> This bit can be read from and written to. The time-of-year controller is placed in a hard reset state when this bit is 0. This bit is cleared to 0 at power up or reset, resetting the TOY. When reset, the TOY loses neither its date nor its 50 bytes of permanent storage. SSR<9> This bit can be read from and written to. The 53C94 SCSI controller is placed in a hard reset state when this bit is 0. This bit is cleared to 0 at power up or reset, resetting the 53C94 SCSI controller. C-10 CPU System Manual SER<«8> This bit can be read from and written to. LANCE is placed in a hard reset state when this bit is 0. This bit is cleared to 0 at power up or reset, resetting LANCE, SSR<7:0> These bits are reserved; not in use. C.2.3 System Interrupt Register (SIR) The SIR register consists of two sections, Bits <31:16> are set by the DMA engine for various DMA conditions. These bits are always set by the system and can be cleared by writing 0 to them. Writing 1 has no effect. These bits are cleared to 0 during system power up or reset. Bits <15:0> reflect the status of specific system devices and are read-only. A few of these are not usually used as interrupts and should be masked. These bits may or may not be reset to 0 during system power up reset, depending on the state of the interrupting device. Table C-5: System Interrupt Register 0xBF840110 Bits Access Description k3| R/WoC Communication port 1 transmit page end interrupt 30 R/WoC Communication port 1 transmit DMA memory read error 29 RWwWoC Communication port 1 receive half page interrupt 28 RWOC Communication port 1 receive DMA page overrun 27 R/WoC Communication port 2 transmit page end interrupt 26 R'WoC Communication port 2 transmit DMA memory read error 25 R/WQC Communication port 2 receive half page intermpt 24 R/WQC Communication port 2 receive DMA overrun 23 R/WO0C Reserved 22 R/WoC Reserved 21 R/WoC Reserved 20 R/WoC Reserved 19 RWOC SCSI DMA interrupt, {(DMA buffer pointer loaded) 18 R/WoC SCSI DMA overrun error CPU and System Registers C-11 Table C-5 (Cont.): System Interrupt Register 0xBF840110 Bits Access Description 17 RwWoC SCSI DMA memory read error 16 R/WoC LANCE DMA memory read error 16 R Reserved 14 R NVR mode jumper 13 R TURBOchannel slot 2 interrupt 12 R TURBOchannel slot 1 interrupt 11 R TURBOchannel slot 0 interrupt 10 R NRMOD manufacturing mode jumper 9 R SCSI interrupt from 53C94 SCSI controller 8 R Ethernet interrupt 7 R SCC(1) serial interrupt (communication 2) 6 R SCC(0) serial interrupt (communication 1) 5 R Reserved 4 R PSWARN power supply warning indicator 3 R Reserved 2 R SCSI data ready 1 R PBNC 0 R PBNO NOTE: Communication port i is the same as serial line 2. Communication port 2 is the same as serial line 3. SIR<31> This interrupt is generated by the communication port 1 transmit DMA logic. The DMA transmitter, when enabled, transmits bytes until the pointer reaches a 4-Kbyte page boundary. At this point, it stops DMA and interrupts the processor. DMA is disabled whenever this bit is set. To restart, clear this bit by writing 0; writing 1 has no effect. SIR<30> When a parity error, page crossing error, or maximum transfer length error occurs during a communicaticn transmit port 1 DMA, this bitis set and the C-12 CPU System Manual DMA is disabled. The DMA pointer contains the error address. Check the memory sections for more information. To restart, software must clear this bit by writing 0; writing 1 has no effect. SIR<29> When the receive DMA pointer associated with communication port 1 reaches a half page (2-Kbyte) boundary, this bit is set. Software must disable DMA and then load a new pointer and restart DMA without being interrupted. Clear this bit by writing 0; writing 1 has no effect. The value of this bit is informational only and does not stop the DMA. SIR<28> When the receive DMA pointer associated with communication port 1 reaches a page boundary, this bit is set and the DMA disabled. To restart, clear this bit by writing 0; writing 1 has no effect. Note that bit <29> is set whenever this bit is set. SIR<27> This interrupt is generated by the communication port 2 transmit DMA logic. The DMA transmitter, when enabled, transmits bytes until the pointer reaches a page boundary. At this point, it stops DMA and interrupts the processor. DMA is disabled whenever this bit is set. Clear this bit by writing 0; writing 1 has no effect. Clearing this bit may restart the DMA if the DMA enable bit is still on. SIR<26> When a parity error, page crossing error, or maximum transfer length error occurs during a communication transmit port 2 DMA, this bit is set and the DMA is disabled. The DMA pointer will contain the error address. Check the memory sections for more information. To restart, software must clear this bit by writing 0; writing 1 has no effect. SIR<25> When the receive DMA pointer associated with communication port 2 reaches a (2-Kbyte) half-page boundary, this bit is set. Software must disable DMA, load a new pointer, and restart DMA quickly. Clear this bit by writing 0. Writing 1 has no effect. This bit will always be set when bit 24 is set. The value of this bit is informational only and does not stop the DMA. SIR<24> When the receive DMA pointer associated with communication port 2 reaches a page boundary, this bit is set and the DMA disabled. To restart, CPU and System Registers C-13 clear this bit by writing 0; writing 1 has no effect. Note that bit<25> is also set whenever this bit is set. SIR<23:20> These bits are reserved. SIR<19> This interrupt is set whenever the SCSI DMA buffer pointer associated with the SCSI port is loaded into the SCST DMA pointer register. Software uses this interrupt to load a new buffer pointer into the SCSI buffer pointer register. Clear this interrupt by writing 0 to it. SIR<18> This bit is set when the buffer pointer is not reloaded soon enough. It indicates an overrun condition as the data buffer space is exhausted. DMA is disabled when this bit is set. Clear this bit by writing 0 to it. SIR«<17> This bit is set when the SCSI DMA encounters a memory read error during a DMA. DMA is disabled when this bit is set. Clear this bit by writing 0 to it. SIR<16> This bit is set to 1 when the LANCE DMA encounters a memory read error, disabling DMA. The LANCE will then enter a timeout state, interrupting the processor to handle the problem. The address of the error will be visible in the LPR. Clear this bit by writing 0 to it; writing 1 to it has no effect. SIR<15:» This bit is reserved. SIR<14> UNSCUR When this bit is set, the contents of the NV RAM in the TOY clock chip are set to default system values. Any password that had been saved is lost. SIR<13> TCO 2 interrupt This bit reflects the value of the TURBOchannel slot 2 interrupt. SIR<12> TCO 1 interrupt This bit reflects the value of the TURBOchannel slot 1 interrupt. SIR<11> TCO 0 interrupt This bit reflects the value of the TURBOchannel slot 0 interrupt. C-14 CPU System Manual SIR<10> NRMMOD This bit reflects the state of the manufacturing jumper on the module. When the jumper is absent, NRMMOD is 0, and the console should perform its normal power up or reset tests and boot. When the jumper is installed, NRMMOD is 1, and the console will execute manufacturing tests. SIR<9> This bit follows the state of the interrupt from the 53C94 SCSI controller chip. This interrupt indicates that the transfer is complete, SIR<8> This bit follows the state of the interrupt from the LANCE. SIR<7> This interrupt is generated by SCC(1), which contains the communication port 2 (ch B). Softwure must read SCC(1) internal registers to determine the appropriate course of action. Communication port 2 is the same as serial line 3. SIR<6> This interrupt is generated by SCC(0), which contains both the communication port 1 (ch B). Software must read SCC(0) internal registers to determine the appropriate course of action. Communication port 1 is the same as serial line 2. SIR<5> This bit follows the state of the time-of-year clock interrupt. SIR<4> This bit follows the state of the power supply warning indicator. When this bit is set, the operating system should report an error. When the power supply overheats, this bit is set to 1. SIR<3> This bit is reserved. SIR<2> This bit indicates SCSI receive data in the FIFO of the 53C94 SCSI controller. When transfers are aligned and the DMA is enabled, data is moved from the FIFO to main memory by the 1/0 control ASIC, and this interrupt is masked by software. Unaligned transfers cannot use DMA and thus cannot use this interrupt to signal when the processor must move data to memory. CPU and System Registers C-15 SIR<1> This bit reflects the state of the halt button on the back of the system unit. This bit is set to 0 when the button is pushed. This interrupt should always be masked. SIR<0> This bit reflects the state of the halt button on the back of the system unit. This bit is set to 1 when the halt button is pushed. On R3000A systems, this interrupt should always be masked. The halt interrupt is also presented at the processor interface, so it should be visible to the CPU. C.2.4 System Interrupt Mask Register Table C-6: System Interrupt Mask Register 0xBF840120 Bits Access Description 310 R/W Interrupt mask <31:0> These bits, if 0, mask the corresponding interrupt observable in the SIR. Bit <0> masks SIR<0>, bit <1> masks SIR<1>, and so on. The mask does not prevent an interrupt from showing up in the SIR; it merely keeps the CPU from being interrupted. All bits of the interrupt mask are set to 0 on power up, masking all interrupts. Software must set to 1 those interrupts that it wants enabled. C.2.5 Error Address Register (EAR) The error address register (EAR) (address: 0xBFA40000) is the primary error log register that records the physical address of TC I/O timeouts, TC DMA overruns, and memory ECC errors. The EA register is cleared by system reset or by a processor write. When an error occurs, EA.VALID is set along with the log bits. Table C-7 shows the format of the EA register during reads. C-16 CPU System Manual Table C-7: Error Address Register 0xBFA40000 Base Size Name 31 1 VALID 30 1 CPU 2 1 WRITE 28 1 ECCERR 27 1 RSRVD 0 27 ADDRESS EAR<31> - EA.VALID This bit is set to 1 when error information is clocked into the register. When EA.VALID is already set, error logging is disabled. That is, the EA register indicates only the first error that occurred if there are multiple errors. EAR<30> - EA.CPU If this bit is 1, the error occurred during a processor transaction. If this bit is 0, the error occurred during a TC DMA transaction. EAR<29> - EA.WRITE If this bit is 1, the error occurred on an I/O write or memory write transaction. If this bit is 0, the error occurred on an /O read or memory read transaction. EAR<28> - EA . ECCERR If this bit is 1, an ECC error occurred. If this bit is 0, an I/O timeout or DMA overrun occurred. EAR<27> - EA.RSRVD This bit is reserved and stuck at 0. EAR<26:0> - EA ADDRESS This field records the value of the pipelined address in effect at the time the error occurred. For I/O transactions and partial memory writes, this is the word address issued by the processor. For DMA overrun errors, this is the word address of the last valid word transferred (127). For processor and DMA memory reads, this is the word address in the memory controller. However, due to pipelining of the memory controller, the column field of the word address has advanced five stages before the ECC error status is available. Software must extract ADDRESS[11:0], perform a signed CPU and System Registers C-17 subtract of five, and then reinsert this value into ADDRESS[11:0] to recover the address of the word fhat contained the ECC error. Table C—8 lists the values of bits <30>, <29>, and <28> for the different types of system errors. During read conflicts, the memory controller may service the same read request several times (while stalling the processor) until conflicting write data in the write buffer has been flushed. It is possible for ECC read errors to occur during processor reread conflicts when the processor is stalled. However, after the write buffer is flushed, the error is overwritten with new data, so the processor will not receive a bus error on termination of the read. Also, if the processor is waiting for a memory space partial write to complete, and a multi bit ECC error occurs during the read/modify/write of the partial data, invalid data and valid ECC check bits will be loaded into memory. In this case, the ensuing read will complete without causing an exception even though the read data is invalid. If the address is a cached location, invalid data will be loaded into the cache and the cache entry will be incorrectly marked valid. Regardless of the type of masked error, a memory interrupt will be generated, and the offending ECC read error or processor partial write error will be correctly logged in the EA and ES registers. Table C-8: EA Error Log Types bit <30>bit <29>bit <28> CPU CPU CPU Error Type 0 0 0 DMA read overrun 0 0 1 DMA memory read 0 1 0 DMA memory write 0 1 1 Invalid combination 1 0 0 Processor /O read timeout 1 0 1 Processor memory read ECC 1 1 0 Processor /O write timeout 1 1 1 Processor partial memory write ECC If the MB ASIC has prefetching enabled, it is possible to log processor read hard ECC errors without a processor error if the ECC error occurs in the prefetched portion of the cache block. C-18 CPU System Manual C.2.6 Error Syndrome Register (ES) The error syndrome (ES) register (address: 0xBFA80000) is a slave error log register that records check bits and syndrome bits of the last memory read. The ES register is frozen when EA <31> is 1. The ES register is cleared by system reset and processor writes. Table C-9 shows the format of the ES register during reads. The syndrome bytes are only valid if EA <31> is 1. The CHKHI byte is only valid if the VLDHI bit <31> is set to 1. The CHKLO byte is only valid if the VLDLO bit <15> is set to 1. Table C-9: Error Syndrome Register 0xBFA80000 Base Size Name 31 1 VLDHI 24 7 CHKHI 23 1 SNGHI 16 7 SYNHI 15 1 VLDLO 8 7 CHKLO 7 1 SNGLO 0 7 SYNLO ES<31> - ES.VLDHI This bit is set to 1 whenever the CHKHI field «30:24> is updated. ES<30:24> - ES.CHKHI In the absence of errors, this field records the last check bits read from the high bank of memory (odd word). Once an error occurs, and the EAVALID bit (EA<31>) is set to 1, this field is frozen. ES<23> - ES.SNGHl This bit records the single-versus-double bit error output of the ECC logic at the time that an error was detected by the high bank of memory. If it is 1, a single-bit error occurred. Ifit is 0, a double-bit error occurred. This bit is valid when the ES.SYNHI <22:16> field is valid. ES<22:16> - ES.SYNHI This field records the the syndrome bits calculated by the ECC logic at the time that an error was detected by the high bank of memory (odd words). CPU and System Registers C-19 The EA.ADDRESS field (EA <26:0>) field must be used to determine whether the error pertains to a low or high word of memory. This field is undefined for low bank errors. The syndrome can be used to determine which bit was in error. See the next section, "ECC logic,” for a description of the syndrome logic. ES<15> - ES.VLDLO This bit is set to 1 whenever the CHKLO field <14:8> is updated. ES<14:8> - ES.CHKLO In the absence of errors, this field records the last check bits read from the low bank of memory (even word). Once an error occurs, and the EAVALID bit (EA<31>) is set to 1, this field is frozen. ES<7> - ES.SNGLO This bit records the single-versus-double bit error output of the ECC logic at the time that an error was detected by the high bank of memory. If it is 1, a single-bit error occurred. If it is 0, a double-bit error occurred. This bit is valid when the ES.SYNLO <«6:0> field is valid. ES<6:0> - ES.SYNLO This field records the the syndrome bits calculated by the ECC logic at the time that an error was detected by the low bank of memory (even words). The EAADDRESS field (EA <26:0>) field must be used to determine whether the error pertains to a low or high word of memory. This field is undefined for high bank errors. The syndrome can be used to determine which bit was in error. See Section C.2.8 for a description of the syndrome logic. C.2.7 Controi Register (CS) The control (CS) register (address: O0xBFACO0000) controls the memory array size decoding via the CS.BNK32M bit <10>. The CS register also controls the ECC data path. CS is a read/write register that is cleared by system reset. Table C-10 shows the format of the CS register during reads and writes. Table C-10: Control Register 0OxBFAC0000 Base Size Name 16 16 RSRVD2 C-20 CPU System Manual Table C--10 (Cont.): Control Register 0xBFAC0000 Base Size Name 15 1 DIAGCHK 14 1 DIAGGEN 13 1 CORRECT 1 2 RSRVD1 10 1 BNK32M 7 o RSRVDo 0 7 CHECK CS<«31:16> - CS.RSRVD2 This field must be written with zeros. CS<15> - CS.DIAGCHK This bit controls a diagnostic multiplexor in the ECC read data path. the CS.DIAGCHK bit <15> is 0, check bits from the memory array c: are used during memory reads. If the CS DIAGCHK bit <15> is 1, f CS.CHECK field <6:0> specifies the check bits during memory reads. Si CS is cleared by system reset, check bits are read from memory by defa CS<14> - CS.DIAGGEN This bit controls a diagnostic multiplexor in the ECC write data path. If: CS.DIAGGEN bit <14> is 0, check bits are calculated from the processo TC data word during memory writes. If the CS DIAGGEN bit <14> is the CS.CHECK field <6:0> specifies the check bits during memory wrif Since CS is cleared by system reset, check bits are generated from proces or TC data by default. CS<13> - CS.CORRECT This bit controls whether or not the ECC logic corrects single-bit error: memory read data. When this bit is 1, the single-bit error in the read d is complemented as specified by the ECC syndrome. When this bit i and the ECC logic detects a multibit error, the output of the ECC logi undefined. The state of this bit does not affect memory interrupts, er logging, or bus errors; it only controls modification of memory data. Si CS is cleared by system reset, ECC correction is disabled by default. CS<12:11> - CS.RSRVD1 CPU and System Registers C This field must be written with zeroes. CS<10> - CS.BNK32M This bit controls the memory bank stride. If this bit is 0, the stride is 8 Mbytes. If this bit is 1, the stride is 32 Mbytes. Powerup/reset software sets this bit and determines whether each memory module is an 8- or 32Mbyte module. Then, if no 32-Mbyte modules are found, this bit is cleared. Sigce CS is cleared by system reset, the memory bank stride defaults to 8 Mbytes. CS<«9:7> - CS.RSRVDO This field must be written with zeroes. CS<6:0> - CS.CHECK This field specifies the diagnostic check value used by the CS.DIAGCHK and CS.DIAGGEN multiplexors. C.2.8 ECC Logic This section describes the error correction code (ECC) logic. MT generates seven check bits for each word written to the memory arrays. For each word read from the memory arrays, MT verifies that the check bits are consistent with the data bits. If a single-bit error is detected, the erroneous bit is automatically corrected if CS CORRECT (bit <>) is 1. If a single- or double-bit error is detected and EA.VALID (bit EA<>) is 0, the EA and ES registers are written and frozen with the address, check, and syndrome bits of the memory word. Table C-11 lists the data bits included in the exclusive-or logic for each check bit. The ES.CHKLO (ES «>), ES.CHKHI (ES <>), and.CS.CHECK (CS «>) fields correspond to: 64*C16 | 32*C8 | Table C-11: 16*C4 | 8*C1 | 2*C0 | CX Participating Data Bits in Check Bit Calculation Bit Parity CX Even 0467891114 Co Even 0124681012 C1 Odd 03479101315 C2 Odd 01567111213 C-22 CPU System Manual Table C-11 (Cont.): Participating Data Bits in Check Bit Calculation Bit Parity C4 Even 2345671415 Cs Even 8910111214 16 C16 Even 012345667 CX Even 17 18 19 21 26 28 29 31 Co Even 16 17 18 20 22 24 26 28 C1 QOdd 16 19 20 23 25 26 29 31 C2 Odd 16 17 21 22 23 27 28 29 C4 Even 18 1920 21 22 23 30 31 C8 Even 24 25 26 27 28 29 30 31 Ci6 Even 24 25 26 27 28 29 30 31 Table C-12 lists the significance of each syndrome code logged in the ES register. The multibit syndrome codes are shown for completeness; MT does not report these as hard errors with the assertion of either p.mc.~rErr or t.mo.~err as appropriate. MT only reports double-bit errors as hard errors. If the operating system detects a multibit error syndrome code, it should log the error and shut down immediately. Table C-12: Syndrome Decoding Syndrome Error Syndrome Error Syndrome Error Syndrome Error 00 None 20 Cc8 40 Cl6 60 Double (] cX 21 [ouble 11 Double 61 Muiti 02 co 22 Double 42 Double 62 D24 03 Double 23 D8 43 Multi 63 Double 04 C1 24 Double 44 Double 64 D25 05 Double 25 D9 45 Multi 65 Double 06 Double 26 D10 46 Multi 66 Double 07 Multi 27 Double 47 Double 67 D26 CPU and System Registers C-23 Table C-12 (Cont.): Syndrome Decoding Syn- Syndrome Error Syndrome Error Syn- drome Error drome Error 08 ce 28 Double 48 Double 68 D27 09 Double 29 bil 49 Muiti 69 Double oA Double 2A D12 4A D1 6A Double oB ma7 2B Double 4B Double 3] n28 Double 2C D13 4C Multi 6C Double oD Muiti 2D Double 4D Double 6D D29 OE D6 2E Double 4E Double 6E Multi oF Daouble 2F Muiti 4F 6F Double 10 C4 30 Double 50 Doubla 70 D36 Double 31 D14 51 Multi 71 Double Double 32 Multi 52 D2 72 Double D18 Double Double 73 Multi 14 Double D15 D3 74 Double 15 D19 Double Double 75 D31 16 D20 Double Double 76 Muiti D4 77 Double D5 78 Double Double 37 Multi 2 13 57 Muiti Double 19 n21 39 Double 59 Double 79 Muiti 1A D22 3A Double EA Double TA Multi 1B Double 3B Multi 5B 78 Double D23 3c Double 7c Muiti Double 3D Multi 7D Double Double 3E Multi Multi 7E Double Mulu 3F Double Double F Multi C-24 CPU System Manual Double 5D Appendix D Connector Pin Assignments This appendix lists pin assignments for the following connectors: ¢ SCSI cable connectors ¢ Serial communications connectors ¢ ThickWire Ethernet connectors * Modem loopback connectors e Ethemnet loopback connectors It also provides a summary of loopback connectors Connector Pin Assignments D-1 Table D-1: SCSI Cable Connector Pin Assignments Pin Signal Pin Signal 50 ~ VO 25 GND 49 ~ REQ P GND 48 ~ C/O 23 GND 47 ~ SEL 22 GND 46 ~ MSG 21 GND 45 ~ RST 20 GND 44 ~ ACK 19 GND 43 ~ BSY 18 GND 42 GND 17 GND 41 ~ ATN 16 GND 40 GND 15 GND 39 RSVD 14 GND 3s TERMPWR 13 NC 37 RSVD 12 GND 36 GND 11 GND 35 GND 10 GND 34 ~ PARITY 9 GND 33 ~ DATA<7> 8 GND 32 ~ DATA<6> 7 GND 31 ~ DATA<5> 6 GND 30 ~ DATA<4> 5 GND 29 ~ DATA<3> 4 GND 28 ~ DATA<2> 3 GND 27 ~ DATA<1> 2 GND 26 ~ DATA<0> 1 GND D-2 CPU System Manual Table D-2: Pin Serial Communications Connectors Pin Assignments Source 1 Signal CCITT! EIA? Description GND 102 AB Signal ground 2 KNO3A-AA TX 103 BA Modem transmitted data 3 Modem/printer RX 104 BB Modem received data 4 KNO3A-AA RTS 105 CA Request to send 5 Modem/printer CTS 106 CB Clear to send 6 Modem/printer DSR 107 CcC Data set ready GND 102 AR Signal ground CD 109 CF Carrier detector 7 8 Modem/printer 9 Unconnected 10 Unconnected 11 Unconnected 12 Modem/printer Sl 112 Cl SPDMI 13 Unconnected 14 Unconnected 15 Modem/printer TxCk (DCE114 DB 16 17 Modem transmit clock Unconnected Modem/printer RxCk (DEC)15 DD Modem transmit clock 18 Unconnected 19 Unconnected 20 KNO3A-AA DTR 108.2 CD 21 Data terminal ready Unconnected 22 Modem/printer Rl 125 CE Ring indicator 23 KNO3A-AA S5 111 CH DSRS 24 Unconnected 25 Unconnected 1Comite Consultatif International Telegraphique et Telephonique, an international consultative committee that gsets international communications standards 2Electronic Industries Assaciation Connector Pin Assignments D-3 Table D-3: Pin ThickWire Ethernet Connector Pin Assignments Source Signal 1 Description Shield 2 XCVR ACOL+ Collision presence 3 KNO3A-AA ATX+ Transmission GND Ground 4 5 XCVR ARX+ Reception 6 XCVR GND Power return 7 CTL+ Control ocutput 8 GND Ground 9 XCVR ACOL- Collision presence 10 KNO3A-AA ATX- Transmission GND Ground 1 12 XCVR ARX- Reception 13 KNO3A-AA +12V Power 14 GND Ground 15 CTL~ Control output Table D-4: Cable Power Supply Pin Assignments Pin Signal Red 5 volt supply Black 5 volt return Multilead D-4 i POK 2 +12 volt return 3 +12 volt supply 4 WARN 5 ~12 volt return 6 ~12 volt supply CPU System Manual Table D-5: Modem Loopback Connector Pin Assignments From Pin No. Signal To Pin No. Signal P4.-2 TX2 P4-3 RX2 P44 RTS2 P4-5 CTSs2 P4-6 DSR2 P4-20 DTR2 P4.12 SPDMI2 P4.23 DSRS2 P4-18 LIPBK2 P48 Ci2 P4.18 LLPBK2 P4.22 RI2 P4.18 LLPBK2 P4-25 TTMIZ2 Table D-6: Ethernet Loopback Connector Pin Assignments From Pin Neo. Signal P6-3 TRA+ P6-10 TRA-~ P6-13 PWR Table D-7: Pin No. P6-12 Signal Description REC+ Through capacitor REC- Through capacitor RET Through resistor and LED Summary of Loopback Connectors Standard/ Function Unique Part Number Option Number loopback Standard 12-15336-13 H3200 ThickWire loopback connec- Standard 12-22196-02 N/A ThinWire T-connector Standard 12-25869-01 HB223 ThinWire terminator Standard 12-26318-01 H8225 Communications connector tor Connector Pin Assignments D-5 Appendix E ULTRIX System Exercisers The ULTRIX operating system contains a set of commands called exercisers. The exercises reside in the /usr/field directory and allow you to test all or part of your system by exercising specified parts. NOTE: The ULTRIX exercisers are not a mandatory subset and may not be installed on your system. Subset UDTEXER must be installed for the exercisers to be present. The following ULTRIX-based exercisers are currently available and can be used to exercise and test the DECsystem 5900; * fsx = file system exerciser * memx = memory exerciser * shmx = shared memory exerciser ¢ dskx = disk exerciser * mtx = magnetic tape exerciser * tapex = tape exerciser program * netx = tcp/ip network exerciser * c¢mx = communications exerciser ¢ |px = line printer exerciser To run these exercisers, the operator must be logged in as superuser (root) and then change directory to /usr/field. All of the exercisers can be run in either the foreground or the background and can be canceled at any time by pressing CTRL/C in the foreground. More than one exerciser can be run at the same time. To run more than one exerciser simultaneously, a shell script called syscript is used. The syscript command asks that which exercisers are to be run, how long the exercisers will be run and how many exercisers are to be run at one time. The syscript command can be used to exercise a device, a subsystem, or the entire system. ULTRIX System Exercisers E-1 Each time an exerciser is invoked, a new logfile is generated in the /usr/field directory. The logfile is record of the exerciser’s results and consists of the starting and stopping times, and of error and statistical information. E.1 File system Exerciser (fsx) The file system exerciser (fsx) is used to exercise a file system locally. Fsx exercises the specified file system by initiating multiple processes which creates, writes, closes, opens, reads, validates, and unlinks a test file of random data. The format of the fsx command is: fsx -h -ofile -pn -fpath ~tmin & -~h Prints the help message for fsx. ~ofile Saves -pn Specifies maximum -fpath is the 250, cutput the default Specifies system to -tmin test. fsx designated file. process is default the in is initiate. The directory of the file /usr/field. number minutes the to 20. pathname of the The Specifies Runs in the the number of fsx is teo run background. The followmg example starts 5 processes and tests the /usr file system for 60 minutes in the background: # fsx -p5 -f/usr -t60 & E.2 Memory Exerciser (memx) The memx command exercises system memory. The memx command runs ones and zeros, zeros and ones and random data patterns in the allocated memory being tested. The format of the memx commands is: memx -h -ofile -5 -h Prints -ofile Saves -3 Disables - mn Specifies The default -pX The -mn the the is -tmin help message output in automatic the total Speclfies maximum is -px the 20 Specifies the & Runs in memx memory number is number the for the of of of of memory divided also memx designated invocation ammount which -tmin the & by process command. file. shmx. in bytes to test. 20. to initiate. the default. minutes memx is to run. background. The following example disables the shared memory exerciser, tests 4095 bytes of memory, starts 5 processes and run for 60 minutes in the background. # memx E-2 -s -md4095 ~p5 -t60 CPU System Manual & E.3 Shared Memory Exerciser (shmx) The shmx command tests shared memory. Shmx spawns a background process called shmxb, and together shmx and shmxb exercise the shared memory segments. They take turns writing and reading each other’s data. The format of shmx is: shmx -h -ofile -ti ~h Prints -ofile Saves -mj -sk -v help message output in & for the designated -ti Indicates the -fmJ Specifies the memory segment to be -sk run in command. mimutes size (i). in bytes (3) tested. Specifies the number The maximum and default -v Uses system call & time shmx file. Runs is the fork system to spawn shmxb. shmx in the if memory segments (k). 6. call instead of the vfork background. The following example runs shmx for 180 minutes, tests 100,000 bytes on three memory segments and runs in the background. # shmx -t180 -m100000 -s3 & E.4 Disk Exerciser (dskx) The dskx command exercises disk drives. The dskx command exercises spe~ified partitions and file systems on the designated disk. The format of the dskx command is: CAUTION: The -p and -c options destroy data on the device you are testing. Use extreme caution when using either of this options. dskx -h -ofile -pdevpart -h Prints -ofile Saves -pdevpart Performs specified the the You it partitions the -rdev of the of ¢ in seeks, of validates the test the the random the c¢ Performs random is safe the seeks except to use & dskx and reads device (dev). and the block because the and reads (dev), ¢ is sizes. test reads all on all except not block partition on and on information. writes bad command. file. random data Partition would corrupt option writes device the test (dev), the bad block specified device for -dn partition, seeks, partition. —-tmin designated (part) corrupt Performs -rdev message random cannot would -cdev help output partition Next, -cdev tested because information. on c¢. disks all The partitions -r because it will ULTRIX System Exercisers E-3 not ~-tmin overwrite data. Specifies the run time in minputes. -dmin Specifies command to in minutes how often you want print default diagnostics is to print to the terminal. diagnostics upon the dskx The completion of the exercise. & Runs dskx in the background. The following example runs dskx on rz0 for 20 minutes and diagnostic information is displayed on the terminal every 5 minutes. 4 dskx -rrz0 -t20 -d5 E.5 Mag Tape Exerciser (mtx) The mtx command writes, reads and validates random data on a tape device from BOT to EOT. The format of the mtx command is: mtx -h -ofile -rn -fn -sdev# ~h Prints help -ofile Saves output -rn Specifies record -fn Specifies the ~sdev# Performs validates variable a message the in for the length length of on -tmin & mtx command. in bytes, the files default is 10240. in numbers that device device name -adev# file. record test records raw -vdev# designated short 512-byte is -ldev# writes, {dev). and number, of reads The dev# such as records. and rmt Oh. -ldev# Performs validates variable a long 10240-byte is the raw record test records on device name that device writes, (dev). and number, reads and The dev# such as rmtOh. -vdev#$ Performs reads and 20280 bytes device name ~adev# on variable dev (dev). number, random (dev). and number, Performs device and a validates short, The such -tmin Specifies & Runs mtx dev# as length test that writes, record lengths 512 to the raw The such long dev# as and variable from variable is rmtOh. variable 1s the record raw length device tests on name rmtOh. the in record run the time in minutes. background. The following example writes 20480 byte records to rmtOh for 60 minutes and runs in the background: # mtx E4 ~-r20480 -lrmtOh -t60 CPU System Manual & E.6 Tape Exerciser (tapex) The tapex command is similar to the mtx command but performs additional tests, for example, positioning tests for records and files. The are over 30 options that can be used for the tapex command and space does not ailow for their inclusion here. To view the tapex command help file use the following command: # tapex -h E.7 Network Exerciser (netx) The netx command exercises the TCP/IP network. The netx command sets up a stream socket connection with netx acting as the client and the miscd utility acting as the server in the TCP/IP internet domain. Using the connection, netx writes random data to the miscd server. The server loops the data back to netx, and then the data is read and verified against the original data. The format of the netx command is: netx -h -pn nodename -tmin ~h Prints the -pn Specifies domain. The nodename running help message the variable n The the name & port must of the be for number less to than remote or netx. use in the internet 32768. local system host server. ~tmin The & Runs run time netx in in minutes the background. The following example runs a test on node tinker for 60 minutes in the background. # netc tinker -t60 & E.8 Communications Exerciser (cmx) The ¢cmx command exercises terminal communications. The emx command writes, -eads and validates random data and packet lengths on the communications line or lines specified. The format of the ¢cmx command is: cmz -h -ofile ~tmin -1 line-1 line-2 line-n... & ULTRIX System Exercisers E-5 ~h -ofile Prints the help message of cmx. Saves the output to the designated file. -tmin Specifies the run time in minutes. ~1 line.. & Specifies Runs cmx the line you want to test. in the background. The following example runs a test on tty00 for 45 minutes in the background. # cmx -1 00 -t45 & E.9 Line Printer Exerciser (Ipx) The lpx command exercises line printers. The lpx command exercises line printers by printing a rolling character pattern repeatedly to the device. The format of the Ipx command is: lpx -h -ofile -pn =-ddev -tmin -h Prints ~ofile Saves the output to the designated file -pn Specifies the pause period in n ~ddev Specifies the line printer you want to ~tmin Specifies the run Runs the & lpx help message in the time for the in lpx command. minutes. test. minutes. background. The following example runs a test on lp1 for 60 minutes in the background, # lpx E-6 -t60 -dlpl & CPU System Manual IndeXx A Connectors (cont’d) internal, 1-6 Addresses, 4-9 power, 1-6 hardware, 4-11 external, 1-1 memory, 4-10 SCSI, 1-1, 1-6 Autoboot, 2-3 SIMM, 1-6 TURBOchannel option module, 1-6 Bad virtual address register, C~7 BadVAddr register, C-7 external, 1-1 Console exception messages, 48 Base system module tests, B4 Boot software, 2-3 Console mode, 2-1 C Console prompt (>>), 2-1 Cause register, C-2 CSR, C-20 cntl test, 5~7 Commands cat, 5-11 enfg, 4-2, 4-3 enfg 3, 4-3 e, 4-18 full, 2-1 restricted, 2-1 CPU type utility, B-15 D Data buffer registers, C--7 Data buffers, C-7 1s, 5-10 E sh, 5-9 EAR, C-16 t, 54 ECC logic, C-22 EPC register, C4 uerf, 4-13 Common tests, 5-7 Configuration enfg command, 4-3 Configuration displays, 4-2 detailed, 4-3 overview, 4-3 Connector pin assignments, D-1 to D-5 Connectors, 1-1, 1-6 CPU module, 1-6 Error address register, C-16 Error correction code logic, C-22 Error logs, 4-13 event types, 4~15 format, 4-13 memory, 4-16 Error messages, 4-5 memory, 4-7 test, 4-5 Ethernet, 1-1 Index-1 Error syndrome register, C-19 Pin assignments ESR, C-19 Ethernet loopback connector, D-5 modem loopback connector, D-5 power supply connectors, D—4 printer connector, D-3 Exception messages, 4-8 Exception program counter register, External loopback test, 5-8 printer/communications connector, D-3 H SCSI cable connectors, D-2 serial communications connector, D-3 Hardware configurations, 1-1 ThickWire Ethernet connector, Hardware physical addresses, 4-11 D4 Pins test, 5-8 Power-up self-tests, 5-3 Individual tests, 5-6, B-1 L Q Quick test script, 5-3 LED codes, 4-1 LEDs CPU, 4-2 diagnostic, 4-1 drawer, 4-2 Loopback connectors summary, D-5 R R>, 2-1 Registers, 4-18, C-1 bad virtual address, C-7 BadVAddr, C-7 cause, C-2 control, C-20 CPU, C-1 CS, C-20 Memory NVRAM, 1-6 Memory addresses, 4-10 Memory error logs, 4-16 Memory error messages, 4-7 N NVRAM module, 1-6 data buffer, C-7 EAR, C-16 EPC, C4 error address, C-16 error syndrome, C-19 ES, C-19 exception program counter, C—4 SIR, C-11 SR, C-5 O SSR, C-8 Operating mode, 2-3 system, C-1, C-7 P Index-2 status, C-5 system interrupt, C-11 system interrupt mask, C-16 system support, C-8 S Tests (cont’d) Scripts creating, 5-12 list, 5-10 quick test, 5-3 SCSI controllers, 1-7 drives, 1-7 sdiag test, 5-7 DMA registers, B-20 DMA transfer, B-22 ecc correction, B-10 ecc/cor, B-10 ESAR, B-23 Ethernet collision, B-17 Ethernet common diagnostic utilities, B-18 Shutdown software, 24 Ethernet cyclic redundancy code, B-19 SIMM, 1-6 Ethernet DMA registers, B-20 SIR, C-11 Ethernet DMA transfer, B-22 Slot numbers, 4-9, 5-2 Ethernet external loopback, B-24 Ethernet internal loopback, B-25 Software boot, 2-3 shutdown, 2-4 SR register, C-5 Ethernet interrupt request, B~25 Ethernet multicast, B-27 SSR, C-8 Ethernet promiscuous mode, B-28 Status register, C-5 Ethernet registers, B-29 Summary of loopback connectors, D-5 Ethernet station address ROM, B-23 System interrupt mask register, C-16 floating 1/0 memory, B-13 System interrupt register, C-11 System registers, C-7 System software, 2-3 T floating-point unit, B-11 fpu, B-11 halt button, B-15 individual, B-1 individual tests, 5-6 IRQ, B-25 t command, 5-4 Test error messages, 4-5 Test scripts, 5-1, 5-9 list, 5-10 Tests base system module, B4 cache isolate, B-6 cache segment, B-9 cache/data, B4 cache/fill, B-5 cachefisol, B-6 cachef/reload, B-7 cache/seg, B-9 cpu-type, B-15 CRC, B-19 LANCE, B-18 list of, 5-6 mem, B-12 mem/float10, B-13 mem/init, B-14 memory module, B-12 mem/select, B-14 misc/halt, B-15 mis«</pstemp, B-16 misc¢/wbpart, B-16 ni/ellsn, B-17 ni/common, B-18 ni/cre, B-19 ni/dmal, B-20 ni/dma2, B-22 ni/esar, B-23 index-3 Tests (cont’d) Tests (cont'd) nifext-ib, B-24 tlb/prb, B-49 nifint, B-25 tib/reg, B—49 ni/int-1b, B-25 translation lookaside buffer probe, B—49 ni/m-cst, B-27 ni/regs, B-29 translation lookaside buffers registers, B—49 NVR, B-32 zero memory, B-14 ni/promisc, B-28 overheat, B-16 Tests:real-time clock period, B-33 Tests:rtc/period, B-33 partial write, B-16 Tests:scc/int, B-37 NVRAM module, B--30 power-up self-test, 5-1 prcache, B-30 preache/arm, B-31 preache/clear, B-31 Tests:serial communication chip interrupts, B-37 Transmit and receive test, 5-8 Troubleshooting overview, 3-1 . prcache/unarm, B-32 RAM select lines, B-14 real-time, B-34 real-time clock registers, B-34 rte/nvr, B-32 rte/regs, B-34 rtc/time, B-34 SCC transmit-receive, B-—42 scc/access, B-35 sce/dma, B-36 sccfio, B-38 scc/pins, B—40 sce/tx-rx, B—42 SCSI controller, B-44 SCSI send diagnostics, B-45 scsi/entl, B-44 scsi/sdiag, B-45 scsi/target, B—46 send diagnostics, B-45 serial communication chip access, B-35 serial communication chip direct memory access, B-36 serial communication chip I/0O, B-38 serial communication chip pins, B—40 subtests, 5-1 TLB probe, B-49 TLB registers, B—49 index—4 U ULTRIX error logs, 2-3, 4-13 ULTRIX software, 2-3 Utilities Ethernet display MOP counters, B-20 ni/ctrs, B-20
Home
Privacy and Data
Site structure and layout ©2025 Majenko Technologies